<div dir="ltr">Hi Heidi,<div><br></div><div>Thanks for your patience, I've had a chance to look at this now.</div><div><br></div><div style>If I build a cache file from the following files:</div><div style><br></div><div style>
<a href="ftp://ftp.ensembl.org/pub/release-70/gtf/gasterosteus_aculeatus/Gasterosteus_aculeatus.BROADS1.70.gtf.gz">ftp://ftp.ensembl.org/pub/release-70/gtf/gasterosteus_aculeatus/Gasterosteus_aculeatus.BROADS1.70.gtf.gz</a><br>
</div><div style><br></div><div style>and</div><div style><br></div><div style><a href="ftp://ftp.ensembl.org/pub/release-70/fasta/gasterosteus_aculeatus/dna/Gasterosteus_aculeatus.BROADS1.70.dna.toplevel.fa.gz">ftp://ftp.ensembl.org/pub/release-70/fasta/gasterosteus_aculeatus/dna/Gasterosteus_aculeatus.BROADS1.70.dna.toplevel.fa.gz</a><br>
</div><div style><br></div><div style>I get (I think!) the correct output from the VEP:</div><div style><br></div><div style>perl <a href="http://gtf2vep.pl">gtf2vep.pl</a> -i Gasterosteus_aculeatus.BROADS1.70.gtf.gz -fasta Gasterosteus_aculeatus.BROADS1.70.dna.toplevel.fa -species gasterosteus_aculeatus -dir test/ -db 70<br>
</div><div style>perl <a href="http://variant_effect_predictor.pl">variant_effect_predictor.pl</a> -i gastero_in.txt -species gasterosteus_aculeatus -force -off -dir test/ -db 70<br></div><div style>grep -v # variant_effect_output.txt</div>
<div style><br></div><div style><div>groupXIX_2822477_C/T    groupXIX:2822477        T       ENSGACG00000003129      ENSGACT00000004109      Transcript      missense_variant  67       49      17      A/T     Gcg/Acg -</div>
<div>groupXIX_2822500_T/C    groupXIX:2822500        C       ENSGACG00000003129      ENSGACT00000004109      Transcript      missense_variant  44       26      9       D/G     gAc/gGc -</div><div>groupXIX_2822523_C/T    groupXIX:2822523        T       ENSGACG00000003129      ENSGACT00000004109      Transcript      initiator_codon_variant    21      3       1       M/I     atG/atA -</div>
<div>groupXIX_2822541_T/A    groupXIX:2822541        A       ENSGACG00000003129      ENSGACT00000004109      Transcript      5_prime_UTR_variant</div><div>        3       -       -       -       -       -</div></div><div style>
<br></div><div style>This works the same if I use the version 67 files as it appears you have.</div><div style><br></div><div style>So I suspect there is something different about your FASTA file - you could check that the sequence of the groupXIX file matches that in the file I link to above (do an md5sum or some such thing).</div>
<div style><br></div><div style>It is also possible that an issue with older versions of BioPerl is to blame - there was a known bug in the way BioPerl indexes large FASTA file. Normally for Ensembl we recommend using BioPerl 1.2.3 (which contains the bug), but VEP works fine with the latest version. I'd try updating your BioPerl install to the latest version, remove the *.fa.index file that is generated next to your .fa file, and try re-running <a href="http://gtf2vep.pl">gtf2vep.pl</a></div>
<div style><br></div><div style>Beyond this it's hard to say what's happening without seeing the contents of your GTF and FASTA files. If the problem persists, perhaps you could just pull out the lines in the GTF for ENSGACT00000004109 and the sequence for groupXIX and if that still gives you the same problem, send them to me so I can debug.</div>
<div style><br></div><div style>Hope this helps!</div><div style><br></div><div style>Will</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 19 March 2013 12:07, Heidi Viitaniemi <span dir="ltr"><<a href="mailto:hmviit@utu.fi" target="_blank">hmviit@utu.fi</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    Hi Will,<br>
    <br>
    And thank you for your response. I'll wait for the solution. I like
    the idea that you can incorporate your own data to run VEP.<br>
    <br>
    Thanks,<br>
    Heidi Viitaniemi<br>
    <br>
    <br>
    <br>
    <div>19.3.2013 13:42, Will McLaren
      kirjoitti:<br>
    </div><div><div class="h5">
    <blockquote type="cite">
      
      <div dir="ltr">Hello Heidi,
        <div><br>
        </div>
        <div>Thanks for finding this - the causes of this bug
          are I believe somewhat complex so may take a while to get to
          the bottom of it.</div>
        <div><br>
        </div>
        <div>Just wanted to let you know that your mail is not
          being ignored!</div>
        <div><br>
        </div>
        <div>Regards</div>
        <div><br>
        </div>
        <div>Will McLaren</div>
        <div>Ensembl Variation</div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On 18 March 2013 13:48, Heidi
          Viitaniemi <span dir="ltr"><<a href="mailto:hmviit@utu.fi" target="_blank">hmviit@utu.fi</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
              <br>
              I'm running version 2.7 on a unix server. I want to create
              a custom cache using my own gtf and fasta with <a href="http://gtf2vep.pl" target="_blank">gtf2vep.pl</a>. This works without
              problem and also running VEP seems to go fine. The problem
              is that, in the output it seems that the cDNA_position,
              CDS_position and Protein_position are correct given my
              input gtf file but the calls for Amino_acids and Codons
              seem completely random. If I run against the cache
              retrieved from ensembl these are all correct. The version
              of the genome didn't have an effect on the output, the
              gtf's haven't changed. The gtf and the fasta that I'm
              using for the custom originate from the ensembl reference
              so I don't see any reason why the custom cache shouldn't
              perform the same way as the reference from ensembl cache.
              Could there be bug that somehow messes up the link between
              the custom gtf and fasta in my run? Below are the commands
              I ran and a snippet of the output's I got.<br>
              <br>
              Thanks,<br>
              Heidi Viitaniemi<br>
              <br>
              For custom cache I'm running (wrong output for Amino_acids
              and Codons)<br>
              perl <a href="http://gtf2vep.pl" target="_blank">gtf2vep.pl</a> -i
              GasAcu1.67_group_xixflip.gtf -f
              gasAcu_group_withoutbac_inv7.fa -d 67 -s
              Gasterosteus_aculeatus_XIXflipped_18032013<br>
              perl <a href="http://variant_effect_predictor.pl" target="_blank">variant_effect_predictor.pl</a> -offline
              1 -dir $HOME/.vep -i ens_realigned_AK_F.var.vcf -format
              vcf -fork 4 -db_version 67 -species
              Gasterosteus_aculeatus_XIXflipped_18032013 -numbers
              -per_gene -buffer_size 10000 -o
              VEP_18032013_exon_pergene_AK_F.var.vcf.txt<br>
              <br>
              <table style="border-collapse:collapse" height="89" width="1587" border="0" cellpadding="0" cellspacing="0">
                <colgroup><col style="width:125pt" width="167"> <col style="width:119pt" width="158"> <col style="width:48pt" width="64"> <col style="width:136pt" width="181"> <col style="width:118pt" width="157"> <col style="width:68pt" width="90"> <col style="width:136pt" width="181"> <col style="width:85pt" span="2" width="113"> <col style="width:48pt" span="6" width="64"> <col style="width:136pt" width="181"> </colgroup><tbody>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt;width:125pt" height="20" width="167">groupXIX_2822477_C/T</td>
                    <td style="width:119pt" width="158">groupXIX:2822477</td>
                    <td style="width:48pt" width="64">T</td>
                    <td style="width:136pt" width="181">ENSGACG00000003129</td>
                    <td style="width:118pt" width="157">ENSGACT00000004109</td>
                    <td style="width:68pt" width="90">Transcript</td>
                    <td style="width:136pt" width="181">missense_variant</td>
                    <td style="width:48pt" width="64" align="right">67</td>
                    <td style="width:48pt" width="64" align="right">49</td>
                    <td style="width:48pt" width="64" align="right">17</td>
                    <td style="width:48pt" width="64">G/R</td>
                    <td style="width:48pt" width="64">Gga/Aga</td>
                    <td style="width:48pt" width="64">-</td>
                    <td style="width:136pt" width="181">EXON=1/2</td>
                  </tr>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt" height="20">groupXIX_2822500_T/C</td>
                    <td>groupXIX:2822500</td>
                    <td>C</td>
                    <td>ENSGACG00000003129</td>
                    <td>ENSGACT00000004109</td>
                    <td>Transcript</td>
                    <td>missense_variant</td>
                    <td align="right">44</td>
                    <td align="right">26</td>
                    <td align="right">9</td>
                    <td>Y/C</td>
                    <td>tAt/tGt</td>
                    <td>-</td>
                    <td>EXON=1/2</td>
                  </tr>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt" height="20">groupXIX_2822523_C/T</td>
                    <td>groupXIX:2822523</td>
                    <td>T</td>
                    <td>ENSGACG00000003129</td>
                    <td>ENSGACT00000004109</td>
                    <td>Transcript</td>
                    <td>synonymous_variant</td>
                    <td align="right">21</td>
                    <td align="right">3</td>
                    <td align="right">1</td>
                    <td>R</td>
                    <td>cgG/cgA</td>
                    <td>-</td>
                    <td>EXON=1/2</td>
                  </tr>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt" height="20">groupXIX_2822541_T/A</td>
                    <td>groupXIX:2822541</td>
                    <td>A</td>
                    <td>ENSGACG00000003129</td>
                    <td>ENSGACT00000004109</td>
                    <td>Transcript</td>
                    <td>5_prime_UTR_variant</td>
                    <td align="right">3</td>
                    <td>-</td>
                    <td>-</td>
                    <td>-</td>
                    <td>-</td>
                    <td>-</td>
                    <td>EXON=1/2</td>
                  </tr>
                </tbody>
              </table>
              <br>
              <br>
              For ensembl cache I'm running (correct output for
              Amino_acids and Codons)<br>
              perl <a href="http://variant_effect_predictor.pl" target="_blank">variant_effect_predictor.pl</a> -offline
              -dir $HOME/.vep -i ens_realigned_AK_F.var.vcf -format vcf
              -fork 4 -db_version 69 -species gasterosteus_aculeatus
              -numbers -per_gene -buffer_size 10000 -o
              ensVEP_18032013_exon_pergene_AK_F.var.vcf.txt<br>
              <br>
              <span><font color="#888888"> </font></span><span><font color="#888888"> </font></span><span><font color="#888888"> </font></span>
              <table style="border-collapse:collapse;width:1101pt" width="1468" border="0" cellpadding="0" cellspacing="0">
                <colgroup><col style="width:122pt" width="163"> <col style="width:92pt" width="122"> <col style="width:48pt" width="64"> <col style="width:109pt" width="145"> <col style="width:107pt" width="143"> <col style="width:117pt" width="156"> <col style="width:170pt" width="227"> <col style="width:48pt" span="7" width="64"> </colgroup><tbody>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt;width:122pt" height="20" width="163">groupXIX_2822477_C/T</td>
                    <td style="width:92pt" width="122">groupXIX:2822477</td>
                    <td style="width:48pt" width="64">T</td>
                    <td style="width:109pt" width="145">ENSGACG00000003129</td>
                    <td style="width:107pt" width="143">ENSGACT00000004109</td>
                    <td style="width:117pt" width="156">Transcript</td>
                    <td style="width:170pt" width="227">missense_variant</td>
                    <td style="width:48pt" width="64" align="right">67</td>
                    <td style="width:48pt" width="64" align="right">49</td>
                    <td style="width:48pt" width="64" align="right">17</td>
                    <td style="width:48pt" width="64">A/T</td>
                    <td style="width:48pt" width="64">Gcg/Acg</td>
                    <td style="width:48pt" width="64">-</td>
                    <td style="width:48pt" width="64">EXON=1/2</td>
                  </tr>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt" height="20">groupXIX_2822500_T/C</td>
                    <td>groupXIX:2822500</td>
                    <td>C</td>
                    <td>ENSGACG00000003129</td>
                    <td>ENSGACT00000004109</td>
                    <td>Transcript</td>
                    <td>missense_variant</td>
                    <td align="right">44</td>
                    <td align="right">26</td>
                    <td align="right">9</td>
                    <td>D/G</td>
                    <td>gAc/gGc</td>
                    <td>-</td>
                    <td>EXON=1/2</td>
                  </tr>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt" height="20">groupXIX_2822523_C/T</td>
                    <td>groupXIX:2822523</td>
                    <td>T</td>
                    <td>ENSGACG00000003129</td>
                    <td>ENSGACT00000004109</td>
                    <td>Transcript</td>
                    <td>initiator_codon_variant</td>
                    <td align="right">21</td>
                    <td align="right">3</td>
                    <td align="right">1</td>
                    <td>M/I</td>
                    <td>atG/atA</td>
                    <td>-</td>
                    <td>EXON=1/2</td>
                  </tr>
                  <tr style="min-height:15.0pt" height="20">
                    <td style="min-height:15.0pt" height="20">groupXIX_2822541_T/A</td>
                    <td>groupXIX:2822541</td>
                    <td>A</td>
                    <td>ENSGACG00000003129</td>
                    <td>ENSGACT00000004109</td>
                    <td>Transcript</td>
                    <td>5_prime_UTR_variant</td>
                    <td align="right">3</td>
                    <td>-</td>
                    <td>-</td>
                    <td>-</td>
                    <td>-</td>
                    <td>-</td>
                    <td>EXON=1/2</td>
                  </tr>
                </tbody>
              </table>
              <span><font color="#888888"> <br>
                  <br>
                  <pre cols="72">-- 
______________________________________________

Heidi Viitaniemi
PhD student
Division of Genetics and Physiology
Department of Biology
Itäinen Pitkäkatu 4A, 7th floor (Pharmacity)
University of Turku
20520 Turku

FINLAND </pre>
                </font></span></div>
            <br>
            _______________________________________________<br>
            Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      <pre>_______________________________________________
Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
    <pre cols="72">-- 
______________________________________________

Heidi Viitaniemi
PhD student
Division of Genetics and Physiology
Department of Biology
Itäinen Pitkäkatu 4A, 7th floor (Pharmacity)
University of Turku
20520 Turku

FINLAND </pre>
  </div></div></div>

<br>_______________________________________________<br>
Dev mailing list    <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>