<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi Will,<br>
<br>
And thank you very much for looking into this. <br>
<br>
The reason why I want to build my own cahce from gtf and fasta of
the Gasterosteus_aculeatus is that on groupXIX the two last (of the
three ) supercontigs are flipped in the Ensembl genome
(Ross&Peichel 2008). I also first thougt that reverse
complementing the fasta and gtf for groupXIX was the problem but
ENSGACG00000003129 is located on a reagion that I didn't touch in
the gtf or the fasta.<br>
<br>
I'll check the versions of Bioperl in our server and try using only
the fasta and gtf for ENSGACG00000003129.<br>
<br>
Thanks,<br>
Heidi<br>
<br>
<br>
<div class="moz-cite-prefix">26.3.2013 16:04, Will McLaren
kirjoitti:<br>
</div>
<blockquote
cite="mid:CAMVEDX01YHiQeoJmKLuNEy9MaLg-y_FJeKxTQaiwQxxWBsffPA@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div dir="ltr">Hi Heidi,
<div><br>
</div>
<div>Thanks for your patience, I've had a chance to look at this
now.</div>
<div><br>
</div>
<div style="">If I build a cache file from the following files:</div>
<div style=""><br>
</div>
<div style="">
<a moz-do-not-send="true"
href="ftp://ftp.ensembl.org/pub/release-70/gtf/gasterosteus_aculeatus/Gasterosteus_aculeatus.BROADS1.70.gtf.gz">ftp://ftp.ensembl.org/pub/release-70/gtf/gasterosteus_aculeatus/Gasterosteus_aculeatus.BROADS1.70.gtf.gz</a><br>
</div>
<div style=""><br>
</div>
<div style="">and</div>
<div style=""><br>
</div>
<div style=""><a moz-do-not-send="true"
href="ftp://ftp.ensembl.org/pub/release-70/fasta/gasterosteus_aculeatus/dna/Gasterosteus_aculeatus.BROADS1.70.dna.toplevel.fa.gz">ftp://ftp.ensembl.org/pub/release-70/fasta/gasterosteus_aculeatus/dna/Gasterosteus_aculeatus.BROADS1.70.dna.toplevel.fa.gz</a><br>
</div>
<div style=""><br>
</div>
<div style="">I get (I think!) the correct output from the VEP:</div>
<div style=""><br>
</div>
<div style="">perl <a moz-do-not-send="true"
href="http://gtf2vep.pl">gtf2vep.pl</a> -i
Gasterosteus_aculeatus.BROADS1.70.gtf.gz -fasta
Gasterosteus_aculeatus.BROADS1.70.dna.toplevel.fa -species
gasterosteus_aculeatus -dir test/ -db 70<br>
</div>
<div style="">perl <a moz-do-not-send="true"
href="http://variant_effect_predictor.pl">variant_effect_predictor.pl</a>
-i gastero_in.txt -species gasterosteus_aculeatus -force -off
-dir test/ -db 70<br>
</div>
<div style="">grep -v # variant_effect_output.txt</div>
<div style=""><br>
</div>
<div style="">
<div>groupXIX_2822477_C/T groupXIX:2822477 T
ENSGACG00000003129 ENSGACT00000004109 Transcript
missense_variant 67 49 17 A/T
Gcg/Acg -</div>
<div>groupXIX_2822500_T/C groupXIX:2822500 C
ENSGACG00000003129 ENSGACT00000004109 Transcript
missense_variant 44 26 9 D/G
gAc/gGc -</div>
<div>groupXIX_2822523_C/T groupXIX:2822523 T
ENSGACG00000003129 ENSGACT00000004109 Transcript
initiator_codon_variant 21 3 1 M/I
atG/atA -</div>
<div>groupXIX_2822541_T/A groupXIX:2822541 A
ENSGACG00000003129 ENSGACT00000004109 Transcript
5_prime_UTR_variant</div>
<div> 3 - - - - -</div>
</div>
<div style="">
<br>
</div>
<div style="">This works the same if I use the version 67 files
as it appears you have.</div>
<div style=""><br>
</div>
<div style="">So I suspect there is something different about
your FASTA file - you could check that the sequence of the
groupXIX file matches that in the file I link to above (do an
md5sum or some such thing).</div>
<div style=""><br>
</div>
<div style="">It is also possible that an issue with older
versions of BioPerl is to blame - there was a known bug in the
way BioPerl indexes large FASTA file. Normally for Ensembl we
recommend using BioPerl 1.2.3 (which contains the bug), but
VEP works fine with the latest version. I'd try updating your
BioPerl install to the latest version, remove the *.fa.index
file that is generated next to your .fa file, and try
re-running <a moz-do-not-send="true" href="http://gtf2vep.pl">gtf2vep.pl</a></div>
<div style=""><br>
</div>
<div style="">Beyond this it's hard to say what's happening
without seeing the contents of your GTF and FASTA files. If
the problem persists, perhaps you could just pull out the
lines in the GTF for ENSGACT00000004109 and the sequence for
groupXIX and if that still gives you the same problem, send
them to me so I can debug.</div>
<div style=""><br>
</div>
<div style="">Hope this helps!</div>
<div style=""><br>
</div>
<div style="">Will</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 19 March 2013 12:07, Heidi
Viitaniemi <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:hmviit@utu.fi" target="_blank">hmviit@utu.fi</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi Will,<br>
<br>
And thank you for your response. I'll wait for the
solution. I like the idea that you can incorporate your
own data to run VEP.<br>
<br>
Thanks,<br>
Heidi Viitaniemi<br>
<br>
<br>
<br>
<div>19.3.2013 13:42, Will McLaren kirjoitti:<br>
</div>
<div>
<div class="h5">
<blockquote type="cite">
<div dir="ltr">Hello Heidi,
<div><br>
</div>
<div>Thanks for finding this - the causes of this
bug are I believe somewhat complex so may take a
while to get to the bottom of it.</div>
<div><br>
</div>
<div>Just wanted to let you know that your mail is
not being ignored!</div>
<div><br>
</div>
<div>Regards</div>
<div><br>
</div>
<div>Will McLaren</div>
<div>Ensembl Variation</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 18 March 2013 13:48,
Heidi Viitaniemi <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:hmviit@utu.fi" target="_blank">hmviit@utu.fi</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
<br>
I'm running version 2.7 on a unix server. I
want to create a custom cache using my own
gtf and fasta with <a
moz-do-not-send="true"
href="http://gtf2vep.pl" target="_blank">gtf2vep.pl</a>.
This works without problem and also running
VEP seems to go fine. The problem is that,
in the output it seems that the
cDNA_position, CDS_position and
Protein_position are correct given my input
gtf file but the calls for Amino_acids and
Codons seem completely random. If I run
against the cache retrieved from ensembl
these are all correct. The version of the
genome didn't have an effect on the output,
the gtf's haven't changed. The gtf and the
fasta that I'm using for the custom
originate from the ensembl reference so I
don't see any reason why the custom cache
shouldn't perform the same way as the
reference from ensembl cache. Could there be
bug that somehow messes up the link between
the custom gtf and fasta in my run? Below
are the commands I ran and a snippet of the
output's I got.<br>
<br>
Thanks,<br>
Heidi Viitaniemi<br>
<br>
For custom cache I'm running (wrong output
for Amino_acids and Codons)<br>
perl <a moz-do-not-send="true"
href="http://gtf2vep.pl" target="_blank">gtf2vep.pl</a>
-i GasAcu1.67_group_xixflip.gtf -f
gasAcu_group_withoutbac_inv7.fa -d 67 -s
Gasterosteus_aculeatus_XIXflipped_18032013<br>
perl <a moz-do-not-send="true"
href="http://variant_effect_predictor.pl"
target="_blank">variant_effect_predictor.pl</a>
-offline 1 -dir $HOME/.vep -i
ens_realigned_AK_F.var.vcf -format vcf -fork
4 -db_version 67 -species
Gasterosteus_aculeatus_XIXflipped_18032013
-numbers -per_gene -buffer_size 10000 -o
VEP_18032013_exon_pergene_AK_F.var.vcf.txt<br>
<br>
<table style="border-collapse:collapse"
width="1587" border="0" cellpadding="0"
cellspacing="0" height="89">
<colgroup><col style="width:125pt"
width="167"> <col style="width:119pt"
width="158"> <col style="width:48pt"
width="64"> <col style="width:136pt"
width="181"> <col style="width:118pt"
width="157"> <col style="width:68pt"
width="90"> <col style="width:136pt"
width="181"> <col style="width:85pt"
span="2" width="113"> <col
style="width:48pt" span="6" width="64">
<col style="width:136pt" width="181"> </colgroup><tbody>
<tr style="min-height:15.0pt"
height="20">
<td
style="min-height:15.0pt;width:125pt"
width="167" height="20">groupXIX_2822477_C/T</td>
<td style="width:119pt" width="158">groupXIX:2822477</td>
<td style="width:48pt" width="64">T</td>
<td style="width:136pt" width="181">ENSGACG00000003129</td>
<td style="width:118pt" width="157">ENSGACT00000004109</td>
<td style="width:68pt" width="90">Transcript</td>
<td style="width:136pt" width="181">missense_variant</td>
<td style="width:48pt" width="64"
align="right">67</td>
<td style="width:48pt" width="64"
align="right">49</td>
<td style="width:48pt" width="64"
align="right">17</td>
<td style="width:48pt" width="64">G/R</td>
<td style="width:48pt" width="64">Gga/Aga</td>
<td style="width:48pt" width="64">-</td>
<td style="width:136pt" width="181">EXON=1/2</td>
</tr>
<tr style="min-height:15.0pt"
height="20">
<td style="min-height:15.0pt"
height="20">groupXIX_2822500_T/C</td>
<td>groupXIX:2822500</td>
<td>C</td>
<td>ENSGACG00000003129</td>
<td>ENSGACT00000004109</td>
<td>Transcript</td>
<td>missense_variant</td>
<td align="right">44</td>
<td align="right">26</td>
<td align="right">9</td>
<td>Y/C</td>
<td>tAt/tGt</td>
<td>-</td>
<td>EXON=1/2</td>
</tr>
<tr style="min-height:15.0pt"
height="20">
<td style="min-height:15.0pt"
height="20">groupXIX_2822523_C/T</td>
<td>groupXIX:2822523</td>
<td>T</td>
<td>ENSGACG00000003129</td>
<td>ENSGACT00000004109</td>
<td>Transcript</td>
<td>synonymous_variant</td>
<td align="right">21</td>
<td align="right">3</td>
<td align="right">1</td>
<td>R</td>
<td>cgG/cgA</td>
<td>-</td>
<td>EXON=1/2</td>
</tr>
<tr style="min-height:15.0pt"
height="20">
<td style="min-height:15.0pt"
height="20">groupXIX_2822541_T/A</td>
<td>groupXIX:2822541</td>
<td>A</td>
<td>ENSGACG00000003129</td>
<td>ENSGACT00000004109</td>
<td>Transcript</td>
<td>5_prime_UTR_variant</td>
<td align="right">3</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>EXON=1/2</td>
</tr>
</tbody>
</table>
<br>
<br>
For ensembl cache I'm running (correct
output for Amino_acids and Codons)<br>
perl <a moz-do-not-send="true"
href="http://variant_effect_predictor.pl"
target="_blank">variant_effect_predictor.pl</a>
-offline -dir $HOME/.vep -i
ens_realigned_AK_F.var.vcf -format vcf -fork
4 -db_version 69 -species
gasterosteus_aculeatus -numbers -per_gene
-buffer_size 10000 -o
ensVEP_18032013_exon_pergene_AK_F.var.vcf.txt<br>
<br>
<span><font color="#888888"> </font></span><span><font
color="#888888"> </font></span><span><font
color="#888888"> </font></span>
<table
style="border-collapse:collapse;width:1101pt"
width="1468" border="0" cellpadding="0"
cellspacing="0">
<colgroup><col style="width:122pt"
width="163"> <col style="width:92pt"
width="122"> <col style="width:48pt"
width="64"> <col style="width:109pt"
width="145"> <col style="width:107pt"
width="143"> <col style="width:117pt"
width="156"> <col style="width:170pt"
width="227"> <col style="width:48pt"
span="7" width="64"> </colgroup><tbody>
<tr style="min-height:15.0pt"
height="20">
<td
style="min-height:15.0pt;width:122pt"
width="163" height="20">groupXIX_2822477_C/T</td>
<td style="width:92pt" width="122">groupXIX:2822477</td>
<td style="width:48pt" width="64">T</td>
<td style="width:109pt" width="145">ENSGACG00000003129</td>
<td style="width:107pt" width="143">ENSGACT00000004109</td>
<td style="width:117pt" width="156">Transcript</td>
<td style="width:170pt" width="227">missense_variant</td>
<td style="width:48pt" width="64"
align="right">67</td>
<td style="width:48pt" width="64"
align="right">49</td>
<td style="width:48pt" width="64"
align="right">17</td>
<td style="width:48pt" width="64">A/T</td>
<td style="width:48pt" width="64">Gcg/Acg</td>
<td style="width:48pt" width="64">-</td>
<td style="width:48pt" width="64">EXON=1/2</td>
</tr>
<tr style="min-height:15.0pt"
height="20">
<td style="min-height:15.0pt"
height="20">groupXIX_2822500_T/C</td>
<td>groupXIX:2822500</td>
<td>C</td>
<td>ENSGACG00000003129</td>
<td>ENSGACT00000004109</td>
<td>Transcript</td>
<td>missense_variant</td>
<td align="right">44</td>
<td align="right">26</td>
<td align="right">9</td>
<td>D/G</td>
<td>gAc/gGc</td>
<td>-</td>
<td>EXON=1/2</td>
</tr>
<tr style="min-height:15.0pt"
height="20">
<td style="min-height:15.0pt"
height="20">groupXIX_2822523_C/T</td>
<td>groupXIX:2822523</td>
<td>T</td>
<td>ENSGACG00000003129</td>
<td>ENSGACT00000004109</td>
<td>Transcript</td>
<td>initiator_codon_variant</td>
<td align="right">21</td>
<td align="right">3</td>
<td align="right">1</td>
<td>M/I</td>
<td>atG/atA</td>
<td>-</td>
<td>EXON=1/2</td>
</tr>
<tr style="min-height:15.0pt"
height="20">
<td style="min-height:15.0pt"
height="20">groupXIX_2822541_T/A</td>
<td>groupXIX:2822541</td>
<td>A</td>
<td>ENSGACG00000003129</td>
<td>ENSGACT00000004109</td>
<td>Transcript</td>
<td>5_prime_UTR_variant</td>
<td align="right">3</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>EXON=1/2</td>
</tr>
</tbody>
</table>
<span><font color="#888888"> <br>
<br>
<pre cols="72">--
______________________________________________
Heidi Viitaniemi
PhD student
Division of Genetics and Physiology
Department of Biology
Itäinen Pitkäkatu 4A, 7th floor (Pharmacity)
University of Turku
20520 Turku
FINLAND </pre>
</font></span></div>
<br>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org"
target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe
info: <a moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/"
target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
<pre cols="72">--
______________________________________________
Heidi Viitaniemi
PhD student
Division of Genetics and Physiology
Department of Biology
Itäinen Pitkäkatu 4A, 7th floor (Pharmacity)
University of Turku
20520 Turku
FINLAND </pre>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
______________________________________________
Heidi Viitaniemi
PhD student
Division of Genetics and Physiology
Department of Biology
Itäinen Pitkäkatu 4A, 7th floor (Pharmacity)
University of Turku
20520 Turku
FINLAND </pre>
</body>
</html>