<div dir="ltr">Hi Sabrina,<div><br>There's a few issues with your GTF; if you correct them then it should work.</div><div><br></div><div>1) IDs should not be shared by transcripts and genes. In your example, I fixed this by prefixing the gene ID with "g_" and the transcript ID with "t_"</div><div><br></div><div>2) Transcript entries need a valid biotype; typically this will be "protein_coding" (see <a href="http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#gff">http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#gff</a>)</div><div><br></div><div>3) The phase field must be correctly set for CDS entries.</div><div><br></div><div>These points also apply if you use a GFF format file.</div><div><br></div><div>Hope that helps</div><div><br></div><div>Will McLaren</div><div>Ensembl Variation</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 22 May 2017 at 12:36, Sabrina Legoueix-Rodriguez <span dir="ltr"><<a href="mailto:sabrina.legoueix@inra.fr" target="_blank">sabrina.legoueix@inra.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
Dear all,<br>
<br>
I have installed on my machine your recent vep API locally to use a
home made genome in order to get SNPs annotations.<br>
<br>
I used the instructions on these pages:<br>
<a class="m_5616784093623879695moz-txt-link-freetext" href="http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#offline" target="_blank">http://www.ensembl.org/info/<wbr>docs/tools/vep/script/vep_<wbr>cache.html#offline</a><br>
<a class="m_5616784093623879695moz-txt-link-freetext" href="http://www.ensembl.org/info/docs/tools/vep/script/index.html" target="_blank">http://www.ensembl.org/info/<wbr>docs/tools/vep/script/index.<wbr>html</a><br>
<br>
My inputs are:<br>
-> a home made reference genome in fasta file<br>
-> a .VCF file with SNPs list on that genome<br>
-> a .GTF file with genome annotations<br>
<br>
My goal is to use vep to generate a .vep file with functionnal
annotations of my SNPs.<br>
<br>
For instance:<br>
<br>
my gtf is:<br>
tig00000004_pilon_pilon Pacbio gene 231183 234374
. + . gene_id "A"; <br>
tig00000004_pilon_pilon Pacbio transcript 231183
234374 . + . gene_id "A";transcript_id "A"; <br>
tig00000004_pilon_pilon Pacbio CDS 231183 234374 .
+ . gene_id "A";transcript_id "A"; <br>
tig00000004_pilon_pilon Pacbio exon 231183 234374
. + . gene_id "A";transcript_id "A"; <br>
<br>
( I also tried with a .gff)<br>
<br>
my vcf is:<br>
##...<br>
#CHROM POS ID REF ALT QUAL FILTER INFO
FORMAT A_ATTACTCG <br>
tig00000004_pilon_pilon 232205 . G A 9881.15
.
AC=8;AF=0.800;AN=10;DP=245;FS=<wbr>0.000;MLEAC=8;MLEAF=0.800;MQ=<wbr>60.05;QD=25.82;SOR=0.983
GT:AD:DP:GQ:PL 0:9,0:9:99:0,247 <br>
<br>
=> this snp should be found in the gene "A"<br>
<br>
To prepare the gtf (or also .gff), I used:<br>
grep -v "^#" test.gtf | sort -k1,1 -k4,4n -k5,5n | bgzip -c >
test.gtf.gz<br>
tabix -p gtf test.gtf.gz<br>
<br>
my command line is:<br>
./vep -i test.vcf -gtf test.gtf.gz -fasta ref.fasta
--force_overwrite<br>
or <br>
./vep -i test.vcf -gff test.gff.gz -fasta ref.fasta
--force_overwrite<br>
<br>
The result file is:<br>
#Uploaded_variation Location Allele Gene Feature
Feature_type Consequence cDNA_position CDS_position
Protein_position Amino_acids Codons
Existing_variation Extra<br>
. tig00000004_pilon_pilon:<wbr>232205 A - -
- <b> intergenic_variant</b> - - -
- - - IMPACT=MODIFIER<br>
variant_effect_output.txt (END)<br>
<br>
<br>
It does not work, it retreives only integenic variants which is
wrong as I have some SNPs in genes...<br>
<br>
When I try the tools on data that I used to work on using <a href="http://gtf2vep.pl" target="_blank">gtf2vep.pl</a>
a few years ago, it does not work either....<br>
<br>
Could you please help me and tell me if I am doing something wrong?<br>
<br>
Thank you in advance.<br>
<br>
Best regards,<br>
<br>
Sabrina
<div class="m_5616784093623879695moz-signature">-- <br>
<br>
Sabrina
<br>
<br>
<table style="max-width:800px">
<tbody>
<tr>
<td>
<table>
<tbody>
<tr>
<td><img src="cid:part3.09090709.05050700@inra.fr"></td>
<td style="padding-left:15px">
<p style="font-family:Helvetica,arial,sans-serif;color:#9373b1;font-size:13px;border-bottom:1px solid #9373b1;padding-bottom:5px;margin-bottom:0px"><strong><b>Sabrina
LEGOUEIX RODRIGUEZ</b></strong><br>
Responsable Plateau Bioinformatique<br>
</p>
<p style="font-family:Helvetica,arial,sans-serif;color:#595959;font-size:12px;margin-top:7px">Tél. : <a href="tel:+33%205%2061%2028%2057%2092" value="+33561285792" target="_blank">+33 (0) 5 61 28 57 92</a><br>
<a href="mailto:[MAIL]" style="color:#9373b1;text-decoration:none" target="_blank">sabrina.legoueix@toulouse.<wbr>inra.fr</a><br>
<a href="http://www.toulouse-white-biotechnology.com/" style="color:#9373b1;text-decoration:none" target="_blank"></a><a class="m_5616784093623879695moz-txt-link-abbreviated" href="http://www.toulouse-white-biotechnology.com" target="_blank">www.toulouse-white-<wbr>biotechnology.com</a><br>
</p>
<p style="margin-top:3px"> <a href="https://www.linkedin.com/company/2757525h" style="padding-right:5px;font-family:Helvetica,arial,sans-serif;color:#9373b1;font-size:12px;text-decoration:none" target="_blank"><img src="cid:part6.09020305.06010300@inra.fr">
LinkedIn</a> <a href="https://twitter.com/TWB_Biotech" style="font-family:Helvetica,arial,sans-serif;color:#9373b1;font-size:12px;text-decoration:none" target="_blank"><img src="cid:part8.06090809.06080907@inra.fr">
Twitter</a></p>
</td>
</tr>
</tbody>
</table>
<table style="border-top:1px solid #9373b1;border-bottom:1px solid #9373b1" width="100%">
<tbody>
<tr>
<td align="left"><font face="Trebuchet MS, Arial,
Helvetica, sans-serif" size="2" color="#9373b1">TWB
- Parc technologique du canal • Bâtiment NAPA
CENTER B • 3, rue Ariane • 31520 Ramonville
Saint-Agne </font></td>
</tr>
</tbody>
</table>
<br>
<table width="100%">
<tbody>
<tr>
<td style="font-family:arial;font-size:9px;color:#999999">Ce message et ses pièces jointes
sont strictement personnels. Ils peuvent contenir
des informations confidentielles. Si vous avez
reçu ce message par erreur, merci d'en avertir
l'expéditeur et de détruire le message et les
documents joints. Toute utilisation des
informations reçues par erreur est interdite.
This message and the attachments are strictly
personal. They may contain confidential
information. If you have received this message in
error, please notify the sender and delete the
message and the attachments. Any use of this
communication received in error is prohibited. </td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<br>
</div>
</div>
<br>______________________________<wbr>_________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank">http://lists.ensembl.org/<wbr>mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>