<div dir="ltr">Hi Cyriac,<div><br></div><div>Thanks for the report.</div><div><br></div><div>Obviously a quick solution for now is to specify the input format with "--format vcf". We'll get the format detection fixed in a future version.</div><div><br></div><div>Regards</div><div><br></div><div>Will McLaren</div><div>Ensembl Variation</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 5 October 2016 at 19:17, Cyriac Kandoth <span dir="ltr"><<a href="mailto:kandothc@mskcc.org" target="_blank">kandothc@mskcc.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div dir="ltr" class="m_1780065145508307953gmail_msg">Hi Dev,<br class="m_1780065145508307953gmail_msg"><br class="m_1780065145508307953gmail_msg">This exception shouldn't ever happen because VCF specs require that the ID field always be non-empty. Specifically "If there is no identifier available, then the missing value '.' should be used".<br class="m_1780065145508307953gmail_msg"><br class="m_1780065145508307953gmail_msg">However, we all deal with crappy VCFs, so its worthwhile to support it in code, or least fail gracefully. Attached is a sample VCF, and here is the command I used to reproduce the bug:<br class="m_1780065145508307953gmail_msg"><br class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">/home/kandoth/perl/perl-5.22.<wbr>2/bin/perl /home/kandoth/vep/<a href="http://variant_effect_predictor.pl" class="m_1780065145508307953gmail_msg" target="_blank">variant_<wbr>effect_predictor.pl</a> --species homo_sapiens --assembly GRCh37 --offline --no_progress --no_stats --sift b --ccds --uniprot --hgvs --symbol --numbers --domains --gene_phenotype --canonical --protein --biotype --uniprot --polyphen b --gmaf --maf_1kg --maf_esp --regulatory --tsl --pubmed --variant_class --shift_hgvs 1 --check_existing --total_length --allele_number --no_escape --xref_refseq --failed 1 --vcf --minimal --flag_pick_allele --pick_order canonical,tsl,biotype,rank,<wbr>ccds,length --dir /home/kandoth/.vep --fasta /home/kandoth/.vep/homo_<wbr>sapiens/85_GRCh37/Homo_<wbr>sapiens.GRCh37.75.dna.primary_<wbr>assembly.fa.gz --input_file test.vcf --output_file test.vep.vcf</font><br class="m_1780065145508307953gmail_msg"><br class="m_1780065145508307953gmail_msg">This is the terminal output I get:<div class="m_1780065145508307953gmail_msg"><br class="m_1780065145508307953gmail_msg"></div><div class="m_1780065145508307953gmail_msg"><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:37 - Read existing cache info</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:37 - Starting...</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:37 - <b>Detected format of input file as pileup</b></font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">WARNING: Length of reference allele (A length 1) does not match co-ordinates 112175600-112175599 on line 3</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:37 - Read 2 variants into buffer</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:37 - Checking for existing variations</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:37 - Reading transcript data from cache and/or database</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Retrieved 653 transcripts (0 mem, 653 cached, 0 DB, 0 duplicates)</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Reading regulatory data from cache and/or database</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Retrieved 1940 regulatory features (0 mem, 1940 cached, 0 DB, 0 duplicates)</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Analyzing chromosome 12</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Analyzing variants</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Analyzing RegulatoryFeatures</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Analyzing MotifFeatures</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:38 - Calculating consequences</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:44 - Analyzing chromosome 17</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:44 - Analyzing variants</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:44 - Analyzing RegulatoryFeatures</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:44 - Analyzing MotifFeatures</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:44 - Calculating consequences</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in string eq at /home/kandoth/vep/Bio/EnsEMBL/<wbr>Variation/<wbr>TranscriptVariationAllele.pm line 1269.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">Use of uninitialized value in uc at /home/kandoth/vep/Bio/<wbr>SeqUtils.pm line 290.</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:52 - Processed 2 total variants (0 vars/sec, 0 vars/sec total)</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:52 - See test.vep.vcf_warnings.txt for details of 1 warnings</font></div><div class="m_1780065145508307953gmail_msg"><font face="monospace" class="m_1780065145508307953gmail_msg">2016-10-05 18:01:52 - Finished!</font></div><div class="m_1780065145508307953gmail_msg"><br class="m_1780065145508307953gmail_msg"></div><div class="m_1780065145508307953gmail_msg">The output has errors too. Basically, it thinks that the rsIDs are the reference alleles.</div><span class="HOEnZb"><font color="#888888"><div class="m_1780065145508307953gmail_msg"><br></div></font></span></div></div><span class="HOEnZb"><font color="#888888"><div dir="ltr" class="m_1780065145508307953gmail_msg"><div class="m_1780065145508307953gmail_msg">~Cyriac</div></div></font></span></div></div>
<br>______________________________<wbr>_________________<br>
Dev mailing list    <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank">http://lists.ensembl.org/<wbr>mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>