<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi all<div class=""><br class=""></div><div class="">I’ve hit an issue with some invalid VarScan2 VCF files crashing VEP extremely fatally. A VCF that triggers this is:</div><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class="">##fileformat=VCFv4.1</div><div class="">##source=VarScan2</div><div class="">##INFO=<ID=DP,Number=1,Type=Integer,Description="Total depth of quality bases"></div><div class="">##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Indicates if record is a somatic mutation"></div><div class="">##INFO=<ID=SS,Number=1,Type=String,Description="Somatic status of variant (0=Reference,1=Germline,2=Somatic,3=LOH, or 5=Unknown)"></div><div class="">##INFO=<ID=SSC,Number=1,Type=String,Description="Somatic score in Phred scale (0-255) derived from somatic p-value"></div><div class="">##INFO=<ID=GPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor+normal versus no variant for Germline calls"></div><div class="">##INFO=<ID=SPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor versus normal for Somatic/LOH calls"></div><div class="">##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand"></div><div class="">##FILTER=<ID=indelError,Description="Likely artifact due to indel reads at this position"></div><div class="">##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"></div><div class="">##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality"></div><div class="">##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth"></div><div class="">##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)"></div><div class="">##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)"></div><div class="">##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency"></div><div class="">##FORMAT=<ID=DP4,Number=1,Type=String,Description="Strand read counts: ref/fwd, ref/rev, var/fwd, var/rev"></div><div class="">#CHROM  POS     ID<span class="Apple-tab-span" style="white-space:pre">     </span>REF     ALT     QUAL    FILTER  INFO    FORMAT  NORMAL  498_tissue</div><div class="">chr2    242814072<span class="Apple-tab-span" style="white-space:pre">      </span>.<span class="Apple-tab-span" style="white-space:pre">   </span>TG<span class="Apple-tab-span" style="white-space:pre">  </span>T<span class="Apple-tab-span" style="white-space:pre">   </span>.<span class="Apple-tab-span" style="white-space:pre">   </span>PASS    .<span class="Apple-tab-span" style="white-space:pre"> </span>GT:GQ:DP:RD:AD:FREQ:DP4 0/0:.:34:34:0:0%:18,16,0,0<span class="Apple-tab-span" style="white-space:pre">  </span>0/1:.:77:73:2:2.67%:35,38,1,1</div><div class="">chr3    239555  .<span class="Apple-tab-span" style="white-space:pre">     </span>C<span class="Apple-tab-span" style="white-space:pre">   </span>CT/-T   .<span class="Apple-tab-span" style="white-space:pre">      </span>PASS    .<span class="Apple-tab-span" style="white-space:pre"> </span>GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:77:29:19:39.58%:10,19,4,15        0/1:.:72:43:15:25.86%:19,24,4,11</div><div class=""><br class=""></div></blockquote><br class=""></div><div class="">it’s the last like that does this. If the chr2 entry is missing, the file isn’t even detected as a VCF.</div><div class=""><br class=""></div><div class="">The error is:</div><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class="">MSG: start arg must be less than or equal to end arg + 1</div><div class="">STACK Bio::EnsEMBL::TranscriptMapper::genomic2cds /mnt/work1/software/vep/83/Bio/EnsEMBL/TranscriptMapper.pm:397</div><div class="">STACK Bio::EnsEMBL::Variation::BaseTranscriptVariation::cds_coords /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/BaseTranscriptVariation.pm:325</div><div class="">STACK Bio::EnsEMBL::Variation::BaseVariationFeatureOverlapAllele::_pre_consequence_predicates /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/BaseVariationFeatureOverlapAllele.pm:393</div><div class="">STACK Bio::EnsEMBL::Variation::BaseVariationFeatureOverlapAllele::get_all_OverlapConsequences /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/BaseVariationFeatureOverlapAllele.pm:237</div><div class="">STACK Bio::EnsEMBL::Variation::Utils::VEP::tva_to_line /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/Utils/VEP.pm:2568</div><div class="">STACK Bio::EnsEMBL::Variation::Utils::VEP::vfoa_to_line /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/Utils/VEP.pm:2504</div><div class="">STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_to_consequences /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/Utils/VEP.pm:2191</div><div class="">STACK Bio::EnsEMBL::Variation::Utils::VEP::rejoin_variants /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/Utils/VEP.pm:1777</div><div class="">STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_list_to_cons /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/Utils/VEP.pm:1485</div><div class="">STACK Bio::EnsEMBL::Variation::Utils::VEP::get_all_consequences /mnt/work1/software/vep/83/Bio/EnsEMBL/Variation/Utils/VEP.pm:1205</div><div class="">STACK main::main /mnt/work1/software/vep/83/variant_effect_predictor.pl:321</div><div class="">STACK toplevel /mnt/work1/software/vep/83/variant_effect_predictor.pl:148</div><div class="">Date (localtime)    = Tue May 24 13:59:39 2016</div><div class="">Ensembl API version = 83</div><div class="">---------------------------------------------------</div><div class="">ERROR: Forked process(es) died</div></blockquote><br class=""></div><div class="">We’re still trying to figure the VarScan issue, but this shouldn’t really take out an entire VEP run. Even the issue where this line breaks recognition of VEP input is, I’d say, less than ideal, as the file contains about 7000 other valid records. </div><div class=""><br class=""></div><div class="">All the best</div><div class="">Stuart</div><div class=""><br class=""><div class="">
<div class=""><div class="">—</div><div class=""><font color="#4f0500" class=""><b class="">Stuart Watt, PhD</b></font></div><div class=""><font color="#4f0500" class="">Scientific Research Associate, Princess Margaret Cancer Centre</font></div><div class=""><font color="#4f0500" class="">MaRS Centre, 101 College Street<br class="">Toronto Medical Discovery Tower, Room 9-302<br class="">Toronto, Ontario, Canada M5G 1L7</font></div><div class=""><font color="#4f0500" class=""><a href="mailto:stuart.watt@uhnresearch.ca" class="">stuart.watt@uhnresearch.ca</a><br class="">416-634-8816</font></div></div>
</div>
<br class=""></div></body></html>