<div dir="ltr">Hi Roland,<div><br></div><div>You can ignore that warning message; when you specify --everything, it switches on a few options which tell the VEP to expect to find cache files containing co-located variants. Since you generated your cache yourself, these files don't exist, which is why the code is complaining. You can either continue to ignore the warnings, or substitute --everything for the list of flags specified here:<br></div><div><br></div><div><a href="http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_everything">http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_everything</a><br></div><div><br></div><div>In fact in your case only the following will work with a user-generated cache anyway: --variant_class, --biotype, --numbers</div><div><br></div><div>Regarding the lack of protein-changing results, there is every chance that the cache has not been generated correctly from the GTF. I notice you converted a GFF; it's worth checking that the requirements on the input GTF are quite strict, see <a href="http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_everything">http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_everything</a></div><div><br></div><div>It is on our to-do list to make this script compatible with a wider spectrum of GFF/GTF formatting.</div><div><br></div><div>Regards</div><div><br></div><div>Will</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 5 June 2015 at 13:52, Schmucki, Roland <span dir="ltr"><<a href="mailto:roland.schmucki@roche.com" target="_blank">roland.schmucki@roche.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Dear Will<div><br></div><div>Thank you very much for the quick response.</div><div>I would like to post this issue to the public Ensembl mailing list.</div><div>Here is a brief description of the problem I encountered:</div><div><br></div><br>When running VEP with ensembl annotation files I get errors of the form "Could not find variation cache for Chromosome..."<br><br>I downloaded a genome (i.e. pao1, $name.fa) and annotation ($name.gff3) from Ensembl ftp and then created the cache files according to the VEP tutorial:<br><br><br>sort -k1,1 -k4,4n $name.gff | bgzip > $name.gff.gz<br>tabix -p gff $name.gff.gz<br>./cufflinks/gffread $name.gff -T -o $name.gtf<br>perl <a href="http://gtf2vep.pl/" target="_blank">gtf2vep.pl</a> -i $name.gtf -f $name.fa -d 79 -s $name --dir variant_effect_predictor_version79/cache_files_<br>and move the cache files to the correct location manually.<br><br>This all seem to have worked fine without any error or warning messages.<br>Then I mapped the reads to the genome, ran Freebayes (variants.vcf with 2700 variants) and at the very end applied VEP with the following command:<br><br><br>perl <a href="http://variant_effect_predictor.pl/" target="_blank">variant_effect_predictor.pl</a> --everything --offline --custom $name.gff.gz,$name-genes,gff,overlap,0 --format vcf -i variants.vcf -o variants.txt --species $name --dir_cache $VEP_DATA<br><br><br>The variable VEP_DATA points to the corresponding cache file:<div>with the following files (creation date and file size) there in: $VEP_DATA/pao1/79/Chromosome/</div><div><div>292135 Jun 5 09:10 3000001-4000000.gz</div><div>294904 Jun 5 09:10 1000001-2000000.gz</div><div>290186 Jun 5 09:10 1-1000000.gz</div><div>290763 Jun 5 09:10 5000001-6000000.gz</div><div>284789 Jun 5 09:10 2000001-3000000.gz</div><div>292462 Jun 5 09:10 4000001-5000000.gz</div><div>78483 Jun 5 09:10 6000001-7000000.gz<br></div><div><br></div></div><div><br></div><div>When I run VEP I get the following errors and warnings (See attached log file for all details):</div><div><div>WARNING: Could not find variation cache for Chromosome:1-1000000</div><div>WARNING: Could not find variation cache for Chromosome:5000001-6000000</div></div><div>etc.</div><div><br></div><div><br></div><div><span style="font-size:12.8000001907349px">I don't understand why I got this errors/warnings?</span><div style="font-size:12.8000001907349px">Thanks a lot for any advice!</div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">Best,</div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">R.</div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">PS: there is an output file generated with variant annotations of the form:</div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px"><div>#Uploaded_variation Location Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Pro</div><div>tein_position Amino_acids Codons Existing_variation Extra</div><div>Chromosome_2415_G/T Chromosome:2415 T gene:PA0005 transcript:AAG03395 Transcript downstream_gene_variant -</div><div> - - - - - IMPACT=MODIFIER;pao1-genes=gene:PA0002,exon_Chromosome:2056-3159,CDS:AAG03392,transc</div></div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">However, no amino acid changes are found which is unlikely.</div></div><div><br></div>
</div>
<br>_______________________________________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>