<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi, <br>
<br>
I am using the VEP installer script to download and unpack caches to
use with the VEP script.<br>
<br>
I would like to use the human refseq cache, to get NM_ transcript
IDs, as this is what my colleagues would like reported in their
output.<br>
<br>
When prompted which cache to download, if I choose '25 :
homo_sapiens_refseq_vep_73.tar.gz' it is downloaded - put into a tmp
folder within ~/.vep, however it looks as if it fails to unpack as
the resulting cache folder (homo_sapiens) is empty? <br>
<br>
If I choose '26 : homo_sapiens_vep_73.tar.gz' the unpacked
'homo_sapiens' folder contains all the cache information.<br>
<br>
I therefore downloaded the cache files directly from
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<a href="ftp://ftp.ensembl.org/pub/release-73/variation/VEP/">ftp://ftp.ensembl.org/pub/release-73/variation/VEP/</a>
however when I unpack them both they are both named 'homo_sapiens'.
I believe in the past the refseq cache had a different name e.g.
homo_sapiens_refseq ? I am using --dir_cache to get around this.<br>
<br>
Finally, when running the VEP script with the refseq cache and using
the --symbol flag I was getting the error:<br>
<br>
Can't call method "display_xref" on an undefined value at
/home/chris/VEP/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm
line 1997.<br>
<br>
And the process hangs.<br>
<br>
If I run with the --refseq flag I no longer get the error but the
output of --symbol is not populated i.e. the gene HGNC symbol.<br>
<br>
I don't any get errors if I use the ensembl vep cache...<br>
<br>
Here are the three commands I am running:<br>
<br>
1. Using ref seq cache without --refseq flag (throws the
"/VEP/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm
line 1997" error<br>
<br>
perl $VEP/variant_effect_predictor.pl \<br>
-fork 4 \<br>
--buffer_size 10000 \<br>
--cache \<br>
--dir_cache /home/chris/.vep/Refseq \<br>
--dir_plugins /home/chris/.vep/Plugins \<br>
--fasta
/home/chris/.vep/EnsemblRef/Homo_sapiens.GRCh37.73.dna.primary_assembly.fa
\<br>
--input_file $inputVCF \<br>
--output_file $outputVCF \<br>
--sift b \<br>
--polyphen b \<br>
--allele_number \<br>
--numbers \<br>
--domains \<br>
--HGVS \<br>
--protein \<br>
--symbol \<br>
--ccds \<br>
--canonical \<br>
--biotype \<br>
--check_alleles \<br>
--gmaf \<br>
--maf_1kg \<br>
--maf_esp \<br>
--pubmed \<br>
--vcf \<br>
--force_overwrite \<br>
--plugin FATHMM,"python
~/Reference_sequences/Variants/FATHMM/fathmm.py"<br>
<br>
<br>
2. As above but with --refseq flag - works without an error but HGNC
(--symbol) is not populated?<br>
<br>
perl $VEP/variant_effect_predictor.pl \<br>
-fork 4 \<br>
--buffer_size 10000 \<br>
--cache \<br>
--dir_cache /home/chris/.vep/Refseq \<br>
--dir_plugins /home/chris/.vep/Plugins \<br>
--fasta
/home/chris/.vep/EnsemblRef/Homo_sapiens.GRCh37.73.dna.primary_assembly.fa
\<br>
--input_file $inputVCF \<br>
--output_file $outputVCF \<br>
--sift b \<br>
--polyphen b \<br>
--allele_number \<br>
--numbers \<br>
--domains \<br>
--HGVS \<br>
--protein \<br>
--symbol \<br>
--ccds \<br>
--canonical \<br>
--biotype \<br>
--check_alleles \<br>
--gmaf \<br>
--maf_1kg \<br>
--maf_esp \<br>
--pubmed \<br>
--vcf \<br>
--refseq \<br>
--force_overwrite \<br>
--plugin FATHMM,"python
~/Reference_sequences/Variants/FATHMM/fathmm.py"<br>
<br>
3. Using ensembl cache - works but no ref seq trasncript IDs!<br>
<br>
perl $VEP/variant_effect_predictor.pl \<br>
-fork 4 \<br>
--buffer_size 10000 \<br>
--cache \<br>
--dir_cache /home/chris/.vep/ \<br>
--dir_plugins /home/chris/.vep/Plugins \<br>
--fasta
/home/chris/.vep/EnsemblRef/Homo_sapiens.GRCh37.73.dna.primary_assembly.fa
\<br>
--input_file $inputVCF \<br>
--output_file $outputVCF \<br>
--sift b \<br>
--polyphen b \<br>
--allele_number \<br>
--numbers \<br>
--domains \<br>
--HGVS \<br>
--protein \<br>
--symbol \<br>
--ccds \<br>
--canonical \<br>
--biotype \<br>
--check_alleles \<br>
--gmaf \<br>
--maf_1kg \<br>
--maf_esp \<br>
--pubmed \<br>
--vcf \<br>
--refseq \<br>
--force_overwrite \<br>
--plugin FATHMM,"python
~/Reference_sequences/Variants/FATHMM/fathmm.py"<br>
<br>
Any help with the above would be much appreciated!<br>
<br>
Thanks<br>
<br>
Chris<br>
<br>
<br>
<br>
<div class="moz-signature">-- <br>
<p><b>Chris Boustred</b><br>
Laboratory Bioinformatician<br>
Regional Molecular Genetics<br>
Great Ormond Street for Children NHS Foundation Trust<br>
Level 6, York House<br>
37 Queen Square<br>
London<br>
WC1N 3BH<br>
<a href="mailto:christopher.boustred@gosh.nhs.uk">christopher.boustred@gosh.nhs.uk</a><br>
<a href="mailto:cboustred@gmail.com">cboustred@gmail.com</a><br>
Phone: 020 7762 6874<br>
Fax: 020 7813 8196<br>
</p>
</div>
</body>
</html>