<div dir="ltr"><div>Hi Likhitha,</div><div><br></div><div>many thanks for your prompt reply. I thought I was using the Ensembl transcripts cache as I wasn't using the `--refseq` command line switch. Nevertheless, I tried to install the cache again and run VEP, but still finding the same situation in the domains' output.</div><div><br></div><div>To install the new cache + fasta, I did the following:</div><div><br></div><div>    $ perl INSTALL.pl --AUTO c --CACHEDIR ../../vep_106 --SPECIES "homo_sapiens" --ASSEMBLY GRCh38<br>    WARNING: DBD::mysql module not found. VEP can only run in offline (--offline) mode without DBD::mysql installed<br>    <br>    <a href="http://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#requirements">http://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#requirements</a><br>     - getting list of available cache files<br>     - downloading <a href="ftp://ftp.ensembl.org/pub/release-106/variation/indexed_vep_cache/homo_sapiens_vep_106_GRCh38.tar.gz">ftp://ftp.ensembl.org/pub/release-106/variation/indexed_vep_cache/homo_sapiens_vep_106_GRCh38.tar.gz</a><br>     - unpacking homo_sapiens_vep_106_GRCh38.tar.gz<br>     - converting cache, this may take some time but will allow VEP to look up variants and frequency data much faster<br>     - use CTRL-C to cancel if you do not wish to convert this cache now (you may run <a href="http://convert_cache.pl">convert_cache.pl</a> later)<br>    2022-06-22 18:11:42 - Processing homo_sapiens<br>    2022-06-22 18:11:42 - Processing version 106_GRCh38<br>    2022-06-22 18:11:42 - No unprocessed types remaining, skipping<br>    2022-06-22 18:11:42 - All done!<br>    <br>    All done<br></div><div>    </div><div>    $ perl INSTALL.pl --AUTO f --CACHEDIR ../../vep_106 --SPECIES "homo_sapiens" --ASSEMBLY GRCh38<br>    WARNING: DBD::mysql module not found. VEP can only run in offline (--offline) mode without DBD::mysql installed<br><br>    <a href="http://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#requirements">http://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#requirements</a><br>     - downloading Homo_sapiens.GRCh38.dna.toplevel.fa.gz<br>    <br>    All done<br></div><div><br></div><div>Then, ran VEP:</div><div><br></div><div>    vep --offline --cache --assembly GRCh38 --dir_cache /opt/bioResources/vep_106 --fasta /opt/bioResources/vep_106/homo_sapiens/106_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz --input_file T790M.vcf --json --output_file T790M.vep.json --force_overwrite --domains<br></div><div><br></div><div>Is this the right way to do it? I'm still not getting information of protein domains for all other databases (including Pfam) besides de `ENSP_mappings`...<br></div><div><br></div><div>Many thanks,</div><div>Pedro</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 22 Jun 2022 at 09:12, Likhitha Surapaneni <<a href="mailto:likhithas@ebi.ac.uk">likhithas@ebi.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Pedro,<br>

<br>

I am sorry to hear that you are facing an issue with VEP command line.<br>

<br>

Could you please confirm if you were using RefSeq cache? RefSeq cache <br>

lacks classes of data present in the Ensembl transcript cache, one of <br>

them being Protein domains <br>

(<a href="https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq" rel="noreferrer" target="_blank">https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq</a>). <br>

Could you please try with Ensembl transcript cache and see if you are <br>

facing the same issue?<br>

<br>

Hope this helps and please let me know if you have further questions.<br>

<br>

Thanks and regards,<br>

<br>

Likhitha<br>

<br>

On 21/06/2022 18:01, Pedro Almeida wrote:<br>

> Hi all,<br>

><br>

> I've been trying to get information of overlapping protein domains for <br>

> one variant using VEP, but it looks as if the REST API returns more <br>

> domains than the command line tool. Domains here means the output of <br>

> the command line switch `--domains`, which, as far as I can tell, is <br>

> the same as `domains=1` with the `GET vep/:species/id/:id` API request.<br>

><br>

> For example, for this single variant I'm using for testing, EGFR <br>

> T790M, with the GET method above <br>

> `<a href="https://rest.ensembl.org/vep/human/id/rs121434569?domains=1&content-type=application/json" rel="noreferrer" target="_blank">https://rest.ensembl.org/vep/human/id/rs121434569?domains=1&content-type=application/json`</a> <br>

> <<a href="https://rest.ensembl.org/vep/human/id/rs121434569?domains=1&content-type=application/json" rel="noreferrer" target="_blank">https://rest.ensembl.org/vep/human/id/rs121434569?domains=1&content-type=application/json`</a>> <br>

> the `domains` list of the `transcript_consequences` object, lists <br>

> several ENSP_mappings and also information from CDD, Pfam, <br>

> PROSITE_profiles, and others. I'm more interested in the Pfam <br>

> information, which in this case corresponds to a protein tyrosine and <br>

> serine/threonine kinase, PF07714.<br>

><br>

> However, when I run this same variant in the command line (using a VCF <br>

> file with this single variant as input), I can only obtain information <br>

> from the ENSP_mappings, but all other databases appear to be missing. <br>

> The command used was the following:<br>

><br>

> ```<br>

> vep --domains --dir_cache /opt/bioResources/vep_106/ --fasta <br>

> /opt/bioResources/vep_106/homo_sapiens_refseq/106_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz <br>

> --input_file T790M.vcf --output_file T790M.vep.json --cache --offline <br>

> --json --force_overwrite<br>

> ```<br>

><br>

> Does anyone know if this is expected, or how to get the same output of <br>

> the REST API (regarding the list of protein domains) when using the <br>

> command line tool? Are custom annotations needed for these cases?<br>

><br>

> Many thanks,<br>

> Pedro<br>

><br>

><br>

> _______________________________________________<br>

> Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>

> Posting guidelines and subscribe/unsubscribe info: <a href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org" rel="noreferrer" target="_blank">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a><br>

> Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>

</blockquote></div><br clear="all"><div><br></div></div>