<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 23, 2024 at 10:43 PM Allan Kamau <<a href="mailto:kamauallan@gmail.com">kamauallan@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 23, 2024 at 6:37 PM Benjamin Moore <<a href="mailto:bmoore@ebi.ac.uk" target="_blank">bmoore@ebi.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Allan,<br>
<br>
I think the most straightforward way to retreieve the 5'UTR sequences <br>
for a list of Ensembl features (I assume you have a list of gene IDs, <br>
ENSG...) using the REST API is to use the Lookup endpoints with the <br>
expand and utr optional parameters to retreieve the genomic coordinates <br>
of the 5'UTRs of each transcript for your list of genes:<br>
<br>
<a href="https://rest.ensembl.org/documentation/info/lookup" rel="noreferrer" target="_blank">https://rest.ensembl.org/documentation/info/lookup</a><br>
<br>
Then, you can use the coordinates from the first step as the input for <br>
the Sequence/region endpoints to retreieve the genomic sequence of the <br>
5' UTRs:<br>
<br>
<a href="https://rest.ensembl.org/documentation/info/sequence_region" rel="noreferrer" target="_blank">https://rest.ensembl.org/documentation/info/sequence_region</a><br>
<br>
I hope this helps.<br>
<br>
Best wishes<br>
<br>
Ben<br>
<br>
On 23/04/2024 15:30, Allan Kamau wrote:<br>
> Is there a way to obtain 5'UTR sequences given a list of ensembl ids <br>
> programmatically?<br>
><br>
> I have a list of ensembl ids for which I would like to obtain the <br>
> 5'UTR region for each one of them programmatically hopefully via <br>
> ensembl rest using wget, or python (ensembl-rest).<br>
><br>
> Kindly assist.<br>
><br>
> Thanks.<br>
><br>
> -Allan.<br>
><br>
><br>
><br>
><br>
> _______________________________________________<br>
> Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
> Posting guidelines and subscribe/unsubscribe info: <a href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org" rel="noreferrer" target="_blank">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a><br>
> Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
<br>
-- <br>
Dr. Ben Moore (he/him)<br>
Ensembl Outreach Manager<br>
<br>
European Bioinformatics Institute (EMBL-EBI)<br>
European Molecular Biology Laboratory<br>
Wellcome Trust Genome Campus<br>
Hinxton<br>
Cambridge<br>
CB10 1SD<br>
UK<br>
<br>
<a href="mailto:bmoore@ebi.ac.uk" target="_blank">bmoore@ebi.ac.uk</a><br>
+44 (0)1223 494265<br>
<br></blockquote><div><br></div><div>Thank you Ben for your response. I am now stuck in defining the url for the <a href="https://rest.ensembl.org/sequence/region/" target="_blank">https://rest.ensembl.org/sequence/region</a> resource.</div><div><br></div><div>I am using the ensembl id "ENSMUSG00000041075" in this example.</div><div>The URL below provides the sequence features the ensembl id object "ENSMUSG00000041075".</div><div><br></div><div><a href="https://rest.ensembl.org/lookup/id/ENSMUSG00000041075?content-type=application/json;expand=1;utr=1" target="_blank">https://rest.ensembl.org/lookup/id/ENSMUSG00000041075?content-type=application/json;expand=1;utr=1</a></div><div><br></div><div>This returns the query below</div><div><br>{ <br> "ENSMUSG00000041075": { <br> "seq_region_name": "1", <br> "logic_name": "ensembl_havana_gene_mus_musculus", <br> "end": 59526114, <br> "biotype": "protein_coding", <br> "version": 9, <br> "db_type": "core", <br> "object_type": "Gene", <br> "strand": 1, <br> "start": 59521583, <br> "canonical_transcript": "ENSMUST00000114246.4", <br> "Transcript": [ <br> { <br> "biotype": "protein_coding",<br> "version": 4,<br> "db_type": "core",<br> "object_type": "Transcript",<br> "seq_region_name": "1",<br> "logic_name": "ensembl_havana_transcript_mus_musculus",<br> "end": 59526114,<br> "Translation": {<br> "id": "ENSMUSP00000109884",<br> "length": 572,<br> "start": 59522119,<br> "end": 59523837,<br> "version": 3,<br> "object_type": "Translation",<br> "db_type": "core",<br> "Parent": "ENSMUST00000114246",<br> "species": "mus_musculus"<br> },<br> "assembly_name": "GRCm39",<br> "Parent": "ENSMUSG00000041075",<br> "is_canonical": 1,<br> "display_name": "Fzd7-201",<br> "Exon": [<br> {<br> "species": "mus_musculus",<br> "version": 4,<br> "db_type": "core",<br> "object_type": "Exon",<br> "assembly_name": "GRCm39",<br> "id": "ENSMUSE00000698652",<br> "start": 59521583,<br> "end": 59526114,<br> "strand": 1,<br> "seq_region_name": "1"<br> }<br> ],<br> "UTR": [<br> {<br> "object_type": "five_prime_UTR",<br> "db_type": "core",<br> "assembly_name": "GRCm39",<br> "Parent": "ENSMUST00000114246",<br> "type": "five_prime_utr",<br> "species": "mus_musculus",<br> "seq_region_name": "1",<br> "strand": 1,<br> "id": "ENSMUST00000114246",<br> "source": "ensembl_havana",<br> "start": 59521583,<br> "end": 59522118<br> },<br> {<br> "type": "three_prime_utr",<br> "species": "mus_musculus",<br> "db_type": "core",<br> "object_type": "three_prime_UTR",<br> "assembly_name": "GRCm39",<br> "Parent": "ENSMUST00000114246",<br> "id": "ENSMUST00000114246",<br> "source": "ensembl_havana",<br> "end": 59526114,<br> "start": 59523838,<br> "seq_region_name": "1",<br> "strand": 1<br> }<br> ],<br> "species": "mus_musculus",<br> "strand": 1,<br> "start": 59521583,<br> "id": "ENSMUST00000114246",<br> "source": "ensembl_havana",<br> "length": 4532<br> }<br> ],<br> "id": "ENSMUSG00000041075",<br> "description": "frizzled class receptor 7 [Source:MGI Symbol;Acc:MGI:108570]",<br> "source": "ensembl_havana",<br> "assembly_name": "GRCm39",<br> "display_name": "Fzd7",<br> "species": "mus_musculus"<br> }<br>}<br></div><div><br></div><div><div>What would be formulation for the "<a href="https://rest.ensembl.org/sequence/region/" target="_blank">https://rest.ensembl.org/sequence/region/</a>" for the 5'UTR gene region given above.</div><div><br></div><div>Below is the step where I am stuck.</div><div><a href="https://rest.ensembl.org/sequence/region/mus_musculus/" target="_blank">https://rest.ensembl.org/sequence/region/mus_musculus/</a><what_goes_here>:59522119..59523837:1?coord_system=seqlevel;content-type=text/x-fasta</div></div><div><br></div><div>-Allan.</div><div> </div></div></div></blockquote><div><br></div><div>What would be the url to obtain the five_prime_UTR region?</div><div><br></div><div>I have tried the URL below but it finds no slice.</div><div><a href="https://rest.ensembl.org/sequence/region/mus_musculus/GRCm39:59521583..59522118:1?coord_system=seqlevel;content-type=text/x-fasta">https://rest.ensembl.org/sequence/region/mus_musculus/GRCm39:59521583..59522118:1?coord_system=seqlevel;content-type=text/x-fasta</a> </div><div><br></div><div>-Allan.</div></div></div>