<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Genomeo,<br>
<br>
If no value is available, the attribute is null.<br>
An empty string can still be considered a value, unlike 'null' which
is undefined.<br>
We tend to add tests for existence if a value can be null.<br>
<br>
For the latency, I get the same result as you.<br>
Connecting to the database and fetching all the genes took about 5
and a half minutes.<br>
<br>
Going through all those genes and print out information is not
really going to be a bottleneck, it is the initial fetching of data
which is.<br>
<br>
Unfortunately, our public server is being used heavily at the
moment, maybe because release 75 just came out.<br>
<br>
If you are going to need the data regularly, our advice is to create
your own local server.<br>
Running your script on our local server, I was able to fetch all the
genes in 6s.<br>
Looping through these 63677 genes and printing all the results took
around 6 minutes.<br>
<br>
<br>
Regards,<br>
Magali<br>
<br>
<div class="moz-cite-prefix">On 27/02/2014 16:15, Genomeo Dev wrote:<br>
</div>
<blockquote
cite="mid:CAKry3c1tG-acPvVmySJiv5+E3R-AJL4NdvNZW9pSYz0-SPpybg@mail.gmail.com"
type="cite">
<div dir="ltr">OK thanks.
<div><br>
</div>
<div>Another question please:</div>
<div><br>
</div>
<div>Going back to my perl subroutine for getting xref for
ensembl IDs:</div>
<div><br>
</div>
<div>use warnings;</div>
<div>
<div>sub print_DBEntries</div>
<div>{</div>
<div><span class="" style="white-space:pre"> </span>my
$db_entries = shift;</div>
<div><span class="" style="white-space:pre"> </span>foreach
my $dbe ( @{$db_entries} ) {</div>
<div><span class="" style="white-space:pre"> </span>printf
"%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n", $dbe->dbname(),
$dbe->display_id(), $dbe->description(),
$dbe->db_display_name(), $dbe->info_text(),
$dbe->info_type(), $dbe->primary_id(), join(",
",@{$dbe->get_all_synonyms()}) , $dbe->version();</div>
<div><span class="" style="white-space:pre"> </span>}</div>
<div>}</div>
</div>
<div><br>
</div>
<div>I found that some values are not returned for some
databases. For example using ENSG00000151067:</div>
<div><br>
</div>
<div>
<div>
#Use of uninitialized value in printf at ./<a
moz-do-not-send="true"
href="http://fetch_ensembl_genes_v2.pl">fetch_ensembl_genes_v2.pl</a>
line 47.</div>
<div>OTTG<span class="" style="white-space:pre"> </span>OTTHUMG00000150243<span
class="" style="white-space:pre"> </span>Havana gene<span
class="" style="white-space:pre"> </span>NONE<span
class="" style="white-space:pre"> </span>OTTHUMG00000150243<span
class="" style="white-space:pre"> </span>7</div>
<div>#Use of uninitialized value in printf at ./<a
moz-do-not-send="true"
href="http://fetch_ensembl_genes_v2.pl">fetch_ensembl_genes_v2.pl</a>
line 47.</div>
<div>ENS_LRG_gene<span class="" style="white-space:pre"> </span>LRG_334<span
class="" style="white-space:pre"> </span>LRG display in
Ensembl<span class="" style="white-space:pre"> </span>NONE<span
class="" style="white-space:pre"> </span>LRG_334<span
class="" style="white-space:pre"> </span>0</div>
<div>LRG<span class="" style="white-space:pre"> </span>LRG_334<span
class="" style="white-space:pre"> </span>Locus Reference
Genomic record for CACNA1C<span class=""
style="white-space:pre"> </span>Locus Reference Genomic<span
class="" style="white-space:pre"> </span>DIRECT<span
class="" style="white-space:pre"> </span>LRG_334<span
class="" style="white-space:pre"> </span>0</div>
<div>ArrayExpress<span class="" style="white-space:pre"> </span>ENSG00000151067<span
class="" style="white-space:pre"> </span>ArrayExpress<span
class="" style="white-space:pre"> </span>DIRECT<span
class="" style="white-space:pre"> </span>ENSG00000151067<span
class="" style="white-space:pre"> </span>0</div>
<div>EntrezGene<span class="" style="white-space:pre"> </span>CACNA1C<span
class="" style="white-space:pre"> </span>calcium channel,
voltage-dependent, L type, alpha 1C subunit<span class=""
style="white-space:pre"> </span>EntrezGene<span class=""
style="white-space:pre"> </span>DEPENDENT<span class=""
style="white-space:pre"> </span>775<span class=""
style="white-space:pre"> </span>CACH2, CACN2, CACNL1A1,
CaV1.2, CCHL1A1, LQT8, TS<span class=""
style="white-space:pre"> </span>0</div>
<div>HGNC<span class="" style="white-space:pre"> </span>CACNA1C<span
class="" style="white-space:pre"> </span>calcium channel,
voltage-dependent, L type, alpha 1C subunit<span class=""
style="white-space:pre"> </span>HGNC Symbol<span class=""
style="white-space:pre"> </span>Generated via
ensembl_manual<span class="" style="white-space:pre"> </span>DIRECT<span
class="" style="white-space:pre"> </span>1390<span
class="" style="white-space:pre"> </span>CACH2, CACN2,
CACNL1A1, Cav1.2, CCHL1A1, LQT8, TS<span class=""
style="white-space:pre"> </span>0</div>
<div>MIM_GENE<span class="" style="white-space:pre"> </span>
CALCIUM CHANNEL, VOLTAGE-DEPENDENT [*114205]<span class=""
style="white-space:pre"> </span> CALCIUM CHANNEL,
VOLTAGE-DEPENDENT, L TYPE, ALPHA-1C SUBUNIT; CACNA1C<span
class="" style="white-space:pre"> </span>MIM gene<span
class="" style="white-space:pre"> </span>DEPENDENT<span
class="" style="white-space:pre"> </span>114205<span
class="" style="white-space:pre"> </span>0</div>
<div>MIM_MORBID<span class="" style="white-space:pre"> </span>
TIMOTHY SYNDROME [#601005]<span class=""
style="white-space:pre"> </span> TIMOTHY SYNDROME; TS<span
class="" style="white-space:pre"> </span>MIM disease<span
class="" style="white-space:pre"> </span>DEPENDENT<span
class="" style="white-space:pre"> </span>601005<span
class="" style="white-space:pre"> </span>0</div>
<div>MIM_MORBID<span class="" style="white-space:pre"> </span>
BRUGADA SYNDROME 3 [#611875]<span class=""
style="white-space:pre"> </span> BRUGADA SYNDROME 3;
BRGDA3<span class="" style="white-space:pre"> </span>MIM
disease<span class="" style="white-space:pre"> </span>DEPENDENT<span
class="" style="white-space:pre"> </span>611875<span
class="" style="white-space:pre"> </span>0</div>
<div>UniGene<span class="" style="white-space:pre"> </span>Hs.690010<span
class="" style="white-space:pre"> </span>Voltage-dependent
L-type Ca2+ channel alpha 1 subunit (CACNA1C) mRNA, exon 1a
and partial cds<span class="" style="white-space:pre"> </span>UniGene<span
class="" style="white-space:pre"> </span>SEQUENCE_MATCH<span
class="" style="white-space:pre"> </span>Hs.690010<span
class="" style="white-space:pre"> </span>0</div>
<div>UniGene<span class="" style="white-space:pre"> </span>Hs.697137<span
class="" style="white-space:pre"> </span>Transcribed
locus, weakly similar to XP_416388.3 PREDICTED:
voltage-dependent L-type calcium channel subunit alpha-1C
[Gallus gallus]<span class="" style="white-space:pre"> </span>UniGene<span
class="" style="white-space:pre"> </span>SEQUENCE_MATCH<span
class="" style="white-space:pre"> </span>Hs.697137<span
class="" style="white-space:pre"> </span>0</div>
<div>Uniprot_gn<span class="" style="white-space:pre"> </span>CACNA1C<span
class="" style="white-space:pre"> </span>UniProtKB Gene
Name<span class="" style="white-space:pre"> </span>DEPENDENT<span
class="" style="white-space:pre"> </span>CACNA1C<span
class="" style="white-space:pre"> </span>CACH2, CACN2,
CACNL1A1, CCHL1A1<span class="" style="white-space:pre"> </span>0</div>
<div>WikiGene<span class="" style="white-space:pre"> </span>CACNA1C<span
class="" style="white-space:pre"> </span>calcium channel,
voltage-dependent, L type, alpha 1C subunit<span class=""
style="white-space:pre"> </span>WikiGene<span class=""
style="white-space:pre"> </span>DEPENDENT<span class=""
style="white-space:pre"> </span>775<span class=""
style="white-space:pre"> </span>0</div>
</div>
<div><br>
</div>
<div>I would expect for missing value to be empty string. Am I
missing anything?</div>
<div><br>
</div>
<div>On a related note, this code runs very slowly (takes 6 mns
for 100 genes). From an earlier post, it seems that connecting
to the database is the bottleneck. Connecting to <a
moz-do-not-send="true" href="http://useastdb.ensembl.org">useastdb.ensembl.org</a>
instead of <a moz-do-not-send="true"
href="http://ensembldb.ensembl.org">ensembldb.ensembl.org</a>
is not much better. So I was wondering whether there is a way
to turn off lazy loading for the purpose of debugging?</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 27 February 2014 14:43, mag <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:mr6@ebi.ac.uk" target="_blank">mr6@ebi.ac.uk</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Hi Genomeo,<br>
<br>
Taking the last entry as example:<br>
Hs.743764 refers to
<a moz-do-not-send="true"
href="http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?UGID=5947187&TAXID=9606&SEARCH=Hs.743764"
target="_blank">http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?UGID=5947187&TAXID=9606&SEARCH=Hs.743764</a><br>
This is a human locus.<br>
The description means that this locus is similar to a gene
in rat, as it has not been fully annotated in human.<br>
<br>
I agree the description can be misleading, but it is
imported directly from NCBI as is, so there is not much we
can do about it.<br>
<br>
<br>
Regards,<br>
Magali
<div>
<div class="h5"><br>
<br>
<div>On 27/02/2014 14:37, Genomeo Dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Thanks very much Magali for pointing
that out.
<div><br>
</div>
<div>If understand you correctly db_type and
species are therefore attributes of the query
gene IDs not the returned cross-reference ids.
For my query IDs I see I define that here:</div>
<div><br>
</div>
<div>my $gene_adaptor =
Bio::EnsEMBL::Registry->get_adaptor( "human",
"core", "gene" );<br>
</div>
<div><br>
</div>
<div>I have tried to lookup about 5000 human
ensembl IDs and found that for 256 I get cross
mapping to other organisms. It only happens for
UniGene. For example for ENSG00000010244:</div>
<div><br>
</div>
<div>display_id<span style="white-space:pre-wrap">
</span>dbname<span style="white-space:pre-wrap">
</span>ensembl_start<span
style="white-space:pre-wrap"> </span>xref_start<span
style="white-space:pre-wrap"> </span>display_id<span
style="white-space:pre-wrap"> </span>score<span
style="white-space:pre-wrap"> </span>db_display_name<span
style="white-space:pre-wrap"> </span>xref_end<span
style="white-space:pre-wrap"> </span>evalue<span
style="white-space:pre-wrap"> </span>info_text<span
style="white-space:pre-wrap"> </span>info_type<span
style="white-space:pre-wrap"> </span>ensembl_end<span
style="white-space:pre-wrap"> </span>primary_id<span
style="white-space:pre-wrap"> </span>ensembl_identity<span
style="white-space:pre-wrap"> </span>synonyms<span
style="white-space:pre-wrap"> </span>version<span
style="white-space:pre-wrap"> </span>cigar_line<span
style="white-space:pre-wrap"> </span>xref_identity<span
style="white-space:pre-wrap"> </span>dbname<span
style="white-space:pre-wrap"> </span>description<br>
</div>
<div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>Hs.500775<span
style="white-space:pre-wrap"> </span>23313<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>4672<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>4672<span
style="white-space:pre-wrap"> </span>Hs.500775<span
style="white-space:pre-wrap"> </span>99<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>1631M1D1592M1I1448M<span
style="white-space:pre-wrap"> </span>99<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Zinc
finger protein 207</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>5667<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>Hs.612377<span
style="white-space:pre-wrap"> </span>1200<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>249<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>5417<span
style="white-space:pre-wrap"> </span>Hs.612377<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>249M<span
style="white-space:pre-wrap"> </span>97<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>12853<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>Hs.636112<span
style="white-space:pre-wrap"> </span>2260<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>452<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>12400<span
style="white-space:pre-wrap"> </span>Hs.636112<span
style="white-space:pre-wrap"> </span>3<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>452M<span
style="white-space:pre-wrap"> </span>100<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>5213<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>Hs.658344<span
style="white-space:pre-wrap"> </span>3230<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>684<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>4526<span
style="white-space:pre-wrap"> </span>Hs.658344<span
style="white-space:pre-wrap"> </span>4<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>615M1I8M1D3M1D34M1D13M1D5M1I4M<span
style="white-space:pre-wrap"> </span>91<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>3014<span
style="white-space:pre-wrap"> </span>23<span
style="white-space:pre-wrap"> </span>Hs.670238<span
style="white-space:pre-wrap"> </span>1995<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>427<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>2607<span
style="white-space:pre-wrap"> </span>Hs.670238<span
style="white-space:pre-wrap"> </span>2<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>399M1D6M<span
style="white-space:pre-wrap"> </span>94<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>11505<span
style="white-space:pre-wrap"> </span>2<span
style="white-space:pre-wrap"> </span>Hs.694378<span
style="white-space:pre-wrap"> </span>3063<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>628<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>12131<span
style="white-space:pre-wrap"> </span>Hs.694378<span
style="white-space:pre-wrap"> </span>4<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>627M<span
style="white-space:pre-wrap"> </span>98<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>Hs.716993<span
style="white-space:pre-wrap"> </span>1971<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>396<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>396<span
style="white-space:pre-wrap"> </span>Hs.716993<span
style="white-space:pre-wrap"> </span>99<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>396M<span
style="white-space:pre-wrap"> </span>99<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus, strongly similar to NP_001034109.1
Zfp207 gene product [Rattus norvegicus]</div>
<div>ENSG00000010244<span
style="white-space:pre-wrap"> </span>ensembl<span
style="white-space:pre-wrap"> </span>50<span
style="white-space:pre-wrap"> </span>1<span
style="white-space:pre-wrap"> </span>Hs.743764<span
style="white-space:pre-wrap"> </span>3472<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>791<span
style="white-space:pre-wrap"> </span>SEQUENCE_MATCH<span
style="white-space:pre-wrap"> </span>841<span
style="white-space:pre-wrap"> </span>Hs.743764<span
style="white-space:pre-wrap"> </span>5<span
style="white-space:pre-wrap"> </span>0<span
style="white-space:pre-wrap"> </span>716M1D75M<span
style="white-space:pre-wrap"> </span>91<span
style="white-space:pre-wrap"> </span>UniGene<span
style="white-space:pre-wrap"> </span>Transcribed
locus, moderately similar to NP_001034109.1
Zfp207 gene product [Rattus norvegicus]</div>
</div>
<div><br>
</div>
<div class="gmail_extra">(obtained from Rest)</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 27 February 2014
14:00, mag <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:mr6@ebi.ac.uk"
target="_blank">mr6@ebi.ac.uk</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Hi
Genomeo,<br>
<br>
To find which attributes are available,
the Ensembl Doxygen documentation usually
covers everything you need.<br>
Looking at <a moz-do-not-send="true"
href="http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Gene.html"
target="_blank">http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Gene.html</a><br>
will tell you that you can obtain the
following from a gene:<br>
<br>
$gene->source()<br>
$gene->analysis->logic_name()<br>
$gene->description()<br>
$gene->external_name()<br>
$gene->biotype()<br>
$gene->seq_region_start()<br>
$gene->seq_region_end()<br>
$gene->seq_region_name()<br>
$gene->seq_region_strand()<br>
$gene->display_id()<br>
<br>
When using the API, you should always know
what object_type you are using, as it
allows you to use the correct attributes.<br>
In this example, if you are using a
Bio::EnsEMBL::Gene, object_type is 'gene'<br>
<br>
For species and db_type as well, you need
to know those beforehand when using
directly the perl API.<br>
They are the ones which will allow you to
connect to the correct database based on
the data you are looking for.<br>
<br>
Regarding cross references to other
organisms, do you have any examples?<br>
Generally, we should be only mapping to
other resources for the same organism.<br>
For example, for pig, we will only assign
cross references to Uniprot pig proteins.<br>
<br>
The main exceptions I can think of are:<br>
- HGNC names<br>
Typically, if the coverage for a species
is low (ie, not all 20 odd thousand
proteins have been submitted to Uniprot or
RefSeq), we will use HGNC names to fill in
the gaps.<br>
Where no name can be found and there is a
homolog in human, we use the same name as
in human.<br>
- Ensembl translations<br>
For some low coverage species, annotations
was provided by projecting human
annotation via a whole genome alignment.<br>
For these models, we add an external
reference to the human translation which
was used to build the model.<br>
<br>
<br>
Hope this helps,<br>
Magali
<div>
<div><br>
<div><br>
</div>
<div>On 27/02/2014 13:41, Genomeo Dev
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Thanks very much for
the useful answer.
<div><br>
</div>
<div>I noticed that cross ref also
maps to genes from organisms
other than that of the query
gene ID. Any comment on that?<br>
<div><br>
</div>
<div>Related to the previous
question, I use the following
Rest python code to do id
lookup for particular Ensembl
IDs:</div>
<div><br>
</div>
<div>
<div>pref= "/lookup/id/"</div>
<div>ext = "?"</div>
<div><br>
</div>
<div>for line in inputfile1:</div>
<div> geneid=
line.rstrip('\n')</div>
<div><br>
</div>
<div> resp, content =
http.request(server+pref+geneid+ext,
method="GET",
headers={"Content-Type":"application/json"})</div>
<div><br>
</div>
<div> if not
resp.status == 200:</div>
<div> print
"%s\t%s\t%s" % (geneid,
"Invalid response:",
resp.status)</div>
<div> continue</div>
<div>
#sys.exit()</div>
<div> print "%s\t%s" %
(geneid,content)</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>And I get this output:</div>
<div><br>
</div>
<div>
<div>ENSG00000223972<span
style="white-space:pre-wrap">
</span>{"source":"ensembl_havana","object_type":"Gene","logic_name":"ensembl_havana_gene","species":"homo_sapiens","description":"DEAD/H
(Asp-Glu-Ala-Asp/His) box
helicase 11 like 1
[Source:HGNC
Symbol;Acc:37102]","display_name":"DDX11L1","biotype":"pseudogene","end":14412,"seq_region_name":"1","db_type":"core","strand":1,"id":"ENSG00000223972","start":11869}</div>
</div>
<div><br>
</div>
<div>What would be the
classes/attributes to use
under the Perl API to get
that? i.e:</div>
<div><br>
</div>
<div>source</div>
<div>object_type</div>
<div>logic_name</div>
<div>species</div>
<div>description</div>
<div>display_name</div>
<div>biotype</div>
<div>end</div>
<div>seq_region_name<br>
</div>
<div>db_type</div>
<div>strand</div>
<div>id</div>
<div>start</div>
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>G.</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 27
February 2014 11:39, mag <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:mr6@ebi.ac.uk"
target="_blank">mr6@ebi.ac.uk</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF"
text="#000000"> Hi Genomeo,<br>
<br>
The REST server only display
the current/latest release.<br>
The release version can be
found with this endpoint:<br>
<a moz-do-not-send="true"
href="http://beta.rest.ensembl.org/documentation/info/software"
target="_blank">http://beta.rest.ensembl.org/documentation/info/software</a><br>
<br>
To get more details with the
Ensembl API, you only need
to update the
print_DBEntries method to
display all the attributes
you are looking for.<br>
Compared to the output from
REST, we have the following:<br>
- display_id is
$dbe->display_id()<br>
- primary_id is
$dbe->primary_id()<br>
- version is
$dbe->version()<br>
- description is
$dbe->description()<br>
- dbname is
$dbe->dbname()<br>
- synonyms is
$dbe->get_all_synonyms()<br>
- info_type is
$dbe->info_type()<br>
- info_text is
$dbe->info_text()<br>
- db_display_name is
$dbe->db_display_name()<br>
<br>
You can chose what format
the REST will output.<br>
Details of all formats can
be found in our user guide:<br>
<a moz-do-not-send="true"
href="http://beta.rest.ensembl.org/documentation/user_guide"
target="_blank">http://beta.rest.ensembl.org/documentation/user_guide</a><br>
For tab-delimited output,
content_type=text/x-gff3 is
used, but it is only
available for the /feature
endpoint.<br>
<br>
There is no file in the
Ensembl ftp dumps that
contains all the external
references produced.<br>
<br>
<br>
Regards,<br>
Magali
<div>
<div><br>
<br>
<div>On 27/02/2014
11:20, Genomeo Dev
wrote:<br>
</div>
</div>
</div>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">
<div><font
color="#000000"
face="arial,
helvetica,
sans-serif"><span
style="line-height:18px;white-space:pre-wrap">Hi,</span></font></div>
<div><font
color="#000000"
face="arial,
helvetica,
sans-serif"><span
style="line-height:18px;white-space:pre-wrap"><br>
</span></font></div>
<div><span
style="line-height:18px;white-space:pre-wrap;font-family:arial,helvetica,sans-serif">I
am interested in
getting wide
cross references
to ensembl gene
IDs. I found two
programmatic
ways to do that
which give
consistent
results but
different amount
of details.
Using
ENSG00000223972
as an example:</span><br>
</div>
<div><font
color="#000000"
face="arial,
helvetica,
sans-serif"><span
style="line-height:18px;white-space:pre-wrap">(1)</span></font></div>
<div><font
color="#000000"
face="arial,
helvetica,
sans-serif"><span
style="line-height:18px;white-space:pre-wrap">Using this rest API
Endpoint
python code (<a
moz-do-not-send="true"
href="http://beta.rest.ensembl.org/documentation/info/xref_id"
target="_blank">http://beta.rest.ensembl.org/documentation/info/xref_id</a>)</span></font></div>
<div><br>
</div>
<ol>
<li
style="line-height:18px;list-style-type:decimal;color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,136)">import</span><span style="color:rgb(0,0,0)">
httplib2</span><span
style="color:rgb(102,102,0)">,</span><span style="color:rgb(0,0,0)"> sys</span></li>
<li
style="line-height:18px;list-style-type:decimal;background-color:rgb(238,238,238);color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)"> </span></li>
<li
style="line-height:18px;list-style-type:decimal;color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)">http </span><span style="color:rgb(102,102,0)">=</span><span
style="color:rgb(0,0,0)"> httplib2</span><span
style="color:rgb(102,102,0)">.</span><span
style="color:rgb(102,0,102)">Http</span><span
style="color:rgb(102,102,0)">(</span><span
style="color:rgb(0,136,0)">".cache"</span><span
style="color:rgb(102,102,0)">)</span></li>
<li
style="line-height:18px;list-style-type:decimal;background-color:rgb(238,238,238);color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)"> </span></li>
<li
style="line-height:18px;color:rgb(190,190,197);list-style-type:decimal;padding-left:12px"><span
style="color:rgb(0,0,0)">server </span><span
style="color:rgb(102,102,0)">=</span><span
style="color:rgb(0,0,0)"> </span><span style="color:rgb(0,136,0)">"<a
moz-do-not-send="true"
href="http://beta.rest.ensembl.org" target="_blank">http://beta.rest.ensembl.org</a>"</span></li>
<li
style="line-height:18px;list-style-type:decimal;background-color:rgb(238,238,238);color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)">ext </span><span style="color:rgb(102,102,0)">=</span><span
style="color:rgb(0,0,0)"> </span><span style="color:rgb(0,136,0)">"/xrefs/id/ENSG00000157764?"</span></li>
<li
style="line-height:18px;list-style-type:decimal;color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)">resp</span><span style="color:rgb(102,102,0)">,</span><span
style="color:rgb(0,0,0)"> content </span><span
style="color:rgb(102,102,0)">=</span><span
style="color:rgb(0,0,0)"> http</span><span style="color:rgb(102,102,0)">.</span><span
style="color:rgb(0,0,0)">request</span><span
style="color:rgb(102,102,0)">(</span><span
style="color:rgb(0,0,0)">server</span><span style="color:rgb(102,102,0)">+</span><span
style="color:rgb(0,0,0)">ext</span><span style="color:rgb(102,102,0)">,</span><span
style="color:rgb(0,0,0)"> method</span><span
style="color:rgb(102,102,0)">=</span><span
style="color:rgb(0,136,0)">"GET"</span><span
style="color:rgb(102,102,0)">,</span><span
style="color:rgb(0,0,0)"> headers</span><span
style="color:rgb(102,102,0)">={</span><span
style="color:rgb(0,136,0)">"Content-Type"</span><span
style="color:rgb(102,102,0)">:</span><span
style="color:rgb(0,136,0)">"application/json"</span><span
style="color:rgb(102,102,0)">})</span></li>
<li
style="line-height:18px;list-style-type:decimal;background-color:rgb(238,238,238);color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)"> </span></li>
<li
style="line-height:18px;list-style-type:decimal;color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,136)">if</span><span style="color:rgb(0,0,0)"> </span><span
style="color:rgb(0,0,136)">not</span><span style="color:rgb(0,0,0)">
resp</span><span
style="color:rgb(102,102,0)">.</span><span style="color:rgb(0,0,0)">status
</span><span
style="color:rgb(102,102,0)">==</span><span
style="color:rgb(0,0,0)"> </span><span style="color:rgb(0,102,102)">200</span><span
style="color:rgb(102,102,0)">:</span></li>
<li
style="line-height:18px;background-color:rgb(238,238,238);color:rgb(190,190,197);list-style-type:decimal;padding-left:12px"><span
style="color:rgb(0,0,0)"> </span><span style="color:rgb(0,0,136)">print</span><span
style="color:rgb(0,0,0)"> </span><span style="color:rgb(0,136,0)">"Invalid
response: "</span><span
style="color:rgb(102,102,0)">,</span><span style="color:rgb(0,0,0)">
resp</span><span
style="color:rgb(102,102,0)">.</span><span style="color:rgb(0,0,0)">status</span></li>
<li
style="line-height:18px;list-style-type:decimal;color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)"> sys</span><span style="color:rgb(102,102,0)">.</span><span
style="color:rgb(0,0,136)">exit</span><span style="color:rgb(102,102,0)">()</span></li>
<li
style="line-height:18px;list-style-type:decimal;background-color:rgb(238,238,238);color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,136)">import</span><span style="color:rgb(0,0,0)">
json</span></li>
<li
style="line-height:18px;list-style-type:decimal;color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)"> </span></li>
<li
style="line-height:18px;list-style-type:decimal;background-color:rgb(238,238,238);color:rgb(190,190,197);padding-left:12px"><span
style="color:rgb(0,0,0)">decoded </span><span
style="color:rgb(102,102,0)">=</span><span
style="color:rgb(0,0,0)"> json</span><span style="color:rgb(102,102,0)">.</span><span
style="color:rgb(0,0,0)">loads</span><span style="color:rgb(102,102,0)">(</span><span
style="color:rgb(0,0,0)">content</span><span
style="color:rgb(102,102,0)">)</span></li>
<li
style="line-height:18px;color:rgb(190,190,197);list-style-type:decimal;padding-left:12px"><span
style="color:rgb(0,0,136)">print</span><span style="color:rgb(0,0,0)">
repr</span><span
style="color:rgb(102,102,0)">(</span><span style="color:rgb(0,0,0)">decoded</span><span
style="color:rgb(102,102,0)">)</span></li>
</ol>
<div><br>
</div>
<div>I get:</div>
<div><br>
</div>
<div>
<div>{"display_id":"OTTHUMG00000000961","primary_id":"OTTHUMG00000000961","version":"2","description":null,"dbname":"OTTG","synonyms":[],"info_type":"NONE","info_text":"","db_display_name":"Havana
gene"}</div>
<div><br>
</div>
<div>{"primary_id":"Hs.714157","dbname":"UniGene","ensembl_identity":98,"synonyms":[],"ensembl_start":6,"xref_start":1,"xref_end":1639,"db_display_name":"UniGene","display_id":"Hs.714157","ensembl_end":1657,"version":"0","score":8055,"cigar_line":"1200M1D299M12D140M","description":"DEAD/H
(Asp-Glu-Ala-Asp/His)
box helicase 11
like
1","xref_identity":97,"evalue":null,"info_text":"","info_type":"SEQUENCE_MATCH"}</div>
<div><br>
</div>
<div>{"primary_id":"Hs.618434","dbname":"UniGene","ensembl_identity":58,"synonyms":[],"ensembl_start":669,"xref_start":1,"xref_end":974,"db_display_name":"UniGene","display_id":"Hs.618434","ensembl_end":1655,"version":"0","score":4757,"cigar_line":"537M1D299M12D138M","description":"Similar
to DEAD/H
(Asp-Glu-Ala-Asp/His)
box polypeptide
11 isoform 1,
mRNA (cDNA clone
IMAGE:6103207)","xref_identity":96,"evalue":null,"info_text":"","info_type":"SEQUENCE_MATCH"}</div>
<div><br>
</div>
<div>{"display_id":"DDX11L1","primary_id":"37102","version":"0","description":"DEAD/H
(Asp-Glu-Ala-Asp/His)
box helicase 11
like
1","dbname":"HGNC","synonyms":[],"info_type":"DIRECT","info_text":"Generated
via
ensembl_manual","db_display_name":"HGNC
Symbol"}</div>
<div><br>
</div>
<div>{"display_id":"DDX11L5","primary_id":"100287596","version":"0","description":"DEAD/H
(Asp-Glu-Ala-Asp/His)
box helicase 11
like
5","dbname":"EntrezGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"EntrezGene"}</div>
<div><br>
</div>
<div>{"display_id":"DDX11L1","primary_id":"100287102","version":"0","description":"DEAD/H
(Asp-Glu-Ala-Asp/His)
box helicase 11
like
1","dbname":"EntrezGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"EntrezGene"}</div>
<div><br>
</div>
<div>{"display_id":"ENSG00000223972","primary_id":"ENSG00000223972","version":"0","description":"","dbname":"ArrayExpress","synonyms":[],"info_type":"DIRECT","info_text":"","db_display_name":"ArrayExpress"}</div>
<div><br>
</div>
<div>{"display_id":"DDX11L5","primary_id":"100287596","version":"0","description":"DEAD/H
(Asp-Glu-Ala-Asp/His)
box helicase 11
like
5","dbname":"WikiGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"WikiGene"}</div>
<div><br>
</div>
<div>{"display_id":"DDX11L1","primary_id":"100287102","version":"0","description":"DEAD/H
(Asp-Glu-Ala-Asp/His)
box helicase 11
like
1","dbname":"WikiGene","synonyms":[],"info_type":"DEPENDENT","info_text":"","db_display_name":"WikiGene"}]</div>
<div><br>
</div>
</div>
<div>(2)</div>
<div><br>
</div>
<div>Using this perl
API code (based on
<a
moz-do-not-send="true"
href="http://www.ensembl.org/info/docs/api/core/core_tutorial.html"
target="_blank">http://www.ensembl.org/info/docs/api/core/core_tutorial.html</a>):</div>
<div><br>
</div>
<div>
<pre style="margin-top:0px;margin-bottom:16px;border:1px solid rgb(204,204,204);background-color:rgb(240,240,240);padding:8px!important"><font color="#555555" face="courier new, monospace"><span style="line-height:16px"># Define a helper subroutine to print DBEntries
sub print_DBEntries
{
my $db_entries = shift;
foreach my $dbe ( @{$db_entries} ) {
printf "\tXREF %s (%s)\n", $dbe->display_id(), $dbe->dbname();
}
}
my $genes = $gene_adaptor->fetch_all_by_stable_id_list([@gene_list]);</span></font></pre>
<pre style="margin-top:0px;margin-bottom:16px;border:1px solid rgb(204,204,204);background-color:rgb(240,240,240);line-height:16px;color:rgb(85,85,85);padding:8px!important"><font face="courier new, monospace">
...</font></pre>
<pre style="margin-top:0px;margin-bottom:16px;border:1px solid rgb(204,204,204);background-color:rgb(240,240,240);line-height:16px;color:rgb(85,85,85);padding:8px!important"><font face="courier new, monospace">
print "GENE ", $gene->stable_id(), "\n";
print_DBEntries( $gene->get_all_DBEntries() );</font></pre>
</div>
<div>
<div><span
style="white-space:pre-wrap">I
get:</span></div>
<div><span
style="white-space:pre-wrap">
</span></div>
<div>XREF
OTTHUMG00000000961
(OTTG)</div>
<div>XREF
ENSG00000223972
(ArrayExpress)</div>
<div>XREF DDX11L1
(EntrezGene)</div>
<div>XREF DDX11L5
(EntrezGene)</div>
<div>XREF DDX11L1
(HGNC)</div>
<div>XREF
Hs.618434
(UniGene)</div>
<div>XREF
Hs.714157
(UniGene)</div>
<div> XREF DDX11L1
(WikiGene)</div>
<div>XREF DDX11L5
(WikiGene)</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>Questions:</div>
<div><br>
</div>
<div>1. am I correct
in saying that the
Rest code uses the
latest Ensembl
release while the
API code uses the
Ensembl release
currently
installed as part
of the VM (I am
using release 74)?</div>
<div><br>
</div>
<div>2. Rest code
gives more
extensive details
(which I like)
compared to the
perl API code.
Could you suggest
a simple way to
use the API to get
the same details?</div>
<div><br>
</div>
<div>3. The Rest
code output
format. Is tab
separated text
supported?<br>
</div>
<div><br>
</div>
<div>4. Is there a
file in the
Ensembl ftp area
which contains pre
generated detailed
cross ref mappings
for all current
Ensembl genes?</div>
--
<div><br>
</div>
<div>Thanks,</div>
<div><br>
<div dir="ltr"> G.</div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
</div>
<pre>_______________________________________________
Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a
moz-do-not-send="true"
href="mailto:Dev@ensembl.org"
target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and
subscribe/unsubscribe info: <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a
moz-do-not-send="true"
href="http://www.ensembl.info/"
target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">G.</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a
moz-do-not-send="true"
href="mailto:Dev@ensembl.org"
target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe
info: <a moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/"
target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">G.</div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">G.</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
</body>
</html>