<div dir="ltr"><p class="MsoNormal"><span lang="EN-US">Missing GO data when using API to </span></p>
<p class="MsoNormal"><span lang="EN-US">Dear all</span></p>
<p class="MsoNormal"><span lang="EN-US">I am using following scripts to fetch all
GO for a bacteria. </span></p>
<p class="MsoNormal"><span lang="EN-US">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> </span></p>
<p class="MsoNormal"><span lang="EN-US"># methanocella_conradii_hz254</span></p>
<p class="MsoNormal"><span lang="EN-US">#!/usr/bin/perl</span></p>
<p class="MsoNormal"><span lang="EN-US">use strict;</span></p>
<p class="MsoNormal"><span lang="EN-US">use warnings;</span></p>
<p class="MsoNormal"><span lang="EN-US">use Bio::EnsEMBL::LookUp;</span></p>
<p class="MsoNormal"><span lang="EN-US"># load the lookup from the main Ensembl
Bacteria public server</span></p>
<p class="MsoNormal"><span lang="EN-US">my $lookup = Bio::EnsEMBL::LookUp->new(</span></p>
<p class="MsoNormal"><span lang="EN-US">
-URL => "<a href="http://bacteria.ensembl.org/registry.json">http://bacteria.ensembl.org/registry.json</a>",</span></p>
<p class="MsoNormal"><span lang="EN-US">
-NO_CACHE => 1</span></p>
<p class="MsoNormal"><span lang="EN-US">);</span></p>
<p class="MsoNormal"><span lang="EN-US"># find the correct database adaptor using a
unique name</span></p>
<p class="MsoNormal"><span lang="EN-US">my ($dba) =
@{$lookup->get_by_name_exact(</span></p>
<p class="MsoNormal"><span lang="EN-US">
'methanocella_conradii_hz254'</span></p>
<p class="MsoNormal"><span lang="EN-US">)};</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">my $genes =
$dba->get_GeneAdaptor()->fetch_all(); # where is the get_GeneAdaptor()
documentation</span></p>
<p class="MsoNormal"><span lang="EN-US"># test</span></p>
<p class="MsoNormal"><span lang="EN-US">print "Found ".scalar
@$genes." genes for ".$dba->species()."\n";</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">use Bio::EnsEMBL::DBSQL::OntologyDBAdaptor;
</span></p>
<p class="MsoNormal"><span lang="EN-US">#did you try to import the ontology the adaptor
before constructing it? you should have line above!</span></p>
<p class="MsoNormal"><span lang="EN-US"># problems lists below solved!</span></p>
<p class="MsoNormal"><span lang="EN-US"># get adaptor for ontology</span></p>
<p class="MsoNormal"><span lang="EN-US">my $ontology_dba =
Bio::EnsEMBL::DBSQL::OntologyDBAdaptor->new(</span></p>
<p class="MsoNormal"><span lang="EN-US"> -HOST => '<a href="http://mysql.ebi.ac.uk">mysql.ebi.ac.uk</a>',</span></p>
<p class="MsoNormal"><span lang="EN-US"> -USER => 'anonymous',</span></p>
<p class="MsoNormal"><span lang="EN-US"> -PORT => '4157',</span></p>
<p class="MsoNormal"><span lang="EN-US"> -group => 'ontology',</span></p>
<p class="MsoNormal"><span lang="EN-US"> -dbname => 'ensemblgenomes_ontology_21_74',</span></p>
<p class="MsoNormal"><span lang="EN-US"> -species => 'multi' );</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">my $goada =
$ontology_dba->get_adaptor('OntologyTerm');</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US"># get go infomation</span></p>
<p class="MsoNormal"><span lang="EN-US">open (MYFILE, '>>HZ254gene_go.txt');</span></p>
<p class="MsoNormal"><span lang="EN-US">foreach my $gene (@$genes){</span></p>
<p class="MsoNormal"><span lang="EN-US">foreach my $link (@{ $gene->get_all_DBLinks
} ){</span></p>
<p class="MsoNormal"><span lang="EN-US">if ($link->database eq "GO"){</span></p>
<p class="MsoNormal"><span lang="EN-US">my $term_id=$link->display_id;</span></p>
<p class="MsoNormal"><span lang="EN-US">my $term_name='-';</span></p>
<p class="MsoNormal"><span lang="EN-US">my
$term=$goada->fetch_by_accession($term_id);</span></p>
<p class="MsoNormal"><span lang="EN-US">if($term and $term->name){</span></p>
<p class="MsoNormal"><span lang="EN-US">$term_name=$term->name;}</span></p>
<p class="MsoNormal"><span lang="EN-US">print MYFILE $gene->stable_id,
"\t", $term_id, "\n";</span></p>
<p class="MsoNormal"><span lang="EN-US"> }</span></p>
<p class="MsoNormal"><span lang="EN-US"> }</span></p>
<p class="MsoNormal"><span lang="EN-US">};</span></p>
<p class="MsoNormal"><span lang="EN-US">close (MYFILE);</span></p>
<p class="MsoNormal"><span lang="EN-US"># API version, 74</span></p>
<p class="MsoNormal"><span lang="EN-US">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> </span></p>
<p class="MsoNormal"><span lang="EN-US">Take Mtc_0001, one of my gene for example,
data from perl API was as following:</span></p>
<p class="MsoNormal"><span lang="EN-US">Mtc_0001 GO:0016740
(molecular_function)</span></p>
<p class="MsoNormal"><span lang="EN-US">After searching on ensembl bacteria for M.
conradii and for gene Mtc_0001 (<a href="http://bacteria.ensembl.org/methanocella_conradii_hz254/Transcript/Ontology/molecular_function?db=core;g=Mtc_0001;oid=molecular_function;r=Chromosome:38-805;t=AFC98776;tab=t">http://bacteria.ensembl.org/methanocella_conradii_hz254/Transcript/Ontology/molecular_function?db=core;g=Mtc_0001;oid=molecular_function;r=Chromosome:38-805;t=AFC98776;tab=t</a>)
</span></p>
<p class="MsoNormal"><span lang="EN-US">I got the same results:</span></p>
<p class="MsoNormal"><span lang="EN-US">GO:0016740 transferase
activity <a href="http://www.uniprot.org/uniprot/H8I517"><span style="font-size:10pt;font-family:Helvetica,sans-serif;color:rgb(0,0,102)">UniProtKB/TrEMBL:H8I517</span></a>
molecular_function</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">However, when I continued to browse the its
source, UniProtKB/TrEMBL:H8I517, found in all GO annotation the following: </span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProtKB H8I517</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">Mtc_0001</span></p>
<p class="MsoNormal" align="left"><a name="as-GO:0008152"><u><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:rgb(0,102,102)">GO:0008152</span></u></a><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black"></span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">metabolic process</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">P</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">IEA</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProt
Keywords2GO (UniProtKB/TrEMBL entries)</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProtKB-KW:KW-0808</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">1041930</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">20140111</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">GOC</span></p>
<p class="MsoNormal" align="center" style="text-align:center;background-color:rgb(204,204,204)"><b><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">Function</span></b></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProtKB</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">H8I517</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">Mtc_0001</span></p>
<p class="MsoNormal" align="left"><a name="as-GO:0016740"><u><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:rgb(0,102,102)">GO:0016740</span></u></a><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black"></span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">transferase
activity</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">F</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">IEA</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProt
Keywords2GO (UniProtKB/TrEMBL entries)</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProtKB-KW:KW-0808</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">1041930</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">20140107</span></p>
<p class="MsoNormal" align="left"><span lang="EN-US" style="font-size:9pt;font-family:Helvetica,sans-serif;color:black">UniProt</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US"><span style="background-color:rgb(255,255,255)"><b>results from API above missed the metabolic
process data for Mtc_0001. (Also for all other genes, just molecular function
GO term ID were return).</b></span></span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">I need to do GO analysis for this bacteria,
hopefully want to fetch all GO for it. Now I am little confused on this, any suggestion
to figure out this situation are appreciated!</span></p><div><br></div>-- <br><div>Pengfei Liu, PhD Candidate<br><br>Lab of Microbial Ecology<br>College of Resources and Environmental Sciences<br>China Agricultural University<br>
No.2 Yuanmingyuanxilu, Beijing, 100193<br>P.R. China<br><br>Tel: +86-10-62731358<br>Fax: +86-10-62731016<br> <br>E-mail: <a href="mailto:liupfskygre@gmail.com" target="_blank">liupfskygre@gmail.com</a></div>
<div><br></div>
</div>