<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
<p>Dear Joseph</p>
<p><br>
</p>
<p>You have probably noticed that you query returns two rows for the
taxon_id 644223. There are a few more things to consider:<br>
</p>
<p>1. There are a number of genomes that belong to the Fungi site.
They exist as Core databases on the server and you can identify
them by checking the return value of <tt>get_division()</tt> via
the <tt>MetaContainer</tt> adaptor of every genome registered on
the Registry. If you do SQL, that's the <tt>species.division</tt>
meta key<br>
</p>
<p>2. Some of those genomes will make it to the <tt>genome_db</tt>
table of the Compara database</p>
<p>3. Some of those genomes will make it to the <tt>gene_tree</tt>
/ <tt>species_tree</tt> tables of the Compara database, i.e. the
genomes that are used in the protein-tree / orthology builds</p>
<p>I believe you can have multiple strains / sub-species of the same
species (and perhaps multiple assemblies of the same organism ?)
at each of these three levels.</p>
<p>- If you deal with Core databases, the <tt>species.species_taxonomy_id</tt>
meta key holds the taxon_id of the species, whereas the <tt>species.taxonomy_id</tt>
meta keys holds the taxon_id of the particular strain /
sub-species (if one has been assigned).</p>
<p>- If you use the Compara database, the taxon_id we store in the <tt>genome_db</tt>
and <tt>species_tree_node</tt> tables is <tt>species.taxonomy_id</tt>,
so to get the species' taxon_id instead you need to traverse the
taxonomy upwards until you find a node at the <tt>species</tt>
rank.<br>
</p>
<p><br>
</p>
<p>Note that on the same server there is another database (<tt>ensembl_metadata_99</tt>)
which holds the list of genomes and taxon_ids (both species and
sub-species), and whether the genome is in the Compara database.
It might be easier to query that the Core / Compara databases;<br>
</p>
<p><br>
</p>
<p>Hope this helps,</p>
<p>Matthieu</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 06/02/2020 05:58, Joseph Steinberger
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:DB8P191MB066712B09476A614293072A6B51D0@DB8P191MB0667.EURP191.PROD.OUTLOOK.COM">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
Dear Community,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
I would like to know the number of unique species in the
Ensembl Fungi database - I believe there are 488.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
I run the following command, and get 488 rows - <br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<blockquote style="margin-top: 0px; margin-bottom: 0px;">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span>SELECT taxon_id,<br>
</span>
<div> node_name<br>
</div>
<div>FROM ensembl_compara_fungi_46_99.species_tree_node<br>
</div>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span>WHERE genome_db_id != 'NaN'</span></div>
</blockquote>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span><br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span>Am I correct in my interpretation?</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span><br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span></span>Sincerely,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
Yossi</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
Matthieu Muffato, Ph.D.
Ensembl Compara Principal Developer
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room A3-123
Phone + 44 (0) 1223 49 4631
Fax + 44 (0) 1223 49 4468</pre>
</body>
</html>