<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7654.12">
<TITLE>Data differences between api query and website</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>Dear All,<BR>
<BR>
I've extracted data using a script which takes a gene of interest, gets its parent node and prints out the ids of all the leaf nodes.<BR>
I am getting discrepancies between my results through querying the database through the api and the gene trees shown on the ensembl website.<BR>
<BR>
For example:<BR>
Using gene id ENSOANG00000010713 - I get a parent node with 4 children:<BR>
ENSOANG00000008298; ENSOANG00000008299; ENSOANG00000022578; ENSOANG00000010713.<BR>
However when I examine gene trees I see that this is not the case at all - with orthologs from other species present.<BR>
<BR>
(please see <A HREF="http://www.ensembl.org/Ornithorhynchus_anatinus/Gene/Compara_Tree?collapse=1959500%2C1959435%2C1959539%2C1959433%2C1959430%2C1959490;db=core;g=ENSOANG00000010713;r=X5:2856975-2885950">http://www.ensembl.org/Ornithorhynchus_anatinus/Gene/Compara_Tree?collapse=1959500%2C1959435%2C1959539%2C1959433%2C1959430%2C1959490;db=core;g=ENSOANG00000010713;r=X5:2856975-2885950</A>)<BR>
<BR>
This is not the case for all sequences - the majority (I think) do correlate with the gene trees presented on website.<BR>
<BR>
Do you think this is a database version issue - they are fairly large discrepancies?<BR>
<BR>
<BR>
Many thanks in advance,<BR>
Emily<BR>
---<BR>
Script below:<BR>
<BR>
use Bio::EnsEMBL::Registry;<BR>
<BR>
my $registry = 'Bio::EnsEMBL::Registry';<BR>
<BR>
$registry->load_registry_from_db(<BR>
-host => 'ensembldb.ensembl.org',<BR>
-user => 'anonymous'<BR>
);<BR>
<BR>
my $member_adaptor =<BR>
Bio::EnsEMBL::Registry->get_adaptor("Compara", "compara", "Member");<BR>
<BR>
<BR>
my $proteintree_adaptor =<BR>
Bio::EnsEMBL::Registry->get_adaptor("Compara", "compara",<BR>
"ProteinTree");<BR>
----<BR>
#I then extracted for a list of genes I was interested in<BR>
foreach my $id (@data)<BR>
{<BR>
chomp($id);<BR>
print "$id\n";<BR>
my $member = $member_adaptor->fetch_by_source_stable_id("ENSEMBLGENE",<BR>
$id);<BR>
next unless (defined $member);<BR>
my $aligned_member = $proteintree_adaptor-><BR>
fetch_AlignedMember_by_member_id_root_id<BR>
($member->get_longest_peptide_Member->member_id);<BR>
my $node = $aligned_member;<BR>
<BR>
while ($node->has_parent){<BR>
#if node is a leaf then add all leaves from parent even if some are not expressed<BR>
my $terminal = 0;<BR>
if ($node->is_leaf){<BR>
$terminal = 1;<BR>
}<BR>
$node = $node->parent();<BR>
if ($terminal == 1){<BR>
print "node is leaf\n";<BR>
my $exp = 1;<BR>
my $proteintree =<BR>
$proteintree_adaptor->fetch_node_by_node_id($node->node_id);<BR>
my @leaves = @{$proteintree->get_all_leaves};<BR>
print scalar(@leaves)."\n";<BR>
foreach my $leaf (@leaves)<BR>
{<BR>
my $gene = $leaf->get_Gene->stable_id;<BR>
print $gene;<BR>
my $prot = $leaf->get_longest_peptide_Member->stable_id;<BR>
....</FONT>
</P>
</BODY>
</HTML>