<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#ffffff">
Hi Bert,<br>
<br>
<br>
I agree with your code and i agree with the sql i originally posted.
I don't understand why biomart and this perl code (below)<br>
are only returning 25k, and it seems too coincidental they are
returning the same number<br>
<br>
The difference in results is huge. What exons is this code missing?
All i could think of was predicted exons but it seems<br>
unlikely there are 25k known exons and (250k-25k = 225k) predicted
exons not assigned to genes. I don't even know if ensembl deals with
predicted exons.<br>
I got the same 'discrepancy' with figures when i tested human too.<br>
<br>
===============================================<br>
<br>
my $gene_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core',
'Gene' );<br>
my $genes = $gene_adaptor->fetch_all();<br>
<br>
<br>
$total_genes=0;<br>
$exon_count = 0;<br>
foreach $gene(@{$genes}) {<br>
$total_genes++;<br>
<br>
foreach $exon ($gene->get_all_Exons()) {<br>
$exon_count++;<br>
} <br>
} #end for each gene<br>
<br>
<br>
=============================================<br>
<br>
<br>
Thank you very much<br>
<br>
On 06/05/11 16:44, Bert Overduin wrote:
<blockquote
cite="mid:BANLkTik1KCWZFDa38bqQGfJhVVfb2HDF8w@mail.gmail.com"
type="cite">Hi,
<div><br>
</div>
<div>When I use the following code:</div>
<div><br>
</div>
<div>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta http-equiv="Content-Style-Type" content="text/css">
<title></title>
<meta name="Generator" content="Cocoa HTML Writer">
<meta name="CocoaVersion" content="949.54">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica; min-height: 14.0px}
</style>
<p class="p1">#!/usr/bin/perl</p>
<p class="p2"><br>
</p>
<p class="p1">use strict;</p>
<p class="p1">use Bio::EnsEMBL::Registry;</p>
<p class="p2"><br>
</p>
<p class="p1">my $reg = "Bio::EnsEMBL::Registry";</p>
<p class="p2"><br>
</p>
<p class="p1">$reg->load_registry_from_db( -host => '<a
moz-do-not-send="true" href="http://ensembldb.ensembl.org">ensembldb.ensembl.org</a>',
-user => 'anonymous' );</p>
<p class="p2"><br>
</p>
<p class="p1">my $exon_adaptor = $reg->get_adaptor( 'Bos
taurus', 'Core', 'Exon' );</p>
<p class="p2"><br>
</p>
<p class="p1">my $exons = $exon_adaptor->fetch_all;</p>
<p class="p2"><br>
</p>
<p class="p1">print scalar( @{$exons} ), "\n";</p>
</div>
<div><br>
</div>
<div>I get:</div>
<div><br>
</div>
<div>
<div>farm2-head2[bert]2: perl <a moz-do-not-send="true"
href="http://test.pl">test.pl</a></div>
<div>225837</div>
<div><br>
</div>
<div>Which is the same number I get with a MySQL query:</div>
<div><br>
</div>
<div>
<div>mysql -u anonymous -h <a moz-do-not-send="true"
href="http://ensembldb.ensembl.org">ensembldb.ensembl.org</a>
-P 5306</div>
<div>Welcome to the MySQL monitor. Commands end with ; or \g.</div>
<div>Your MySQL connection id is 8610 to server version:
5.1.34-log</div>
<div><br>
</div>
<div>Type 'help;' or '\h' for help. Type '\c' to clear the
buffer.</div>
</div>
<div><br>
</div>
<div>
<div>mysql> use bos_taurus_core_62_4k </div>
<div>Reading table information for completion of table and
column names</div>
<div>You can turn off this feature to get a quicker startup
with -A</div>
</div>
<div>
<div><br>
</div>
<div>Database changed</div>
</div>
<div>
<div>mysql> SELECT COUNT(*) FROM exon;</div>
<div>+----------+</div>
<div>| COUNT(*) |</div>
<div>+----------+</div>
<div>| 225837 |</div>
<div>+----------+</div>
<div>1 row in set (0.01 sec)</div>
</div>
<div><br>
</div>
<div>Cheers,</div>
<div>Bert</div>
<div><br>
</div>
<br>
<div class="gmail_quote">On Fri, May 6, 2011 at 4:22 PM, Andrea
Edwards <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:edwardsa@cs.man.ac.uk">edwardsa@cs.man.ac.uk</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
<div text="#000000" bgcolor="#ffffff"> I tried 2 ways :<br>
<br>
===============================================<br>
<br>
my $gene_adaptor = $registry->get_adaptor(
'bos_taurus', 'Core', 'Gene' );<br>
my $genes = $gene_adaptor->fetch_all();<br>
<br>
my $exon_adaptor = $registry->get_adaptor(
'bos_taurus', 'Core', 'Exon' );<br>
$total_genes=0;<br>
$exon_count = 0;<br>
foreach $gene(@{$genes}) {<br>
$total_genes++;<br>
<br>
foreach $exon ($gene->get_all_Exons()) {<br>
$exon_count++;<br>
} <br>
} #end for each gene<br>
<br>
<br>
=============================================<br>
<br>
This way gave even less (23k) but i'm being stricter here
about the chromosomes<br>
<br>
@slices = @{ $slice_adaptor->fetch_all('chromosome',
undef, 0, 1) };<br>
<br>
$total_genes=0;<br>
$exon_count = 0;<br>
foreach $slice (@slices) {<br>
unless ($slice->seq_region_name() =~ /Un/) {<br>
print $slice->seq_region_name."\n";<br>
my $genes =
$gene_adaptor->fetch_all_by_Slice($slice);<br>
<br>
<br>
foreach my $gene(@{$genes}) {<br>
$total_genes++;<br>
<br>
foreach my $exon ($gene->get_all_Exons()) {<br>
$exon_count++;<br>
print "$exon_count\n";<br>
} <br>
<br>
<br>
<br>
<br>
} #end for each gene<br>
}<br>
}<br>
<br>
==============================================<br>
<br>
But neither give anything like the sql results<br>
<br>
Why does the sql give so many more? Which should I use?<br>
<br>
thank you
<div>
<div class="h5"><br>
<br>
<br>
On 06/05/11 15:50, Bert Overduin wrote:
<blockquote type="cite">Hi Andrea,
<div><br>
</div>
<div>I suspect that your BioMart results are
truncated because the query is too large.
<div><br>
</div>
<div>However, that doesn't explain your API
results .... How does your API code look like?<br>
<div><br>
</div>
<div>Cheers,</div>
<div>Bert<br>
<br>
<div class="gmail_quote">On Fri, May 6, 2011
at 3:45 PM, Andrea Edwards <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:edwardsa@cs.man.ac.uk"
target="_blank">edwardsa@cs.man.ac.uk</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin: 0pt 0pt 0pt 0.8ex;
border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">Hello<br>
<br>
I'm sorry for the basic question but I was
looking at the ensembl core schema and
trying to retrieve just the exons on
chromosomes and couldn't work out why i am
getting such different figures than with
biomart and the perl api<br>
<br>
For example for cow there are 25670 exons
in genes with biomart and the api but with
this sql ~210k exons. This code is just
looking for exons on chromosomes 1-30 and
X<br>
<br>
select count(distinct stable_id) from exon
e inner join exon_stable_id es
using(exon_id) inner join seq_region sr
using(seq_region_id) where
sr.coord_system_id = 2 and <a
moz-do-not-send="true"
href="http://sr.name" target="_blank">sr.name</a>
REGEXP '^[1-9]|^X' and e.is_current=1<br>
<br>
I get 8k just on chromosome 1<br>
<br>
I'm sure this is simple and perhaps its
because its Friday afternoon but I'm just
not seeing it!!<br>
<br>
_______________________________________________<br>
Dev mailing list <a
moz-do-not-send="true"
href="mailto:Dev@ensembl.org"
target="_blank">Dev@ensembl.org</a><br>
List admin (including
subscribe/unsubscribe): <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/"
target="_blank">http://www.ensembl.info/</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
Bert Overduin, Ph.D.<br>
Vertebrate Genomics Team<br>
<br>
EMBL - European Bioinformatics Institute<br>
Wellcome Trust Genome Campus<br>
Hinxton, Cambridge CB10 1SD<br>
United Kingdom<br>
<br>
<a moz-do-not-send="true"
href="http://www.ebi.ac.uk/%7Ebert"
target="_blank">http://www.ebi.ac.uk/~bert</a>
<div>
<p style="margin-bottom: 0.0001pt;"> <font
face="Arial"> </font></p>
<p style="margin: 0.1pt 0in 0.0001pt;"><font
face="Arial"><span style="font-size:
10pt; font-family: Arial; color:
black;">Ensembl browser: <a
moz-do-not-send="true"
href="http://www.ensembl.org/"
target="_blank"><span style="color:
blue;">http://www.ensembl.org</span></a></span><span
style="font-size: 10pt; font-family:
Arial;"></span></font></p>
<font face="Arial">
<p style="margin: 0.1pt 0in 0.0001pt;"><span
style="font-size: 10pt; font-family:
Arial; color: black;">Mailing lists: </span><span
style="font-size: 10pt; font-family:
Arial; color: blue;"><a
moz-do-not-send="true"
href="http://www.ensembl.org/info/about/contact/mailing.html"
target="_blank"><span style="color:
blue;">http://www.ensembl.org/info/about/contact/mailing.html</span></a></span><span
style="font-size: 10pt; font-family:
Arial; color: black;"></span></p>
<p style="margin: 0.1pt 0in 0.0001pt;"><span
style="font-size: 10pt; font-family:
Arial; color: black;">Blog: </span><span
style="font-size: 10pt; font-family:
Arial; color: blue;"><a
moz-do-not-send="true"
href="http://www.ensembl.info/"
target="_blank"><span style="color:
blue;">http://www.ensembl.info</span></a></span><span
style="font-size: 10pt; font-family:
Arial; color: black;"></span></p>
<p style="margin: 0.1pt 0in 0.0001pt;"><span
style="font-size: 10pt; font-family:
Arial; color: black;">YouTube: </span><span
style="font-size: 10pt; font-family:
Arial; color: rgb(0, 0, 153);"><a
moz-do-not-send="true"
href="http://www.youtube.com/user/EnsemblHelpdesk"
target="_blank"><span style="color:
blue;">http://www.youtube.com/user/EnsemblHelpdesk</span></a></span><span
style="font-size: 10pt; font-family:
Arial; color: black;"><br>
Facebook: </span><span
style="font-size: 10pt; font-family:
Arial; color: blue;"><a
moz-do-not-send="true"
href="http://www.facebook.com/Ensembl.org"
target="_blank"><span style="color:
blue;">http://www.facebook.com/Ensembl.org</span></a></span><span
style="font-size: 10pt; font-family:
Arial; color: black;"><br>
Twitter: </span><span
style="font-size: 10pt; font-family:
Arial; color: blue;"><a
moz-do-not-send="true"
href="http://twitter.com/Ensembl"
target="_blank"><span style="color:
blue;">http://twitter.com/Ensembl</span></a> </span><span
style="font-size: 10pt; font-family:
Arial; color: black;"></span></p>
</font><font size="2" face="Arial"> </font>
</div>
<br>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
Bert Overduin, Ph.D.<br>
Vertebrate Genomics Team<br>
<br>
EMBL - European Bioinformatics Institute<br>
Wellcome Trust Genome Campus<br>
Hinxton, Cambridge CB10 1SD<br>
United Kingdom<br>
<br>
<a moz-do-not-send="true" href="http://www.ebi.ac.uk/%7Ebert"
target="_blank">http://www.ebi.ac.uk/~bert</a>
<div>
<p style="margin-bottom: 0.0001pt;">
<font face="Arial">
</font></p>
<p style="margin: 0.1pt 0in 0.0001pt;"><font face="Arial"><span
style="font-size: 10pt; font-family: Arial; color:
black;">Ensembl
browser: <a moz-do-not-send="true"
href="http://www.ensembl.org/" target="_blank"><span
style="color: blue;">http://www.ensembl.org</span></a></span><span
style="font-size: 10pt; font-family: Arial;"></span></font></p>
<font face="Arial">
<p style="margin: 0.1pt 0in 0.0001pt;"><span
style="font-size: 10pt; font-family: Arial; color:
black;">Mailing
lists: </span><span style="font-size: 10pt; font-family:
Arial; color: blue;"><a moz-do-not-send="true"
href="http://www.ensembl.org/info/about/contact/mailing.html"
target="_blank"><span style="color: blue;">http://www.ensembl.org/info/about/contact/mailing.html</span></a></span><span
style="font-size: 10pt; font-family: Arial; color:
black;"></span></p>
<p style="margin: 0.1pt 0in 0.0001pt;"><span
style="font-size: 10pt; font-family: Arial; color:
black;">Blog: </span><span style="font-size: 10pt;
font-family: Arial; color: blue;"><a
moz-do-not-send="true" href="http://www.ensembl.info/"
target="_blank"><span style="color: blue;">http://www.ensembl.info</span></a></span><span
style="font-size: 10pt; font-family: Arial; color:
black;"></span></p>
<p style="margin: 0.1pt 0in 0.0001pt;"><span
style="font-size: 10pt; font-family: Arial; color:
black;">YouTube: </span><span style="font-size: 10pt;
font-family: Arial; color: rgb(0, 0, 153);"><a
moz-do-not-send="true"
href="http://www.youtube.com/user/EnsemblHelpdesk"
target="_blank"><span style="color: blue;">http://www.youtube.com/user/EnsemblHelpdesk</span></a></span><span
style="font-size: 10pt; font-family: Arial; color:
black;"><br>
Facebook: </span><span style="font-size: 10pt;
font-family: Arial; color: blue;"><a
moz-do-not-send="true"
href="http://www.facebook.com/Ensembl.org"
target="_blank"><span style="color: blue;">http://www.facebook.com/Ensembl.org</span></a></span><span
style="font-size: 10pt; font-family: Arial; color:
black;"><br>
Twitter: </span><span style="font-size: 10pt;
font-family: Arial; color: blue;"><a
moz-do-not-send="true"
href="http://twitter.com/Ensembl" target="_blank"><span
style="color: blue;">http://twitter.com/Ensembl</span></a> </span><span
style="font-size: 10pt; font-family: Arial; color:
black;"></span></p>
</font><font size="2" face="Arial">
</font>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>