<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi Andrea,<div>When you hit the count button, this just gives a count of the number of genes as the Ensembl mart datasets you mention are gene centric. It will not give a count of the number of exons. To get this count, you would have to download the result file and further process the file to calculate the number of exons.</div><div>Hope that helps</div><div>Regards</div><div>Rhoda</div><div><br><div><div>On 9 May 2011, at 12:57, Andrea Edwards wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"> <div text="#000000" bgcolor="#ffffff"> Hi<br> <br> I wasn't using biomart per se; i used it as a test to try and work out why the perl code wasn't returning the same same numbers as the sql<br> <br> in both cow and human (genes 62) i didn't set any filters and selected exon id, gene id and transcript id attributes. For cow the count was 25670 and for human it was 53334<br> <br> regards<br> <br> On 09/05/11 09:05, Rhoda Kinsella wrote: <blockquote cite="mid:814DFCD1-2D01-45E2-B0FC-58CDD7A3CB76@ebi.ac.uk" type="cite">Hi Andrea <div>Can you send me the details of your biomart query so that I can look into the issue?</div> <div>Regards</div> <div>Rhoda</div> <div><br> <div> <div>On 6 May 2011, at 17:04, Andrea Edwards wrote:</div> <br class="Apple-interchange-newline"> <blockquote type="cite"><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"> <div text="#000000" bgcolor="#ffffff">Hi Bert,<br> <br> <br> I agree with your code and i agree with the sql i originally posted. I don't understand why biomart and this perl code (below)<br> are only returning 25k, and it seems too coincidental they are returning the same number<br> <br> The difference in results is huge. What exons is this code missing? All i could think of was predicted exons but it seems<br> unlikely there are 25k known exons and (250k-25k = 225k) predicted exons not assigned to genes. I don't even know if ensembl deals with predicted exons.<br> I got the same 'discrepancy' with figures when i tested human too.<br> <br> ===============================================<br> <br> my $gene_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Gene' );<br> my $genes = $gene_adaptor->fetch_all();<br> <br> <br> $total_genes=0;<br> $exon_count = 0;<br> foreach $gene(@{$genes}) {<br> $total_genes++;<br> <span class="Apple-converted-space"> </span><br> foreach $exon ($gene->get_all_Exons()) {<br> $exon_count++;<br> } <span class="Apple-converted-space"> </span><br> } #end for each gene<br> <br> <br> =============================================<br> <br> <br> Thank you very much<br> <br> On 06/05/11 16:44, Bert Overduin wrote: <blockquote cite="mid:BANLkTik1KCWZFDa38bqQGfJhVVfb2HDF8w@mail.gmail.com" type="cite">Hi, <div><br> </div> <div>When I use the following code:</div> <div><br> </div> <div> <div style="margin: 0px; font: 12px Helvetica;">#!/usr/bin/perl</div> <div style="margin: 0px; font: 12px Helvetica; min-height: 14px;"><br> </div> <div style="margin: 0px; font: 12px Helvetica;">use strict;</div> <div style="margin: 0px; font: 12px Helvetica;">use Bio::EnsEMBL::Registry;</div> <div style="margin: 0px; font: 12px Helvetica; min-height: 14px;"><br> </div> <div style="margin: 0px; font: 12px Helvetica;">my $reg = "Bio::EnsEMBL::Registry";</div> <div style="margin: 0px; font: 12px Helvetica; min-height: 14px;"><br> </div> <div style="margin: 0px; font: 12px Helvetica;">$reg->load_registry_from_db( -host => '<a moz-do-not-send="true" href="http://ensembldb.ensembl.org">ensembldb.ensembl.org</a>', -user => 'anonymous' );</div> <div style="margin: 0px; font: 12px Helvetica; min-height: 14px;"><br> </div> <div style="margin: 0px; font: 12px Helvetica;">my $exon_adaptor = $reg->get_adaptor( 'Bos taurus', 'Core', 'Exon' );</div> <div style="margin: 0px; font: 12px Helvetica; min-height: 14px;"><br> </div> <div style="margin: 0px; font: 12px Helvetica;">my $exons = $exon_adaptor->fetch_all;</div> <div style="margin: 0px; font: 12px Helvetica; min-height: 14px;"><br> </div> <div style="margin: 0px; font: 12px Helvetica;">print scalar( @{$exons} ), "\n";</div> </div> <div><br> </div> <div>I get:</div> <div><br> </div> <div> <div>farm2-head2[bert]2: perl<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://test.pl">test.pl</a></div> <div>225837</div> <div><br> </div> <div>Which is the same number I get with a MySQL query:</div> <div><br> </div> <div> <div>mysql -u anonymous -h<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://ensembldb.ensembl.org">ensembldb.ensembl.org</a><span class="Apple-converted-space"> </span>-P 5306</div> <div>Welcome to the MySQL monitor. Commands end with ; or \g.</div> <div>Your MySQL connection id is 8610 to server version: 5.1.34-log</div> <div><br> </div> <div>Type 'help;' or '\h' for help. Type '\c' to clear the buffer.</div> </div> <div><br> </div> <div> <div>mysql> use bos_taurus_core_62_4k </div> <div>Reading table information for completion of table and column names</div> <div>You can turn off this feature to get a quicker startup with -A</div> </div> <div> <div><br> </div> <div>Database changed</div> </div> <div> <div>mysql> SELECT COUNT(*) FROM exon;</div> <div>+----------+</div> <div>| COUNT(*) |</div> <div>+----------+</div> <div>| 225837 |</div> <div>+----------+</div> <div>1 row in set (0.01 sec)</div> </div> <div><br> </div> <div>Cheers,</div> <div>Bert</div> <div><br> </div> <br> <div class="gmail_quote">On Fri, May 6, 2011 at 4:22 PM, Andrea Edwards<span class="Apple-converted-space"> </span><span dir="ltr"><<a moz-do-not-send="true" href="mailto:edwardsa@cs.man.ac.uk">edwardsa@cs.man.ac.uk</a>></span><span class="Apple-converted-space"> </span>wrote:<br> <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> <div text="#000000" bgcolor="#ffffff">I tried 2 ways :<br> <br> ===============================================<br> <br> my $gene_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Gene' );<br> my $genes = $gene_adaptor->fetch_all();<br> <br> my $exon_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Exon' );<br> $total_genes=0;<br> $exon_count = 0;<br> foreach $gene(@{$genes}) {<br> $total_genes++;<br> <span class="Apple-converted-space"> </span><br> foreach $exon ($gene->get_all_Exons()) {<br> $exon_count++;<br> } <span class="Apple-converted-space"> </span><br> } #end for each gene<br> <br> <br> =============================================<br> <br> This way gave even less (23k) but i'm being stricter here about the chromosomes<br> <br> @slices = @{ $slice_adaptor->fetch_all('chromosome', undef, 0, 1) };<br> <br> $total_genes=0;<br> $exon_count = 0;<br> foreach $slice (@slices) {<br> unless ($slice->seq_region_name() =~ /Un/) {<br> print $slice->seq_region_name."\n";<br> my $genes = $gene_adaptor->fetch_all_by_Slice($slice);<br> <span class="Apple-converted-space"> </span><br> <span class="Apple-converted-space"> </span><br> foreach my $gene(@{$genes}) {<br> $total_genes++;<br> <span class="Apple-converted-space"> </span><br> foreach my $exon ($gene->get_all_Exons()) {<br> $exon_count++;<br> print "$exon_count\n";<br> } <span class="Apple-converted-space"> </span><br> <span class="Apple-converted-space"> </span><br> <span class="Apple-converted-space"> </span><br> <span class="Apple-converted-space"> </span><br> <span class="Apple-converted-space"> </span><br> } #end for each gene<br> }<br> }<br> <br> ==============================================<br> <br> But neither give anything like the sql results<br> <br> Why does the sql give so many more? Which should I use?<br> <br> thank you <div> <div class="h5"><br> <br> <br> On 06/05/11 15:50, Bert Overduin wrote: <blockquote type="cite">Hi Andrea, <div><br> </div> <div>I suspect that your BioMart results are truncated because the query is too large. <div><br> </div> <div>However, that doesn't explain your API results .... How does your API code look like?<br> <div><br> </div> <div>Cheers,</div> <div>Bert<br> <br> <div class="gmail_quote">On Fri, May 6, 2011 at 3:45 PM, Andrea Edwards<span class="Apple-converted-space"> </span><span dir="ltr"><<a moz-do-not-send="true" href="mailto:edwardsa@cs.man.ac.uk" target="_blank">edwardsa@cs.man.ac.uk</a>></span><span class="Apple-converted-space"> </span>wrote:<br> <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Hello<br> <br> I'm sorry for the basic question but I was looking at the ensembl core schema and trying to retrieve just the exons on chromosomes and couldn't work out why i am getting such different figures than with biomart and the perl api<br> <br> For example for cow there are 25670 exons in genes with biomart and the api but with this sql ~210k exons. This code is just looking for exons on chromosomes 1-30 and X<br> <br> select count(distinct stable_id) from exon e inner join exon_stable_id es using(exon_id) inner join seq_region sr using(seq_region_id) where sr.coord_system_id = 2 and<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://sr.name" target="_blank">sr.name</a>REGEXP '^[1-9]|^X' and e.is_current=1<br> <br> I get 8k just on chromosome 1<br> <br> I'm sure this is simple and perhaps its because its Friday afternoon but I'm just not seeing it!!<br> <br> _______________________________________________<br> Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br> List admin (including subscribe/unsubscribe):<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br> Ensembl Blog:<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br> </blockquote> </div> <br> <br clear="all"> <br> --<span class="Apple-converted-space"> </span><br> Bert Overduin, Ph.D.<br> Vertebrate Genomics Team<br> <br> EMBL - European Bioinformatics Institute<br> Wellcome Trust Genome Campus<br> Hinxton, Cambridge CB10 1SD<br> United Kingdom<br> <br> <a moz-do-not-send="true" href="http://www.ebi.ac.uk/%7Ebert" target="_blank">http://www.ebi.ac.uk/~bert</a> <div> <div style="margin-bottom: 0.0001pt;"><br class="webkit-block-placeholder"> </div> <div style="margin: 0.1pt 0in 0.0001pt;"><font face="Arial"><span style="font-size: 10pt; font-family: Arial; color: black;">Ensembl browser: <a moz-do-not-send="true" href="http://www.ensembl.org/" target="_blank"><span style="color: blue;">http://www.ensembl.org</span></a></span><span style="font-size: 10pt; font-family: Arial;"></span></font></div> <font face="Arial"> <div style="margin: 0.1pt 0in 0.0001pt;"><span style="font-size: 10pt; font-family: Arial; color: black;">Mailing lists: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://www.ensembl.org/info/about/contact/mailing.html" target="_blank"><span style="color: blue;">http://www.ensembl.org/info/about/contact/mailing.html</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"></span></div> <div style="margin: 0.1pt 0in 0.0001pt;"><span style="font-size: 10pt; font-family: Arial; color: black;">Blog: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank"><span style="color: blue;">http://www.ensembl.info</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"></span></div> <div style="margin: 0.1pt 0in 0.0001pt;"><span style="font-size: 10pt; font-family: Arial; color: black;">YouTube: </span><span style="font-size: 10pt; font-family: Arial; color: rgb(0, 0, 153);"><a moz-do-not-send="true" href="http://www.youtube.com/user/EnsemblHelpdesk" target="_blank"><span style="color: blue;">http://www.youtube.com/user/EnsemblHelpdesk</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"><br> Facebook: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://www.facebook.com/Ensembl.org" target="_blank"><span style="color: blue;">http://www.facebook.com/Ensembl.org</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"><br> Twitter: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://twitter.com/Ensembl" target="_blank"><span style="color: blue;">http://twitter.com/Ensembl</span></a> </span><span style="font-size: 10pt; font-family: Arial; color: black;"></span></div> </font></div> <br> </div> </div> </div> </blockquote> <br> </div> </div> </div> </blockquote> </div> <br> <br clear="all"> <br> --<span class="Apple-converted-space"> </span><br> Bert Overduin, Ph.D.<br> Vertebrate Genomics Team<br> <br> EMBL - European Bioinformatics Institute<br> Wellcome Trust Genome Campus<br> Hinxton, Cambridge CB10 1SD<br> United Kingdom<br> <br> <a moz-do-not-send="true" href="http://www.ebi.ac.uk/%7Ebert" target="_blank">http://www.ebi.ac.uk/~bert</a> <div> <div style="margin-bottom: 0.0001pt;"><br class="webkit-block-placeholder"> </div> <div style="margin: 0.1pt 0in 0.0001pt;"><font face="Arial"><span style="font-size: 10pt; font-family: Arial; color: black;">Ensembl browser: <a moz-do-not-send="true" href="http://www.ensembl.org/" target="_blank"><span style="color: blue;">http://www.ensembl.org</span></a></span><span style="font-size: 10pt; font-family: Arial;"></span></font></div> <font face="Arial"> <div style="margin: 0.1pt 0in 0.0001pt;"><span style="font-size: 10pt; font-family: Arial; color: black;">Mailing lists: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://www.ensembl.org/info/about/contact/mailing.html" target="_blank"><span style="color: blue;">http://www.ensembl.org/info/about/contact/mailing.html</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"></span></div> <div style="margin: 0.1pt 0in 0.0001pt;"><span style="font-size: 10pt; font-family: Arial; color: black;">Blog: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank"><span style="color: blue;">http://www.ensembl.info</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"></span></div> <div style="margin: 0.1pt 0in 0.0001pt;"><span style="font-size: 10pt; font-family: Arial; color: black;">YouTube: </span><span style="font-size: 10pt; font-family: Arial; color: rgb(0, 0, 153);"><a moz-do-not-send="true" href="http://www.youtube.com/user/EnsemblHelpdesk" target="_blank"><span style="color: blue;">http://www.youtube.com/user/EnsemblHelpdesk</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"><br> Facebook: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://www.facebook.com/Ensembl.org" target="_blank"><span style="color: blue;">http://www.facebook.com/Ensembl.org</span></a></span><span style="font-size: 10pt; font-family: Arial; color: black;"><br> Twitter: </span><span style="font-size: 10pt; font-family: Arial; color: blue;"><a moz-do-not-send="true" href="http://twitter.com/Ensembl" target="_blank"><span style="color: blue;">http://twitter.com/Ensembl</span></a> </span><span style="font-size: 10pt; font-family: Arial; color: black;"></span></div> </font></div> <br> </div> </blockquote> <br> _______________________________________________<br> Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br> List admin (including subscribe/unsubscribe):<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br> Ensembl Blog:<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://www.ensembl.info/">http://www.ensembl.info/</a><br> </div> </span></blockquote> </div> <br> <div> <span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;"> <div style="word-wrap: break-word;"><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;"> <div style="word-wrap: break-word;"> <div>Rhoda Kinsella Ph.D.</div> <div> <div style="margin: 0px; font: 12px Helvetica;">Ensembl Bioinformatician,</div> </div> <div>European Bioinformatics Institute (EMBL-EBI),<br> Wellcome Trust Genome Campus, </div> <div>Hinxton<br> Cambridge CB10 1SD,</div> <div>UK.</div> </div> </span></div> </span> </div> <br> </div> </blockquote> <br> </div> </blockquote></div><br><div> <span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Rhoda Kinsella Ph.D.</div><div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Ensembl Bioinformatician,</div></div><div>European Bioinformatics Institute (EMBL-EBI),<br>Wellcome Trust Genome Campus, </div><div>Hinxton<br>Cambridge CB10 1SD,</div><div>UK.</div></div></span></div></span> </div><br></div></body></html>