<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi,<br>
I am trying to get an accurate count of all the ENSDARG genes from
the latest zebrafish data (<span>GRCz10) in ensembl.<br>
If I use the perl api to get all the genes in all the chromosomes
I get a total of 31,388 i.e.<br>
<br>
my $slice_adaptor = $registry->get_adaptor( 'danio_rerio',
'Core', 'Slice' );<br>
my @slices = @{ $slice_adaptor->fetch_all('chromosome') };<br>
my $total = 0;<br>
my %all;<br>
foreach my $slice (@slices) {<br>
my @genes = @{ $slice->get_all_Genes() };<br>
my $count = scalar @genes;<br>
$all{$slice->seq_region_name()}=$count;<br>
$total += $count;<br>
}<br>
foreach my $sorted (sort {$a<=>$b} keys %all) {<br>
print "chromosome: $sorted\t$all{$sorted}\n";<br>
}<br>
print "gene total is\t$total\n";<br>
<br>
chromosome: MT 37<br>
chromosome: 1 1386<br>
chromosome: 2 1587<br>
chromosome: 3 1611<br>
chromosome: 4 3103<br>
chromosome: 5 1704<br>
chromosome: 6 1280<br>
chromosome: 7 1507<br>
chromosome: 8 1216<br>
chromosome: 9 1108<br>
chromosome: 10 1108<br>
chromosome: 11 1039<br>
chromosome: 12 952<br>
chromosome: 13 1013<br>
chromosome: 14 953<br>
chromosome: 15 1146<br>
chromosome: 16 1241<br>
chromosome: 17 1048<br>
chromosome: 18 942<br>
chromosome: 19 1123<br>
chromosome: 20 1253<br>
chromosome: 21 1092<br>
chromosome: 22 1174<br>
chromosome: 23 1031<br>
chromosome: 24 800<br>
chromosome: 25 934<br>
gene total is 31388<br>
<br>
Anyone see anything wrong with how I get the total? I don't, but
then when I go to biomart (see below), I get a total of 31953<br>
<br>
</span><img src="cid:part1.09060504.00080305@senecacollege.ca"
alt=""><br>
<span><br>
and if I go to the info page for the genome at <a
class="moz-txt-link-freetext"
href="http://useast.ensembl.org/Danio_rerio/Info/Annotation">http://useast.ensembl.org/Danio_rerio/Info/Annotation</a>
I see a </span><span>different</span><span> total there too
(31,650 not counting pseudogenes).<br>
<br>
</span><img src="cid:part3.08020603.03070106@senecacollege.ca"
alt=""><br>
<span><br>
Anyone have any idea why the different totals and which one to
believe and whether there's anything wrong with using the one that
my code calculated as the definitive one? I need to compare the
total number of genes vs. the number that we are finding under
certain conditions, to do some stats.<br>
John<br>
</span><br>
<br>
<br>
</body>
</html>