<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Thanks Dan.<br>
I thought of that, and I tried the same code but looking for genes
in all the scaffolds, thinking that there might be some unplaced
scaffolds, but the total for all scaffolds adds up to 31,501. This
could be, as you said, all the genes mapped to chromosomes, plus
some unplaced scaffolds, but that doesn't match any of the other
totals, so I'm no closer to knowing which total is correct.<br>
Any other thoughts?<br>
John<br>
<br>
On 15-12-06 09:07 PM, Daniel Lawson wrote:
<blockquote
cite="mid:CAMwWv1x53XRYCDELr0-HaQ6nDRU+BKBEXDJpfsNyHwUe4w+kCg@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div dir="ltr">
<div class="gmail_default"
style="font-family:verdana,sans-serif">Hi John,</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">There may be other
sequences in the assembly that have not been assigned to a
chromosome. You can check this via the API or in Mart. I
expect you'll find a bunch of small sequences that harbour
some genes - maybe that will get your totals to balance.</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">cheers</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">Dan</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 7 December 2015 at 01:59, john
samuel <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:john.samuel@senecacollege.ca" target="_blank">john.samuel@senecacollege.ca</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
I am trying to get an accurate count of all the ENSDARG
genes from the latest zebrafish data (<span>GRCz10) in
ensembl.<br>
If I use the perl api to get all the genes in all the
chromosomes I get a total of 31,388 i.e.<br>
<br>
my $slice_adaptor = $registry->get_adaptor(
'danio_rerio', 'Core', 'Slice' );<br>
my @slices = @{
$slice_adaptor->fetch_all('chromosome') };<br>
my $total = 0;<br>
my %all;<br>
foreach my $slice (@slices) {<br>
my @genes = @{ $slice->get_all_Genes() };<br>
my $count = scalar @genes;<br>
$all{$slice->seq_region_name()}=$count;<br>
$total += $count;<br>
}<br>
foreach my $sorted (sort {$a<=>$b} keys %all) {<br>
print "chromosome: $sorted\t$all{$sorted}\n";<br>
}<br>
print "gene total is\t$total\n";<br>
<br>
chromosome: MT 37<br>
chromosome: 1 1386<br>
chromosome: 2 1587<br>
chromosome: 3 1611<br>
chromosome: 4 3103<br>
chromosome: 5 1704<br>
chromosome: 6 1280<br>
chromosome: 7 1507<br>
chromosome: 8 1216<br>
chromosome: 9 1108<br>
chromosome: 10 1108<br>
chromosome: 11 1039<br>
chromosome: 12 952<br>
chromosome: 13 1013<br>
chromosome: 14 953<br>
chromosome: 15 1146<br>
chromosome: 16 1241<br>
chromosome: 17 1048<br>
chromosome: 18 942<br>
chromosome: 19 1123<br>
chromosome: 20 1253<br>
chromosome: 21 1092<br>
chromosome: 22 1174<br>
chromosome: 23 1031<br>
chromosome: 24 800<br>
chromosome: 25 934<br>
gene total is 31388<br>
<br>
Anyone see anything wrong with how I get the total? I
don't, but then when I go to biomart (see below), I get
a total of 31953<br>
<br>
</span><img
src="cid:part2.02050200.03000702@senecacollege.ca"
alt=""><br>
<span><br>
and if I go to the info page for the genome at <a
moz-do-not-send="true"
href="http://useast.ensembl.org/Danio_rerio/Info/Annotation"
target="_blank">http://useast.ensembl.org/Danio_rerio/Info/Annotation</a>
I see a </span><span>different</span><span> total there
too (31,650 not counting pseudogenes).<br>
<br>
</span><img
src="cid:part4.04020308.07090803@senecacollege.ca"
alt=""><br>
<span><br>
Anyone have any idea why the different totals and which
one to believe and whether there's anything wrong with
using the one that my code calculated as the definitive
one? I need to compare the total number of genes vs.
the number that we are finding under certain conditions,
to do some stats.<span class="HOEnZb"><font
color="#888888"><br>
John<br>
</font></span></span><br>
<br>
<br>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/" rel="noreferrer"
target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="gmail_signature">
<div dir="ltr">
<div>VectorBase | i5K insect genome initiative</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
</body>
</html>