<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif">Hi John,</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">You are missing 565 loci correct (31953 - 31388). </div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">I open Mart and using the Region filter select all non-chromosome scaffolds. The 'Gene' count for these is 565, see image if that works on the email list, else I include a URL for the Mart query.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif"><pre style="color:rgb(0,0,0)"><a href="http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id|drerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name">http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id|drerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name</a>."KN149679.1,KN149681.1,KN149682.1,KN149684.1,KN149686.1,KN149687.1,KN149688.1,KN149689.1,KN149690.1,KN149691.1,KN149694.1,KN149695.1,KN149696.1,KN149697.1,KN149698.1,KN149702.1,KN149704.1,KN149706.1,KN149707.1,KN149710.1,KN149711.1,KN149713.1,KN149715.1,KN149717.1,KN149719.1,KN149725.1,KN149727.1,KN149730.1,KN149731.1,KN149732.1,KN149734.1,KN149735.1,KN149739.1,KN149753.1,KN149755.1,KN149764.1,KN149765.1,KN149776.1,KN149779.1,KN149781.1,KN149782.1,KN149784.1,KN149787.1,KN149790.1,KN149795.1,KN149797.1,KN149798.1,KN149799.1,KN149803.1,KN149813.1,KN149816.1,KN149818.1,KN149829.1,KN149830.1,KN149831.1,KN149842.1,KN149843.1,KN149846.1,KN149847.1,KN149850.1,KN149855.1,KN149857.1,KN149858.1,KN149859.1,KN149861.1,KN149868.1,KN149874.1,KN149878.1,KN149880.1,KN149883.1,KN149884.1,KN149886.1,KN149894.1,KN149895.1,KN149896.1,KN149897.1,KN149900.1,KN149904.1,KN149906.1,KN149909.1,KN149910.1,KN149912.1,KN149914.1,KN149916.1,KN149917.1,KN149921.1,KN149923.1,KN149929.1,KN149930.1,KN149933.1,KN149934.1,KN149936.1,KN149939.1,KN149943.1,KN149945.1,KN149946.1,KN149947.1,KN149948.1,KN149951.1,KN149955.1,KN149959.1,KN149962.1,KN149964.1,KN149966.1,KN149968.1,KN149978.1,KN149986.1,KN149987.1,KN149989.1,KN149992.1,KN149995.1,KN149997.1,KN149998.1,KN150000.1,KN150001.1,KN150002.1,KN150003.1,KN150008.1,KN150013.1,KN150015.1,KN150027.1,KN150032.1,KN150038.1,KN150039.1,KN150040.1,KN150041.1,KN150042.1,KN150046.1,KN150051.1,KN150052.1,KN150056.1,KN150062.1,KN150064.1,KN150066.1,KN150067.1,KN150071.1,KN150072.1,KN150075.1,KN150079.1,KN150080.1,KN150084.1,KN150086.1,KN150088.1,KN150090.1,KN150096.1,KN150099.1,KN150102.1,KN150104.1,KN150108.1,KN150109.1,KN150112.1,KN150115.1,KN150120.1,KN150125.1,KN150127.1,KN150128.1,KN150131.1,KN150137.1,KN150141.1,KN150142.1,KN150148.1,KN150156.1,KN150158.1,KN150162.1,KN150164.1,KN150165.1,KN150168.1,KN150169.1,KN150170.1,KN150171.1,KN150172.1,KN150173.1,KN150176.1,KN150177.1,KN150178.1,KN150188.1,KN150189.1,KN150193.1,KN150196.1,KN150199.1,KN150205.1,KN150207.1,KN150208.1,KN150212.1,KN150213.1,KN150214.1,KN150216.1,KN150221.1,KN150229.1,KN150230.1,KN150232.1,KN150239.1,KN150240.1,KN150241.1,KN150251.1,KN150259.1,KN150262.1,KN150265.1,KN150267.1,KN150269.1,KN150272.1,KN150273.1,KN150277.1,KN150285.1,KN150305.1,KN150307.1,KN150311.1,KN150312.1,KN150314.1,KN150317.1,KN150320.1,KN150322.1,KN150324.1,KN150326.1,KN150328.1,KN150332.1,KN150334.1,KN150335.1,KN150336.1,KN150339.1,KN150342.1,KN150345.1,KN150346.1,KN150348.1,KN150350.1,KN150351.1,KN150353.1,KN150355.1,KN150359.1,KN150361.1,KN150362.1,KN150365.1,KN150366.1,KN150371.1,KN150372.1,KN150379.1,KN150380.1,KN150383.1,KN150387.1,KN150390.1,KN150399.1,KN150400.1,KN150401.1,KN150402.1,KN150403.1,KN150405.1,KN150407.1,KN150411.1,KN150412.1,KN150415.1,KN150416.1,KN150424.1,KN150425.1,KN150432.1,KN150433.1,KN150435.1,KN150442.1,KN150447.1,KN150449.1,KN150451.1,KN150456.1,KN150470.1,KN150474.1,KN150475.1,KN150482.1,KN150487.1,KN150490.1,KN150491.1,KN150492.1,KN150505.1,KN150506.1,KN150508.1,KN150516.1,KN150518.1,KN150521.1,KN150527.1,KN150530.1,KN150531.1,KN150532.1,KN150541.1,KN150543.1,KN150544.1,KN150545.1,KN150550.1,KN150552.1,KN150561.1,KN150562.1,KN150564.1,KN150566.1,KN150568.1,KN150570.1,KN150572.1,KN150574.1,KN150576.1,KN150578.1,KN150589.1,KN150590.1,KN150596.1,KN150597.1,KN150600.1,KN150603.1,KN150605.1,KN150608.1,KN150614.1,KN150616.1,KN150617.1,KN150620.1,KN150628.1,KN150630.1,KN150631.1,KN150635.1,KN150636.1,KN150637.1,KN150642.1,KN150647.1,KN150650.1,KN150653.1,KN150654.1,KN150663.1,KN150665.1,KN150666.1,KN150667.1,KN150670.1,KN150672.1,KN150674.1,KN150677.1,KN150680.1,KN150681.1,KN150683.1,KN150685.1,KN150691.1,KN150696.1,KN150698.1,KN150699.1,KN150700.1,KN150702.1,KN150703.1,KN150706.1,KN150708.1,KN150709.1"&VISIBLEPANEL=filterpanel</pre></div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Hope that helps/goes some way to explaining the difference between Mart and your API script. I can't comment on whether or not either of these are the definitive gene count for zebrafish.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">regards</div><div class="gmail_default" style="font-family:verdana,sans-serif">Dan</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 7 December 2015 at 02:17, john samuel <span dir="ltr"><<a href="mailto:john.samuel@senecacollege.ca" target="_blank">john.samuel@senecacollege.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
Thanks Dan.<br>
I thought of that, and I tried the same code but looking for genes
in all the scaffolds, thinking that there might be some unplaced
scaffolds, but the total for all scaffolds adds up to 31,501. This
could be, as you said, all the genes mapped to chromosomes, plus
some unplaced scaffolds, but that doesn't match any of the other
totals, so I'm no closer to knowing which total is correct.<br>
Any other thoughts?<span class="HOEnZb"><font color="#888888"><br>
John</font></span><div><div class="h5"><br>
<br>
On 15-12-06 09:07 PM, Daniel Lawson wrote:
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:verdana,sans-serif">Hi John,</div>
<div class="gmail_default" style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:verdana,sans-serif">There may be other
sequences in the assembly that have not been assigned to a
chromosome. You can check this via the API or in Mart. I
expect you'll find a bunch of small sequences that harbour
some genes - maybe that will get your totals to balance.</div>
<div class="gmail_default" style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:verdana,sans-serif">cheers</div>
<div class="gmail_default" style="font-family:verdana,sans-serif">Dan</div>
<div class="gmail_default" style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:verdana,sans-serif"><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 7 December 2015 at 01:59, john
samuel <span dir="ltr"><<a href="mailto:john.samuel@senecacollege.ca" target="_blank">john.samuel@senecacollege.ca</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
I am trying to get an accurate count of all the ENSDARG
genes from the latest zebrafish data (<span>GRCz10) in
ensembl.<br>
If I use the perl api to get all the genes in all the
chromosomes I get a total of 31,388 i.e.<br>
<br>
my $slice_adaptor = $registry->get_adaptor(
'danio_rerio', 'Core', 'Slice' );<br>
my @slices = @{
$slice_adaptor->fetch_all('chromosome') };<br>
my $total = 0;<br>
my %all;<br>
foreach my $slice (@slices) {<br>
my @genes = @{ $slice->get_all_Genes() };<br>
my $count = scalar @genes;<br>
$all{$slice->seq_region_name()}=$count;<br>
$total += $count;<br>
}<br>
foreach my $sorted (sort {$a<=>$b} keys %all) {<br>
print "chromosome: $sorted\t$all{$sorted}\n";<br>
}<br>
print "gene total is\t$total\n";<br>
<br>
chromosome: MT 37<br>
chromosome: 1 1386<br>
chromosome: 2 1587<br>
chromosome: 3 1611<br>
chromosome: 4 3103<br>
chromosome: 5 1704<br>
chromosome: 6 1280<br>
chromosome: 7 1507<br>
chromosome: 8 1216<br>
chromosome: 9 1108<br>
chromosome: 10 1108<br>
chromosome: 11 1039<br>
chromosome: 12 952<br>
chromosome: 13 1013<br>
chromosome: 14 953<br>
chromosome: 15 1146<br>
chromosome: 16 1241<br>
chromosome: 17 1048<br>
chromosome: 18 942<br>
chromosome: 19 1123<br>
chromosome: 20 1253<br>
chromosome: 21 1092<br>
chromosome: 22 1174<br>
chromosome: 23 1031<br>
chromosome: 24 800<br>
chromosome: 25 934<br>
gene total is 31388<br>
<br>
Anyone see anything wrong with how I get the total? I
don't, but then when I go to biomart (see below), I get
a total of 31953<br>
<br>
</span><img src="cid:part2.02050200.03000702@senecacollege.ca" alt=""><br>
<span><br>
and if I go to the info page for the genome at <a href="http://useast.ensembl.org/Danio_rerio/Info/Annotation" target="_blank">http://useast.ensembl.org/Danio_rerio/Info/Annotation</a>
I see a </span><span>different</span><span> total there
too (31,650 not counting pseudogenes).<br>
<br>
</span><img src="cid:part4.04020308.07090803@senecacollege.ca" alt=""><br>
<span><br>
Anyone have any idea why the different totals and which
one to believe and whether there's anything wrong with
using the one that my code calculated as the definitive
one? I need to compare the total number of genes vs.
the number that we are finding under certain conditions,
to do some stats.<span><font color="#888888"><br>
John<br>
</font></span></span><br>
<br>
<br>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div>
<div dir="ltr">
<div>VectorBase | i5K insect genome initiative</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
</div></div></div>
<br>_______________________________________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><div>VectorBase | i5K insect genome initiative</div></div></div>
</div>