<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Thanks Dan.<br>
I must admit I don't use biomart, so I don't know what it can do,
but now that you've shown me I see what you mean.<br>
That gives me a better idea about which total to use.<br>
Maybe someone else can shed some light on why the total on the
annotation page is different?<br>
John<br>
<br>
On 15-12-06 09:24 PM, Daniel Lawson wrote:
<blockquote
cite="mid:CAMwWv1wB_6ZwdkjSHo5cAAnmG51qapYUV2N9iehhwe2h1pnOqA@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div dir="ltr">
<div class="gmail_default"
style="font-family:verdana,sans-serif">Hi John,</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">You are missing 565
loci correct (31953 - 31388). </div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">I open Mart and using
the Region filter select all non-chromosome scaffolds. The
'Gene' count for these is 565, see image if that works on the
email list, else I include a URL for the Mart query.</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">
<pre style="color:rgb(0,0,0)"><a moz-do-not-send="true" href="http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id%7Cdrerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name">http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id|drerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name</a>."KN149679.1,KN149681.1,KN149682.1,KN149684.1,KN149686.1,KN149687.1,KN149688.1,KN149689.1,KN149690.1,KN149691.1,KN149694.1,KN149695.1,KN149696.1,KN149697.1,KN149698.1,KN149702.1,KN149704.1,KN149706.1,KN149707.1,KN149710.1,KN149711.1,KN149713.1,KN149715.1,KN149717.1,KN149719.1,KN149725.1,KN149727.1,KN149730.1,KN149731.1,KN14
9732.1,KN149734.1,KN149735.1,KN149739.1,KN149753.1,KN149755.1,KN149764.1,KN149765.1,KN149776.1,KN149779.1,KN149781.1,KN149782.1,KN149784.1,KN149787.1,KN149790.1,KN149795.1,KN149797.1,KN149798.1,KN149799.1,KN149803.1,KN149813.1,KN149816.1,KN149818.1,KN149829.1,KN149830.1,KN149831.1,KN149842.1,KN149843.1,KN149846.1,KN149847.1,KN149850.1,KN149855.1,KN149857.1,KN149858.1,KN149859.1,KN149861.1,KN149868.1,KN149874.1,KN149878.1,KN149880.1,KN149883.1,KN149884.1,KN149886.1,KN149894.1,KN149895.1,KN149896.1,KN149897.1,KN149900.1,KN149904.1,KN149906.1,KN149909.1,KN149910.1,KN149912.1,KN149914.1,KN149916.1,KN149917.1,KN149921.1,KN149923.1,KN149929.1,KN149930.1,KN149933.1,KN149934.1,KN149936.1,KN149939.1,KN149943.1,KN149945.1,KN149946.1,KN149947.1,KN149948.1,KN149951.1,KN149955.1,KN149959.1,KN149962.1,KN149964.1,KN149966.1,KN149968.1,KN149978.1,KN149986.1,KN149987.1,KN149989.1,KN149992.1,KN149995.1,KN149997.1,KN149998.1,KN150000.1,KN150001.1,KN150002.1,KN150003.1,KN150008.1,KN150013.1,KN150015.1,KN
150027.1,KN150032.1,KN150038.1,KN150039.1,KN150040.1,KN150041.1,KN150042.1,KN150046.1,KN150051.1,KN150052.1,KN150056.1,KN150062.1,KN150064.1,KN150066.1,KN150067.1,KN150071.1,KN150072.1,KN150075.1,KN150079.1,KN150080.1,KN150084.1,KN150086.1,KN150088.1,KN150090.1,KN150096.1,KN150099.1,KN150102.1,KN150104.1,KN150108.1,KN150109.1,KN150112.1,KN150115.1,KN150120.1,KN150125.1,KN150127.1,KN150128.1,KN150131.1,KN150137.1,KN150141.1,KN150142.1,KN150148.1,KN150156.1,KN150158.1,KN150162.1,KN150164.1,KN150165.1,KN150168.1,KN150169.1,KN150170.1,KN150171.1,KN150172.1,KN150173.1,KN150176.1,KN150177.1,KN150178.1,KN150188.1,KN150189.1,KN150193.1,KN150196.1,KN150199.1,KN150205.1,KN150207.1,KN150208.1,KN150212.1,KN150213.1,KN150214.1,KN150216.1,KN150221.1,KN150229.1,KN150230.1,KN150232.1,KN150239.1,KN150240.1,KN150241.1,KN150251.1,KN150259.1,KN150262.1,KN150265.1,KN150267.1,KN150269.1,KN150272.1,KN150273.1,KN150277.1,KN150285.1,KN150305.1,KN150307.1,KN150311.1,KN150312.1,KN150314.1,KN150317.1,KN150320.1,
KN150322.1,KN150324.1,KN150326.1,KN150328.1,KN150332.1,KN150334.1,KN150335.1,KN150336.1,KN150339.1,KN150342.1,KN150345.1,KN150346.1,KN150348.1,KN150350.1,KN150351.1,KN150353.1,KN150355.1,KN150359.1,KN150361.1,KN150362.1,KN150365.1,KN150366.1,KN150371.1,KN150372.1,KN150379.1,KN150380.1,KN150383.1,KN150387.1,KN150390.1,KN150399.1,KN150400.1,KN150401.1,KN150402.1,KN150403.1,KN150405.1,KN150407.1,KN150411.1,KN150412.1,KN150415.1,KN150416.1,KN150424.1,KN150425.1,KN150432.1,KN150433.1,KN150435.1,KN150442.1,KN150447.1,KN150449.1,KN150451.1,KN150456.1,KN150470.1,KN150474.1,KN150475.1,KN150482.1,KN150487.1,KN150490.1,KN150491.1,KN150492.1,KN150505.1,KN150506.1,KN150508.1,KN150516.1,KN150518.1,KN150521.1,KN150527.1,KN150530.1,KN150531.1,KN150532.1,KN150541.1,KN150543.1,KN150544.1,KN150545.1,KN150550.1,KN150552.1,KN150561.1,KN150562.1,KN150564.1,KN150566.1,KN150568.1,KN150570.1,KN150572.1,KN150574.1,KN150576.1,KN150578.1,KN150589.1,KN150590.1,KN150596.1,KN150597.1,KN150600.1,KN150603.1,KN150605.
1,KN150608.1,KN150614.1,KN150616.1,KN150617.1,KN150620.1,KN150628.1,KN150630.1,KN150631.1,KN150635.1,KN150636.1,KN150637.1,KN150642.1,KN150647.1,KN150650.1,KN150653.1,KN150654.1,KN150663.1,KN150665.1,KN150666.1,KN150667.1,KN150670.1,KN150672.1,KN150674.1,KN150677.1,KN150680.1,KN150681.1,KN150683.1,KN150685.1,KN150691.1,KN150696.1,KN150698.1,KN150699.1,KN150700.1,KN150702.1,KN150703.1,KN150706.1,KN150708.1,KN150709.1"&VISIBLEPANEL=filterpanel</pre>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">Hope that helps/goes
some way to explaining the difference between Mart and your
API script. I can't comment on whether or not either of these
are the definitive gene count for zebrafish.</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">regards</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">Dan</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 7 December 2015 at 02:17, john
samuel <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:john.samuel@senecacollege.ca" target="_blank">john.samuel@senecacollege.ca</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Thanks Dan.<br>
I thought of that, and I tried the same code but looking
for genes in all the scaffolds, thinking that there might
be some unplaced scaffolds, but the total for all
scaffolds adds up to 31,501. This could be, as you said,
all the genes mapped to chromosomes, plus some unplaced
scaffolds, but that doesn't match any of the other totals,
so I'm no closer to knowing which total is correct.<br>
Any other thoughts?<span class="HOEnZb"><font
color="#888888"><br>
John</font></span>
<div>
<div class="h5"><br>
<br>
On 15-12-06 09:07 PM, Daniel Lawson wrote:
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default"
style="font-family:verdana,sans-serif">Hi John,</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">There may
be other sequences in the assembly that have not
been assigned to a chromosome. You can check
this via the API or in Mart. I expect you'll
find a bunch of small sequences that harbour
some genes - maybe that will get your totals to
balance.</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">cheers</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif">Dan</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif"><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 7 December 2015 at
01:59, john samuel <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:john.samuel@senecacollege.ca"
target="_blank">john.samuel@senecacollege.ca</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
I am trying to get an accurate count of all
the ENSDARG genes from the latest zebrafish
data (<span>GRCz10) in ensembl.<br>
If I use the perl api to get all the genes
in all the chromosomes I get a total of
31,388 i.e.<br>
<br>
my $slice_adaptor =
$registry->get_adaptor( 'danio_rerio',
'Core', 'Slice' );<br>
my @slices = @{
$slice_adaptor->fetch_all('chromosome')
};<br>
my $total = 0;<br>
my %all;<br>
foreach my $slice (@slices) {<br>
my @genes = @{
$slice->get_all_Genes() };<br>
my $count = scalar @genes;<br>
$all{$slice->seq_region_name()}=$count;<br>
$total += $count;<br>
}<br>
foreach my $sorted (sort {$a<=>$b}
keys %all) {<br>
print "chromosome:
$sorted\t$all{$sorted}\n";<br>
}<br>
print "gene total is\t$total\n";<br>
<br>
chromosome: MT 37<br>
chromosome: 1 1386<br>
chromosome: 2 1587<br>
chromosome: 3 1611<br>
chromosome: 4 3103<br>
chromosome: 5 1704<br>
chromosome: 6 1280<br>
chromosome: 7 1507<br>
chromosome: 8 1216<br>
chromosome: 9 1108<br>
chromosome: 10 1108<br>
chromosome: 11 1039<br>
chromosome: 12 952<br>
chromosome: 13 1013<br>
chromosome: 14 953<br>
chromosome: 15 1146<br>
chromosome: 16 1241<br>
chromosome: 17 1048<br>
chromosome: 18 942<br>
chromosome: 19 1123<br>
chromosome: 20 1253<br>
chromosome: 21 1092<br>
chromosome: 22 1174<br>
chromosome: 23 1031<br>
chromosome: 24 800<br>
chromosome: 25 934<br>
gene total is 31388<br>
<br>
Anyone see anything wrong with how I get
the total? I don't, but then when I go to
biomart (see below), I get a total of
31953<br>
<br>
</span><img
src="cid:part4.08010808.04040809@senecacollege.ca"
alt=""><br>
<span><br>
and if I go to the info page for the
genome at <a moz-do-not-send="true"
href="http://useast.ensembl.org/Danio_rerio/Info/Annotation"
target="_blank">http://useast.ensembl.org/Danio_rerio/Info/Annotation</a>
I see a </span><span>different</span><span>
total there too (31,650 not counting
pseudogenes).<br>
<br>
</span><img
src="cid:part6.04070601.02050501@senecacollege.ca"
alt=""><br>
<span><br>
Anyone have any idea why the different
totals and which one to believe and
whether there's anything wrong with using
the one that my code calculated as the
definitive one? I need to compare the
total number of genes vs. the number that
we are finding under certain conditions,
to do some stats.<span><font
color="#888888"><br>
John<br>
</font></span></span><br>
<br>
<br>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org"
target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe
info: <a moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/"
rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div>
<div dir="ltr">
<div>VectorBase | i5K insect genome initiative</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Dev mailing list <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
</div>
</div>
</div>
<br>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/" rel="noreferrer"
target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="gmail_signature">
<div dir="ltr">
<div>VectorBase | i5K insect genome initiative</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
</body>
</html>