<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Hi John,<br>
    <br>
    The zebrafish geneset contains 10 genes of biotype TEC (To be
    Experimentally Confirmed)<br>
    As such, they do not fit in any of the biotype categories we display
    on the annotation page (they are not coding, not non coding, not
    pseudogenes)<br>
    <br>
    The annotation page hence displays a count of 31,943 genes (25,642
    coding, 6,008 non coding and 293 pseudogenes)<br>
    The biomart page displays a count of 31,953 genes, which is the same
    as the above + 10 TEC<br>
    <br>
    If you use the API with a fetch_all('toplevel'), this will retrieve
    all chromosomes as well as all non assembled scaffolds.<br>
    Using this with the same snippet of code that you include in your
    email should return 31,953 genes, the same as biomart.<br>
    <br>
    <br>
    Hope that helps,<br>
    Magali<br>
    <br>
    <div class="moz-cite-prefix">On 07/12/2015 02:41, john samuel wrote:<br>
    </div>
    <blockquote cite="mid:5664F1C1.3030203@senecacollege.ca" type="cite">
      <meta content="text/html; charset=windows-1252"
        http-equiv="Content-Type">
      Thanks Dan.<br>
      I must admit I don't use biomart, so I don't know what it can do,
      but now that you've shown me I see what you mean.<br>
      That gives me a better idea about which total to use.<br>
      Maybe someone else can shed some light on why the total on the
      annotation page is different?<br>
      John<br>
      <br>
      On 15-12-06 09:24 PM, Daniel Lawson wrote:
      <blockquote
cite="mid:CAMwWv1wB_6ZwdkjSHo5cAAnmG51qapYUV2N9iehhwe2h1pnOqA@mail.gmail.com"
        type="cite">
        <meta http-equiv="Content-Type" content="text/html;
          charset=windows-1252">
        <div dir="ltr">
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">Hi John,</div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif"><br>
          </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">You are missing 565
            loci correct (31953 - 31388). </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif"><br>
          </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">I open Mart and using
            the Region filter select all non-chromosome scaffolds. The
            'Gene' count for these is 565, see image if that works on
            the email list, else I include a URL for the Mart query.</div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif"><br>
          </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">
            <pre style="color:rgb(0,0,0)"><a moz-do-not-send="true" href="http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id%7Cdrerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name">http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id|drerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name</a>."KN149679.1,KN149681.1,KN149682.1,KN149684.1,KN149686.1,KN149687.1,KN149688.1,KN149689.1,KN149690.1,KN149691.1,KN149694.1,KN149695.1,KN149696.1,KN149697.1,KN149698.1,KN149702.1,KN149704.1,KN149706.1,KN149707.1,KN149710.1,KN149711.1,KN149713.1,KN149715.1,KN149717.1,KN149719.1,KN149725.1,KN149727.1,KN149730.1,KN14
 9731.1,KN
14
9732.1,KN149734.1,KN149735.1,KN149739.1,KN149753.1,KN149755.1,KN149764.1,KN149765.1,KN149776.1,KN149779.1,KN149781.1,KN149782.1,KN149784.1,KN149787.1,KN149790.1,KN149795.1,KN149797.1,KN149798.1,KN149799.1,KN149803.1,KN149813.1,KN149816.1,KN149818.1,KN149829.1,KN149830.1,KN149831.1,KN149842.1,KN149843.1,KN149846.1,KN149847.1,KN149850.1,KN149855.1,KN149857.1,KN149858.1,KN149859.1,KN149861.1,KN149868.1,KN149874.1,KN149878.1,KN149880.1,KN149883.1,KN149884.1,KN149886.1,KN149894.1,KN149895.1,KN149896.1,KN149897.1,KN149900.1,KN149904.1,KN149906.1,KN149909.1,KN149910.1,KN149912.1,KN149914.1,KN149916.1,KN149917.1,KN149921.1,KN149923.1,KN149929.1,KN149930.1,KN149933.1,KN149934.1,KN149936.1,KN149939.1,KN149943.1,KN149945.1,KN149946.1,KN149947.1,KN149948.1,KN149951.1,KN149955.1,KN149959.1,KN149962.1,KN149964.1,KN149966.1,KN149968.1,KN149978.1,KN149986.1,KN149987.1,KN149989.1,KN149992.1,KN149995.1,KN149997.1,KN149998.1,KN150000.1,KN150001.1,KN150002.1,KN150003.1,KN150008.1,KN150013.1,KN15
 0015.1,KN

150027.1,KN150032.1,KN150038.1,KN150039.1,KN150040.1,KN150041.1,KN150042.1,KN150046.1,KN150051.1,KN150052.1,KN150056.1,KN150062.1,KN150064.1,KN150066.1,KN150067.1,KN150071.1,KN150072.1,KN150075.1,KN150079.1,KN150080.1,KN150084.1,KN150086.1,KN150088.1,KN150090.1,KN150096.1,KN150099.1,KN150102.1,KN150104.1,KN150108.1,KN150109.1,KN150112.1,KN150115.1,KN150120.1,KN150125.1,KN150127.1,KN150128.1,KN150131.1,KN150137.1,KN150141.1,KN150142.1,KN150148.1,KN150156.1,KN150158.1,KN150162.1,KN150164.1,KN150165.1,KN150168.1,KN150169.1,KN150170.1,KN150171.1,KN150172.1,KN150173.1,KN150176.1,KN150177.1,KN150178.1,KN150188.1,KN150189.1,KN150193.1,KN150196.1,KN150199.1,KN150205.1,KN150207.1,KN150208.1,KN150212.1,KN150213.1,KN150214.1,KN150216.1,KN150221.1,KN150229.1,KN150230.1,KN150232.1,KN150239.1,KN150240.1,KN150241.1,KN150251.1,KN150259.1,KN150262.1,KN150265.1,KN150267.1,KN150269.1,KN150272.1,KN150273.1,KN150277.1,KN150285.1,KN150305.1,KN150307.1,KN150311.1,KN150312.1,KN150314.1,KN150317.1,KN
 150320.1,

KN150322.1,KN150324.1,KN150326.1,KN150328.1,KN150332.1,KN150334.1,KN150335.1,KN150336.1,KN150339.1,KN150342.1,KN150345.1,KN150346.1,KN150348.1,KN150350.1,KN150351.1,KN150353.1,KN150355.1,KN150359.1,KN150361.1,KN150362.1,KN150365.1,KN150366.1,KN150371.1,KN150372.1,KN150379.1,KN150380.1,KN150383.1,KN150387.1,KN150390.1,KN150399.1,KN150400.1,KN150401.1,KN150402.1,KN150403.1,KN150405.1,KN150407.1,KN150411.1,KN150412.1,KN150415.1,KN150416.1,KN150424.1,KN150425.1,KN150432.1,KN150433.1,KN150435.1,KN150442.1,KN150447.1,KN150449.1,KN150451.1,KN150456.1,KN150470.1,KN150474.1,KN150475.1,KN150482.1,KN150487.1,KN150490.1,KN150491.1,KN150492.1,KN150505.1,KN150506.1,KN150508.1,KN150516.1,KN150518.1,KN150521.1,KN150527.1,KN150530.1,KN150531.1,KN150532.1,KN150541.1,KN150543.1,KN150544.1,KN150545.1,KN150550.1,KN150552.1,KN150561.1,KN150562.1,KN150564.1,KN150566.1,KN150568.1,KN150570.1,KN150572.1,KN150574.1,KN150576.1,KN150578.1,KN150589.1,KN150590.1,KN150596.1,KN150597.1,KN150600.1,KN150603.1,
 KN150605.

1,KN150608.1,KN150614.1,KN150616.1,KN150617.1,KN150620.1,KN150628.1,KN150630.1,KN150631.1,KN150635.1,KN150636.1,KN150637.1,KN150642.1,KN150647.1,KN150650.1,KN150653.1,KN150654.1,KN150663.1,KN150665.1,KN150666.1,KN150667.1,KN150670.1,KN150672.1,KN150674.1,KN150677.1,KN150680.1,KN150681.1,KN150683.1,KN150685.1,KN150691.1,KN150696.1,KN150698.1,KN150699.1,KN150700.1,KN150702.1,KN150703.1,KN150706.1,KN150708.1,KN150709.1"&VISIBLEPANEL=filterpanel</pre>
          </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif"><br>
          </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">Hope that helps/goes
            some way to explaining the difference between Mart and your
            API script. I can't comment on whether or not either of
            these are the definitive gene count for zebrafish.</div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif"><br>
          </div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">regards</div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif">Dan</div>
          <div class="gmail_default"
            style="font-family:verdana,sans-serif"><br>
          </div>
        </div>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On 7 December 2015 at 02:17, john
            samuel <span dir="ltr"><<a moz-do-not-send="true"
                href="mailto:john.samuel@senecacollege.ca"
                target="_blank">john.samuel@senecacollege.ca</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"> Thanks Dan.<br>
                I thought of that, and I tried the same code but looking
                for genes in all the scaffolds, thinking that there
                might be some unplaced scaffolds, but the total for all
                scaffolds adds up to 31,501.  This could be, as you
                said, all the genes mapped to chromosomes, plus some
                unplaced scaffolds, but that doesn't match any of the
                other totals, so I'm no closer to knowing which total is
                correct.<br>
                Any other thoughts?<span class="HOEnZb"><font
                    color="#888888"><br>
                    John</font></span>
                <div>
                  <div class="h5"><br>
                    <br>
                    On 15-12-06 09:07 PM, Daniel Lawson wrote:
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif">Hi
                          John,</div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif"><br>
                        </div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif">There
                          may be other sequences in the assembly that
                          have not been assigned to a chromosome. You
                          can check this via the API or in Mart. I
                          expect you'll find a bunch of small sequences
                          that harbour some genes - maybe that will get
                          your totals to balance.</div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif"><br>
                        </div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif">cheers</div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif">Dan</div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif"><br>
                        </div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif"><br>
                        </div>
                        <div class="gmail_default"
                          style="font-family:verdana,sans-serif"><br>
                        </div>
                      </div>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On 7 December 2015 at
                          01:59, john samuel <span dir="ltr"><<a
                              moz-do-not-send="true"
                              href="mailto:john.samuel@senecacollege.ca"
                              target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:john.samuel@senecacollege.ca">john.samuel@senecacollege.ca</a></a>></span>
                          wrote:<br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
                              I am trying to get an accurate count of
                              all the ENSDARG genes from the latest
                              zebrafish data (<span>GRCz10) in ensembl.<br>
                                If I use the perl api to get all the
                                genes in all the chromosomes I get a
                                total of 31,388 i.e.<br>
                                <br>
                                my $slice_adaptor =
                                $registry->get_adaptor(
                                'danio_rerio', 'Core', 'Slice' );<br>
                                my @slices = @{
                                $slice_adaptor->fetch_all('chromosome')
                                };<br>
                                my $total = 0;<br>
                                my %all;<br>
                                foreach my $slice (@slices) {<br>
                                    my @genes = @{
                                $slice->get_all_Genes() };<br>
                                    my $count = scalar @genes;<br>
                                   
                                $all{$slice->seq_region_name()}=$count;<br>
                                    $total += $count;<br>
                                }<br>
                                foreach my $sorted (sort {$a<=>$b}
                                keys %all) {<br>
                                    print "chromosome:
                                $sorted\t$all{$sorted}\n";<br>
                                }<br>
                                print "gene total is\t$total\n";<br>
                                <br>
                                chromosome: MT    37<br>
                                chromosome: 1    1386<br>
                                chromosome: 2    1587<br>
                                chromosome: 3    1611<br>
                                chromosome: 4    3103<br>
                                chromosome: 5    1704<br>
                                chromosome: 6    1280<br>
                                chromosome: 7    1507<br>
                                chromosome: 8    1216<br>
                                chromosome: 9    1108<br>
                                chromosome: 10    1108<br>
                                chromosome: 11    1039<br>
                                chromosome: 12    952<br>
                                chromosome: 13    1013<br>
                                chromosome: 14    953<br>
                                chromosome: 15    1146<br>
                                chromosome: 16    1241<br>
                                chromosome: 17    1048<br>
                                chromosome: 18    942<br>
                                chromosome: 19    1123<br>
                                chromosome: 20    1253<br>
                                chromosome: 21    1092<br>
                                chromosome: 22    1174<br>
                                chromosome: 23    1031<br>
                                chromosome: 24    800<br>
                                chromosome: 25    934<br>
                                gene total is    31388<br>
                                <br>
                                Anyone see anything wrong with how I get
                                the total?  I don't, but then when I go
                                to biomart (see below), I get a total of
                                31953<br>
                                <br>
                              </span><img
                                src="cid:part4.05020703.06090504@ebi.ac.uk"
                                alt=""><br>
                              <span><br>
                                and if I go to the info page for the
                                genome at <a moz-do-not-send="true"
                                  href="http://useast.ensembl.org/Danio_rerio/Info/Annotation"
                                  target="_blank">http://useast.ensembl.org/Danio_rerio/Info/Annotation</a>
                                I see a </span><span>different</span><span>
                                total there too (31,650 not counting
                                pseudogenes).<br>
                                <br>
                              </span><img
                                src="cid:part6.05090803.09080006@ebi.ac.uk"
                                alt=""><br>
                              <span><br>
                                Anyone have any idea why the different
                                totals and which one to believe and
                                whether there's anything wrong with
                                using the one that my code calculated as
                                the definitive one?  I need to compare
                                the total number of genes vs. the number
                                that we are finding under certain
                                conditions, to do some stats.<span><font
                                    color="#888888"><br>
                                    John<br>
                                  </font></span></span><br>
                              <br>
                              <br>
                            </div>
                            <br>
_______________________________________________<br>
                            Dev mailing list    <a
                              moz-do-not-send="true"
                              href="mailto:Dev@ensembl.org"
                              target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a></a><br>
                            Posting guidelines and subscribe/unsubscribe
                            info: <a moz-do-not-send="true"
                              href="http://lists.ensembl.org/mailman/listinfo/dev"
                              rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
                            Ensembl Blog: <a moz-do-not-send="true"
                              href="http://www.ensembl.info/"
                              rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
                            <br>
                          </blockquote>
                        </div>
                        <br>
                        <br clear="all">
                        <div><br>
                        </div>
                        -- <br>
                        <div>
                          <div dir="ltr">
                            <div>VectorBase | i5K insect genome
                              initiative</div>
                          </div>
                        </div>
                      </div>
                      <br>
                      <fieldset></fieldset>
                      <br>
                      <pre>_______________________________________________
Dev mailing list    <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
                    </blockquote>
                  </div>
                </div>
              </div>
              <br>
              _______________________________________________<br>
              Dev mailing list    <a moz-do-not-send="true"
                href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
              Posting guidelines and subscribe/unsubscribe info: <a
                moz-do-not-send="true"
                href="http://lists.ensembl.org/mailman/listinfo/dev"
                rel="noreferrer" target="_blank"><a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a></a><br>
              Ensembl Blog: <a moz-do-not-send="true"
                href="http://www.ensembl.info/" rel="noreferrer"
                target="_blank">http://www.ensembl.info/</a><br>
              <br>
            </blockquote>
          </div>
          <br>
          <br clear="all">
          <div><br>
          </div>
          -- <br>
          <div class="gmail_signature">
            <div dir="ltr">
              <div>VectorBase | i5K insect genome initiative</div>
            </div>
          </div>
        </div>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
Dev mailing list    <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
      </blockquote>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>