<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Thanks Dan.<br>
    I must admit I don't use biomart, so I don't know what it can do,
    but now that you've shown me I see what you mean.<br>
    That gives me a better idea about which total to use.<br>
    Maybe someone else can shed some light on why the total on the
    annotation page is different?<br>
    John<br>
    <br>
    On 15-12-06 09:24 PM, Daniel Lawson wrote:
    <blockquote
cite="mid:CAMwWv1wB_6ZwdkjSHo5cAAnmG51qapYUV2N9iehhwe2h1pnOqA@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div dir="ltr">
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">Hi John,</div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif"><br>
        </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">You are missing 565
          loci correct (31953 - 31388). </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif"><br>
        </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">I open Mart and using
          the Region filter select all non-chromosome scaffolds. The
          'Gene' count for these is 565, see image if that works on the
          email list, else I include a URL for the Mart query.</div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif"><br>
        </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">
          <pre style="color:rgb(0,0,0)"><a moz-do-not-send="true" href="http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id%7Cdrerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name">http://www.ensembl.org/biomart/martview/68b3c7a216540966e2b2f569b596e7e6?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.feature_page.ensembl_gene_id|drerio_gene_ensembl.default.feature_page.ensembl_transcript_id&FILTERS=drerio_gene_ensembl.default.filters.chromosome_name</a>."KN149679.1,KN149681.1,KN149682.1,KN149684.1,KN149686.1,KN149687.1,KN149688.1,KN149689.1,KN149690.1,KN149691.1,KN149694.1,KN149695.1,KN149696.1,KN149697.1,KN149698.1,KN149702.1,KN149704.1,KN149706.1,KN149707.1,KN149710.1,KN149711.1,KN149713.1,KN149715.1,KN149717.1,KN149719.1,KN149725.1,KN149727.1,KN149730.1,KN149731.1,KN14
9732.1,KN149734.1,KN149735.1,KN149739.1,KN149753.1,KN149755.1,KN149764.1,KN149765.1,KN149776.1,KN149779.1,KN149781.1,KN149782.1,KN149784.1,KN149787.1,KN149790.1,KN149795.1,KN149797.1,KN149798.1,KN149799.1,KN149803.1,KN149813.1,KN149816.1,KN149818.1,KN149829.1,KN149830.1,KN149831.1,KN149842.1,KN149843.1,KN149846.1,KN149847.1,KN149850.1,KN149855.1,KN149857.1,KN149858.1,KN149859.1,KN149861.1,KN149868.1,KN149874.1,KN149878.1,KN149880.1,KN149883.1,KN149884.1,KN149886.1,KN149894.1,KN149895.1,KN149896.1,KN149897.1,KN149900.1,KN149904.1,KN149906.1,KN149909.1,KN149910.1,KN149912.1,KN149914.1,KN149916.1,KN149917.1,KN149921.1,KN149923.1,KN149929.1,KN149930.1,KN149933.1,KN149934.1,KN149936.1,KN149939.1,KN149943.1,KN149945.1,KN149946.1,KN149947.1,KN149948.1,KN149951.1,KN149955.1,KN149959.1,KN149962.1,KN149964.1,KN149966.1,KN149968.1,KN149978.1,KN149986.1,KN149987.1,KN149989.1,KN149992.1,KN149995.1,KN149997.1,KN149998.1,KN150000.1,KN150001.1,KN150002.1,KN150003.1,KN150008.1,KN150013.1,KN150015.1,KN
150027.1,KN150032.1,KN150038.1,KN150039.1,KN150040.1,KN150041.1,KN150042.1,KN150046.1,KN150051.1,KN150052.1,KN150056.1,KN150062.1,KN150064.1,KN150066.1,KN150067.1,KN150071.1,KN150072.1,KN150075.1,KN150079.1,KN150080.1,KN150084.1,KN150086.1,KN150088.1,KN150090.1,KN150096.1,KN150099.1,KN150102.1,KN150104.1,KN150108.1,KN150109.1,KN150112.1,KN150115.1,KN150120.1,KN150125.1,KN150127.1,KN150128.1,KN150131.1,KN150137.1,KN150141.1,KN150142.1,KN150148.1,KN150156.1,KN150158.1,KN150162.1,KN150164.1,KN150165.1,KN150168.1,KN150169.1,KN150170.1,KN150171.1,KN150172.1,KN150173.1,KN150176.1,KN150177.1,KN150178.1,KN150188.1,KN150189.1,KN150193.1,KN150196.1,KN150199.1,KN150205.1,KN150207.1,KN150208.1,KN150212.1,KN150213.1,KN150214.1,KN150216.1,KN150221.1,KN150229.1,KN150230.1,KN150232.1,KN150239.1,KN150240.1,KN150241.1,KN150251.1,KN150259.1,KN150262.1,KN150265.1,KN150267.1,KN150269.1,KN150272.1,KN150273.1,KN150277.1,KN150285.1,KN150305.1,KN150307.1,KN150311.1,KN150312.1,KN150314.1,KN150317.1,KN150320.1,
KN150322.1,KN150324.1,KN150326.1,KN150328.1,KN150332.1,KN150334.1,KN150335.1,KN150336.1,KN150339.1,KN150342.1,KN150345.1,KN150346.1,KN150348.1,KN150350.1,KN150351.1,KN150353.1,KN150355.1,KN150359.1,KN150361.1,KN150362.1,KN150365.1,KN150366.1,KN150371.1,KN150372.1,KN150379.1,KN150380.1,KN150383.1,KN150387.1,KN150390.1,KN150399.1,KN150400.1,KN150401.1,KN150402.1,KN150403.1,KN150405.1,KN150407.1,KN150411.1,KN150412.1,KN150415.1,KN150416.1,KN150424.1,KN150425.1,KN150432.1,KN150433.1,KN150435.1,KN150442.1,KN150447.1,KN150449.1,KN150451.1,KN150456.1,KN150470.1,KN150474.1,KN150475.1,KN150482.1,KN150487.1,KN150490.1,KN150491.1,KN150492.1,KN150505.1,KN150506.1,KN150508.1,KN150516.1,KN150518.1,KN150521.1,KN150527.1,KN150530.1,KN150531.1,KN150532.1,KN150541.1,KN150543.1,KN150544.1,KN150545.1,KN150550.1,KN150552.1,KN150561.1,KN150562.1,KN150564.1,KN150566.1,KN150568.1,KN150570.1,KN150572.1,KN150574.1,KN150576.1,KN150578.1,KN150589.1,KN150590.1,KN150596.1,KN150597.1,KN150600.1,KN150603.1,KN150605.
1,KN150608.1,KN150614.1,KN150616.1,KN150617.1,KN150620.1,KN150628.1,KN150630.1,KN150631.1,KN150635.1,KN150636.1,KN150637.1,KN150642.1,KN150647.1,KN150650.1,KN150653.1,KN150654.1,KN150663.1,KN150665.1,KN150666.1,KN150667.1,KN150670.1,KN150672.1,KN150674.1,KN150677.1,KN150680.1,KN150681.1,KN150683.1,KN150685.1,KN150691.1,KN150696.1,KN150698.1,KN150699.1,KN150700.1,KN150702.1,KN150703.1,KN150706.1,KN150708.1,KN150709.1"&VISIBLEPANEL=filterpanel</pre>
        </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif"><br>
        </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">Hope that helps/goes
          some way to explaining the difference between Mart and your
          API script. I can't comment on whether or not either of these
          are the definitive gene count for zebrafish.</div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif"><br>
        </div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">regards</div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif">Dan</div>
        <div class="gmail_default"
          style="font-family:verdana,sans-serif"><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On 7 December 2015 at 02:17, john
          samuel <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:john.samuel@senecacollege.ca" target="_blank">john.samuel@senecacollege.ca</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF"> Thanks Dan.<br>
              I thought of that, and I tried the same code but looking
              for genes in all the scaffolds, thinking that there might
              be some unplaced scaffolds, but the total for all
              scaffolds adds up to 31,501.  This could be, as you said,
              all the genes mapped to chromosomes, plus some unplaced
              scaffolds, but that doesn't match any of the other totals,
              so I'm no closer to knowing which total is correct.<br>
              Any other thoughts?<span class="HOEnZb"><font
                  color="#888888"><br>
                  John</font></span>
              <div>
                <div class="h5"><br>
                  <br>
                  On 15-12-06 09:07 PM, Daniel Lawson wrote:
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif">Hi John,</div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif"><br>
                      </div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif">There may
                        be other sequences in the assembly that have not
                        been assigned to a chromosome. You can check
                        this via the API or in Mart. I expect you'll
                        find a bunch of small sequences that harbour
                        some genes - maybe that will get your totals to
                        balance.</div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif"><br>
                      </div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif">cheers</div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif">Dan</div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif"><br>
                      </div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif"><br>
                      </div>
                      <div class="gmail_default"
                        style="font-family:verdana,sans-serif"><br>
                      </div>
                    </div>
                    <div class="gmail_extra"><br>
                      <div class="gmail_quote">On 7 December 2015 at
                        01:59, john samuel <span dir="ltr"><<a
                            moz-do-not-send="true"
                            href="mailto:john.samuel@senecacollege.ca"
                            target="_blank">john.samuel@senecacollege.ca</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
                            I am trying to get an accurate count of all
                            the ENSDARG genes from the latest zebrafish
                            data (<span>GRCz10) in ensembl.<br>
                              If I use the perl api to get all the genes
                              in all the chromosomes I get a total of
                              31,388 i.e.<br>
                              <br>
                              my $slice_adaptor =
                              $registry->get_adaptor( 'danio_rerio',
                              'Core', 'Slice' );<br>
                              my @slices = @{
                              $slice_adaptor->fetch_all('chromosome')
                              };<br>
                              my $total = 0;<br>
                              my %all;<br>
                              foreach my $slice (@slices) {<br>
                                  my @genes = @{
                              $slice->get_all_Genes() };<br>
                                  my $count = scalar @genes;<br>
                                 
                              $all{$slice->seq_region_name()}=$count;<br>
                                  $total += $count;<br>
                              }<br>
                              foreach my $sorted (sort {$a<=>$b}
                              keys %all) {<br>
                                  print "chromosome:
                              $sorted\t$all{$sorted}\n";<br>
                              }<br>
                              print "gene total is\t$total\n";<br>
                              <br>
                              chromosome: MT    37<br>
                              chromosome: 1    1386<br>
                              chromosome: 2    1587<br>
                              chromosome: 3    1611<br>
                              chromosome: 4    3103<br>
                              chromosome: 5    1704<br>
                              chromosome: 6    1280<br>
                              chromosome: 7    1507<br>
                              chromosome: 8    1216<br>
                              chromosome: 9    1108<br>
                              chromosome: 10    1108<br>
                              chromosome: 11    1039<br>
                              chromosome: 12    952<br>
                              chromosome: 13    1013<br>
                              chromosome: 14    953<br>
                              chromosome: 15    1146<br>
                              chromosome: 16    1241<br>
                              chromosome: 17    1048<br>
                              chromosome: 18    942<br>
                              chromosome: 19    1123<br>
                              chromosome: 20    1253<br>
                              chromosome: 21    1092<br>
                              chromosome: 22    1174<br>
                              chromosome: 23    1031<br>
                              chromosome: 24    800<br>
                              chromosome: 25    934<br>
                              gene total is    31388<br>
                              <br>
                              Anyone see anything wrong with how I get
                              the total?  I don't, but then when I go to
                              biomart (see below), I get a total of
                              31953<br>
                              <br>
                            </span><img
                              src="cid:part4.08010808.04040809@senecacollege.ca"
                              alt=""><br>
                            <span><br>
                              and if I go to the info page for the
                              genome at <a moz-do-not-send="true"
                                href="http://useast.ensembl.org/Danio_rerio/Info/Annotation"
                                target="_blank">http://useast.ensembl.org/Danio_rerio/Info/Annotation</a>
                              I see a </span><span>different</span><span>
                              total there too (31,650 not counting
                              pseudogenes).<br>
                              <br>
                            </span><img
                              src="cid:part6.04070601.02050501@senecacollege.ca"
                              alt=""><br>
                            <span><br>
                              Anyone have any idea why the different
                              totals and which one to believe and
                              whether there's anything wrong with using
                              the one that my code calculated as the
                              definitive one?  I need to compare the
                              total number of genes vs. the number that
                              we are finding under certain conditions,
                              to do some stats.<span><font
                                  color="#888888"><br>
                                  John<br>
                                </font></span></span><br>
                            <br>
                            <br>
                          </div>
                          <br>
_______________________________________________<br>
                          Dev mailing list    <a moz-do-not-send="true"
                            href="mailto:Dev@ensembl.org"
                            target="_blank">Dev@ensembl.org</a><br>
                          Posting guidelines and subscribe/unsubscribe
                          info: <a moz-do-not-send="true"
                            href="http://lists.ensembl.org/mailman/listinfo/dev"
                            rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
                          Ensembl Blog: <a moz-do-not-send="true"
                            href="http://www.ensembl.info/"
                            rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>
                          <br>
                        </blockquote>
                      </div>
                      <br>
                      <br clear="all">
                      <div><br>
                      </div>
                      -- <br>
                      <div>
                        <div dir="ltr">
                          <div>VectorBase | i5K insect genome initiative</div>
                        </div>
                      </div>
                    </div>
                    <br>
                    <fieldset></fieldset>
                    <br>
                    <pre>_______________________________________________
Dev mailing list    <a moz-do-not-send="true" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
                  </blockquote>
                </div>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            Dev mailing list    <a moz-do-not-send="true"
              href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a
              moz-do-not-send="true"
              href="http://lists.ensembl.org/mailman/listinfo/dev"
              rel="noreferrer" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a moz-do-not-send="true"
              href="http://www.ensembl.info/" rel="noreferrer"
              target="_blank">http://www.ensembl.info/</a><br>
            <br>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        <div class="gmail_signature">
          <div dir="ltr">
            <div>VectorBase | i5K insect genome initiative</div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
    </blockquote>
  </body>
</html>