<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Duarte<div class=""><br class=""></div><div class="">No we are not saying there are two possible canonical transcripts because of their curated/predicted status.<br class=""><div class=""><br class=""></div><div class="">I did a quick search and found a relevant bit of information from UCSC's genome mailing list. The knownCanonical table is populated by UCSC [1] and not by RefSeq. The rules Ensembl has used to select a canonical transcript from our own gene set [2] and the rules UCSC [3] have used to select from the RefSeq set are not the same. </div><div class=""><br class=""></div><div class="">Neither Ensembl nor UCSC claim this is a canonical transcript assigned by RefSeq. In both cases it is the application of our rules to an externally imported gene set.</div><div class=""><br class=""></div><div class="">Andy</div><div class=""><br class=""></div><div class="">1 - <a href="https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/_6asF5KciPc/ANihqywjAwAJ" class="">https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/_6asF5KciPc/ANihqywjAwAJ</a></div><div class="">2 - <a href="https://github.com/Ensembl/ensembl/blob/release/85/modules/Bio/EnsEMBL/Utils/TranscriptSelector.pm#L46" class="">https://github.com/Ensembl/ensembl/blob/release/85/modules/Bio/EnsEMBL/Utils/TranscriptSelector.pm#L46</a></div><div class="">3 - <a href="http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=knownGene" class="">http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=knownGene</a></div><div class=""><br class=""></div><div class=""><div apple-content-edited="true" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="orphans: auto; text-align: start; text-indent: 0px; widows: auto; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="orphans: auto; text-align: start; text-indent: 0px; widows: auto; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">------------<br class="">Andrew Yates - Genomics Technology Infrastructure Team Leader<br class="">The European Bioinformatics Institute (EMBL-EBI)<br class="">Wellcome Genome Campus<br class="">Hinxton, Cambridge<br class="">CB10 1SD, United Kingdom<br class="">Tel: +44-(0)1223-492538<br class="">Fax: +44-(0)1223-494468<br class="">Skype: andy.yates.ebi</div><div style="orphans: auto; text-align: start; text-indent: 0px; widows: auto; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><a href="http://www.ebi.ac.uk/" class="">http://www.ebi.ac.uk/</a><br class="">http://www.ensembl.org/</div></div></div></div></div>
</div>
<br class=""><div><blockquote type="cite" class=""><div class="">On 26 Jul 2016, at 16:44, Duarte Molha <<a href="mailto:duartemolha@gmail.com" class="">duartemolha@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">Now I am really confused !</div><div class=""><br class=""></div><div class=""><div class="">Even the UCSC tables link <span style="font-size:12.8px" class="">NM_003036.3  as the canonical transcript. Does this mean there can be 2 possible canonical transcripts </span></div></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><span style="font-size:12.8px" class="">one for curated annotations and one for predicted?</span></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><span style="font-size:12.8px" class="">Here is the table linkage of refseq transcripts in the knownCanonical </span><span style="font-size:12.8px" class="">table</span></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><pre style="word-wrap: break-word; white-space: pre-wrap;" class="">#filter: kgXref.geneSymbol = 'SKI'
#hg19.knownCanonical.chrom      hg19.knownCanonical.chromStart  hg19.knownCanonical.chromEnd    hg19.knownCanonical.clusterId   hg19.knownCanonical.transcript  hg19.knownCanonical.protein     hg19.kgXref.geneSymbol  hg19.kgXref.refseq      hg19.kgXref.protAcc     hg19.kgXref.description
chr1    2160133 2241652 98      uc001aja.4      uc001aja.4      SKI     NM_003036       NP_003027       Homo sapiens v-ski sarcoma viral oncogene homolog (avian) (SKI), mRNA.</pre></div><div class=""><pre class="gmail-genbank" style="font-family: monospace, serif; font-size: 13px; white-space: pre-wrap; margin-top: 0px; margin-bottom: 0px; overflow: visible; word-wrap: break-word; width: 50em; zoom: 1; line-height: 16.9px;"><pre style="line-height:normal;word-wrap:break-word;white-space:pre-wrap" class=""><pre style="word-wrap:break-word;white-space:pre-wrap" class=""><br class=""></pre></pre></pre></div><div class="gmail_extra">
<br class=""><div class="gmail_quote">On 26 July 2016 at 16:06, mag <span dir="ltr" class=""><<a href="mailto:mr6@ebi.ac.uk" target="_blank" class="">mr6@ebi.ac.uk</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" class="">
    Hi Duarte,<br class="">
    <br class="">
    A canonical transcript is usually the transcript with the longest
    translation for a given gene<br class="">
    <a href="http://www.ensembl.org/Help/Glossary?id=346" target="_blank" class="">http://www.ensembl.org/Help/Glossary?id=346</a><br class="">
    <br class="">
    In your example, XP_005244832.1 has a translation of 730 aa while
    NP_003027.1 only has 728.<br class="">
    Hence, it is chosen as the canonical transcript.<br class="">
    <br class="">
    As Kieron mentioned, if you want specifically curated RefSeq
    annotation, it might be better to fetch all external annotations
    then filter out the ones you are interested in.<br class="">
    <br class="">
    <br class="">
    Regards,<br class="">
    Magali<div class=""><div class="gmail-h5"><br class="">
    <br class="">
    <div class="">On 25/07/2016 17:07, Duarte Molha
      wrote:<br class="">
    </div>
    <blockquote type="cite" class="">
      <div dir="ltr" class="">I will try and produce here the relevant parts of
        the script.
        <div class=""><br class="">
        </div>
        <div class="">But I still am at loss why <span style="font-size:12.8px" class=""> </span><a href="http://www.ncbi.nlm.nih.gov/protein/XP_005244832.1" style="font-size:12.8px" target="_blank" class="">XP_005244832.1</a> has
          been tagged as canonical</div>
        <div class=""><br class="">
        </div>
        <div class="">For what you are saying is that I simply might not have
          cycled trough all of the refseq transcripts... but is there
          going to be more than one refseq transcript tagged as
          canonical for each gene?</div>
        <div class=""><br class="">
        </div>
        <div class="">Not sure I follow!</div>
        <div class=""><br class="">
        </div>
        <div class="">Thanks</div>
        <div class=""><br class="">
          Duarte</div>
        <div class=""><br class="">
        </div>
        <div class=""><br class="">
        </div>
        <div class=""><br class="">
        </div>
      </div>
      <div class="gmail_extra"><br clear="all" class="">
        <div class="">
          <div class="">
            <div dir="ltr" class="">
              <div class="">
                <table style="margin:0px;padding:0px;border:0px;outline:0px;font-size:14px;font-family:proxima-nova-1,proxima-nova-2,tahoma,helvetica,verdana,sans-serif;vertical-align:baseline;color:rgb(51,51,51);line-height:18.2px" border="0" cellpadding="0" cellspacing="0" class="">
                  <tbody style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                    <tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                      <td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;font-family:inherit;vertical-align:baseline;width:auto;height:30px" class=""> </td>
                    </tr>
                    <tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                      <td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-family:inherit;vertical-align:baseline;width:auto" class="">
                        <div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline;line-height:0" class=""><a href="https://about.me/duarte?promo=email_sig" style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline;color:rgb(43,130,173);text-decoration:none;display:inline-block" target="_blank" class="">
                            <table style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" border="0" cellpadding="0" cellspacing="0" class="">
                              <tbody style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                                <tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                                  <td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-family:inherit;vertical-align:top;width:auto;line-height:1" align="left" valign="top" class=""><img alt="--" style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; display: block; width: 0px; min-height: 0px; overflow: hidden;" height="0" width="0" class="">
                                    <div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:bold;font-style:inherit;font-size:18px;font-family:proxima-nova-1,proxima-nova,helvetica,arial,sans-serif;vertical-align:baseline;line-height:1;color:rgb(51,51,51)" class="">Duarte
                                      Molha</div>
                                    <div style="margin:3px 0px 0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-size:12px;font-family:proxima-nova-1,proxima-nova,helvetica,arial,sans-serif;vertical-align:baseline" class=""><img alt="https://" style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; display: block; width: 0px; min-height: 0px; overflow: hidden;" height="0" width="0" class="">about.me/duarte</div>
                                  </td>
                                </tr>
                                <tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                                  <td style="padding:8px 0px 0px;border:0px;outline:0px;font-style:inherit;font-family:inherit;vertical-align:top;width:auto;line-height:1" align="left" valign="top" class="">
                                    <div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline;text-align:right;min-height:4px;background-color:rgb(197,208,224)" class=""><img src="https://d13pix9kaak6wt.cloudfront.net/signature/colorbar.png" alt="" style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; float: right; display: block;" height="4" width="88" class=""></div>
                                  </td>
                                </tr>
                              </tbody>
                            </table>
                          </a>  </div>
                      </td>
                    </tr>
                    <tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
                      <td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;font-family:inherit;vertical-align:baseline;width:auto;height:20px" class=""><img style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; overflow: hidden;" height="1" width="1" class=""></td>
                    </tr>
                  </tbody>
                </table>
              </div>
            </div>
          </div>
        </div>
        <br class="">
        <div class="gmail_quote">On 25 July 2016 at 11:58, Kieron Taylor
          <span dir="ltr" class=""><<a href="mailto:ktaylor@ebi.ac.uk" target="_blank" class="">ktaylor@ebi.ac.uk</a>></span>
          wrote:<br class="">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi Duarte,<br class="">
            <br class="">
            Can you send us a snippet of code that accesses the external
            database adaptor (DBEntryAdaptor?). It sounds like you may
            not be reading enough of your results to get the RefSeq ID
            you expect. We have all of the RefSeq IDs you mention
            associated at some level to the transcript, but some are
            from "RefSeq peptide predicted" for example.<br class="">
            <br class="">
            Kieron<br class="">
            <br class="">
            <br class="">
            <br class="">
            Kieron Taylor PhD.<br class="">
            Ensembl Developer<br class="">
            <br class="">
            EMBL, European Bioinformatics Institute<br class="">
            <div class="">
              <div class=""><br class="">
                <br class="">
                <br class="">
                <br class="">
                <br class="">
                <br class="">
                > On 22 Jul 2016, at 10:47, Duarte Molha <<a href="mailto:duartemolha@gmail.com" target="_blank" class="">duartemolha@gmail.com</a>>
                wrote:<br class="">
                ><br class="">
                > Hi Guys<br class="">
                ><br class="">
                > I have a script that based on a gene symbol
                connects to ensembl and retrieves the canonical
                transcript and then does the same using the external
                database adaptor to get the canonical refseq transcript.<br class="">
                ><br class="">
                > However this does not seem to give me the correct
                result<br class="">
                ><br class="">
                > Take for example the gene SKI ( I am using GRCh37
                assembly btw)<br class="">
                ><br class="">
                > If you open this gene on the Ensembl browser:<br class="">
                ><br class="">
                > <a href="http://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000157933;r=1:2159997-2161343" rel="noreferrer" target="_blank" class="">http://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000157933;r=1:2159997-2161343</a><br class="">
                ><br class="">
                ><br class="">
                > On SKI, Ensembl annotates as the canonical
                transcript: ENST00000378536<br class="">
                ><br class="">
                > However, using by script, the external database
                adaptor returns the refseq XP_005244832.1 as the refseq
                canonical transcript, even though the correct canonical
                transcripts is NM_003036.3<br class="">
                ><br class="">
                > <a href="http://www.ncbi.nlm.nih.gov/gene/6497" rel="noreferrer" target="_blank" class="">http://www.ncbi.nlm.nih.gov/gene/6497</a><br class="">
                ><br class="">
                > Unless I am understanding this incorrectly if the
                coding regions is the same length in 2 transcripts the
                longest should be the canonical<br class="">
                ><br class="">
                > The longer Refseq is NM_003036.3  (has a longer
                5prime UTR)<br class="">
                ><br class="">
                > Can you help me understand this?<br class="">
                ><br class="">
                > Many thanks<br class="">
                ><br class="">
                > Duarte<br class="">
              </div>
            </div>
            > _______________________________________________<br class="">
            > Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank" class="">Dev@ensembl.org</a><br class="">
            > Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">
            > Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank" class="">http://www.ensembl.info/</a><br class="">
            <br class="">
            <br class="">
            _______________________________________________<br class="">
            Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank" class="">Dev@ensembl.org</a><br class="">
            Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">
            Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank" class="">http://www.ensembl.info/</a><br class="">
          </blockquote>
        </div>
        <br class="">
      </div>
      <br class="">
      <fieldset class=""></fieldset>
      <br class="">
      <pre class="">_______________________________________________
Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank" class="">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank" class="">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br class="">
  </div></div></div>

<br class="">_______________________________________________<br class="">
Dev mailing list    <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank" class="">http://www.ensembl.info/</a><br class="">
<br class=""></blockquote></div><br class=""></div></div>
_______________________________________________<br class="">Dev mailing list    <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">Ensembl Blog: <a href="http://www.ensembl.info/" class="">http://www.ensembl.info/</a><br class=""></div></blockquote></div><br class=""></div></div></body></html>