<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Duarte<div class=""><br class=""></div><div class="">No we are not saying there are two possible canonical transcripts because of their curated/predicted status.<br class=""><div class=""><br class=""></div><div class="">I did a quick search and found a relevant bit of information from UCSC's genome mailing list. The knownCanonical table is populated by UCSC [1] and not by RefSeq. The rules Ensembl has used to select a canonical transcript from our own gene set [2] and the rules UCSC [3] have used to select from the RefSeq set are not the same. </div><div class=""><br class=""></div><div class="">Neither Ensembl nor UCSC claim this is a canonical transcript assigned by RefSeq. In both cases it is the application of our rules to an externally imported gene set.</div><div class=""><br class=""></div><div class="">Andy</div><div class=""><br class=""></div><div class="">1 - <a href="https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/_6asF5KciPc/ANihqywjAwAJ" class="">https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/_6asF5KciPc/ANihqywjAwAJ</a></div><div class="">2 - <a href="https://github.com/Ensembl/ensembl/blob/release/85/modules/Bio/EnsEMBL/Utils/TranscriptSelector.pm#L46" class="">https://github.com/Ensembl/ensembl/blob/release/85/modules/Bio/EnsEMBL/Utils/TranscriptSelector.pm#L46</a></div><div class="">3 - <a href="http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=knownGene" class="">http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=knownGene</a></div><div class=""><br class=""></div><div class=""><div apple-content-edited="true" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="orphans: auto; text-align: start; text-indent: 0px; widows: auto; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="orphans: auto; text-align: start; text-indent: 0px; widows: auto; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">------------<br class="">Andrew Yates - Genomics Technology Infrastructure Team Leader<br class="">The European Bioinformatics Institute (EMBL-EBI)<br class="">Wellcome Genome Campus<br class="">Hinxton, Cambridge<br class="">CB10 1SD, United Kingdom<br class="">Tel: +44-(0)1223-492538<br class="">Fax: +44-(0)1223-494468<br class="">Skype: andy.yates.ebi</div><div style="orphans: auto; text-align: start; text-indent: 0px; widows: auto; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><a href="http://www.ebi.ac.uk/" class="">http://www.ebi.ac.uk/</a><br class="">http://www.ensembl.org/</div></div></div></div></div>
</div>
<br class=""><div><blockquote type="cite" class=""><div class="">On 26 Jul 2016, at 16:44, Duarte Molha <<a href="mailto:duartemolha@gmail.com" class="">duartemolha@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">Now I am really confused !</div><div class=""><br class=""></div><div class=""><div class="">Even the UCSC tables link <span style="font-size:12.8px" class="">NM_003036.3 as the canonical transcript. Does this mean there can be 2 possible canonical transcripts </span></div></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><span style="font-size:12.8px" class="">one for curated annotations and one for predicted?</span></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><span style="font-size:12.8px" class="">Here is the table linkage of refseq transcripts in the knownCanonical </span><span style="font-size:12.8px" class="">table</span></div><div class=""><span style="font-size:12.8px" class=""><br class=""></span></div><div class=""><pre style="word-wrap: break-word; white-space: pre-wrap;" class="">#filter: kgXref.geneSymbol = 'SKI'
#hg19.knownCanonical.chrom hg19.knownCanonical.chromStart hg19.knownCanonical.chromEnd hg19.knownCanonical.clusterId hg19.knownCanonical.transcript hg19.knownCanonical.protein hg19.kgXref.geneSymbol hg19.kgXref.refseq hg19.kgXref.protAcc hg19.kgXref.description
chr1 2160133 2241652 98 uc001aja.4 uc001aja.4 SKI NM_003036 NP_003027 Homo sapiens v-ski sarcoma viral oncogene homolog (avian) (SKI), mRNA.</pre></div><div class=""><pre class="gmail-genbank" style="font-family: monospace, serif; font-size: 13px; white-space: pre-wrap; margin-top: 0px; margin-bottom: 0px; overflow: visible; word-wrap: break-word; width: 50em; zoom: 1; line-height: 16.9px;"><pre style="line-height:normal;word-wrap:break-word;white-space:pre-wrap" class=""><pre style="word-wrap:break-word;white-space:pre-wrap" class=""><br class=""></pre></pre></pre></div><div class="gmail_extra">
<br class=""><div class="gmail_quote">On 26 July 2016 at 16:06, mag <span dir="ltr" class=""><<a href="mailto:mr6@ebi.ac.uk" target="_blank" class="">mr6@ebi.ac.uk</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF" class="">
Hi Duarte,<br class="">
<br class="">
A canonical transcript is usually the transcript with the longest
translation for a given gene<br class="">
<a href="http://www.ensembl.org/Help/Glossary?id=346" target="_blank" class="">http://www.ensembl.org/Help/Glossary?id=346</a><br class="">
<br class="">
In your example, XP_005244832.1 has a translation of 730 aa while
NP_003027.1 only has 728.<br class="">
Hence, it is chosen as the canonical transcript.<br class="">
<br class="">
As Kieron mentioned, if you want specifically curated RefSeq
annotation, it might be better to fetch all external annotations
then filter out the ones you are interested in.<br class="">
<br class="">
<br class="">
Regards,<br class="">
Magali<div class=""><div class="gmail-h5"><br class="">
<br class="">
<div class="">On 25/07/2016 17:07, Duarte Molha
wrote:<br class="">
</div>
<blockquote type="cite" class="">
<div dir="ltr" class="">I will try and produce here the relevant parts of
the script.
<div class=""><br class="">
</div>
<div class="">But I still am at loss why <span style="font-size:12.8px" class=""> </span><a href="http://www.ncbi.nlm.nih.gov/protein/XP_005244832.1" style="font-size:12.8px" target="_blank" class="">XP_005244832.1</a> has
been tagged as canonical</div>
<div class=""><br class="">
</div>
<div class="">For what you are saying is that I simply might not have
cycled trough all of the refseq transcripts... but is there
going to be more than one refseq transcript tagged as
canonical for each gene?</div>
<div class=""><br class="">
</div>
<div class="">Not sure I follow!</div>
<div class=""><br class="">
</div>
<div class="">Thanks</div>
<div class=""><br class="">
Duarte</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
</div>
<div class="gmail_extra"><br clear="all" class="">
<div class="">
<div class="">
<div dir="ltr" class="">
<div class="">
<table style="margin:0px;padding:0px;border:0px;outline:0px;font-size:14px;font-family:proxima-nova-1,proxima-nova-2,tahoma,helvetica,verdana,sans-serif;vertical-align:baseline;color:rgb(51,51,51);line-height:18.2px" border="0" cellpadding="0" cellspacing="0" class="">
<tbody style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;font-family:inherit;vertical-align:baseline;width:auto;height:30px" class=""> </td>
</tr>
<tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-family:inherit;vertical-align:baseline;width:auto" class="">
<div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline;line-height:0" class=""><a href="https://about.me/duarte?promo=email_sig" style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline;color:rgb(43,130,173);text-decoration:none;display:inline-block" target="_blank" class="">
<table style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" border="0" cellpadding="0" cellspacing="0" class="">
<tbody style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-family:inherit;vertical-align:top;width:auto;line-height:1" align="left" valign="top" class=""><img alt="--" style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; display: block; width: 0px; min-height: 0px; overflow: hidden;" height="0" width="0" class="">
<div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:bold;font-style:inherit;font-size:18px;font-family:proxima-nova-1,proxima-nova,helvetica,arial,sans-serif;vertical-align:baseline;line-height:1;color:rgb(51,51,51)" class="">Duarte
Molha</div>
<div style="margin:3px 0px 0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-size:12px;font-family:proxima-nova-1,proxima-nova,helvetica,arial,sans-serif;vertical-align:baseline" class=""><img alt="https://" style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; display: block; width: 0px; min-height: 0px; overflow: hidden;" height="0" width="0" class="">about.me/duarte</div>
</td>
</tr>
<tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<td style="padding:8px 0px 0px;border:0px;outline:0px;font-style:inherit;font-family:inherit;vertical-align:top;width:auto;line-height:1" align="left" valign="top" class="">
<div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline;text-align:right;min-height:4px;background-color:rgb(197,208,224)" class=""><img src="https://d13pix9kaak6wt.cloudfront.net/signature/colorbar.png" alt="" style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; float: right; display: block;" height="4" width="88" class=""></div>
</td>
</tr>
</tbody>
</table>
</a> </div>
</td>
</tr>
<tr style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:inherit;font-style:inherit;font-family:inherit;vertical-align:baseline" class="">
<td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;font-family:inherit;vertical-align:baseline;width:auto;height:20px" class=""><img style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; outline: 0px; font-weight: inherit; font-style: inherit; font-family: inherit; vertical-align: baseline; overflow: hidden;" height="1" width="1" class=""></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<br class="">
<div class="gmail_quote">On 25 July 2016 at 11:58, Kieron Taylor
<span dir="ltr" class=""><<a href="mailto:ktaylor@ebi.ac.uk" target="_blank" class="">ktaylor@ebi.ac.uk</a>></span>
wrote:<br class="">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi Duarte,<br class="">
<br class="">
Can you send us a snippet of code that accesses the external
database adaptor (DBEntryAdaptor?). It sounds like you may
not be reading enough of your results to get the RefSeq ID
you expect. We have all of the RefSeq IDs you mention
associated at some level to the transcript, but some are
from "RefSeq peptide predicted" for example.<br class="">
<br class="">
Kieron<br class="">
<br class="">
<br class="">
<br class="">
Kieron Taylor PhD.<br class="">
Ensembl Developer<br class="">
<br class="">
EMBL, European Bioinformatics Institute<br class="">
<div class="">
<div class=""><br class="">
<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
> On 22 Jul 2016, at 10:47, Duarte Molha <<a href="mailto:duartemolha@gmail.com" target="_blank" class="">duartemolha@gmail.com</a>>
wrote:<br class="">
><br class="">
> Hi Guys<br class="">
><br class="">
> I have a script that based on a gene symbol
connects to ensembl and retrieves the canonical
transcript and then does the same using the external
database adaptor to get the canonical refseq transcript.<br class="">
><br class="">
> However this does not seem to give me the correct
result<br class="">
><br class="">
> Take for example the gene SKI ( I am using GRCh37
assembly btw)<br class="">
><br class="">
> If you open this gene on the Ensembl browser:<br class="">
><br class="">
> <a href="http://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000157933;r=1:2159997-2161343" rel="noreferrer" target="_blank" class="">http://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000157933;r=1:2159997-2161343</a><br class="">
><br class="">
><br class="">
> On SKI, Ensembl annotates as the canonical
transcript: ENST00000378536<br class="">
><br class="">
> However, using by script, the external database
adaptor returns the refseq XP_005244832.1 as the refseq
canonical transcript, even though the correct canonical
transcripts is NM_003036.3<br class="">
><br class="">
> <a href="http://www.ncbi.nlm.nih.gov/gene/6497" rel="noreferrer" target="_blank" class="">http://www.ncbi.nlm.nih.gov/gene/6497</a><br class="">
><br class="">
> Unless I am understanding this incorrectly if the
coding regions is the same length in 2 transcripts the
longest should be the canonical<br class="">
><br class="">
> The longer Refseq is NM_003036.3 (has a longer
5prime UTR)<br class="">
><br class="">
> Can you help me understand this?<br class="">
><br class="">
> Many thanks<br class="">
><br class="">
> Duarte<br class="">
</div>
</div>
> _______________________________________________<br class="">
> Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank" class="">Dev@ensembl.org</a><br class="">
> Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">
> Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank" class="">http://www.ensembl.info/</a><br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank" class="">Dev@ensembl.org</a><br class="">
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank" class="">http://www.ensembl.info/</a><br class="">
</blockquote>
</div>
<br class="">
</div>
<br class="">
<fieldset class=""></fieldset>
<br class="">
<pre class="">_______________________________________________
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank" class="">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank" class="">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br class="">
</div></div></div>
<br class="">_______________________________________________<br class="">
Dev mailing list <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">
Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank" class="">http://www.ensembl.info/</a><br class="">
<br class=""></blockquote></div><br class=""></div></div>
_______________________________________________<br class="">Dev mailing list <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">Ensembl Blog: <a href="http://www.ensembl.info/" class="">http://www.ensembl.info/</a><br class=""></div></blockquote></div><br class=""></div></div></body></html>