<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi Julie<div class=""><br class=""></div><div class="">We have some information about non-ATG start codons in our blog post from release 102:</div><div class=""><a href="https://www.ensembl.info/2020/11/30/ensembl-102-has-been-released/" class="">https://www.ensembl.info/2020/11/30/ensembl-102-has-been-released/</a></div><div class=""><br class=""></div><div class="">Quite simply, there is not a rule. This is a situation of exceptional biology which we are only able to annotate correctly because of our expert manual gene annotators analysing the data in detail.</div><div class=""><br class=""></div><div class="">All the best</div><div class=""><br class=""></div><div class="">Emily<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On 10 Mar 2021, at 09:08, Julie Sullivan <<a href="mailto:julie.sullivan@gmail.com" class="">julie.sullivan@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small"><a href="https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634" class="">https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634</a><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small">The first codon is GTG. I would not have expected that to be Methionine.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small">I looked in the text files, and there are 123 of these transcripts where the start codon is NOT ATG but the aa is M, in Homo sapiens.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small"><pre class="gmail-c-mrkdwn__pre" style="box-sizing:inherit;margin:4px 0px;padding:8px;font-size:12px;line-height:1.50001;font-variant-ligatures:none;white-space:pre-wrap;word-break:normal;font-family:Monaco,Menlo,Consolas,"Courier New",monospace;border-radius:4px;color:rgb(29,28,29);font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">{'error': 0,<br style="box-sizing:inherit" class=""> 'methionine': 91434,<br style="box-sizing:inherit" class=""> 'GTG': 22,<br style="box-sizing:inherit" class=""> 'ATA': 10,<br style="box-sizing:inherit" class=""> 'CTG': 67,<br style="box-sizing:inherit" class=""> 'ACG': 8,<br style="box-sizing:inherit" class=""> 'TTG': 9,<br style="box-sizing:inherit" class=""> 'ATT': 5,<br style="box-sizing:inherit" class=""> 'AAC': 1,<br style="box-sizing:inherit" class=""> 'AAG': 1}</pre><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small">Why is that? <br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small">Specifically I would like a rule I can use, as my HGVSp strings are different from VEP for this reason.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small">Thanks!<br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif;font-size:small">Julie<br class=""></div></div>
_______________________________________________<br class="">Dev mailing list    <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">Posting guidelines and subscribe/unsubscribe info: <a href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org" class="">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a><br class="">Ensembl Blog: <a href="http://www.ensembl.info/" class="">http://www.ensembl.info/</a><br class=""></div></blockquote></div><br class=""><div class="">
<meta charset="UTF-8" class=""><div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div>—</div><div><br class=""></div><div>Dr Emily Perry (Pritchard)<br class="">Ensembl Outreach Project Leader <br class="">(she/her)<br class=""><br class="">European Bioinformatics Institute (EMBL-EBI)<br class="">European Molecular Biology Laboratory <br class="">Wellcome Genome Campus<br class="">Hinxton<br class="">Cambridge<br class="">CB10 1SD<br class="">UK </div><div class=""><br class=""></div></div><br class="Apple-interchange-newline"><br class="Apple-interchange-newline">
</div>
<br class=""></div></body></html>