<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Hi Wojtek,<br>

    <br>

    I went to speak with Kevin to better understand what you're trying

    to accomplish. Based on our conversation, you'll want to look at the

    pep2genomic() method in the Transcript object.<br>

    <br>

    For these transcripts you're trying to evaluate if the data you've

    used to load an Ensembl db is correct, use the translateable_seq()

    function to retrieve the sequence as would be translated based on

    the input data. Then cycle through the sequence looking for the stop

    codon characters. You can then take those positions in the protein

    and feed them in to pep2genomic( x, x ) to find the genomic

    coordinates.<br>

    <br>

    The Ensembl API documentation [1] details the return type for this

    call, a series of Coordinate and Gap objects containing the genomic

    coordinates.<br>

    <br>

    If you have any further questions, or if we didn't properly

    understand what you were trying to accomplish, let us know. Thanks.<br>

    <br>

    [1]

<a class="moz-txt-link-freetext" href="http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1TranscriptMapper.html#afb66982442190f4eaa711fdee89b0418">http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1TranscriptMapper.html#afb66982442190f4eaa711fdee89b0418</a><br>

    <br>

    <div class="moz-cite-prefix">On 03/05/18 11:39, Wojtek Bażant wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:47c708f49a43fd94f5246e43f55e6a32@sanger.ac.uk">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <p>Hi ensembl-dev,</p>

      <p>Sometimes the annotation I try to load up has transcripts that

        don't translate to valid proteins. I then go to look at them in

        a genome viewer to get an idea what's wrong, and it's helpful to

        know where to look.</p>

      <p>I've tried to work with the values reported to me by the

        ProteinTranslation healthcheck log, until I realised they're

        nonsense - I think this code is wrong:</p>

      <p><a

href="https://github.com/Ensembl/ensj-healthcheck/blob/release/92/perl/Bio/EnsEMBL/Healthcheck/Translation.pm#L306"

          moz-do-not-send="true">https://github.com/Ensembl/ensj-healthcheck/blob/release/92/perl/Bio/EnsEMBL/Healthcheck/Translation.pm#L306</a></p>

      <p>It takes the protein sequence (in the peptide alphabet), looks

        for indexes of '*', adds these to the beginning of transcript

        start ( in the dna alphabet), and claims these to be locations

        of stop codons.</p>

      <p>I currently have no good way of doing this. I have been

        translating the exons in all three phases, saving them to a

        file, and then text searching for bits of the sequence around

        the *. Does Ensembl offer a better way that I couldn't find, or,

        can you think of one?Thanks,Wojtek</p>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>

Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>

Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>

</pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">

</pre>

  </body>

</html>