<div dir="ltr">Yes, my reply still stands.<div><br></div><div>The protein sequence as stored in <span style="font-size:12.8px">$tva->transcript_variation->_</span><wbr style="font-size:12.8px"><span style="font-size:12.8px">peptide is extracted from the protein sequence as constructed by the Ensembl API (from the imported RefSeq coords and the genomic sequence), *not* from the protein sequence as given in the RefSeq record. The reasons for this are explained in my initial reply.</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Unfortunately this is how our whole API is designed, and we are currently looking for better solutions at dealing with situations where the RefSeq sequence differs from the genome.</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Hopefully that makes more sense?</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Will</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 27 September 2016 at 12:57, João Eiras <span dir="ltr"><<a href="mailto:joao.eiras@gmail.com" target="_blank">joao.eiras@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi.<br>

<br>

Thanks for the reply. But I might not have been clear enough.<br>

<br>

This is how the plugin looks like<br>

<br>

sub run {<br>

my ($self, $tva) = @_;<br>

return {reference => $tva->transcript_variation->_<wbr>peptide};<br>

}<br>

<br>

As you can see, it accesses _peptide which is the wild type sequence<br>

(I hope, I got that line from the ProteinSeqs plugin).<br>

<br>

So, regardless of whether the variant affects the transcript or, I'm<br>

accessing the wild type sequence... right ?<br>

<br>

I used that variant (which is a frameshift) as an example, but could<br>

just as well do<br>

<span class=""><br>

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT<br>

</span>chr5 38300282 . GTC AAA 5000 . . .<br>

<br>

the point of the variant it just to access the transcript (since I<br>

don't know how to query the VEP database directly).<br>

<br>

VEP will then report three annotations in transcripts<br>

ENSMUST00000063136, ENSMUST00000114099 and NM_172709.3. The reference<br>

sequence for transcripts ENSMUST00000114099 and NM_172709.3 should be<br>

the same, but they aren't.<br>

<br>

So, I'm concerned that the refseq transcripts are not being correctly stored.<br>

<div class="HOEnZb"><div class="h5"><br>

Thank you.<br>

<br>

______________________________<wbr>_________________<br>

Dev mailing list    <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>

Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" rel="noreferrer" target="_blank">http://lists.ensembl.org/<wbr>mailman/listinfo/dev</a><br>

Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>

</div></div></blockquote></div><br></div>