<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hey Steve,<div class=""><br class=""></div><div class="">The problem with using the database is that sequence is not stored against the top-level sequences annotation is held against. Instead sequence is held against the contig sequence regions which requires descending through the assembly table an unspecified number of times (once for each mapping e.g. chromosome -> supercontig -> contig). </div><div class=""><br class=""></div><div class="">I would seriously *not* recommend doing this. Not only do you have to deal with descending down the assembly but also having to think about concatenating the sequence & paying attention to the orientation of assembly. Instead you could use the Perl API (probably not an option considering you’re a Python guy), BioMart (you can access unspliced gene sequence quite easily), the REST API or download the full genome sequence from FTP and doing subslices. The faindex index tool from htslib/samtools is pretty good at extracting arbitrary sequence from very large FASTA files.</div><div class=""><br class=""></div><div class="">Andy</div><div class=""><br class=""></div><div apple-content-edited="true" class="">
------------<br class="">Andrew Yates - Ensembl Support Coordinator<br class="">European Molecular Biology Laboratory<br class="">European Bioinformatics Institute<br class="">Wellcome Trust Genome Campus<br class="">Hinxton, Cambridge<br class="">CB10 1SD, United Kingdom<br class="">Tel: +44-(0)1223-492538<br class="">Fax: +44-(0)1223-494468<br class="">Skype: andrewyatz<br class=""><a href="http://www.ensembl.org/" class="">http://www.ensembl.org/</a>

</div>
<br class=""><div><blockquote type="cite" class=""><div class="">On 16 Dec 2014, at 16:15, Steve Moss <<a href="mailto:gawbul@gmail.com" class="">gawbul@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><span style="font-size:13px" class="">Dear EnsEMBL Dev,</span><div dir="ltr" style="font-size:13px" class=""><div class=""><br class=""></div><div class="">I'm trying to write a raw SQL query to retrieve the sequence for the human BRCA2 gene to compare different methods of accessing EnsEMBL data. I'm currently doing the following, but getting an empty set.</div><div class=""><br class=""></div><div class=""><div class="">SELECT SUBSTRING(sequence, g.seq_region_start, g.seq_region_end)</div><div class="">FROM dna d</div><div class="">JOIN gene g</div><div class="">ON d.seq_region_id = g.seq_region_id</div><div class="">WHERE g.stable_id="ENSG00000139618"</div></div><div class=""><br class=""></div><div class="">What am I missing? I think I'm falling short on working out the coord. system mapping stuff. Any pointers to help in fixing please?</div><div class=""><br class=""></div><div class="">Cheers,</div><div class=""><br class="">Steve</div></div><div class=""><br class=""></div>-- <br class=""><div class="gmail_signature"><div dir="ltr" class=""><a href="http://about.me/gawbul" style="margin:0px;padding:0px;border:0px;outline:0px;font-size:14px;vertical-align:baseline;color:rgb(43,130,173);text-decoration:none;line-height:18px" target="_blank" class=""><font face="tahoma, sans-serif" class=""></font><table border="0" cellpadding="0" cellspacing="0" style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline;border-spacing:0px" class=""><tbody style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline" class=""><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline" class=""><td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;vertical-align:baseline;height:30px" class=""> </td></tr><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline" class=""><td align="left" valign="top" style="padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:top;line-height:1" class=""><div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:bold;font-style:inherit;font-size:18px;vertical-align:baseline;line-height:1;color:rgb(51,51,51)" class="">Steve Moss</div><div style="margin:3px 0px 0px;padding:0px;border:0px;outline:0px;font-style:inherit;font-size:12px;vertical-align:baseline" class="">about.me/gawbul</div></td></tr><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline" class=""><td align="left" valign="top" style="padding:8px 0px 0px;border:0px;outline:0px;font-style:inherit;vertical-align:top;line-height:1" class=""><div style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline;text-align:right;height:4px;background-color:rgb(197,208,224)" class=""><img src="http://d13pix9kaak6wt.cloudfront.net/signature/colorbar.png" alt="Steve Moss on about.me" width="88" height="4" style="margin: 0px; padding: 0px; border: 0px; outline: 0px; font-style: inherit; vertical-align: baseline; float: right; display: block;" class=""></div></td></tr><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline" class=""><td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;vertical-align:baseline;height:20px" class=""> </td></tr></tbody></table></a></div></div>
</div>
_______________________________________________<br class="">Dev mailing list    <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">Ensembl Blog: <a href="http://www.ensembl.info/" class="">http://www.ensembl.info/</a><br class=""></div></blockquote></div><br class=""></body></html>