<div dir="ltr">Dear Kieron,<div><br></div><div>Thanks for the detailed email. Just what I needed! Yes, I have included those other options too, but hadn't thought about the FTP downloads. Will look at those too, thanks! </div><div><br></div><div>I had looked at the assembly table and indeed seen 13,471 records. It makes much more sense now thinking about it in terms of unassembled contigs.</div><div><br></div><div>I was hoping to include the query as a baseline benchmark to compare against. I'm guessing it would need some optimisation though. Will have a think about how I can best do that. Will have a place with DBI->trace() too.</div><div><br></div><div>Thanks again!</div><div><br></div><div>Cheers,</div><div><br></div><div>Steve<br><div class="gmail_extra"><br><div class="gmail_quote">On 16 December 2014 at 16:48,  <span dir="ltr"><<a href="mailto:dev-request@ensembl.org" target="_blank">dev-request@ensembl.org</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Message: 2<br>
Date: Tue, 16 Dec 2014 16:40:44 +0000<br>
From: Kieron Taylor <<a href="mailto:ktaylor@ebi.ac.uk">ktaylor@ebi.ac.uk</a>><br>
Subject: Re: [ensembl-dev] SQL query to retrieve gene sequence...<br>
To: Ensembl developers list <<a href="mailto:dev@ensembl.org">dev@ensembl.org</a>><br>
Message-ID: <<a href="mailto:5490608C.1080001@ebi.ac.uk">5490608C.1080001@ebi.ac.uk</a>><br>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed<br>
<br>
Dear Steve,<br>
<br>
Firstly, since you're benchmarking, make sure you include our REST<br>
service, BioMart, and maybe even slicing up FTP downloads. We generally<br>
recommend the API for this kind of thing as it deals with the schema for<br>
you, which is non-trivial.<br>
<br>
For the SQL, your query needs to be a lot more complicated. Some<br>
seq_region_ids correspond to single contigs, while others are<br>
assemblages of several. For this you need the assembly table, which maps<br>
out the components that are joined together to give the sequence of that<br>
'top level' seq_region. Remember that our sequence is built up from the<br>
output of sequencing facilities, and our data structures reflect that.<br>
<br>
If you feed a seq_region_id of 131541 into the assembly table, you'll<br>
see just how many parts combine to form chromosome 13. The primary<br>
transcript of BRCA2 crosses over the edge of at least two of those many<br>
contigs, thus you must subselect from several seq_regions to give you<br>
multiple bits of sequence to concatenate together.<br>
<br>
At this stage, you may decide you'd rather not write the query. It may<br>
make sense to use a DBI trace on our Perl API to get all of the queries<br>
that are run during the call of $gene->seq .<br>
<br>
Regards,<br>
<br>
Kieron Taylor<br>
Ensembl Core<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><a href="http://about.me/gawbul" style="margin:0px;padding:0px;border:0px;outline:0px;font-size:14px;vertical-align:baseline;color:rgb(43,130,173);text-decoration:none;line-height:18px" target="_blank"><font face="tahoma, sans-serif"></font><table border="0" cellpadding="0" cellspacing="0" style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline;border-spacing:0px"><tbody style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline"><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline"><td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;vertical-align:baseline;height:30px"> </td></tr><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline"><td align="left" valign="top" style="padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:top;line-height:1"><div style="margin:0px;padding:0px;border:0px;outline:0px;font-weight:bold;font-style:inherit;font-size:18px;vertical-align:baseline;line-height:1;color:rgb(51,51,51)">Steve Moss</div><div style="margin:3px 0px 0px;padding:0px;border:0px;outline:0px;font-style:inherit;font-size:12px;vertical-align:baseline">about.me/gawbul</div></td></tr><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline"><td align="left" valign="top" style="padding:8px 0px 0px;border:0px;outline:0px;font-style:inherit;vertical-align:top;line-height:1"><div style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline;text-align:right;background-color:rgb(197,208,224);height:4px"><img src="http://d13pix9kaak6wt.cloudfront.net/signature/colorbar.png" alt="Steve Moss on about.me" width="88" height="4" style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline;float:right;display:block"></div></td></tr><tr style="margin:0px;padding:0px;border:0px;outline:0px;font-style:inherit;vertical-align:baseline"><td style="padding:0px;border:0px;outline:0px;font-style:inherit;font-size:0px;vertical-align:baseline;height:20px"> </td></tr></tbody></table></a></div></div>
</div></div></div>