<div dir="ltr">Okay, I agree that's definitely better than crawling the FTP site!<div><br></div><div>Thanks for answering this, it's very very helpful.</div><div><br></div><div>> My answer to your other question might be a good source of unique species names depending on how you're working with our services.<br><br>It is! I actually was already using that endpoint, I just didn't realize it had all the assemblies in that response and thought I needed an additional endpoint. As far as I can tell I now know everything I need to to make this work.</div><div><br></div><div>Thanks a bunch,</div><div><br></div><div>- Kurt</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 19, 2019 at 12:50 PM Kieron Taylor <<a href="mailto:ktaylor@ebi.ac.uk">ktaylor@ebi.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Unforunately, you are into rather inconvenient methods to do this. We have plans for a service that indexes the FTP content and will make it searchable, but you'll have to wait for that.<br>

<br>

In the meantime, you can:<br>

<br>

Get <a href="ftp://ftp.ensemblgenomes.org/pub/bacteria/current/species_EnsemblBacteria.txt" rel="noreferrer" target="_blank">ftp://ftp.ensemblgenomes.org/pub/bacteria/current/species_EnsemblBacteria.txt</a> <br>

Extract coloumn 13 for the rows you are interested in<br>

The column contains values like bacteria_177_collection_core_45_98_1<br>

Regex out the tail end after _collection and then you have the collection name: bacteria_177_collection<br>

Put that into your URL, then add "species" name as per the tab-separated file above <br>

<br>

It's not pretty but it's better than crawling the FTP site! My answer to your other question might be a good source of unique species names depending on how you're working with our services.<br>

<br>

Hopefully you are already aware of the "current" link in the path should you not wish to work on a specific release.<br>

<br>

<br>

Hope that helps,<br>

<br>

Kieron<br>

<br>

<br>

Kieron Taylor PhD.<br>

Ensembl Developer<br>

<br>

EMBL, European Bioinformatics Institute<br>

<br>

<br>

<br>

<br>

<br>

<br>

> On 19 Dec 2019, at 15:10, Kurt Wheeler <<a href="mailto:kurt.wheeler91@gmail.com" target="_blank">kurt.wheeler91@gmail.com</a>> wrote:<br>

> <br>

> Hello,<br>

> <br>

> I'm trying to figure out how to programmatically find this URL:<br>

> <a href="ftp://ftp.ensemblgenomes.org/pub/bacteria/release-45/fasta/bacteria_13_collection/pseudomonas_aeruginosa_pao1/dna/" rel="noreferrer" target="_blank">ftp://ftp.ensemblgenomes.org/pub/bacteria/release-45/fasta/bacteria_13_collection/pseudomonas_aeruginosa_pao1/dna/</a><br>

> <br>

> I got that URL by going to <a href="https://bacteria.ensembl.org/Pseudomonas_aeruginosa_pao1/Info/Index/" rel="noreferrer" target="_blank">https://bacteria.ensembl.org/Pseudomonas_aeruginosa_pao1/Info/Index/</a> and clicking a link that said: "Download DNA sequence (FASTA)". However I can't figure out how to get the API to tell me that and I don't want to scrape the HTML for the link.<br>

> <br>

> Does anyone know how to find that URL for a given organism/strain?<br>

> <br>

> Thanks,<br>

> <br>

> - Kurt<br>

> <br>

> P.S. I solved this problem for divisions other than bacteria by building the URLs with information that the API does provide: <a href="https://github.com/AlexsLemonade/refinebio/blob/dev/foreman/data_refinery_foreman/surveyor/transcriptome_index.py#L48" rel="noreferrer" target="_blank">https://github.com/AlexsLemonade/refinebio/blob/dev/foreman/data_refinery_foreman/surveyor/transcriptome_index.py#L48</a><br>

> <br>

> However in the FTP server the bacteria are broken up into collections which I'm having trouble figuring out how to determine.<br>

> _______________________________________________<br>

> Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>

> Posting guidelines and subscribe/unsubscribe info: <a href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org" rel="noreferrer" target="_blank">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a><br>

> Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>

<br>

<br>

_______________________________________________<br>

Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>

Posting guidelines and subscribe/unsubscribe info: <a href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org" rel="noreferrer" target="_blank">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a><br>

Ensembl Blog: <a href="http://www.ensembl.info/" rel="noreferrer" target="_blank">http://www.ensembl.info/</a><br>

</blockquote></div>