<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Dear Allan,</p>
    <p>Thanks for your message, and happy to help.</p>
    <p>As per your last question, I am not aware of a file describing
      the entire directory tree TBH.<br>
      It is possibly something for us to consider providing.</p>
    <p>This said, the directory tree for FASTA is similar to the GFF3
      one</p>
    <p>For plants<br>
<a class="moz-txt-link-freetext" href="http://ftp.ensemblgenomes.org/pub/plants/release-60/fasta/">http://ftp.ensemblgenomes.org/pub/plants/release-60/fasta/</a><species>/[cdna,cds,dna,dna_index,ncrna,pep]</p>
    <p>For bacteria<br>
<a class="moz-txt-link-freetext" href="http://ftp.ensemblgenomes.org/pub/bacteria/release-60/fasta/">http://ftp.ensemblgenomes.org/pub/bacteria/release-60/fasta/</a><collection>/<species>/[cdna,cds,dna,dna_index,ncrna,pep]</p>
    <p>The extra complexity layer here is because of the type of
      sequence you may be interested in; please, see the options in the
      square brackets above.</p>
    <p>Regrettably, there is no explicit (and convenient) mapping
      assembly accession --> fasta: for the time being you have to
      create it using the species metadata file(s).</p>
    <p>accession --> (collection,)species --> fasta files</p>
    <p>Hope it helps</p>
    <p>Kind regards,</p>
    <p>Stefano</p>
    <div class="moz-cite-prefix">On 11/07/2025 2:42 pm, Allan Kamau
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAF3N6oQfDGCKww4YCz39V7EcdDH+5=9vAvA_PCRcq4_ETYwrrQ@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">Dear Stefano,
        <div><br>
        </div>
        <div>Thank you for your reply and advice.</div>
        <div><br>
        </div>
        <div>I am now constructing the ftp urls to gtf resources using
          data from the species field and data extracted from core_db
          field as you suggested.</div>
        <div><br>
        </div>
        <div>Regarding fasta data, is there metadata containing the
          entire directory tree of the entire ftp directory by which I
          could easily identify the specific fasta file type for a given
          accession?</div>
        <div><br>
        </div>
        <div>Regards,</div>
        <div>Allan.</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Thu, Jul 10, 2025 at
          1:58 PM Stefano Giorgetti <<a
            href="mailto:sgiorgetti@ebi.ac.uk" target="_blank"
            moz-do-not-send="true" class="moz-txt-link-freetext">sgiorgetti@ebi.ac.uk</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>
            <p> </p>
            <p>Dear Allan,</p>
            <p>Thanks for your email and for using Ensembl services.<br>
            </p>
            <p>We have 2 main cases: stand-alone species (for instance
              all the plants') and species sharing a "collection DB" -
              like bacteria.</p>
            <p>For the stand-alone species - say plants - the path to
              the (release 60) GTF would be<br>
              <a
href="http://ftp.ensemblgenomes.org/pub/plants/release-60/gtf/"
                target="_blank" moz-do-not-send="true"
                class="moz-txt-link-freetext">http://ftp.ensemblgenomes.org/pub/plants/release-60/gtf/</a><species>/<br>
              where <species> can be found from one of the species
              metadata files.</p>
            <div>For species belonging to a collection - say bacteria -
              the path to the (release 60) GTF would be<br>
              <a
href="http://ftp.ensemblgenomes.org/pub/release-60/bacteria/gtf/"
                target="_blank" moz-do-not-send="true"
                class="moz-txt-link-freetext">http://ftp.ensemblgenomes.org/pub/release-60/bacteria/gtf/</a><collection>/<species>/<br>
              Regrettably, there is no trivial way to get the collection
              the species belongs to.<br>
              One hopefully not-too-cumbersome would be to extract it
              from the "core_db" field from the species metadata file.<br>
              For instance for "acetobacter_syzygii_gca_002276805", we
              have core db "bacteria_60_collection_core_60_113_1", the
              collection name would be "bacteria_60_collection"; thus
              giving <a
href="http://ftp.ensemblgenomes.org/pub/release-60/bacteria/gtf/bacteria_60_collection/acetobacter_syzygii_gca_002276805/"
                target="_blank" moz-do-not-send="true"
                class="moz-txt-link-freetext">http://ftp.ensemblgenomes.org/pub/release-60/bacteria/gtf/bacteria_60_collection/acetobacter_syzygii_gca_002276805/</a><i><br>
              </i></div>
            <div><i><br>
              </i></div>
            <div>Hope it helps.<br>
              Any questions, please do not hesitate to ask.<br>
            </div>
            <div><br>
            </div>
            <div>Kind regards,</div>
            <div>Stefano on behalf of the Ensembl team<i><br>
              </i></div>
            <div><i><br>
              </i></div>
            <div>On 10/07/2025 6:49 am, Allan Kamau wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div dir="ltr">Greetings,
                  <div><br>
                  </div>
                  <div>Given an entry from one of the species metadata
                    files such as "<a
href="http://ftp.ensemblgenomes.org/pub/plants/release-60/species_EnsemblPlants.txt"
                      target="_blank" moz-do-not-send="true">ftp.ensemblgenomes.org/pub/plants/release-60/species_EnsemblPlants.txt</a>"
                    I would like to determine the ftp path to the "gtf"
                    data of the given species.
                    <div><br>
                    </div>
                    <div>Is there such a mapping file or mechanism that
                      I can use?</div>
                    <div><br>
                    </div>
                    <div>Or in short if I have an "assembly" value such
                      as "ASM16007v2" or and an "assembly_accession"
                      label for example "GCA_000160075.2" is there a way
                      to determine the ftp path to the gtf data which is
                      "<a
href="http://tp.ensemblgenomes.org/pub/release-60/bacteria/gtf/bacteria_118_collection/abiotrophia_defectiva_atcc_49176_gca_000160075"
                        target="_blank" moz-do-not-send="true">tp.ensemblgenomes.org/pub/release-60/bacteria/gtf/bacteria_118_collection/abiotrophia_defectiva_atcc_49176_gca_000160075</a>"
                      in this case?</div>
                    <div><br>
                    </div>
                    <div>Regards,</div>
                    <div>- Allan.</div>
                  </div>
                </div>
              </div>
              <br>
              <fieldset></fieldset>
              <pre>_______________________________________________
Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank"
              moz-do-not-send="true" class="moz-txt-link-freetext">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a
href="https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org"
              target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank"
              moz-do-not-send="true" class="moz-txt-link-freetext">http://www.ensembl.info/</a>
</pre>
            </blockquote>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>