<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi Jinrui</p>
    <p>Have a look back at my first email. There is an example URL to
      our REST API server to get the alignment of a human region</p>
    <p>If you want to use the Perl API, you can adapt
<a class="moz-txt-link-freetext" href="https://github.com/Ensembl/ensembl-presentation/blob/master/API/Compara/exercises/gab1.pl">https://github.com/Ensembl/ensembl-presentation/blob/master/API/Compara/exercises/gab1.pl</a><br>
    </p>
    <p>Regards,<br>
      Matthieu<br>
    </p>
    <div class="moz-cite-prefix">On 18/03/2019 16:53, Jin-Rui Xu wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAFZUdz5xe_k8tz1St8wSKFqjxFgkavcf8wsy7z5g1Z+EjARaXg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">Hi Matthieu,
        <div><br>
        </div>
        <div>I am going to use the human self-alignment to detect
          paralogous genomic regions (particularly non coding regions).
          But I can not find examples of API for this purpose. Could you
          pass me some scripts or examples where I can start? Say I have
          a human genomic coordinate, and want to find its paralogous
          regions and alignments.</div>
        <div>Many thanks.</div>
        <div>Jinrui</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Mar 13, 2019 at 8:37
          AM Matthieu Muffato <<a href="mailto:muffato@ebi.ac.uk"
            moz-do-not-send="true">muffato@ebi.ac.uk</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div bgcolor="#FFFFFF">
            <p>Hi Jinrui</p>
            <p>In all our pairwise alignments, we refine the LastZ
              alignment blocks with two steps called "chaining" and
              "netting" (see <a
                class="gmail-m_-1501946497467315318moz-txt-link-freetext"
                href="http://europepmc.org/articles/PMC4852398"
                target="_blank" moz-do-not-send="true">http://europepmc.org/articles/PMC4852398</a>
              and <a
                class="gmail-m_-1501946497467315318moz-txt-link-freetext"
                href="http://genomewiki.ucsc.edu/index.php/Chains_Nets"
                target="_blank" moz-do-not-send="true">http://genomewiki.ucsc.edu/index.php/Chains_Nets</a>
              for more information). What you get in our database is the
              product of these two steps.<br>
              The netting phase is done on the reference species only,
              we don't do bidirectional netting. This means that there
              is very little overlap / nesting on the reference species
              (human in the case of the human vs * alignments). Overlap
              / nesting is allowed on the non-reference species, though.
              For instance, in the human-mouse alignments, there are
              20,000 pairs of blocks that overlap on human, and
              1,900,000 pairs of blocks that overlap on mouse.<br>
            </p>
            <p>So in this case, yes you can identify human paralogous
              regions 1) through the self-alignment and 2) through the
              human-mouse alignment (or any pairwise alignment that
              involves human) by finding human regions that align to the
              same region in the other species</p>
            <p>Hope this helps,</p>
            <p>Matthieu<br>
            </p>
            <div class="gmail-m_-1501946497467315318moz-cite-prefix">On
              11/03/2019 19:45, Jin-Rui Xu wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>Hi Matthieu,</div>
                <div><br>
                </div>
                <div>Thank you very much for your email. </div>
                <div><br>
                </div>
                <div>I am wondering in the human self alignment, one
                  genomic region may be mapped to multiple other
                  regions. These multiple hits also exist in e.g. human
                  vs mouse genome alignment.</div>
                <div>Does ensembl provide all these multiple regions or
                  just the best one? Can these multiple hits achieved by
                  compara perl API?</div>
                <div><br>
                </div>
                <div>Thanks!</div>
                <div>Jinrui</div>
                <div><br>
                </div>
                <div><br>
                </div>
                <div>  </div>
                <div><br>
                </div>
                <br>
                <div class="gmail_quote">
                  <div dir="ltr" class="gmail_attr">On Mon, Mar 11, 2019
                    at 3:05 PM Matthieu Muffato <<a
                      href="mailto:muffato@ebi.ac.uk" target="_blank"
                      moz-do-not-send="true">muffato@ebi.ac.uk</a>>
                    wrote:<br>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px 0.8ex;border-left:1px solid
                    rgb(204,204,204);padding-left:1ex">Dear Jinrui,<br>
                    <br>
                    We have a human self-alignment, that has been
                    computed with LastZ and <br>
                    identifies paralogous regions within the genome. You
                    can find the whole <br>
                    alignment on the FTP <br>
                    <a
href="ftp://ftp.ensembl.org/pub/current_maf/ensembl-compara/pairwise_alignments/"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">ftp://ftp.ensembl.org/pub/current_maf/ensembl-compara/pairwise_alignments/</a>
                    <br>
                    but also query specific regions: <br>
                    <a
href="http://rest.ensembl.org/alignment/region/homo_sapiens/17:63997797-64000390:1?species_set=homo_sapiens;content-type=application/json;method=LASTZ_NET"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">http://rest.ensembl.org/alignment/region/homo_sapiens/17:63997797-64000390:1?species_set=homo_sapiens;content-type=application/json;method=LASTZ_NET</a><br>
                    <br>
                    Human is the only species for which we have a
                    self-alignment.<br>
                    <br>
                    Kind regards,<br>
                    Matthieu<br>
                    <br>
                    On 09/03/2019 03:10, Jin-Rui Xu wrote:<br>
                    > Hello,<br>
                    ><br>
                    > I just started learning the compara API.
                    However, I am still not sure <br>
                    > whether it can address my questions. I am
                    wondering if someone could <br>
                    > give me some guidance and example scripts. Here
                    is my question: (1) I <br>
                    > want to identify all paralogous DNA fragments
                    (not neccessarily genes) <br>
                    > in a genome. One genomic regions may have more
                    than one duplicate. (2) <br>
                    > Then, I want to find in which of the other
                    species, the two paralogous <br>
                    > DNAs have a common ancestor.<br>
                    > Alternatively, I can focus on two genomic
                    regions in a genome to test <br>
                    > if they are paralogous, and then which species
                    has their common <br>
                    > ancestral DNA<br>
                    > How could I get this done using compara API
                    (version 95)?<br>
                    ><br>
                    > Many thanks!<br>
                    ><br>
                    > Jinrui<br>
                    <br>
                    -- <br>
                    Matthieu Muffato, Ph.D.<br>
                    Ensembl Compara and TreeFam Project Leader<br>
                    European Bioinformatics Institute (EMBL-EBI)<br>
                    European Molecular Biology Laboratory<br>
                    Wellcome Trust Genome Campus, Hinxton<br>
                    Cambridge, CB10 1SD, United Kingdom<br>
                    Room  A3-145<br>
                    Phone + 44 (0) 1223 49 4631<br>
                    Fax   + 44 (0) 1223 49 4468<br>
                    <br>
                  </blockquote>
                </div>
              </div>
            </blockquote>
            <pre class="gmail-m_-1501946497467315318moz-signature" cols="72">-- 
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468</pre>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468</pre>
  </body>
</html>