<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi Jinrui</p>
<p>Have a look back at my first email. There is an example URL to
our REST API server to get the alignment of a human region</p>
<p>If you want to use the Perl API, you can adapt
<a class="moz-txt-link-freetext" href="https://github.com/Ensembl/ensembl-presentation/blob/master/API/Compara/exercises/gab1.pl">https://github.com/Ensembl/ensembl-presentation/blob/master/API/Compara/exercises/gab1.pl</a><br>
</p>
<p>Regards,<br>
Matthieu<br>
</p>
<div class="moz-cite-prefix">On 18/03/2019 16:53, Jin-Rui Xu wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFZUdz5xe_k8tz1St8wSKFqjxFgkavcf8wsy7z5g1Z+EjARaXg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Hi Matthieu,
<div><br>
</div>
<div>I am going to use the human self-alignment to detect
paralogous genomic regions (particularly non coding regions).
But I can not find examples of API for this purpose. Could you
pass me some scripts or examples where I can start? Say I have
a human genomic coordinate, and want to find its paralogous
regions and alignments.</div>
<div>Many thanks.</div>
<div>Jinrui</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Mar 13, 2019 at 8:37
AM Matthieu Muffato <<a href="mailto:muffato@ebi.ac.uk"
moz-do-not-send="true">muffato@ebi.ac.uk</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Hi Jinrui</p>
<p>In all our pairwise alignments, we refine the LastZ
alignment blocks with two steps called "chaining" and
"netting" (see <a
class="gmail-m_-1501946497467315318moz-txt-link-freetext"
href="http://europepmc.org/articles/PMC4852398"
target="_blank" moz-do-not-send="true">http://europepmc.org/articles/PMC4852398</a>
and <a
class="gmail-m_-1501946497467315318moz-txt-link-freetext"
href="http://genomewiki.ucsc.edu/index.php/Chains_Nets"
target="_blank" moz-do-not-send="true">http://genomewiki.ucsc.edu/index.php/Chains_Nets</a>
for more information). What you get in our database is the
product of these two steps.<br>
The netting phase is done on the reference species only,
we don't do bidirectional netting. This means that there
is very little overlap / nesting on the reference species
(human in the case of the human vs * alignments). Overlap
/ nesting is allowed on the non-reference species, though.
For instance, in the human-mouse alignments, there are
20,000 pairs of blocks that overlap on human, and
1,900,000 pairs of blocks that overlap on mouse.<br>
</p>
<p>So in this case, yes you can identify human paralogous
regions 1) through the self-alignment and 2) through the
human-mouse alignment (or any pairwise alignment that
involves human) by finding human regions that align to the
same region in the other species</p>
<p>Hope this helps,</p>
<p>Matthieu<br>
</p>
<div class="gmail-m_-1501946497467315318moz-cite-prefix">On
11/03/2019 19:45, Jin-Rui Xu wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi Matthieu,</div>
<div><br>
</div>
<div>Thank you very much for your email. </div>
<div><br>
</div>
<div>I am wondering in the human self alignment, one
genomic region may be mapped to multiple other
regions. These multiple hits also exist in e.g. human
vs mouse genome alignment.</div>
<div>Does ensembl provide all these multiple regions or
just the best one? Can these multiple hits achieved by
compara perl API?</div>
<div><br>
</div>
<div>Thanks!</div>
<div>Jinrui</div>
<div><br>
</div>
<div><br>
</div>
<div> </div>
<div><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Mar 11, 2019
at 3:05 PM Matthieu Muffato <<a
href="mailto:muffato@ebi.ac.uk" target="_blank"
moz-do-not-send="true">muffato@ebi.ac.uk</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">Dear Jinrui,<br>
<br>
We have a human self-alignment, that has been
computed with LastZ and <br>
identifies paralogous regions within the genome. You
can find the whole <br>
alignment on the FTP <br>
<a
href="ftp://ftp.ensembl.org/pub/current_maf/ensembl-compara/pairwise_alignments/"
rel="noreferrer" target="_blank"
moz-do-not-send="true">ftp://ftp.ensembl.org/pub/current_maf/ensembl-compara/pairwise_alignments/</a>
<br>
but also query specific regions: <br>
<a
href="http://rest.ensembl.org/alignment/region/homo_sapiens/17:63997797-64000390:1?species_set=homo_sapiens;content-type=application/json;method=LASTZ_NET"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://rest.ensembl.org/alignment/region/homo_sapiens/17:63997797-64000390:1?species_set=homo_sapiens;content-type=application/json;method=LASTZ_NET</a><br>
<br>
Human is the only species for which we have a
self-alignment.<br>
<br>
Kind regards,<br>
Matthieu<br>
<br>
On 09/03/2019 03:10, Jin-Rui Xu wrote:<br>
> Hello,<br>
><br>
> I just started learning the compara API.
However, I am still not sure <br>
> whether it can address my questions. I am
wondering if someone could <br>
> give me some guidance and example scripts. Here
is my question: (1) I <br>
> want to identify all paralogous DNA fragments
(not neccessarily genes) <br>
> in a genome. One genomic regions may have more
than one duplicate. (2) <br>
> Then, I want to find in which of the other
species, the two paralogous <br>
> DNAs have a common ancestor.<br>
> Alternatively, I can focus on two genomic
regions in a genome to test <br>
> if they are paralogous, and then which species
has their common <br>
> ancestral DNA<br>
> How could I get this done using compara API
(version 95)?<br>
><br>
> Many thanks!<br>
><br>
> Jinrui<br>
<br>
-- <br>
Matthieu Muffato, Ph.D.<br>
Ensembl Compara and TreeFam Project Leader<br>
European Bioinformatics Institute (EMBL-EBI)<br>
European Molecular Biology Laboratory<br>
Wellcome Trust Genome Campus, Hinxton<br>
Cambridge, CB10 1SD, United Kingdom<br>
Room A3-145<br>
Phone + 44 (0) 1223 49 4631<br>
Fax + 44 (0) 1223 49 4468<br>
<br>
</blockquote>
</div>
</div>
</blockquote>
<pre class="gmail-m_-1501946497467315318moz-signature" cols="72">--
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room A3-145
Phone + 44 (0) 1223 49 4631
Fax + 44 (0) 1223 49 4468</pre>
</div>
</blockquote>
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room A3-145
Phone + 44 (0) 1223 49 4631
Fax + 44 (0) 1223 49 4468</pre>
</body>
</html>