<div dir="ltr"><div>Thank you very much Matthieu. <br>Very helpful.<br></div><div><br></div><div>Thanks<br></div><div>Haiming<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Sep 29, 2014 at 10:02 AM, Matthieu Muffato <span dir="ltr"><<a href="mailto:muffato@ebi.ac.uk" target="_blank">muffato@ebi.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Haiming,<br>
<br>
You need to access the multiple alignments through the GenomicAlignTree objects. They put together the history of the extant regions and their ancestral sequences.<br>
<br>
You can have a look at the code we're using to generate the AgeOfBase track<br>
<a href="https://github.com/Ensembl/ensembl-compara/blob/release/77/modules/Bio/EnsEMBL/Compara/RunnableDB/BaseAge/BaseAge.pm#L121" target="_blank">https://github.com/Ensembl/<u></u>ensembl-compara/blob/release/<u></u>77/modules/Bio/EnsEMBL/<u></u>Compara/RunnableDB/BaseAge/<u></u>BaseAge.pm#L121</a><br>
especially those lines:<br>
- L140: get all the GenomicAlignTree objects<br>
- L160: iterate over list of trees<br>
- L190: iterate over the internal nodes of a given tree<br>
- L195+203: get the ancestral sequence of this node<br>
<br>
Hope this helps,<br>
Matthieu<br>
<br>
On 29/09/14 17:45, Tang, Haiming wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi, Stephen<br>
<br>
Thank you very much for you help.<br>
<br>
This solves my problem. So column 4 like Ggor-Hsap-Hsap-Pabe[4] stands<br>
for an ancestor of these listed species to which the base has been<br>
preserved.<br>
<br>
May I also know the script you used to get the tree and alignment info<br>
as seen in your email?<br>
<br>
I tried :<br>
<br>
"my $mlss =<br>
$mlss_adaptor->fetch_by_<u></u>method_link_type_species_set_<u></u>name("EPO", "mammals");<br>
<br>
my $slice = $slice_adaptor->fetch_by_<u></u>region('toplevel', $seq_region,<br>
$seq_region_start, $seq_region_end);<br>
<br>
my $genomic_align_blocks =$genomic_align_block_adaptor ->fetch_all_by_<br>
<br>
MethodLinkSpeciesSet_Slice($<u></u>mlss, $slice);<br>
<br>
" to fetch the ancestral sequences.<br>
<br>
But it doesn't seem to work.<br>
<br>
Thanks<br>
Haiming<br>
<br>
On Mon, Sep 29, 2014 at 8:42 AM, Stephen Fitzgerald <<a href="mailto:stephenf@ebi.ac.uk" target="_blank">stephenf@ebi.ac.uk</a><br>
<mailto:<a href="mailto:stephenf@ebi.ac.uk" target="_blank">stephenf@ebi.ac.uk</a>>> wrote:<br>
<br>
Hi Haiming, column 4 lists the set of species whose ancestor had the<br>
same base as human (we use a program called Ortheus to infer the<br>
sequence of the ancestral nodes in the tree connecting all the<br>
extant species).<br>
<br>
For example:<br>
<br>
chr1 1031796 1031797 Mmul-Panu-Hsap-Ptro[4] 196 50,50,255<br>
<br>
The ancestral sequence of the primates present in the alignment at<br>
this position in human (maked with a "*") is the most recent common<br>
ancestor to share a G base with human (this is at the root of the 4<br>
primates in the alignment). The next deepest ancestor (between<br>
rodents and primates, marked with a "**") is predicted to have a T<br>
at this position. So, somewhere between these two ancestors the base<br>
changed T->G. Hence, this position would be marked as primate specific.<br>
<br>
<br>
Human › chromosome:GRCh38:1:1031796:__<u></u>1031797:1<br>
Ancestral sequences › (homo_sapiens,pan_troglodytes)<u></u>__;<br>
Chimpanzee › chromosome:CHIMP2.1.4:7:__<u></u>159477370:159477371:1<br>
Ancestral sequences ›<br>
((homo_sapiens,pan___<u></u>troglodytes),(papio_anubis,__<u></u>macaca_mulatta)); *<br>
Macaque › chromosome:MMUL_1:1:4106934:__<u></u>4106935:1<br>
Ancestral sequences › (papio_anubis,macaca_mulatta);<br>
Olive baboon › scaffold:PapAnu2.0:JH684932.1:<u></u>__192067:192068:1<br>
Ancestral sequences ›<br>
(((homo_sapiens,pan___<u></u>troglodytes),(papio_anubis,__<u></u>macaca_mulatta)),(mus___<u></u>musculus,rattus_norvegicus)); **<br>
Mouse › chromosome:GRCm38:4:156188534:<u></u>__156188535:-1<br>
Ancestral sequences › (mus_musculus,rattus___<u></u>norvegicus);<br>
Rat › chromosome:Rnor_5.0:5:__<u></u>177087882:177087883:-1<br>
Ancestral sequences ›<br>
((((homo_sapiens,pan___<u></u>troglodytes),(papio_anubis,__<u></u>macaca_mulatta)),(mus___<u></u>musculus,rattus_norvegicus)),(<u></u>__(sus_scrofa,bos_taurus),<u></u>canis___familiaris));<br>
Cow › chromosome:UMD3.1:16:52694475:<u></u>__52694476:-1<br>
Ancestral sequences › (sus_scrofa,bos_taurus);<br>
Pig › chromosome:Sscrofa10.2:6:__<u></u>57872690:57872691:-1<br>
Ancestral sequences › ((sus_scrofa,bos_taurus),__<u></u>canis_familiaris);<br>
Dog › chromosome:CanFam3.1:5:__<u></u>56250642:56250643:1<br>
<br>
<br>
Human G<br>
Ancestral sequences G<br>
Chimpanzee G<br>
Ancestral sequences G *<br>
Macaque G<br>
Ancestral sequences G<br>
Olive baboon G<br>
Ancestral sequences T **<br>
Mouse C<br>
Ancestral sequences C<br>
Rat C<br>
Ancestral sequences T<br>
Cow T<br>
Ancestral sequences T<br>
Pig G<br>
Ancestral sequences T<br>
Dog T<br>
<br>
<br>
We don't store speciation times for the age of base track.<br>
Information regarding speciation times can be obtained from sites<br>
such as Time Tree (<a href="http://www.timetree.org/" target="_blank">http://www.timetree.org/</a>).<br>
<br>
HTH,<br>
Stephen.<br>
<br>
On Fri, 26 Sep 2014, Tang, Haiming wrote:<br>
<br>
HI, Stephen<br>
I followed your instructions and got the bed file.<br>
<br>
Column 4 appears to list the species for which that base is the<br>
same as in human, since it looks like Hsap is in every line.<br>
The number in square brackets [] is just the number of species<br>
listed.<br>
<br>
But the file doesn’t seem to give the age of the base.<br>
<br>
For example: How to interpret Ggor-Hsap-Hsap-Pabe[4] in<br>
<br>
"chrY 57107125 57107126 Ggor-Hsap-Hsap-Pabe[4] 120 30,30,255"?<br>
<br>
Are Ggor and Hsap ancestral species?<br>
<br>
Or Age of base is stored at somewhere else?<br>
<br>
Thanks<br>
<br>
Haiming<br>
<br>
On Fri, Sep 26, 2014 at 2:47 AM, Stephen Fitzgerald<br>
<<a href="mailto:stephenf@ebi.ac.uk" target="_blank">stephenf@ebi.ac.uk</a> <mailto:<a href="mailto:stephenf@ebi.ac.uk" target="_blank">stephenf@ebi.ac.uk</a>>> wrote:<br>
Hi Haiming,<br>
the compara API is used to retrieve information from the<br>
compara database. However the "Age of Base" track is<br>
generated from a Bigbed binary file, so it is not part of<br>
the compara database. The Bigbed file is generated from a<br>
Bed file. I have transferred this Bed file (from release<br>
76) to our ftp site. You can retrieve this file using<br>
anonymous ftp from here:<br>
<br>
ftp <a href="http://ftp.ebi.ac.uk" target="_blank">ftp.ebi.ac.uk</a> <<a href="http://ftp.ebi.ac.uk" target="_blank">http://ftp.ebi.ac.uk</a>><br>
<br>
cd pub/software/ensembl/stephen/_<u></u>_BaseAge/<br>
<br>
get base_age_76.bed.gz<br>
<br>
Hope this helps,<br>
Stephen.<br>
<br>
<br>
On Thu, 25 Sep 2014, Tang, Haiming wrote:<br>
<br>
<br>
DEAR GROUP, MY NAME IS HAIMING TANG. I'M IN DR PAUL<br>
THOMAS'S GROUP IN UNIVERSITY OF SOUTHERN<br>
CALIFORNIA.<br>
<br>
I'm trying to retrieve "Age of Base" using Perl API.<br>
<br>
As described in<br>
"<a href="http://www.ensembl.org/info/__genome/compara/analyses.html#__age_of_base" target="_blank">http://www.ensembl.org/info/_<u></u>_genome/compara/analyses.html#<u></u>__age_of_base</a><br>
<<a href="http://www.ensembl.org/info/genome/compara/analyses.html#age_of_base" target="_blank">http://www.ensembl.org/info/<u></u>genome/compara/analyses.html#<u></u>age_of_base</a>>"<br>
<br>
"Age of Base<br>
<br>
From these ancestral sequences, we infer the age of<br>
a base, i.e. the timing of the most recent mutation<br>
for each<br>
base of the genome. Each position of the human<br>
genome is compared to its immediate inferred ancestor,<br>
then its<br>
ancestor, etc. until a difference is found. The<br>
inferred substitution event therefore occurred on a<br>
specific<br>
branch of the tree, which is identified by all the<br>
extant species which eventually descended from that<br>
branch, as<br>
illustrated below."<br>
<br>
"Age of base" has close relation with EPO ancestral<br>
alignment. But I could find any related method in<br>
Compara Perl<br>
API Documentation or Compara API Tutorial.<br>
<br>
Can anyone show me how to do to retrieve "age of base"?<br>
<br>
Thank you in advance.<br>
<br>
Haiming<br>
<br>
<br>
<br>
<br>
______________________________<u></u>___________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a> <mailto:<a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>><br>
Posting guidelines and subscribe/unsubscribe info:<br>
<a href="http://lists.ensembl.org/__mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/__<u></u>mailman/listinfo/dev</a><br>
<<a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/<u></u>mailman/listinfo/dev</a>><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a> <mailto:<a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>><br>
Posting guidelines and subscribe/unsubscribe info:<br>
<a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/<u></u>mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/<u></u>mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br>
</blockquote>
<br>
-- <br>
Matthieu Muffato, Ph.D.<br>
Ensembl Compara Project Leader<br>
European Bioinformatics Institute (EMBL-EBI)<br>
European Molecular Biology Laboratory<br>
Wellcome Trust Genome Campus, Hinxton<br>
Cambridge, CB10 1SD, United Kingdom<br>
Room A3-145<br>
Phone <a href="tel:%2B%2044%20%280%29%201223%2049%204631" value="+441223494631" target="_blank">+ 44 (0) 1223 49 4631</a><br>
Fax <a href="tel:%2B%2044%20%280%29%201223%2049%204468" value="+441223494468" target="_blank">+ 44 (0) 1223 49 4468</a><br>
<br>
______________________________<u></u>_________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/<u></u>mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
</blockquote></div><br></div>