<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi Yan,</p>
<p>Since LASTZ alignments are on the whole genome, it's important to
note that genes may not be present on every part of the sequence.
To see which gene(s) are on a given aligned sequence region, you
can use:</p>
<p>$genomic_align->get_Slice->get_all_Genes();</p>
<p>However, at the end of the message, you mention homologies.
Homologies and whole genome alignments are really separate objects
in our system. Inference of homologies is performed on a gene tree
level and are not directly related to the whole genome alignments
(<a class="moz-txt-link-freetext" href="http://www.ensembl.org/info/genome/compara/homology_types.html">http://www.ensembl.org/info/genome/compara/homology_types.html</a>).
Can you confirm whether it is truly homologies that you are
interested in, or whether you are simply using the term to refer
to well-aligned regions (regardless of genes)? Here is a basic
example of how to retrieve human/mouse homologies using our API:
<a class="moz-txt-link-freetext" href="https://www.ebi.ac.uk/~muffato/workshops/2016_01_EBI/solutions_compara/homology2.pl">https://www.ebi.ac.uk/~muffato/workshops/2016_01_EBI/solutions_compara/homology2.pl</a></p>
<p>Finally, since it's clear that there is some relationship between
homology and alignment (biologically speaking), we have a QC
method for our homologies that tests the quality of the alignment
over a pair of homologous genes (WGA score). This may be useful to
you:
<a class="moz-txt-link-freetext" href="http://www.ensembl.org/info/genome/compara/Ortholog_qc_manual.html">http://www.ensembl.org/info/genome/compara/Ortholog_qc_manual.html</a><br>
</p>
<p>Hope this helps?</p>
<p>Best,</p>
<p>Carla</p>
<br>
<div class="moz-cite-prefix">On 24/07/2018 12:00, Yan P. Yuan wrote:<br>
</div>
<blockquote type="cite"
cite="mid:c1ec5cb9-71d9-9358-db5c-2ff7c0b4c3c9@embl.de">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
Hi,<br>
<br>
Currently, I'm interested in genomic aligments between 2 species:
homo sapience and mus musculus. <br>
<br>
If I use MethodLinkSpeciesSet-adaptor and
GenomicAlignBlock-adaptor to fetch the genomic alignments, the
output misses the corresponding gene names (or involved transcript
names). <br>
<br>
Can someone point out how to get the gene names via the used
methods? Or a different way to do it?<br>
<br>
Thanks a lot.<br>
<br>
Yan<br>
<br>
PS:<br>
<br>
The folllowing code was used:<br>
<blockquote><tt>use strict;</tt><tt><br>
</tt><tt>use warnings;</tt><tt><br>
</tt><tt><br>
</tt><tt>use Bio::EnsEMBL::Registry;</tt><tt><br>
</tt><tt>use Bio::AlignIO;</tt><tt><br>
</tt><tt><br>
</tt><tt>my $reg = 'Bio::EnsEMBL::Registry';</tt><tt><br>
</tt><tt><br>
</tt><tt>$reg->load_registry_from_db(</tt><tt><br>
</tt><tt> -host=>'ensembldb.ensembl.org',</tt><tt><br>
</tt><tt> -port=>5306,</tt><tt><br>
</tt><tt> -user=>'anonymous', </tt><tt><br>
</tt><tt>);</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt>my $mlss_adaptor = $reg->get_adaptor("Multi",
"compara", "MethodLinkSpeciesSet");</tt><tt><br>
</tt><tt><br>
</tt><tt>my $mlss =
$mlss_adaptor->fetch_by_method_link_type_registry_aliases('LASTZ_NET',
['human', 'mouse']);</tt><tt><br>
</tt><tt><br>
</tt><tt>my $species_slice_adaptor =
$reg->get_adaptor("mouse", "core", "Slice");</tt><tt><br>
</tt><tt><br>
</tt><tt>my $genomic_align_block_adaptor =
$reg->get_adaptor("Multi", "compara", "GenomicAlignBlock");</tt><tt><br>
</tt><tt><br>
</tt><tt># Define a region to fetch</tt><tt><br>
</tt><tt>my ($species_start, $species_end) = (143702058,
143702091);</tt><tt><br>
</tt><tt>my $species_slice =
$species_slice_adaptor->fetch_by_region("chromosome", 1,
$species_start, $species_end, -1);</tt><tt><br>
</tt><tt><br>
</tt><tt>my $all_genomic_align_blocks =
$genomic_align_block_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($mlss,
$species_slice);</tt><tt><br>
</tt><tt><br>
</tt><tt>my $alignIO =
Bio::AlignIO->newFh(-interleaved=>0,</tt><tt><br>
</tt><tt> -fh=>\*STDOUT,</tt><tt><br>
</tt><tt> -format=>'clustalw',</tt><tt><br>
</tt><tt> -idlength=>20);</tt><tt><br>
</tt><tt><br>
</tt><tt># print the restricted alignments</tt><tt><br>
</tt><tt>foreach my $genomic_align_block ( @{
$all_genomic_align_blocks } ) {</tt><tt><br>
</tt><tt> my $restricted_gab =
$genomic_align_block->restrict_between_reference_positions($species_start,
$species_end);</tt><tt><br>
</tt><tt><br>
</tt><tt> foreach my $genomic_align ( @{
$restricted_gab->get_all_GenomicAligns() } ) {</tt><tt><br>
</tt><tt> printf("%s %s:%d-%d:%s:%s\n",
$genomic_align->genome_db->name,
$genomic_align->dnafrag->name,
$genomic_align->dnafrag_start,
$genomic_align->dnafrag_end,
$genomic_align->dnafrag_strand,
$genomic_align->aligned_sequence());</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> print "\n";</tt><tt> </tt><tt><br>
</tt><tt>}</tt><tt><br>
</tt><tt><br>
</tt></blockquote>
<tt>The output is then: <br>
</tt>
<blockquote><tt>homo_sapiens
1:193122704-193122731:1:TGTTCTTTATTTTTTTTTTCTCTT------TTCC</tt><tt><br>
</tt><tt>mus_musculus
1:143702058-143702091:-1:TGTTCTTAGGTTTTTTTTTTTTTTGTAATCTCCC</tt><br>
</blockquote>
<br>
One way to get to the gene name/transcripts I've used is
GeneMember-Adaptor & Homology-Adaptor, but then I'd miss the
genomic alignments. <br>
<br>
This was taken from an ENSEMBL example: <br>
<blockquote><tt>use strict;</tt><br>
<tt>use warnings;</tt><br>
<br>
<tt># using this script to get the ortholog of a given gene of
mus musculus</tt><br>
<br>
<tt>use Bio::EnsEMBL::Registry;</tt><br>
<br>
<tt>my $reg = 'Bio::EnsEMBL::Registry';</tt><br>
<br>
<tt>$reg->load_registry_from_db(</tt><br>
<tt> -host=>'ensembldb.ensembl.org',</tt><br>
<tt> -port=>5306,</tt><br>
<tt> -user=>'anonymous', </tt><br>
<tt>);</tt><br>
<br>
<br>
<tt># Getting the gene_member_adaptor</tt><br>
<tt>my $gene_member_adaptor =
Bio::EnsEMBL::Registry->get_adaptor('Multi', 'compara',
'GeneMember');</tt><br>
<br>
<tt>my $gene_member =
$gene_member_adaptor->fetch_by_stable_id('ENSMUSG00000065782');</tt><br>
<br>
<tt># Getting the homology adaptor </tt><br>
<tt>my $homology_adaptor =
Bio::EnsEMBL::Registry->get_adaptor('Multi', 'compara',
'Homology');</tt><br>
<br>
<tt>my $homologies =
$homology_adaptor->fetch_all_by_Member($gene_member);
#homologies = hash array</tt><br>
<br>
<tt>foreach my $homology ( @{$homologies} ) {</tt><br>
<tt> foreach my $member ( @{$homology->get_all_Members} )
{</tt><br>
<tt> if ( $member->taxon->scientific_name eq 'Homo
sapiens' ) {</tt><br>
<tt> print $member->description, "\n";</tt><br>
<tt> } </tt><br>
<tt> }</tt><br>
<tt>}</tt><br>
</blockquote>
<br>
Output: <br>
<br>
Transcript:ENST00000391309 Gene:ENSG00000212611 Chr:X
Start:114017033 End:114017102 Acc:RF00088<br>
Transcript:ENST00000384693 Gene:ENSG00000277846 Chr:11
Start:62853663 End:62853732 Acc:RF00088<br>
<br>
Or is there a better way to get homology members only between 2
species? <br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
<div class="moz-signature">-- <br>
Carla Cummins, Ph.D.<br>
Ensembl Compara Developer<br>
European Bioinformatics Institute (EMBL-EBI),<br>
European Molecular Biology Laboratory,<br>
Wellcome Trust Genome Campus, Hinxton,<br>
Cambridge, CB10 1SD, United Kingdom<br>
Room A3-145<br>
Phone + 44 (0) 1223 49 4240</div>
</body>
</html>