<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
BTW, we have an example script
(ensembl-compara/scripts/examples/families_workshop_fetchFamilyAlignment.pl)
that does something very similar to what you want (but just for one
gene).<br>
<br>
Javier<br>
<br>
<div class="moz-cite-prefix">On 07/11/12 11:25, Javier Herrero
wrote:<br>
</div>
<blockquote cite="mid:509A4517.10200@ebi.ac.uk" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
Hi Sabrina<br>
<br>
It is certainly possible to get proteins from several species.<br>
<br>
If you are interested in getting alignments for all possible
isoforms (each possible protein from each gene), you would have to
use the Ensembl families. These are groups of similar proteins,
but you should not assume that they are all orthologues. To infer
orthology, you need a phylogenetic tree. The trees we provide are
built using only one single representative protein per gene.<br>
<br>
In your case, I would recommend to use the Ensembl families, query
the families using each cow (this is you query species, isn't it?)
protein and dump the alignments. There are several options for
this. You may want to use all possible species (the families are
built using Ensembl and non-Ensembl proteins) or limit the
alignment to a subset of species. Also, in some cases you will
find that more than one cow proteins are in the same family, so
you will get duplicated alignments. Is this OK?<br>
<br>
Kind regards<br>
<br>
Javier<br>
<br>
<div class="moz-cite-prefix">On 05/11/12 13:47, srodriguez wrote:<br>
</div>
<blockquote
cite="mid:20121105144701.114227a0o1ysiiqs@www2.jouy.inra.fr"
type="cite">Hi Javier, <br>
<br>
Thank you for your answer. <br>
<br>
Actually, I would like to obtain, 1 file per protein query
aligned to all other species ortholog proteins (and not 1
sequence to 1 sequence). <br>
<br>
ex: <br>
for protein ENSBTAP00000032594, the file containing: <br>
ENSBTAP00000032594/1-397
MDALRASAAKPPTGRKMKARAPPPPGKPATPNLHSGQRSPRRASPGPPQNQLSR <br>
ENSP00000265136/1-1261
MDAPRASAAKPPTGRKMKARAPPPPGKAATLHVHSDQKPPHDGALGSQQNLVRMK <br>
ENSSPECIE2... <br>
ENSSPECIE3... <br>
*** ***********************.**
::**.*:.*: .: *. ** : <br>
<br>
Also, I would like to have 1 file per protein from the query,
and if a gene has several proteins, obtain all the proteins
query as single files with the alignment as above. <br>
<br>
Do you know if it is feasible to obtain such an output with
Ensembl compara? <br>
<br>
In that case, could you please modify the script to obtain it? <br>
<br>
Thank you very much in advance. <br>
<br>
Best regards, <br>
<br>
Sabrina. <br>
<br>
<br>
<br>
<br>
<br>
<br>
Javier Herrero <a moz-do-not-send="true"
class="moz-txt-link-rfc2396E" href="mailto:jherrero@ebi.ac.uk"><jherrero@ebi.ac.uk></a>
a écrit : <br>
<br>
<blockquote type="cite">Dear Sabrina <br>
<br>
I have modified the script slightly only. Essentially, I have
removed some bits that were not required and cleaned up the
code a little. I have also added the possibility of specifying
the query and the target species in the command line. Last, I
have also changed the script to output the alignments into
separate files. <br>
<br>
Your strategy using the ENSEMBLGENE was correct. Indeed, you
get two proteins aligned. I believe this is what you want,
isn't it? <br>
<br>
I have added a few comments. Let me know if there something
that is not clear. <br>
<br>
Javier <br>
<br>
On 22/10/12 15:58, srodriguez wrote: <br>
<blockquote type="cite">Dear all, <br>
<br>
I would like to use compara EnsEMBL API to get the aligned
protein sequences of a query animal with homologous protein
sequences from other species. <br>
<br>
The script would take as input the query specie name, (and
if possible the hit species names). The script would get the
proteins of the query organism, then the homologous protein
sequences, and then retrieves 1 file per protein query
sequence containing the alignment of the query (placed as
the first sequence) and then the other specie protein
sequences aligned. <br>
<br>
I was thinking about using an "homology adaptor" with
ENSEMBLPEP, so I started a script that way, but I do not
obtain any results with ENSEMBLPEP and the results with
ENSEMBLGENE are 2 sequences per alignment (see script
attached). <br>
<br>
I also tried with "families", but sometimes, I do not get
the protein sequence for my specie query in the sequence
alignment even though I searched by using my taxon id
(script N#2 attached). <br>
<br>
Would you have a script that already performs my goal? <br>
<br>
If not, could you please help me reaching my goal? <br>
<br>
Thank you very much in advance. <br>
<br>
Best regards, <br>
<br>
Sabrina. <br>
<br>
<br>
******************************************* <br>
Sabrina Rodriguez <br>
Bioinformatics <br>
Département de Génétique animale <br>
Unité GABI <br>
Domaine de Vilvert <br>
78532 Jouy en josas <br>
<br>
+33 (0) 1 34 65 29 53 <br>
<br>
<br>
_______________________________________________ <br>
Dev mailing list <a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:Dev@ensembl.org">Dev@ensembl.org</a> <br>
Posting guidelines and subscribe/unsubscribe info: <a
moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
<br>
Ensembl Blog: <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://www.ensembl.info/">http://www.ensembl.info/</a>
<br>
</blockquote>
<br>
-- <br>
Javier Herrero, PhD <br>
Ensembl Coordinator and Ensembl Compara Project Leader <br>
European Bioinformatics Institute (EMBL-EBI) <br>
Wellcome Trust Genome Campus, Hinxton <br>
Cambridge - CB10 1SD - UK <br>
<br>
<br>
</blockquote>
<br>
<br>
<br>
<br>
******************************************* <br>
Sabrina Rodriguez <br>
Bioinformatics <br>
Département de Génétique animale <br>
Unité GABI <br>
Domaine de Vilvert <br>
78532 Jouy en josas <br>
<br>
+33 (0) 1 34 65 29 53<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK</pre>
</body>
</html>