<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    BTW, we have an example script

    (ensembl-compara/scripts/examples/families_workshop_fetchFamilyAlignment.pl)

    that does something very similar to what you want (but just for one

    gene).<br>

    <br>

    Javier<br>

    <br>

    <div class="moz-cite-prefix">On 07/11/12 11:25, Javier Herrero

      wrote:<br>

    </div>

    <blockquote cite="mid:509A4517.10200@ebi.ac.uk" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      Hi Sabrina<br>

      <br>

      It is certainly possible to get proteins from several species.<br>

      <br>

      If you are interested in getting alignments for all possible

      isoforms (each possible protein from each gene), you would have to

      use the Ensembl families. These are groups of similar proteins,

      but you should not assume that they are all orthologues. To infer

      orthology, you need a phylogenetic tree. The trees we provide are

      built using only one single representative protein per gene.<br>

      <br>

      In your case, I would recommend to use the Ensembl families, query

      the families using each cow (this is you query species, isn't it?)

      protein and dump the alignments. There are several options for

      this. You may want to use all possible species (the families are

      built using Ensembl and non-Ensembl proteins) or limit the

      alignment to a subset of species. Also, in some cases you will

      find that more than one cow proteins are in the same family, so

      you will get duplicated alignments. Is this OK?<br>

      <br>

      Kind regards<br>

      <br>

      Javier<br>

      <br>

      <div class="moz-cite-prefix">On 05/11/12 13:47, srodriguez wrote:<br>

      </div>

      <blockquote

        cite="mid:20121105144701.114227a0o1ysiiqs@www2.jouy.inra.fr"

        type="cite">Hi Javier, <br>

        <br>

        Thank you for your answer. <br>

        <br>

        Actually, I would like to obtain, 1 file per protein query

        aligned to all other species ortholog proteins (and not 1

        sequence to 1 sequence). <br>

        <br>

        ex: <br>

        for protein ENSBTAP00000032594, the file containing: <br>

        ENSBTAP00000032594/1-397

        MDALRASAAKPPTGRKMKARAPPPPGKPATPNLHSGQRSPRRASPGPPQNQLSR <br>

        ENSP00000265136/1-1261  

        MDAPRASAAKPPTGRKMKARAPPPPGKAATLHVHSDQKPPHDGALGSQQNLVRMK <br>

        ENSSPECIE2... <br>

        ENSSPECIE3... <br>

                                 *** ***********************.**

        ::**.*:.*: .: *. ** : <br>

        <br>

        Also, I would like to have 1 file per protein from the query,

        and if a gene has several proteins, obtain all the proteins

        query as single files with the alignment as above. <br>

        <br>

        Do you know if it is feasible to obtain such an output with

        Ensembl compara? <br>

        <br>

        In that case, could you please modify the script to obtain it? <br>

        <br>

        Thank you very much in advance. <br>

        <br>

        Best regards, <br>

        <br>

        Sabrina. <br>

        <br>

        <br>

        <br>

        <br>

        <br>

        <br>

        Javier Herrero <a moz-do-not-send="true"

          class="moz-txt-link-rfc2396E" href="mailto:jherrero@ebi.ac.uk"><jherrero@ebi.ac.uk></a>

        a écrit : <br>

        <br>

        <blockquote type="cite">Dear Sabrina <br>

          <br>

          I have modified the script slightly only. Essentially, I have

          removed some bits that were not required and cleaned up the

          code a little. I have also added the possibility of specifying

          the query and the target species in the command line. Last, I

          have also changed the script to output the alignments into

          separate files. <br>

          <br>

          Your strategy using the ENSEMBLGENE was correct. Indeed, you

          get two proteins aligned. I believe this is what you want,

          isn't it? <br>

          <br>

          I have added a few comments. Let me know if there something

          that is not clear. <br>

          <br>

          Javier <br>

          <br>

          On 22/10/12 15:58, srodriguez wrote: <br>

          <blockquote type="cite">Dear all, <br>

            <br>

            I would like to use compara EnsEMBL API to get the aligned

            protein sequences of a query animal with homologous protein

            sequences from other species. <br>

            <br>

            The script would take as input the query specie name, (and

            if possible the hit species names). The script would get the

            proteins of the query organism, then the homologous protein

            sequences, and then retrieves 1 file per protein query

            sequence containing the alignment of the query (placed as

            the first sequence) and then the other specie protein

            sequences aligned. <br>

            <br>

            I was thinking about using an "homology adaptor" with

            ENSEMBLPEP, so I started a script that way, but I do not

            obtain any results with ENSEMBLPEP and the results with

            ENSEMBLGENE are 2 sequences per alignment (see script

            attached). <br>

            <br>

            I also tried with "families", but sometimes, I do not get

            the protein sequence for my specie query in the sequence

            alignment even though I searched by using my taxon id

            (script N#2 attached). <br>

            <br>

            Would you have a script that already performs my goal? <br>

            <br>

            If not, could you please help me reaching my goal? <br>

            <br>

            Thank you very much in advance. <br>

            <br>

            Best regards, <br>

            <br>

            Sabrina. <br>

            <br>

            <br>

            ******************************************* <br>

            Sabrina Rodriguez <br>

            Bioinformatics <br>

            Département de Génétique animale <br>

            Unité GABI <br>

            Domaine de Vilvert <br>

            78532 Jouy en josas <br>

            <br>

            +33 (0) 1 34 65 29 53 <br>

            <br>

            <br>

            _______________________________________________ <br>

            Dev mailing list    <a moz-do-not-send="true"

              class="moz-txt-link-abbreviated"

              href="mailto:Dev@ensembl.org">Dev@ensembl.org</a> <br>

            Posting guidelines and subscribe/unsubscribe info: <a

              moz-do-not-send="true" class="moz-txt-link-freetext"

              href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>

            <br>

            Ensembl Blog: <a moz-do-not-send="true"

              class="moz-txt-link-freetext"

              href="http://www.ensembl.info/">http://www.ensembl.info/</a>

            <br>

          </blockquote>

          <br>

          -- <br>

          Javier Herrero, PhD <br>

          Ensembl Coordinator and Ensembl Compara Project Leader <br>

          European Bioinformatics Institute (EMBL-EBI) <br>

          Wellcome Trust Genome Campus, Hinxton <br>

          Cambridge - CB10 1SD - UK <br>

          <br>

          <br>

        </blockquote>

        <br>

        <br>

        <br>

        <br>

        ******************************************* <br>

        Sabrina Rodriguez <br>

        Bioinformatics <br>

        Département de Génétique animale <br>

        Unité GABI <br>

        Domaine de Vilvert <br>

        78532 Jouy en josas <br>

        <br>

        +33 (0) 1 34 65 29 53<br>

        <br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

Dev mailing list    <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>

Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>

Ensembl Blog: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>

</pre>

      </blockquote>

      <br>

      <pre class="moz-signature" cols="72">-- 

Javier Herrero, PhD

Ensembl Coordinator and Ensembl Compara Project Leader

European Bioinformatics Institute (EMBL-EBI)

Wellcome Trust Genome Campus, Hinxton

Cambridge - CB10 1SD - UK</pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Javier Herrero, PhD

Ensembl Coordinator and Ensembl Compara Project Leader

European Bioinformatics Institute (EMBL-EBI)

Wellcome Trust Genome Campus, Hinxton

Cambridge - CB10 1SD - UK</pre>

  </body>

</html>