<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi Marc,<div><br><div><div>On 27 Aug 2013, at 13:53, Marc Hoeppner <<a href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">
  
    <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Thibaut,<br>
      <br>
      thanks for the feedback. Answers to your comments in line:<br>
      <br>
      <br>
    </div>
    <blockquote cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk" type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      Hi Marc,
      <div><br>
        <div>
          <div>On 26 Aug 2013, at 10:59, Marc Hoeppner <<a moz-do-not-send="true" href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <meta http-equiv="content-type" content="text/html;
              charset=ISO-8859-1">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western">Hi
                EnsEMBL team, <br>
                <br>
                been playing with the pipeline again, but am having
                problems (again). Please see below for details - am
                happy about any suggestions. <br>
                <br>
                Cheers, <br>
                <br>
                Marc <br>
                <br>
                ######## <br>
                1) Pmatch <br>
                ######## <br>
                <br>
                I set up a pmatch analysis as by the documentation and
                it runs fine on my test dataset (small chicken
                chromosome) when I try it with test_RunnableDB. However,
                when I run the pipeline, I get this: <br>
                <br>
                TARGET  0.064u 0.008s 0+0k 0pf 0sw <br>
                BUILD   0.116u 0.040s 0+0k 0pf 0sw <br>
                SEARCH  22.949u 0.172s 0+0k 0pf 0sw <br>
                WARN: For multiple species use species attribute in
                DBAdaptor->new() <br>
                WRITING: Lost the will to live Error <br>
                Job 1198 failed: [ <br>
                -------------------- EXCEPTION -------------------- <br>
                MSG: Problems for Pmatch writing output for
                chromosome:vchicken_test:10:1:19911089:1 [Can't call
                method "version" on an undefined value at
                /opt/bioinformatics/ensembl-70/ensembl/modules/Bio/EnsEMBL/DBSQL/MetaContainer.pm

                line 218. <br>
                ] <br>
                STACK Bio::EnsEMBL::Pipeline::Job::run_module
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Job.pm:720<br>
                STACK (eval)
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:219<br>
                STACK main::run_jobs_with_lsfcopy
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:218<br>
                STACK toplevel
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:128<br>
                Date (localtime)    = Fri Aug 23 14:53:27 2013 <br>
                Ensembl API version = 70 <br>
                <br>
              </div>
            </div>
          </blockquote>
          We would need to see how your coord_system and meta tables are
          populated.</div>
        <div>The API complains that it can't find the version of your
          assembly. Your coord_system table should look like this one:</div>
        <div>
          <div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
          <div>| coord_system_id | species_id | name               |
            version   | rank | attrib                         |</div>
          <div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
          <div>|                             1 |                   1 |
            contig              | NULL      |       3 |
            default_version,sequence_level |</div>
          <div>|                             2 |                   1 |
            scaffold           | oryCun2 |       2 | default_version    
                       |</div>
          <div>|                             3 |                   1 |
            chromosome | oryCun2 |       1 | default_version            
               |</div>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western"> <br>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    This is my coord_system table:<br>
    <br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
    | coord_system_id | species_id | name       | version       | rank |
    attrib                         |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
    |               1 |          1 | chromosome | vchicken_test |    1 |
    default_version                |<br>
    |               2 |          1 | contig     | vchicken_test |    3 |
    default_version,sequence_level |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
    <br>
    <br>
    I don't have a supercontig layer, since I am faking contigs from
    assembled sequences for testing purposes. I think I had that
    discussed over this mailing list as well and was told that the API
    code should be able to deal with a contig-chromosome setup. Anything
    suspicious here?<br></div></blockquote>I'm not sure that this is the problem but you should change the rank of your contig, set it to 2. The API might be looking for the coordinate system with the rank 2 and fails to find it at the moment.</div><div><br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000">
    <blockquote cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk" type="cite">
      <div>
        <div>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western">
                ########## <br>
                2) Unigene <br>
                ########## <br>
                <br>
                This one really bothers me <span class="moz-smiley-s3" title=";)"></span> I think everything is set up
                correctly (downloaded the unigene file, header seems to
                comply with the reference formatting in Blast.pm etc),
                bit I cannot for the life of me get it to work.
                Specifically, I am trying to use ncbi blast and the
                command just looks off - seems like it tries to do a mix
                of Wublast and Ncbi blast (works fine with Uniprot
                though - so perhaps something with the BlastGenscanDna
                module?). <br>
                <br>
                Running job 1791 <br>
                Module is BlastGenscanDNA <br>
                Input id is contig:vchicken_test:10_68:1:50000:1 <br>
                Analysis is unigene <br>
                Files are
                /data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unigene.55.retry2.out
/data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unige$

                <br>
                <br>
                -------------------- WARNING ---------------------- <br>
                MSG: Error running Blast cmd </usr/bin/blastall -d
                /data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa
                -i /tmp/seq.22305.24863.fa -cpus=1 2>&1 >
                /tmp/unigene.fa.22305.5651.blast.out>. Returned error
                256 BLAST EXIT: '1', SIGNA$ <br>
                FILE: Analysis/Runnable/Blast.pm LINE: 380 <br>
                CALLED BY: EnsEMBL/Analysis/Runnable.pm  LINE: 729 <br>
                Date (localtime)    = Fri Aug 23 14:54:47 2013 <br>
                Ensembl API version = 70 <br>
              </div>
            </div>
          </blockquote>
          Have you tried to run the command by itself to see if it
          works? The error message you have seems to be from the ncbi
          blast program.</div>
        <div>As the module dies the temporary file containing your
          chicken sequence should still exists. If not, you will need to
          comment a line in the run method of
          ensembl-analysis/modules/Bio/EnsEMBL/Analysis/Runnable.pm:</div>
        <div><br>
        </div>
        <div>  #$self->delete_files;</div>
        <div><br>
        </div>
        <div>You probably need to change your parameters in the analysis
          table of your reference database. We use WU blast at the
          moment.</div>
        <div><br>
        </div>
        <div>Also, the parameters for blast should be "-cpus 1 -hitdist
          40" instead of "<span style="font-family: -moz-fixed;
            font-size: 12px; ">-cpus => 1, -hitdist => 40"</span></div>
        <div><br>
        </div>
        <div>Regards</div>
        <div>Thibaut</div>
        <div><br>
        </div>
      </div>
    </blockquote>
    I think the problem is that the blastall string is mal-formatted. It
    should be<br>
    <br>
    blastall -i input.fasta -d database -p blastn <br>
    <br>
    So it failed to determine which blast program to use. Interestingly,
    it works fine for protein-protein blast, but fails in this
    protein-dna configuration. Hence my question whether this may be a
    problem in the BlastGenscanDna module. I can try wublast also, but I
    think I had serious trouble getting that to work. Are you guys
    calling your executables wublastp, wublastn etc? Because the only
    thing I could find was blastp, blastn etc. I assume this would still
    work if I specify these binary names in the configs..? Gave up at
    some point because it keep whining about something, so went the ncbi
    route..<br></div></blockquote>Maybe blast look at the database to know which type of search it will do by default.<br>For the moment you need to change the parameters and add '-p blastn' and it should work.</div><div><br></div><div>If your blastn is similar to blastall -p blastn then you can change your program to be blastn and you don't need to add '-p blastn' to your parameters<br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000">
    <br>
    Oh and thanks for pointing out the parameter issue, I actually took
    those from the documentation, sooo... ;) But will update my scripts.
    <br></div></blockquote>Unfortunately the documentation available is a bit old and updating it take a lot of time.</div><div><br></div><div>Regards</div><div>Thibaut</div><div><br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000">
    <br>
    All the best,<br>
    <br>
    Marc<br>
    <br>
    <br>
    <blockquote cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk" type="cite">
      <div>
        <div>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western"> <br>
                And here the config for the unigene search: <br>
                <br>
                [unigene] <br>
                db=unigene <br>
                db_file=/data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa

                <br>
                program=blastall <br>
                program_file=blastall <br>
                parameters=-cpus => 1, -hitdist => 40 <br>
                module=BlastGenscanDNA <br>
                input_id_type=CONTIG <br>
                <br>
                (Blast.pm is configured to use 'ncbi' as default type,
                so unigene should inherit that, no?)<br>
                <br>
              </div>
            </div>
          </blockquote>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western"> <br>
                <div class="moz-txt-sig"><span class="moz-txt-tag">-- <br>
                  </span>Marc P. Hoeppner, PhD <br>
                  Department of Medical Biochemistry and Microbiology <br>
                  Uppsala University, Sweden <br>
                  <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:marc.hoeppner@imbim.uu.se">marc.hoeppner@imbim.uu.se</a>
                  <br>
                  <br>
                </div>
              </div>
            </div>
            _______________________________________________<br>
            Dev mailing list    <a moz-do-not-send="true" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/">http://www.ensembl.info/</a><br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
  </div>

_______________________________________________<br>Dev mailing list    <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>Ensembl Blog: <a href="http://www.ensembl.info/">http://www.ensembl.info/</a><br></blockquote></div><br></div></body></html>