<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Thibaut,<br>
      <br>
      thanks for the feedback. Answers to your comments in line:<br>
      <br>
      <br>
    </div>
    <blockquote
      cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      Hi Marc,
      <div><br>
        <div>
          <div>On 26 Aug 2013, at 10:59, Marc Hoeppner <<a
              moz-do-not-send="true" href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <meta http-equiv="content-type" content="text/html;
              charset=ISO-8859-1">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western">Hi
                EnsEMBL team, <br>
                <br>
                been playing with the pipeline again, but am having
                problems (again). Please see below for details - am
                happy about any suggestions. <br>
                <br>
                Cheers, <br>
                <br>
                Marc <br>
                <br>
                ######## <br>
                1) Pmatch <br>
                ######## <br>
                <br>
                I set up a pmatch analysis as by the documentation and
                it runs fine on my test dataset (small chicken
                chromosome) when I try it with test_RunnableDB. However,
                when I run the pipeline, I get this: <br>
                <br>
                TARGET  0.064u 0.008s 0+0k 0pf 0sw <br>
                BUILD   0.116u 0.040s 0+0k 0pf 0sw <br>
                SEARCH  22.949u 0.172s 0+0k 0pf 0sw <br>
                WARN: For multiple species use species attribute in
                DBAdaptor->new() <br>
                WRITING: Lost the will to live Error <br>
                Job 1198 failed: [ <br>
                -------------------- EXCEPTION -------------------- <br>
                MSG: Problems for Pmatch writing output for
                chromosome:vchicken_test:10:1:19911089:1 [Can't call
                method "version" on an undefined value at
                /opt/bioinformatics/ensembl-70/ensembl/modules/Bio/EnsEMBL/DBSQL/MetaContainer.pm

                line 218. <br>
                ] <br>
                STACK Bio::EnsEMBL::Pipeline::Job::run_module
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Job.pm:720<br>
                STACK (eval)
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:219<br>
                STACK main::run_jobs_with_lsfcopy
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:218<br>
                STACK toplevel
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:128<br>
                Date (localtime)    = Fri Aug 23 14:53:27 2013 <br>
                Ensembl API version = 70 <br>
                <br>
              </div>
            </div>
          </blockquote>
          We would need to see how your coord_system and meta tables are
          populated.</div>
        <div>The API complains that it can't find the version of your
          assembly. Your coord_system table should look like this one:</div>
        <div>
          <div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
          <div>| coord_system_id | species_id | name               |
            version   | rank | attrib                         |</div>
          <div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
          <div>|                             1 |                   1 |
            contig              | NULL      |       3 |
            default_version,sequence_level |</div>
          <div>|                             2 |                   1 |
            scaffold           | oryCun2 |       2 | default_version    
                       |</div>
          <div>|                             3 |                   1 |
            chromosome | oryCun2 |       1 | default_version            
               |</div>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western"> <br>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    This is my coord_system table:<br>
    <br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
    | coord_system_id | species_id | name       | version       | rank |
    attrib                         |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
    |               1 |          1 | chromosome | vchicken_test |    1 |
    default_version                |<br>
    |               2 |          1 | contig     | vchicken_test |    3 |
    default_version,sequence_level |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
    <br>
    <br>
    I don't have a supercontig layer, since I am faking contigs from
    assembled sequences for testing purposes. I think I had that
    discussed over this mailing list as well and was told that the API
    code should be able to deal with a contig-chromosome setup. Anything
    suspicious here?<br>
    <blockquote
      cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk"
      type="cite">
      <div>
        <div>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western">
                ########## <br>
                2) Unigene <br>
                ########## <br>
                <br>
                This one really bothers me <span class="moz-smiley-s3"
                  title=";)"></span> I think everything is set up
                correctly (downloaded the unigene file, header seems to
                comply with the reference formatting in Blast.pm etc),
                bit I cannot for the life of me get it to work.
                Specifically, I am trying to use ncbi blast and the
                command just looks off - seems like it tries to do a mix
                of Wublast and Ncbi blast (works fine with Uniprot
                though - so perhaps something with the BlastGenscanDna
                module?). <br>
                <br>
                Running job 1791 <br>
                Module is BlastGenscanDNA <br>
                Input id is contig:vchicken_test:10_68:1:50000:1 <br>
                Analysis is unigene <br>
                Files are
                /data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unigene.55.retry2.out
/data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unige$

                <br>
                <br>
                -------------------- WARNING ---------------------- <br>
                MSG: Error running Blast cmd </usr/bin/blastall -d
                /data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa
                -i /tmp/seq.22305.24863.fa -cpus=1 2>&1 >
                /tmp/unigene.fa.22305.5651.blast.out>. Returned error
                256 BLAST EXIT: '1', SIGNA$ <br>
                FILE: Analysis/Runnable/Blast.pm LINE: 380 <br>
                CALLED BY: EnsEMBL/Analysis/Runnable.pm  LINE: 729 <br>
                Date (localtime)    = Fri Aug 23 14:54:47 2013 <br>
                Ensembl API version = 70 <br>
              </div>
            </div>
          </blockquote>
          Have you tried to run the command by itself to see if it
          works? The error message you have seems to be from the ncbi
          blast program.</div>
        <div>As the module dies the temporary file containing your
          chicken sequence should still exists. If not, you will need to
          comment a line in the run method of
          ensembl-analysis/modules/Bio/EnsEMBL/Analysis/Runnable.pm:</div>
        <div><br>
        </div>
        <div>  #$self->delete_files;</div>
        <div><br>
        </div>
        <div>You probably need to change your parameters in the analysis
          table of your reference database. We use WU blast at the
          moment.</div>
        <div><br>
        </div>
        <div>Also, the parameters for blast should be "-cpus 1 -hitdist
          40" instead of "<span style="font-family: -moz-fixed;
            font-size: 12px; ">-cpus => 1, -hitdist => 40"</span></div>
        <div><br>
        </div>
        <div>Regards</div>
        <div>Thibaut</div>
        <div><br>
        </div>
      </div>
    </blockquote>
    I think the problem is that the blastall string is mal-formatted. It
    should be<br>
    <br>
    blastall -i input.fasta -d database -p blastn <br>
    <br>
    So it failed to determine which blast program to use. Interestingly,
    it works fine for protein-protein blast, but fails in this
    protein-dna configuration. Hence my question whether this may be a
    problem in the BlastGenscanDna module. I can try wublast also, but I
    think I had serious trouble getting that to work. Are you guys
    calling your executables wublastp, wublastn etc? Because the only
    thing I could find was blastp, blastn etc. I assume this would still
    work if I specify these binary names in the configs..? Gave up at
    some point because it keep whining about something, so went the ncbi
    route..<br>
    <br>
    Oh and thanks for pointing out the parameter issue, I actually took
    those from the documentation, sooo... ;) But will update my scripts.
    <br>
    <br>
    All the best,<br>
    <br>
    Marc<br>
    <br>
    <br>
    <blockquote
      cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk"
      type="cite">
      <div>
        <div>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western"> <br>
                And here the config for the unigene search: <br>
                <br>
                [unigene] <br>
                db=unigene <br>
                db_file=/data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa

                <br>
                program=blastall <br>
                program_file=blastall <br>
                parameters=-cpus => 1, -hitdist => 40 <br>
                module=BlastGenscanDNA <br>
                input_id_type=CONTIG <br>
                <br>
                (Blast.pm is configured to use 'ncbi' as default type,
                so unigene should inherit that, no?)<br>
                <br>
              </div>
            </div>
          </blockquote>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-text-flowed" style="font-family:
                -moz-fixed; font-size: 12px;" lang="x-western"> <br>
                <div class="moz-txt-sig"><span class="moz-txt-tag">-- <br>
                  </span>Marc P. Hoeppner, PhD <br>
                  Department of Medical Biochemistry and Microbiology <br>
                  Uppsala University, Sweden <br>
                  <a moz-do-not-send="true"
                    class="moz-txt-link-abbreviated"
                    href="mailto:marc.hoeppner@imbim.uu.se">marc.hoeppner@imbim.uu.se</a>
                  <br>
                  <br>
                </div>
              </div>
            </div>
            _______________________________________________<br>
            Dev mailing list    <a moz-do-not-send="true"
              href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a
              moz-do-not-send="true"
              href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a moz-do-not-send="true"
              href="http://www.ensembl.info/">http://www.ensembl.info/</a><br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>