<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi Marc,<div><br><div><div>On 28 Aug 2013, at 14:35, Marc Hoeppner <<a href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">
  
    <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Thibaut,<br>
      <br>
      thanks for the help, been able to fix my problems. <br>
      <br>
      Turns out that yes, the coordinate system rank seems to matter -
      but I think the problem was that Pmatch was configured to write to
      the GENEWISE_DB, which existed and had tables and everything, but
      wasn't loaded with any data (such as a coordinate system). <br></div></div></blockquote>Whenever you write in a database using the Genebuild pipeline you need to check that some tables are in sync with your reference. Here are the tables:</div><div>analysis</div><div>assembly</div><div>assembly_exception</div><div>coord_system</div><div>seq_region</div><div>seq_region_attrib</div><div>meta</div><div>attrib_type</div><div>seq_region_synonym</div><div><div><br></div></div><div><br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000"><div class="moz-cite-prefix">
      <br>
      For the unigene blast, adding the blastall program option manually
      into the parameter string solved that problem, but I still wonder
      if there is a bug in the code (it should do this automatically).<br></div></div></blockquote>As it is an blastall option and can be set in the analysis configuration I don't see it as a bug</div><div><br></div><div>Regards</div><div>Thibaut</div><div> <br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000"><div class="moz-cite-prefix">
      <br>
      Anyway, thanks again!<br>
      <br>
      /Marc<br>
    </div>
    <blockquote cite="mid:30269147-DC7E-41DE-8828-14B070152902@sanger.ac.uk" type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      Hi Marc,
      <div><br>
        <div>
          <div>On 27 Aug 2013, at 13:53, Marc Hoeppner <<a moz-do-not-send="true" href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
            <div bgcolor="#FFFFFF" text="#000000">
              <div class="moz-cite-prefix">Hi Thibaut,<br>
                <br>
                thanks for the feedback. Answers to your comments in
                line:<br>
                <br>
                <br>
              </div>
              <blockquote cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk" type="cite">
                <meta http-equiv="Content-Type" content="text/html;
                  charset=ISO-8859-1">
                Hi Marc,
                <div><br>
                  <div>
                    <div>On 26 Aug 2013, at 10:59, Marc Hoeppner <<a moz-do-not-send="true" href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>>

                      wrote:</div>
                    <br class="Apple-interchange-newline">
                    <blockquote type="cite">
                      <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div class="moz-text-flowed" style="font-family:
                          -moz-fixed; font-size: 12px;" lang="x-western">Hi

                          EnsEMBL team, <br>
                          <br>
                          been playing with the pipeline again, but am
                          having problems (again). Please see below for
                          details - am happy about any suggestions. <br>
                          <br>
                          Cheers, <br>
                          <br>
                          Marc <br>
                          <br>
                          ######## <br>
                          1) Pmatch <br>
                          ######## <br>
                          <br>
                          I set up a pmatch analysis as by the
                          documentation and it runs fine on my test
                          dataset (small chicken chromosome) when I try
                          it with test_RunnableDB. However, when I run
                          the pipeline, I get this: <br>
                          <br>
                          TARGET  0.064u 0.008s 0+0k 0pf 0sw <br>
                          BUILD   0.116u 0.040s 0+0k 0pf 0sw <br>
                          SEARCH  22.949u 0.172s 0+0k 0pf 0sw <br>
                          WARN: For multiple species use species
                          attribute in DBAdaptor->new() <br>
                          WRITING: Lost the will to live Error <br>
                          Job 1198 failed: [ <br>
                          -------------------- EXCEPTION
                          -------------------- <br>
                          MSG: Problems for Pmatch writing output for
                          chromosome:vchicken_test:10:1:19911089:1
                          [Can't call method "version" on an undefined
                          value at
                          /opt/bioinformatics/ensembl-70/ensembl/modules/Bio/EnsEMBL/DBSQL/MetaContainer.pm


                          line 218. <br>
                          ] <br>
                          STACK Bio::EnsEMBL::Pipeline::Job::run_module
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Job.pm:720<br>
                          STACK (eval)
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:219<br>
                          STACK main::run_jobs_with_lsfcopy
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:218<br>
                          STACK toplevel
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:128<br>
                          Date (localtime)    = Fri Aug 23 14:53:27 2013
                          <br>
                          Ensembl API version = 70 <br>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                    We would need to see how your coord_system and meta
                    tables are populated.</div>
                  <div>The API complains that it can't find the version
                    of your assembly. Your coord_system table should
                    look like this one:</div>
                  <div>
                    <div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
                    <div>| coord_system_id | species_id | name          
                          | version   | rank | attrib                  
                            |</div>
                    <div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
                    <div>|                             1 |              
                          1 | contig              | NULL      |       3
                      | default_version,sequence_level |</div>
                    <div>|                             2 |              
                          1 | scaffold           | oryCun2 |       2 |
                      default_version                |</div>
                    <div>|                             3 |              
                          1 | chromosome | oryCun2 |       1 |
                      default_version                |</div>
                    <blockquote type="cite">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div class="moz-text-flowed" style="font-family:
                          -moz-fixed; font-size: 12px;" lang="x-western">
                          <br>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                </div>
              </blockquote>
              This is my coord_system table:<br>
              <br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
              | coord_system_id | species_id | name       |
              version       | rank | attrib                         |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
              |               1 |          1 | chromosome |
              vchicken_test |    1 | default_version                |<br>
              |               2 |          1 | contig     |
              vchicken_test |    3 | default_version,sequence_level |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
              <br>
              <br>
              I don't have a supercontig layer, since I am faking
              contigs from assembled sequences for testing purposes. I
              think I had that discussed over this mailing list as well
              and was told that the API code should be able to deal with
              a contig-chromosome setup. Anything suspicious here?<br>
            </div>
          </blockquote>
          I'm not sure that this is the problem but you should change
          the rank of your contig, set it to 2. The API might be looking
          for the coordinate system with the rank 2 and fails to find it
          at the moment.</div>
        <div><br>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000">
              <blockquote cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk" type="cite">
                <div>
                  <div>
                    <blockquote type="cite">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div class="moz-text-flowed" style="font-family:
                          -moz-fixed; font-size: 12px;" lang="x-western">
                          ########## <br>
                          2) Unigene <br>
                          ########## <br>
                          <br>
                          This one really bothers me <span class="moz-smiley-s3" title=";)"></span> I
                          think everything is set up correctly
                          (downloaded the unigene file, header seems to
                          comply with the reference formatting in
                          Blast.pm etc), bit I cannot for the life of me
                          get it to work. Specifically, I am trying to
                          use ncbi blast and the command just looks off
                          - seems like it tries to do a mix of Wublast
                          and Ncbi blast (works fine with Uniprot though
                          - so perhaps something with the
                          BlastGenscanDna module?). <br>
                          <br>
                          Running job 1791 <br>
                          Module is BlastGenscanDNA <br>
                          Input id is
                          contig:vchicken_test:10_68:1:50000:1 <br>
                          Analysis is unigene <br>
                          Files are
                          /data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unigene.55.retry2.out
/data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unige$


                          <br>
                          <br>
                          -------------------- WARNING
                          ---------------------- <br>
                          MSG: Error running Blast cmd
                          </usr/bin/blastall -d
                          /data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa
                          -i /tmp/seq.22305.24863.fa -cpus=1 2>&1
                          > /tmp/unigene.fa.22305.5651.blast.out>.
                          Returned error 256 BLAST EXIT: '1', SIGNA$ <br>
                          FILE: Analysis/Runnable/Blast.pm LINE: 380 <br>
                          CALLED BY: EnsEMBL/Analysis/Runnable.pm  LINE:
                          729 <br>
                          Date (localtime)    = Fri Aug 23 14:54:47 2013
                          <br>
                          Ensembl API version = 70 <br>
                        </div>
                      </div>
                    </blockquote>
                    Have you tried to run the command by itself to see
                    if it works? The error message you have seems to be
                    from the ncbi blast program.</div>
                  <div>As the module dies the temporary file containing
                    your chicken sequence should still exists. If not,
                    you will need to comment a line in the run method of
ensembl-analysis/modules/Bio/EnsEMBL/Analysis/Runnable.pm:</div>
                  <div><br>
                  </div>
                  <div>  #$self->delete_files;</div>
                  <div><br>
                  </div>
                  <div>You probably need to change your parameters in
                    the analysis table of your reference database. We
                    use WU blast at the moment.</div>
                  <div><br>
                  </div>
                  <div>Also, the parameters for blast should be "-cpus 1
                    -hitdist 40" instead of "<span style="font-family:
                      -moz-fixed; font-size: 12px; ">-cpus => 1,
                      -hitdist => 40"</span></div>
                  <div><br>
                  </div>
                  <div>Regards</div>
                  <div>Thibaut</div>
                  <div><br>
                  </div>
                </div>
              </blockquote>
              I think the problem is that the blastall string is
              mal-formatted. It should be<br>
              <br>
              blastall -i input.fasta -d database -p blastn <br>
              <br>
              So it failed to determine which blast program to use.
              Interestingly, it works fine for protein-protein blast,
              but fails in this protein-dna configuration. Hence my
              question whether this may be a problem in the
              BlastGenscanDna module. I can try wublast also, but I
              think I had serious trouble getting that to work. Are you
              guys calling your executables wublastp, wublastn etc?
              Because the only thing I could find was blastp, blastn
              etc. I assume this would still work if I specify these
              binary names in the configs..? Gave up at some point
              because it keep whining about something, so went the ncbi
              route..<br>
            </div>
          </blockquote>
          Maybe blast look at the database to know which type of search
          it will do by default.<br>
          For the moment you need to change the parameters and add '-p
          blastn' and it should work.</div>
        <div><br>
        </div>
        <div>If your blastn is similar to blastall -p blastn then you
          can change your program to be blastn and you don't need to add
          '-p blastn' to your parameters<br>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000"> <br>
              Oh and thanks for pointing out the parameter issue, I
              actually took those from the documentation, sooo... ;) But
              will update my scripts. <br>
            </div>
          </blockquote>
          Unfortunately the documentation available is a bit old and
          updating it take a lot of time.</div>
        <div><br>
        </div>
        <div>Regards</div>
        <div>Thibaut</div>
        <div><br>
          <blockquote type="cite">
            <div bgcolor="#FFFFFF" text="#000000"> <br>
              All the best,<br>
              <br>
              Marc<br>
              <br>
              <br>
              <blockquote cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk" type="cite">
                <div>
                  <div>
                    <blockquote type="cite">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div class="moz-text-flowed" style="font-family:
                          -moz-fixed; font-size: 12px;" lang="x-western">
                          <br>
                          And here the config for the unigene search: <br>
                          <br>
                          [unigene] <br>
                          db=unigene <br>
                          db_file=/data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa


                          <br>
                          program=blastall <br>
                          program_file=blastall <br>
                          parameters=-cpus => 1, -hitdist => 40 <br>
                          module=BlastGenscanDNA <br>
                          input_id_type=CONTIG <br>
                          <br>
                          (Blast.pm is configured to use 'ncbi' as
                          default type, so unigene should inherit that,
                          no?)<br>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                    <blockquote type="cite">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div class="moz-text-flowed" style="font-family:
                          -moz-fixed; font-size: 12px;" lang="x-western">
                          <br>
                          <div class="moz-txt-sig"><span class="moz-txt-tag">-- <br>
                            </span>Marc P. Hoeppner, PhD <br>
                            Department of Medical Biochemistry and
                            Microbiology <br>
                            Uppsala University, Sweden <br>
                            <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:marc.hoeppner@imbim.uu.se">marc.hoeppner@imbim.uu.se</a>
                            <br>
                            <br>
                          </div>
                        </div>
                      </div>
                      _______________________________________________<br>
                      Dev mailing list    <a moz-do-not-send="true" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
                      Posting guidelines and subscribe/unsubscribe info:
                      <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
                      Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/">http://www.ensembl.info/</a><br>
                    </blockquote>
                  </div>
                  <br>
                </div>
                <br>
                <fieldset class="mimeAttachmentHeader"></fieldset>
                <br>
                <pre wrap="">_______________________________________________
Dev mailing list    <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
              </blockquote>
              <br>
            </div>
            _______________________________________________<br>
            Dev mailing list    <a moz-do-not-send="true" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a moz-do-not-send="true" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a moz-do-not-send="true" href="http://www.ensembl.info/">http://www.ensembl.info/</a><br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
  </div>

</blockquote></div><br></div></body></html>