<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Hi Thibaut,<br>
<br>
thanks for the feedback. Answers to your comments in line:<br>
<br>
<br>
</div>
<blockquote
cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
Hi Marc,
<div><br>
<div>
<div>On 26 Aug 2013, at 10:59, Marc Hoeppner <<a
moz-do-not-send="true" href="mailto:mphoeppner@gmail.com">mphoeppner@gmail.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-text-flowed" style="font-family:
-moz-fixed; font-size: 12px;" lang="x-western">Hi
EnsEMBL team, <br>
<br>
been playing with the pipeline again, but am having
problems (again). Please see below for details - am
happy about any suggestions. <br>
<br>
Cheers, <br>
<br>
Marc <br>
<br>
######## <br>
1) Pmatch <br>
######## <br>
<br>
I set up a pmatch analysis as by the documentation and
it runs fine on my test dataset (small chicken
chromosome) when I try it with test_RunnableDB. However,
when I run the pipeline, I get this: <br>
<br>
TARGET 0.064u 0.008s 0+0k 0pf 0sw <br>
BUILD 0.116u 0.040s 0+0k 0pf 0sw <br>
SEARCH 22.949u 0.172s 0+0k 0pf 0sw <br>
WARN: For multiple species use species attribute in
DBAdaptor->new() <br>
WRITING: Lost the will to live Error <br>
Job 1198 failed: [ <br>
-------------------- EXCEPTION -------------------- <br>
MSG: Problems for Pmatch writing output for
chromosome:vchicken_test:10:1:19911089:1 [Can't call
method "version" on an undefined value at
/opt/bioinformatics/ensembl-70/ensembl/modules/Bio/EnsEMBL/DBSQL/MetaContainer.pm
line 218. <br>
] <br>
STACK Bio::EnsEMBL::Pipeline::Job::run_module
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Job.pm:720<br>
STACK (eval)
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:219<br>
STACK main::run_jobs_with_lsfcopy
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:218<br>
STACK toplevel
/opt/bioinformatics/ensembl-70/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/runner.pl:128<br>
Date (localtime) = Fri Aug 23 14:53:27 2013 <br>
Ensembl API version = 70 <br>
<br>
</div>
</div>
</blockquote>
We would need to see how your coord_system and meta tables are
populated.</div>
<div>The API complains that it can't find the version of your
assembly. Your coord_system table should look like this one:</div>
<div>
<div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
<div>| coord_system_id | species_id | name |
version | rank | attrib |</div>
<div>+-----------------------+----------------+------------------+------------+-------+--------------------------------+</div>
<div>| 1 | 1 |
contig | NULL | 3 |
default_version,sequence_level |</div>
<div>| 2 | 1 |
scaffold | oryCun2 | 2 | default_version
|</div>
<div>| 3 | 1 |
chromosome | oryCun2 | 1 | default_version
|</div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-text-flowed" style="font-family:
-moz-fixed; font-size: 12px;" lang="x-western"> <br>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
This is my coord_system table:<br>
<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
| coord_system_id | species_id | name | version | rank |
attrib |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
| 1 | 1 | chromosome | vchicken_test | 1 |
default_version |<br>
| 2 | 1 | contig | vchicken_test | 3 |
default_version,sequence_level |<br>
+-----------------+------------+------------+---------------+------+--------------------------------+<br>
<br>
<br>
I don't have a supercontig layer, since I am faking contigs from
assembled sequences for testing purposes. I think I had that
discussed over this mailing list as well and was told that the API
code should be able to deal with a contig-chromosome setup. Anything
suspicious here?<br>
<blockquote
cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk"
type="cite">
<div>
<div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-text-flowed" style="font-family:
-moz-fixed; font-size: 12px;" lang="x-western">
########## <br>
2) Unigene <br>
########## <br>
<br>
This one really bothers me <span class="moz-smiley-s3"
title=";)"></span> I think everything is set up
correctly (downloaded the unigene file, header seems to
comply with the reference formatting in Blast.pm etc),
bit I cannot for the life of me get it to work.
Specifically, I am trying to use ncbi blast and the
command just looks off - seems like it tries to do a mix
of Wublast and Ncbi blast (works fine with Uniprot
though - so perhaps something with the BlastGenscanDna
module?). <br>
<br>
Running job 1791 <br>
Module is BlastGenscanDNA <br>
Input id is contig:vchicken_test:10_68:1:50000:1 <br>
Analysis is unigene <br>
Files are
/data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unigene.55.retry2.out
/data2/projects/annotation/EnsEMBL/chicken/output//unigene/0/contig:vchicken_test:10_114:1:50000:1.unige$
<br>
<br>
-------------------- WARNING ---------------------- <br>
MSG: Error running Blast cmd </usr/bin/blastall -d
/data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa
-i /tmp/seq.22305.24863.fa -cpus=1 2>&1 >
/tmp/unigene.fa.22305.5651.blast.out>. Returned error
256 BLAST EXIT: '1', SIGNA$ <br>
FILE: Analysis/Runnable/Blast.pm LINE: 380 <br>
CALLED BY: EnsEMBL/Analysis/Runnable.pm LINE: 729 <br>
Date (localtime) = Fri Aug 23 14:54:47 2013 <br>
Ensembl API version = 70 <br>
</div>
</div>
</blockquote>
Have you tried to run the command by itself to see if it
works? The error message you have seems to be from the ncbi
blast program.</div>
<div>As the module dies the temporary file containing your
chicken sequence should still exists. If not, you will need to
comment a line in the run method of
ensembl-analysis/modules/Bio/EnsEMBL/Analysis/Runnable.pm:</div>
<div><br>
</div>
<div> #$self->delete_files;</div>
<div><br>
</div>
<div>You probably need to change your parameters in the analysis
table of your reference database. We use WU blast at the
moment.</div>
<div><br>
</div>
<div>Also, the parameters for blast should be "-cpus 1 -hitdist
40" instead of "<span style="font-family: -moz-fixed;
font-size: 12px; ">-cpus => 1, -hitdist => 40"</span></div>
<div><br>
</div>
<div>Regards</div>
<div>Thibaut</div>
<div><br>
</div>
</div>
</blockquote>
I think the problem is that the blastall string is mal-formatted. It
should be<br>
<br>
blastall -i input.fasta -d database -p blastn <br>
<br>
So it failed to determine which blast program to use. Interestingly,
it works fine for protein-protein blast, but fails in this
protein-dna configuration. Hence my question whether this may be a
problem in the BlastGenscanDna module. I can try wublast also, but I
think I had serious trouble getting that to work. Are you guys
calling your executables wublastp, wublastn etc? Because the only
thing I could find was blastp, blastn etc. I assume this would still
work if I specify these binary names in the configs..? Gave up at
some point because it keep whining about something, so went the ncbi
route..<br>
<br>
Oh and thanks for pointing out the parameter issue, I actually took
those from the documentation, sooo... ;) But will update my scripts.
<br>
<br>
All the best,<br>
<br>
Marc<br>
<br>
<br>
<blockquote
cite="mid:8721613D-12DB-4AA8-8B36-141F116838B6@sanger.ac.uk"
type="cite">
<div>
<div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-text-flowed" style="font-family:
-moz-fixed; font-size: 12px;" lang="x-western"> <br>
And here the config for the unigene search: <br>
<br>
[unigene] <br>
db=unigene <br>
db_file=/data2/projects/annotation/EnsEMBL/chicken/refseqs/unigene.fa
<br>
program=blastall <br>
program_file=blastall <br>
parameters=-cpus => 1, -hitdist => 40 <br>
module=BlastGenscanDNA <br>
input_id_type=CONTIG <br>
<br>
(Blast.pm is configured to use 'ncbi' as default type,
so unigene should inherit that, no?)<br>
<br>
</div>
</div>
</blockquote>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-text-flowed" style="font-family:
-moz-fixed; font-size: 12px;" lang="x-western"> <br>
<div class="moz-txt-sig"><span class="moz-txt-tag">-- <br>
</span>Marc P. Hoeppner, PhD <br>
Department of Medical Biochemistry and Microbiology <br>
Uppsala University, Sweden <br>
<a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:marc.hoeppner@imbim.uu.se">marc.hoeppner@imbim.uu.se</a>
<br>
<br>
</div>
</div>
</div>
_______________________________________________<br>
Dev mailing list <a moz-do-not-send="true"
href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a
moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a moz-do-not-send="true"
href="http://www.ensembl.info/">http://www.ensembl.info/</a><br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
</body>
</html>