<div dir="ltr">Thanks<div><br></div><div style>Running a annotation using 16 forks... lets see how it handles :)</div><div style>I'll report back any issues.</div><div style><br></div><div style>Thanks for the update</div>

<div style><br></div><div style>Duarte</div><div style><br></div></div><div class="gmail_extra"><br clear="all"><div><font style="background-color:rgb(255,255,255)" color="#999999">=========================<br>     Duarte Miguel Paulo Molha      <br>

</font><div><font style="background-color:rgb(255,255,255)" color="#999999">         <a href="http://about.me/duarte" target="_blank">http://about.me/duarte</a>         <br>=========================</font></div></div>
<br><br><div class="gmail_quote">On Tue, May 14, 2013 at 10:16 AM, Will McLaren <span dir="ltr"><<a href="mailto:wm2@ebi.ac.uk" target="_blank">wm2@ebi.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir="ltr">Stuart, Guillermo, Duarte,<div><br></div><div>I'm currently working on some code as I stated above to improve stability and performance under forking.</div><div><br></div><div>I've committed some code to the HEAD of our CVS tree which should help the problems you are encountering. You'd all be welcome to test this out, with the obvious proviso that this is development code and may contain bugs!</div>


<div><br></div><div>To use this, you should download the copy of VEP.pm from:</div><div><br></div><div><a href="http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm?revision=1.92&root=ensembl" target="_blank">http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm?revision=1.92&root=ensembl</a><br>


</div><div><br></div><div>and replace the VEP.pm under ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils (or just Bio/EnsEMBL/Variation/Utils if you use INSTALL.pl) with this one.</div><div><br></div>
<div>This code will appear in production in the next proper release of Ensembl.</div><div><br></div><div>Regards</div><span class="HOEnZb"><font color="#888888"><div><br></div><div>Will</div></font></span></div><div class="HOEnZb">

<div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">
On 14 May 2013 09:55, Stuart Meacham <span dir="ltr"><<a href="mailto:sm766@cam.ac.uk" target="_blank">sm766@cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">



  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div>Hi,<br>
      <br>
      I certainly don't want to hijack this thread but it seemed daft to
      start another. I am also getting forking errors. I don't use any
      custom plugins and am using a validated VCF as input (with about
      600,000 variants). Trying to fork more than 4 threads is unstable
      even on my machine which has 64 cores and half a TB of RAM.<br>
      <br>
      I haven't found anything reproducible, however if I do I'll report
      back to the list.<br>
      <br>
      Thanks<span><font color="#888888"><br>
      <br>
      Stuart</font></span><div><div><br>
      <br>
      On 14/05/2013 09:42, Will McLaren wrote:<br>
    </div></div></div><div><div>
    <blockquote type="cite">
      <div dir="ltr">Hello,
        <div><br>
        </div>
        <div>Your aa_grantham_distance plugin is somewhat inefficient -
          it retrieves the peptide alleles from the HGVS annotation,
          which itself requires some database fetching and processing to
          produce. This is why it is slow.</div>
        <div><br>
        </div>
        <div>You can get the peptides from the transcript
          variation object:</div>
        <div><br>
        </div>
        <div>my @peps = split "/",
          $tva->transcript_variation->pep_allele_string();</div>
        <div>
          <br>
        </div>
        <div>This will give you single-letter AA codes, but you
          could either modify your hash or use BioPerl to convert:</div>
        <div><br>
        </div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">$seqobj
            = Bio::PrimarySeq->new ( -seq => $single_letter_aa); </span><br style="font-family:Arial,Helvetica,sans-serif;font-size:13px">
          <span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">$three_letter_aa
            = Bio::SeqUtils->seq3($seqobj); </span><br>
        </div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px"><br>
          </span></div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">You
            should also declare your distances hash in the new() sub and
            store it on $self; this will also marginally speed up your
            plugin.</span></div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px"><br>
          </span></div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">Regarding
            the forking issues, we are working on improving stability
            under forking.</span></div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px"><br>
          </span></div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">Thanks
            for your patience</span></div>
        <div>
          <span style="font-family:Arial,Helvetica,sans-serif;font-size:13px"><br>
          </span></div>
        <div><span style="font-family:Arial,Helvetica,sans-serif;font-size:13px">Will</span></div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">
          On 14 May 2013 07:37, Guillermo Marco Puche <span dir="ltr"><<a href="mailto:guillermo.marco@sistemasgenomicos.com" target="_blank">guillermo.marco@sistemasgenomicos.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000066">
              <div>Hello,<br>
                <br>
                I'm not really sure which one of those plugins is
                causing the fork error. I cannot recreate it now running
                each one of them separately.<br>
                <br>
                Here are both:<br>
                <br>
                <a href="https://github.com/guillermomarco/vep_plugins_71" target="_blank">https://github.com/guillermomarco/vep_plugins_71</a><br>
                <br>
                They also slow the calculating consequences process a
                lot. <a href="http://aa_grantham_distance.pm" target="_blank">aa_grantham_distance.pm</a>
                is just a hardcoded plugin from one of the biologists in
                my work. It was just a pure copy paste and adaptation to
                make it work as a VEP plugin. Maybe the problem is in
                the matrix definition every time the sub routine is
                called. I'm not running out of memory nor CPU. I'm
                currently using it with 2 threads and buffersize of 500
                for a 5000 variant vcf file.<br>
                <br>
                I'm my honest opinion, I think one (or even both) of
                those plugins are slowing so much the calculating
                process that sometimes the fork just dies. Like when you
                have a timeout during to heavy network traffic. So when
                you use them together with lot of other plugins like
                Condel, Consequence, etc.. they may be causing the
                process to handle and die.<br>
                <br>
                Best regards,<br>
                Guillermo.
                <div>
                  <div><br>
                    <br>
                    On 05/13/2013 03:55 PM, Duarte Molha wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div>
                  <blockquote type="cite">
                    <div dir="ltr">I also get this error... it is so
                      prevalent and so difficult to pinpoint what is
                      causing it that I have given up on forking my
                      annotation process.
                      <div><br>
                      </div>
                      <div>I do think it is related to the number of
                        forks. It seems to crash less often if you use a
                        low number of forks... anything above 5
                        will undoubtedly crash the script at least in my
                        experience.</div>
                      <div><br>
                      </div>
                      <div>Cheers</div>
                      <div><br>
                        Duarte</div>
                    </div>
                    <div class="gmail_extra"><br clear="all">
                      <div><font color="#999999">=========================<br>
                               Duarte Miguel Paulo Molha      <br>
                        </font>
                        <div><font color="#999999">         <a href="http://about.me/duarte" target="_blank">http://about.me/duarte</a> 
                                   <br>
                            =========================</font></div>
                      </div>
                      <br>
                      <br>
                      <div class="gmail_quote">On Mon, May 13, 2013 at
                        2:50 PM, Will McLaren <span dir="ltr"><<a href="mailto:wm2@ebi.ac.uk" target="_blank">wm2@ebi.ac.uk</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div dir="ltr">Hi Guillermo,
                            <div><br>
                            </div>
                            <div>Test each plugin individually until you
                              find the one that causes the error. It is
                              highly unlikely that a particular
                              combination of plugins is causing the
                              crash.</div>
                            <div><br>
                            </div>
                            <div>Check that there are no "print" (to
                              STDOUT or STDERR) statements in your
                              plugin - forking assumes that code remains
                              silent otherwise it will throw errors like
                              this.</div>
                            <div><br>
                            </div>
                            <div> Also, check what, if anything, is
                              cached between runs of your plugin. If you
                              are caching things (for example to avoid
                              re-querying a database), you may need to
                              write storable hooks to ensure the data is
                              getting cached between forks - see <a href="https://github.com/ensembl-variation/VEP_plugins/blob/master/ProteinSeqs.pm" target="_blank">https://github.com/ensembl-variation/VEP_plugins/blob/master/ProteinSeqs.pm</a>
                              for an example.</div>
                            <div><br>
                            </div>
                            <div>If you still have no luck, send me the
                              code and an input file that recreates the
                              problem.</div>
                            <div><br>
                            </div>
                            <div>Regards</div>
                            <div><br>
                            </div>
                            <div>Will</div>
                          </div>
                          <div class="gmail_extra"> <br>
                            <br>
                            <div class="gmail_quote">
                              <div>
                                <div>On 13 May 2013 13:18, Guillermo
                                  Marco Puche <span dir="ltr"><<a href="mailto:guillermo.marco@sistemasgenomicos.com" target="_blank">guillermo.marco@sistemasgenomicos.com</a>></span>
                                  wrote:<br>
                                </div>
                              </div>
                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                <div>
                                  <div>
                                    <div bgcolor="#FFFFFF" text="#000066"> Hello,<br>
                                      <br>
                                      I've started to recently having
                                      problems with VEP script while
                                      using different plugins (most of
                                      them own plugins).<br>
                                      <br>
                                      <pre>2013-05-13 13:59:44 - Connected to core version 71 database and variation version 71 database
2013-05-13 13:59:44 - Loaded plugin: vcf_input
2013-05-13 13:59:44 - Loaded plugin: biobase
2013-05-13 13:59:44 - Loaded plugin: aa_grantham_distance
2013-05-13 13:59:44 - Loaded plugin: flanking_sequence
2013-05-13 13:59:44 - Loaded plugin: Condel
2013-05-13 13:59:44 - Output fields redefined (37 defined)
2013-05-13 13:59:44 - Starting...
2013-05-13 13:59:45 - Read 3888 variants into buffer
2013-05-13 13:59:54 - Reading transcript data from cache and/or database
[===============================================]  [ 100% ]
2013-05-13 14:02:38 - Retrieved 6463 transcripts (0 mem, 0 cached, 13743 DB, 7280 duplicates)
2013-05-13 14:02:38 - Calculating consequences
[===================================>           ]   [ 78% ]
ERROR: Forked process failed


</pre>
                                      I'm not getting any other error
                                      message. So I cannot debug
                                      properly. I thought my plugins
                                      were OK but it's seems they don't.
                                      I think the problem occurs when I
                                      use "aa_grantham_distance plugin"
                                      together with "flanking_sequence".
                                      I've no idea what could be causing
                                      this.<br>
                                      <br>
                                      I'm running VEP on verbose mode
                                      but I can't get any usefull
                                      information. How could I debug
                                      that?<br>
                                      <br>
                                      Best regards,<br>
                                      Guillermo.<br>
                                      <br>
                                    </div>
                                    <br>
                                  </div>
                                </div>
_______________________________________________<br>
                                Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
                                Posting guidelines and
                                subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
                                Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
                                <br>
                              </blockquote>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                    </div>
                  </blockquote>
                </div>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      <pre>_______________________________________________
Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
  </div></div></div>

<br>_______________________________________________<br>
Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Dev mailing list    <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>