<div dir="ltr">You get one line of output for each variant/feature overlap, so you will almost always see more output lines than input if you use the default output format. If you use VCF output, you only get one line per variant.<div>

<br></div><div>You can check how many unique variants there are in the output with e.g.:</div><div><br></div><div>grep -v # variant_effect_output.txt | cut -f 1 | sort -u | wc -l<br></div><div><br></div><div style>assuming your variants have unique names.</div>
<div><br></div><div style>Try dropping "html" from your config, see if that makes any difference - as the newest feature there, it's got a higher chance of causing problems!</div><div style><br></div><div style>
Will</div><div><br></div><div><br></div>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On 21 May 2013 16:02, Guillermo Marco Puche <span dir="ltr"><<a href="mailto:guillermo.marco@sistemasgenomicos.com" target="_blank">guillermo.marco@sistemasgenomicos.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000066">
    <div>Hello Will,<br>
      <br>
      I'm getting more 3000 lines of file output.. this seems really
      weird....<br>
      <br>
      <pre>wc -l variant_effect_output.txt</pre>
      <b>3936</b><br>
      <br>
      Here's the way I'm proceeding:<br>
      <br>
      <pre>./<a href="http://variant_effect_predictor.pl" target="_blank">variant_effect_predictor.pl</a> -i /home/likewise-open/SGNET/gmarco/vep_71_annotation_check/input.vcf -force -fork 4 --database --config vep_71.test</pre>

      <br>
      Here's the content of vep_71.test:<br>
      <br>
      dir                /home/likewise-open/SGNET/gmarco/.vep<br>
      toplevel_dir       /home/likewise-open/SGNET/gmarco/.vep<br>
      force_overwrite    1<br>
      format             vcf<br>
      html               1<br>
      host               192.19.x.xx<br>
      port               3306<br>
      user               myuser<br>
      password           mypassword<br>
      buffer_size        5000<div class="im"><br>
      hgvs               1<br>
      canonical          1<br>
      ccds               1<br>
      check_svs          1<br>
      domains            1<br>
      gmaf               1<br>
      hgnc               1<br>
      maf_1kg            1<br>
      numbers            1<br>
      polyphen           b<br>
      regulatory         1<br>
      sift               b<br>
      <br></div>
      Best regards,<br>
      Guillermo.<div><div class="h5"><br>
      <br>
      On 05/21/2013 02:30 PM, Will McLaren wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">Hi Guillermo,
        <div><br>
          I'm unable to recreate this, sorry!</div>
        <div><br>
        </div>
        <div>I get 406 going in, 406 coming out every time,
          whichever combination of those options above I use, and
          whether I use VCF or standard output.</div>
        <div><br>
        </div>
        <div>Here's my run (minus -check_sv):</div>
        <div><br>
        </div>
        <div>
          <div>> perl <a href="http://variant_effect_predictor.pl" target="_blank">variant_effect_predictor.pl</a>
            -i guill.vcf -vcf -cache -force -fork 4 -hgvs -canon -ccds
            -domains -gmaf -hgnc -maf_1kg -numbers -poly b -regu -sift b
            -fasta
            ~/NFS/Fasta/Homo_sapiens.GRCh37.69.dna.primary_assembly.fa</div>
          <div>2013-05-21 13:24:26 - Checking/creating FASTA index</div>
          <div>2013-05-21 13:24:26 - Read existing cache info</div>
          <div>2013-05-21 13:24:26 - Starting...</div>
          <div>2013-05-21 13:24:26 - Detected format of input file as
            vcf</div>
          <div>2013-05-21 13:24:26 - Read 406 variants into buffer</div>
          <div>2013-05-21 13:24:26 - Reading transcript data from cache
            and/or database</div>
          <div>[================================================================]
             [ 100% ]</div>
          <div>2013-05-21 13:24:30 - Retrieved 10891 transcripts (0 mem,
            10919 cached, 0 DB, 28 duplicates)</div>
          <div>2013-05-21 13:24:30 - Reading regulatory data from cache
            and/or database</div>
          <div>[================================================================]
             [ 100% ]</div>
          <div>2013-05-21 13:24:35 - Retrieved 36955 regulatory features
            (0 mem, 36955 cached, 0 DB, 0 duplicates)</div>
          <div>2013-05-21 13:24:35 - Calculating consequences</div>
          <div>[================================================================]
             [ 100% ]</div>
          <div>2013-05-21 13:24:56 - Writing output2013-05-21 13:24:56 -
            Processed 406 total variants (14 vars/sec, 14 vars/sec
            total)</div>
          <div>2013-05-21 13:24:56 - Wrote stats summary to
            variant_effect_output.txt_summary.html</div>
          <div>2013-05-21 13:24:56 - Finished!</div>
          <div>> wc -l variant_effect_output.txt</div>
          <div>408</div>
          <div><br>
          </div>
          <div>It's 408 as it's adding two header lines to the
            VCF output.</div>
          <div>
            <br>
          </div>
          <div>Which 16 are missing from your output, and is it
            the same 16 each time?</div>
          <div><br>
          </div>
          <div>Try writing to a different output file, or on a
            different disk if you can (perhaps disk space is an issue?)</div>
          <div><br>
          </div>
          <div>Will</div>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On 21 May 2013 13:15, Guillermo Marco
          Puche <span dir="ltr"><<a href="mailto:guillermo.marco@sistemasgenomicos.com" target="_blank">guillermo.marco@sistemasgenomicos.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000066">
              <div>Hello Will,<br>
                <br>
                Here's the input: <a href="https://github.com/guillermomarco/vep_plugins_71/blob/master/missing_variants/missing_output_variants.vcf" target="_blank">https://github.com/guillermomarco/vep_plugins_71/blob/master/missing_variants/missing_output_variants.vcf</a><br>

                <br>
                As you said it's not about the options or plugins.
                Launching VEP without specyfiying any option still
                returns an output with missing variants.<br>
                <br>
                Regards,<br>
                Guillermo.
                <div>
                  <div><br>
                    <br>
                    <br>
                    On 05/21/2013 01:49 PM, Will McLaren wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div>
                  <blockquote type="cite">
                    <div dir="ltr">Hi Guillermo,
                      <div><br>
                      </div>
                      <div>None of those options should filter out
                        variants.</div>
                      <div><br>
                      </div>
                      <div>Are you able to provide any of the files that
                        recreate the problem?</div>
                      <div> <br>
                      </div>
                      <div>Is there any chance that you are using VCF
                        input and it contains non-variant lines - this
                        would be where the ALT column is empty or "."?
                        If so, this may be your problem. To force these
                        to be included in the output, you should add
                        --allow_non_variant.</div>
                      <div><br>
                      </div>
                      <div>Regards</div>
                      <div><br>
                      </div>
                      <div>Will</div>
                    </div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On 21 May 2013 09:40,
                        Guillermo Marco Puche <span dir="ltr"><<a href="mailto:guillermo.marco@sistemasgenomicos.com" target="_blank">guillermo.marco@sistemasgenomicos.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div bgcolor="#FFFFFF" text="#000066"> Hello,<br>
                            <br>
                            I've been checking VEP results, and i've
                            noticed that I'm missing some input variants
                            in the output.<br>
                            <br>
                            I think this may be issued to some of the
                            options i'm using to launch vep:<br>
                            <br>
                            <small><small>hgvs               1<br>
                                canonical          1<br>
                                ccds               1<br>
                                check_svs          1<br>
                                domains            1<br>
                                gmaf               1<br>
                                hgnc               1<br>
                                maf_1kg            1<br>
                                numbers            1<br>
                                polyphen           b<br>
                                regulatory         1<br>
                                sift               b</small></small><br>
                            <br>
                            Should be any of these options filtering
                            output? I've disabled all plugins to run
                            this test to be sure that it's not a plugin
                            issue.<br>
                            <br>
                            <ul>
                              <li>With a 406 variant input vcf file,
                                I've missed 16 variants. <br>
                              </li>
                              <li>I then ran VEP with only those 16
                                missing variants and missed 3 on output.
                                <br>
                              </li>
                              <li>Rerun again and now with 3 missing
                                variants and now not a single one was
                                missing.</li>
                            </ul>
                            <p>I would like to know what's behind that
                              weird behaviour.<br>
                            </p>
                            <p>Thank you.<br>
                            </p>
                            <p>Best regards,<br>
                              Guillermo.<br>
                            </p>
                            <br>
                            <br>
                          </div>
                          <br>
_______________________________________________<br>
                          Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
                          Posting guidelines and subscribe/unsubscribe
                          info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
                          Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info</a>
                        </blockquote>
                      </div>
                    </div>
                  </blockquote>
                </div>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      <pre>_______________________________________________
Dev mailing list    <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
    <br>
  </div></div></div>

<br>_______________________________________________<br>
Dev mailing list    <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>