<div dir="ltr">But if they are for different individuals then you could just change the header information for each file and use VCF-merge to deal with these issues...<div><br></div><div>IMHO, I think that your approach might result in unforeseen errors.</div>
<div><br></div><div>Best regards</div><div><br></div><div>Duarte</div></div><div class="gmail_extra"><br clear="all"><div><font style="background-color:rgb(255,255,255)" color="#999999">=========================<br> Duarte Miguel Paulo Molha <br>
</font><div><font style="background-color:rgb(255,255,255)" color="#999999"> <a href="http://about.me/duarte" target="_blank">http://about.me/duarte</a> <br>=========================</font></div></div>
<br><br><div class="gmail_quote">On Thu, Jun 6, 2013 at 11:26 AM, Guillermo Marco Puche <span dir="ltr"><<a href="mailto:guillermo.marco@sistemasgenomicos.com" target="_blank">guillermo.marco@sistemasgenomicos.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000066">
<div>Hello Duarte,<br>
<br>
I know it's standard VCF. You're totally right, but my workmates
are using their own modified vcf.<br>
So I've to deal with. <br>
<br>
Anyways thanks to your code Duarte I can access VCF line with
object data straight. So I don't need tabix.<br>
<br>
Best regards,<br>
Guillermo.<div><div class="h5"><br>
<br>
On 06/06/2013 12:19 PM, Duarte Molha wrote:<br>
</div></div></div><div><div class="h5">
<blockquote type="cite">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Dear
Guillermo<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">How
did you create the VCF file that produced the 2 lines that
gave you the problem?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Correct
me if I am wrong DEVS, but I believe the file you indicated
is not following the VCF specifications… Was this 2
individuals you were trying to merge? If so the 2 variations
should be merged into 1 line like so:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal">#CHROM POS ID REF ALT
QUAL FILTER INFO FORMAT Sample1 Sample2<br>
chr11 123502514 . G A,C 1000 .
GT:GK:VS:GF:PA:F3 0/1:0:2:0:2:0 0/2:0:2:0:2:0<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Use something like vcf-merge (from vcf
tools to correctly merge VCF files.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">If this was 1 individual then it would have
become<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">#CHROM POS ID REF ALT
QUAL FILTER INFO FORMAT Sample1<u></u><u></u></p>
<p class="MsoNormal">chr11 123502514 . G A,C
1000 . GT:GK:VS:GF:PA:F3 1/2:0:2:0:2:0<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">just concatenating and sorting 2 vcf files
will not be correct.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Just to make sure your file is valid you
could probably use the vcf-validator (also in vcf-tools)<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Best regards<u></u><u></u></p>
<p class="MsoNormal"><br>
Duarte<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext" lang="EN-US">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext" lang="EN-US"> <a href="mailto:dev-bounces@ensembl.org" target="_blank">dev-bounces@ensembl.org</a>
[<a href="mailto:dev-bounces@ensembl.org" target="_blank">mailto:dev-bounces@ensembl.org</a>] <b>On Behalf Of </b>Guillermo
Marco Puche<br>
<b>Sent:</b> 06 June 2013 10:50<br>
<b>To:</b> <a href="mailto:dev@ensembl.org" target="_blank">dev@ensembl.org</a><br>
<b>Subject:</b> Re: [ensembl-dev] Input information
plugin for a variation on same chromosome and position<u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Thanks to Duarte's code I've fixed my
plugin to parse VCF input.<br>
<br>
It's a shame all this information isn't on a devs guide for
VEP. At least some information about all the available
objects whould be nice.<br>
<br>
Regards,<br>
Guillermo.<br>
<br>
<br>
On 06/05/2013 06:20 PM, Guillermo Marco Puche wrote:<u></u><u></u></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal">Hello,<br>
<br>
I've noticed that the plugin I'm using to parse my VCF is
being wrong in one case.<br>
<br>
In the case there's 2 variants in the same chromosome and
the same position:<br>
<br>
Here's the example:<br>
<br>
Input VCF:<br>
<br>
#CHROM POS ID REF ALT QUAL FILTER
INFO FORMAT DATA<br>
chr11 123502514 . G A 1000 .
GT:GK:VS:GF:PA:F3 0/1:0:2:0:2:0<br>
chr11 123502514 . G C 1000 .
GT:GK:VS:GF:PA:F3 0/1:0:2:0:2:0<br>
<br>
<br>
The output is correct it reports all variants but the column
refering to VAR ALLELE (which is parsed by my plugin) "C"
allele is being reported in both cases. This is wrong. "A"
and "C" alleles should be reported.<br>
<br>
I suppose this is due because my plugin access the VCF input
to parse information with Tabix. And tabix access VCF input
file with chr, start and end position. The chromosome and
position being the same for both changes then the output
from parse is incorrect.<br>
<br>
I had to do this due to this because the VCF input used by
my workmates is a bit weird.<br>
<br>
The code for my <a href="http://vcf_input.pm" target="_blank">vcf_input.pm</a> parser is located here: <a href="https://github.com/guillermomarco/vep_plugins_71/blob/master/vcf_input.pm" target="_blank">https://github.com/guillermomarco/vep_plugins_71/blob/master/vcf_input.pm</a><br>
<br>
Important lines for this problem are from line 102 to 126
(tabix related).<br>
<br>
If accessing VCF file with tabix it's impossible to
distinguish between two variations in same position, is
there any other way I can access the VCF for the the current
variation consequence without having to parse the whole
input file?<br>
<br>
I know Duarte Molha had a script to get VCF input
information for the input line of consequence being
calculated.<br>
<br>
This must be the important line to access the input VCF line
object: <u></u><u></u></p>
<pre><b>my $line = $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line};</b><u></u><u></u></pre>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
This is par of Duarte's code:<u></u><u></u></p>
<pre>sub run {<u></u><u></u></pre>
<pre> my $self = shift;<u></u><u></u></pre>
<pre> my $vf = shift;<u></u><u></u></pre>
<pre> my $line_hash = shift;<u></u><u></u></pre>
<pre> my $config = $self->{config};<u></u><u></u></pre>
<pre> my $ind_cols = $config->{ind_cols};<u></u><u></u></pre>
<pre> my $line = $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line};<u></u><u></u></pre>
<pre> my $individual = $vf->{base_variation_feature_overlap}->{base_variation_feature}->{individual};<u></u><u></u></pre>
<pre> my @split_line = split /[\s\t]+/, $line;<u></u><u></u></pre>
<pre> my $qual_score = $split_line[5];<u></u><u></u></pre>
<pre> my @gt_format = split /:/, $split_line[8];<u></u><u></u></pre>
<pre> my @gt_data = split /:/, $split_line[$ind_cols->{$individual}];<u></u><u></u></pre>
<pre> my $results = {map { shift @gt_format => $_ } @gt_data};<u></u><u></u></pre>
<pre> $results->{"quality_score"} = $qual_score;<u></u><u></u></pre>
<pre><u></u> <u></u></pre>
<pre> return $results;<u></u><u></u></pre>
<pre>}<u></u><u></u></pre>
<p class="MsoNormal">Thank you.<br>
<br>
Best regards,<br>
Guillermo.<br>
<br>
<br>
<br>
<u></u><u></u></p>
<pre>_______________________________________________<u></u><u></u></pre>
<pre>Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a><u></u><u></u></pre>
<pre>Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><u></u><u></u></pre>
<pre>Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><u></u><u></u></pre>
</blockquote>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Dev mailing list <a href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
</div></div></div>
<br>_______________________________________________<br>
Dev mailing list <a href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
Ensembl Blog: <a href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a><br>
<br></blockquote></div><br></div>