<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000066">
<div class="moz-cite-prefix">Thanks to Duarte's code I've fixed my
plugin to parse VCF input.<br>
<br>
It's a shame all this information isn't on a devs guide for VEP.
At least some information about all the available objects whould
be nice.<br>
<br>
Regards,<br>
Guillermo.<br>
<br>
<br>
On 06/05/2013 06:20 PM, Guillermo Marco Puche wrote:<br>
</div>
<blockquote cite="mid:51AF653F.9000503@sistemasgenomicos.com"
type="cite">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
Hello,<br>
<br>
I've noticed that the plugin I'm using to parse my VCF is being
wrong in one case.<br>
<br>
In the case there's 2 variants in the same chromosome and the same
position:<br>
<br>
Here's the example:<br>
<br>
Input VCF:<br>
<br>
#CHROM POS ID REF ALT QUAL FILTER INFO
FORMAT DATA<br>
chr11 123502514 . G A 1000 .
GT:GK:VS:GF:PA:F3 0/1:0:2:0:2:0<br>
chr11 123502514 . G C 1000 .
GT:GK:VS:GF:PA:F3 0/1:0:2:0:2:0<br>
<br>
<br>
The output is correct it reports all variants but the column
refering to VAR ALLELE (which is parsed by my plugin) "C" allele
is being reported in both cases. This is wrong. "A" and "C"
alleles should be reported.<br>
<br>
I suppose this is due because my plugin access the VCF input to
parse information with Tabix. And tabix access VCF input file
with chr, start and end position. The chromosome and position
being the same for both changes then the output from parse is
incorrect.<br>
<br>
I had to do this due to this because the VCF input used by my
workmates is a bit weird.<br>
<br>
The code for my vcf_input.pm parser is located here:
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<a moz-do-not-send="true"
href="https://github.com/guillermomarco/vep_plugins_71/blob/master/vcf_input.pm">https://github.com/guillermomarco/vep_plugins_71/blob/master/vcf_input.pm</a><br>
<br>
Important lines for this problem are from line 102 to 126 (tabix
related).<br>
<br>
If accessing VCF file with tabix it's impossible to distinguish
between two variations in same position, is there any other way I
can access the VCF for the the current variation consequence
without having to parse the whole input file?<br>
<br>
I know Duarte Molha had a script to get VCF input information for
the input line of consequence being calculated.<br>
<br>
This must be the important line to access the input VCF line
object: <br>
<pre><b>my $line = $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line};</b></pre>
<br>
This is par of Duarte's code:<br>
<br>
<pre>sub run {
</pre>
<pre> my $self = shift;
my $vf = shift;
my $line_hash = shift;
</pre>
<pre> my $config = $self->{config};
my $ind_cols = $config->{ind_cols};
my $line = $vf->{base_variation_feature_overlap}->{base_variation_feature}->{_line};
my $individual = $vf->{base_variation_feature_overlap}->{base_variation_feature}->{individual};
my @split_line = split /[\s\t]+/, $line;
my $qual_score = $split_line[5];
my @gt_format = split /:/, $split_line[8];
my @gt_data = split /:/, $split_line[$ind_cols->{$individual}];
my $results = {map { shift @gt_format => $_ } @gt_data};
$results->{"quality_score"} = $qual_score;
return $results;
</pre>
<pre>}</pre>
Thank you.<br>
<br>
Best regards,<br>
Guillermo.<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
</body>
</html>