<div dir="ltr"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div><span style="font-size:12.8px">Hi there,</span><div style="font-size:12.8px"><br></div><div style="font-size:12.8px"><span style="color:rgb(51,51,51);font-family:'Luxi Sans',Helvetica,Arial,Geneva,sans-serif;font-size:12.8px">There is probably a bug on cache in Variation API with annotating with refseq transcript.</span><br></div><div style="font-size:12.8px"><br></div><div style="font-size:12.8px">For a single variant A "1:g.121116121T>C", the online VEP (84) result is:</div><div style="font-size:12.8px"><a href="http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Results?db=core;tl=jIjMm1R03KeDvYW4-1799042" target="_blank">http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Results?db=core;tl=jIjMm1R03KeDvYW4-1799042</a><br></div><div style="font-size:12.8px">Observe that the 2 rows with Feature type Transcript contains Symbol <span style="color:rgb(51,51,51);font-family:'Luxi Sans',Helvetica,Arial,Geneva,sans-serif;font-size:12.8px">SRGAP2C.</span></div><div style="font-size:12.8px"><span style="color:rgb(51,51,51);font-family:'Luxi Sans',Helvetica,Arial,Geneva,sans-serif;font-size:12.8px"><br></span></div><div style="font-size:12.8px">When variant A is analysed together with another variant B "1:g.120935661T>C", the result is:</div><div style="font-size:12.8px"><a href="http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Results?db=core;tl=dNaMvffRoZ6scMMp-1799049;field1=Location;operator1=is;value1=1:121116121-121116121" target="_blank">http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Results?db=core;tl=dNaMvffRoZ6scMMp-1799049;field1=Location;operator1=is;value1=1:121116121-121116121<br></a></div><div style="font-size:12.8px">Observe that Symbol is missing from second row with Feature type Transcript<span style="color:rgb(51,51,51);font-family:'Luxi Sans',Helvetica,Arial,Geneva,sans-serif;font-size:12.8px">.</span></div><div style="font-size:12.8px"><br></div><div style="font-size:12.8px"><font color="#333333" face="Luxi Sans, Helvetica, Arial, Geneva, sans-serif">In the subroutine fetch_transcripts of Bio::EnsEMBL::Variation::Utils::VEP (<a href="https://github.com/Ensembl/ensembl-variation/blob/release/84/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm#L3738" target="_blank">https://github.com/Ensembl/ensembl-variation/blob/release/84/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm#L3738</a>):</font></div><div style="font-size:12.8px"><font color="#333333" face="Luxi Sans, Helvetica, Arial, Geneva, sans-serif"><br></font></div><div style="font-size:12.8px"><font color="#333333" face="monospace, monospace"><span style="font-size:12.8px"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">my %seen_trs;<br>...<br>foreach my $chr(...) {<br>    foreach my $region(...) {<br>        ...<br>        my %refseq_stuff = ();<br>        if(defined($tmp_cache->{$chr})) {<br>            TRANSCRIPT: while(my $tr = shift @{$tmp_cache->{$chr}}) {<br>                ...<br>                if($seen_trs{$dbID}) {<br>                    $count_duplicates++;<br>                    next;<br>                }<br>                ...<br>                if(defined($config->{refseq}) || defined($config->{merged})) {<br>                    # put data to $refseq_stuff<br>                }<br>                $seen_trs{$dbID} = 1;<br>                ...<br>            }<br>        }<br>        ...<br>    }<br>}</blockquote></span></font></div><div style="font-size:12.8px"><br></div><div style="font-size:12.8px">The scope of variable <font face="monospace, monospace">%seen_trs</font> is for all regions is all chromosomes while the scope of variable <font face="monospace, monospace">%refseq_stuff</font> is for a single region only.</div><div style="font-size:12.8px">In the second analysis above, transcript for region of variant B is loaded to cache and marked seen using <font face="monospace, monospace">%seen_trs</font>. When it came to region of variant A, cache loading is skipped according to <span style="font-family:monospace,monospace">%seen_trs</span> but the <font face="monospace, monospace">%refseq_stuff</font> variable is actually empty for this new region.</div><div style="font-size:12.8px"><br clear="all"><div><div data-smartmail="gmail_signature"><div dir="ltr"><div dir="ltr">Regards,<br>Wallace Ko</div></div></div></div></div></div></div></div></div></div>
</div>