<div dir="ltr">Just another question<div><br></div><div>I can do what you say by querying the ensembl database remotely. But we have installed it locally as well and since my queries would be extensive I much prefered if I could also to this locally.</div><div><br></div><div>Where and how do I download the VCFs and install them on my own server so that this can also be done locally?</div><div><br></div><div>Many thanks</div><div>Duarte</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 17 Jan 2019 at 11:28, Laurent Gil <<a href="mailto:lgil@ebi.ac.uk">lgil@ebi.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Dear Duarte,<br>
<br>
The 1000 Genomes Phase 3 data are stored in a VCF file and not in
a database (it was too big to store it in our databases), that's
why you didn't see them in your results.<br>
However you can access it with the Ensembl Variation API. For
that, you need add the following line in your script to force the
API to look into the Ensembl Variation VCF files:<br>
</p>
<pre>$variation_adaptor->db->use_vcf(1);</pre>
<p><br>
Here is a suggestion of your script with the change:<br>
</p>
<pre class="gmail-m_-121412223426997746code gmail-m_-121412223426997746sh_perl gmail-m_-121412223426997746sh_sourceCode"><span class="gmail-m_-121412223426997746sh_keyword">my</span> <span class="gmail-m_-121412223426997746sh_variable">$variation_adaptor</span> <span class="gmail-m_-121412223426997746sh_symbol">=</span> <span class="gmail-m_-121412223426997746sh_variable">$registry</span><span class="gmail-m_-121412223426997746sh_symbol">-></span><span class="gmail-m_-121412223426997746sh_function">get_adaptor</span><span class="gmail-m_-121412223426997746sh_symbol">(</span><span class="gmail-m_-121412223426997746sh_string">"human"</span><span class="gmail-m_-121412223426997746sh_symbol">,</span> <span class="gmail-m_-121412223426997746sh_string">"variation"</span><span class="gmail-m_-121412223426997746sh_symbol">,</span> <span class="gmail-m_-121412223426997746sh_string">"variation"</span><span class="gmail-m_-121412223426997746sh_symbol">);
</span>
<span class="gmail-m_-121412223426997746sh_symbol">$variation_adaptor->db->use_vcf(1);</span></pre>
<pre>my $variation = $variation_adaptor->fetch_by_name($id);</pre>
<div>
<pre>foreach my $vf (@{$variation->get_all_VariationFeatures()}) {</pre>
<pre> ...</pre>
<pre>}</pre>
</div>
<p>Note that I also replaced the VariationFeatureAdaptor call
"$vf_adaptor->fetch_all_by_Variation($var)}" to avoid
using/instantiate an extra adaptor.</p>
<p>There are some further descriptions in our Ensembl Variation API
tutorial: <a href="https://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#alleles" target="_blank">https://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#alleles</a></p>
<p><br>
</p>
<p>Best regards,<br>
</p>
<pre class="gmail-m_-121412223426997746moz-signature" cols="72">Laurent
Ensembl Variation
</pre>
<div class="gmail-m_-121412223426997746moz-cite-prefix">On 17/01/2019 09:54, Duarte Molha
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr" class="gmail-m_-121412223426997746gmail_signature">
<div>Dear Developers </div>
<div><br>
</div>
<div>I created a simple script to provide me
with polymorphic frequencies in the
different populations in the database.
However after running it on my set it seems
some variations do not show results</div>
<div><br>
</div>
<div><br>
</div>
take for example the INDEL rs141080692
<div> </div>
<div>When I run it though my script this is
the information I get:</div>
<div><br>
</div>
<div>
<div>rs141080692 GT
1000GENOMES:pilot_1_CEU_low_coverage_panel
- deletion 9
123543905 123543907</div>
<div>rs141080692 -
1000GENOMES:pilot_1_CEU_low_coverage_panel
- deletion 9
123543905 123543907</div>
<div>rs141080692 GT
1000GENOMES:pilot_1_CHB+JPT_low_coverage_panel
- deletion 9 123543905
123543907</div>
<div>rs141080692 -
1000GENOMES:pilot_1_CHB+JPT_low_coverage_panel
- deletion 9 123543905
123543907</div>
<div>rs141080692 GT
1000GENOMES:pilot_1_YRI_low_coverage_panel
- deletion 9
123543905 123543907</div>
<div>rs141080692 -
1000GENOMES:pilot_1_YRI_low_coverage_panel
- deletion 9
123543905 123543907</div>
<div>rs141080692 GT GMI:AK_Koreans
- deletion 9 123543905
123543907</div>
<div>rs141080692 - GMI:AK_Koreans
- deletion 9 123543905
123543907</div>
<div>rs141080692 GT GMI:NA10851
- deletion 9
123543905 123543907</div>
<div>rs141080692 - GMI:NA10851
- deletion 9
123543905 123543907</div>
<div>rs141080692 GT SSMP:SSM
- deletion 9 123543905
123543907</div>
<div>rs141080692 - SSMP:SSM
- deletion 9 123543905
123543907</div>
</div>
<div><br>
</div>
<div>however, looking at the same database in
your website:</div>
<div><br>
</div>
<div><a href="http://dec2015.archive.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=9:123543406-123544407;v=rs141080692;vdb=variation;vf=127601209" target="_blank">http://dec2015.archive.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=9:123543406-123544407;v=rs141080692;vdb=variation;vf=127601209</a><br>
</div>
<div><br>
</div>
<div>You can see that there is information
about its frequency in a whole bunch of
populations</div>
<div><br>
</div>
<div>How do I go about fetching these?</div>
<div><br>
</div>
<div>My script is pretty basic</div>
<div><br>
</div>
<div>first I fect all populations or only ones
I am interested in with:</div>
<div><br>
</div>
<div>
<div>foreach my $pop
(@{$population_adaptor->fetch_all()}){</div>
<div><span style="white-space:pre-wrap"> </span>my
$name = $pop->name();</div>
<div><span style="white-space:pre-wrap"> </span>if
(defined $name){</div>
<div><span style="white-space:pre-wrap"> </span>if
(defined $population){</div>
<div><span style="white-space:pre-wrap"> </span>if
($name =~ /\Q$population/){</div>
<div><span style="white-space:pre-wrap"> </span>print
STDERR "Selected Populations: $name \n";</div>
<div><span style="white-space:pre-wrap"> </span>push
@selected_populations, $name; </div>
<div><span style="white-space:pre-wrap"> </span>}</div>
<div><span style="white-space:pre-wrap"> </span>}else{</div>
<div><span style="white-space:pre-wrap"> </span>print
STDERR "Selected Populations: $name \n";</div>
<div><span style="white-space:pre-wrap"> </span>push
@selected_populations, $name; </div>
<div><span style="white-space:pre-wrap"> </span></div>
<div><span style="white-space:pre-wrap"> </span>}</div>
<div><span style="white-space:pre-wrap"> </span>}</div>
<div>}</div>
</div>
<div><br>
</div>
<div>I then use the variation adaptor to get
the variation object</div>
<div><br>
</div>
<div>
<div> my $variation =
$variation_adaptor->fetch_by_name($id);</div>
</div>
<div><br>
</div>
<div>Then I cycle though each variation
feature with </div>
<div><br>
</div>
<div>
<div>foreach my $vf
(@{$vf_adaptor->fetch_all_by_Variation($var)}){</div>
</div>
<div>
<div><span style="white-space:pre-wrap"> </span>my
@alleles = @{$vf->get_all_Alleles};</div>
</div>
<div><br>
</div>
<div>
<div><span style="white-space:pre-wrap"> </span>ALLELE_CYCLE:foreach
my $a (@alleles){</div>
<div><span style="white-space:pre-wrap"> </span>my
$astr = $a->allele();</div>
<div><span style="white-space:pre-wrap"> </span>my
$pop = $a->population();</div>
<div><span style="white-space:pre-wrap"> </span>my
$pop_name = "-";</div>
<div><span style="white-space:pre-wrap"> </span>if
(defined $pop){</div>
<div><span style="white-space:pre-wrap"> </span>$pop_name
= $a->population->name() ;</div>
<div><span style="white-space:pre-wrap"> </span>}</div>
<div><span style="white-space:pre-wrap"> </span>my
$freq = $a->frequency() || "-";</div>
<div><span style="white-space:pre-wrap"> </span></div>
<div><span style="white-space:pre-wrap"> </span>foreach
my $p (@{$selected_populations}){</div>
<div><span style="white-space:pre-wrap"> </span>#print
STDERR $pop_name."\t".$p."\n";</div>
<div><span style="white-space:pre-wrap"> </span>if
($pop_name eq $p){</div>
<div><span style="white-space:pre-wrap"> </span>print
$out_fh join "\t", (<span style="white-space:pre-wrap"> </span>$var->name(),</div>
<div><span style="white-space:pre-wrap"> </span>$astr,</div>
<div><span style="white-space:pre-wrap"> </span>$pop_name,</div>
<div><span style="white-space:pre-wrap"> </span>$freq,</div>
<div><span style="white-space:pre-wrap"> </span>$varClass, </div>
<div><span style="white-space:pre-wrap"> </span>$chr, </div>
<div><span style="white-space:pre-wrap"> </span>$start, </div>
<div><span style="white-space:pre-wrap"> </span>$end."\n");</div>
<div><span style="white-space:pre-wrap"> </span>next
ALLELE_CYCLE;</div>
<div><span style="white-space:pre-wrap"> </span>}</div>
<div><span style="white-space:pre-wrap"> </span>}</div>
<div><span style="white-space:pre-wrap"> </span>}</div>
</div>
<div>}</div>
<div><br>
</div>
<div>Am I doing something wrong? </div>
<div>There are the phase3 population data for
example. They are clealy included in your
site</div>
<div><br>
</div>
<div>Many thanks</div>
<div><br>
</div>
<div>Duarte</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="gmail-m_-121412223426997746mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_-121412223426997746moz-quote-pre">_______________________________________________
Dev mailing list <a class="gmail-m_-121412223426997746moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org" target="_blank">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="gmail-m_-121412223426997746moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev" target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="gmail-m_-121412223426997746moz-txt-link-freetext" href="http://www.ensembl.info/" target="_blank">http://www.ensembl.info/</a>
</pre>
</blockquote>
</div>
</blockquote></div>