<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Jessie,<div class=""><br class=""></div><div class=""><div class="">For short variants including SNVs and indels:</div><div class="">You can use <a href="ftp://ftp.ensembl.org/pub/release-91/variation/vcf/homo_sapiens/1000GENOMES-phase_3.vcf.gz" class="">ftp://ftp.ensembl.org/pub/release-91/variation/vcf/homo_sapiens/1000GENOMES-phase_3.vcf.gz</a> for filtering for common variants. For each variant we report the allele frequency for the super populations (EAS, EUR, AFR, AMR, SAS) studied in the 1000 Genomes phase 3 project.</div><div class=""><br class=""></div><div class="">Example row from the file:</div><div class="">19  10368804  rs373966690 TAAGTAA T . . dbSNP_150;TSA=deletion;E_Freq;E_1000G;MA=-;MAF=0.00439297;MAC=22;EAS_AF=0.0149;EUR_AF=0.002;AMR_AF=0.0014;SAS_AF=0.0031;AFR_AF=0.0008</div><div class="">9 10368969  rs12720251  C G,T . . dbSNP_150;TSA=SNV;E_Freq;E_1000G;MA=T;MAF=0.00479233;MAC=24;AA=C;EAS_AF=0,0;EUR_AF=0,0;AMR_AF=0,0.0014;SAS_AF=0,0;AFR_AF=0,0.0174</div><div class=""><br class=""></div><div class="">You can extract the frequencies from AFR_AF, AMR_AF, EUR_AF, EAS_AF, SAS_AF which are listed in the info column and report the frequency for the variant allele. Here, the variant allele is T.</div><div class=""><br class=""></div><div class="">For structural variants including CNVs:</div><div class="">We don't include frequencies to our structural variation data dumps. However, we compute structural variation allele frequencies based on samples from the 1000 Genomes project and display them on our website:</div><div class=""><br class=""></div><div class="">For example:</div><div class=""><a href="http://www.ensembl.org/Homo_sapiens/StructuralVariation/Evidence?db=core;r=12:131494150-131500971;sv=esv3631253;svf=118155531;vdb=variation" class="">http://www.ensembl.org/Homo_sapiens/StructuralVariation/Evidence?db=core;r=12:131494150-131500971;sv=esv3631253;svf=118155531;vdb=variation</a></div><div class=""><br class=""></div><div class="">You can access the frequencies with our perl API. As an alternative you can also use the VCF file from the 1000 Genomes website which contains the allele frequencies you are looking for:</div><div class=""><a href="ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/" class="">ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/</a></div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">For example:</div><div class="">tabix <a href="ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/ALL.wgs.mergedSV.v8.20130502.svs.genotypes.vcf.gz" class="">ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/ALL.wgs.mergedSV.v8.20130502.svs.genotypes.vcf.gz</a> 12:131979195-131979196</div><div class=""><br class=""></div><div class="">12  131979195 DUP_gs_CNV_12_131979195_131985016 T <CN0>,<CN2> . PASS  AC=1,8;AF=0.00019968,0.00159744;AFR_AF=0,0.0015;AMR_AF=0.0014,0.0029;AN=5008;CS=DUP_gs;EAS_AF=0,0.002;END=131985016;EUR_AF=0,0.001;NS=2504;SAS_AF=0,0.001;SITEPOST=0.8758;SVTYPE=CNV</div><div class=""><br class=""></div><div class="">AFR_AF, AMR_AF, EAS_AF, SAS_AF report the allele frequencies for the variant alleles which are <CN0>,<CN2> in this case.</div><div class=""><br class=""></div><div class="">I hope that answers your questions. Please let us know if you have further questions.</div><div class=""><br class=""></div><div class="">Best,</div><div class="">Anja</div></div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 22 Dec 2017, at 10:54, Jessie PoquĂ©russe <<a href="mailto:jessie.poquerusse@gmail.com" class="">jessie.poquerusse@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Hello, <br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">I'm having a hard time getting my hands on the best, most recent VCF datasets, mapped to GRCh38/hg38, of SNP/indel and CNV/SV variations, which I could then filter according to commonality in the general population to obtain a list of common-only variants. My question is thus two-fold:</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">1) What is the best source for such variations, and <br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">2) Are there any instructions on how the variant frequency is encoded (does it correspond to E_freq, as per <a href="ftp://ftp.ensembl.org/pub/release-91/variation/vcf/homo_sapiens/README" class="">ftp://ftp.ensembl.org/pub/release-91/variation/vcf/homo_sapiens/README</a>), and how to filter for this?</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">I realize I've asked a version of this question a few days ago, but now would love more details.<br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Thank-you & happy holidays!<br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br class=""></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Best,</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Jessie<br class=""></div></div>
_______________________________________________<br class="">Dev mailing list    <a href="mailto:Dev@ensembl.org" class="">Dev@ensembl.org</a><br class="">Posting guidelines and subscribe/unsubscribe info: <a href="http://lists.ensembl.org/mailman/listinfo/dev" class="">http://lists.ensembl.org/mailman/listinfo/dev</a><br class="">Ensembl Blog: <a href="http://www.ensembl.info/" class="">http://www.ensembl.info/</a><br class=""></div></blockquote></div><br class=""></div></body></html>