<div dir="ltr"><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">Dear Ensembl developers!</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">Thank you for all your great work!</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">Gnomad 2.1. is a major update of Gnomad database of variation in the human population </div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">(whole exome and whole genome sequencing).</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><a href="https://macarthurlab.org/2018/10/17/gnomad-v2-1/" rel="nofollow" target="_blank" class="gmail-" style="color:rgb(25,106,212)">https://macarthurlab.org/2018/10/17/gnomad-v2-1/</a><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">We are using Ensembl hosted Gnomad vcf files in cloudbiolinux and bcbio.</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><a href="https://github.com/chapmanb/cloudbiolinux" rel="nofollow" target="_blank" class="enhancr_card_7783636078" style="color:rgb(25,106,212)">chapmanb/cloudbiolinux</a><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><a href="https://github.com/bcbio/bcbio-nextgen" rel="nofollow" target="_blank" class="enhancr_card_8460760495" style="color:rgb(25,106,212)">bcbio/bcbio-nextgen</a><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">There is a difference between gnomad2.0.1 files and gnomad2.1 - they are split into chromosomes:</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><a href="http://ftp.ensemblorg.ebi.ac.uk/pub/data_files/homo_sapiens/GRCh37/variation_genotype/gnomad/r2.1/exomes/" rel="nofollow" target="_blank" class="enhancr_card_7759586661" style="color:rgb(25,106,212)">Index of /pub/data_files/homo_sapiens/GRCh37/variation_genotype/gnomad/r2.1/exomes</a><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">To use gnomad2.1 in the annotation step of bcbio (we annotate with vcfanno), we decided to merge the files</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">and remove a number of INFO fields to reduce the file size, see the discussion here:</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><a href="https://github.com/bcbio/bcbio-nextgen/issues/2736" rel="nofollow" target="_blank" class="enhancr_card_8361792277" style="color:rgb(25,106,212)">Using gnomad2.1: request for opinions · Issue #2736 · bcbio/bcbio-nextgen</a><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">We created recipes in cloudbiolinux to merge gnomad2.1 vcfs for grch37, grch38, and hg19.<br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><a href="https://github.com/chapmanb/cloudbiolinux/blob/master/ggd-recipes/GRCh37/gnomad.yaml" rel="nofollow" target="_blank" class="gmail-" style="color:rgb(25,106,212)">https://github.com/chapmanb/cloudbiolinux/blob/master/ggd-recipes/GRCh37/gnomad.yaml</a><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">However, the long running time makes merging gnomad vcf files in every local installation not feasible.</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">We decided to generate merged files once, and then provide users with easy to install recipe.</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">Would you kindly agree to host merged vcfs for gnomad exome and genome for grch37 and grch38 on ENSEMBL FTP server?</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">We would be happy to produce the files and upload them.</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">The technical steps on how we merge the vcfs are listed in the recipe: </div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">we sort the variants, filter only PASS variants, keep the pre-defined subset of INFO fields, etc.</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">We hope that many of Ensembl users would benefit </div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">from the merged and relatively slim gnomad2.1 vcf files,</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">and we are happy to share our work with Ensembl.</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px"><br></div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">Thanks!</div><div style="color:rgb(29,34,40);font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px">Sergey Naumenko</div></div>