<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<title></title>
</head>
<body text="#000000" bgcolor="#ffffff">
Hello<br>
<br>
Can i confirm what exactly variation synonyms are? I thought they
represented entries for the same variation in different sources but
when i look at the variation synonym table most synonyms seem to
have a source of dbSNP (i.e. they have the same source as the
original variation). Variations in dbSNP from the same chromosome
location are generally merged into one rs cluster so the notion of a
synonym in dbSNP doesnt really exist. The only thing i have ever
come across in dbSNP that is like a synonym is where 2 submitted
variations have flanking sequences of different lengths and are
indentical in overlapping regions but the snp with a shorting
flanking sequence can map to multiple genomic locations whereas the
variation with a longer sequence is mapped to only one location. In
this case the variations have not been merged into one entry.<br>
However there are 7.6 million dbSNP variation synomyms in ensembl so
I don't think they all represent the scenario i have just described<br>
<br>
mysql> select count(*) from variation_synonym vs inner join
variation v on v.var<br>
iation_id = vs.variation_id where vs.source_id = 1 and v.source_id =
1;<br>
7.6 million results returned<br>
<br>
Also in the biomart snp_61 database i notice some of the fields for
the main snp table (hsapiens_snp__variation__main) are something
like<br>
variation_synonyn_OMIM_bool<br>
which i am presuming is a boolean field to specify whether the snp
has an entry in the variation synonym table whose source is OMIM<br>
<br>
However many of the fields have a name like<br>
variation_synonym_DGV.....bool (e.g.
variation_synonym_DGVaestd21_bool)<br>
And there are corresponding dimension tables for each field with
this type of name<br>
<br>
Could you tell me what these fields represent as I'm don't know
what the connection is between SNPs and DGV variants. The terms
'variation_synonym' in the field name also seem a bit misleading as
there are no SNPs in DGV. Interestingly there are no filters related
to DGV on the biomart web interface for the snp dataset within the
human variation database so I couldnt work out what these fields
might be from biomart.<br>
<br>
thanks very much<br>
<br>
On 17/02/2011 23:40, Pontus Larsson wrote:
<blockquote
cite="mid:AANLkTi=wRgqzJtvazDwvf=JHO3hmgx2dGPUhkS0ULGcp@mail.gmail.com"
type="cite">Hi Andrea,
<div><br>
</div>
<div>The data we import from OMIM are annotations of phenotypes
associated with dbSNP variations and as such, they are stored in
the variation_annotation table (it is neither independent
variations nor synonyms for existing variations so we don't
store it in the variation or variation_synonym table).</div>
<meta charset="utf-8">
<div><br>
</div>
<div>There is some support in the API for working with these, you
may want to take a look at the VariationAnnotation and related
modules.</div>
<div><br>
</div>
<div>As you have noticed, there is also a variation set for
variations with OMIM phenotype annotations (this variation set
is a subset of the 'Phenotype-associated variations' set). For
the task you want to do, the best approach is probably what you
already suggested: to get the variations in this variation set
and intersect it with your list of variations.</div>
<div><br>
</div>
<div>Best regards</div>
<div>/Pontus</div>
<meta charset="utf-8">
<meta charset="utf-8">
<div><br>
<br>
<div class="gmail_quote">2011/2/17 Andrea Edwards <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:edwardsa@cs.man.ac.uk">edwardsa@cs.man.ac.uk</a>></span><br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">hello<br>
<br>
I am trying to find whether the human SNPs (60,000) i have
are listed in OMIM. I believe most SNPs in ensembl have a
primary source of dbSNP. None of the human variations have a
source id of 15 (OMIM) There is a table variation_synonym to
hold data about multiple sources for a snp but I can't find
any entries in this table which have a source_id = 15
either. What am i doing wrong? There exists a variation_set
called OMIM which has 11509 SNPs and I investigated some of
these variations at random and I don't know how you have
linked them to the OMIM variation set<br>
<br>
I have seen there are methods get_all_synonyms and get
_all_synonym_sources on the perl api for a variation. I
presume i could call get_all_synonyms('OMIM') but I don't
see how that can work when no variation synonyms have a
source of 15/OMIM<br>
<br>
Out of general curiosity, will the following 2 approaches
give the same results: getting the OMIM variation set and
seeing whether each of my 60,000 snps is in that or getting
the OMIM variation_synonyms for the 60,000 snps and seeing
which return an actual result? I'm presuming the second
option will be far faster.<br>
<br>
thanks a lot<br>
<br>
_______________________________________________<br>
Dev mailing list<br>
<a moz-do-not-send="true" href="mailto:Dev@ensembl.org"
target="_blank">Dev@ensembl.org</a><br>
<a moz-do-not-send="true"
href="http://lists.ensembl.org/mailman/listinfo/dev"
target="_blank">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>