<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><font class="Apple-style-span" face="Courier">Hi,</font><div><font class="Apple-style-span" face="Courier">Please see below a list of intentions declared for Ensembl release 63 (scheduled for the end of June). Please note that these are intentions and are not guaranteed to be in the release.</font></div><div><font class="Apple-style-span" face="Courier">Regards,</font></div><div><font class="Apple-style-span" face="Courier">Rhoda Kinsella</font></div><div><font class="Apple-style-span" face="Courier"><br></font></div><div><pre><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px; "><font class="Apple-style-span" face="Courier">=======================================
Declarations of Intentions - Ensembl 63
=======================================
Compara
=======
pairwise alignments (All Species)
---------------------------------
human vs panda lastz
human vs marmoset lastz
human vs microbat lastz
human vs cow lastz
human haplotype alignments for high coverage blastz-net alignments
multiple alignments (All Species)
---------------------------------
6way-primate-epo alignments to incorporate new marmoset
12way-mammal-epo alignments to incorporate new marmoset and cow
19way-amniota-pecan alignments to incorporate new marmoset and cow
35way-mammal low-coverage-epo alignments ( new marmoset, panda, cow and microbat )
5way-fish (new mappings with HMM derived anchors)
syntenies (All Species)
-----------------------
human marmoset synteny
human cow synteny
ProteinTrees and homologies (All Species)
-----------------------------------------
GeneTrees (protein-coding) with new/updated genebuilds and assemblies
Clustering using hcluster_sg
Multiple sequence alignments using MCoffee, without the exon-disaligner module (AKA decaf)
Phylogenetic reconstruction using TreeBeST
Homology inference including the recent 'possible_ortholog','putative gene split' and 'contiguous gene split' exceptions
Pairwise gene-based dN/dS scores for high coverage species pairs only
GeneTree stable ID mapping
ncRNAtrees and homologies (All Species)
---------------------------------------
Classification based on RFAM model
Multiple sequence alignments with infernal
Phylogenetic reconstruction using RaxML
Additional multiple sequence alignments with Prank (w/ genomic flanks)
Additional phylogenetic reconstruction using PnyML and NJ
Phylogenetic tree merging using TreeBeST
Homology inference
families (All Species)
----------------------
Clustering by MCL
Multiple Sequence Alignments with MAFFT
Family stable ID mapping
data dumps (All Species)
------------------------
EMF dumps for 19 way PECAN multiple aligments
BED files for 19 way GERP constrained elements
EMF dumps for 12 way EPO multiple aligments
EMF dumps for 35 way low-coverage alignments
BED files for 35 way low-coverage alignments
EMF dumps for 6 way EPO primate multiple aligments
Data dumps for ProteinTrees
Data dumps for ncRNAtrees
schema changes (All Species)
----------------------------
The linking table 'species_set' is renamed into 'species_set_genome_db'
There will be a new, "header" table called 'species_set' for which 'species_set_id' will be [obviously, unique] primary key.
species_set.species_set_id will become a foreign key for species_set_genome_db.species_set_id
Core
====
xref projection (All Species)
-----------------------------
Project GO ids and gene names to species. Make alterations to zebrafish projections.
EMBL & Genbank dumps (All Species)
----------------------------------
EMBL & Genbank dumps for all species
Update schema version (All Species)
-----------------------------------
# Description:
# Update schema_version in meta table to 63.
UPDATE meta SET meta_value='63' WHERE meta_key='schema_version';
# Patch identifier
INSERT INTO meta (species_id, meta_key, meta_value)
VALUES (NULL, 'patch', 'patch_62_63_a.sql|schema_version');
Indexing changes for core database. (All Species)
-------------------------------------------------
Change stable Id version to not null, default 1 in exon_stable_id, gene_stable_id, transcript_stable_id, translation_stable_id, gene_archive.
Create unique index on stable_id and verision for tables exon_stable_id, gene_stable_id, transcript_stable_id, translation_stable_id, gene_archive.
Create a unique index for table umapped_object.
Remove field dbprimary_acc_linkable from external_db table. (All Species)
-------------------------------------------------------------------------
Remove field dbprimary_acc_linkable from external_db table.
Import of new LRGs (Human)
--------------------------
Removal of lowercase letter at the end of all database names (All Species)
--------------------------------------------------------------------------
For release 63 it has been decided to remove the lowercase letter at the end of all database names as it is confusing for users and provides little meaning about actual data changes.
Update xrefs for core databases (All Species)
---------------------------------------------
Update xrefs for Human, Mouse, Rat, Pig, Macaque, Chimp, Orangutan, Fugu and Stickleback.
API web documentation overhaul (All Species)
--------------------------------------------
Replace PDoc system with Doxygen + Perl Filter to produce API reference web pages.
xref sources to be moved to gene level (All Species)
----------------------------------------------------
The following External database sources have been moved up to the Gene level :-
DBASS3, DBASS5, EntrezGene, miRBase, RFAM, UniGene, Uniprot_genename, WikiGene, MIM_GENE and MIM_MORBID.
Funcgen
=======
Updated Regulation API Tutorial (All Species)
---------------------------------------------
patch_62_63_a - Schema Version (All Species)
--------------------------------------------
This patch updates the schema version
patch_62_63_b - binding_matrix.analysis_id (All Species)
--------------------------------------------------------
This patch updates the the analysis_id field of the binding_matrix tables to a smallint
MicroArray Mapping (All Species)
--------------------------------
Mapping of expression arrays to Ensembl Transcripts has been updated for all relevant species i.e. those with new assemblies or gene builds.
RegulatoryFeatureAdaptor::fetch_all (All Species)
-------------------------------------------------
The base fetch_all method has been over-ridden for the RegulatoryFeatureAdaptor, this now defaults to returning only the MultiCell RegulatoryFeatures, as the other generic methods do.
ResultFeatureAdaptor method over-rides (All Species)
----------------------------------------------------
Where appropriate some of the base feature adaptor methods have been over-riden, this prevents some API errors due to the nature of the ResultFeature storage
Added Motif Features for Missing Jaspar Matrices (Human)
--------------------------------------------------------
Motif Features were added for the following Jaspar Matrices:
E2F1: MA0024.1
NFKB: MA0105.1
BHLHE40: PB0111.1;PB0007.1
Nrsf: MA0138.1
Changed Motif Features score to [0-1] relative affinity scale (Human and Mouse)
-------------------------------------------------------------------------------
Instead of showing the absolute score from the MOODs software, we now display a [0-1] linear relative value between the maximum (1) and minimum (0) score. This is to make it coherent with the API BindingMatrix::relative_affinity function and to make it easier for the user to interpret the score.
Added a threshold field to the BindingMatrix table (All Species)
----------------------------------------------------------------
A new threshold float field was added to the Binding Matrix to store the minimum score for Motif Features from each matrix (patch_62_63_c).
Added species-specific thresholds to Binding Matrices (Human and Mouse)
-----------------------------------------------------------------------
Added to each Binding Matrix the lowest score for Motif Features belonging to that matrix and that species. This will make it easier for people using the API to know if the potential binding affinity for a given sequence goes above the currently used threshold (ie would be classified as a binding site).
Cleaned Regulatory Regions in chromosomal boundaries (Human and Mouse)
----------------------------------------------------------------------
In some rare cases, regulatory regions can pass the boundaries of sequence regions (like chromosomes). These cases will be removed as they are likely to be artifactual.
Update of Regulation Metadata (All Species)
-------------------------------------------
CTCF is now classified generically as a "Transcription Factor" instead of "Insulator"
Genebuild
=========
New microbat assembly (Microbat)
--------------------------------
A full gene annotation on the new high coverage microbat assembly, Myoluc2.0
Removed duplicated dna in panda (Panda)
---------------------------------------
Scaffold dna sequences removed from the dna table
Rabbit xrefs (Rabbit)
---------------------
Missing xrefs added for ncRNAs
Human Vega annotation (Human)
-----------------------------
Manual annotation of human from Havana has been updated. This represents the annotation presented in Vega release 43
Zebrafish Vega annotation (Zebrafish)
-------------------------------------
Manual annotation of zebrafish from Havana has been updated. This represents the annotation presented in Vega release 43
GRCh37.p4 (Human)
-----------------
GRCh37.p4 added to the human databases.
GRCh37.p4 annotation (Human)
----------------------------
Gene annotation of the patches in the otherfeatures db.
Human cDNA update (Human)
-------------------------
A new cDNA db for human.
Mouse cDNA update (Mouse)
-------------------------
A new cDNA db for mouse.
New Cow Assembly (Cow)
----------------------
The first genebuild on cow assembly UMD3.1.
Update to Ensembl-Havana GENCODE gene set (release 8) (Human)
-------------------------------------------------------------
Update to Ensembl-Havana GENCODE gene set (release 8) - this is based on updated Ensembl gene set and latest Havana gene annotation.
Flagging obsolete Uniprot proteins (All Species)
------------------------------------------------
Flag the obsolete proteins in Uniprot used as supporting evidence
Flagging obsolete Ensembl proteins (All Species)
------------------------------------------------
Flag obsolete human Ensembl proteins used as supporting evidence
Logic name update (All Species)
-------------------------------
Whenever possible, logic names updated to be consistent across all databases
Zebrafish Vega merge (Zebrafish)
--------------------------------
A new Vega gene set has been merged with the Ensembl geneset from release 61.
Mart
====
BioMart 63 databases (All Species)
----------------------------------
Full build of all 7 marts.
Variation
=========
New rhesus macaque variation database (Macaque)
-----------------------------------------------
Based on dbSNP 131
Updates to human phenotype associations (Human)
-----------------------------------------------
OMIM, UniProt, NHGRI GWAS catalog, HGMD mutations, COSMIC
New mouse variation database (Mouse)
------------------------------------
Based on dbSNP 132
Add attrib_id column to variation_set (All Species)
---------------------------------------------------
An attrib_id column is added to variation_set in order to be able to provide general and human-friendly names to variation sets without breaking the web display.
Update structural variation data from DGVa (Dog, Human, Macaque, Mouse and Pig)
-------------------------------------------------------------------------------
DGVa
Schema changes (All Species)
----------------------------
# structural variation schema changes:
- Change the columns name from bound_start to inner_start and bound_end to inner_end
- Add a column for validation status
- Change the column class to class_attrib_id, using more detailled SO terms.
# moved failed descriptions into attribute table
LRG data (Human)
----------------
import LRG variant data
add LRG consequences to the database
New individual genotypes (Human)
--------------------------------
Individual genotypes from Penn State University:
Han Chinese Individual (YanHuang Project)
Seong-Jin Kim (SJK, GUMS/KOBIC)
Anonymous Irish Male
Individual from the Extinct Palaeo-Eskimo Saqqaq (Saqqaq Genome Project)
Individual from the Extinct Palaeo-Eskimo Saqqaq, high confidence SNPs (Saqqaq Genome Project)
Anonymous Korean individual, AK1 (Genomic Medicine Institute) : Individual genotype
Misha Angrist (Personal Genome Project)
Henry Louis Gates Jr (Personal Genome Project)
Henry Louis Gates Sr (Personal Genome Project)
Rosalynn Gill (Personal Genome Project)
Marjolein Kriek (Leiden University Medical Centre)
Stephen Quake (Stanford)
update variation consequences (Cow, Zebrafish and Human)
--------------------------------------------------------
update variation consequences on human, zebrafish and cow due to new gene sets
EnsemblGenomes
==============
New core database for Yeast (Yeast)
-----------------------------------
New core database for Saccharomyces cerevisiae to reflect the new assembly and genebuild from SGD
New otherfeatures database for Yeast (Yeast)
--------------------------------------------
Rebuilt otherfeatures database with new EST alignments reflecting the new assembly from SGD.
New funcgen database for Yeast (Yeast)
--------------------------------------
New functional genomics database for Yeast with new probe mapping to reflect the assembly update from SGD.
New variation database for Yeast (Yeast)
----------------------------------------
New variation database for Yeast with mapped variation features to reflect the latest assembly from SGD.
</font></span></font></pre><div apple-content-edited="true"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><font class="Apple-style-span" face="Courier"><br></font></div><div><font class="Apple-style-span" face="Courier"><br></font></div><div><font class="Apple-style-span" face="Courier"><br></font></div></div></span></div></div></div><div> <span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Rhoda Kinsella Ph.D.</div><div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; ">Ensembl Bioinformatician,</div></div><div>European Bioinformatics Institute (EMBL-EBI),<br>Wellcome Trust Genome Campus, </div><div>Hinxton<br>Cambridge CB10 1SD,</div><div>UK.</div></div></span></div></span> </div><br></body></html>