<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Dear Zhang<br>
<br>
Yes, the alignments can span an assembly gap. These are represented
as N's in the sequence, which is like a hard-masked sequence.<br>
<br>
Could please explain when and how you get the exception about not
finding the sequence pieces? <br>
<br>
Kind regards<br>
<br>
Javier<br>
<br>
On 04/12/11 13:11, Zhang Di wrote:
<blockquote
cite="mid:CAMHeD-fq48yNmpJpGFYGz9wXCO8VgkfDskFt8ypoYSzKytCYvg@mail.gmail.com"
type="cite">
<div>Hi,</div>
<div><br>
</div>
<div> Finally I got the compara pipeline for whole genome
alignment to work.</div>
<div><br>
</div>
<div> The results of RAW, CHAIN, and NET are all stored in
table 'genomic_align' distinguished by different
method_link_species_id.</div>
<div> </div>
<div> I found that the some records of CHAIN and NET, contain a
few base pairs belong to gap region of its scaffold.</div>
<div><br>
</div>
<div>e.g.</div>
<div> </div>
<blockquote style="margin: 0 0 0 40px; border: none; padding:
0px;">
<div> mysql> select method_link_species_set_id, dnafrag_id,
dnafrag_start, dnafrag_end from genomic_align where dnafrag_id
= 4465 and dnafrag_start=486;</div>
<div>
<div>+----------------------------+------------+---------------+-------------+</div>
</div>
<div>
<div>| method_link_species_set_id | dnafrag_id | dnafrag_start
| dnafrag_end |</div>
</div>
<div>
<div>+----------------------------+------------+---------------+-------------+</div>
</div>
<div>
<div>| 2 | 4465 | 486
| 567 | </div>
</div>
<div>
<div>| 3 | 4465 | 486
| 567 | </div>
</div>
<div>
<div>+----------------------------+------------+---------------+-------------+</div>
</div>
</blockquote>
<div><br>
</div>
<div>while for the dnafrag_id = 4465 , in my core database it is
scaffold_2621 , seq_region_id = 429785:</div>
<div><br>
</div>
<blockquote style="margin: 0 0 0 40px; border: none; padding:
0px;">
<div>
<div>mysql> select * from assembly where asm_seq_region_id
= 429785;</div>
</div>
<div>
<div>+-------------------+-------------------+-----------+---------+-----------+---------+-----+</div>
</div>
<div>
<div>| asm_seq_region_id | cmp_seq_region_id | asm_start |
asm_end | cmp_start | cmp_end | ori |</div>
</div>
<div>
<div>+-------------------+-------------------+-----------+---------+-----------+---------+-----+</div>
</div>
<div>
<div>| 429785 | 181573 | 488 |
717 | 1 | 230 | -1 | </div>
</div>
<div>
<div>| 429785 | 191688 | 1 |
419 | 1 | 419 | 1 | </div>
</div>
<div>
<div>| 429785 | 220761 | 718 |
1086 | 1 | 369 | 1 | </div>
</div>
<div>
<div>+-------------------+-------------------+-----------+---------+-----------+---------+-----+</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
<div>the 420 - 487 interval is a gap.</div>
<div><br>
</div>
<div>
Is this normal result of CHAIN-NET ?</div>
<div><br>
</div>
<div>It is quite annoying because I want to use the compara_db for
low coverage gene build, and It will complain:</div>
<div><br>
</div>
<div>EXCEPTION:</div>
Could not find sequence-level pieces for
scaffold_2621/486-744<br clear="all">
<div><br>
</div>
<div>Best reguards</div>
<div><br>
</div>
-- <br>
Zhang Di<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Dev mailing list <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
List admin (including subscribe/unsubscribe): <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK</pre>
</body>
</html>