Hi,<br>  As described previously, I'm trying to run the low coverage annotation pipeline for our Illumina GAII sequenced fish genome (~800m).<br>  The doc low_coverage_gen_build.txt tells me to prepare my own compara db, so I go to encembl-compara.<br>

  For my fish genome, I have ~75k scaffolds (length >= 200bp, N50 ~1M), among which 2600 scaffolds are longer than 1000bp. my ref genome is stickleback, and I followed the README-pairaligner doc.<br>  As the ref genome has ~2000 chunks (size 1M), there will be 2000 X 75000 = 150M pairaligner jobs. too many to run in my institute.<br>

  here are my questions:<br>  1. should I only use these scaffolds longer than 1000bp? <br clear="all">  2. am I followed the right doc? Which doc should I read to produce such a alignment that: 'each bp in the target genome should be represented <br>

  at most once' (cited from low_coverage_gene_build.txt). I don't quite understand the README-2xalignment and README-low-coverage-genome-aligner.<br><br>Thank you<br><br>Best reguards<br><br>-- <br>Zhang Di<br>