<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Nathan,<br>
      <br>
      Thanks a lot for the reply - very helpful. I was trying to add
      ENSG id's (E71) to the GEO annotation file available for this
      platform. I expected some drop out (due to annotation differences
      etc) but wasn't expecting ~9000  protein coding genes (14000
      probesets) to go missing between GEO and ensembl. I guess a more
      stringent QC strategy would probably explain do that, I was
      worried that there was a problem with my way of doing this, but
      this doesn't seem to be the case.<br>
      <br>
      Thanks for your help.<br>
      <br>
      Olly<br>
      <br>
      On 17/06/13 21:39, Nathan Johnson wrote:<br>
    </div>
    <blockquote
      cite="mid:074560E5-3FCB-42AD-B57E-0E819DC80CD4@ebi.ac.uk"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      Hi Oliver
      <div><br>
      </div>
      <div>The reason why this isn't being considered as a transcript
        xref is because it is on the wrong strand.  This is an easy
        mistake to make as many of the array technologies differ in how
        they process the RNA sample and hence what strand is actually
        hybridised when it eventually meets the array.</div>
      <div><br>
      </div>
      <div>There is a digram of the IVT processing on this page:</div>
      <div><br>
      </div>
      <div><a moz-do-not-send="true"
href="http://www.affymetrix.com/estore/browse/products.jsp?categoryIdClicked=&productId=131415#1_1">http://www.affymetrix.com/estore/browse/products.jsp?categoryIdClicked=&productId=131415#1_1</a></div>
      <div><br>
      </div>
      <div>In saying that, that particular set of alignments does look
        like it was designed for the exons of that gene, albeit with
        some exon boundary overlap. However, IVT arrays normally target
        3' ends and UTRs specifically, which makes this particular
        probeset even more odd.</div>
      <div><br>
      </div>
      <div> Sorry I can't be of more help.</div>
      <div><br>
      </div>
      <div>Nathan</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <div><br>
        <div>
          <div>On 17 Jun 2013, at 15:58, Oliver Burren <<a
              moz-do-not-send="true"
              href="mailto:oliver.burren@cimr.cam.ac.uk">oliver.burren@cimr.cam.ac.uk</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <meta http-equiv="content-type" content="text/html;
              charset=ISO-8859-1">
            <div bgcolor="#FFFFFF" text="#000000"> Hi,<br>
              <br>
              I'm trying to retrieve all probset.id mappings to ensembl
              genes for [HG-U133_Plus_2] Affymetrix Human Genome U133
              Plus 2.0 Array (<a moz-do-not-send="true"
                class="moz-txt-link-freetext"
                href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL570">http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL570</a>)
              using ensmart 71. However I noticed a large drop out wrt
              to the GEO annotation file so I did some digging...<br>
              <br>
              <br>
              If I look in Biomart for something like this <br>
              <br>
              <br>
              <meta http-equiv="content-type" content="text/html;
                charset=ISO-8859-1">
              <pre style="font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
                        
        <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
                <Filter name = "affy_hg_u133_plus_2" value = "205332_at"/>
                <Attribute name = "ensembl_gene_id" />
                <Attribute name = "ensembl_transcript_id" />
        </Dataset>
</Query>

</pre>
              I get no results. However if I search the website for
              205332_at and turn on the track for AFFY:HG-U133_Plus_2 it
              shows that the probeset (6 features) maps to the gene. The
              help on this page <a moz-do-not-send="true"
                class="moz-txt-link-freetext"
href="http://www.ensembl.org/info/docs/microarray_probe_set_mapping.html">http://www.ensembl.org/info/docs/microarray_probe_set_mapping.html</a>
              says ' it is normally required that more than 50% of the
              probes in a probe set hit a given transcript sequence'. Is
              this the reason why this probeset isn't being tagged to
              this gene (although this appears to be 60%) ?<br>
              <br>
              Any light that you could shed would be appreciated.
              Thanks,<br>
              <br>
              Olly Burren<br>
              <br>
              <br>
            </div>
            _______________________________________________<br>
            Dev mailing list    <a moz-do-not-send="true"
              href="mailto:Dev@ensembl.org">Dev@ensembl.org</a><br>
            Posting guidelines and subscribe/unsubscribe info: <a
              moz-do-not-send="true"
              href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a><br>
            Ensembl Blog: <a moz-do-not-send="true"
              href="http://www.ensembl.info/">http://www.ensembl.info/</a><br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Dev mailing list    <a class="moz-txt-link-abbreviated" href="mailto:Dev@ensembl.org">Dev@ensembl.org</a>
Posting guidelines and subscribe/unsubscribe info: <a class="moz-txt-link-freetext" href="http://lists.ensembl.org/mailman/listinfo/dev">http://lists.ensembl.org/mailman/listinfo/dev</a>
Ensembl Blog: <a class="moz-txt-link-freetext" href="http://www.ensembl.info/">http://www.ensembl.info/</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>