<html>
  <head>
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi all,</p>
    <p>I'm troubleshooting a Perl tool that calls the Ensembl API with a
      variant id and tries to find the gene with the closest 5' end
      within a 500 kb window. The tool was written by a colleague and it
      uses
      Bio::EnsEMBL::DBSQL::BaseFeatureAdaptor::fetch_all_by_outward_search()
      like this:</p>
    <pre style="background-color:#ffffff;color:#000000;font-family:'Menlo';font-size:9.0pt;"><span style="color:#000080;font-weight:bold;">my </span>@gene_list_for_feature  = @{$gene_adaptor->fetch_all_by_outward_search( 
                                                   -FEATURE => $var_feature,
                                                   -RANGE => <span style="color:#0000ff;">10000</span>,
                                                   -MAX_RANGE => <span style="color:#0000ff;">500000</span>,
                                                   -LIMIT => <span style="color:#0000ff;">40</span>,
                                                   -FIVE_PRIME => <span style="color:#0000ff;">1</span>)};</pre>
    <p>According to the documentation of this function
      (<a class="moz-txt-link-freetext"
href="http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1BaseFeatureAdaptor.html#a76a51bc70828aaccb9435eda9a44b20a">http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1BaseFeatureAdaptor.html#a76a51bc70828aaccb9435eda9a44b20a</a>),
      it "Searches for features within the suggested -RANGE, and if it
      finds none, expands the search area until it satisfies -LIMIT or
      hits -MAX_RANGE". My understanding is that in my case it should
      search first in a 10 kb window and, if there are no genes,
      progressively expand it to up to 500 kb unless it finds 40
      features before. However, this is not the behaviour I am seeing,
      the search range grows like this: 10k, 20k, 60k, 240k and 1.20M.
      Is this a bug or have I misundertood what it does?</p>
    <p>I have looked into the code of this subroutine (<a
        moz-do-not-send="true"
href="https://github.com/Ensembl/ensembl/blob/release/96/modules/Bio/EnsEMBL/DBSQL/BaseFeatureAdaptor.pm#L1441-L1469">https://github.com/Ensembl/ensembl/blob/release/96/modules/Bio/EnsEMBL/DBSQL/BaseFeatureAdaptor.pm#L1441-L1469</a>)
      and the search window growths exponentially because it multiplies
      the previous value instead of the initial value:</p>
    <p><span class="pl-smi">[L1452] $search_range</span> = <span
        class="pl-smi">$search_range</span> * <span class="pl-smi">$factor</span>;</p>
    <p>In addition, it is not true that it only expands the range if it
      does not find any features in the initial window, which is obvious
      from looking into the while statement:</p>
    <p>[L1451] <span class="pl-k">while</span> (<span class="pl-c1">scalar</span>
      <span class="pl-smi">@results</span> < <span class="pl-smi">$limit</span>
      && <span class="pl-smi">$search_range</span> <= <span
        class="pl-smi">$max_range</span>) {</p>
    <p>I am also confused by the fact that, apparently, the found
      features only need to be partially within the range. For instance,
      ENSG00000150394 (CDH8) is found with the above parameters although
      its 5' prime end is 1,338,771 bp away from the variant according
      to the distance reported by the function. So, it seems that the
      feature is found because its 3' end is within the range although
      the 5' prime end, which is what I am interested in, is not. This
      somehow contradicts what the documentation says (<a
        moz-do-not-send="true"
href="https://github.com/Ensembl/ensembl/blob/release/96/modules/Bio/EnsEMBL/DBSQL/BaseFeatureAdaptor.pm#L1490-L1491">https://github.com/Ensembl/ensembl/blob/release/96/modules/Bio/EnsEMBL/DBSQL/BaseFeatureAdaptor.pm#L1490-L1491</a>):
      "When looking beyond the boundaries of the source Feature, the
      distance is measured to the nearest end of that Feature to the
      nearby Feature's nearest end."</p>
    <p>Any help will be much appreciated. I am happy to share code if
      you think it would be useful.</p>
    <p>Thanks,<br>
      Asier<br>
    </p>
  </body>
</html>