Open main menu

UESPWiki β

User:Daveh/Lucene Search

< User:Daveh

This article details the experience of installing and configuring Lucene search for the wiki.

Dev Wiki InstallationEdit

  • Lucene Installation
  • Change all instances of uesp_net_wiki5 to uesp_net_wikidev in LocalSettings.php on dev.uesp.net (prevents issues later on).
  • Download Lucene Search 2.1.3 and uncompress.
  • Run: ./configure /home/uespdev/www/w
  • Edit lsearch-global.conf to look like the following:
    [Database]
    uesp_net_wikidev : (single) (spell,4,2) (prefix) (language,en)
  • Update the [Namespace-Prefix] section in lsearch-global.conf to include:
   [100] : 100
   [101] : 101
   [102] : 102
   [103] : 103
   [104] : 104
   [105] : 105
   [106] : 106
   [107] : 107
   [108] : 108
   [109] : 109
   [110] : 110
   [111] : 111
   [112] : 112
   [113] : 113
   [114] : 114
   [115] : 115
   [116] : 116
   [117] : 117
   [118] : 118
   [119] : 119
   [120] : 120
   [121] : 121
   [122] : 122
   [123] : 123
   [124] : 124
   [125] : 125
   [126] : 126
   [127] : 127
   [128] : 128
   [129] : 129
   [130] : 130
   [131] : 131
   [132] : 132
   [133] : 133
   [134] : 134
   [135] : 135
   [136] : 136
   [137] : 137
   [138] : 138
   [139] : 139
   [140] : 140
   [141] : 141
   [142] : 142
   [143] : 143
   [144] : 144
   [145] : 145
   [146] : 146
   [147] : 147
   all_talk : 1,3,5,7,9,11,13,15,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145
   all_content : 0,102,104,106,108,110,112,114,116,118,120,122,124,126,128,130,132,134,136,138,140,142,144,146
  • MWSearch Installation
  • Download and unzip the MWSearch extension for MediaWiki 1.19.
  • Install to /home/uespdev/www/w/extensions/MWSearch
  • Edit LocalSettings.php to add the following lines:
   $wgSearchType = 'LuceneSearch';
   $wgLuceneHost = '10.2.212.14';
   $wgLucenePort = 8123;
   require_once( "$IP/extensions/MWSearch/MWSearch.php" );
         //The following must be after the extension is included
   $wgEnableLucenePrefixSearch = true;
   $wgLucenePrefixHost = '10.2.212.14';
   $wgLuceneSearchVersion = 2.1;
  • Ensure the following parameters are set in LocalSettings.php:
   $wgUseAjax = true;
   $wgAjaxSearch = true;
   $wgEnableMWSuggest = true;
  • Check the Special:Version page to confirm installation.
  • Edit extensions/UespCustomCode/ and ensure the following lines are commented/uncommented as shown:
   # $wgAutoloadClasses['SiteSearchMySQL'] = $dir . 'SiteSearchMySQL.php';
   # $wgSearchType = 'SiteSearchMySQL';
   # $aSpecialPages['Search'] = array( 'SpecialPage', 'Search', , true, 'efSiteSpecialSearch', $dir . 'SiteSpecialSearch.php');
  • Test the search.
  • OAIRespository Installation
  • Download the OAI extension.
  • Install into the MediaWiki extension folder: /home/uespdev/www/w/extensions/OAI
  • Add a line to LocalSettings.php
   require_once( "$IP/extensions/OAI/OAIRepo.php" );
  • Run the maintenance/update.php script to create the OAI tables.
  • Ensure that lsearch-global.conf has the following lines:
   [OAI]
   <default> : http://dev.uesp.net/w/index.php
  • Run the Lucene update script (run time = 1 min).

Live InstallationEdit

  • For now content2 will be the indexer and search host.
  • Added entries to MediaWiki:Common.css.
  • Install OAI and MWSearch extensions.
  • Ran maintenance/update.php on content3 (thus updating the master db1).
  • Create the sub-domain search.uesp.net which points to the appropriate server. Note that the indexer appears to need to be run on a content server.
  • Copy /lsearch folder from dev.
  • Clear the dump and indexes folders.
  • Change hostname in lsearch.initd to search.uesp.net.
  • Setup start script: cp lsearch.initd /etc/init.d/lsearch.
  • Edit lsearch.conf and update the localization URL:
   Localization.url = file:///home/uesp/www/w/languages/messages
  • Edit lsearch-global.conf:
   [Database]
   uesp_net_wiki5 : (single,true,20,100) (spell,10,3) (prefix) (language,en)
   
   [Search-Group]
   content2.uesp.net : *
   
   [Index]
   content2.uesp.net : *
   
   [Index-Path]
   <default> : /search
   
   [OAI]
   <default> : http://content2.uesp.net/w/index.php
   
   # Namespaces as before
  • Edit config.inc with the correct paths and hostnames (change hostname as appropriate):
   dbname=uesp_net_wiki5
   wgScriptPath=/w
   hostname=content2.uesp.net
   indexes=/lsearch/indexes
   mediawiki=/home/uesp/www/w
   base=/lsearch
   wgServer=http://www.uesp.net
  • To build the index run ./build which should use the db2 slave database. Monitor the load on content1/db2 and if the build process stalls at any time break the process. It appears to sometimes lock-up on the database side. Approximate build run is 5-6 minutes.
  • Start lsearchd on content2 and note any errors. Test to ensure it is working correctly.
  • Run update on content2 and ensure it works correctly.
  • Setup the init.d run script. Kill the current instance of Lucene manually and start it via the script. Ensure it has started and works correctly.
  • Setup the update script to run hourly in cron.hourly.
  • Edit LocalSettings.php to add the following lines:
   $wgUseAjax = true;
   $wgAjaxSearch = true;
   $wgEnableMWSuggest = true;
   require_once( "$IP/extensions/OAI/OAIRepo.php" );
   $wgSearchType = 'LuceneSearch';
   $wgLuceneHost = '10.2.212.12';
   $wgLucenePort = 8123;
   require_once( "$IP/extensions/MWSearch/MWSearch.php" );
   $wgEnableLucenePrefixSearch = true;
   $wgLucenePrefixHost = '10.2.212.12';
   $wgLuceneSearchVersion = 2.1;
  • Test search in the live site. Ensure that the suggest/prefix search works in addition to the normal search and the search options.

BenchmarkingEdit

  • Tests were done using ApacheBench in the format: ab -kc 10 -t 30 http://...
  • Index changes in lsearch-global.conf seemed to have only a very minor effect on performance.
  • Index build times are typically 5-6 minutes on live servers using db2 (main database slave).
  • Index update times are typically ~30 seconds.
  • Simple prefix suggestions via the opensearch API typically take 70-120 ms. Remote or local requests didn't change the time required very much (remote requests were slightly slower at 100-150 ms).
  • Simple searches directly to Lucene took 20-90 ms from the local host.
  • Search throughput maxes out around 350 req/s. Prefix rate is around 100 req/s.
  • Average search time according to the search log is 50 ms.
  • Load on the search server (content2)
  • Average load = 0.03.
  • Rare spikes from 0.5 - 1.0.
  • Hourly spike up to 0.5 for 1 minute from the index update.
  • Search request rate averages 0.67/sec.
  • Prefix request rate averages 2.1/sec (estimate from Squid logs).
  • Based on benchmarking current capacity of search host is 2-3%.

Custom Search ChangesEdit

A few skin changes were made at the same time as the Lucene search installation:

  • skins/MonoBook.php
  • skins/monobook/main.css
  • extensions/UespCustomCode/files/search-icon.png
  • extensions/UespCustomCode/SiteSpecialSearch.php
  • extensions/UespCustomCode/SiteCustomCode.php
  • MediaWiki:Common.css

LinksEdit