Table of Contents
ElasticSearch Plugin
Compatible with DokuWiki
- 2024-02-06 "Kaos" yes
- 2023-04-04 "Jack Jackrum" yes
- 2022-07-31 "Igor" yes
- 2020-07-29 "Hogfather" yes
Similar to docsearch, searchtext
Installation
External requirements: This plugin requires the following additional components that must be installed separately:
- Elasticsearch instance (latest tested version 8.16)
Download and install the plugin using the Extension Manager. Refer to Plugins on how to install plugins manually.
Configure and setup the plugin as outlined under Configuration below.
Examples/Usage
Once installed, the plugin will automatically react on the ?do=search action, replacing DokuWiki's builtin search. However, the typical DokuWiki query shortcuts (like @namespace) do not work in ElasticSearch.
All access rights are respected, so users will only find pages for which they have at least read privileges.
You can use the Advanced Search Tools to filter the results by namespace and date of the last modification. With tagging plugin installed, the search tools will include a dropdown with tags available for the current result set. So if none of the current results are tagged, the filter will not be displayed.
If the text plugin is installed, the rendered content of pages will be indexed in addition to raw wiki syntax. By default, search uses both fields. If you want to disable search in syntax, change the searchSyntax
setting in the plugin's configuration.
Configuration and Settings
To integrate this plugin, you will need to do some configuration and run a few commands on the command line.
- enter the configuration of your Elasticsearch server in the Configuration Settings
- optional: if your ES instance has security enabled (default setting since 8.0), enter the authentication username and password in Configuration Settings
- copy and adjust a sample configuration for media indexing
- create the Index
- index your pages
Creating the Index
Indexing media
Copy the elasticsearch.conf.example
included in this plugin's conf directory into /conf/elasticsearch.conf
. The defaults should be fine for typical Linux servers. They include some popular file extensions and point to tools for extracting text from them, either as UNIX commands or an URL to the /rmeta
endpoint of an Apache Tika server.
Use the provided command line tool to create the index.
./bin/plugin.php elasticsearch createindex
The name of the index is determined by the configuration.
Re-creating the index
Sometimes it is necessary to throw away the old index and replace it with a new one. This can be done via the same DokuWiki script with an additional parameter:
./bin/plugin.php elasticsearch createindex --clear
Index-Recreation is necessary when upgrading the plugin, or when you add new plugins that integrate with the ElasticSearch plugin (like translation or tagging).
Languages and fuzzy search
One of the main reasons for using a dedicated search engine is that it provides advanced features, such as fuzzy search.
If you have configured multiple languages using the Translation plugin, they will be recognized.
By default all available translation are searched. Users can change the language selection in Advanced Search Tools.
You can also enable translation detection in the plugin configuration. The option is called detectTranslation
. When activated, the search will try to detect the current language context from the top namespace, and then set the language filter accordingly. For example, if the translation plugin is configured to handle the en es fr
namespaces and the user starts the search when browsing the page es:capítulos:tres
, the language filter will be automatically set to es
.
Index management
The pages will be indexed automatically when browsed, just like with the DokuWiki builtin mechanism. When a page is updated, its entry in the index will be updated as well.
You can also force indexing the whole Wiki at once using the CLI tool. This is recommended when you build the index for the first time or when you have made extensive changes (like moving pages or updating the ACLs).
./bin/plugin.php elasticsearch index
You can index pages or media separately:
./bin/plugin.php elasticsearch index --only=pages ./bin/plugin.php elasticsearch index --only=media
Other plugins
The tagging plugin integrates well with Elasticsearch. You can search for tags explicitly using #sometag
search terms. If any of the results are tagged, a tag filter will be added to Advanced Search Tools.
Development
Plugin integration
Elasticsearch emits several events that can be used by other plugins to put their own data into the search index. Take a look at the implementation tagging to see how those events can be used.
PLUGIN_ELASTICSEARCH_CREATEMAPPING
: Triggered when creating the index. Plugins may add their own fields and mappings via event data.PLUGIN_ELASTICSEARCH_INDEXPAGE
: While indexing a page, plugins can provide their own data.PLUGIN_ELASTICSEARCH_FILTERS
: Adds search configurations provided by plugins.PLUGIN_ELASTICSEARCH_SEARCHFIELDS
: Lets plugins add their own fields to the list of search fields included in the Elastic query.PLUGIN_ELASTICSEARCH_QUERY
: Lets plugins append their data to the query string.
Early Access Features
Additional features are available as early access through our DokuWiki Business Plugin Partner Program.
The plugin has been refactored and dependencies on the Ruflin and Elastic libraries have been replaced by a lightweight wrapper around the Elastic API. This makes it easier to maintain the plugin and add new features in the future.
As a side-effect, the plugin will now use DokuWiki's proxy settings when connecting to the ElasticSearch servers. No need to specify them separately anymore.
The following new features are available in the early access release:
- Support for the
@namespace
syntax to search in a given namespace - Autocompletion in the search box: you will automatically get suggestions based on the available data in the index
- Spelling Correction: when your search query seems to contain misspelled words, the correct query is automatically suggested
- Filter by File Type: you now can search for a specific file type (like PDF or Powerpoint)
- Improved Search Interface
- Better filter handling: Adding new filters is easier, you imeadiately see what filters are currently set
- Nicer results: results are easier to scan and filter attributes (like language and tags) are shown with the results
- Image Preview: when images are returned as a search result, a thumbnail preview is shown alongside the hit.