hicode
diff --git a/‎docs/reference/api-conventions.asciidoc
Lines changed: 72 additions & 0 deletions b/‎docs/reference/api-conventions.asciidoc
Lines changed: 72 additions & 0 deletions
diff --git a/‎docs/reference/query-dsl/queries/flt-field-query.asciidoc
Lines changed: 2 additions & 2 deletions b/‎docs/reference/query-dsl/queries/flt-field-query.asciidoc
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/reference/query-dsl/queries/flt-query.asciidoc
Lines changed: 2 additions & 2 deletions b/‎docs/reference/query-dsl/queries/flt-query.asciidoc
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/reference/query-dsl/queries/fuzzy-query.asciidoc
Lines changed: 54 additions & 31 deletions b/‎docs/reference/query-dsl/queries/fuzzy-query.asciidoc
Lines changed: 54 additions & 31 deletions
diff --git a/‎docs/reference/query-dsl/queries/match-query.asciidoc
Lines changed: 9 additions & 8 deletions b/‎docs/reference/query-dsl/queries/match-query.asciidoc
Lines changed: 9 additions & 8 deletions
diff --git a/‎docs/reference/query-dsl/queries/query-string-query.asciidoc
Lines changed: 4 additions & 4 deletions b/‎docs/reference/query-dsl/queries/query-string-query.asciidoc
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/reference/search/suggesters/completion-suggest.asciidoc
Lines changed: 4 additions & 3 deletions b/‎docs/reference/search/suggesters/completion-suggest.asciidoc
Lines changed: 4 additions & 3 deletions
diff --git a/‎src/main/java/org/apache/lucene/queryparser/classic/MapperQueryParser.java
Lines changed: 2 additions & 1 deletion b/‎src/main/java/org/apache/lucene/queryparser/classic/MapperQueryParser.java
Lines changed: 2 additions & 1 deletion
@@ -122,6 +122,21 @@ fields within a document indexed treated as boolean fields.
 All REST APIs support providing numbered parameters as `string` on top
 of supporting the native JSON number types.
 
+[[time-units]]
+[float]
+=== Time units
+
+Whenever durations need to be specified, eg for a `timeout` parameter, the duration
+can be specified as a whole number representing time in milliseconds, or as a time value like `2d` for 2 days.  The supported units are:
+
+[horizontal]
+`y`::   Year
+`M`::   Month
+`w`::   Week
+`h`::   Hour
+`m`::   Minute
+`s`::   Second
+
 [[distance-units]]
 [float]
 === Distance Units
@@ -144,6 +159,63 @@ Centimeter::    `cm` or `centimeters`
 Millimeter::    `mm` or `millimeters`
 
 
+[[fuzziness]]
+[float]
+=== Fuzziness
+
+Some queries and APIs support parameters to allow inexact _fuzzy_ matching,
+using the `fuzziness` parameter. The `fuzziness` parameter is context
+sensitive which means that it depends on the type of the field being queried:
+
+[float]
+==== Numeric, date and IPv4 fields
+
+When querying numeric, date and IPv4 fields, `fuzziness` is interpreted as a
+`+/- margin. It behaves like a <<query-dsl-range-query>> where:
+
+    -fuzziness <= field value <= +fuzziness
+
+The `fuzziness` parameter should be set to a numeric value, eg `2` or `2.0`. A
+`date` field interprets a long as milliseconds, but also accepts a string
+containing a time value -- `"1h"` -- as explained in <<time-units>>. An `ip`
+field accepts a long or another IPv4 address (which will be converted into a
+long).
+
+[float]
+==== String fields
+
+When querying `string` fields, `fuzziness` is interpreted as a
+http://en.wikipedia.org/wiki/Levenshtein_distance[Levenshtein Edit Distance]
+-- the number of one character changes that need to be made to one string to
+make it the same as another string.
+
+The `fuzziness` parameter can be specified as:
+
+`0`, `1`, `2`::
+
+the maximum allowed Levenshtein Edit Distance (or number of edits)
+
+`AUTO`::
++
+--
+generates an edit distance based on the length of the term. For lengths:
+
+`0..1`:: must match exactly
+`1..4`:: one edit allowed
+`>4`:: two edits allowed
+
+`AUTO` should generally be the preferred value for `fuzziness`.
+--
+
+`0.0..1.0`::
+
+converted into an edit distance using the formula: `length(term) * (1.0 -
+fuzziness)`, eg a `fuzziness` of `0.6` with a term of length 10 would result
+in an edit distance of `4`. Note: in all APIs except for the
+<<query-dsl-flt-query>>, the maximum allowed edit distance is `2`.
+
+
+
 [float]
 === Result Casing
 
 
@@ -33,8 +33,8 @@ The `fuzzy_like_this_field` top level parameters include:
 |`max_query_terms` |The maximum number of query terms that will be
 included in any generated query. Defaults to `25`.
 
-|`min_similarity` |The minimum similarity of the term variants. Defaults
-to `0.5`.
+|`fuzziness` |The fuzziness of the term variants. Defaults
+to `0.5`. See  <<fuzziness>>.
 
 |`prefix_length` |Length of required common prefix on variant terms.
 Defaults to `0`.
 
@@ -32,8 +32,8 @@ Defaults to the `_all` field.
 |`max_query_terms` |The maximum number of query terms that will be
 included in any generated query. Defaults to `25`.
 
-|`min_similarity` |The minimum similarity of the term variants. Defaults
-to `0.5`.
+|`fuzziness` |The minimum similarity of the term variants. Defaults
+to `0.5`. See  <<fuzziness>>.
 
 |`prefix_length` |Length of required common prefix on variant terms.
 Defaults to `0`.
 
@@ -1,12 +1,15 @@
 [[query-dsl-fuzzy-query]]
 === Fuzzy Query
 
-A fuzzy query that uses similarity based on Levenshtein (edit
-distance) algorithm. This maps to Lucene's `FuzzyQuery`.
+The fuzzy query uses similarity based on Levenshtein edit distance for
+`string` fields, and a `+/-` margin on numeric and date fields.
 
-Warning: this query is not very scalable with its default prefix length
-of 0 - in this case, *every* term will be enumerated and cause an edit
-score calculation or `max_expansions` is not set.
+==== String fields
+
+The `fuzzy` query generates all possible matching terms that are within  the
+maximum edit distance specified in `fuzziness` and then checks the term
+dictionary to find out which of those generated terms actually exist in the
+index.
 
 Here is a simple example:
 
@@ -17,63 +20,83 @@ Here is a simple example:
 }
 --------------------------------------------------
 
-More complex settings can be set (the values here are the default
-values):
+Or with more advanced settings:
 
 [source,js]
 --------------------------------------------------
-    {
-        "fuzzy" : { 
-            "user" : {
-                "value" : "ki",
-                "boost" : 1.0,
-                "min_similarity" : 0.5,
-                "prefix_length" : 0
-            }
+{
+    "fuzzy" : {
+        "user" : {
+            "value" :         "ki",
+            "boost" :         1.0,
+            "fuzziness" :     2,
+            "prefix_length" : 0,
+            "max_expansions": 100
         }
     }
+}
 --------------------------------------------------
 
-The `max_expansions` parameter (unbounded by default) controls the
-number of terms the fuzzy query will expand to.
+[float]
+===== Parameters
+
+[horizontal]
+`fuzziness`::
+
+    The maximum edit distance. Defaults to `AUTO`. See <<fuzziness>>.
+
+`prefix_length`::
+
+    The number of initial characters which will not be ``fuzzified''. This
+    helps to reduce the number of terms which must be examined. Defaults
+    to `0`.
+
+`max_expansions`::
+
+    The maximum number of terms that the `fuzzy` query will expand to.
+    Defaults to `0`.
+
+
+WARNING: this query can be very heavy if `prefix_length` and `max_expansions`
+are both set to their defaults of `0`. This could cause every term in the
+index to be examined!
+
 
 [float]
-==== Numeric / Date Fuzzy
+==== Numeric and date fields
+
+Performs a <<query-dsl-range-query>> ``around'' the value using the
+`fuzziness` value as a `+/-` range, where:
+
+    -fuzziness <= field value <= +fuzziness
 
-`fuzzy` query on a numeric field will result in a range query "around"
-the value using the `min_similarity` value. For example:
+For example:
 
 [source,js]
 --------------------------------------------------
 {
     "fuzzy" : {
         "price" : {
             "value" : 12,
-            "min_similarity" : 2
+            "fuzziness" : 2
         }
     }
 }
 --------------------------------------------------
 
-Will result in a range query between 10 and 14. Same applies to dates,
-with support for time format for the `min_similarity` field:
+Will result in a range query between 10 and 14. Date fields support
+<<time-units,time values>>, eg:
 
 [source,js]
 --------------------------------------------------
 {
     "fuzzy" : {
         "created" : {
             "value" : "2010-02-05T12:05:07",
-            "min_similarity" : "1d"
+            "fuzziness" : "1d"
         }
     }
 }
 --------------------------------------------------
 
-In the mapping, numeric and date types now allow to configure a
-`fuzzy_factor` mapping value (defaults to 1), which will be used to
-multiply the fuzzy value by it when used in a `query_string` type query.
-For example, for dates, a fuzzy factor of "1d" will result in
-multiplying whatever fuzzy value provided in the min_similarity by it.
-Note, this is explicitly supported since query_string query only allowed
-for similarity valued between 0.0 and 1.0.
+See <<fuzziness>> for more details about accepted values.
@@ -34,9 +34,10 @@ The `analyzer` can be set to control which analyzer will perform the
 analysis process on the text. It default to the field explicit mapping
 definition, or the default search analyzer.
 
-`fuzziness` can be set to a value (depending on the relevant type, for
-string types it should be a value between `0.0` and `1.0`) to constructs
-fuzzy queries for each term analyzed. The `prefix_length` and
+`fuzziness` allows _fuzzy matching_ based on the type of field being queried.
+See <<fuzziness>> for allowed settings.
+
+The `prefix_length` and
 `max_expansions` can be set in this case to control the fuzzy process.
 If the fuzzy option is set the query will use `constant_score_rewrite`
 as its <<query-dsl-multi-term-rewrite,rewrite
@@ -80,9 +81,9 @@ change that the `zero_terms_query` option can be used, which accepts
 .cutoff_frequency
 The match query supports a `cutoff_frequency` that allows
 specifying an absolute or relative document frequency where high
-frequent terms are moved into an optional subquery and are only scored 
-if one of the low frequent (below the cutoff) terms in the case of an 
-`or` operator or all of the low frequent terms in the case of an `and` 
+frequent terms are moved into an optional subquery and are only scored
+if one of the low frequent (below the cutoff) terms in the case of an
+`or` operator or all of the low frequent terms in the case of an `and`
 operator match.
 
 This query allows handling `stopwords` dynamically at runtime, is domain
@@ -101,8 +102,8 @@ Note: If the `cutoff_frequency` is used and the operator is `and`
 _stacked tokens_ (tokens that are on the same position like `synonym` filter emits)
 are not handled gracefully as they are in a pure `and` query. For instance the query
 `fast fox` is analyzed into 3 terms `[fast, quick, fox]` where `quick` is a synonym
-for `fast` on the same token positions the query might require `fast` and `quick` to 
-match if the operator is `and`. 
+for `fast` on the same token positions the query might require `fast` and `quick` to
+match if the operator is `and`.
 
 Here is an example showing a query composed of stopwords exclusivly:
 
 
@@ -46,8 +46,8 @@ increments in result queries. Defaults to `true`.
 |`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
 expand to. Defaults to `50`
 
-|`fuzzy_min_sim` |Set the minimum similarity for fuzzy queries. Defaults
-to `0.5`
+|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
+to `AUTO`. See  <<fuzziness>> for allowed settings.
 
 |`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
 is `0`.
@@ -70,7 +70,7 @@ in the resulting boolean query should match. It can be an absolute value
 both>>.
 
 |`lenient` |If set to `true` will cause format based failures (like
-providing text to a numeric field) to be ignored. 
+providing text to a numeric field) to be ignored.
 |=======================================================================
 
 When a multi term query is being generated, one can control how it gets
@@ -128,7 +128,7 @@ search on all "city" fields:
 
 Another option is to provide the wildcard fields search in the query
 string itself (properly escaping the `*` sign), for example:
-`city.\*:something`. 
+`city.\*:something`.
 
 When running the `query_string` query against multiple fields, the
 following additional parameters are allowed:
 
@@ -199,7 +199,7 @@ curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
         "completion" : {
             "field" : "suggest",
             "fuzzy" : {
-                "edit_distance" : 2
+                "fuzziness" : 2
             }
         }
     }
@@ -210,8 +210,9 @@ The fuzzy query can take specific fuzzy parameters.
 The following parameters are supported:
 
 [horizontal]
-`edit_distance`::
-    Maximum edit distance, defaults to `1`
+`fuzziness`::
+    The fuzziness factor, defaults to `AUTO`.
+    See  <<fuzziness>> for allowed settings.
 
 `transpositions`::
     Sets if transpositions should be counted
 
@@ -30,6 +30,7 @@
 import org.elasticsearch.common.lucene.Lucene;
 import org.elasticsearch.common.lucene.search.Queries;
 import org.elasticsearch.common.lucene.search.XFilteredQuery;
+import org.elasticsearch.common.unit.Fuzziness;
 import org.elasticsearch.index.mapper.FieldMapper;
 import org.elasticsearch.index.mapper.MapperService;
 import org.elasticsearch.index.query.QueryParseContext;
@@ -435,7 +436,7 @@ private Query getFuzzyQuerySingle(String field, String termStr, String minSimila
             if (currentMapper != null) {
                 try {
                     //LUCENE 4 UPGRADE I disabled transpositions here by default - maybe this needs to be changed
-                    Query fuzzyQuery = currentMapper.fuzzyQuery(termStr, minSimilarity, fuzzyPrefixLength, settings.fuzzyMaxExpansions(), false);
+                    Query fuzzyQuery = currentMapper.fuzzyQuery(termStr, Fuzziness.build(minSimilarity), fuzzyPrefixLength, settings.fuzzyMaxExpansions(), false);
                     return wrapSmartNameQuery(fuzzyQuery, fieldMappers, parseContext);
                 } catch (RuntimeException e) {
                     if (settings.lenient()) {
Original file line number	Diff line number	Diff line change
`@@ -199,7 +199,7 @@ curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{`
`199`	`199`	`"completion" : {`
`200`	`200`	`"field" : "suggest",`
`201`	`201`	`"fuzzy" : {`
`202`		`- "edit_distance" : 2`
	`202`	`+ "fuzziness" : 2`
`203`	`203`	`}`
`204`	`204`	`}`
`205`	`205`	`}`
`@@ -210,8 +210,9 @@ The fuzzy query can take specific fuzzy parameters.`
`210`	`210`	`The following parameters are supported:`
`211`	`211`
`212`	`212`	`[horizontal]`
`213`		-`edit_distance`::
`214`		- Maximum edit distance, defaults to `1`
	`213`	+`fuzziness`::
	`214`	+ The fuzziness factor, defaults to `AUTO`.
	`215`	`+ See <<fuzziness>> for allowed settings.`
`215`	`216`
`216`	`217`	`transpositions`::
`217`	`218`	`Sets if transpositions should be counted`