-
Notifications
You must be signed in to change notification settings - Fork 27.4k
docs(i18n): expand the MessageFormat syntax documentation #11576
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,8 +18,10 @@ application means providing translations and localized formats for the abstracte | |
Angular supports i18n/l10n for {@link ng.filter:date date}, {@link ng.filter:number number} and | ||
{@link ng.filter:currency currency} filters. | ||
|
||
Additionally, Angular supports localizable pluralization support through the {@link | ||
ng.directive:ngPluralize `ngPluralize` directive}. | ||
Localizable pluralization is supported via the {@link ng.directive:ngPluralize `ngPluralize` | ||
directive}. Additionally, you can use <a href="#MessageFormat">MessageFormat extensions</a> to | ||
`$interpolate` for localizable pluralization and gender support in all interpolations via the | ||
`ngMessageFormat` module. | ||
|
||
All localizable Angular components depend on locale-specific rule sets managed by the {@link | ||
ng.$locale `$locale` service}. | ||
|
@@ -142,96 +144,200 @@ displaying the date with a timezone specified by the developer. | |
<a name="MessageFormat"></a> | ||
## MessageFormat extensions | ||
|
||
AngularJS interpolations via `$interpolate` and in templates | ||
support an extended syntax based on a subset of the ICU | ||
MessageFormat that covers plurals and gender selections. | ||
You can write localizable plural and gender based messages in Angular interpolation expressions and | ||
`$interpolate` calls. | ||
|
||
This syntax extension is provided by way of the `ngMessageFormat` module that your application can | ||
depend upon (shipped separately as `angular-message-format.min.js` and `angular-message-format.js`.) | ||
A current limitation of the `ngMessageFormat` module, is that it does not support redefining the | ||
`$interpolate` start and end symbols. Only the default `{{` and `}}` are allowed. | ||
|
||
The syntax extension is based on a subset of the ICU MessageFormat syntax that covers plurals and | ||
gender selections. Please refer to the links in the “Further Reading” section at the bottom of this | ||
section. | ||
|
||
You may find it helpful to play with our [Plnkr Example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9RyR?p=preview) | ||
as you read the examples below. | ||
|
||
### Plural Syntax | ||
|
||
The syntax for plural based message selection looks like the following: | ||
|
||
```text | ||
{{NUMERIC_EXPRESSION, plural, | ||
=0 {MESSAGE_WHEN_VALUE_IS_0} | ||
=1 {MESSAGE_WHEN_VALUE_IS_1} | ||
=2 {MESSAGE_WHEN_VALUE_IS_2} | ||
=3 {MESSAGE_WHEN_VALUE_IS_3} | ||
... | ||
zero {MESSAGE_WHEN_PLURAL_CATEGORY_IS_ZERO} | ||
one {MESSAGE_WHEN_PLURAL_CATEGORY_IS_ONE} | ||
two {MESSAGE_WHEN_PLURAL_CATEGORY_IS_TWO} | ||
few {MESSAGE_WHEN_PLURAL_CATEGORY_IS_FEW} | ||
many {MESSAGE_WHEN_PLURAL_CATEGORY_IS_MANY} | ||
other {MESSAGE_WHEN_THERE_IS_NO_MATCH} | ||
}} | ||
``` | ||
|
||
Please refer to our [design doc](https://docs.google.com/a/google.com/document/d/1pbtW2yvtmFBikfRrJd8VAsabiFkKezmYZ_PbgdjQOVU/edit) | ||
for a lot more details. You may find it helpful to play with our [Plnkr Example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9RyR?p=preview). | ||
Please note that whitespace (including newline) is generally insignificant except as part of the | ||
actual message text that occurs in curly braces. Whitespace is generally used to aid readability. | ||
|
||
You can read more about the ICU MessageFormat syntax at | ||
[Formatting Messages | ICU User Guide](http://userguide.icu-project.org/formatparse/messages#TOC-MessageFormat). | ||
Here, `NUMERIC_EXPRESSION` is an expression that evaluates to a numeric value based on which the | ||
displayed message should change based on pluralization rules. | ||
|
||
Following the Angular expression, you would denote the plural extension syntax by the `, plural,` | ||
syntax element. The spaces there are optional. | ||
|
||
This is followed by a list of selection keyword and corresponding message pairs. The "other" | ||
keyword and corresponding message are **required** but you may have as few or as many of the other | ||
categories as you need. | ||
|
||
#### Selection Keywords | ||
|
||
The selection keywords can be either exact matches or language dependent [plural | ||
categories](http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html). | ||
|
||
Exact matches are written as the equal sign followed by the exact value. `=0`, `=1`, `=2` and | ||
`=123` are all examples of exact matches. Note that there should be no space between the equal sign | ||
and the numeric value. | ||
|
||
Plural category matches are single words corresponding to the [plural | ||
categories](http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html) of | ||
the CLDR plural category spec. These categories vary by locale. The "en" (English) locale, for | ||
example, defines just "one" and "other" while the "ga" (Irish) locale defines "one", "two", "few", | ||
"many" and "other". Typically, you would just write the categories for your language. During | ||
translation, the translators will add or remove more categories depending on the target locale. | ||
|
||
Exact matches always win over keyword matches. Therefore, if you define both `=0` and `zero`, when | ||
the value of the expression is zero, the `=0` message is the one that will be selected. (The | ||
duplicate keyword categories are helpful when used with the optional `offset` syntax described | ||
later.) | ||
|
||
This extended syntax is provided by way of the | ||
`ngMessageFormat` module that your application can depend | ||
upon (shipped separately as `angular-message-format.min.js` and | ||
`angular-message-format.js`.) A current limitation of the | ||
`ngMessageFormat` module, is that it does not support | ||
redefining the `$interpolate` start and end symbols. Only the | ||
default `{{` and `}}` are allowed. | ||
|
||
This syntax extension, while based on MessageFormat, has | ||
been designed to be backwards compatible with existing | ||
AngularJS interpolation expressions. The key rule is simply | ||
this: **All interpolations are done inside double curlies.** | ||
The top level comma operator after an expression inside the | ||
double curlies causes MessageFormat extensions to be | ||
recognized. Such a top level comma is otherwise illegal in | ||
an Angular expression and is used by MessageFormat to | ||
specify the function (such as plural/select) and it's | ||
related syntax. | ||
|
||
To understand the extension, take a look at the ICU | ||
MessageFormat syntax as specified by the ICU documentation. | ||
Anywhere in that MessageFormat that you have regular message | ||
text and you want to substitute an expression, just put it | ||
in double curlies instead of single curlies that | ||
MessageFormat dictates. This has a huge advantage. **You | ||
are no longer limited to simple identifiers for | ||
substitutions**. Because you are using double curlies, you | ||
can stick in any arbitrary interpolation syntax there, | ||
including nesting more MessageFormat expressions! Some | ||
examples will make this clear. In the following example, I | ||
will only be showing you the AngularJS syntax. | ||
|
||
#### Messages | ||
|
||
Messages immediately follow a selection keyword and are optionally preceded by whitespace. They are | ||
written in single curly braces (`{}`). They may contain Angular interpolation syntax inside them. | ||
In addition, the `#` symbol is a placeholder for the actual numeric value of the expression. | ||
|
||
### Simple plural example | ||
|
||
``` | ||
```text | ||
{{numMessages, plural, | ||
=0 { You have no new messages } | ||
=1 { You have one new message } | ||
other { You have # new messages } | ||
=0 {You have no new messages} | ||
=1 {You have one new message} | ||
other {You have # new messages} | ||
}} | ||
``` | ||
|
||
While I won't be teaching you MessageFormat here, you will | ||
note that the `#` symbol works as expected. You could have | ||
also written it as: | ||
Because these messages can themselves contain Angular expressions, you could also write this as | ||
follows: | ||
|
||
``` | ||
```text | ||
{{numMessages, plural, | ||
=0 { You have no new messages } | ||
=1 { You have one new message } | ||
other { You have {{numMessages}} new messages } | ||
=0 {You have no new messages} | ||
=1 {You have one new message} | ||
other {You have {{numMessages}} new messages} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could even write it as:
:P There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I should actually add a section on best practices but that would touch on many things so I have not done so. The example you have written is actually wrong. I mean—it's correct for English, but it's highly unlikely to work if you use regular translation tools. You're basically embedding a fragment of English text inside a subexpression and expecting that it will work. The Angular expression typically would become an opaque placeholder to the translator that they can't change (and can't see that it contains "no" inside it.) Also, in a different language, the position of "no" might need to move around. This scheme will definitely work if one is only supporting a single language. The following is also bad practice as it relies on the same type of "concatenation" assumption. (The best practice is to let the translator see the messages in their entirety and not piecemeal.)
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right ! |
||
}} | ||
``` | ||
|
||
where you explicitly typed in `numMessages` for "other" | ||
instead of using `#`. They are nearly the same except if | ||
you're using "offset". Refer to the ICU MessageFormat | ||
documentation to learn about offset. | ||
|
||
Please note that **other** is a **required** category (for | ||
both the plural syntax and the select syntax that is shown | ||
later.) | ||
### Plural syntax with optional `offset` | ||
|
||
The plural syntax supports an optional `offset` syntax that is used in matching. It's simpler to | ||
explain this with an example. | ||
|
||
```text | ||
{{recipients.length, plural, offset:1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if it is intended, but having 0 recipients (assuming the offset of But there doesn't seem to be a way to specify a messages for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For pluralization, negative numbers don't make sense—it's a logic bug. If you're using an offset, your messages are automatically written to use this information (e.g. "… and # other people…"). You can simply add exact matches for the cases that would cause you to have negative "#" evaluations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still don't get how am I supposed to handle a case with 0 recipients and |
||
=0 {You gave no gifts} | ||
=1 {You gave {{recipients[0].name}} a gift} | ||
one {You gave {{recipients[0].name}} and one other person a gift} | ||
other {You gave {{recipients[0].name}} and # other people a gift} | ||
}} | ||
``` | ||
|
||
When an `offset` is specified, the matching works as follows. First, the exact value of the Angular | ||
expression is matched against the exact matches (i.e. `=N` selectors) to find a match. If there is | ||
one, that message is used. If there was no match, then the offset value is subtracted from the | ||
value of the expression and locale specific pluralization rules are applied to this new value to | ||
obtain its plural category (such as “one”, “few”, “many”, etc.) and a match is attempted against the | ||
keyword selectors and the matching message is used. If there was no match, then the “other” | ||
category (required) is used. The value of the `#` character inside a message is the value of | ||
original expression reduced by the offset value that was specified. | ||
|
||
### Escaping / Quoting | ||
|
||
You will need to escape curly braces or the `#` character inside message texts if you want them to | ||
be treated literally with no special meaning. You may quote/escape any character in your message | ||
text by preceding it with a `\` (backslash) character. The backslash character removes any special | ||
meaning to the character that immediately follows it. Therefore, you can escape or quote the | ||
backslash itself by preceding it with another backslash character. | ||
|
||
|
||
### Simple select (for gender) example | ||
### Gender (aka select) Syntax | ||
|
||
The gender support is provided by the more generic "select" syntax that is more akin to a switch | ||
statement. It is general enough to support use for gender based messages. | ||
|
||
The syntax for gender based message selection looks like the following: | ||
|
||
```text | ||
{{EXPRESSION, select, | ||
male {MESSAGE_WHEN_EXPRESSION_IS_MALE} | ||
female {MESSAGE_WHEN_EXPRESSION_IS_FEMALE} | ||
... | ||
other {MESSAGE_WHEN_THERE_IS_NO_GENDER_MATCH} | ||
}} | ||
``` | ||
|
||
Please note that whitespace (including newline) is generally insignificant except as part of the | ||
actual message text that occurs in curly braces. Whitespace is generally used to aid readability. | ||
|
||
Here, `EXPRESSION` is an Angular expression that evaluates to the gender of the person that | ||
is used to select the message that should be displayed. | ||
|
||
The Angular expression is followed by `, select,` where the spaces are optional. | ||
|
||
This is followed by a list of selection keyword and corresponding message pairs. The "other" | ||
keyword and corresponding message are **required** but you may have as few or as many of the other | ||
gender values as you need (i.e. it isn't restricted to male/female.) Note however, that the | ||
matching is **case-sensitive**. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if you wanted to actually match the string There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sadly, that's not an option. :) That just seems to be how MessageFormat has been defined. We could deviate from the spec and use something else instead of "other" for the default case, but I feel like that is a bit too much for a rare case. (Our other deviations are more understandable.) |
||
|
||
#### Selection Keywords | ||
|
||
Selection keywords are simple words like "male" and "female". The keyword, "other", and it's | ||
corresponding message are required while others are optional. It is used when the Angular | ||
expression does not match (case-insensitively) any of the other keywords specified. | ||
|
||
#### Messages | ||
|
||
Messages immediately follow a selection keyword and are optionally preceded by whitespace. They are | ||
written in single curly braces (`{}`). They may contain Angular interpolation syntax inside them. | ||
|
||
### Simple gender example | ||
|
||
```text | ||
{{friendGender, select, | ||
male { Invite him } | ||
female { Invite her } | ||
other { Invite them } | ||
male {Invite him} | ||
female {Invite her} | ||
other {Invite them} | ||
}} | ||
``` | ||
|
||
### More complex example that combines some of these | ||
### Nesting | ||
|
||
As mentioned in the syntax for plural and select, the embedded messages can contain Angular | ||
interpolation syntax. Since you can use MessageFormat extensions in Angular interpolation, this | ||
allows you to nest plural and gender expressions in any order. | ||
|
||
Please note that if these are intended to reach a translator and be translated, it is recommended | ||
that the messages appear as a whole and not be split up. | ||
|
||
### More complex example that demonstrates nesting | ||
|
||
This is taken from the [plunker example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9RyR?p=preview) linked to earlier. | ||
|
||
``` | ||
```text | ||
{{recipients.length, plural, offset:1 | ||
=0 {You ({{sender.name}}) gave no gifts} | ||
=1 { {{ recipients[0].gender, select, | ||
|
@@ -249,3 +355,26 @@ This is taken from the [plunker example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9R | |
other {You ({{sender.name}}) gave {{recipients.length}} people gifts. } | ||
}} | ||
``` | ||
|
||
### Differences from the ICU MessageFormat syntax | ||
|
||
This section is useful to you if you're already familiar with the ICU MessageFormat syntax. | ||
|
||
This syntax extension, while based on MessageFormat, has been designed to be backwards compatible | ||
with existing AngularJS interpolation expressions. The key rule is simply this: **All | ||
interpolations are done inside double curlies.** The top level comma operator after an expression | ||
inside the double curlies causes MessageFormat extensions to be recognized. Such a top level comma | ||
is otherwise illegal in an Angular expression and is used by MessageFormat to specify the function | ||
(such as plural/select) and it's related syntax. | ||
|
||
To understand the extension, take a look at the ICU MessageFormat syntax as specified by the ICU | ||
documentation. Anywhere in that MessageFormat that you have regular message text and you want to | ||
substitute an expression, just put it in double curlies instead of single curlies that MessageFormat | ||
dictates. This has a huge advantage. **You are no longer limited to simple identifiers for | ||
substitutions**. Because you are using double curlies, you can stick in any arbitrary interpolation | ||
syntax there, including nesting more MessageFormat expressions! | ||
|
||
### Further Reading | ||
For more details, please refer to our [design doc](https://docs.google.com/a/google.com/document/d/1pbtW2yvtmFBikfRrJd8VAsabiFkKezmYZ_PbgdjQOVU/edit). | ||
You can read more about the ICU MessageFormat syntax at | ||
[Formatting Messages | ICU User Guide](http://userguide.icu-project.org/formatparse/messages#TOC-MessageFormat). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chirayuk do you think we should deprecate ngPluralize in 1.5 and put it in its own separate module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't done that because I had mistakenly assumed that ngPluralize allowed its embedded messages to contain HTML. Reading the docs and testing on plnkr.co shows that that is not the case.
So yes, we should definitely deprecate this for 1.5! Should I say something about it in the docs for 1.4 right now?