Skip to content
This repository was archived by the owner on Apr 12, 2024. It is now read-only.

docs(i18n): expand the MessageFormat syntax documentation #11576

Merged
merged 1 commit into from
Apr 15, 2015
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
257 changes: 193 additions & 64 deletions docs/content/guide/i18n.ngdoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ application means providing translations and localized formats for the abstracte
Angular supports i18n/l10n for {@link ng.filter:date date}, {@link ng.filter:number number} and
{@link ng.filter:currency currency} filters.

Additionally, Angular supports localizable pluralization support through the {@link
ng.directive:ngPluralize `ngPluralize` directive}.
Localizable pluralization is supported via the {@link ng.directive:ngPluralize `ngPluralize`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chirayuk do you think we should deprecate ngPluralize in 1.5 and put it in its own separate module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't done that because I had mistakenly assumed that ngPluralize allowed its embedded messages to contain HTML. Reading the docs and testing on plnkr.co shows that that is not the case.

So yes, we should definitely deprecate this for 1.5! Should I say something about it in the docs for 1.4 right now?

directive}. Additionally, you can use <a href="#MessageFormat">MessageFormat extensions</a> to
`$interpolate` for localizable pluralization and gender support in all interpolations via the
`ngMessageFormat` module.

All localizable Angular components depend on locale-specific rule sets managed by the {@link
ng.$locale `$locale` service}.
Expand Down Expand Up @@ -142,96 +144,200 @@ displaying the date with a timezone specified by the developer.
<a name="MessageFormat"></a>
## MessageFormat extensions

AngularJS interpolations via `$interpolate` and in templates
support an extended syntax based on a subset of the ICU
MessageFormat that covers plurals and gender selections.
You can write localizable plural and gender based messages in Angular interpolation expressions and
`$interpolate` calls.

This syntax extension is provided by way of the `ngMessageFormat` module that your application can
depend upon (shipped separately as `angular-message-format.min.js` and `angular-message-format.js`.)
A current limitation of the `ngMessageFormat` module, is that it does not support redefining the
`$interpolate` start and end symbols. Only the default `{{` and `}}` are allowed.

The syntax extension is based on a subset of the ICU MessageFormat syntax that covers plurals and
gender selections. Please refer to the links in the “Further Reading” section at the bottom of this
section.

You may find it helpful to play with our [Plnkr Example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9RyR?p=preview)
as you read the examples below.

### Plural Syntax

The syntax for plural based message selection looks like the following:

```text
{{NUMERIC_EXPRESSION, plural,
=0 {MESSAGE_WHEN_VALUE_IS_0}
=1 {MESSAGE_WHEN_VALUE_IS_1}
=2 {MESSAGE_WHEN_VALUE_IS_2}
=3 {MESSAGE_WHEN_VALUE_IS_3}
...
zero {MESSAGE_WHEN_PLURAL_CATEGORY_IS_ZERO}
one {MESSAGE_WHEN_PLURAL_CATEGORY_IS_ONE}
two {MESSAGE_WHEN_PLURAL_CATEGORY_IS_TWO}
few {MESSAGE_WHEN_PLURAL_CATEGORY_IS_FEW}
many {MESSAGE_WHEN_PLURAL_CATEGORY_IS_MANY}
other {MESSAGE_WHEN_THERE_IS_NO_MATCH}
}}
```

Please refer to our [design doc](https://docs.google.com/a/google.com/document/d/1pbtW2yvtmFBikfRrJd8VAsabiFkKezmYZ_PbgdjQOVU/edit)
for a lot more details. You may find it helpful to play with our [Plnkr Example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9RyR?p=preview).
Please note that whitespace (including newline) is generally insignificant except as part of the
actual message text that occurs in curly braces. Whitespace is generally used to aid readability.

You can read more about the ICU MessageFormat syntax at
[Formatting Messages | ICU User Guide](http://userguide.icu-project.org/formatparse/messages#TOC-MessageFormat).
Here, `NUMERIC_EXPRESSION` is an expression that evaluates to a numeric value based on which the
displayed message should change based on pluralization rules.

Following the Angular expression, you would denote the plural extension syntax by the `, plural,`
syntax element. The spaces there are optional.

This is followed by a list of selection keyword and corresponding message pairs. The "other"
keyword and corresponding message are **required** but you may have as few or as many of the other
categories as you need.

#### Selection Keywords

The selection keywords can be either exact matches or language dependent [plural
categories](http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html).

Exact matches are written as the equal sign followed by the exact value. `=0`, `=1`, `=2` and
`=123` are all examples of exact matches. Note that there should be no space between the equal sign
and the numeric value.

Plural category matches are single words corresponding to the [plural
categories](http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html) of
the CLDR plural category spec. These categories vary by locale. The "en" (English) locale, for
example, defines just "one" and "other" while the "ga" (Irish) locale defines "one", "two", "few",
"many" and "other". Typically, you would just write the categories for your language. During
translation, the translators will add or remove more categories depending on the target locale.

Exact matches always win over keyword matches. Therefore, if you define both `=0` and `zero`, when
the value of the expression is zero, the `=0` message is the one that will be selected. (The
duplicate keyword categories are helpful when used with the optional `offset` syntax described
later.)

This extended syntax is provided by way of the
`ngMessageFormat` module that your application can depend
upon (shipped separately as `angular-message-format.min.js` and
`angular-message-format.js`.) A current limitation of the
`ngMessageFormat` module, is that it does not support
redefining the `$interpolate` start and end symbols. Only the
default `{{` and `}}` are allowed.

This syntax extension, while based on MessageFormat, has
been designed to be backwards compatible with existing
AngularJS interpolation expressions. The key rule is simply
this: **All interpolations are done inside double curlies.**
The top level comma operator after an expression inside the
double curlies causes MessageFormat extensions to be
recognized. Such a top level comma is otherwise illegal in
an Angular expression and is used by MessageFormat to
specify the function (such as plural/select) and it's
related syntax.

To understand the extension, take a look at the ICU
MessageFormat syntax as specified by the ICU documentation.
Anywhere in that MessageFormat that you have regular message
text and you want to substitute an expression, just put it
in double curlies instead of single curlies that
MessageFormat dictates. This has a huge advantage. **You
are no longer limited to simple identifiers for
substitutions**. Because you are using double curlies, you
can stick in any arbitrary interpolation syntax there,
including nesting more MessageFormat expressions! Some
examples will make this clear. In the following example, I
will only be showing you the AngularJS syntax.

#### Messages

Messages immediately follow a selection keyword and are optionally preceded by whitespace. They are
written in single curly braces (`{}`). They may contain Angular interpolation syntax inside them.
In addition, the `#` symbol is a placeholder for the actual numeric value of the expression.

### Simple plural example

```
```text
{{numMessages, plural,
=0 { You have no new messages }
=1 { You have one new message }
other { You have # new messages }
=0 {You have no new messages}
=1 {You have one new message}
other {You have # new messages}
}}
```

While I won't be teaching you MessageFormat here, you will
note that the `#` symbol works as expected. You could have
also written it as:
Because these messages can themselves contain Angular expressions, you could also write this as
follows:

```
```text
{{numMessages, plural,
=0 { You have no new messages }
=1 { You have one new message }
other { You have {{numMessages}} new messages }
=0 {You have no new messages}
=1 {You have one new message}
other {You have {{numMessages}} new messages}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could even write it as:

    =1 {You have one new message}
 other {You have {{numMessage||'no'}} new messages}

:P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should actually add a section on best practices but that would touch on many things so I have not done so.

The example you have written is actually wrong. I mean—it's correct for English, but it's highly unlikely to work if you use regular translation tools. You're basically embedding a fragment of English text inside a subexpression and expecting that it will work. The Angular expression typically would become an opaque placeholder to the translator that they can't change (and can't see that it contains "no" inside it.) Also, in a different language, the position of "no" might need to move around. This scheme will definitely work if one is only supporting a single language.

The following is also bad practice as it relies on the same type of "concatenation" assumption. (The best practice is to let the translator see the messages in their entirety and not piecemeal.)

You have {{numMessages, plural, =0 {no} =1 {one} other {#} }} new messages

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right !
I am not in a 100% i18n mindset yet :)

}}
```

where you explicitly typed in `numMessages` for "other"
instead of using `#`. They are nearly the same except if
you're using "offset". Refer to the ICU MessageFormat
documentation to learn about offset.

Please note that **other** is a **required** category (for
both the plural syntax and the select syntax that is shown
later.)
### Plural syntax with optional `offset`

The plural syntax supports an optional `offset` syntax that is used in matching. It's simpler to
explain this with an example.

```text
{{recipients.length, plural, offset:1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it is intended, but having 0 recipients (assuming the offset of 1), results in a -1 count and displays the:
"You gave and -1 other people a gift"

But there doesn't seem to be a way to specify a messages for -1 (i.e. 0 recipients - 1 offset).
Is there another solution to this ? If so it should be documented (if not it should be probably implemented :))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pluralization, negative numbers don't make sense—it's a logic bug. If you're using an offset, your messages are automatically written to use this information (e.g. "… and # other people…"). You can simply add exact matches for the cases that would cause you to have negative "#" evaluations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't get how am I supposed to handle a case with 0 recipients and offset="1".

=0 {You gave no gifts}
=1 {You gave {{recipients[0].name}} a gift}
one {You gave {{recipients[0].name}} and one other person a gift}
other {You gave {{recipients[0].name}} and # other people a gift}
}}
```

When an `offset` is specified, the matching works as follows. First, the exact value of the Angular
expression is matched against the exact matches (i.e. `=N` selectors) to find a match. If there is
one, that message is used. If there was no match, then the offset value is subtracted from the
value of the expression and locale specific pluralization rules are applied to this new value to
obtain its plural category (such as “one”, “few”, “many”, etc.) and a match is attempted against the
keyword selectors and the matching message is used. If there was no match, then the “other”
category (required) is used. The value of the `#` character inside a message is the value of
original expression reduced by the offset value that was specified.

### Escaping / Quoting

You will need to escape curly braces or the `#` character inside message texts if you want them to
be treated literally with no special meaning. You may quote/escape any character in your message
text by preceding it with a `\` (backslash) character. The backslash character removes any special
meaning to the character that immediately follows it. Therefore, you can escape or quote the
backslash itself by preceding it with another backslash character.


### Simple select (for gender) example
### Gender (aka select) Syntax

The gender support is provided by the more generic "select" syntax that is more akin to a switch
statement. It is general enough to support use for gender based messages.

The syntax for gender based message selection looks like the following:

```text
{{EXPRESSION, select,
male {MESSAGE_WHEN_EXPRESSION_IS_MALE}
female {MESSAGE_WHEN_EXPRESSION_IS_FEMALE}
...
other {MESSAGE_WHEN_THERE_IS_NO_GENDER_MATCH}
}}
```

Please note that whitespace (including newline) is generally insignificant except as part of the
actual message text that occurs in curly braces. Whitespace is generally used to aid readability.

Here, `EXPRESSION` is an Angular expression that evaluates to the gender of the person that
is used to select the message that should be displayed.

The Angular expression is followed by `, select,` where the spaces are optional.

This is followed by a list of selection keyword and corresponding message pairs. The "other"
keyword and corresponding message are **required** but you may have as few or as many of the other
gender values as you need (i.e. it isn't restricted to male/female.) Note however, that the
matching is **case-sensitive**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you wanted to actually match the string "other" but also have the catch-all other selector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, that's not an option. :) That just seems to be how MessageFormat has been defined. We could deviate from the spec and use something else instead of "other" for the default case, but I feel like that is a bit too much for a rare case. (Our other deviations are more understandable.)


#### Selection Keywords

Selection keywords are simple words like "male" and "female". The keyword, "other", and it's
corresponding message are required while others are optional. It is used when the Angular
expression does not match (case-insensitively) any of the other keywords specified.

#### Messages

Messages immediately follow a selection keyword and are optionally preceded by whitespace. They are
written in single curly braces (`{}`). They may contain Angular interpolation syntax inside them.

### Simple gender example

```text
{{friendGender, select,
male { Invite him }
female { Invite her }
other { Invite them }
male {Invite him}
female {Invite her}
other {Invite them}
}}
```

### More complex example that combines some of these
### Nesting

As mentioned in the syntax for plural and select, the embedded messages can contain Angular
interpolation syntax. Since you can use MessageFormat extensions in Angular interpolation, this
allows you to nest plural and gender expressions in any order.

Please note that if these are intended to reach a translator and be translated, it is recommended
that the messages appear as a whole and not be split up.

### More complex example that demonstrates nesting

This is taken from the [plunker example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9RyR?p=preview) linked to earlier.

```
```text
{{recipients.length, plural, offset:1
=0 {You ({{sender.name}}) gave no gifts}
=1 { {{ recipients[0].gender, select,
Expand All @@ -249,3 +355,26 @@ This is taken from the [plunker example](http://plnkr.co/edit/QBVRQ70dvKZDWmHW9R
other {You ({{sender.name}}) gave {{recipients.length}} people gifts. }
}}
```

### Differences from the ICU MessageFormat syntax

This section is useful to you if you're already familiar with the ICU MessageFormat syntax.

This syntax extension, while based on MessageFormat, has been designed to be backwards compatible
with existing AngularJS interpolation expressions. The key rule is simply this: **All
interpolations are done inside double curlies.** The top level comma operator after an expression
inside the double curlies causes MessageFormat extensions to be recognized. Such a top level comma
is otherwise illegal in an Angular expression and is used by MessageFormat to specify the function
(such as plural/select) and it's related syntax.

To understand the extension, take a look at the ICU MessageFormat syntax as specified by the ICU
documentation. Anywhere in that MessageFormat that you have regular message text and you want to
substitute an expression, just put it in double curlies instead of single curlies that MessageFormat
dictates. This has a huge advantage. **You are no longer limited to simple identifiers for
substitutions**. Because you are using double curlies, you can stick in any arbitrary interpolation
syntax there, including nesting more MessageFormat expressions!

### Further Reading
For more details, please refer to our [design doc](https://docs.google.com/a/google.com/document/d/1pbtW2yvtmFBikfRrJd8VAsabiFkKezmYZ_PbgdjQOVU/edit).
You can read more about the ICU MessageFormat syntax at
[Formatting Messages | ICU User Guide](http://userguide.icu-project.org/formatparse/messages#TOC-MessageFormat).