Skip to content

Commit 916f376

Browse files
authored
Merge pull request #2437 from corob-msft/cr-524
Windows 18671203 c16rtomb, mbrtoc16 issue
2 parents 033f0cc + ea5f98a commit 916f376

File tree

3 files changed

+64
-53
lines changed

3 files changed

+64
-53
lines changed
Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
---
2-
title: "Interpretation of Multibyte-Character Sequences"
3-
ms.date: "04/11/2018"
2+
title: "Interpretation of multibyte-character sequences"
3+
ms.date: "10/22/2019"
44
f1_keywords: ["c.character.multibyte"]
55
helpviewer_keywords: ["MBCS [C++], locale code page"]
66
ms.assetid: da9150de-70ea-4d2f-90e6-ddb9202dd80b
77
---
8-
# Interpretation of Multibyte-Character Sequences
8+
# Interpretation of multibyte-character sequences
99

10-
Most multibyte-character routines in the Microsoft run-time library recognize multibyte-character sequences relating to a multibyte code page. The output value is affected by the setting of the **LC_CTYPE** category setting of the locale; see [setlocale](../c-runtime-library/reference/setlocale-wsetlocale.md) for more information. The versions of these functions without the **_l** suffix use the current locale for this locale-dependent behavior; the versions with the **_l** suffix are identical except that they use the locale parameter passed in instead.
10+
Most multibyte-character routines in the Microsoft run-time library recognize multibyte-character sequences relating to a multibyte code page. The output value is affected by the setting of the **LC_CTYPE** category setting of the locale. For more information, see [setlocale](../c-runtime-library/reference/setlocale-wsetlocale.md). The versions of these functions without the **_l** suffix use the current locale for this locale-dependent behavior. The versions with the **_l** suffix are identical, except they use the locale parameter instead of the current locale.
1111

12-
## Locale-Dependent Multibyte Routines
12+
## Locale-dependent multibyte routines
1313

1414
|Routine|Use|
1515
|-------------|---------|
@@ -19,10 +19,15 @@ Most multibyte-character routines in the Microsoft run-time library recognize mu
1919
|[mbtowc, _mbtowc_l](../c-runtime-library/reference/mbtowc-mbtowc-l.md)|Convert multibyte character to corresponding wide character|
2020
|[wcstombs, _wcstombs_l](../c-runtime-library/reference/wcstombs-wcstombs-l.md), [wcstombs_s, _wcstombs_s_l](../c-runtime-library/reference/wcstombs-s-wcstombs-s-l.md)|Convert sequence of wide characters to corresponding sequence of multibyte characters|
2121
|[wctomb, _wctomb_l](../c-runtime-library/reference/wctomb-wctomb-l.md), [wctomb_s, _wctomb_s_l](../c-runtime-library/reference/wctomb-s-wctomb-s-l.md)|Convert wide character to corresponding multibyte character|
22-
|[mbrtoc16, mbrtoc32](../c-runtime-library/reference/mbrtoc16-mbrtoc323.md)|Convert multibyte character to equivalent UTF-16 or UTF-32 character|
23-
|[c16rtomb, c32rtomb](../c-runtime-library/reference/c16rtomb-c32rtomb1.md)|Convert UTF-16 or UTF-32 character to equivalent multibyte character|
22+
23+
## Locale-independent multibyte routines
24+
25+
|Routine|Use|
26+
|-------------|---------|
27+
|[mbrtoc16, mbrtoc32](../c-runtime-library/reference/mbrtoc16-mbrtoc323.md)|Convert multibyte UTF-8 character to equivalent UTF-16 or UTF-32 character|
28+
|[c16rtomb, c32rtomb](../c-runtime-library/reference/c16rtomb-c32rtomb1.md)|Convert UTF-16 or UTF-32 character to equivalent UTF-8 multibyte character|
2429

2530
## See also
2631

27-
[Internationalization](../c-runtime-library/internationalization.md)<br/>
28-
[Universal C runtime routines by category](../c-runtime-library/run-time-routines-by-category.md)<br/>
32+
[Internationalization](../c-runtime-library/internationalization.md)\
33+
[Universal C runtime routines by category](../c-runtime-library/run-time-routines-by-category.md)
Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "c16rtomb, c32rtomb"
3-
ms.date: "01/22/2018"
3+
ms.date: "10/22/2019"
44
api_name: ["c16rtomb", "c32rtomb"]
55
api_location: ["msvcrt.dll", "msvcr80.dll", "msvcr90.dll", "msvcr100.dll", "msvcr100_clr0400.dll", "msvcr110.dll", "msvcr110_clr0400.dll", "msvcr120.dll", "msvcr120_clr0400.dll", "ucrtbase.dll", "api-ms-win-crt-convert-l1-1-0.dll"]
66
api_type: ["DLLExport"]
@@ -11,7 +11,7 @@ ms.assetid: 7f5743ca-a90e-4e3f-a310-c73e16f4e14d
1111
---
1212
# c16rtomb, c32rtomb
1313

14-
Convert a UTF-16 or UTF-32 wide character into a multibyte character in the current locale.
14+
Convert a UTF-16 or UTF-32 wide character into a UTF-8 multibyte character.
1515

1616
## Syntax
1717

@@ -30,40 +30,44 @@ size_t c32rtomb(
3030

3131
### Parameters
3232

33-
*mbchar*<br/>
34-
Pointer to an array to store the multibyte converted character.
33+
*mbchar*\
34+
Pointer to an array to store the converted UTF-8 multibyte character.
3535

36-
*wchar*<br/>
36+
*wchar*\
3737
A wide character to convert.
3838

39-
*state*<br/>
39+
*state*\
4040
A pointer to an **mbstate_t** object.
4141

42-
## Return Value
42+
## Return value
4343

44-
The number of bytes stored in array object *mbchar*, including any shift sequences. If *wchar* is not a valid wide character, the value (**size_t**)(-1) is returned, **errno** is set to **EILSEQ**, and the value of *state* is unspecified.
44+
The number of bytes stored in array object *mbchar*, including any shift sequences. If *wchar* isn't a valid wide character, the value (**size_t**)(-1) is returned, **errno** is set to **EILSEQ**, and the value of *state* is unspecified.
4545

4646
## Remarks
4747

48-
The **c16rtomb** function converts the UTF-16 character *wchar* to the equivalent multibyte narrow character sequence in the current locale. If *mbchar* is not a null pointer, the function stores the converted sequence in the array object pointed to by *mbchar*. Up to **MB_CUR_MAX** bytes are stored in *mbchar*, and *state* is set to the resulting multibyte shift state. If *wchar* is a null wide character, a sequence required to restore the initial shift state is stored, if needed, followed by the null character, and *state* is set to the initial conversion state. The **c32rtomb** function is identical, but converts a UTF-32 character.
48+
The **c16rtomb** function converts the UTF-16 LE character *wchar* to the equivalent UTF-8 multibyte narrow character sequence. If *mbchar* isn't a null pointer, the function stores the converted sequence in the array object pointed to by *mbchar*. Up to **MB_CUR_MAX** bytes are stored in *mbchar*, and *state* is set to the resulting multibyte shift state.
49+
50+
If *wchar* is a null wide character, a sequence required to restore the initial shift state is stored, if needed, followed by the null character. *state* is set to the initial conversion state. The **c32rtomb** function is identical, but converts a UTF-32 character.
4951

5052
If *mbchar* is a null pointer, the behavior is equivalent to a call to the function that substitutes an internal buffer for *mbchar* and a wide null character for *wchar*.
5153

52-
The *state* conversion state object allows you to make subsequent calls to this function and other restartable functions that maintain the shift state of the multibyte output characters. Results are undefined when you mix the use of restartable and non-restartable functions, or if a call to **setlocale** is made between restartable function calls.
54+
The *state* conversion state object allows you to make subsequent calls to this function and other restartable functions that maintain the shift state of the multibyte output characters. Results are undefined when you mix the use of restartable and non-restartable functions.
55+
56+
To convert UTF-16 characters into non-UTF-8 multibyte characters, use the [wcstombs, _wcstombs_l](wcstombs-wcstombs-l.md), [wcstombs_s, or _wcstombs_s_l](wcstombs-s-wcstombs-s-l.md) functions.
5357

5458
## Requirements
5559

5660
|Routine|Required header|
5761
|-------------|---------------------|
5862
|**c16rtomb**, **c32rtomb**|C, C++: \<uchar.h>|
5963

60-
For compatibility information, see [Compatibility](../../c-runtime-library/compatibility.md).
64+
For compatibility information, see [Compatibility](../compatibility.md).
6165

6266
## See also
6367

64-
[Data Conversion](../../c-runtime-library/data-conversion.md)<br/>
65-
[Locale](../../c-runtime-library/locale.md)<br/>
66-
[Interpretation of Multibyte-Character Sequences](../../c-runtime-library/interpretation-of-multibyte-character-sequences.md)<br/>
67-
[mbrtoc16, mbrtoc32](mbrtoc16-mbrtoc323.md)<br/>
68-
[wcrtomb](wcrtomb.md)<br/>
69-
[wcrtomb_s](wcrtomb-s.md)<br/>
68+
[Data conversion](../data-conversion.md)\
69+
[Locale](../locale.md)\
70+
[Interpretation of multibyte-character sequences](../interpretation-of-multibyte-character-sequences.md)\
71+
[mbrtoc16, mbrtoc32](mbrtoc16-mbrtoc323.md)\
72+
[wcrtomb](wcrtomb.md)\
73+
[wcrtomb_s](wcrtomb-s.md)
Lines changed: 29 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "mbrtoc16, mbrtoc323"
3-
ms.date: "11/04/2016"
3+
ms.date: "10/22/2019"
44
api_name: ["mbrtoc16", "mbrtoc32"]
55
api_location: ["msvcrt.dll", "msvcr80.dll", "msvcr90.dll", "msvcr100.dll", "msvcr100_clr0400.dll", "msvcr110.dll", "msvcr110_clr0400.dll", "msvcr120.dll", "msvcr120_clr0400.dll", "ucrtbase.dll", "api-ms-win-crt-convert-l1-1-0.dll"]
66
api_type: ["DLLExport"]
@@ -11,7 +11,7 @@ ms.assetid: 099ade4d-56f7-4e61-8b45-493f1d7a64bd
1111
---
1212
# mbrtoc16, mbrtoc32
1313

14-
Translates the first multibyte character in a narrow string into the equivalent UTF-16 or UTF-32 character.
14+
Translates the first UTF-8 multibyte character in a string into the equivalent UTF-16 or UTF-32 character.
1515

1616
## Syntax
1717

@@ -33,52 +33,54 @@ size_t mbrtoc32(
3333

3434
### Parameters
3535

36-
*destination*<br/>
37-
Pointer to the **char16_t** or **char32_t** equivalent of the multibyte character to convert. If null, the function does not store a value.
36+
*destination*\
37+
Pointer to the **char16_t** or **char32_t** equivalent of the UTF-8 multibyte character to convert. If null, the function doesn't store a value.
3838

39-
*source*<br/>
40-
Pointer to the multibyte character string to convert.
39+
*source*\
40+
Pointer to the UTF-8 multibyte character string to convert.
4141

42-
*max_bytes*<br/>
43-
The maximum number of bytes in *source* to examine for a character to convert. This should be a value between one and the number of bytes, including any null terminator, remaining in *source*.
42+
*max_bytes*\
43+
The maximum number of bytes in *source* to examine for a character to convert. This argument should be a value between one and the number of bytes, including any null terminator, remaining in *source*.
4444

45-
*state*<br/>
46-
Pointer to a **mbstate_t** conversion state object used to interpret the multibyte string to one or more output characters.
45+
*state*\
46+
Pointer to a **mbstate_t** conversion state object used to interpret the UTF-8 multibyte string to one or more output characters.
4747

48-
## Return Value
48+
## Return value
4949

5050
On success, returns the value of the first of these conditions that applies, given the current *state* value:
5151

5252
|Value|Condition|
5353
|-----------|---------------|
54-
|0|The next *max_bytes* or fewer characters converted from *source* correspond to the null wide character, which is the value stored if *destination* is not null.<br /><br /> *state* contains the initial shift state.|
55-
|Between 1 and *max_bytes*, inclusive|The value returned is the number of bytes of *source* that complete a valid multibyte character. The converted wide character is stored if *destination* is not null.|
56-
|-3|The next wide character resulting from a previous call to the function has been stored in *destination* if *destination* is not null. No bytes from *source* are consumed by this call to the function.<br /><br /> When *source* points to a multibyte character that requires more than one wide character to represent (for example, a surrogate pair), then the *state* value is updated so that the next function call writes out the additional character.|
57-
|-2|The next *max_bytes* bytes represent an incomplete, but potentially valid, multibyte character. No value is stored in *destination*. This result can occur if *max_bytes* is zero.|
58-
|-1|An encoding error has occurred. The next *max_bytes* or fewer bytes do not contribute to a complete and valid multibyte character. No value is stored in *destination*.<br /><br /> **EILSEQ** is stored in **errno** and the conversion state *state* is unspecified.|
54+
|0|The next *max_bytes* or fewer characters converted from *source* correspond to the null wide character, which is the value stored if *destination* isn't null.<br /><br /> *state* contains the initial shift state.|
55+
|Between 1 and *max_bytes*, inclusive|The value returned is the number of bytes of *source* that complete a valid multibyte character. The converted wide character is stored if *destination* isn't null.|
56+
|-3|The next wide character resulting from a previous call to the function has been stored in *destination* if *destination* isn't null. No bytes from *source* are consumed by this call to the function.<br /><br /> When *source* points to a UTF-8 multibyte character that requires more than one wide character to represent (for example, a surrogate pair), then the *state* value is updated so that the next function call writes out the additional character.|
57+
|-2|The next *max_bytes* bytes represent an incomplete, but potentially valid, UTF-8 multibyte character. No value is stored in *destination*. This result can occur if *max_bytes* is zero.|
58+
|-1|An encoding error has occurred. The next *max_bytes* or fewer bytes do not contribute to a complete and valid UTF-8 multibyte character. No value is stored in *destination*.<br /><br /> **EILSEQ** is stored in **errno** and the conversion state value *state* is unspecified.|
5959

6060
## Remarks
6161

62-
The **mbrtoc16** function reads up to *max_bytes* bytes from *source* to find the first complete, valid multibyte character, and then stores the equivalent UTF-16 character in *destination*. The source bytes are interpreted according to the current thread multibyte locale. If the multibyte character requires more than one UTF-16 output character, such as a surrogate pair, then the *state* value is set to store the next UTF-16 character in *destination* on the next call to **mbrtoc16**. The **mbrtoc32** function is identical, but output is stored as a UTF-32 character.
62+
The **mbrtoc16** function reads up to *max_bytes* bytes from *source* to find the first complete, valid UTF-8 multibyte character, and then stores the equivalent UTF-16 character in *destination*. If the character requires more than one UTF-16 output character, such as a surrogate pair, then the *state* value is set to store the next UTF-16 character in *destination* on the next call to **mbrtoc16**. The **mbrtoc32** function is identical, but output is stored as a UTF-32 character.
6363

64-
If *source* is null, these functions return the equivalent of a call made using arguments of **NULL** for *destination*, **""** for *source*, and 1 for *max_bytes*. The passed values of *destination* and *max_bytes* are ignored.
64+
If *source* is null, these functions return the equivalent of a call made using arguments of **NULL** for *destination*, `""` (an empty, null-terminated string) for *source*, and 1 for *max_bytes*. The passed values of *destination* and *max_bytes* are ignored.
6565

66-
If *source* is not null, the function starts at the beginning of the string and inspects up to *max_bytes* bytes to determine the number of bytes required to complete the next multibyte character, including any shift sequences. If the examined bytes contain a valid and complete multibyte character, the function converts the character into the equivalent 16-bit or 32-bit wide character or characters. If *destination* is not null, the function stores the first (and possibly only) result character in destination. If additional output characters are required, a value is set in *state*, so that subsequent calls to the function output the additional characters and return the value -3. If no more output characters are required, then *state* is set to the initial shift state.
66+
If *source* isn't null, the function starts at the beginning of the string and inspects up to *max_bytes* bytes to determine the number of bytes required to complete the next UTF-8 multibyte character, including any shift sequences. If the examined bytes contain a valid and complete UTF-8 multibyte character, the function converts the character into the equivalent 16-bit or 32-bit wide character or characters. If *destination* isn't null, the function stores the first (and possibly only) result character in destination. If additional output characters are required, a value is set in *state*, so that subsequent calls to the function output the additional characters and return the value -3. If no more output characters are required, then *state* is set to the initial shift state.
67+
68+
To convert non-UTF-8 multibyte characters to UTF-16 LE characters, use the [mbrtowc](mbrtowc.md), [mbtowc, or _mbtowc_l](mbtowc-mbtowc-l.md) functions.
6769

6870
## Requirements
6971

7072
|Function|C header|C++ header|
7173
|--------------|--------------|------------------|
7274
|**mbrtoc16**, **mbrtoc32**|\<uchar.h>|\<cuchar>|
7375

74-
For additional compatibility information, see [Compatibility](../../c-runtime-library/compatibility.md).
76+
For additional compatibility information, see [Compatibility](../compatibility.md).
7577

7678
## See also
7779

78-
[Data Conversion](../../c-runtime-library/data-conversion.md)<br/>
79-
[Locale](../../c-runtime-library/locale.md)<br/>
80-
[Interpretation of Multibyte-Character Sequences](../../c-runtime-library/interpretation-of-multibyte-character-sequences.md)<br/>
81-
[c16rtomb, c32rtomb](c16rtomb-c32rtomb1.md)<br/>
82-
[mbrtowc](mbrtowc.md)<br/>
83-
[mbsrtowcs](mbsrtowcs.md)<br/>
84-
[mbsrtowcs_s](mbsrtowcs-s.md)<br/>
80+
[Data conversion](../data-conversion.md)\
81+
[Locale](../locale.md)\
82+
[Interpretation of multibyte-character sequences](../interpretation-of-multibyte-character-sequences.md)\
83+
[c16rtomb, c32rtomb](c16rtomb-c32rtomb1.md)\
84+
[mbrtowc](mbrtowc.md)\
85+
[mbsrtowcs](mbsrtowcs.md)\
86+
[mbsrtowcs_s](mbsrtowcs-s.md)

0 commit comments

Comments
 (0)