You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
UTF-8 support can be enabled by using the UTF-8 code page in your locale string. See the [UTF-8 Support section of `setlocale`](../c-runtime-library/reference/setlocale-wsetlocale.md#utf-8-support) for more information.
Copy file name to clipboardExpand all lines: docs/c-runtime-library/reference/setlocale-wsetlocale.md
+35-21Lines changed: 35 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -52,44 +52,44 @@ sets all categories, returning only the string
52
52
en-US
53
53
```
54
54
55
-
You can copy the string returned by **setlocale** to restore that part of the program's locale information. Global or thread local storage is used for the string returned by **setlocale**. Later calls to **setlocale** overwrite the string, which invalidates string pointers returned by earlier calls.
55
+
You can copy the string returned by `setlocale` to restore that part of the program's locale information. Global or thread local storage is used for the string returned by `setlocale`. Later calls to `setlocale` overwrite the string, which invalidates string pointers returned by earlier calls.
56
56
57
57
## Remarks
58
58
59
-
Use the **setlocale** function to set, change, or query some or all of the current program locale information specified by *locale* and *category*. *locale* refers to the locality (country/region and language) for which you can customize certain aspects of your program. Some locale-dependent categories include the formatting of dates and the display format for monetary values. If you set *locale* to the default string for a language that has multiple forms supported on your computer, you should check the **setlocale** return value to see which language is in effect. For example, if you set *locale* to "chinese" the return value could be either "chinese-simplified" or "chinese-traditional".
59
+
Use the `setlocale` function to set, change, or query some or all of the current program locale information specified by *locale* and *category*. *locale* refers to the locality (country/region and language) for which you can customize certain aspects of your program. Some locale-dependent categories include the formatting of dates and the display format for monetary values. If you set *locale* to the default string for a language that has multiple forms supported on your computer, you should check the `setlocale` return value to see which language is in effect. For example, if you set *locale* to "chinese" the return value could be either "chinese-simplified" or "chinese-traditional".
60
60
61
-
**_wsetlocale** is a wide-character version of **setlocale**; the *locale* argument and return value of **_wsetlocale** are wide-character strings. **_wsetlocale** and **setlocale** behave identically otherwise.
61
+
`_wsetlocale` is a wide-character version of `setlocale`; the *locale* argument and return value of `_wsetlocale` are wide-character strings. `_wsetlocale` and `setlocale` behave identically otherwise.
62
62
63
63
By default, this function's global state is scoped to the application. To change this, see [Global state in the CRT](../global-state.md).
64
64
65
65
### Generic-Text Routine Mappings
66
66
67
67
|TCHAR.H routine|_UNICODE & _MBCS not defined|_MBCS defined|_UNICODE defined|
The *category* argument specifies the parts of a program's locale information that are affected. The macros used for *category* and the parts of the program they affect are as follows:
72
72
73
73
|*category* flag|Affects|
74
74
|-|-|
75
-
|**LC_ALL**| All categories, as listed below. |
76
-
|**LC_COLLATE**| The **strcoll**, **_stricoll**, **wcscoll**, **_wcsicoll**, **strxfrm**, **_strncoll**, **_strnicoll**, **_wcsncoll**, **_wcsnicoll**, and **wcsxfrm** functions. |
77
-
|**LC_CTYPE**| The character-handling functions (except **isdigit**, **isxdigit**, **mbstowcs**, and **mbtowc**, which are unaffected). |
78
-
|**LC_MONETARY**| Monetary-formatting information returned by the **localeconv** function. |
79
-
|**LC_NUMERIC**| Decimal-point character for the formatted output routines (such as **printf**), for the data-conversion routines, and for the non-monetary formatting information returned by **localeconv**. In addition to the decimal-point character, **LC_NUMERIC** sets the thousands separator and the grouping control string returned by [localeconv](localeconv.md). |
80
-
|**LC_TIME**| The **strftime** and **wcsftime** functions. |
75
+
|`LC_ALL`| All categories, as listed below. |
76
+
|`LC_COLLATE`| The `strcoll`, `_stricoll`, `wcscoll`, `_wcsicoll`, `strxfrm`, `_strncoll`, `_strnicoll`, `_wcsncoll`, `_wcsnicoll`, and `wcsxfrm` functions. |
77
+
|`LC_CTYPE`| The character-handling functions (except `isdigit`, `isxdigit`, `mbstowcs`, and `mbtowc`, which are unaffected). |
78
+
|`LC_MONETARY`| Monetary-formatting information returned by the `localeconv` function. |
79
+
|`LC_NUMERIC`| Decimal-point character for the formatted output routines (such as `printf`), for the data-conversion routines, and for the non-monetary formatting information returned by `localeconv`. In addition to the decimal-point character, `LC_NUMERIC` sets the thousands separator and the grouping control string returned by [localeconv](localeconv.md). |
80
+
|`LC_TIME`| The `strftime` and `wcsftime` functions. |
81
81
82
-
This function validates the category parameter. If the category parameter isn't one of the values given in the previous table, the invalid parameter handler is invoked, as described in [Parameter Validation](../../c-runtime-library/parameter-validation.md). If execution is allowed to continue, the function sets **errno** to **EINVAL** and returns **NULL**.
82
+
This function validates the category parameter. If the category parameter isn't one of the values given in the previous table, the invalid parameter handler is invoked, as described in [Parameter Validation](../../c-runtime-library/parameter-validation.md). If execution is allowed to continue, the function sets `errno` to `EINVAL` and returns `NULL`.
83
83
84
-
The *locale* argument is a pointer to a string that specifies the locale. For information about the format of the *locale* argument, see [Locale Names, Languages, and Country/Region Strings](../../c-runtime-library/locale-names-languages-and-country-region-strings.md). If *locale* points to an empty string, the locale is the implementation-defined native environment. A value of **C** specifies the minimal ANSI conforming environment for C translation. The **C** locale assumes that all **`char`** data types are 1 byte and that their value is always less than 256.
84
+
The *locale* argument is a pointer to a string that specifies the locale. For information about the format of the *locale* argument, see [Locale Names, Languages, and Country/Region Strings](../../c-runtime-library/locale-names-languages-and-country-region-strings.md). If *locale* points to an empty string, the locale is the implementation-defined native environment. A value of `C` specifies the minimal ANSI conforming environment for C translation. The `C` locale assumes that all ``char`` data types are 1 byte and that their value is always less than 256.
85
85
86
86
At program startup, the equivalent of the following statement is executed:
87
87
88
88
`setlocale( LC_ALL, "C" );`
89
89
90
-
The *locale* argument can take a locale name, a language string, a language string and country/region code, a code page, or a language string, country/region code, and code page. The set of available locale names, languages, country/region codes, and code pages includes all those supported by the Windows NLS API. The set of locale names supported by **setlocale** are described in [Locale Names, Languages, and Country/Region Strings](../../c-runtime-library/locale-names-languages-and-country-region-strings.md). The set of language and country/region strings supported by **setlocale** are listed in [Language Strings](../../c-runtime-library/language-strings.md) and [Country/Region Strings](../../c-runtime-library/country-region-strings.md). We recommend the locale name form for performance and for maintainability of locale strings embedded in code or serialized to storage. The locale name strings are less likely to be changed by an operating system update than the language and country/region name form.
90
+
The *locale* argument can take a locale name, a language string, a language string and country/region code, a code page, or a language string, country/region code, and code page. The set of available locale names, languages, country/region codes, and code pages includes all those supported by the Windows NLS API. The set of locale names supported by `setlocale` are described in [Locale Names, Languages, and Country/Region Strings](../../c-runtime-library/locale-names-languages-and-country-region-strings.md). The set of language and country/region strings supported by `setlocale` are listed in [Language Strings](../../c-runtime-library/language-strings.md) and [Country/Region Strings](../../c-runtime-library/country-region-strings.md). We recommend the locale name form for performance and for maintainability of locale strings embedded in code or serialized to storage. The locale name strings are less likely to be changed by an operating system update than the language and country/region name form.
91
91
92
-
A null pointer that's passed as the *locale* argument tells **setlocale** to query instead of to set the international environment. If the *locale* argument is a null pointer, the program's current locale setting isn't changed. Instead, **setlocale** returns a pointer to the string that's associated with the *category* of the thread's current locale. If the *category* argument is **LC_ALL**, the function returns a string that indicates the current setting of each category, separated by semicolons. For example, the sequence of calls
92
+
A null pointer that's passed as the *locale* argument tells `setlocale` to query instead of to set the international environment. If the *locale* argument is a null pointer, the program's current locale setting isn't changed. Instead, `setlocale` returns a pointer to the string that's associated with the *category* of the thread's current locale. If the *category* argument is `LC_ALL`, the function returns a string that indicates the current setting of each category, separated by semicolons. For example, the sequence of calls
which is the string that's associated with the **LC_ALL** category.
108
+
which is the string that's associated with the `LC_ALL` category.
109
109
110
-
The following examples pertain to the **LC_ALL** category. Either of the strings ".OCP" and ".ACP" can be used instead of a code page number to specify use of the user-default OEM code page and user-default ANSI code page for that locale name, respectively.
110
+
The following examples pertain to the `LC_ALL` category. Either of the strings ".OCP" and ".ACP" can be used instead of a code page number to specify use of the user-default OEM code page and user-default ANSI code page for that locale name, respectively.
111
111
112
112
-`setlocale( LC_ALL, "" );`
113
113
@@ -145,7 +145,7 @@ The following examples pertain to the **LC_ALL** category. Either of the strings
145
145
146
146
-`setlocale( LC_ALL, "<language>" );`
147
147
148
-
Sets the locale to the language that's indicated by *\<language>*, and uses the default country/region for the specified language and the user-default ANSI code page for that country/region as obtained from the host operating system. For example, the following calls to **setlocale** are functionally equivalent:
148
+
Sets the locale to the language that's indicated by *\<language>*, and uses the default country/region for the specified language and the user-default ANSI code page for that country/region as obtained from the host operating system. For example, the following calls to `setlocale` are functionally equivalent:
149
149
150
150
`setlocale( LC_ALL, "en-US" );`
151
151
@@ -159,22 +159,36 @@ The following examples pertain to the **LC_ALL** category. Either of the strings
159
159
160
160
Sets the code page to the value indicated by *<code_page>*, together with the default country/region and language (as defined by the host operating system) for the specified code page.
161
161
162
-
The category must be either **LC_ALL** or **LC_CTYPE** to effect a change of code page. For example, if the default country/region and language of the host operating system are "United States" and "English," the following two calls to **setlocale** are functionally equivalent:
162
+
The category must be either `LC_ALL` or `LC_CTYPE` to effect a change of code page. For example, if the default country/region and language of the host operating system are "United States" and "English," the following two calls to `setlocale` are functionally equivalent:
For more information, see the [setlocale](../../preprocessor/setlocale.md) pragma directive in the [C/C++ Preprocessor Reference](../../preprocessor/c-cpp-preprocessor-reference.md).
169
169
170
-
The function [_configthreadlocale](configthreadlocale.md) is used to control whether **setlocale** affects the locale of all threads in a program or only the locale of the calling thread.
170
+
The function [_configthreadlocale](configthreadlocale.md) is used to control whether `setlocale` affects the locale of all threads in a program or only the locale of the calling thread.
171
+
172
+
## UTF-8 Support
173
+
174
+
Starting in Windows 10 build 17134 (April 2018 Update), the Universal C Runtime supports using a UTF-8 code page. This means that `char` strings passed to C runtime functions will expect strings in the UTF-8 encoding. To enable UTF-8 mode, use "UTF-8" as the code page when using `setlocale`. For example, `setlocale(LC_ALL, ".utf8")` will use the current default Windows ANSI code page (ACP) for the locale and UTF-8 for the code page.
175
+
176
+
After calling `setlocale(LC_ALL, ".UTF8")`, you may pass "😊" to `mbtowcs` and it will be properly translated to a `wchar_t` string, whereas previously there was not a locale setting available to do this.
177
+
178
+
UTF-8 mode is also enabled for functions that have historically translated `char` strings using the default Windows ANSI code page (ACP). For example, calling [`_mkdir("😊")`](../reference/mkdir-wmkdir.md) while using a UTF-8 code page will correctly produce a directory with that emoji as the folder name, instead of requiring the ACP to be changed to UTF-8 prior to running your program. Likewise, calling [`_getcwd()`](../reference/getcwd-wgetcwd.md) inside of that folder will return a UTF-8 encoded string. For compatibility, the ACP is still used if the C locale code page is not set to UTF-8.
179
+
180
+
The following aspects of the C Runtime that are not able to use UTF-8 because they are set during program startup and must use the default Windows ANSI code page (ACP): [`__argv`](../argc-argv-wargv.md), [`_acmdln`](../acmdln-tcmdln-wcmdln.md), and [`_pgmptr`](../pgmptr-wpgmptr.md).
181
+
182
+
Previous to this support, [`mbrtoc16`, `mbrtoc32`](../reference/mbrtoc16-mbrtoc323.md), [`c16rtomb`, and `c32rtomb`](../reference/c16rtomb-c32rtomb1.md) existed to translate between UTF-8 narrow strings, UTF-16 (same encoding as `wchar_t` on Windows platforms) and UTF-32. For compatibility reasons, these APIs still only translate to and from UTF-8 and not the code page set via `setlocale`.
183
+
184
+
To use this feature on an OS prior to Windows 10, such as Windows 7, you must use [app-local deployment](../../windows/universal-crt-deployment.md#local-deployment) or link statically using version 17134 of the Windows SDK or later. For Windows 10 operating systems prior to 17134, only static linking is supported.
171
185
172
186
## Requirements
173
187
174
188
|Routine|Required header|
175
189
|-------------|---------------------|
176
-
|**setlocale**|\<locale.h>|
177
-
|**_wsetlocale**|\<locale.h> or \<wchar.h>|
190
+
|`setlocale`|\<locale.h>|
191
+
|`_wsetlocale`|\<locale.h> or \<wchar.h>|
178
192
179
193
For additional compatibility information, see [Compatibility](../../c-runtime-library/compatibility.md).
Copy file name to clipboardExpand all lines: docs/c-runtime-library/reference/setmbcp.md
-10Lines changed: 0 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -34,16 +34,6 @@ Returns 0 if the code page is set successfully. If an invalid code page value is
34
34
35
35
The **_setmbcp** function specifies a new multibyte code page. By default, the run-time system automatically sets the multibyte code page to the system-default ANSI code page. The multibyte code page setting affects all multibyte routines that are not locale dependent. However, it is possible to instruct **_setmbcp** to use the code page defined for the current locale (see the following list of manifest constants and associated behavior results). For a list of the multibyte routines that are dependent on the locale code page rather than the multibyte code page, see [Interpretation of Multibyte-Character Sequences](../../c-runtime-library/interpretation-of-multibyte-character-sequences.md).
36
36
37
-
The multibyte code page also affects multibyte-character processing by the following run-time library routines:
In addition, all run-time library routines that receive multibyte-character *argv* or *envp* program arguments as parameters (such as the **_exec** and **_spawn** families) process these strings according to the multibyte code page. Therefore, these routines are also affected by a call to **_setmbcp** that changes the multibyte code page.
46
-
47
37
The *codepage* argument can be set to any of the following values:
48
38
49
39
-**_MB_CP_ANSI** Use ANSI code page obtained from operating system at program startup.
0 commit comments