You can find our user documentation at

Check out our new API beta site!

Child pages
  • Guide to Locales - Why You Should Use UTF-8
Skip to end of metadata
Go to start of metadata


The locale system's character set is  always   utf-8 .

 While it is possible to use other character sets for the locale system,   utf-8  offers many benefits that other character sets lack, and has no known issues. For this reason, we  only  recommend   utf-8  locales.


If you receive character encoding errors or other "garbled" text, read our Troubleshooting documentation.

Why utf-8 ?

The locale system uses the utf-8 character encoding for several specific reasons:

  • Character sets and collations are complicated topics. A unified standard simplifies the localization process immensely.
  • utf-8 support is universal.
  • utf-8 allows you to list multiple languages in a single interface or file (for example, to create a menu of available locales).
  • utf-8 ensures that the locale system can interact with external systems (for example, file editors and databases). 
  • Languages like JavaScript and Perl can natively use utf-8 data.

While it is possible for a locale to use another character encoding, we have yet to find a good reason to do so. For this reason, we will not document how to use a different character set. If, however, you think that you have found a valid reason to use another character set, we would be happy to consider it. Reach out to us in Discord with the technical reasons why your project requires a locale in another encoding.

For more information about utf-8, we recommend that you watch Dan Muey's  I  Unicode presentation from OSCon 2014.