Wide string in C++ and Windows MultiByteToWideChar usage for Unicode
I wrote this page for myself to review anytime I am going to use wchar_t and string conversion functions to make sure I'm not making something stupid.
Don't assume "Unicode" is 16-bytes
Size of Unicode differs on various frameworks / system. Unicode just means it can store all characters possible but makes no statement about the storage size. On one framework it could be stored in UTF-8 and thus have varying size while on other system it could be stored in UTF-16 or also UTF-32.
The size of one character is variable
If you store a letter in UTF-8, UTF-16 or "Unicode", you must make no assumption about the byte length of storage - these are variable-length storage.
Fixed-size unicode
To have fixed-size unicode use UTF-32 where characters are always saved on 4-bytes whatever the language or character.
About windows API
sizeof(variable) gives you the size in bytes.
Most windows functions to convert have parameters in character size (not bytes!)
Recent Comments