Wide string in C++ and Windows MultiByteToWideChar usage for Unicode

by frank · January 1, 2016

I wrote this page for myself to review anytime I am going to use wchar_t and string conversion functions to make sure I'm not making something stupid.

Don't assume "Unicode" is 16-bytes
Size of Unicode differs on various frameworks / system. Unicode just means it can store all characters possible but makes no statement about the storage size. On one framework it could be stored in UTF-8 and thus have varying size while on other system it could be stored in UTF-16 or also UTF-32.

The size of one character is variable
If you store a letter in UTF-8, UTF-16 or "Unicode", you must make no assumption about the byte length of storage - these are variable-length storage.

Fixed-size unicode
To have fixed-size unicode use UTF-32 where characters are always saved on 4-bytes whatever the language or character.

About windows API
sizeof(variable) gives you the size in bytes.
Most windows functions to convert have parameters in character size (not bytes!)

Shareooo

Wide string in C++ and Windows MultiByteToWideChar usage for Unicode

You may also like...

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

Wide string in C++ and Windows MultiByteToWideChar usage for Unicode

You may also like...

boost link issue

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta