[bitc-dev] initial interest

David Hopwood david.nospam.hopwood at blueyonder.co.uk
Sat Dec 10 10:44:38 EST 2005


Jonathan S. Shapiro wrote:
> A bunch of issues are getting mixed up here:
> 
>   1. What is the size of "char"
>   2. What is the internal-to-memory representation of strings?
>   3. What is the *external* representation of strings during
>      serialization.
> 
> Answers:
> 
>   1. Char *must* be 32 bits, because char needs to be able to
>      represent all code points. I chose very explicitly NOT to
>      support a legacy character type, because it will invite
>      people to write code badly.
> 
>   2. String internal representation is not specified, but the
>      plan is to use either UTF8 or some variation of the ICU
>      strategy.
> 
>   3. External representation of strings is UTF32.

UTF-32 is an extremely inefficient encoding, whether used internally
or externally. The general trend in protocol design is to use UTF-8
externally.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>



More information about the bitc-dev mailing list