Main index | Section 5 | Options |
[0x00000000 - 0x0000007f] [00000000.0bbbbbbb] -> 0bbbbbbb [0x00000080 - 0x000007ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb [0x00000800 - 0x0000ffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb [0x00010000 - 0x001fffff] [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] -> 11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb [0x00200000 - 0x03ffffff] [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb [0x04000000 - 0x7fffffff] [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
If more than a single representation of a value exists (for example, 0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always used. Longer ones are detected as an error as they pose a potential security risk, and destroy the 1:1 character:octet sequence mapping.
Proceedings of the Winter 1993 USENIX Technical Conference, Hello World, January 1993.
, , ,RFC 2279, UTF-8, a transformation format of ISO 10646, January 1998.
,as amended by the Unicode Standard Annex #27: Unicode 3.1 and by the Unicode Standard Annex #28: Unicode 3.2, The Unicode Standard, Version 3.0, 2000.
,UTF8 (5) | April 7, 2004 |
Main index | Section 5 | Options |
Please direct any comments about this manual page service to Ben Bullock. Privacy policy.
“ | VI = Virtually Incomprehensible. | ” |