Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The Unicode standard, version 4.0
Aliprand J. (ed), Allen J., Becker J. (ed), Davis M. (ed), Everson M. (ed), Freytag A. (ed), Jenkins J. (ed), Ksar M. (ed), McGowan R. (ed), Muller E. (ed), Moore L. (ed), Suignard M., Whisler K. (ed), Addison-Wesley Longman Publishing Co, Inc., Boston, MA, 2003. 1504 pp. Type: Book (9780321185785)
Date Reviewed: Feb 25 2004

Unicode is a basic encoding scheme for characters, which covers all the major alphabetic and ideographic writing systems of the world. The encoding was developed by the Unicode Consortium, whose members include most of the largest hardware and software companies. Unicode encoding has become quite popular, and modern programming languages like Java use a 16-bit version of Unicode, Unicode transformation format 16 (UTF-16), internally to represent characters. Often, this 16-bit representation is considered to be “the Unicode representation,” but this is only partially correct; there are other byte-based representation schemas available as well, such as UTF-8, UTF-16, and UTF-32. UTF-8 is one of the most famous representations of Unicode, since it is heavily used in Extensible Markup Language (XML). This handbook explains all of the above in detail.

Chapter 1 presents a short introduction to how Unicode came about, and the ideas behind it. Chapter 2 explains the basic Unicode design principles, and the different encoding forms, such as UTF-8 and UTF-16. Chapter 5 includes some implementation guidelines for developers, including a solution to the problem of sorting and searching Unicode strings, which is quite an important application area for software that deals with databases or similar applications.

The central chapters of the book are chapters 7 to 13, which describe all the scripts available for the different characters, starting from the Latin character set, and continuing on through archaic character sets like Linear B (although I wonder why there is no support for hieroglyphs). Symbols, such as currency and numbers, are described in chapter 14, while chapter 15 defines special areas, such as the private use area. Chapter 16 (the most voluminous chapter) contains all the code charts (about 700 pages). The appendix focuses on a description of changes to previous versions of Unicode, and its relationship to ISO 10646.

The accompanying CD-ROM contains the old 3.0 version of the book and all the relevant technical reports. It also contains a very useful Unicode character viewer application.

One problem that nonexpert Unicode users may encounter is the following: in many cases, one might have a graphical image of a character (such as a Chinese or a Thai one), but have no easy way to find the associated Unicode value for that character. I assume this will happen more often to developers who write “internationalized” versions of their products. If one does not know the “literal meaning” of the character, it will take a long time to find it in the book. However, it would clearly not be possible for a book to address this problem; perhaps there will eventually be software that will allow a user to supply an image, returning the Unicode value for that character. It would be perfect if a piece of software like this could be supplied on the accompanying CD-ROM.

This is the definitive handbook on Unicode, at least for the current state-of-the-art. It provides all the information one needs to work with Unicode. I have used different versions of the book since 1995. Since that time, it has gained weight (it is now about three kilograms), but each gram is worthwhile. Aside from this, it is fun simply to flick through the book, just to see the different character sets.

Reviewer:  K. Waldhör Review #: CR129146 (0408-0912)
Bookmark and Share
  Reviewer Selected
 
 
Standards (I.7.2 ... )
 
 
Standards (K.1 ... )
 
 
Standards (I.3.6 ... )
 
 
Web-Based Services (H.3.5 ... )
 
 
World Wide Web (WWW) (H.3.4 ... )
 
 
Methodology And Techniques (I.3.6 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Standards": Date
Web publishing with HTML in a week
Lemay L., Sams, Indianapolis, IN, 1995. Type: Book (9780672306679)
Jun 1 1996
Unicode demystified: a practical programmer’s guide to the encoding standard
Gillam R., Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2002.  832, Type: Book (9780201700527)
Jan 30 2003
 Designing with Web standards
Zeldman J., New Riders Publishing, Thousand Oaks, CA, 2003. Type: Book (9780735712010)
Apr 6 2004
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy