Book Review: Unicode Explained

Why is Unicode so hard?For good reasons: its complications have complications, and it's hard to isolate any part small enough to understand that isn't deeply coupled to much else.


November 01, 2006
URL:http://www.drdobbs.com/book-review-unicode-explained/199103004

UnixReview.com
November 2006

Book Review:  Unicode Explained

Reviewed by Cameron Laird

Unicode Explained
Jukka K. Korpela
O'Reilly, 2006
0-596-10121-X
678 pages, $59.99

Why is Unicode so hard?

For good reasons: its complications have complications, and it's hard to isolate any part small enough to understand that isn't deeply coupled to much else. Three broad themes that illustrate this difficulty are:

There's good news, though: Jukka Korpela's Unicode Explained makes Unicode comprehensible. I've been working occasionally with Unicode for almost a decade, but I find I understand parts of it much better now that I've read his book.

Unicode Explained isn't unique in its values; several introductions to Unicode have been assembled by passionate, deeply informed authors who handle the topic's difficulties fairly and with insight. Among these, Unicode Explained deserves attention as the most recent and the one that exhibits the most scholarly refinements. Over and over, Korpela "goes the extra mile" for readers by his introduction of specific details and concepts crucial to understanding. Rather than a glib syllogism about how typographic unification can go to excess, he presents specific examples from Scandinavian languages, possessive punctuation, and speech synthesis (is it obvious that "Charles I ..." is about the first in a sequence of kings, and that "I" is neither a pronoun nor an initial?) to make his point. He's careful and explicit to keep HTML, CSS, and XML separate in all their manifestations. The entire book is dense with this sort of illuminating substance.

An introduction to Unicode is different from one on SQLite, say, or even a topic as broad as cryptography, because the subject of Unicode is so unavoidably incoherent. Unicode deals with human languages and their typographic representations and must expand to all the messiness we humans achieve. A good author on Unicode can't be just a formal prodigy in a bounded subject like chess, for instance. Instead, he must be experienced in all sorts of esoterica. Korpela appears to have devoted himself to the subject, with Unicode Explained the helping hand he generously offers those of us who merely use Unicode.

Conclusion

My recommendation, then, if you work at all outside the ASCII table or standard Latin alphabet, is to keep a copy of Unicode Explained at your desk. It's a wonderful reference for such common questions as:

There are a very few places where Unicode Explained is confusing or misleading. Korpela, for instance, doesn't distinguish mathematicians from physicists, which leads to error in explaining the symbols the former use.

These missteps are minor, though. If you read, write, or program with human languages other than English (or perhaps Hawaiian or a very few others), you'll do well to keep Unicode Explained at hand.

Cameron is vice president of the Phaseit, Inc., consultancy, specializing in high-reliability and high-performance applications managed by high-level languages. He has reviewed more than 50 books for UnixReview.com, and has had a life-long interest and involvement with several human languages apart from English.

 

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.