#Unicode4Life

This is a brief article to let the readership know that the Unicode Consortium now offers lifetime memberships for individual members. My lifetime membership certificate is shown above.
Continue reading…

UTC #155

The next UTC (Unicode Technical Committee) meeting—the 155th one—takes place during the week of April 30th, and will be hosted at the Adobe headquarters in San José, California. Of course, all voting members of the Unicode Consortium are strongly encouraged to attend.
Continue reading…

CMap Resources & Character Collections

The CMap resources that are associated with our public glyph sets—called character collections—were first open-sourced on 2009-09-21 via Adobe’s first open source portal, and about a year later the project was moved to SourceForge. I then migrated the project to GitHub on 2015-03-27 where it is likely to remain for the foreseeable future. The main purpose for open-sourcing our CMap resources was to make it easier for developers to include them in their own open source projects, many of which require that the components themselves be open source.

I then open-sourced three of our four character collections on GitHub—Adobe-GB1-5, Adobe-CNS1-7, and Adobe-Japan1-6—in October of last year. The Adobe-Korea1-2 character collection was intentionally not open-sourced, because it will soon be replaced by the Adobe-KR-9 character collection that is expected to be published in mid-May.
Continue reading…

Adobe-KR-9 Fourth Draft

This article picks up where the 2018-01-18 article left off, and provides details about the fourth—and hopefully final—draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The fourth draft of the Adobe-KR-9 character collection includes 22,860 glyphs (CIDs 0 through 22859) distributed among ten Supplements. When compared to the third draft, four glyphs were removed, only one glyph was added, a small number of glyphs were moved from Supplement 0 to later Supplements, and the ordering of Supplements 3 through 9 was changed. Because it is a draft, the details are still subject to change, though my hope is that this draft represents what will become the final character collection specification.
Continue reading…

Exploring IICore—Part 5

Part 1, Part 2, Part 3, and Part 4 of this series scrutinized the ideographs that are associated with each of the seven region tags of the kIICore property. In this fifth and final article of this series, I will provide some details about the earlier versions of IICore, and what changed between them.
Continue reading…

Exploring IICore—Part 4

In Part 1, Part 2, and Part 3 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), “J” (for Japan), and “G” (for PRC or China) in the kIICore property. In Part 4, which is today’s article, we will explore the ideographs that are tagged “T” (for ROC or Taiwan), “H” (for Hong Kong SAR), and “M” (for Macao SAR).
Continue reading…

Year of the Dog

I’d like to use this opportunity to welcome the year of the dog, which is expressed using the CJK Unified Ideograph (U+620C), and to wish a Happy Chinese New Year to all of my friends, colleagues, and blog readers who are celebrating this holiday. May this year be safe, prosperous, and enjoyable.
Continue reading…

Exploring IICore—Part 3

In Part 1 and Part 2 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), and “J” (for Japan) in the kIICore property. In Part 3, which is today’s article, we will explore the 5,825 ideographs that are tagged “G” (for PRC or China).
Continue reading…

Exploring IICore—Part 2

In Part 1 of this series, which is intended to scrutinize the 9,810 CJK Unified Ideographs that comprise IICore, we explored some of the oddities that related to ROK (aka South Korea). In Part 2 of this series, we will explore the ideographs that are tagged “P” and “J” for DPRK (aka North Korea) and Japan use, respectively.
Continue reading…

Exploring IICore—Part 1

Today’s article is the very first one that references IICore (International Ideographs Core), which is best described as a region-agnostic subset that includes the most commonly used CJK Unified Ideographs in Unicode, and is intended for use in memory-challenged devices and environments. Included are 9,810 ideographs, the bulk of which are in the URO (9,706), with the remaining ones in Extensions A (42) and B (62).

IICore is instantiated as the kIICore property of the Unihan Database, and documented in UAX #38. The kIICore property values consist of an initial letter—A, B, or C—that indicates priority, followed by one or more letters that specify a source that more or less corresponds to a region: G, H, J, K, M, P (short for KP), and T.
Continue reading…