A Forthcoming Registry & Ordering: Adobe-KR-6

It is difficult to imagine that it has been over 20 years since a new RO—or Adobe CID-keyed glyph set—was born. Of course, I am referring to the static glyph sets, not the ones based on the special-purpose Adobe-Identity-0 ROS.

“RO” stands for Registry and Ordering, which represent compatibility names or identifiers for CID-keyed glyph sets that are referred to as character collections. Adobe CID-keyed glyph sets are usually referred to as ROSes, with the final “S” being an integer that refers to a specific Supplement. The first Supplement, of course, is 0 (zero).

One of my recent projects is to revitalize and modernize our Korean glyph set, Adobe-Korea1-2 (see Adobe Tech Note #5093), which was last modified on 1998-10-12 by defining Supplement 2 that added only pre-rotated versions of the proportional and half-width glyphs that are referenced by the effectively-deprecated 'vrt2' (Vertical Alternates and Rotation) GSUB feature. Instead of defining a new Supplement, I decided that it would be better to simply define a completely new glyph set for a variety of reasons. The tentative Registry and Ordering names are Adobe and KR (meaning “Adobe-KR”), and unlike other ROSes for which Supplements are defined incrementally, my current plan is to simultaneously define seven Supplements, 0 through 6.

This article describes the first draft of the Adobe-KR-6 character collection that includes 22,637 glyphs (CIDs 0 through 22636). Because it is the first draft, the details are subject to change—and most certainly will change. I am fairly certain that there will be seven Supplements, because each one has a very well-defined scope. The table below details the number of glyphs per Supplement, their CID ranges, and a high-level summary of the glyphs in each:

Supplement	Glyphs	CID Range	Scope
0	2,823	0–2822	Core glyphs
1	8,822	2823–11644	Supplementary modern hangul syllables
2	4,620	11645–16264	Core hanja
3	250	16265–16514	Enclosed digits, Latin characters & hangul letters/syllables
4	708	16515–17222	KS X 1001 compatibility
5	1,874	17223–19096	Pre-composed hangul syllables for the Jeju dialect (제주말 jejumal) & combining jamo
6	3,540	19097–22636	Supplementary hanja

Being the very first draft, no actual glyphs are provided or shown. Instead, I have put together a data file that specifies for each glyph its CID, Unicode-based glyph name, the Unicode code point or sequence, and the actual character. Subsequent drafts may include a glyph table to supplement the data file. The final published version will definitely include a glyph table that will almost certainly use representative glyphs based on the open source Source Han Serif (본명조) typeface.

The sections below provide some brief details about the scope and purpose of each of the seven planned Supplements:

Supplement 0

Supplement 0 is meant to include the core glyphs that should be in modern Korean font resources, and serves as a minimal glyph set for today’s Unicode-based environments. Of course, the basic set of 2,350 modern hangul syllables are included, along with glyphs for ASCII, ISO Latin 1 (aka ISO/IEC 8859-1), punctuation, and symbols. Several of the glyphs, such as those for punctuation and digits, include both Western and Korean forms, and the short-term intent is to use the OpenType 'locl' (Localized Forms) GSUB feature to switch between them. The long-term goal is to define Standardized Variation Sequences (SVSes) for them as proposed in L2/17-056. The number of glyphs is a very modest 2,823.

Supplement 1

The second Supplement simply includes the glyphs for the remaining 8,822 modern hangul syllables to form the complete set of 11,172 that have been in Unicode since Version 2.0.

Supplement 2

Supplement 2 includes the glyphs for the 4,888 hanja (aka ideographs) that are included in the KS X 1001 standard. The number of glyphs is actually 4,620, because 268 of the 4,888 hanja are genuine duplicates that are included due to multiple readings. (I am still fascinated to this day that one particular hanja, 樂, appears four times in the KS X 1001 standard due to multiple readings: 낙 nag, 락 rag, 악 ag, and 요 yo.)

Supplement 3

The fourth Supplement includes 250 glyphs for enclosed digits, Latin characters, and hangul letters/syllables. The scope goes beyond what is found in the KS standards, and includes appropriate characters found in the Unicode blocks named Enclosed Alphanumerics, Dingbats, Enclosed CJK Letters and Months, and Enclosed Alphanumeric Supplement.

Supplement 4

This Supplement is meant to include glyphs for KS X 1001 compatibility, for the benefit of font developers who feel that they need to support this standard in its entirety. Included are glyphs for Japanese kana, Greek, Cyrillic, math (only the basic math symbols are included in Supplement 0), line-drawing characters, and other symbols. I should point out that this Supplement actually includes glyphs for characters outside of the KS X 1001 standard, such as U+03C2 ς GREEK SMALL LETTER FINAL SIGMA for making the Greek functional, and additional kana and kana-related characters, such as U+30FC ー KATAKANA-HIRAGANA PROLONGED SOUND MARK, which is necessary for katakana.

Supplement 5

Supplement 5 is meant to include a small set of pre-composed hangul syllables that fall outside the modern set of 11,172, and whose scope is well-defined. As opposed to the approach that was used for the Source Han and Noto CJK typeface designs, which involved cherry-picking the 500 most frequently-used pre-modern hangul syllables, I figured that including pre-composed forms of the 160 pre-modern hangul syllables that are necessary for the Jeju dialect (제주말 jejumal) seemed appropriate. The rest of the Supplement includes the nominal forms of combining jamo, along with the combining forms themselves. Included in the latter are five sets of leading jamo (the nominal forms serve as one of the sets, meaning that there are six sets in total), two sets of vowel jamo, and four sets of trailing jamo. Of course, this is modeled after what was done for the successful and broadly-deployed Source Han and Noto CJK typeface designs. The OpenType 'ljmo' (Leading Jamo Forms), 'vjmo' (Vowel Jamo Forms), and 'tjmo' (Trailing Jamo Forms) GSUB features are expected to be used.

The 1,874 glyphs in this Supplement include a modest subset of 1,714 glyphs for combining jamo that can represent a staggering 1,638,750 hangul syllables (11,875 LV plus 1,626,875 LVT sequences), with the 11,172 modern hangul syllables being a very tiny subset.

Supplement 6

The seventh and final Supplement includes 3,540 glyphs for additional hanja beyond those in Supplement 2. Of course, glyphs for the 2,856 hanja in the KS X 1002 standard are included. The rest of the glyphs are for hanja found in the Korean Supreme Court’s list, 665 of which are encoded in the URO and Extensions A, B, E, and F. 18 need to be supported by the IVD via the soon-to-be-registered “KRName” IVD collection (see PRI #351), and one is serial number 02063 (aka UTC-01200 ⿰氵恩) of IRG Working Set 2015 that is expected to become Extension G.

In closing, I welcome any and all feedback. While I don’t expect any glyphs to be removed, some glyphs may be added, and some glyphs may move between Supplements. This first draft is currently under review by my friends at Sandoll Communications, along with the Korea Font Association (KFA). I suspect that their feedback will result in a lot of changes, so there is nothing wrong with waiting for the second draft, which is likely to be more solid, before providing feedback.

🐡

CJK Type Blog

CJK Fonts, Character Sets & Encodings.