Adobe-KR-9 Second Draft

This article picks up where the 2017-10-01 article left off, and provides details about the second draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The second draft of the Adobe-KR-9 character collection includes 22,612 glyphs (CIDs 0 through 22611) distributed among ten Supplements. When compared to the first draft, 35 glyphs were removed, ten glyphs were added, three Supplements were added, and the distribution of glyphs among some of the Supplements was changed. Because it is the second draft, the details are still subject to change—and most certainly will change, though I hope that the changes are minimal.

The table below details the number of glyphs per Supplement, their CID ranges, and a high-level summary of the glyphs in each:

Supplement	Glyphs	CID Range	Scope
0	2,625	0–2624	Core glyphs
1	2,003	2625–4627	Supplementary modern hangul syllables
2	6,814	4628–11441	Tertiary modern hangul syllables
3	4,620	11442–16061	Core hanja
4	250	16062–16311	Enclosed digits, Latin characters & hangul letters/syllables
5	136	16312–16447	Hangul tone marks, full-width Latin characters & vertical forms
6	346	16448–16793	KS X 1001 compatibility
7	404	16794–17197	Latin, Greek, Cyrillic & Kana
8	1,874	17198–19071	Pre-composed hangul syllables for the Jeju dialect (제주말 jejumal) & combining jamo
9	3,540	19072–22611	Supplementary hanja

No actual glyphs are provided or shown, and like for the first draft, I have put together a data file that specifies for each glyph its CID, Unicode-based glyph name, the Unicode code point or sequence, and the actual character. A subsequent draft may include a glyph table to supplement the data file. The final published version will definitely include a glyph table that will almost certainly use representative glyphs based on the open source Source Han Serif (본명조) typeface.

I also prepared a mapping file that maps 277 code points to existing glyphs, 270 of which correspond to KS X 1001 hanja. I actually prepared this file for the first draft, and it did not change for this second draft.

The sections below provide some brief details about the scope and purpose of each of the ten tentative Supplements:

Supplement 0

Supplement 0 is meant to include the core glyphs that should be in modern Korean font resources, and serves as a minimal glyph set for today’s Unicode-based environments. Of course, the basic set of 2,350 modern hangul syllables are included, along with glyphs for five high-frequency modern hangul syllables (U+B894 뢔, U+C330 쌰, U+C3BC 쎼, U+C4D4 쓔 & U+CB2C 쬬), ASCII, some ISO Latin 1 (aka ISO/IEC 8859-1), punctuation, and some symbols. Several of the glyphs, such as those for punctuation, include both Western and Korean forms, and the short-term intent is to use the OpenType 'locl' (Localized Forms) GSUB feature to switch between them. The long-term goal is to define Standardized Variation Sequences (SVSes) for them as proposed in L2/17-056. The number of glyphs is a very modest 2,625.

Supplement 1

The second Supplement includes the glyphs for an additional 2,003 modern hangul syllables that come from the union of those in the KS X 1002, KPS 9566 (DPRK), and GB 12052 (PRC) standards, along with a set of 418 additional high-frequency modern hangul syllables that was determined by KFA (Korea Font Association). 1,925 of these glyphs correspond to KS X 1002, 11 are specific to KPS 9566 (U+AD98 궘, U+AF31 꼱, U+AFE5 꿥, U+B2FE 닾, U+B570 땰, U+B6CC 뛌, U+B745 띅, U+C836 젶, U+CA34 쨴, U+CD44 쵄 & U+D5D5 헕), nine are specific to GB 12052 (U+AC03 갃, U+B609 똉, U+B9E7 맧, U+BBC3 믃, U+BF59 뽙, U+BFE5 뿥, U+C6D8 웘, U+CB94 쮔 & U+D63B 혻), and the remaining 58 are from the set of 418 additional high-frequency modern hangul syllables (360 of them are common with KS X 1002).

Supplement 2

Supplement 2 includes the glyphs for the remaining 6,814 modern hangul syllables to form the complete set of 11,172 that have been in Unicode since Version 2.0.

Supplement 3

Supplement 3 includes the glyphs for the 4,888 hanja (aka ideographs) that are included in the KS X 1001 standard. The number of glyphs is actually 4,620, because 268 of the 4,888 hanja are genuine duplicates that are included due to multiple readings.

(This Supplement is unchanged from Supplement 2 of the first draft.)

Supplement 4

The fifth Supplement includes 250 glyphs for enclosed digits, Latin characters, and hangul letters/syllables. The scope goes beyond what is found in the KS standards, and includes appropriate characters found in the Unicode blocks named Enclosed Alphanumerics, Dingbats, Enclosed CJK Letters and Months, and Enclosed Alphanumeric Supplement.

(This Supplement is unchanged from Supplement 3 of the first draft.)

Supplement 5

Supplement 5 includes glyphs for the hangul tone marks, full-width Latin characters, and vertical forms.

Supplement 6

This Supplement is meant to include glyphs for KS X 1001 compatibility, for the benefit of font developers who feel that they need to support this standard in its entirety. Included are glyphs for math (only the basic math symbols are included in Supplement 0), line-drawing characters, and other symbols.

(This Supplement was formerly Supplement 4 of the first draft.)

Supplement 7

Supplement 7 is intended to include glyphs for foreign languages, such as those for extended Latin, Greek, Cyrillic, and Japanese kana. While most of the characters that are supported by these glyphs are in the KS X 1001 standard, I need to point out that this Supplement actually includes glyphs for characters outside of that standard, such as U+03C2 ς GREEK SMALL LETTER FINAL SIGMA for making the Greek functional, and additional kana and kana-related characters, such as U+30FC ー KATAKANA-HIRAGANA PROLONGED SOUND MARK, which is necessary for katakana.

Supplement 8

Supplement 8 is meant to include a small set of pre-composed hangul syllables that fall outside the modern set of 11,172, and whose scope is well-defined. As opposed to the approach that was used for the Source Han and Noto CJK typeface designs, which involved cherry-picking the 500 most frequently-used pre-modern hangul syllables, I figured that including pre-composed forms of the 160 pre-modern hangul syllables that are necessary for the Jeju dialect (제주말 jejumal) seemed appropriate. The rest of the Supplement includes the nominal forms of combining jamo, along with the combining forms themselves. Included in the latter are five sets of leading jamo (the nominal forms serve as one of the sets, meaning that there are six sets in total), two sets of vowel jamo, and four sets of trailing jamo. Of course, this is modeled after what was done for the successful and broadly-deployed Source Han and Noto CJK typeface designs. The OpenType 'ljmo' (Leading Jamo Forms), 'vjmo' (Vowel Jamo Forms), and 'tjmo' (Trailing Jamo Forms) GSUB features are expected to be used.

The 1,874 glyphs in this Supplement include a modest subset of 1,714 glyphs for combining jamo that can represent a staggering 1,638,750 hangul syllables (11,875 LV plus 1,626,875 LVT sequences), with the 11,172 modern hangul syllables being a very tiny subset.

(This Supplement is unchanged from Supplement 5 of the first draft.)

Supplement 9

The tenth and final Supplement includes 3,540 glyphs for additional hanja beyond those in Supplement 3. Of course, glyphs for the 2,856 hanja in the KS X 1002 standard are included. The rest of the glyphs are for hanja found in the Korean Supreme Court’s list, 665 of which are encoded in the URO and Extensions A, B, E, and F. 18 are supported by the IVD via the recently-registered KRName IVD collection, and one outlier will be in Extension G with U+30726 as its tentative code point.

(This Supplement is unchanged from Supplement 6 of the first draft, except for referencing a tentative Extension G code point, and changing its ordering.)

In closing, I again welcome any and all feedback. While I don’t expect any glyphs to be removed at this point, some glyphs may still be added, and the distribution of some glyphs among the Supplements may again change. This second draft is currently under review by my friends at Sandoll Communications, along with the Korea Font Association (KFA).

🐡

CJK Type Blog

CJK Fonts, Character Sets & Encodings.