GB 18030 Oddity or Design Flaw?

I spent a couple of days curling up with GB 18030 (both versions: 2000 and 2005), which is PRC’s latest and greatest national character set standard, and came across an oddity that my gut tells me is a design flaw. At the very least, it is an issue about which font developers need to be aware.

What I found were eight instances of CJK Unified Ideographs with a left-side Radical #130 that uses the Traditional Chinese or Taiwan-style form, instead of the expected Simplified Chinese or PRC-style form that looks the same as Radical #74. Screen captures from the latest Unicode Code Charts, whose glyphs agree with both versions of GB 18030, are shown below:

More details about these eight character pairs are shown below, specifically how they map to Chinese national standards (they all, by definition, map to GB 18030):

Radical #130 Character National Standard Radical #74 Character National Standard
(U+43D3) CNS 11643 Plane 3 (U+670A) Big 5 & GB 2312
(U+4443) CNS 11643 Plane 3 (U+6726) Big 5 & GB 2312
(U+80A6) CNS 11643 Plane 3 (U+670C) CNS 11643 Plane 3 & Hong Kong SCS
(U+80CA) Big 5 (U+6710) Big 5 & GB 2312
(U+80D0) Big 5 (U+670F) Big 5
(U+8101) Big 5 (U+6713) Big 5
(U+8127) Big 5 (U+6718) Big 5
(U+81A7) Big 5 (U+6723) Big 5

(NOTE: Depending on what fonts are being used by your device, the correct glyph may or may not appear in the above table.)

I understand why the designers of GB 18030 chose to make the glyphs for these eight character pairs distinct, but it still violates Simplified Chinese design principles, which are closely followed elsewhere in GB 18030. In this case, I feel that consistency should outweigh preserving character distinction in terms of its form.

While within the confines of GB 18030, both in terms of its scope (all of the URO and Extension A) and the use of a single glyph per code point (meaning a single-region font), this is not a problem per se, but this inconsistency has two inherent issues:

  • When developing multiple-region fonts, such as Pan-Chinese or Pan-CJK, this can become a sticking point for the glyphs for other regions, including but not limited to those that use Traditional Chinese, meaning Taiwan and Hong Kong.
  • When the scope of GB 18030 extends beyond the URO and Extension A, the chance of encountering similar character pairs is relatively high, and if the Radical #130 character is within the URO or Extension, it will require similar treatment.

With regard to the second issue, I managed to quickly find a real-world case, meaning that this issue is no longer a theoretical possibility, but rather a good dose of reality:

This suggests that GB 18030 will, at some point—perhaps after it fully assimilates all of Extension B (interestingly, GB 18030-2005 includes Extension B in its entirety on pp 240–443)—change the form of the left-side radical of U+813C () so that it is distinct from U+23377. I am also convinced that there are other such cases that would require similar treatment.

The main thing about this issue that bothers me is its open-ended or never-ending nature.

Welcome to my world. ☺

Comments are closed.