The Moji_Joho IVD Collection was first introduced via PRI 259 last December, which initiated a mandatory—according to UTS #37—90-day Public Review Period. The submitter received three sets of comments, and after making minor changes, submitted the materials for registering the new IVD collection, along with its initial set of IVSes. The Moji_Joho IVD Collection and its initial set of IVSes were officially registered on May 16, 2014, which represents the fourth version of the IVD (Ideographic Variation Database).
The 2014-05-16 version of the IVD thus registers a new IVD collection, Moji_Joho, along with its initial set of 10,710 IVSes, 9,685 of which are shared—through mutual agreement—with the registered Hanyo-Denshi IVD Collection. Some enhancements were also made to the IVD_Stats.txt file, specifically that the shared IVSes are explicitly listed at the end of the file.
One additional statistic is that the highest VS (Variation Selector) used is currently VS48 (U+E011F), meaning that 32 of the 240 VSes allocated for IVS use are now being used. Of course, it is relatively easy to figure out with which BC (Base Character) VS48 is used, and an educated guess would be that it is either U+9089 (邉) or U+908A (邊). It is the former:
9089 E011F; Moji_Joho; MJ026193
The highest VS used with the latter BC is currently VS37 (U+E0114).
As the IVD Registrar, I’d like to use this opportunity to thank everyone who made the effort to review PRI 259. I’d also like to congratulate those who prepared the Moji_Joho IVD Collection for both review and registration.
Shown above is the top portion of the printed version of Taiwan’s MOE 國字標準字體方體母稿 (Fangti) standard. (For those who are interested, its ISBN is 957-00-8392-1.) What is provided online are effectively scans of the 常用字 and 次常用字 sections, which contain 4,808 and 6,343 hanzi, respectively. Although included in the printed version of the standard, the 罕用字 section, which contains 1,907 additional hanzi, is not provided online. In terms of sheer numbers, these 1,907 additional hanzi appear to completely cover Big Five (both levels) and CNS 11643 Planes 1 and 2.
The purpose of today’s article is to describe two additional issues in this glyph standard that my new friend, Kuang-che Wu (吳光哲) of Google, recently found.
What is shown above is a trivial difference in a two-component structure that is present in many ideographs, such as 滕 (U+6ED5), 縢 (U+7E22), 螣 (U+87A3), 謄 (U+8B04), and 騰 (U+9A30). This difference is, of course, unifiable. What this article is about is consistency within a standard, mainly referring to the source standards from each region. The focus of this article is on the forms used in ROC (Republic of China; 中華民國 Zhōnghuá Mínguó), which is more commonly referred to Taiwan (臺灣 or 台灣 Táiwān).
Recent work has led me to more closely explore U+4548 (☞䕈☜), which is in CJK Unified Ideographs Extension A. (What is shown in parentheses in the previous sentence is likely to be different than what is shown in the excerpt above.)
The image above is an excerpt from the latest Extension A Code Charts. At first glance, everything seem normal. The differences between the G (China) and T (Taiwan) glyphs are expected, and perhaps more importantly, unifiable.