At seemingly every opportunity, whether via this blog or during public speaking engagements, I have made it abundantly clear that the Adobe-branded Source Han families share the same glyph set as the corresponding Google-branded Noto CJK families. That is simply because it is true. What requires a bit of explanation, however, is how the two typeface designs—Source Han Sans and Source Han Serif—differ. That is what this particular article is about.
As the Project Architect of these Pan-CJK typeface families, I have my fingers on all of the data that was used during their development, and for preparing each release. I can therefore impart some useful tidbits of information that cannot be found elsewhere.
Names & Weights
Because the Source Han families are different typeface styles/designs, their names are necessarily different, including their seven weights. The Normal weight is specific to Source Han Sans, and the SemiBold weight is specific to Source Han Serif. The Noto CJK fonts have their own weight-name peculiarities.
Interestingly, and somewhat predictably, some people got a bit bent out of shape with regard to the localized Traditional Chinese name of Source Han Serif, 思源宋體, perhaps because they were expecting 思源明體 instead. The use of 宋 in lieu of 明 is actually intentional. Keep in mind that Source Han Serif is a Pan-CJK typeface design, and that the scope of its Traditional Chinese coverage is limited to Big Five. As a result, and as will be stated in the next section, the moment one tries to use an ideograph that is outside the scope of Big Five, the likelihood of its glyph following Simplified Chinese conventions, not Traditional Chinese ones, is relatively high. In other words, the use of 宋 in its Traditional Chinese name can be considered a gentle or subtle reminder of this.
Regional Standards Coverage
First and foremost, the scope of coverage of the Source Han families, in terms of ideographs, is closely aligned with a limited number of key regional standards, specifically GB 18030 for China, Big Five for Taiwan, the JIS standards for Japan, and the KS X 1001 and KS X 1002 standards for Korea. Source Han Sans and Source Han Serif currently differ in that the former supports Hong Kong SCS, at least in terms of code points, and the latter includes only a very small number of HK glyphs, mainly for URO gap-filling purposes.
It is important to understand that as soon as one tries to use an ideograph that is outside the scope of a particular regional standard, the likelihood of its shape corresponding to the conventions of that region is not assured.
Source Han Sans was released almost three years ago, right after Unicode Version 7.0 was released. We did manage to sneak in four ideographs—U+9FCD 鿍, U+9FCE 鿎, U+9FCF 鿏, and U+9FD0 鿐—that were expected to be encoded in the next version of Unicode and deemed stable enough to support, and indeed they were included in Unicode Version 8.0 in June of the following year, along with five additional ideographs, U+9FD1 through U+9FD5.
Source Han Serif, which was released last month, includes glyphs for the five additional ideographs that were included in Unicode Version 8.0, along with several characters that were added in Version 9.0, and expected to be included in Version 10.0 next month. This includes three Extension F ideographs that correspond to Adobe-Japan1-6 kanji.
In terms of additional characters supported in Source Han Serif, there are 53: U+03C2 (1.1), U+2B95 (7.0), U+312E (10.0), U+312F (11.0?), U+9FD1 through U+9FD5 (8.0), U+9FD6 through U+9FEA (10.0), U+1F12F (10.0), U+1F19B through U+1F1AC (9.0), U+1F23B (9.0), U+2D544 (10.0), U+2E278 (10.0), and U+2E6EA (10.0). U+312F, which is expected to be included in Unicode Version 11.0, was deemed stable enough to support.
Of course, you can expect these additional characters to be supported in Source Han Sans Version 2.000, although I am questioning the inclusion of the HK glyph for U+9FD2.
Unicode Variation Sequences
The UVSes (Unicode Variation Sequences) of Source Han Sans and Source Han Serif, which are specified in the Format 14 'cmap' subtable, differ in two important ways.
First, the Simplified Chinese fonts and font instances of Source Han Serif include nine Standardized Variants that correspond to nine CJK Compatibility Ideographs. Please read Event #4 in this CJK Type Blog article to know the background.
Second, Source Han Serif include three additional Adobe-Japan1 IVSes (Ideographic Variation Sequences) whose base characters are in Extension F, and which are expected to be registered in July. See PRI #349 for more details.
Source Han Sans Version 2.000 will include these minor, but important, enhancements. Although its latest Simplified Chinese fonts and font instances do not include the nine Standard Variants, I added the SourceHanSans_CN_sequences.txt UVS definition file to the project.
Errors—Known Or New
When one is developing fonts with a large number of glyphs, the likelihood of errors exists. Given the large number of characters, I would go so far as to state that I guarantee that there are errors still lurking in both typeface designs. That is the nature of the beast.
Luckily, open source projects, such as those hosted on GitHub, have an excellent mechanism for reporting and tracking issues. I am grateful to those who find and report issues. Some error reports may constitute nitpicking the typeface design, in which case it is up to the designer to decide whether to address such issues, and how. Others are genuine errors that need to be addressed.
The bottom line is that if something looks like an error, it probably is, especially if a particular glyph is strikingly different when comparing the two typeface designs. The prudent thing to do is to report the issue, and then let me or the designer make the determination. Also keep in mind that standards are not error-free.
The extent to which glyphs can be shared across languages depends on two primary factors: typeface style and typeface design. In the context of this section, typeface style is best described as sans serif versus serif in the generic sense. Typeface design refers to the particulars of the actual design, which make it unique from other typeface designs that are based on the same typeface style.
Regardless of the typeface design, the glyphs for some ideographs will always require more than one glyph, at least when designing sans serif and serif typefaces. U+5973 女 serves as a good example, showing a shared JP/KR glyph and separate CN and TW ones:
The following image illustrates the difference in glyph sharing based on typeface design, whereby the sans serif design includes separate JP/KR and CN/TW glyphs for U+91D1 金, but the serif design includes only a single glyph that serves all languages:
Additional JP Glyphs
The original intention of the Source Han families was to include approximately 6,000 additional JP glyphs, above and beyond those in the JIS standards and in Adobe-Japan1-6. In the case of Source Han Sans, over 4,000 of these additional JP glyphs needed to be removed prior to Version 1.000, in order to make room for arguably more critical Traditional Chinese glyphs.
Because Source Han Serif does not provide meaningful Hong Kong support in Version 1.000, nearly all of the 6,000 or so additional JP glyphs are included. Please note that approximately 4,000 of them are targeted for removal as part of the Version 2.000 update, in order to make room for the additional HK glyphs that will be necessary to support Hong Kong in a meaningful fashion, which leads us to the last section of this article.
The practical effect of this, especially when comparing the current Source Han Sans and Source Han Serif, is that the glyphs for some characters may appear differently, particularly when using the Japanese and Korean fonts and font instances, or when language-tagging for those languages.
One word → Biáng
Source Han Sans supports the code points that correspond to the Hong Kong SCS-2008 standard, but its glyphs do not necessarily follow HK conventions. We decided to wait until the Hong Kong SCS-2016 was available to provide more meaningful HK support, which will come in the form of appropriate glyphs and as separate HK fonts and font instances.
For Source Han Serif, we decided that it was prudent to defer Hong Kong support to Version 2.000, mainly in order to avoid doing work that would eventually need to be redone. The small number of HK glyphs that are included are for gap-filling purposes. My estimate is that we will need to add approximately 4,500 new HK glyphs in oder to provide adequate support.
In retrospect, it seems that it was A Good Idea™ to include the additional JP ideographs, as they serve as an excellent buffer for supporting additional glyphs. They’re nice to have, but when push comes to shove, some of them can be removed to make room for more critical glyphs.
Source Han Serif & Noto Serif CJK Version 1.001
In closing, I’d like to announce that Source Han Serif and Noto Serif CJK Version 1.001 were released today, along with updated versions of the multiple-family Super OTCs that include these latest font instances. Some of the details in this article are based on Source Han Serif Version 1.001.