2 THE NOT-SO-SILENT TYPE
1. Introduction
Typing logographic languages such as Chinese is more diicult than typing al-
phabetic languages, where each letter can be represented by one key. There is
no way to fit the tens of thousands of Chinese characters that exist onto a single
keyboard. Despite this obvious challenge, technologies have developed which
make typing in Chinese possible. To enable the input of Chinese characters, a
writer will generally use a keyboard app with an “Input Method Editor” (IME).
IMEs oer a variety of approaches to inputting Chinese characters, including via
handwriting, voice, and optical character recognition (OCR). One popular phonetic
input method is Zhuyin, and shape or stroke-based input methods such as Cangjie
or Wubi are commonly used as well. However, used by nearly 76% of mainland
Chinese keyboard users, the most popular way of typing in Chinese is the pinyin
method, which is based on the pinyin romanization of Chinese characters.
All of the keyboard apps we analyze in this report fall into the category of input
method editors (IMEs) that oer pinyin input. These keyboard apps are particularly
interesting because they have grown to accommodate the challenge of allowing
users to type Chinese characters quickly and easily. While many keyboard apps
operate locally, solely within a user’s device, IME-based keyboard apps oen have
cloud features which enhance their functionality. Because of the complexities
of predicting which characters a user may want to type next, especially in logo-
graphic languages like Chinese, IMEs oen oer “cloud-based” prediction services
which reach out over the network. Enabling “cloud-based” features in these apps
means that longer strings of syllables that users type will be transmitted to servers
elsewhere. As many have previously pointed out, “cloud-based” keyboards and
input methods can function as vectors for surveillance and essentially behave
as keyloggers. While the content of what users type is traveling from their device
to the cloud, it is additionally vulnerable to network attackers if not properly se-
cured. This report is not about how operators of cloud-based IMEs read users’
keystrokes, which is a phenomenon that has already been extensively studied and
documented. This report is primarily concerned with the issue of protecting this
sensitive data from network eavesdroppers.
In this report, we analyze the security of cloud-based pinyin keyboard apps from
nine vendors: Baidu, Honor, Huawei, iFlytek, OPPO, Samsung, Tencent, Vivo, and
Xiaomi. We examined these apps’ transmission of users’ keystrokes for vulnerabil-