The not-so-silent type: Vulnerabilities across keyboard apps reveal keystrokes to network eavesdroppers

The not-so-silent type

Vulnerabilities across keyboard apps

reveal keystrokes to network

eavesdroppers

By Jerey Knockel, Mona Wang, and Zoë Reichert

APRIL 23, 2024

RESEARCH REPORT #175

ShareAlike License).

Electronic version first published by the Citizen Lab in 2024. This work can be

accessed through https://citizenlab.ca/2024/04/vulnerabilities-across-keyboar

d-apps-reveal-keystrokes-to-network-eavesdroppers/.

Document Version: 1.0.

The Creative Commons Attribution-ShareAlike 4.0 license under which this report

is licensed lets you freely copy, distribute, remix, transform, and build on it, as

long as you:

• give appropriate credit

• indicate whether you made changes

• use and link to the same CC BY-SA 4.0 license

However, any rights in excerpts reproduced in this report remain with their respec-

tive authors; and any rights in brand and product names and associated logos

remain with their respective owners. Uses of these that are protected by copyright

or trademark rights require the rightsholder’s prior written agreement.

About the Citizen Lab, Munk School of Global Aairs & Public

Policy, University of Toronto

The Citizen Lab is an interdisciplinary laboratory based at the Munk School of

Global Aairs & Public Policy, University of Toronto, focusing on research, develop-

ment, and high-level strategic policy and legal engagement at the intersection of

information and communication technologies, human rights, and global security.

We use a “mixed methods” approach to research that combines methods from

political science, law, computer science, and area studies. Our research includes

investigating digital espionage against civil society, documenting Internet filtering

and other technologies and practices that impact freedom of expression online,

analyzing privacy, security, and information controls of popular applications, and

examining transparency and accountability mechanisms relevant to the relation-

ship between corporations and state agencies regarding personal data and other

surveillance activities.

Acknowledgements

We would like to thank Jedidiah Crandall, Jakub Dalek, Pellaeon Lin, and Sarah

Scheler for their guidance and review of this report. Research for this project

was supervised by Ron Deibert.

Suggested Citation

Jerey Knockel, Mona Wang, and Zoë Reichert. “The not-so-silent type: Vulnera-

bilities across keyboard apps reveal keystrokes to network eavesdroppers,” Cit-

izen Lab Report No. 175, University of Toronto, April 2024. Available at: https:

//citizenlab.ca/2024/04/vulnerabilities-across-keyboard-apps-reveal-keystroke

s-to-network-eavesdroppers/.

Contents

Key findings 1

1. Introduction 2

2. Related work 4

3. Methodology 5

4. Findings 7

4.1. Tencent 8

4.2. Baidu 9

4.3. iFlytek 15

4.4. Samsung 18

4.5. Huawei 22

4.6. Xiaomi 23

4.7. OPPO 25

4.8. Vivo 27

4.9. Honor 27

5. Other aected keyboard apps 29

6. Coordinated disclosure 31

6.1. Barriers to users receiving security updates 32

6.2. Language barriers in responsible disclosures 33

7. Limitations 34

8. Discussion 34

8.1. Impact of these vulnerabilities 34

8.2. How did these vulnerabilities arise 37

8.3. Can we systemically address these vulnerabilities? 38

9. Summary of recommendations 44

A. Known aected soware 47

B. Disclosure timelines 48

CITIZEN LAB RESEARCH REPORT NO. 175 1

We urge users to install the latest updates to their keyboard

apps and that they keep their mobile operating systems up to

date. We also recommend that at-risk users consider switching

from a cloud-based keyboard app to one that operates entirely

on-device.

Key ndings

›

We analyzed the security of cloud-based pinyin keyboard apps from

nine vendors — Baidu, Honor, Huawei, iFlytek, OPPO, Samsung, Ten-

cent, Vivo, and Xiaomi — and examined their transmission of users’

keystrokes for vulnerabilities.

›

Our analysis revealed critical vulnerabilities in keyboard apps from

eight out of the nine vendors in which we could exploit that vulnera-

bility to completely reveal the contents of users’ keystrokes in transit.

Most of the vulnerable apps can be exploited by an entirely passive

network eavesdropper.

›

Combining the vulnerabilities discovered in this and our previous re-

port analyzing Sogou’s keyboard apps, we estimate that up to one bil-

lion users are aected by these vulnerabilities. Given the scope of these

vulnerabilities, the sensitivity of what users type on their devices, the

ease with which these vulnerabilities may have been discovered, and

that the Five Eyes have previously exploited similar vulnerabilities in

Chinese apps for surveillance, it is possible that such users’ keystrokes

may have also been under mass surveillance.

›

We reported these vulnerabilities to all nine vendors. Most vendors

responded, took the issue seriously, and fixed the reported vulnerabil-

ities, although some keyboard apps remain vulnerable.

›

We conclude our report by summarizing our recommendations to var-

ious stakeholders to attempt to reduce future harm from apps which

might feature similar vulnerabilities.

2 THE NOT-SO-SILENT TYPE

1. Introduction

Typing logographic languages such as Chinese is more diicult than typing al-

phabetic languages, where each letter can be represented by one key. There is

no way to fit the tens of thousands of Chinese characters that exist onto a single

keyboard. Despite this obvious challenge, technologies have developed which

make typing in Chinese possible. To enable the input of Chinese characters, a

writer will generally use a keyboard app with an “Input Method Editor” (IME).

IMEs oer a variety of approaches to inputting Chinese characters, including via

handwriting, voice, and optical character recognition (OCR). One popular phonetic

input method is Zhuyin, and shape or stroke-based input methods such as Cangjie

or Wubi are commonly used as well. However, used by nearly 76% of mainland

Chinese keyboard users, the most popular way of typing in Chinese is the pinyin

method, which is based on the pinyin romanization of Chinese characters.

All of the keyboard apps we analyze in this report fall into the category of input

method editors (IMEs) that oer pinyin input. These keyboard apps are particularly

interesting because they have grown to accommodate the challenge of allowing

users to type Chinese characters quickly and easily. While many keyboard apps

operate locally, solely within a user’s device, IME-based keyboard apps oen have

cloud features which enhance their functionality. Because of the complexities

of predicting which characters a user may want to type next, especially in logo-

graphic languages like Chinese, IMEs oen oer “cloud-based” prediction services

which reach out over the network. Enabling “cloud-based” features in these apps

means that longer strings of syllables that users type will be transmitted to servers

elsewhere. As many have previously pointed out, “cloud-based” keyboards and

input methods can function as vectors for surveillance and essentially behave

as keyloggers. While the content of what users type is traveling from their device

to the cloud, it is additionally vulnerable to network attackers if not properly se-

cured. This report is not about how operators of cloud-based IMEs read users’

keystrokes, which is a phenomenon that has already been extensively studied and

documented. This report is primarily concerned with the issue of protecting this

sensitive data from network eavesdroppers.

In this report, we analyze the security of cloud-based pinyin keyboard apps from

nine vendors: Baidu, Honor, Huawei, iFlytek, OPPO, Samsung, Tencent, Vivo, and

Xiaomi. We examined these apps’ transmission of users’ keystrokes for vulnerabil-

CITIZEN LAB RESEARCH REPORT NO. 175 3

ities. Our analysis revealed critical vulnerabilities in keyboard apps from eight out

of the nine vendors — all but Huawei — in which we could exploit that vulnerability

to completely reveal the contents of users’ keystrokes in transit.

Between this report and our Sogou report, we estimate that close to one billion

users are aected by this class of vulnerabilities. Sogou, Baidu, and iFlytek IMEs

alone comprise over 95% of the market share for third-party IMEs in China, which

are used by around a billion people. In addition to the users of third party keyboard

apps, we found that the default keyboards on devices from three manufacturers

(Honor, OPPO, and Xiaomi) were also vulnerable to our attacks. Devices from

Samsung and Vivo also bundled a vulnerable keyboard, but it was not used by

default. In 2023, Honor, OPPO, and Xiaomi alone comprised nearly 50% of the

smartphone market in China.

Having the capability to read what users type on their devices is of interest to

a number of actors — including government intelligence agencies that operate

globally — because it may encompass exceptionally sensitive information about

users and their contacts including financial information, login credentials such as

usernames or passwords, and messages that are otherwise end-to-end encrypted.

Given the known capabilities of state actors, and that Five Eyes agencies have

previously exploited similar vulnerabilities in Chinese apps for the express purpose

of mass surveillance, it is possible that we were not the first to discover these

vulnerabilities and that they have previously been exploited on a mass scale for

surveillance purposes.

We reported these issues to all eight of the vendors in whose keyboards we found

vulnerabilities. Most vendors responded, took the issue seriously, and fixed the

reported vulnerabilities, although some keyboard apps remain vulnerable. Users

should keep their apps and operating systems up to date. We recommend that

they consider switching from a cloud-based keyboard app to one that operates

entirely on-device if they are concerned about these privacy issues.

The remainder of this report is structured as follows. In the “Related work” section,

we outline previous security and privacy research that has been conducted on

IME apps and past research which relates to issues of encryption in the Chinese

app ecosystem. In “Methodology”, we describe the reverse engineering tools and

techniques we used to analyze the above apps. In the “Findings” section, we ex-

plain the vulnerabilities we discovered in each app and (where applicable) how we

4 THE NOT-SO-SILENT TYPE

exploited these vulnerabilities. In “Coordinated disclosure”, we discuss how we

reported the vulnerabilities we found to the companies and their responses to our

outreach. Finally, in “Discussion”, we reflect on the impact of the vulnerabilities

we discovered, how they came to be, and ways that we can avoid similar problems

in the future. We provide recommendations to all stakeholders in this systemic

privacy and security failure, including users, IME and keyboard developers, oper-

ating systems, mobile device manufacturers, app store operators, International

standards bodies, and security researchers.

2. Related work

There has been much work analyzing East Asian apps for their security and pri-

vacy properties. As examples from outside of China, researchers studied LINE, a

Japanese-developed app, and KakaoTalk, a South Korean-developed app, finding

that they have faults in their end-to-end encryption implementations. When it

comes to Chinese soware, the Citizen Lab has previously revealed privacy and

security issues in several Chinese web browsers, and identified vulnerabilities in

the Zoom video conferencing platform and the MY2022 Olympics app. Unfortu-

nately, even developers of extremely popular apps oen overlook implementing

proper security measures and protecting user privacy.

Some work has been concerned specifically with the privacy issues with cloud-

based keyboard apps. As the technology powering keyboard apps became more

popular and sophisticated, awareness of the potential security risks associated

with these apps grew. Two main areas of concern have received the most attention

from security researchers when it comes to cloud-based keyboard apps: whether

user data is secure in the cloud servers and whether it is secure in transit as it

moves from the user’s device to a cloud server.

Some researchers have expressed concern over companies handling sensitive

keystroke data and have made attempts to ameliorate the risk of the cloud server

being able to record what you typed. In 2013, the Japanese government published

concerns it had with privacy regarding the Baidu IME, particularly the cloud in-

put function. Researchers have also been concerned with surveillance via other

“cloud-based” IMEs, like iFlytek’s voice input. While there has been a push to

develop privacy-aware cloud-based IMEs that would keep user data secret, they

are not widely used. While it is concerning what companies might do with user

CITIZEN LAB RESEARCH REPORT NO. 175 5

keystroke data, our research pertains to the security of user keystroke data before

it even reaches cloud servers and who else other than the cloud operator may be

able to read it.

Other research has studied the leakage of sensitive information when user key-

stroke data is in transit between a user’s device to a remote cloud server. If not

properly encrypted, data can be intercepted and collected by network eavesdrop-

pers. In 2015 security researchers proposed and evaluated a system to identify

keystroke leakages in IME traic, revealing that at least one IME was transmitting

sensitive data without encrypting it at all. Another investigation in the same year

showed that the most popular IME, Sogou, was sending users’ device identifiers

in the clear. In our 2023 report we exposed Sogou falling short once more, finding

that Sogou allowed network eavesdroppers to read what users were typing—as

they typed—in any application. All of these discoveries point to developers of

these applications overlooking the importance of transport security to protect

user data from network attackers.

While previous work studying the security of keystroke network data in transit

investigates single keyboard apps at a time, our report is the first to holistically

evaluate the network security of the cloud-based keyboard app landscape in

China.

3. Methodology

We analyzed the Android and, if present, the iOS and Windows versions of key-

board apps from the following keyboard app vendors: Tencent, Baidu, iFlytek,

Samsung, Huawei, Xiaomi, OPPO, Vivo, and Honor. The first three — Tencent,

Baidu, and iFlytek — are soware developers of keyboard apps whereas the re-

maining six — Samsung, Huawei, Xiaomi, OPPO, Vivo, and Honor — are mobile

device manufacturers who either developed their own keyboard apps or include

one or more of the other three developers’ keyboard apps preinstalled on their

devices. We selected these nine vendors because we identified them as having

integrated cloud recommendation functionality into their products and because

they are popularly used. To procure the versions we analyzed, between August and

November, 2023, we downloaded the latest versions of them from their product

websites, the Apple App Store, or, in the case of the apps developed or bundled

by mobile device manufacturers, by procuring a mobile device that has the app

6 THE NOT-SO-SILENT TYPE

preinstalled on the ROM. In the case that we obtained the app as pre-installed on

a mobile device, we ensured that the device’s apps and operating system were

fully updated before beginning analysis of its apps. The devices we obtained were

intended for the mainland Chinese market, and, when device manufacturers had

two editions of their device, a Chinese edition and a global edition, we analyzed

the Chinese edition.

To better understand whether these vendors’ keyboard apps securely imple-

mented their cloud recommendation functionality, we analyzed them to deter-

mine whether they suiciently encrypted users’ typed keystrokes. To do so, we

used both static and dynamic analysis methods. We used jadx to decompile and

statically analyze Dalvik bytecode and IDA Pro to decompile and statically analyze

native machine code. We used frida to dynamically analyze the Android and iOS

versions and IDA Pro to dynamically analyze the Windows version. Finally, we used

Wireshark and mitmproxy to perform network traic capture and analysis.

To prepare for our dynamic analysis of each keyboard app, aer installing it, we

enabled the pinyin input if it was not already enabled. The keyboards we analyzed

generally prompted users to enable cloud functionality aer installation or on

first use. In such cases, we answered such prompts in the airmative or otherwise

enabled cloud functionality through the mobile device’s or app’s settings.

In our analysis, we assume a fairly conservative threat model. For most of our

attacks, we assume a passive network eavesdropper that monitors network pack-

ets that are sent from a user’s keyboard app to a keyboard app’s cloud server. In

one of our attacks, specifically against apps using Tencent’s Sogou API, we allow

the adversary to be active in a limited way in that the adversary may additionally

transmit network traic to the cloud server but does not necessarily have to be a

machine-in-the-middle (MITM) or spoof messages from the user in a layer 3 sense.

In all of our attacks, the adversary also has access to a copy of the client soware,

but the server is a black box.

We note that, as neither Apple’s nor Google’s keyboard apps have a feature to

transmit keystrokes to cloud servers for cloud-based recommendations, we did

(and could) not analyze these keyboards for the security of this feature. However,

we observed that none of the mobile devices that we analyzed included Google’s

keyboard, Gboard, preinstalled, either. This finding likely results from Google’s

CITIZEN LAB RESEARCH REPORT NO. 175 7

exit from China reportedly due to the company’s failure to comply with China’s

pervasive censorship requirements.

4. Findings

Among the nine vendors whose apps we analyzed, we found that there was only

one vendor, Huawei, in whose apps we could not find any security issues regarding

the transmission of users’ keystrokes. For each of the remaining eight vendors,

in at least one of their apps, we discovered a vulnerability in which keystrokes

could be completely revealed by a passive network eavesdropper (see Table 1 for

details).

The ease with which the keystrokes in these apps could be revealed varied. In

one app, Samsung Keyboard, we found that the app performed no encryption

whatsoever. Some apps appeared to internally use Sogou’s cloud functionality

and were vulnerable to an attack which we previously published. Most vulnerable

apps failed to use asymmetric cryptography and mistakenly relied solely on home-

rolled symmetric encryption to protect users’ keystrokes.

The remainder of this section details further analysis of the apps we analyzed

from each vendor and, when present, their vulnerabilities.

✘✘

active and passive eavesdroppers

working exploit created to decrypt transmitted keystrokes for both

✘

active eavesdropper

working exploit created to decrypt transmitted keystrokes for an

！

weaknesses present in cryptography implementation

✔

no known issues

N/A product not oered or not present on device analyzed

Legend

8 THE NOT-SO-SILENT TYPE

Keyboard developer Android iOS Windows

Tencent

†

✘

N/A

✘

Baidu

！

✘✘

iFlytek

✘✘

✔

Pre-installed keyboard developer

Device manufacturer Own Sogou Baidu iFlytek

Samsung

✘✘

✔

✘✘

N/A N/A N/A

Huawei

✔

N/A N/A N/A N/A

Xiaomi N/A

✘

✘✘

N/A N/A

OPPO N/A

✘

✘✘

N/A N/A N/A

Vivo

✔

✘

N/A N/A N/A N/A

Honor N/A N/A

✘✘

N/A N/A N/A

Default keyboard app on our test device.

†

Both QQ Pinyin and Sogou IME are developed by Tencent; in this report we analyzed QQ

Pinyin and found the same issues as we had in Sogou IME.

Table 1: Summary of vulnerabilities discovered in popular keyboards and in keyboards

pre-installed on popular phones.

4.1. Tencent

We have previously analyzed one Tencent keyboard app, Sogou, in a previous

report. We were motivated by our previous findings analyzing Sogou to analyze

another Tencent keyboard app, QQ Pinyin. We analyzed QQ Pinyin on Android

and Windows. We found that the Android version (8.6.3) and Windows version

(6.6.6304.400) of this soware communicated to similar cloud servers as Sogou

and contained the same vulnerabilities to those which we previously reported in

Sogou IME (see Table 2 for details).

CITIZEN LAB RESEARCH REPORT NO. 175 9

Android com.tencent.qqpinyin 8.6.3

✘

Windows QQPinyin_Setup_6.6.6304.400.exe 6.6.6304.400

✘

Platform File/Package Name Version analyzed Secure?

Table 2: The versions of QQ Pinyin that we analyzed.

4.2. Baidu

We analyzed Baidu IME for Windows, Android, and iOS. We found that Baidu IME for

Windows includes a vulnerability which allows network eavesdroppers to decrypt

network transmissions. This means third parties can obtain sensitive personal

information including what users have typed. We also found privacy and security

weaknesses in the encryption used by the Android and iOS versions of Baidu IME

(see Table 3 for details).

Windows

6.0.3.44.exe

BaiduPinyinSetup_

6.0.3.44

✘✘

BAIDUv3.1

Android com.baidu.input 11.7.19.9

！

BAIDUv4.0

iOS com.baidu.inputMethod 11.7.20

！

BAIDUv4.0

Platform File/Package Name

analyzed

Version

Secure? Protocol

Table 3: The versions of Baidu IME that we analyzed.

The Android version transmitted keystrokes information via UDP packets to

polimeok.baidu.com

and that the Windows and iOS versions transmitted

keystrokes to

udpolimenew.baidu.com

. The two mobile versions that we

analyzed, namely the Android and iOS versions, transmitted these keystrokes

according to a stronger protocol, whose payload begins with the bytes 0x04 0x00.

The Windows version transmitted these keystrokes according to a weaker protocol,

whose UDP payload begins with the bytes 0x03 0x01. We henceforth refer to

these protocols as the BAIDUv4.0 and BAIDUv3.1 protocols, respectively. In the

remainder of this section we detail multiple weaknesses in the BAIDUv4.0 protocol

10 THE NOT-SO-SILENT TYPE

Initialization vector Initialization vector + n – 1 Initialization vector + n

Encrypt

Ciphertext block

Key

Encrypt

Ciphertext block partly encrypted twice

Key

Encrypt

Key

Stolen ciphertext || PlaintextPlaintext block

Last ciphertext block

Baidu’s modiﬁed CTR

Plaintext block

Figure 1: Illustration of BCTR mode encryption scheme used by Baidu IME on Android and

iOS. Adapted from this figure.

used by the Android and iOS versions and explain how a network eavesdropper

can decrypt the contents of keystrokes transmitted by the BAIDUv3.1 protocol.

4.2.1. Weaknesses in BAIDUv4.0 protocol

To encrypt keystroke information, the BAIDUv4.0 protocol uses elliptic-curve

Diie-Hellman and a pinned server public key (pk

) to establish a shared secret

key for use in a modified version of AES.

Upon opening the keyboard, before the first outgoing BAIDUv4.0 protocol message

is sent, the application randomly generates a client Curve25519 public-private

key pair, which we will call (pk

, sk

). Then, a Diie-Hellman shared secret k is

generated using sk

and a pinned public key pk

. To send a message with plaintext

P, the application reuses the first 16 bytes of pk

as the initialization vector (IV)

for symmetric encryption, and k is used as the symmetric encryption key. The

resulting symmetric encryption of P is then sent along with pk

to the server. The

server can then obtain the same Diie-Hellman shared secret k from pk

and sk

the private key corresponding to pk

, to decrypt the ciphertext.

The BAIDUv4.0 protocol symmetrically encrypts data using a modified version

of AES, which symbols in the code indicate Baidu has called AESv3. Compared

to ordinary AES, AESv3 has a built-in cipher mode and padding. AESv3’s built-in

cipher mode mixes bytes dierently and uses a modified counter (CTR) mode

which we call Baidu CTR (BCTR) mode, illustrated in Figure 1.

CITIZEN LAB RESEARCH REPORT NO. 175 11

Generally speaking, any CTR cipher mode involves combining an initialization

vector v with the value i of some counter, whose combination we shall notate as

v + i. Most commonly, the counter value used for block i is simply i, i.e., it begins

at zero and increments for each subsequent block, and AESv3’s implementation

follows this convention. There is no standard way to compute v + i in CTR mode,

but the way that BCTR combines v and i is by adding i to the le-most 32-bits

of v, interpreting this portion of v and i in little-endian byte order. If the sum

overflows, then no carrying is performed on bytes to the right of this 32-bit value.

The implementation details we have thus far described do not significantly deviate

from a typical CTR implementation. However, where BCTR mode diers from

ordinary CTR mode is in how the value v + i is used during encryption. In ordinary

CTR mode, to encrypt block i with key k, you would compute

plain

XOR encrypt(v + i, k).

In BCTR mode, to encrypt block i, you compute

encrypt(plain

XOR (v + i), k).

As we will see later, this deviation will have implications for the security of the

algorithm.

While ordinarily CTR mode does not require the final block length to be a multiple

of the cipher’s block size (in the case of AES, 16 bytes), due to Baidu’s modifi-

cations, BCTR mode no longer automatically possesses this property but rather

achieves it by employing ciphertext stealing. If the final block length n is less than

16, AESv3’s implementation encrypts the final 16 byte block by taking the last

(16 - n) bytes of the penultimate ciphertext block and prepending them to the n

bytes of the ultimate plaintext block. The encryption of the resultant block fills the

last (16 - n) bytes of the penultimate ciphertext block and the n bytes of the final

ciphertext block. Note, however, that this practice only works when the plaintext

consists of at least two blocks. Therefore, if there exists only one plaintext block,

then AESv3 right-zero-pads that block to be 16 bytes.

Privacy issues with key and IV re-use. Since the IV and key are both directly

derived from the client key pair, the IV and key are reused until the application

generates a new key pair. This only happens when the application restarts, such as

when the user restarts the mobile device, the user switches to a dierent keyboard

and back, or the keyboard app is evicted from memory. From our testing, we have

12 THE NOT-SO-SILENT TYPE

Figure 2: When a bitmap image (le) is encrypted in ECB mode, patterns in the image are

still visible in the ciphertext (right). Adapted from these figures.

observed the same key and IV in use for over 24 hours. There are various issues

that arise from key and IV reuse.

Re-using the same IV and key means that the same inputs will encrypt to the same

encrypted ciphertext. Additionally, due to the way the block cipher is constructed,

if blocks in the same positions of the plaintexts are the same, they will encrypt to

the same ciphertext blocks. As an example, if the second block of two plaintexts

are the same, the second block of the corresponding ciphertexts will be the same.

Weakness in cipher mode. The electronic codebook (ECB) cipher mode is noto-

rious for having the undesirable property that equivalent plaintext blocks encrypt

to equivalent ciphertext blocks, allowing patterns in the plaintext to be revealed

in the ciphertext (see Figure 2 for an illustration).

While BCTR mode used by Baidu does not as flagrantly reveal patterns to the same

extent as ECB mode, there do exist circumstances in which patterns in the plaintext

can still be revealed in the ciphertext. Specifically, there exist circumstances in

which there exists a counter-like pattern in the plaintext which can be revealed by

the ciphertext (see Figure 3 for an example). These circumstances are possible due

to the fact that (IV + i) is XORed with each plaintext block i and then encrypted,

unlike ordinary CTR mode which encrypts (IV + i) and XORs it with the plaintext.

CITIZEN LAB RESEARCH REPORT NO. 175 13

Block Plaintext Ciphertext

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e2 d4 00 1c c6 5d 80 33 0c b9 48 7d d5 27 72 7a

01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e2 d4 00 1c c6 5d 80 33 0c b9 48 7d d5 27 72 7a

Figure 3: When encrypted with the randomly generated key

<96 66 08 d1 6f 80 82 86

a7 b7 da 43 96 ee d1 a2>

and IV

<48 5b 54 92 0c 80 a6 20 29 6f 95 e5 c5 6a 3d e2>

using Baidu’s modified CTR mode, the above plaintext blocks in positions 0 and 1 encrypt to

the same ciphertext.

Thus, when using BCTR mode, if the plaintext exhibits similar counting patterns

as (IV + i), then for multiple blocks the value ((IV + i) XOR plaintext block i) may be

equivalent and thus encrypt to an equivalent ciphertext.

More generally, BCTR mode fails to provide the cryptographic property of diusion.

Specifically, if an algorithm provides diusion, then, when we change a single bit

of the plaintext, we expect half of the bits of the ciphertext to change. However, the

example in Figure 3 illustrates a case where changing a single bit of the plaintext

caused zero bits of the ciphertext to change, a clear violation of the expectations of

this property. The property of diusion is vital in secure cryptographic algorithms

so that patterns in the plaintext are not visible as patterns in the ciphertext.

Other privacy and security weaknesses. There are other weaknesses in the

custom encryption protocol designed by Baidu IME that are not consistent with

the expected standards for a modern encryption protocol used by hundreds of

millions of devices.

Forward secrecy issues with static Diie-Hellman. The use of a pinned static

server key means that the cipher is not forward secret, a property of other modern

network encryption ciphers like TLS. If the server key is ever revealed, any past

message where the shared secret was generated with that key can be successfully

decrypted.

Lack of message integrity. There are no cryptographically secure message in-

tegrity checks, which means that a network attacker may freely modify the cipher-

text. There is a CRC32 checksum calculated and included with the plaintext data,

but a CRC32 checksum does not provide cryptographic integrity, as it is easy to

generate CRC32 checksum collisions. Therefore, modifying the ciphertext may be

possible. In combination with the issue concerning key and IV reuse, this protocol

may be vulnerable to a swapped block attack.

14 THE NOT-SO-SILENT TYPE

def derive_fixed_key():

key = []

x = 0

for i in range(16):

key.append((~i ^ ((i + 11) * (x >> (i & 3)))) & 0xff)

x += 1937

return bytes(key)

Figure 4: Python code equivalent to the code that the BAIDUv3.1 protocol uses to derive its

fixed key. The function takes no input and derives the same key on every invocation.

4.2.2. Vulnerability in BAIDUv3.1 protocol

The BAIDUv3.1 protocol is weaker than the BAIDUv4.0 protocol and contains a

critical vulnerability that allows an eavesdropper to decrypt any messages en-

crypted with it. The protocol in the versions of Baidu’s keyboard apps that we

analyzed encrypts keystrokes using a modified version of AES which we call AESv2,

as we believe it to be the predecessor cipher to Baidu’s AESv3. When a keyboard

app uses the BAIDUv3.1 protocol with the AESv2 cipher, we say that it uses the

BAIDUv3.1+AESv2 scheme. Normally, AES when used with a 128-bit key performs

10 rounds of encryption on each block. However, we found that AESv2 uses only 9

rounds but is otherwise equivalent to AES encryption with a 128-bit key.

The BAIDUv3.1+AESv2 scheme encrypts keystrokes using AESv2 in the following

manner. First, a key is derived according to a fixed function (see Figure 4). Note

that the function takes no input nor references any external state and thus always

generates the same static key

= <ff 9e d5 48 07 5a 10 e4 ef 06 c7 2e a7 a2 f2 36>.

To encrypt a protobuf-serialized message, the BAIDUv3.1 protocol first snappy-

compresses it, forming a compressed buer. The 32-bit, little-endian length of

this compressed message is then prepended to the compressed buer, forming

the plaintext. A randomly generated 128-bit key k

is used to encrypt the plaintext

using AESv2 in ECB mode. The resulting ciphertext is stored in bytes 44 until the

end of the final UDP payload. Key k

is used to encrypt k

using AESv2 in ECB mode.

The resulting ciphertext is stored in bytes 28 until 44 of the final UDP payload.

CITIZEN LAB RESEARCH REPORT NO. 175 15

[...]

2 {

1: "nihaocanyoureadthis"

5: 3407918

}

3 {

1: 107

2: 10

5: 1

}

4 {

1: "1133d4c64afbf1feda85d3c497dd6164|0"

2: "wn1||0"

3: "6.0.3.44"

4: "notepad.exe"

}

[...]

Figure 5: Excerpt of decrypted information, including what we had typed (“nihao-

canyoureadthis”) and the app into which it was typed (“notepad.exe”).

We found that these encrypted protobuf serializations include our typed key-

strokes as well as the name of the application into which we were typing them

(see Figure 5).

A vulnerability exists in the BAIDUv3.1+AESv2 scheme that allows a network eaves-

dropper to decrypt the contents of these messages. Since AES is a symmetric

encryption algorithm, the same key used to encrypt a message can also be used to

decrypt it. Since k

is fixed, any network eavesdropper with knowledge of k

, such

as from performing the same analysis of the app as we performed, can decrypt

and thus can decrypt the plaintext contents of each message encrypted in the

manner described above. As we found that users’ keystrokes and the names of

the applications they were using were sent in these messages, a network eaves-

dropper who is eavesdropping on a user’s network traic can observe what that

user is typing and into which application they are typing it by taking advantage of

this vulnerability.

4.3. iFlytek

We analyzed iFlytek (also called xùnfēi from the pinyin of

讯飞

) IME on Android,

iOS, and Windows. We found that iFlytek IME for Android includes a vulnerability

16 THE NOT-SO-SILENT TYPE

which allows network eavesdroppers to recover the plaintext of insuiciently

encrypted network transmissions, revealing sensitive information including what

users have typed (see Table 4 for details).

Android com.iflytek.inputmethod 12.1.10

✘✘

iOS com.iflytek.inputime 12.1.3338

✔

Windows iFlyIME_Setup_3.0.1734.exe 3.0.1734

✔

Platform File/Package Name Version analyzed Secure?

Table 4: The versions of Xunfei IME analyzed.

The Android version of iFlytek IME encrypts the payload of each HTTP request

sent to

pinyin.voicecloud.cn

with the following algorithm. Let s be the

current time in seconds since the Unix epoch at the time of the request. For each

request, an 8-byte encryption key is then derived by first performing the following

computation:

x = (s % 0x5F5E100) ^ 0x1001111

The 8-byte key k is then derived from x as the lowest 8 ASCII-encoded digits of x,

le-padded with leading zeroes if necessary, in big-endian order. In Python, the

above can be summarized by the following expression:

k = b'%08u' % ((s % 0x5F5E100) ^ 0x1001111)

The payload of the request is then padded with PKCS#7 padding and then en-

crypted with DES using key k in ECB mode. The value s is transmitted in the HTTP

request in the clear as a GET parameter named “time”.

Since DES is a symmetric encryption algorithm, the same key used to encrypt

a message can also be used to decrypt it. Since k can be easily derived from

s and since s is transmitted in the clear in every HTTP request encrypted by k,

any network eavesdropper can easily decrypt the contents of each HTTP request

encrypted in the manner described above. (Since s is simply the time in single

second resolution, it also stands to reason that a network eavesdropper would

have general knowledge of s in any case.)

CITIZEN LAB RESEARCH REPORT NO. 175 17

1: 0

2: 0

3: 49

4: "xxxxx"

5: 0

7 {

1: "app_id"

2: "100IME"

}

7 {

1: "uid"

2: "230817031752396418"

}

7 {

1: "cli_ver"

2: "12.1.14983"

}

7 {

1: "net_type"

2: "wifi"

}

7 {

1: "OS"

2: "android"

}

8: 8

Figure 6: Decrypted information revealing what we had typed (“xxxxx”).

We found that users’ keystrokes were transmitted in a protobuf serialization and

encrypted in this manner (see Figure 6). Therefore, a network eavesdropper who

is eavesdropping on a user’s network traic can observe what that user is typing

by taking advantage of this vulnerability.

Finally, the DES encryption algorithm is an older encryption algorithm with known

weaknesses, and the ECB block cipher mode is a simplistic and problematic cipher

mode. The use of each of these technologies is problematic in itself and opens

the Android version of iFlytek IME’s communications to additional attacks.

18 THE NOT-SO-SILENT TYPE

4.4. Samsung

We analyzed Samsung Keyboard on Android as well as the versions of Sogou IME

and Baidu IME that Samsung bundled with our test device, an SM-T220 tablet

running ROM version T220CHN4CWF4. We found that Samsung Keyboard for

Android and Samsung’s bundled version Baidu IME includes a vulnerability that

allows network eavesdroppers to recover the plaintext of insuiciently encrypted

network transmissions, revealing sensitive information including what users have

typed (see Table 5 for details).

Samsung Keyboard

android.honeyboard

com.samsung.

5.6.10.26

✘✘

IME)

百度输入法 (Baidu

com.baidu.input 8.5.20.4

✘✘

Samsung Version)

(Sogou IME

搜狗输入法三星版

sogou.samsung

com.sohu.inputmethod.

202307281642

10.32.38.

✔

Application name Package name Version analyzed Secure?

Table 5: The keyboards analyzed on the Samsung OneUI 5.1 platform.

4.4.1.

Samsung Keyboard (com.samsung.android.honey-

board)

We found that when using Samsung Keyboard on the Chinese edition of a Sam-

sung device and when Pinyin is chosen as Samsung Keyboard’s input language,

Samsung Keyboard transmits keystroke data to the following URL in the clear via

HTTP POST:

http://shouji.sogou.com/web_ime/mobile_pb.php?durtot=33

9&h=8f2bc112-bbec-3f96-86ca-652e98316ad8&r=android_oe

m_samsung_open&v=8.13.10038.413173&s=&e=&i=&fc=0&base=

dW5rbm93biswLjArMC4w&ext_ver=0

The keystroke data is contained in the request’s HTTP payload in a protobuf

serialization (see Figure 7).

CITIZEN LAB RESEARCH REPORT NO. 175 19

1 {

1: "8f2bc112-bbec-3f96-86ca-652e98316ad8"

2: "android_oem_samsung_open"

3: "8.13.10038.413173"

4: "999"

5: 1

7: 2

}

2 {

1: "\351\000"

2: "\372\213"

}

4: "com.tencent.mobileqq"

7: "nihaocanyoureadthis"

16: 10

17 {

3 {

1: 1

2: 5

}

5: 1

9: 1

}

18: ""

19 {

1: "0"

4: "339"

}

Figure 7: Protobuf message transmitted aer typing “nihaocanyoureadthis”.

The device on which we were testing was fully updated on the date of testing

(October 7, 2023) in that it had all OS updates applied and had all updates from

the Samsung Galaxy Store applied.

Since Samsung Keyboard transmits keystroke data via plain, unencrypted HTTP

and sincethere is no encryptionappliedat any other layer, a networkeavesdropper

who is monitoring a Samsung Keyboard user’s network traic can easily observe

that user’s keystrokes if that user is using the Chinese edition of the ROM with the

Pinyin input language selected.

20 THE NOT-SO-SILENT TYPE

When using the global edition of the ROM or when using a non-Pinyin input lan-

guage, we did not observe the Samsung keyboard communicating with cloud

servers.

4.4.2. 百度输入法 (Baidu IME)

We found that the version of Baidu IME bundled with our Samsung test device

transmitted keystroke information via UDP packets to

udpolimenew.baidu.

com

. This version of Baidu IME used the BAIDUv3.1 protocol that we describe in

the Baidu section earlier but with a dierent cipher and compression algorithm as

indicated in each transmission’s header. In the remainder of this section we explain

how a network eavesdropper can, just like with AESv2, decrypt the contents of

messages encrypted using a scheme we call BAIDUv3.1+AESv1 (see Table 6).

BAIDUv3.1

BAIDUv3.1+AESv1 AESv1 ECB Additional permutations

BAIDUv3.1+AESv2 AESv2 ECB Missing round

BAIDUv4.0 BAIDUv4.0+AESv3 AESv3 BCTR

mode

Uses home-rolled cipher

Protocol Scheme Cipher Mode Cipher versus AES

Table 6: Summary of ciphers used across dierent Baidu protocols.

Samsung’s bundled version of Baidu IME encrypts keystrokes using a modified

version of AES which we name AESv1, as we believe it to be the predecessor to

Baidu’s AESv2. When encrypting, AESv1’s key expansion is like that of standard

AES, except, on each but the first subkey, the order of the subkey’s bytes are

additionally permuted. Furthermore, on the encryption of each block, the bytes

of the block are additionally permuted in two locations, once near the beginning

of the block’s encryption immediately aer the block has been XOR’d by the first

subkey and again near the end of the block’s encryption immediately before S-

box substitution. Aside from complicating our analysis, we are not aware of these

modifications altering the security properties of AES, and we have developed an

implementation of this algorithm to both encrypt and decrypt messages given a

plaintext or ciphertext and a key.

CITIZEN LAB RESEARCH REPORT NO. 175 21

0: [800,

1276,

10,

"92F8EE78F1DDCBE74CFEB1166F70883D%7C0",

"a1|SM-T220-gta7litewifi|320",

"8.5.20.4",

"com.android.settings.intelligence",

"1012497q",

"",

"2你好惨又热大腿 ",

""],

1: [0, "", "nihaocanyoureadthis"]

Figure 8: The decrypted and decompressed payload, revealing what we had typed (“ni-

haocanyoureadthis”, highlighted) and the app into which it was typed (“com.android.set-

tings.intelligence”); on top is a hex dump of, when decrypted and decompressed, the result-

ing proprietary binary blob, and below it is our understanding of how to parse it.

Samsung’s bundled version of Baidu IME encrypts keystrokes by applying AESv1

in electronic codebook (ECB) mode in the following manner. First, the app uses

the fixed 128-bit key,

= <ff 9e d5 48 07 5a 10 e4 ef 06 c7 2e a7 a2 f2 36>,

to encrypt another, generated, key, k

. The fixed key k

is the same key the BAIDU-

v3.1 protocol uses for AESv2 (see Figure 4). The encryption of k

is stored in bytes

64 until 80 of each UDP packet’s payload. The key k

is then used to encrypt the

remainder of a zlib-compressed message payload, which is stored at byte 80 until

the end of the UDP payload. We found that the encrypted payload included, in a

binary container format which we did not recognize, our typed keystrokes as well

as the name of the application into which we were typing them (see Figure 8).

22 THE NOT-SO-SILENT TYPE

A vulnerability exists in the BAIDUv3.1+AESv1 scheme that allows a network eaves-

dropper to decrypt the contents of these messages. Since AES, including AESv1,

is a symmetric encryption algorithm, the same key used to encrypt a message

can also be used to decrypt it. Since k

is hard-coded, any network eavesdropper

with knowledge of k

can decrypt k

and thus decrypt the plaintext contents of

each message encrypted in the manner described above. As we found that users’

keystrokes and the names of the applications they were using were sent in these

messages, a network eavesdropper who is eavesdropping on a user’s network

traic can observe what that user is typing and into which application they are

typing it by taking advantage of this vulnerability.

Additionally, in the version of Baidu Input Method distributed by Samsung, we

found that key k

was not securely generated using a secure pseudorandom

number generator (secure PRNG). Instead, it was seeded using a custom-designed

PRNG that we believe to have poor security properties, and, instead of using a high

entropy seed, the PRNG generating k

was seeded using the message plaintext.

However, even without these weaknesses in the generation of k

, the protocol

is already completely insecure to network eavesdroppers as described in the

previous paragraphs.

4.5. Huawei

We analyzed the keyboards preinstalled on our Huawei Mate 50 Pro test device.

We found no vulnerabilities in the manner of transmission of users’ keystrokes

in the versions of Huawei’s keyboard apps that we analyzed (see Table 7 for de-

tails). Specifically, Huawei used TLS to encrypt keystrokes in each version that we

analyzed.

搜狗输入法 (Sogou IME)

inputmethod.sogou

com.sohu.

11.31

✔

小艺输入法 (Celia IME)

inputmethod

com.huawei.ohos.

1.0.19.333

✔

Application name Package Name Version analyzed Secure?

Table 7: The versions of the Huawei keyboard apps analyzed on HarmonyOS 4.0.0.

CITIZEN LAB RESEARCH REPORT NO. 175 23

4.6. Xiaomi

We analyzed the keyboards preinstalled on our Xiaomi Mi 11 test device. We found

that they all include vulnerabilities that allow network eavesdroppers to decrypt

network transmissions from the keyboards (see Table 8 for details). This means

that network eavesdroppers can obtain sensitive personal information, including

what users have typed.

Version)

(Baidu IME Xiaomi

百度输入法小米版

com.baidu.input_mi 10.6.120.480

✘✘

Version)

(Sogou IME Xiaomi

搜狗输入法小米版

sogou.xiaomi

com.sohu.inputmethod.

202210221903

10.32.21.

✘

Version)

(iFlytek IME Xiaomi

讯飞输入法小米版

inputmethod.miui

com.iflytek.

8.1.8014

✘✘

Application name Package Name Version analyzed Secure?

Table 8: The versions of the Xiaomi keyboard apps analyzed on MIUI 14.03.31.

In this section we detail vulnerabilities in three dierent keyboard apps included

with MIUI 14.0.31 in which users’ keystrokes can be, if necessary, decrypted, and

read by network eavesdroppers.

4.6.1. 百度输入法小米版 (Baidu IME Xiaomi Version)

We found that Xiaomi’s Baidu-based keyboard app encrypts keystrokes using the

BAIDUv3.1+AESv2 scheme which we detailed previously. When the app’s messages

are decrypted and deserialized, we found that they include our typed keystrokes as

well as the name of the application into which we were typing them (see Figure 9).

Like we explained previously a vulnerability exists in the BAIDUv3.1+AESv2 scheme

that allows a network eavesdropper to decrypt the contents of these messages.

As we found that users’ keystrokes and the names of the applications they were

using were sent in these messages, a network eavesdropper who is eavesdropping

24 THE NOT-SO-SILENT TYPE

[...]

2 {

1: "nihaonihaoqqwerty"

}

3 {

1: 53

2: 10

3: 1080

4: 2166

5: 5

}

4 {

1: "DC0F75E6809F0FAAB46EDE2F2D6302ED%7CVAPBN4NOH"

2: "p-a1-3-66|2211133C|720"

3: "10.6.120.480"

4: "com.miui.notes"

5: "1000228c"

6: "\346\242\205\345\267\236"

}

[...]

Figure 9: Excerpt of decrypted information, including what we had typed (“nihaonihaoqqw-

erty”) and the application into which it was typed (“com.miui.notes”).

on a user’s network traic can observe what that user is typing and into which

application they are typing it by taking advantage of this vulnerability.

4.6.2. 搜狗输入法小米版 (Sogou IME Xiaomi Version)

The Sogou-based keyboard app is subject to a vulnerability which we have already

publicly disclosed in Sogou IME (

搜狗输入法

) in which a network eavesdropper can

decrypt and recover users’ transmitted keystrokes. Please see the corresponding

details in this report for full details. Tencent responded by securing Sogou IME

transmissions using TLS, but we found that Xiaomi’s Sogou-based keyboard had

not been fixed.

4.6.3. 讯飞输入法小米版 (iFlytek IME Xiaomi Version)

Similar to iFlytek’s own IME for Android, we found that Xiaomi’s iFlytek keyboard

app used the same faulty encryption. We found that users’ keystrokes were sent

to pinyin.voicecloud.cn and encrypted in this manner.

CITIZEN LAB RESEARCH REPORT NO. 175 25

{"p":{"m":53,"f":0,"l":0},"i":"nihaoniba"}

Figure 10: Excerpt of decrypted information, including what we had typed (“nihaoniba”).

Therefore, a network eavesdropper who is eavesdropping on a user’s network

traic can observe what that user is typing by taking advantage of this vulnerability

(see Figure 10).

4.7. OPPO

We analyzed the keyboard apps preinstalled on our OPPO OnePlus Ace test device.

We found that they all include vulnerabilities that allow network eavesdroppers to

decrypt network transmissions from the keyboards (see Table 9 for details). This

means that network eavesdroppers can obtain sensitive personal information,

including what users have typed.

Version)

(Baidu IME Custom

百度输入法定制版

com.baidu.input_oppo 8.5.30.503

✘✘

Version)

(Sogou IME Custom

搜狗输入法定制版

inputmethod.sogouoem

com.sohu.

2305171502

8.32.0322.

✘

Application name Package Name Version analyzed Secure?

Table 9: The versions of the OPPO keyboard apps analyzed on ColorOS 13.1.

In this section we detail vulnerabilities in two dierent keyboard apps included

with MIUI 14.0.31 in which users’ keystrokes can be, if necessary, decrypted, and

read by network eavesdroppers.

4.7.1. 百度输入法定制版 (Baidu IME Custom Version)

We found that OPPO’s Baidu-based keyboard app encrypts keystrokes using the

BAIDUv3.1+AESv2 scheme which we detailed previously. When the app’s messages

are decrypted and deserialized, we found that they include our typed keystrokes as

well as the name of the application into which we were typing them (see Figure 11).

26 THE NOT-SO-SILENT TYPE

[...]

2 {

1: "nihaonihao"

}

3 {

1: 28

2: 10

3: 1240

4: 2662

5: 5

}

4 {

1: "47148455BDAEBA8A253ACBCC1CA40B1B%7CV7JTLNPID"

2: "p-a1-5-105|PHK110|720"

3: "8.5.30.503"

4: "com.android.mms"

5: "1021078a"

}

[...]

Figure 11: Excerpt of decrypted information, including what we had typed (“nihaonihao”)

and the application into which it was typed (“com.android.mms”).

Like we explained previously a vulnerability exists in the BAIDUv3.1+AESv2 scheme

that allows a network eavesdropper to decrypt the contents of these messages.

As we found that users’ keystrokes and the names of the applications they were

using were sent in these messages, a network eavesdropper who is eavesdropping

on a user’s network traic can observe what that user is typing and into which

application they are typing it by taking advantage of this vulnerability.

4.7.2. 搜狗输入法定制版 (Sogou IME Custom Version)

The Sogou-based keyboard app is subject to a vulnerability which we have already

publicly disclosed in Sogou IME (

搜狗输入法

) in which a network eavesdropper can

decrypt and recover users’ transmitted keystrokes. Please see the corresponding

details in this report for full details. Tencent responded by securing Sogou IME

transmissions using TLS, but we found that OPPO’s Sogou-based keyboard had

not been fixed.

CITIZEN LAB RESEARCH REPORT NO. 175 27

4.8. Vivo

We analyzed the keyboard apps preinstalled on our Vivo Y78+ test device. We

found that the Sogou-based one includes vulnerabilities that allow network eaves-

droppers to decrypt network transmissions from the keyboards (see Table 10 for

details). This means that network eavesdroppers can obtain sensitive personal

information, including what users have typed.

Version)

(Sogou IME Custom

搜狗输入法定制版

sogou.vivo

com.sohu.inputmethod.

2305191843

10.32.13023.

✘

IME)

Jovi输入法 (Jovi

com.vivo.ai.ime 2.6.1.2305231

✔

Keyboard name Package Name Version analyzed Secure?

Table 10: The versions of the Vivo keyboard apps analyzed on origin OS 3.

The Sogou-based keyboard app is subject to a vulnerability which we have already

publicly disclosed in Sogou IME (

搜狗输入法

) in which a network eavesdropper can

decrypt and recover users’ transmitted keystrokes. Please see the corresponding

details in this report for full details. Tencent responded by securing Sogou IME

transmissions using TLS, but we found that Vivo’s Sogou-based keyboard had not

been fixed.

4.9. Honor

We analyzed the keyboard apps preinstalled on our Honor Play7T test device.

We found that the Baidu-based one includes vulnerabilities that allow network

eavesdroppers to decrypt network transmissions from the keyboards (see Table 11

for details). This means that network eavesdroppers can obtain sensitive personal

information, including what users have typed.

28 THE NOT-SO-SILENT TYPE

[...]

2 {

1: "nihaonihaonihaoq"

5: 6422639

}

3 {

1: 91

2: 10

3: 720

4: 1552

5: 5

}

4 {

1: "A49AD3D3789A136975C2B28201753F03%7C0"

2: "p-a1-5-115|RKY-AN10|720"

3: "8.2.501.1"

4: "com.hihonor.mms"

5: "1023233d"

7: "A00-TWGTFEV5OFZ7WZ2AFN5TCDE4BPNO7XRZ-BVEZBI4D"

}

[...]

Figure 12: Excerpt of decrypted information, including what we had typed (“nihaonihaoni-

haoq”) and the application into which it was typed (“com.hihonor.mms”).

Version)

(Baidu IME Honor

百度输入法荣耀版

input_hihonor

com.baidu.

8.2.501.1

✘✘

Application name Package Name Version analyzed Secure?

Table 11: The versions of the Honor keyboard apps analyzed on Magic UI 6.1.0.

We found that Honor’s Baidu-based keyboard app encrypts keystrokes using the

BAIDUv3.1+AESv2 scheme which we detailed previously. When the app’s messages

are decrypted and deserialized, we found that they include our typed keystrokes as

well as the name of the application into which we were typing them (see Figure 12).

Like we explained previously a vulnerability exists in the BAIDUv3.1+AESv2 scheme

that allows a network eavesdropper to decrypt the contents of these messages.

As we found that users’ keystrokes and the names of the applications they were

using were sent in these messages, a network eavesdropper who is eavesdropping

CITIZEN LAB RESEARCH REPORT NO. 175 29

on a user’s network traic can observe what that user is typing and into which

application they are typing it by taking advantage of this vulnerability.

As of April 1, 2024, “Baidu IME Honor Version”, the default IME on the Honor device

we tested, is still vulnerable to passive decryption. We also discovered that on our

Play7T device, there was no way to update “Baidu IME Honor Version” through the

device’s app store. In responding to our disclosures, Honor asked us to disclose

to Baidu and that it was Baidu’s responsibility to patch this issue.

5. Other aected keyboard apps

Given our limited resources to analyze apps, we were not able to analyze every

cloud-based keyboard app available. Nevertheless, given that these vulnerabilities

appeared to aect APIs that were used by multiple apps, we wanted to approxi-

mate the total number of apps aected by these vulnerabilities.

We began by searching VirusTotal, a database of soware and other files that have

been uploaded for automated virus scanning, for Android apps which reference

the string “get.sogou.com”, the API endpoint used by Sogou IME, as these apps

may require additional investigation to determine whether they are vulnerable.

Excluding apps that we analyzed above, this search yielded the following apps:

• com.sohu.sohuvideo

• com.tencent.docs

• com.sogou.reader.free

• com.sohu.inputmethod.sogou.samsung

• com.sogou.text

• com.sogou.novel

• com.sogo.appmall

• com.blank_app

• com.sohu.inputmethod.sogou.nubia

• com.sogou.androidtool

• com.sohu.inputmethod.sogou.meizu

• com.sohu.inputmethod.sogou.zte

• sogou.mobile.explorer.hmct

• sogou.mobile.explorer

• com.sogou.translatorpen

30 THE NOT-SO-SILENT TYPE

• com.sec.android.inputmethod.beta

• com.sohu.inputmethod.sogou.meitu

• com.sec.android.inputmethod

• sogou.mobile.explorer.online

• com.sohu.sohuvideo.meizu

• com.sohu.inputmethod.sogou.oem

• com.sogou.map.android.maps

• sogou.llq.online

• com.sohu.inputmethod.sogou.coolpad

• com.sohu.inputmethod.sogou.chuizi

• com.sogou.toptennews

• com.sogou.recmaster

• com.meizu.flyme.input

We have not analyzed these apps and thus cannot conclude that they are neces-

sarily vulnerable, or even keyboard apps, but we provide this list to help reveal

the possible scope of the vulnerabilities that we discovered. When we disclosed

this list to Tencent, Tencent requested an additional three months to fix the vul-

nerabilities before we publicly disclosed this list, suggesting credence to the idea

that apps in this list are largely vulnerable. Similarly, aer excluding apps that we

had already analyzed, the following are other Android apps which reference the

strings “udpolimenew.baidu.com” or “udpolimeok.baidu.com”, the API endpoints

used by Baidu Input Method:

• com.adamrocker.android.input.simeji

• com.facemoji.lite.xiaomi.gp

• com.facemoji.lite.xiaomi

• com.pre.kb.xm

• com.facemoji.lite.transsion

• com.txthinking.brook

• com.facemoji.lite.vivo

• com.baidu.input_huawei

• com.baidu.input_vivo

• com.baidu.input_oem

• com.pre.kb.op

• com.txthinking.shiliew

• mark.via.gp

• com.qinggan.app.windlink

CITIZEN LAB RESEARCH REPORT NO. 175 31

• com.baidu.mapauto

These findings suggest that a large ecosystem of apps may be aected by the

vulnerabilities that we discovered in this report.

6. Coordinated disclosure

We reported the vulnerabilities that we discovered to each vendor in accordance

with our vulnerability disclosure policy. All companies except Baidu, Vivo, and

Xiaomi responded to our disclosures. Baidu fixed the most serious issues we

reported to them shortly aer our disclosure, but Baidu has yet to fix all issues

that we reported to them. The mobile device manufacturers whose preinstalled

keyboard apps we analyzed fixed issues in their apps except for their Baidu apps,

which either only had the most serious issues addressed or, in the case of Honor,

did not address any issues (see Table 12 for details). Regarding QQ Pinyin, Tencent

indicated that “with the exception of end-of-life products, we aim to finalize the

upgrade for all active products to transmit EncryptWall requests via HTTPS by the

conclusion of Q1 [2024]”, but, as of April 1, 2024, we have not seen any fixes to

this product. Tencent may consider QQ Pinyin end-of-life as it has not received

updates since 2020, although we note that it is still available for download. For

timelines and full correspondence of our disclosures to each vendor, please see

the Appendix.

✘✘

active and passive eavesdroppers

working exploit created to decrypt transmitted keystrokes for both

✘

active eavesdropper

working exploit created to decrypt transmitted keystrokes for an

！

weaknesses present in cryptography implementation

✔

no known issues or all known issues fixed

N/A product not oered or not present on device analyzed

Legend.

32 THE NOT-SO-SILENT TYPE

Keyboard developer Android iOS Windows

Tencent

†

✘

N/A

✘

Baidu

！

iFlytek

✔

Pre-installed keyboard developer

Device manufacturer Own Sogou Baidu iFlytek

Samsung

✔

！

N/A N/A N/A

Huawei

✔

N/A N/A N/A N/A

Xiaomi N/A

✔

！

✔

N/A N/A

OPPO N/A

✔

！

N/A N/A N/A

Vivo

✔

N/A N/A N/A N/A

Honor N/A N/A

✘✘

N/A N/A N/A

Default keyboard app on our test device.

†

Both QQ Pinyin and Sogou IME are developed by Tencent; in this report we analyzed QQ

Pinyin and found the same issues as we had in Sogou IME.

Table 12: Status of vulnerabilities aer disclosure as of April 1, 2024.

To summarize, we no longer have working exploits against any products except

Honor’s keyboard app and Tencent’s QQ Pinyin. Baidu’s keyboard apps on other

devices continue to contain weaknesses in their cryptography which we are unable

to exploit at this time to fully decrypt users’ keystrokes in transit.

6.1. Barriers to users receiving security updates

Users can receive updates to their keyboard apps on their phones’ app stores,

and such updates typically install in the background without user intervention.

In our testing, updating keyboard apps was typically performed without friction.

However, in some cases, a user may need to also ensure that they have fully

CITIZEN LAB RESEARCH REPORT NO. 175 33

updated their operating system before they will receive the fixes to our reported

vulnerabilities for their keyboard app through the app store. In the case of the

Honor device we tested, there was no update mechanism for the default keyboard

used by the operating system through the app store. Honor devices bundled with

a vulnerable version of the keyboard will remain vulnerable to passive decryption.

In the case of the Samsung Galaxy Store, we found that on our device a user

must sign in with a Samsung account before receiving security updates to their

keyboard app. In the case the user does not have a Samsung account, then they

must create one. We believe that installing important security updates should

be frictionless, and we recommend that Samsung and app stores in general not

require the registration of a user account before receiving important security

updates.

We also learned from communication with Samsung’s security team that our test

device had been artificially stuck on an older version of Baidu IME (version 8.5.20.4)

compared to the one in the Samsung Galaxy Store. This is because, although the

test device was using a Chinese ROM, we were prevented from receiving updates to

Baidu IME because the app was geographically unavailable in Canada, where we

were testing from. Samsung addressed this issue by adding Baidu’s keyboard app

to the global market. Generally speaking, we recommend that Samsung and other

app stores do not geoblock security updates to apps that are already installed.

6.2. Language barriers in responsible disclosures

We suspect that a language barrier may have prevented iFlytek from responding to

our initial disclosure in English. Aer we did not receive a response for one month,

we re-sent the same disclosure e-mail, but with a subject line and one-sentence

summary in simplified Chinese. iFlytek responded within three days of this second

email and promptly fixed the issues we noted. All future disclosure emails to the

Chinese mobile device manufacturers were then written with Chinese subject

lines and a short summary in Chinese. Though obvious in hindsight, we encourage

security researchers to consider if the company to which they are disclosing uses

a dierent language than the researcher. We suggest submitting vulnerability

disclosures, at the very least, with short summaries and email subject lines in the

oicial language of the company’s jurisdiction to prevent similar delays as we

may have encountered in disclosure timelines.

34 THE NOT-SO-SILENT TYPE

7. Limitations

In this report we detail vulnerabilities relating to the security of the transmission

of users’ keystrokes in multiple keyboard apps. In this work we did not perform

a full audit of any app or make any attempt to exhaustively find every security

vulnerability in any soware. Our report concerns analyzing keyboard apps for a

class of vulnerabilities that we discovered, and the absence of our reporting of

other vulnerabilities should not be considered evidence of their absence.

8. Discussion

In this section we discuss the impact of the vulnerabilities that we found, speculate

as to the factors that gave rise to them, and conclude by introducing possible

ways to systemically prevent such vulnerabilities from arising in the future.

8.1. Impact of these vulnerabilities

The scope of these severe vulnerabilities cannot be understated: until this and

our previous Sogou report, the majority of Chinese mobile users’ keystrokes were

decryptable by network adversaries. The keyboards we studied comprise over

95% of the third-party IME market share, which is estimated to be over 780 million

users by marketing agencies. In addition, the three phone manufacturers which

pre-installed and by default used vulnerable keyboard apps comprise nearly 50%

of China’s smartphone market.

The vulnerabilities that we discovered would be inevitably discovered by any-

one who thinks to look for them. Furthermore, the vulnerabilities do not require

technological sophistication to exploit. With the exception of the vulnerability af-

fecting many Sogou-based keyboard apps that we previously discovered, all of the

vulnerabilities that we covered in this report can be exploited entirely passively

without sending any additional network traic. This also means any existing logs

of network data sent by these keyboards can be decrypted in the future. As such,

we might wonder, are these vulnerabilities actively under mass exploitation?

While many governments may possess sophisticated mass surveillance capa-

bilities, the Snowden revelations gave us unique insight into the capabilities

of the United States National Security Agency (NSA) and more broadly the Five

CITIZEN LAB RESEARCH REPORT NO. 175 35

Figure 13: Locations of XKEYSCORE servers as described in a 2008 NSA slide deck.

Eyes. The revelations disclosed, among other programs, an NSA program called

XKEYSCORE for collecting and searching Internet data in realtime across the globe

(see Figure 13). Leaked slides describing the program specifically reveal only a

few examples of XKEYSCORE plugins. However, one was a plugin that was written

by a Five Eyes team to take advantage of vulnerabilities in the cryptography of

Chinese-developed UC Browser to enable the Five Eyes to collect device iden-

tifiers, SIM card identifiers, and account information pertaining to UC Browser

users (see Figure 14 for an illustration).

The similarity of the vulnerability exploited by this XKEYSCORE plugin and the

vulnerabilities described in this report are uncanny, as they are all vulnerabilities

in the encryption of sensitive data transmissions in soware predominantly used

by Chinese users. Given the known capabilities of XKEYSCORE, we surmise that the

Five Eyes would have the capability to globally surveil the keystrokes of all of the

keyboard apps that we analyzed with the exception of Sogou and the apps licens-

ing its soware. This single exception exists because Sogou cannot be monitored

passively and would require sending packets to Sogou servers. Such commu-

nications would be measurable at Sogou’s servers and at other vantage points,

36 THE NOT-SO-SILENT TYPE

Figure 14: The dashboard of an XKEYSCORE plugin used to monitor for transmissions of

sensitive data insuiciently encrypted by UC Browser as described in a 2012 Five Eyes slide

deck.

potentially revealing the Five Eyes’s target(s) of surveillance to Sogou or Chinese

network operators. Therefore, targets of outdated Sogou soware would be un-

desirable victims of mass surveillance, even if such non-passive measurements

were within the known capabilities of XKEYSCORE or other Five Eyes programs.

Given the enormous intelligence value of knowing what users are typing, we

can conclude that not only do the NSA and more broadly the Five Eyes have

the capabilities to mass exploit the vulnerabilities we found but also the strong

motivation to exploit them. If the Five Eyes’ capabilities are an accurate reflection

of the capabilities and motivations of other governments, then we can assume

that many other governments are also capable and motivated to mass exploit

these vulnerabilities. The only remaining question is whether any government

had knowledge of these vulnerabilities. If they did not have such knowledge

before our original report analyzing Sogou, they may have acquired aer it in the

same way that our original research inspired us to look at similar keyboard apps

for analogous vulnerabilities. Unfortunately, short of future government leaks,

we may never know if or to what extent any state actors mass exploited these

vulnerabilities.

Even though we disclosed the vulnerabilities to vendors, some vendors failed to

fix the issues that we reported. Moreover, users of devices which are out of support

or that otherwise no longer receive updates may continue to be vulnerable. As

CITIZEN LAB RESEARCH REPORT NO. 175 37

such, many users of these apps may continue to be under mass surveillance for

the foreseeable future.

8.2. How did these vulnerabilities arise

We analyzed a broad sample of Chinese keyboard apps, finding that they are

almost universally vulnerable to having their users’ keystrokes being decrypted by

network eavesdroppers. Yet there is no common library or a single implementation

flaw responsible for these vulnerabilities. While some of the keyboard apps did

license their code from other companies, our overall findings can only be explained

by a large number of developers independently making the same kind of mistake.

As such, we might ask, how could such a large number of independent developers

almost universally make such a critical mistake?

One attempt to answer this question is to suggest that these were not mistakes at

all but deliberate backdoors introduced by the Chinese government. However,

this hypothesis is rather weak. First, user keystroke data is already being sent to

servers within Chinese legal jurisdiction, and so the Chinese government would

have access to such data anyways. Second, the vulnerabilities that we found give

the ability not just to the Chinese government to decrypt transmitted keystrokes

but to any other actor as well. In an ideal backdoor, the Chinese government would

want the desirable property that only they have access to the backdoor. Finally,

the Chinese government has made strides to study and improve the data security

of apps developed and used in China, attempting to prevent and fix the very sort of

vulnerabilities which we discovered. For instance, a 2020 report from CNCERT/CC

found that 60 percent of the 50 banking applications that they investigated did

not encrypt any user data transmitted over the network, among a litany of other

common security issues.

Were Chinese app developers skeptical of using cryptographic standards per-

ceived as “Western”? Countries such as China and Russia have their own en-

cryption standards and ciphers. To our knowledge none of the faulty encryption

implementations that we analyzed adhered to any sort of known standard in any

country, and each appeared to be home-rolled ciphers. However, it is possible

that Asian developers are less inclined to use encryption standards that they fear

may contain backdoors such as the potential Dual_EC_DRBG backdoor.

38 THE NOT-SO-SILENT TYPE

Perhaps Chinese app developers could be skeptical of standards such as SSL/TLS

as well. The TLS ecosystem has also only become nearly-universal in the past

decade. Especially before broad oversight of certificate authorities became com-

monplace, there were many valid criticisms of the SSL/TLS ecosystem. In 2011,

digital rights organizations EFF and Access Now were both concerned about the

certificate authority (CA) infrastructure underpinning SSL/TLS transport encryp-

tion. Even today, the vast majority of root certificates trusted by major OSes and

browsers are operated by certificate authorities based in the Global North. We also

note that all of the IMEs containing vulnerabilities were first released before 2013

and likely had a need for secure network transmission before SSL/TLS became

the de-facto standard for strong transport encryption.

Still, it has been a decade since the Snowden leaks demonstrated the global,

urgent, and practical need for strong encryption of data-in-transit in 2013, and the

TLS ecosystem has largely stabilized, with CA root lists of many major browsers

and OSes controlled by voting bodies and certificate transparency deployed. As

of 2024, almost 95% of web traic from users of Firefox in the United States is

traveling over HTTPS. In addition, the speed in which both iFlytek and Sogou

switched to TLS demonstrates that making the change to standard TLS is not

necessarily a time or resource issue. Even if skepticism towards SSL/TLS explains

the reluctance to adopt it in the early 2010s, we are not sure why there is much

more inertia in the Chinese Internet ecosystem against making the switch to TLS.

Finally, mobile devices and other operating systems are still incapable of guaran-

teeing the security of data under transmission, despite iOS and Android having

introduced restrictions into their APIs. For instance, iOS 9 implemented App Trans-

port Security, a policy placing restrictions on the ability to transmit data without

TLS. However, there are two limitations of this technology. First, an app can spec-

ify exceptions to this policy in its Info.plist resource. Second, the policy aects

high level APIs and leaves communications over lower level socket-based APIs

unregulated. Similar to iOS, Android 9 disables cleartext traic using certain high

level APIs by default, but an app may exclude specific domains or avoid the policy

by using lower level APIs.

8.3. Can we systemically address these vulnerabilities?

Individually analyzing apps for this class of vulnerabilities and individually re-

porting issues discovered is limited in the scale of apps that it can fix. First, while

CITIZEN LAB RESEARCH REPORT NO. 175 39

we can attempt to manually analyze some of the most popular keyboard apps,

we will never be able to analyze every app at large. Second, we might not be

able to predict which apps to look at in the first place. For instance, before we

analyzed Sogou and the keyboard apps featured in this report, we never would

have expected that their network transmissions would be so easily vulnerable

to interception. In light of the limitations of the methods that we employed in

this report, in the remainder of this section we discuss possibilities for how we

might systematically or wholesale address apps which transmit sensitive data

over networks without suicient encryption.

8.3.1.

By security researchers paying more attention to the

Chinese Internet

There appears to be a general failure of researchers to analyze Chinese apps and

the Chinese Internet ecosystem at large, despite its size and influence. The Google

Play Store and Apple App Store ecosystems, for instance, are commonly studied

by privacy researchers, but many Chinese app stores are overlooked, despite

that many popular Chinese apps have more users than their counterparts on

the Google Play Store. While the vulnerabilities that we discovered were not all

trivial to find and many took substantial analysis to attack, most would have been

inevitably discovered by any researcher analyzing these apps for data security. A

researcher studying network traic from users of Chinese devices could also have

identified strange, non-standard traic.

8.3.2. By using app store enforcement

One might call on app stores to enforce the use of suicient encryption to protect

sensitive data in transit. App stores already have a number of rules that they

enforce through a combination of automated and manual review. Calling on app

stores to enforce suicient encryption of in-transit sensitive data is tempting given

the resources of the companies operating the app stores. However, failing any

other innovation, the same scaling issues that apply to other researchers studying

these apps will apply to those working for these companies.

40 THE NOT-SO-SILENT TYPE

8.3.3. By using device permission models

On Android devices, installing any keyboard, regardless of whether or how it com-

municates with servers over the Internet, brings up a pop-up with the following

text:

This input method may be able to collect all the text you type, includ-

ing personal data like passwords and credit card numbers.

The wording of these warning messages is overbroad and does not necessarily

help users distinguish between keyboards that transmit keystrokes over the net-

work, keyboards that transmit keystrokes insecurely (using something other than

standard TLS) over the network, and keyboards that do not transmit any data at

all.

iOS devices, on the other hand, sandbox their keyboards by default. There is a “Full

Access” or “open access” permission that must be explicitly granted to keyboards

before they have network access, among other privileges. Without this permission,

third-party keyboards cannot transmit network data. We recommend Android

also adopt a more fine-grained permission model for keyboards.

Furthermore, the vulnerable apps that we studied transmit data using low level

socket APIs versus higher level APIs that require the usage of TLS or HTTPS. One

might desire that separate system calls be designed for TLS or HTTPS traic in

addition to the lower level socket system calls so that devices could implement an

UNSAFE_INTERNET permission that would be required for apps to use the lower

level system calls while still allowing TLS-encrypted traic for apps that do not

have this permission.

While this approach may have some merit, it also has certain drawbacks. It makes

sense for situations where apps are untrustworthy and the operating system is

completely trustworthy, but there are common situations where the operating

system could be not as or even less trustworthy than apps that it is running. One

common case would be a user who is running an up-to-date app on an out of

date operating system, possibly because the user’s device is no longer receiving

operating system updates. In such a case, the app’s implementation of TLS is

more likely to be secure than that of the operating system. Furthermore, a user’s

operating system may be compromised by malware or otherwise be untrustwor-

thy in itself. Introducing a TLS system call would centralize the encryption of all

CITIZEN LAB RESEARCH REPORT NO. 175 41

sensitive data and grant the operating system easy visibility into all unencrypted

data. In any case, innovating in areas of encryption is an important right of appli-

cation developers, and it may not make sense to stifle apps like Signal because of

their use of end-to-end or other novel encryption by requiring them to obtain an

UNSAFE_INTERNET permission.

One might alternatively desire for apps at large to not be able to access the Inter-

net at all. Instead of an UNSAFE_INTERNET permission, what about introducing

an INTERNET permission to govern all Internet socket access, similar to the “Full

Access” permission which iOS already applies to keyboard apps? Android devices

in fact already have such a permission that apps must request to use Internet

(AF_INET) sockets, but it is not a permission that is exposed to ordinary users

either in the Google Play Store or through any stock Android user interface, and it

is automatically granted when installing an app. Unfortunately, given all of the

interprocess communication (IPC) vehicles on modern smart devices, restricting

Internet socket access may not guarantee that the app could not communicate

over the Internet (e.g., through Google Play services). GrapheneOS, an open source

Android-based operating system, implements a NETWORK permission. However,

denying this permission can lead to surprising results where apps can still com-

municate with the Internet via IPC with other apps. As such, we recommend that

both the developers of Android and iOS work toward a meaningful INTERNET

permission that would adequately inform users of whether an app communicates

over the Internet.

8.3.4.

By international standards bodies better engaging

with Chinese developers

We encourage International standards bodies like the IETF to continue to engage

and outreach Chinese Internet companies and engineers in good faith to further

reduce friction in cross-linguistic knowledge transfer. The presence of these sim-

ilar but independent vulnerabilities demonstrate that there is a friction in the

transfer and implementation of knowledge between the English-speaking cryp-

tography community and the Chinese cryptography community. For instance,

Schneier’s Law or the o-repeated mantra “don’t roll your own crypto” may be

common knowledge to cryptographers trained in English, but perhaps lost in

translation. A lag across linguistic boundaries means that general information like

the recent stabilization of TLS and webPKI infrastructure may travel more slowly,

and updating encryption soware to reflect new information may lag even further

42 THE NOT-SO-SILENT TYPE

behind. One other possible example of this phenomenon is that, according to Fire-

fox Telemetry, up until 2020, the Japanese Internet ecosystem also significantly

lagged behind the global average in HTTPS adoption.

Although protocols put out by IETF and other International standards bodies can

be far from bulletproof, these bodies can still help facilitate international com-

munication about the current state-of-the-art in protocol encryption. The burden

of cross-linguistic and cross-cultural exchange on technical standards falls on

global standards bodies. Western media outlets and researchers tend to uniformly

attribute the actions and participation of private Chinese companies within stan-

dards bodies to government actors seeking sovereignty over Internet standards.

While skepticism may be warranted in certain cases, there is also research that

challenges a simplistic and overbroad narrative. As a single data point, we note

that we did not find these issues in Huawei’s keyboards, whose employees are

oen noted as especially active participants in IETF standard-setting.

8.3.5. By using automated static or dynamic analysis

There has been a failure of automated tools to detect insecure traic at large.

Longitudinal TLS telemetry has largely been focused on web-based perspectives

(i.e., how many domains support TLS or how many web connections are encrypted

by TLS?), and the mobile perspective is oen overlooked, despite the increasing

dominance of mobile traic globally. Although there are some research projects

that survey TLS usage in Android mobile apps at scale, there is no public longi-

tudinal data from these projects (i.e., they are run as one-o studies), and many

focus on the Google Play’s Android ecosystem, thereby excluding the Chinese

mobile Internet. There is perhaps a need for public longitudinal TLS telemetry for

popular mobile applications globally, via automated static or dynamic analysis at

scale.

8.3.6. By using attestations in app stores

Another way for users to gain visibility into the security and privacy properties

of their apps is through the use of developer attestations, such as the ones that

appear in data safety sections in many popular app stores. Both the Apple App

Store and the Google Play Store collect and display such attestations to varying

extents, including attestations as to what data an app collects (if any) and with

whom it is shared (if anyone). Additionally, the Play Store allows developers the

CITIZEN LAB RESEARCH REPORT NO. 175 43

Figure 15: An example of an attestation for Microso SwiKey.

opportunity to attest to performing “encryption in transit” (see Figure 15 for an

example). These attestations allow users to clearly see what security and privacy

properties an app’s developer claims it to have and, like privacy policies, they

provide means of redress if violated.

We wanted to evaluate whether the apps that we analyzed lived up to their attes-

tations concerning their encryption in the app stores in which they are available.

Among the apps that we analyzed, only Baidu IME was available in the Play Store.

At the time of this writing, it does not attest to its data being encrypted in transit.

Although other apps that we analyzed were available in Apple’s App Store, to our

knowledge, this store does not display an attestation for whether the app encrypts

data in transit. As such, across both the Google Play and the Apple App stores,

attestations were insuicient for compelling the keyboard apps’ developers to

implement proper encryption or in providing users any opportunity for redress.

In light of the above findings, we believe that users would benefit from the follow-

ing recommendations: (1) that app store operators require developers to attest to

whether or not an app encrypts data in transit, (2) that app store operators display

not only when developers attest to all data being encrypted in transit but also

display a warning when they fail to, and (3) that app store operators require apps

in certain sensitive categories, such as keyboard apps, to either positively attest

to encrypting all data in transit or to attest to not transmitting any data at all.

Since most of the apps that we found perform some type of encryption, even if it

were wholly inadequate, one might wonder if attesting that data is merely “en-

crypted” is enough, since the data arguably did have some manner of encryption

44 THE NOT-SO-SILENT TYPE

applied to it during transit. The Play Store provides some guidance on this topic.

Under the question — “How should I encrypt data in transit?” — the documen-

tation notes: “You should follow best industry standards to safely encrypt your

app’s data in transit. Common encryption protocols include TLS (Transport Layer

Security) and HTTPS.”

Another issue with attestations is that they provide no guarantee that an app

behaves as its developers attest, as developers can, aer all, make false attesta-

tions. While we wish that attestations could guarantee that an app suiciently

implements proper cryptography to the same extent that a permission system

can guarantee an app does not use a microphone, false attestations provide an

opportunity for redress. For instance, apps which are found to violate attestations

would be subject to removal from app stores. Furthermore, apps which violate

attestations could be subject to fines by regulatory bodies such as the FTC. Finally,

apps which violate the attestation could be liable to civil suits.

While the apps we analyzed were predominantly available from Chinese app

stores, we equally recommend that Chinese app stores adopt these recommen-

dations in addition to the Apple App Store and the Google Play Store. Moreover,

while this report focuses on the problem of poor encryption practices as it ap-

plies to Chinese apps, the problem to varying extents applies to apps of all other

provenances.

9. Summary of recommendations

We conclude our report by summarizing our recommendations to multiple stake-

holders.

Recommendations to security researchers

•

Researchers should analyze more apps from the East Asian app ecosystem

and from other popular ecosystems which may be outside of their own

locale.

•

Researchers should develop better static and dynamic analysis techniques

to recognize the types of vulnerabilities that we discovered in this report at

scale.

CITIZEN LAB RESEARCH REPORT NO. 175 45

•

Researchers submitting vulnerability disclosures to a company should in-

clude short summaries and email subject lines in the oicial language of

the company’s jurisdiction.

Recommendations to international standards bodies

•

International standards bodies should continue to engage with security

engineers from Chinese Internet companies.

Recommendations to app store operators

•

App stores should not require account registration as a condition to receive

security updates.

• App stores should not geoblock security updates.

•

App stores should allow developers to attest to all data being transmitted

with encryption, similar to the ability in the Google Play Store.

•

App stores should display not only when developers attest to all data being

encrypted in transit but also display a warning when they fail to.

•

App stores should require apps in certain sensitive categories, such as key-

board apps, to either positively attest to encrypting all data in transit or to

attest to not transmitting any data at all.

Recommendations to keyboard app developers

• Use well-tested and standard encryption protocols, like TLS or QUIC.

•

Make every attempt to provide features on-device without requiring trans-

mitting sensitive data to cloud servers.

Recommendations to mobile operating system developers

•

Android should implement sandboxing by default for keyboard apps, similar

to iOS, that prevents a keyboard from transmitting network traic among

other activities until a user grants the app full access.

•

The developers of Android and iOS should work toward a meaningful IN-

TERNET permission that would adequately inform users of whether any app

communicates over the Internet.

Recommendations to device manufacturers

•

Conduct security audits of third-party keyboards that you intend to pre-

install by default on your operating systems.

46 THE NOT-SO-SILENT TYPE

Recommendations to users

•

Users of Honor’s pre-installed keyboard or users of QQ pinyin should switch

keyboards immediately.

•

Users of any Sogou, Baidu, or iFlytek keyboard, including the versions that

are bundled or pre-installed on operating systems, should ensure their

keyboards and operating systems are up-to-date.

•

Users of any Baidu IME keyboard should consider switching to a dierent

keyboard or disabling the “cloud-based” feature.

•

Users with privacy concerns should not enable “cloud-based” features on

their keyboards or IMEs or should switch to a keyboard that does not oer

“cloud-based” prediction.

•

iOS users with privacy concerns should not enable “Full Access” for their

keyboards or IMEs.

CITIZEN LAB RESEARCH REPORT NO. 175 47

A. Known aected software

We recommend that all users keep their operating systems and apps, including

keyboard apps, up to date. If you use any of the following soware, we especially

recommend you update to the most recent version of your OS and application. As

of April 1, 2024, the following soware has fixes available:

Separately installed, third-party keyboards

• Sogou IME / 搜狗输入法 for Android and Windows

•

Baidu IME /

百度输入法

for Windows (this soware has only been partially

fixed, see below)

• iFlytek IME / 讯飞输入法 for Android

Pre-installed on Samsung devices with Chinese edition ROM

• Samsung Keyboard

• Baidu IME / 百度输入法

Pre-installed on Xiaomi devices with Chinese edition ROM

• Sogou IME Xiaomi Version / 搜狗输入法小米版

• iFlytek IME Xiaomi Version / 讯飞输入法小米版

Pre-installed on OPPO devices with Chinese edition ROM

• Sogou IME Custom Version / 搜狗输入法定制版

Pre-installed on Vivo devices with Chinese edition ROM

• Sogou IME Custom Version / 搜狗输入法定制版

The following soware does not use TLS and may still contain weaknesses:

Separately installed, third-party keyboards

• Baidu IME / 百度输入法 for Android, Windows, and iOS

Pre-installed on Xiaomi devices with Chinese edition ROM

• Baidu IME Xiaomi Version / 百度输入法小米版

Pre-installed on OPPO devices with Chinese edition ROM

• Baidu IME Custom Version / 百度输入法定制版

48 THE NOT-SO-SILENT TYPE

The following soware has not been fixed and is easily exploitable, and we suggest

that users switch to another keyboard entirely:

Separately installed, third-party keyboards

• QQ Pinyin IME / QQ拼音输入法 for Android and Windows

Pre-installed on Honor devices with Chinese edition ROM

• Baidu IME Honor Version / 百度输入法荣耀版

B. Disclosure timelines

For the disclosure timelines, please see here.

CITIZEN LAB RESEARCH REPORT NO. 175 49