Senior Design Projects

ECS193 A/B Winter & Spring 2021

Chinese IME with Customizable Lexicons

Email **********
Chengzhi Chu & Fuqiang Zhuo
UC Davis

Project's details

Chinese IME with Customizable Lexicons
Given the complexity of Chinese writing, Chinese L2 learners commonly use pinyin to type characters for Chinese word processing. However, when typing a pinyin syllable with a Chinese input application (such as Microsoft IME, Sougou Pinyin, Google Pinyin, and those used on Apple devices) for a specific Chinese character, many homophonic or associated characters are presented at once for selection due to the large size of the lexicon used in the current Chinese IMEs. Consequently, lower-level Chinese learners suffer from identifying the right characters amongst the long string of candidates. A Chinese IME (Input Method Editor) with customized or graded Chinese lexicons would solve the problem and greatly help students, especially students in the early stages of learning Chinese. It will not only improve their learning efficiency, but also reduce their cognitive load and frustration with learning.
• The app shall be pinyin-based, similar to Microsoft Pinyin IME, Sougou Pinyin or Google Pinyin. When a user/student types in a pinyin string, the IME will only search for characters/words in the selected lexicon(s) of the student's Chinese level.
• Four smart-pinyin options shall be provided in settings for users to select: 1) full pinyin, 2) abbreviation, 3) fuzzy pinyin, and 4) word prediction.
• Users can flexibly change the lexicon selection in the settings. Additionally, users can also upload their own vocabulary lists (in a format such as word/character, pinyin, frequency rank with a .txt file) as their customized lexicons (thus an interface for loading user lexicons is needed).
• Candidate characters/words for one pinyin string shall appear for selection in the order of their rankings in the selected lexicon.
• Cross-platform implementation: The app can be used on both Windows and MacOS. Ideally it can also be used on smartphones.
• Open source for future improvements.
• The graded lexicons will be provided by Dr. Chu and Dr. Zhuo.
• A working prototype of the app ready in April 2021, final release by June 2021
• The source code with clear and thorough documentation maintained in GitHub
In addition to necessary programming skills for the project, team-work spirit and accountability are expected. Basic Chinese proficiency is a plus.
**********
30-60 min weekly or more
Open source project
Attachment Click here
No
Team members N/A
N/A
N/A