1. Forum
  2. >
  3. Topic: Duolingo
  4. >
  5. Creating Mandarin Chinese les…


Creating Mandarin Chinese lesson tree using DL Language Incubator

To whoever interested to create Mandarin Chinese lesson tree using DL Language Incubator, can we create it based on HSK? HSK is an official Chinese Proficiency Test for certification from China. Here is the website: http://www.hsk.org.cn/english/default.aspx.

There are 6 levels there:

Level 1: 150 words Level 2: 300 words Level 3: 600 words Level 4: 1200 words Level 5: 2500 words Level 6: 5000 words

Based on other DL language vocabulary count, which is around 1500-2000, I would expect using DL, we can be proficient in Mandarin around level 4-5. Any thought anyone?

September 26, 2013



I actually think this is a great idea. I have a bachelor's in Chinese and just a few months ago passed HSK 6, and the HSK does a great job of teaching the key vocabulary you need to increase your comprehension (although there are some outliers). I think level 6 might be too advanced for what Duolingo users will need, but I totally support consulting HSK when making the lessons. Especially because it will provide a great guide, because Chinese needs to be tackled very differently from Western languages. With Spanish, French, German, a lot of time needs to be spent on things like verb conjugation, but verb conjugation, and really word declension in any sense, simply doesn't exist in Chinese, but it has a whole other set of challenges to overcome.


I also think also that the HSK should be the core of anyone trying to learn Mandarin using Duolingo. I would like people to start discussing the challenges of building it, like for example: Who is going to record the listening exercises? What about sound What about the tones? Are we going to use pinyin to type in the answers? What about stroke order? Will it be only character recognition without actually learning how to write the characters? Will there be a preferred grammar source for the users to give explanations of things we normally don't have in Indoeuropean languages (i.e. measure words)? I think the importance of the Chinese language in the world (an in the web) deserves especial care with respect to other languages, I would like someone from the Duolingo team put special attention to this issue.


I'm assuming the listening exercises will just be robotic, like they are for the other languages. Which actually works a lot better for Chinese then other languages, because the tones make natural speech sound a bit robotic anyways sometimes. For anything besides European languages, I think most languages will need a general guide in the beginning to explain certain concepts, in Chinese they'll need to explain tones and characters a bit. But I don't see why there needs to be a preferred grammar source; there is none given for the other languages. Including grammar explanations where we can will be useful, but any language learner figures out how to accumulate an array of language-learning resources. They've said there will be an application process for people to contribute to the language lessons, so honestly I don't really think this is the time or place for discussing the challenges of building it; that'll be for the moderators when the Incubator is released.


I see, after I posted the reply to your comment I went to the general discussion and learned about the application process which I think it is a good way to tackle the issue.


If we want people to learn how to immerse themselves and in time translate things we are going to have to teach them how to read hanzi characters. Preferably simplified because that is what the mainland uses. Without doing this users ability to learn the language by themselves will be greatly stunted.


I agree, simplified characters would be the way to go. How would they type them in though? I use pinyin.


I think for pretty much all non-Latin languages users are just going to have to get used to international keyboards. Language learners have to do that at some point anyways. And the Chinese pinyin input systems are the most common and not that difficult.


At first I think we should just use the pinyin to hanzi input method, but over time (and I'm thinking maybe a couple years or so) we could also input a stroke recognition system for users who'd prefer that ( also excellent for reinforcing the stroke order for characters as they are written in real life). This could also be implemented for those seeking to learn fan ti zi (traditional characters) in both Japanese and traditional hanzi (for use in Taiwan or for Cantonese learners as well).


There is also another standardized test for Chinese proficiency called TOCFL (mainly administered in Taiwan), whose requirements seem much higher than similar levels in HSK: http://www.sc-top.org.tw/english/eng_index.php

The vocabulary requirements are as follows:

TOCFL Level 1 (CEFR A1): 500 words

TOCFL Level 2 (CEFR A2): 998 words

TOCFL Level 3 (CEFR B1): 2501 words

TOCFL Level 4 (CEFR B2): 4997 words

TOCFL Level 5 (CEFR C1): 7989 words

TOCFL Level 6 (CEFR C2): vocabulary requirement not established

Of course language proficiency is not defined by the number of words in a speaker's vocabulary alone. Nonetheless, to a certain extent, the TOCFL standard agrees with the assessment of HSK levels by the Fachverband Chinesisch (Association of Chinese Teachers in German Speaking Countries): http://www.fachverband-chinesisch.de/sites/default/files/FaCh2010_ErklaerungHSK.pdf

Maybe when we build the Chinese lessons in the Incubator we can keep this in mind -- also can refer to the TOCFL vocab list as another source of reference.


On the one side I think that this is a good idea, but on the other side I think that learning by frequency is a lot more helpful for beginners.

The frequency list will be somewhat similar to the HSK lists, however the HSK lists are not ordered(?), and I guess that especially the lower HSK levels contain more daily words ( like names of clothes, body parts, things in the house etc. ) which you may not really need that often.

I actually learned chinese vocabulary by frequency until I knew about 1500 words. Especially at the beginning this is super helpful, because you will be able to understand about 75% of a conversation with just the few hundred most frequent words.

That's why I generally prefer learning by frequency. :D

For Duolingo it would probably be nice to have a mixture of both. Introduce the vocabulary by frequency in general, but also put the words recommed by the HSK in the tree. I also think this is very similar to what we already have for the existing languages :)

Learn a language in just 5 minutes a day. For free.