"You can now read X% of all real articles" - What does this mean statistically?
Does "You can now read 40% of all real Spanish articles" mean:
(1) "Given 100 articles written in Spanish, I can read 40 of them at 100% comprehension"
or does it mean
(2) "Given 100 articles written in Spanish, I can read 100 of them at 40% comprehension"
Of course! I meant that if you drew 100 articles uniformly at random from our database, you'd be able to read 40 on average.
Loosely speaking, we estimate whether you can read a given article based on the fraction of words in the article that you've seen in lessons. But it's not quite as simple as that, because not all words are treated equally. For example if the article is about Google, the fact that you haven't seen "Google" in lessons doesn't make the article any harder to understand.
So the assumption is (1) that your data base is representative of published articles in general (fiction?, philosophy?, biology?, news?) and (2) that you "can read" a certain article if you recognize a certain fraction of the words (what fraction? how tested?). Then you announce rather optimistic results (like 96.1%) with a surprising level of precision.
There are more possibilities: (3) You can read 40 % of the words that appear in written Spanish. (4) You can read 40 % of the word occurrences that appear in Spanish. (So every the on a page counts separately, for example.) If I had to guess what is meant, I would pick (4).
you can read more about it here: http://www.lingholic.com/how-many-words-do-i-need-to-know/. it's not a scientific article, but it is a very clear explanation of the Pareto principle. you really don't have too know ten thousands of words to read a language:-) however there is a difference if you learn a language from a completely different culture. the languages on duolingo at this point are roman and german languages, so that would not be a problem.
Duolingo has never shown me this screen.... is it some kind of A/B testing beta roll-out? Or- do I have to do something special to unlock this?
I goggled around and found a picture: https://lh3.googleusercontent.com/-9Ti3X2okJkQ/Uh4_tbOCldI/AAAAAAAAD-8/bSd38_cv7Tw/w988-h568-no/Capture.JPG
I'm a bit confused on this. I jumped from being in the upper 60's (as far as what percent of articles Duo thinks I can read) to the mid 80's, and now suddenly at over 90%. All within a week. I am nowhere near done with the Spanish sequence.... Doesn't this seem... wrong?