## Weekly Incubator Update: Tracking Progress from February 15th to February 22nd

When I was in my Elementary school, I had read a remarkable story. This was a Hindi translation from a Russian original written by a talented Russian author who was also ambidextrous (literally). He used to write with his right hand and draw pictures using the left (or perhaps vice versa).

So anyways, it is about the animals looking for a shelter in the rain. One of them finds a mushroom and shows kindness to the other animals as they come by and allows them to take shelter under the same mushroom. When the rain is over, they come out of their shelter only to realize that the mushroom was able to accommodate the entire group because the mushroom itself had grown during the incessant rain.

It almost seems that the incubator basket has similarly grown to hold as many as 23 eggs now! And the rain is refusing to cease! So you can expect it to grow further in the coming weeks and months. Here's the current breakdown:
1. Phase 1: 23 courses
2. Phase 2: 12 courses
3. Phase 3: 27 courses
Total: 62 courses!

More statistics
After last week's interesting discussion about adding/replacing mean with median, you will see a change with this week's update. As a consumer of this information, I consider the "mean" to be more useful in our use case. I do get a sense of the "median" by looking at the visual chart.

However, I am a big fan of experimenting. Unless one prototypes, it is hard to change the status quo. I had predicted to myself that the median will stay at 1, or perhaps move up to 2 as we go week by week. Having stated my position about the "mean", I was in for a slight surprise. As you will see below the median too is more dynamic than what I had imagined. I also took the liberty of adding the "mode".

I am writing three comments below to get your direct feedback for the long run:
1. Should we keep it the old way - only mean?
2. Should we keep it the new way - with mean, median, and mode?
3. Should we only have median?

PHASE 1 Progress:
French for Portuguese - 84% | 87% | 92% | 96% (+4)

Turkish for English - 66% | 70% | 80% | 88% (+8)

English for Thai  - 83% | 83% | 84% | 86% (+2)

Esperanto for English - 73% | 76% | 80% | 79% (-1)

Hungarian for English - 76% | 77% | 79% | 79% (+0)

Ukrainian for English - 62% | 66% | 75% | 76% (+1)

German for Italian - 62% | 63% | 64% | 65% (+1)

German for Turkish - 59% | 62% | 63% | 65% (+2)

Russian for English - 57% | 59% | 60% | 61% (+1)

Spanish for Italian - 51% | 52% | 54% | 55% (+1)

Spanish for German - 39% | 45% | 49% | 51% (+2)

Norwegian for English - 21% | 28% | 37% | 49% (+12) *

French for Italian - 43% | 46% | 47% | 48% (+1)

German for French - (New) 26% | 40% | 44% (+4)

Spanish for Chinese - (New) 39% (+1)

French for Chinese - (New) 36% (+0)

Swedish for Russian - (New) 35% (+0)

Romanian for English - 25% | 26% | 26% | 27% (+1)

German for Portuguese - 1% | 9% | 22% | 27% (+5)

Polish for English - 13% | 13% | 14% | 15% (+1)

Vietnamese for English - 4% | 4% | 8% | 10% (+2)

Greek for English - (New) 1% (+1)

Yiddish for English - (New) 0% | 0% (+0)

Mean - 2.41% | 3.33% | 4.32% | 2.13% (-2.19)
Median - 2.5% | 2% | 3% | 1% (-2)
Mode - 0% | 2% | 3% | 1% (-2)

^ The Turkish, the Hungarian, & the Russian teams' progress is as per their own calculation.

Progress Visualized:

Here's what the contributing teams shared during the last week:

S.Chx from Team Chinese** wrote 4 days ago:

S.Chx, Happy New Year to you too!

hideki from Team Japanese wrote 21 hours ago:

New Contributor

@mhagiwara has joined the Japanese team! He is an expert in Computational Linguistics, and is the second Duolingo Engineer whose mother tongue is Japanese.

mhagiwara, welcome to the fold!

luke51991 from Team Norwegian** wrote 18 hours ago:

Quite an exciting week

Hei alle sammen!

We've made an incredible amount of progress this week, with well over 1,600 sentences as of the time of this writing. The team is inching ever closer to the 50% mark, and by the end of the weekend, we will likely be there. Whenever you see our percentage go down, it's because we're adding new skills, and making the course more content-rich (the more words, the better)!

We have been moving the skills around quite a lot, meaning you won't receive an identical format to the Swedish tree, but this is a good thing! After all, Norwegian is a different language, even if it is mutually intelligible. We're trying to keep our creativity high, both in sentence creation and lesson formatting. Below is a chart of our progress as of late.

We greatly appreciate the enthusiasm and curiosity regarding our course, and your excitement is contagious. I plan on writing an update (almost) every Saturday, for those who are interested in our weekly progress. Everyone have a great week to come! (going ice fishing with coworkers... very Minnesotan)

Med vennlig hilsen,

Andreas, Aleksander og Luke

The Norwegian Team

AlexinTurkey from Team Turkish wrote 3 hours ago:

Getting closer!

Merhaba! Hello everyone!

We are still progressing. We have been writing notes, editing sentences, and adding new ones. We have shed blood and tears. We have drank many many cups of çay and eaten so many pieces of simit (maybe that is just me). We do not have that much to report actually, but please stay tuned for more updates as we reach the end of our course development. Currently we are at 88%. Feel free to cheer us on and keep your eye peeled for our release in the coming weeks*.

--The Turkish Team

*The coming weeks does not mean next week. The coming weeks does not mean two weeks. The phrase "the coming weeks" is intentionally vague. Please oh please do not write on mine or Selcen's wall asking for the date. We do not know!

88%! That is exciting :)

TL;DR Two (or more) courses might hatch in March. We have a growing incubator basket with as many as 23 eggs. From 50 total courses, we have quickly come to as many as 62!

The next update is expected on Sunday, 1st of March at 4:00 pm UTC.

Previous Update 08-Feb to 15-Feb

February 22, 2015

We felt bad that we at the Esperanto course were ahead of the Hungarian course, so we decided to delete some of our course, so they could catch up. ;)

February 22, 2015

Nice excuse :). By the way, now that the course is 4/5 done have you started looking for a good TTS engine?

February 22, 2015

I came across an interesting solution for Esperanto TTS on the Lernu forums (thanks Lernu user IronChef!). You can use a Polish TTS and play around with the spelling a bit to get something very close to Esperanto. I tried IVONA's Agnieszka with the sentence "Czu wi povas prononcy esperanton? Jes. Mi povas diri 'preskał fresza czecha mandżarzo'." And it worked well enough.

The main problem that I see is that Polish palatalizes some letters before i, so I had to change "prononci" to "prononcy" which alters the vowel quality a bit, but sounds closer. I also had to change "vi" to "wi" because it was reading it as the Roman numeral 6. Otherwise, I think it would be pretty straightforward to set up a script to handle the supersignoj (s/ŝ/sz/G; s/ĉ/cz/G; s/ŭ/ł/G; ktp.) and feed it into a Polish TTS.

February 22, 2015

Thanks for the info! I'll definitely pass this along to our Esperanto course mentor!

February 23, 2015

I should think it would also help that Esperanto is fairly forgiving, pronunciation wise. It's not like it's a language where you have to differentiate between two versions of ch or sh or r! So even if one picked up a slightly wonky pronunciation, it shouldn't affect comprehension too much and would likely be easily remedied by exposure to human Esperanto :)

February 22, 2015

True! Plus, some people say that since Zamenhof himself was from Poland, the closest pronunciation to the original would be Polish. But really, as long as it's comprehensible, it'll be good.

February 22, 2015

Have you seen these Memrise courses? http://www.memrise.com/course/1105/speak-esperanto-like-a-nativetm-1/ There's a whole series, which have a pretty good amount of vocab. There are also courses in the same series for particles and affixes, to help with grammar.

February 22, 2015

Also an excellent point!

And yes, as long as it is comprehensible, it's all good. In the meantime, I'll keep slogging away at those Memrise cards and Ana Pana ;)

February 22, 2015

That is one of many courses I am doing on Memrise... let's just say I tend towards the obsessive and Memrise appears to be aimed directly at that facet of my personality 8-o and let's not talk about how much time I've spent on there over the last few weeks... ;-o

February 22, 2015

I have in my possession an old record of songs in Esperanto. It contains a spoken word track in which a man speaking Esperanto with a horrid English accent speaks about taking an international vacation in which he is able to communicate with locals through Esperanto everywhere he goes. This is followed by a round of several people speaking Esperanto with comically exaggerated national accents.

I intend to eventually rip this record to mp3 and make it available on the internet in some format or another. Whenever that happens, I'll post the link somewhere on the Duolingo forum (probably the English language main forum because I don't plan on taking the Esperanto course).

February 23, 2015

The closest pronunciation of a national language to Esperanto is that of the Serbians and Croatians.

February 24, 2015

\2. Should we keep it the new way - with mean, median, and mode?

I of course like the median, and albernegiraffe brought me around to the mean. Don't know about the mode... anyone think of a good use for it?

February 22, 2015

It gives similar information to the median in this case. As we were discussing last week the idea with the median is to see how typical courses are progressing. It seems like the mode gives the same type of information, albeit not necessarily as meaningful.

Edit: Also, if you look 3 weeks back the reported median is 2.5 but the mode is 0. This tells us that a bunch of courses were stagnant but a bunch of others made a lot of progress. If the median and mode are the same/close then it's an even stronger indication of how the "typical course" is progressing.

February 22, 2015

February 22, 2015

February 22, 2015

You know, they should change Sunday into WIU-day! (I would at least, this is my Sunday evening entertainment :p) Thanks for your great work!

###### Pssst... 1st of March :)
February 22, 2015

"29nd" didn't sound right. And it is not a leap year! Thanks :)

February 22, 2015

Seems like there is a new race between Ukrainian and Esperanto. Although at the current pace it is entirely possible Norwegian may overtake them.'

Edit: I just realized that Duolingo seems to be quite interested in pushing the Japanese course to phase 3, considering that two staff members are actively working a beta course.

February 22, 2015

I would love it if that was in preparation to get the Japanese for English speakers course started. I would love it if that was the first "problem course" to get started, by which I mean the first course that's causing lots of problems with the writing system.

February 23, 2015

I feel the same way! I am ready to help when it gets to beta. ;-)

どうもありがとうございます Team Japanese for all your hard work and dedication!

February 23, 2015

So many courses the flags had to leave the chart ^_^

it was fairly time consuming to adjust their positions as the teams raced on causing reordering of the list.

You don't do it pragmatically? I could probably whip up something that would make the graph if just provided the data, if you like. Be duly warned I would likely use a combination of LaTeX and either Perl or Bison. :P

February 22, 2015

I am a big fan of automated scripts, but only to a certain extent. Fully automated scripts are useful only when the consumers of such scripts are non-humans, imho :) (Thanks for your offer though! I will keep it in mind).

I agree with Dessamator that perhaps there can be 80% automation, which is what I worked towards today.

To illustrate my point, here are two instances where the automation loses credibility:

\1. Duolingo's course progress assumes that the course tree is fixed and the progress is based on the extent of tree completed. For newer courses where the tree gets redesigned by the contributors during Phase 1, this logic fails. We have seen this happen for the Turkish, Hungarian, Russian trees. It is not possible to expect the teams to first work on completing the tree structure, freeze it, and then start their work, This is not practical.

\2. Recently, the Duo team modified the progress metric for "xyz-Italian" courses to include the Localization strings. All of a sudden the courses exhibited ~20% progress which was artificially inflated.

I like the ability to perform such sanity checks before the data is ready for consumption.

February 22, 2015

My guess is that 80-90% of it could be done programmatically (including adding the markdown code).

I actually created a personal RSS feed of the contributor updates using google apps scripts. I also recall someone created a script that basically checked their (Danish I think) progress every hour until the course graduated.

February 22, 2015

Sounds great! Care to share it with the community?

February 22, 2015

I also programmed it to only store the last update of each course. So if contributors for a particular course add 3 updates in one day, only the last will be shown.

February 22, 2015

I also recall someone created a script that basically checked their (Danish I think) progress every hour until the course graduated.

Heh, that's cool (if a little obsessive!).

February 22, 2015

Not really, it is really easy to set up using a built in timer of google apps script.

I found the original danish spreadsheet (script).

February 22, 2015

It's good to see that long-term lurkers Turkish and Hungarian for English and English for Thai are now getting close to completion.

February 22, 2015

English for Thai will still take some time though. ATM there is only one active contributor and because of reasons unknown to me no new have been added during all those months. I don't know how that contributor will cope with all the reports alone. I wish some more contributors are added soon.

February 22, 2015

Seriously, though, these updates are amazing. Thank you so much.

I find the mean to be more statistically significant than the median or mode, but there's nothing wrong with sharing the complete set of measures. I think you should just give us all three Jiten. Great job as always!

February 22, 2015

Wow, the French for Portuguese course is almost out of the Incubator! Does anyone know when the last time a course exited the Incubator was?

February 22, 2015

I believe the last course to leave stage 1 (hatching) and reach stage 2 (beta) was Swedish for English on 17th Nov 2014 . The Duolingo wiki has a page giving course info like this here.

February 22, 2015

Yes, but it hasn't been updated in a while (the updates are manually added). I've always thought of making it update automatically, but wikia doesn't have the tools to do that. So currently it is only possible to update by relying on a human or third party tool, like a bot or a external app.

February 22, 2015

It's been three months without a course hatching already, we need to fix that.

February 22, 2015

Is Mandarin going to be added in the near future?

Because I see that are lots of languages being added for Chinese.

February 22, 2015

Teaching Chinese is difficult (probably there is a lot the developers need to do), while teaching other languages (which already have a tree) to Chinese speakers does not require any changes to the system. So I have no idea when Mandarin will be added, and I guess even the staff does not really know.

February 22, 2015

Luis said at AMA it would take a decade (i think he was joking though :)

February 22, 2015

I don't know if you have any experience with Chinese, but it would be pretty difficult to format a Chinese Duolingo course. There are lots of things to be considered, like teaching characters (which give few or no hints to pronunciation) and/or pinyin (which gives only the pronunciation), not to mention writing exercises (since all characters are written a specific way) and speaking exercises. Mandarin would be an amazing addition to Duolingo, but it's only going to come with hard work on the development end.

February 22, 2015

i actually recently had an idea about this. First, assume a user will make their own effort to install Chinese character input. I really don't think this is something Duolingo should/could do. Then, have the same exercises as now but also add one which is sketching (using a mouse or such) the character. I'm almost certain I've already seen a JavaScript that takes such input and tries to match it to a specific Chinese character and in this case it's easier because you already know which one the user is going for. It would also mean you could check the stroke order.

February 22, 2015

I don't think Duolingo would need to teach people the stroke order, though. I mean, if you think about, they don't really teach how to handwrite in any of the other languages, so why should they with Chinese? Besides, you rarely have to actually write out the characters in hand, especially if you're a foreigner ;)

I think maybe Duolingo could teach Mandarin to fx Cantonese and Shanghainese people, since they all have to learn Mandarin in school, and their languages are pretty similar to Mandarin

February 22, 2015

You have a point. At this point, we don't know if Duolingo intends to teach the script for any other language with a non-Roman writing system. The world certainly isn't going to end if that component isn't covered, and if leaving it out brings it to the Incubator faster, then so be it! I'll just be spending lots of time copying out vocab on Hanzi grids :)

February 22, 2015

On the other hand, stroke order in Chinese is more important than in many other languages—both for correctness, and because it actually helps you learn the characters better (via the radical decompositions).

February 23, 2015

I like how now there are median and mode progress statistics beside mean.

Also I can't help but notice how the current percentages nearly form a nice diagonal line.

February 22, 2015

Median - 2.5% | 2% | 3% | 1% (-2)

Champion. ^_^

Edit: hmmmm I'm getting different values for the median. I fixed the mean difference as you pointed out last week and I'm now getting the same means as you. But well, you:

Mean - 2.41% | 3.33% | 4.32% | 2.13% (-2.19)

Median - 2.5% | 2% | 3% | 1% (-2)

me:

Mean: - 2.41% | 3.33% | 4.32% | 2.13% (-2.19)

Median: - 2% | 3% | 2% | 1% (-1)

Your first Median of 2.5 suggests you have an even number of courses for that time period but I have 17....

February 22, 2015

I was so excited to see that after the posts last week :)

February 22, 2015

\1. Should we keep it the old way - only mean?

February 22, 2015

It almost seems that the incubator basket has similarly grown to hold as many as 23 eggs now! And the rain is refusing to cease! So you can expect it to grow further in the coming weeks and months.

What do you know what we do not know.... Now I am very curious - although I have my suspicions.

February 22, 2015

Moderators always know more than regular users. But sometimes leaks happen. Like in these top comments.

February 23, 2015

Haha well he could be alluding to the amount of people demanding specific languages :)

February 23, 2015

Could be, but still .... :-)

February 23, 2015

Unfortunately not much. But it seems that Duolingo has learned how to cope better with several simultaneous Phase 1 courses and taking care of several questions and requests from several contributors.

March 1, 2015

You say the progress of the Russian team is from their own calculation, but where do you get this calculation from?

February 22, 2015

КГБ

February 22, 2015

Нет! That was supposed to be a secret!

February 23, 2015

Team Russian is kind enough to send me their update via the incubator chat.

Edit: And they base it on their assessment of their progress as per their course plan.

February 22, 2015

They calculate it by how many more words and sentences are left to add to the course, I believe.

February 22, 2015

For the past month and a bit we have been calculating our progress off of the assumption that the course will have 2160 words. Of course, this does not mean we will necessarily stop at 2160 words and in fact, we'd love to have between 2500 and 3000 although between 2100 and 2500 words is more realistic.

February 22, 2015

Do you think you could leave the higher numbers for Tree 2.0? We're dying just to dive in!

February 23, 2015

To be honest, I wrote down the list of A1 words we still do not have, and it is quite depressing. Duolingo's format has some limitation; the farther you go, the more material there is to throw out due to teaching difficulty (translation method works poorly for words that are hard to translate or when 5 words mean roughly the same and you get confused).

I think, 2100-2500 duo-items is a fairly good number to stop at. By the way, 2160 is about the number of words we had before I started seriously fiddling with the tree, so this way our percentage at least stays consistent. Otherwise we'd had a sudden jump anytime I delete a gray sample skill in the template, especially that one time when I deleted a few of them.

February 24, 2015

Вперёд русская команда!!!

February 24, 2015

I can´t wait to review my Russian with Duolingo! And I´ll try to learn every language they give for English speakers... So far I´ve been up to the challenge!

February 24, 2015

