Wow improved immersion section!
I've just noticed the new and improved immersion section!
Love how it can organize articles better! It's a big help, but my question is; how does Duolingo rate the difficulty of the article? (Since apparently if its long it's hard)
Sorry, I think I need to improve my wording, I meant that how does Duolingo decide if an article is easy or hard? To me it seems like they do it by the length of the article.
But in my opinion, just because its hard, it doesn't mean its long. (No wait, actually the other way around) this was mentioned before, but there should be some kind of voting system on voting the difficulty of an article (ex. on a scale of 1-10 what level would you say that article is?)
Glad you like it!
For now, we estimate difficulty using the Flesch–Kincaid grade level: http://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests
This doesn't depend on article length directly, so if you notice a correlation between difficulty and length it just means that longer articles tend to have longer sentences (or words).
We'll most likely experiment with other measures of difficulty in the future.
Ok, have you guys considered the differences in languages? For example, German words tend to be longer than English words. If not, I'm pretty sure there are readability tests for specific languages. I don't know how flexible the Flesch-Kindcaid scale is.
There's a version of the Flesch Reading Ease test for German as well as another German readability test called the Wiener Sachtextformel.
It's all on this page (in German). Might want to have Julika or Myra read through that :)
I don't know about tests for other languages though.
Thanks for the link! I'll keep it in mind the next time I'm working on this.
Matt, I've noticed that you cut off the bibliography section when importing Wikipedia articles. I know that translating references is not much fun, but often there are bits that can and should be translated. Here's an example: http://de.wikipedia.org/wiki/Martin-Luther-Ged%C3%A4chtniskirche At the very least, the "Weblinks" section should be translated, but it was cut off along with "Literatur" and "Einzelnachweise". Can I suggest that you don't cut these sections off, but make them optional (article - bibliography = 100%)?
For the bibliography: you may know this already, but professional translators generally don't translate bibliography titles. See for example: http://www.proz.com/forum/translation_theory_and_practice/156844-translating_a_bibliography.html
If we agree that titles shouldn't be translated, then the rest of the job (e.g., translating place names and dates) is rather mechanical and I think it is best done automatically rather than through crowdsourcing. So my plan would be that if and when we start auto-publishing Duolingo translations into the target-language Wikipedia, we would auto-translate the bibliography as part of this process.
Should titles in Weblinks be handled differently from titles in the bibliography? It seems like the same arguments would apply to both cases.
I agree that titles should not be translated, but the thing about the "Weblinks" section is that it doesn't actually contain unique titles, but short descriptions of the linked websites or documents. I do believe that these descriptions should be translated manually. It would be a shame if they were butchered by machine translation or remained untranslated.
Ok, I agree that Weblinks (and similar sections in other languages) should be translated. We'll preserve that section in the future.
@Matt: Maybe you could let the uploader decide which parts to translate automatically and which should be kept for manual translation. All we need are very basic editing capabilities when uploading a document. E.g. to mark tables and opus lists to be kept, bibliographies to be auto-translated, etc. I think you'll need a human somewhere in the loop. Even highly standardized texts like Wikipedia articles are too diverse to let these decision be taken entirely automatically.