[Immersion Suggestion] : Add a hidden test (iCAPTCHA )to assess users' translations (and a reward lingot)
I propose an iCAPTCHA (Immersion Completely Automated Public Translation test to tell Competent Humans (translators) Apart), based on Prof von Ahn's et al. work and some ideas from a discussion.
This would periodically verify the competency of a translator by testing their ability to recognise a good translation. As discussed below:
- Add Dummy/seed popular articles(with sentences) - Duolingo puts dummy articles within immersion with both good and bad translations (of varying difficulty, simple - hard), or alternatively adds them to existing articles;
- Validate sentence translations -The good translations within these dummy articles/sentences, will preferably be assessed by professional editors/translators, and tested before use;
- Validate user competency -The user has the option of changing, down-voting/up-voting the translation;
- Crowd-source user assessment - allow higher tiered users to assess user's translations(edits);
- Prevent bad voting - If the user fails the automatic assessment, the user's votes(up/down) will only start counting again once enough users have up-voted the user's translations (maybe 5 users or so);
- Prior upvotes/downvotes revoked[optional] - if the user consistently fails simple translation challenges, retroactively revoke the user's previous downvotes, especially if no other down-votes exist for the particular sentence, and/or mark sentences as needing check.
- Alternatively, enough up-votes could prompt the system to test the user again; and
- This effect remains hidden from the user.
- A bad translator, who on average flags (in)accurate translations incorrectly will be identified by the system;
- An alternate way of validating translation competency will be in place, one that depends on reliable instruments, and not only on perceptions; and
- A model can be built supporting or refuting the competency of the user.
- Seed(add) [dummy/popular] articles/sentences with good and bad translations;
- Use hidden tests to assess translator's competence;
- Crowd-source test results for their translations(edits);
- Prevent random/unreliable up/down-votes;
- Periodically verify user's improvement/ growth;
- Optional - Provide a "verified level", for users who consistently pass the assessments, and remove it if performance diminishes; and
- Optional - Provide a lingot for passing all tests, and receiving a verified level badge.
Problem with this: How will you decide what kind of article to use? For instance, as a teacher my vocabulary related to the field of education is greater than my vocabulary related to, say, news and politics. If it's not an article related to my topics of interest, I'm not going to be motivated to translate it, and I won't bother.
I like that you suggest it as optional with a "verified level badge," because if it's a requirement that I do this before doing other translations, I'll just stop doing translations if I'm not interested in the topic of the test article.
The articles may consist of correct or incorrect sentences. Besides, I did mention of varying difficulties. It would be seeded in articles of general interest. Any intermediary can probably read and translate a grade 1/novice/beginner story book, and if not the education system has really failed them.
The thing is you wouldn't know you are taking the test, so you can't decline or accept. It will look like the immersion, as if you are just translating any random article. One way to address this, is for the duobot to offer completely wrong translations, another is for Duobot to add the translation into the text field with some random dummy user as the author.
It is not that hard for duolingo to find out which articles most users like/view/ translate.
Except that, at least for the moment, I'm mostly just translating stuff that I upload myself. Perhaps a better system would be to, rather than have random "dummy" articles, to have those "dummy" users do a small portion of every article, though this could become unwieldy for woever's doing the "good" translations.
Yes, that would be one of the possible implementations. In that particular case, it could just present you with blatantly bad translations.
But I think you are misunderstanding, the test would not be every day or for every article. It would be periodical, perhaps once every few days or once after translating a certain amount of sentences, and it need only test one user at a time, showing you fake translations you accept or reject.
I'm fairly certain that I understood the proposal; I was proposing an alteration to remove the bias from the proposed measure.
The way I see it, if it's only providing computer-generated bad translations, I do see this as being practical for implementation, though without the human-generated good translations to counterbalance it, I'm not sure how valid the measure would be. In order to increase validity by assessing users' ability to evaluate a good translation, it becomes impractical to insure that every user gets assessed. And if every user is not assessed, then it becomes a biased measure, and part of the bias may be influenced by factors such as those that I've previously mentioned.
I don't know if you have any background in developing assessments, but as a teacher and a researcher, I can tell you that in any assessment tool, factors of validity, practicality, reliability and bias must all be taken into account. Reliability can be easily addressed if the people doing the "good" translations are carefully screened beforehand, but the other three factors remain problematic.
I agree with you there, Revdolphin. Personally, I would be one of those people who likely are not assessed, since I don't like pushing that "this doesn't look right" button. And I have already started a new practice of just translating a few sentences of a document rather than the whole thing. Other people also want something to work on. It's all rather silly to watch the revisions. Some are very good, but there are plenty where people change one synonym to another then back again.
Indeed, I'm a postgraduate student, and although I have no knowledge of developing assessments, I do have knowledge of developing research instruments, and I appreciate the problem you have uncovered.
I presumed that at one point or another, everyone would click the "real world practice" button, or do one of the articles immersion suggests after a lesson, which is not necessarily true.
The only other implementation I can think of is to have the hidden test only for bad translations, and a small "compulsory" visible test for good translations and bad translations to counterbalance it. Perhaps a small single paragraph from an article would be presented that could be wrong or right.
Regarding, screening the good translations. The only way I see to do that, is to either use sentences from lesson practice (resulting in nonsensical paragraphs), or potentially using the vast resource of public domain books or articles that have been translated as the source of valid and accurate translations. This would address both jairapetyan's problem, and yours I guess.
Hmmmm, you have given this quite a bit of thought. I hope you send it directly to Duoling so they can consider implanting it. I like the verified level badge idea too.