1. Forum
  2. >
  3. Topic: Vietnamese
  4. >
  5. Bug: Correct Vietnamese input…

https://www.duolingo.com/profile/galliumarsenide

Bug: Correct Vietnamese input and no Unicode normalization gives incorrect response

I think duolingo does not perform Unicode normalization on my input strings.

I'm using KDE and the Vietnamese keyboard layout.

During a timed practice session, I found that translating "Man" as "Đàn ông" (or any other prompt with diacritics), I got an "almost correct" response.

Here's the text duolingo responded with as correct: Đàn ông

Here's the text I entered: Đàn ông

The problem is my à is in fact two characters: an 'a' and a combining grave accent. Duolingo's à is just one character.

I ran the following in a python shell on both strings above:

import unicodedata

[unicodedata.name(c) for c in u"Đàn ông"]

['LATIN CAPITAL LETTER D WITH STROKE', 'LATIN SMALL LETTER A WITH GRAVE', 'LATIN SMALL LETTER N', 'SPACE', 'LATIN SMALL LETTER O WITH CIRCUMFLEX', 'LATIN SMALL LETTER N', 'LATIN SMALL LETTER G']

[unicodedata.name(c) for c in u"Đàn ông"]

['LATIN CAPITAL LETTER D WITH STROKE', 'LATIN SMALL LETTER A', 'COMBINING GRAVE ACCENT', 'LATIN SMALL LETTER N', 'SPACE', 'LATIN SMALL LETTER O WITH CIRCUMFLEX', 'LATIN SMALL LETTER N', 'LATIN SMALL LETTER G']

Notice my input contained "LATIN SMALL LETTER A" and "COMBINING GRAVE ACCENT" while duolingo's string contained just "LATIN SMALL LETTER A WITH GRAVE".

I ran my input through a Unicode normalizer using NFC normalization and the output ended up the same as Duolingo's correct string.

September 17, 2016

1 Comment


https://www.duolingo.com/profile/bia-hoi

Fwiw, this problem seems to have resurfaced a while ago, so I developed a small browser extension to fix those typos (by automatically normalizing inputs). Download links available here: https://github.com/blmage/duolingo-unicode-normalizer.

Happy learning!

July 11, 2019
Learn Vietnamese in just 5 minutes a day. For free.