Duo: your A/B test on Lesson % is flawed
Approaching 700 upvotes on the thread to get rid of the "% complete" (instead of # lessons) doesn't seem to be getting through to anyone who will either do anything about it, or is willing to take the time to tell us what the heck the thinking is. Users pointing out that there seems to actually be a sync issue between the web and iOS app version, so that progress in one doesn't seem to translate to the other, also not seeming to be addressed.
So I just have to try one more time to put this out there...
Duo, if you are watching whether I do more lessons on iOS vs the web, to try to determine whether the %complete on the website or the #lessons on the app is actually better... that's not actually going to tell you anything because NEITHER one is currently a good option.
THE % COMPLETE MAKES PLANNING MY ACTIVITY PROBLEMATIC ON THE WEBSITE so that I don't even want to do lessons there. But THE "HEALTH" SYSTEM ON iOS KEEPS ME FROM STARTING NEW CONTENT THERE (I don't like to cover all new content on iOS because I will often burn through health before I can even finish a lesson).
So if NEITHER interface is currently desirable to me, for very different reasons, how exactly are you going to calculate THAT in your A/B assessment?
I'm coming up on almost two years here (come June), and I am excited about the tree updates, but you have me stuck here between two terrible interfaces, and this is just so demotivating. When you have me thinking that maybe I should just give up, walk away from a streak nearly 2 years long, and try another solution... you're doing something wrong.
(PS I make my living in customer experience management and this? This is not good. This is almost certainly not how you want users to feel.)
(PPS before someone says "it's free so shut up" -- because someone always does -- realize please that nothing is "free". Some of us pay for Plus, but even if we don't, we are registered users and consumers of ads, which still means revenue to the company. Duo uses us as A/B testers for new features. They leverage our user and usage data. And I have loved Duo, I have encouraged others to try Duo, I would like it to succeed as a business entity, and continue to grow and help its users to succeed in language learning. So if I DON'T come here and say, "hey, Duo, this thing is really problematic, so much so that I am falling out of love ... and I don't WANT to fall out of love, if it can be helped -- so please help!" ... then I would be doing them a disservice.)
if you are watching whether I do more lessons on iOS vs the web,
No, they don't look at that as what you describe is not an A/B test.
An A/B test doesn't compare one user with itself but a set of hundreds of thousands of users with another set of hundreds of thousands of other users, where the two sets of users have all properties/features equal except for one: the tested one.
Thanks Mod! Definitely appreciate the response, and all that the mods do for us.
You’ve explained A/B tests but I don’t have any better sense of what the %complete is meant to be “improving” or why doing a lesson on the web doesn’t seem to change how many are needed on iOS.
(Explaining what an A/B test is or isn’t, while informative, is missing the underlying point I’m trying to convey... )
My point is that Duo currently is serving me up two very different, and neither one satisfactory, experiences.
And there does not seem to be any vehicle to express that both of these experiences are working together to frustrate me SO MUCH that it’s breaking my sense of engagement with Duo and my desire to learn with it — except to come here and say so. And hope this reaches someone who might care enough to look into it.
I have loved Duo and have loved learning languages from it. I would hate to leave it. That I am even thinking about it means something is very wrong.
I don’t have any better sense of what the %complete is meant to be “improving”
The user experience.
And this is not measure by users' direct/explicit feedbacks but by looking if this change improve Duo's metrics.
And there does not seem to be any vehicle to express that both of these experiences are working together to frustrate me SO MUCH that it’s breaking my sense of engagement with Duo and my desire to learn with it
Yes there is: just use Duolingo as you want/wish, given those changes.
The A/B test will exactly measure if on average on all users, the change is inducing bad results on average for users.
— except to come here and say so. And hope this reaches someone who might care enough to look into it.
This is not scalable: companies cannot read the feedbacks of their millions of users, even less in so many languages. That's why A/B tests are here.
That's also why I explained (a little ) A/B tests: because it's not only informative, it explains why they exist in the first place and why decision are not based on users' explicit feedback but on the feedbacks they give, without even noticing, by simply using the app the way they do.
The Duolingo staff makes their decisions based on whether the users with the change learn more or less than the users without the change. It doesn't matter how many people complain in the forum: They look whether the change causes Duolingo users to learn more or less on average. That is the reason for keeping or reverting a change. Never the complaints on the forum. And that is the right way, in my eyes: When a change causes people to learn more, it should not be removed, even if everyone in the forum complains. Because the forum is just a very very tiny part of all the Duolingo users. And if the forum users love a change, but it causes people to learn less, it should be reverted. Duolingo looks at what everyone does, on average. Not what a tiny amount of its users write in the forum. That is the important thing: They look at actions, not words. And they look at everyone, not just a few.
By the way: The current change with the percents will cause me a lot of trouble. But the reason why they do it, is something I have waited for a long time: The "practice" button next to your tree will increase the percent values on the practiced lessons. But that is a second step.
Well, since I’m demotivated by this change - and maybe others are too — then learning should go down if others struggle with it as much as I do. But they would have to be using my activity (dropped significantly) and consider that part of the equation of whether I am (any of us is) learning or not.
And while I agree data should rule the day, if they don’t at least LOOK at the forums “I went on vacation for the summer / got too busy for a while so I’m interacting less” will look exactly the same as “I am stymied by this framework and am demotivated to engage.” So I hope they temper their “real data” with customer feedback, even just a bit.
But they would have to be using my activity (dropped significantly) and consider that part of the equation of whether I am (any of us is) learning or not.
They do. That's even the main metrics.
if they don’t at least LOOK at the forums “I went on vacation for the summer / got too busy for a while so I’m interacting less” will look exactly the same as “I am stymied by this framework and am demotivated to engage.”
No, it'll be invisible to the A/B test.
On average on millions of users, the effects of vacations, etc. will be canceled by the averaging: the two A/B test groups (the one with the change, the one without) being large enough, there will be statistically the same number of persons that are “I went on vacation for the summer / got too busy for a while so I’m interacting less” in both groups, hence this will not have effect on the A/B test's result. On the other hand the “I am stymied by this framework and am demotivated to engage” thing will not be the same for both groups since only one has the new thing: it's exactly what the A/B test is designed to measure.
Yes, I understand this as far as A/B testing goes. My POINT is that data alone (as the other poster noted) without incorporating user feedback, to understand the “why” of behavior change, gives an incomplete view of user experience.
Incorporating the feedback of only a tiny percent of users (=those using the forums, which are only a tiny percent of those using the webversion, which are less than 15% of all Duo users) would not reflect at all (by far) the average opinion. Moreover, persons happy with a change tend to voie it far less than those unhappy. So another bias.
And Duo doesn't rely on what users think is best for them in term of learning but on what data show to be the best on average on all users.
I understand. Duo is watching the metrics, and the metrics are not set up to tell them elements of the user experience -- that this change, and subsequent discussion thread has been infuriating, demotivating, and have changed my engagement from "Active Learner and Active Promoter Considering Plus" to "Active Shopper for Other Learning Platforms." (All true.)
But they CAN see learning disruption -- that I have gone from spending my spare time learning with Duo to barely maintaining a minimal streak.
They can't see - and don't care to hear ahead of time - that I'm so irritated by this whole thing (and only getting more so) that I would actually consider walking away from a nearly 700 day streak. But they WILL actually see when I finally abandon my streak, and with it, Duo.
You are saying that if a lot of us are feeling the same and it actually translates to action, those stats will eventually show up in the numbers and get addressed.
And if not, well, Duo is huge, it can live without a user here and there who has been disrupted enough - whether by the tech itself or by the engagement/lack thereof that follows - that they are no longer willing to pursue learning here.
And Duo is not set up for/really has no interest in direct user feedback, even when offered hopefully constructively, before things devolve to that point.
So, thanks. You've explained it beautifully.
So, thanks. You've explained it beautifully.
Happy to have clarified things.
And Duo […] really has no interest in direct user feedback
This I don't know.
I don't know if Duolingo have no interest or if they'd like to be able to do so but can't (which is at least the case: they can't, currently).
The practice button adding to the percentage on my skills is what I dislike the most. It is messing with my whole schedule and it is making spaced out repetition so much harder. Duolingo has made a lot of questionable decisions in the last 3 years but this one takes the cake. I am fine with the percentages. I just hate that they increase when I do not explicitly click on the skill.