A/B Testing : how to discover your groups.
It appears that discussions about A/B testing are all the rage on the forum nowadays, especially since the recent introduction of the Coach feature.
People often have no clue of which side of the fence they are. So here's a simple and useful tips to get the answer. It relies on the same info used by the site itself and the different userscripts shared by other users which enhance the interface.
In your web browser, press F12 to get access to the dev console. Then simply type the following command to inspect the variable :
You should then get a list like this :
Nothing new or really secret, but could be useful to some of you.
Thanks. It's nice to see that my list is a series of 'falses' and 'controls'. Apart from two, which really stand out, and are the group for longer sentence lengths and more listening challenges - yeh, lucky me. The least they could have done regarding the listening challenges is to also give me Carla, but no, I've still got old gasping for breath francesca16.
Francesca16's voice makes learning Italian awkward.
"Una gasp donna gasp mangia gasp la gasp mela."
If you change a parameter on one browser on one device, does it affect every other session? Try changing one of your mobile, android, iOS settings from your desktop/laptop. Then see if it changed the experience on your mobile device... Answer: Tried it and nothing changed on the mobile device. So any changes only effect the browser where you made changes.
If the changes you make in your browser only affect that session, meaning that the effect is local, then I wonder how much that would mess up the data collected by Duolingo? I mean we have proven that the coach can be enabled (Jonathan) and disabled (AlexisLinguist, who, admirably perhaps, turned it back on again, too).
What if all the users whom DL designated as Coached simply turned it off. Then DL's Coach data would be skewed unless they could perceive whether a browser F12 hack had been performed...
And to go one step further, there are a lot of users using tweaked browsers already. They change the DL colors, etc. Some bring back the coach. Some don't. Unless DL has a way of monitoring all those hacks, then all (or part) of the test data for those users may be bunk.
Yup, if DL has no way to gauge what tweaks users have made, or even whether users have made changes, then every person who makes a change is like a gremlin run amok in the DL labs.
(AlexisLinguist, who, admirably perhaps, turned it back on again, too)
Please don't remind me. It's right there, yet I cannot touch it. So close, yet so far. I may need, a Snickers bar.
Yup, if DL has no way to gauge what tweaks users have made, or even whether users have made changes, then every person who makes a change is like a gremlin run amok in the DL labs.
If we're talking about real gremlins here, then I'm Spike.
"then every person who makes a change is like a gremlin run amok in the DL labs." Not really... It's all very local to that instance. The second they navigate somewhere or refresh (get a page load in general) the settings will get refreshed back to what the server has set for you.
My point being that DL thinks the person is using environment A (the sum of a bajillion tests being run simultaneously) while in fact any usage of the site reflects environment G (what the Gremlin has wrought on his own screen).
"Amok" like the rest of what I wrote being colorful language.
Everyone might be a gremlin, but I think a lot of the people who might actually go ahead and decide to make these changes are probably devoted enough that they're within the "firmly retained" camp for their precious metrics. Plus, the values probably aren't changing on Duo's end. So it probably doesn't matter if a few people change a setting or two. All hail the divine infallible metrics!
Whoa. Now this is kinda like another movie... (in the Will Ferrell movie, he really never initiated anything, did he?)
Yup, I think Jonathan just chose the blue pill. Someone keep a lookout for Morpheus (not to mention Agents) and cue the techno music because Duolingo has become the Matrix.
This is really telling. The number of tests they're carrying out really annoys me. The whole idea of quietly AB testing the site (or anywhere) annoys me. They should tell us what they're working on, implement it for everyone, and let us give feedback. Plus be more engaged in the community in general. </rant>
I'm not sure I understand why this annoys you. A/B testing is a good thing, as much as it stinks to be in the control ("reject") group. It's a proven tool for determining which features have a positive impact on the site, without relying on wishy-washy (and potentially contradictory) written user feedback. As for the number of concurrent tests, that's a function of the size of Duolingo's user base. Having millions of users allows them to test multiple features at once. How is that a bad thing? If anything, it allows them to improve the site more rapidly.
I'm with you on the staff-community engagement point though. It wouldn't kill them to be a little more active on the forums.
"Think they pay attention to effects of one experiment on the results of another?"
I'd hope so. There are more than several experience "configurations" at the moment, so I'd hope they would be taking that into account. Otherwise, seems like that taint the results, and the whole thing would be a waste of time.
I mean, I already slowed down my usage (learning) when the new design rolled out, and now the Coach...I just want the old design (ha, it's only been two-three weeks now) back.
Hey AlexisLinguist, did you try reversing what Jonathan did?
"Huh, I just changed coach_web_experiment from "false" to "true," went to the homepage, and there was the coach. So at least in some cases, if you want to see how the other half lives, you just need to write yourself into the other half."
Maybe you could dispatch the Coach.
Well, we already know, so all results are effectively skewed. ;) It's interesting that the tests would be "out in the open" like this. Wouldn't keeping them tucked away so users couldn't fiddle with them be better? It's almost like they want us to find them. Shoo, if it's an open invitation...
Hey, don't come at me with logic, man! That's cheating! It's not fair! Ima down vote all your comments!!!
Good point. We do already know, don't we? Except, fortunately, I have the memory of an earthworm so I already forgot what groups I'm in. Go me.
Not sure they can be tucked away where a web development tool can't fiddle with them. That's the nature of web pages vs. compiled apps, I fear.
Here are some of mine:
coach_track_experiment: "trackless" (...but I have Coach on, so I wonder what this means)
enable_web_speech_experiment_v3: "reverse_speak" (don't use the speech component online, but when I did I was asked to translate English to Spanish and then speak)
sg_tap_challenge_frequency: "more_tap" (...tap?)
web_remove_real_world_practice: true (I just noticed my Real World Practice button is gone from underneath my Strengthen Skills button)
web_give_up_button_experiment: false (interesting, there's a "give up" button? I just refresh...)
Bunches of lingot for you! Now I may be up for hours just staring at what I found... You know, most experimental models require that a subject not know they are the subject of an experiment... unless that is being tested...
...hmmm... So does anyone here have knowing_test_subject_experiment? And if so, what is it set to? :)
I thought of something that might be fun: create threads for the different experiments. Let users opine away on what they (think they) experience and how much they like the results.
For example, if there were a thread for hover_effect_experiment, as a TRUE, I would opine that it rocks! :) I really do not know what the effect is, mind you, but I think it is what I noticed during immersion. Being able to see translations of words without clicking. If that was the effect being tested, I like it.
Also, just saw this pest: ios_thumb_centric_layout_experiment -- Since I use an iPad as much as an iPhone, I am really glad this is set to false. Having a thumb-centric iPad experience would not be very good. Actually, this calls to mind that iOS series experiments should probably be made more specific. Surely it should be possible to differentiate the iOS experience by device. And for that matter, are all the iOS experiments confined to the app? What about users who access via a mobile browser on iOS?
Anyone know what the "mobile_discussion_experiment" is?
Some of these are a bit annoying... "mobile_turn_off_microphone_forever_experiment: true"
Also sg_tap_challenge_frequency: "fewer_tap" -- More like zero tap. I never get those tile drills anymore. I cannot remember the last time I did one in fact. They were great in my iPad and iPhone browsers...
This one is a new experiment, I think: web_remove_real_world_practice: true I think it is new because I remember seeing that "real world practice" quite recently. No harm. I hardly ever used it to access immersion.
What do you think? If all these experiments were in the forums as threads, would you go comment on them?
"For example, if there were a thread for hover_effect_experiment, as a TRUE, I would opine that it rocks! :) I really do not know what the effect is, mind you, but I think it is what I noticed during immersion. Being able to see translations of words without clicking. If that was the effect being tested, I like it."
Is that really what it is? In that case, I've been in that test since April at least. Maybe hovering over the streak flame? It gives me the day number whenever I hover over it, if that's anything.
Shoot, it could be the slight fade-out of buttons and skills when you hover over them. :P
Good question! Here's what my settings choices are on my laptop: Microphone, Speaker, Voice autoplay, Sound effects
On my iPad in Safari and Chrome: Microphone, Speaker, Voice autoplay, Sound effects
On my iPad in the app: Sound effects, Speaking exercises, Listening exercises
Mobile must only refer to both Android and iOS apps. In the mobile browsers I do have a microphone option. I'm going to try the Blue pill. See if I can change the setting and then see what happens in the app. :)
update nothing i changed in the browser affected the app.
Oh wait! I think I know what the "turn off microphone forever" means! Know how when you do a speaking exercise, you can choose "I can't use the microphone right now" and then the mic is turned off for an hour? Maybe you can choose to turn it off forever right from that interface?
duo.user.attributes.ab_options returns as an object, so you need to set the value calling it as a method:
duo.user.attributes.ab_options.italian_tts_experiment = "Carla"
but I'm not sure this value can be set on the client and re-send to server every time. Most probably, it's got from the server when the lesson starts (and any assignment on browser side doesn't really change it).
No, this doesn't work either. Do you have Carla? Which URL is being called for the TTS for you? For me it's
Maybe something like "it2" for the language would do the trick? ("It2" doesn't)
and yes, being replaced your url works for me, so mine should work for you.
Thanks for the shower (I think I'll share some stones with the topicstarter who has given a good jog with his investigation).
I made the following script (mostly based on your code), which doesn't change anything for people in the AB group with Carla voice. https://monkeyguts.com/code.php?id=541
It depends on the specifics of the feature that is being tested. When you change the test setting, if your change works at all, I would expect it to continue to work until a complete page refresh is required (more or less). As an example, imagine there was an A/B test that was comparing two versions of an icon. When you first load the page containing the icon, your browser requests the page contents from the server (including the A/B test settings). The question is: what event causes the icon to load? If the icon only loads during initial page load, you probably won't be able to change the icon at all because it will have already loaded before you have the chance to change the setting. But, it the icon loads during some kind of user interaction with the page, and you change the setting before the interaction in question, then you should see the different icon, and continue to see it until the next complete refresh of the page. There are some other variables, like how caching is handled for the item in question.
Really, I don't know in which group I am, but I hate Duolingo interface!
I can't find my own post. When it's downvoted, because people prefer chatting and silly thread than grammar questions, I've tried everything, the "followed" tab: impossible to find again my thread, even if it's very recent, 2 or 3 days, and even when I make a search with the words I know my discussion has inside.
I can't see my messages. I have only the last 5. That sucks a lot ! I receive a lot of messages, so if a person message me when I'm not on Duolingo, I lost forever their messages!
I have my activity stream flooded with things I don't care, and I flood my followers with things that they don't care, because I have an heavy activity. I want to choose what I'm an interested into, and have friends but not always see whatever if they are too active.
Why it's so bad?
For FF versions 30 and above you need to read the following:
Once I'd done this via the settings I was good to go. Thanks again.
I had written out the rest of the steps for another user in this post but it appears to have been deleted :-(
For FF: Tools > Web Developer > Web Console At the bottom of the text box are two chevrons >> paste the code here and hit Enter. A short version of the list will appear with a more .... link. Click it and to the right of the box another box will appear with the full list.
Huh. Couldn't figure out what to do in Safari. I figure it's Develop - Error Console or Develop - Show Web Inspector (F12 is increase volume) but then I didn't see where to type in duo.user.attributes.ab_options No matter, I'll have access to Chrome tomorrow. Nice post, thanks!
Of course just about as soon as I submitted that, I found it. Develop -> Show Web Inspector. Then type "duo.user.attributes.ab_options" in at the very bottom center of the console. As writchie4 said, the resulting list is collapsed but that's just a mouse click on the little triangle to open it up.
I'm thinking it has to do with the computer function of auto-correct being disabled. I'd noticed that my Mac had stopped correcting me, and lo and behold, this stupid test. :)
Auto-correct no longer works on iOS either (NO....), so it looks like it applies to both. I was dismayed by all the falses, but also astounded at the sheer number of tests they are running. Are they all really necessary..................?
At this stage I'm feeling a bit like a lab rat. I don't mind the testing, but I'm now wondering what I'm missing out on, or in the case of the XP bar counting my lucky stars. Unfortunately, my daughter who has just joined Duo has something that doesn't look at all like my Duo, and I can't ask her things like .... "How far until you reach the next level?" as this draws a blank stare.
"At this stage I'm feeling a bit like a lab rat."
Same. It's just not fun. The user experiences just vary so much from one thing to the next that I wonder if Duolingo is getting the true data that they covet so much. The metrics have to be all over the place with all these things running at the same time. What happened to simplicity, you know?
The metrics have to be all over the place with all these things running at the same time.
Yeah, that's what bugs me. Whatever happened to the classic research design I learned in college? But maybe there's complicated statistics mojo controlling for all these interactions.
There aren't many websites that I visit that change on an almost weekly basis. Probably because the users would stop coming. I also discovered today that all the effort I've put into reporting potential problems has just been filed. I'd wondered why I've never had an email reply as other users have reported. However, I keep coming back because I am learning and I love the community. If there was no community I'd find someplace else, as the connectivity would be gone. I feel absolutely disconnected from the Duolingo staff.
Actually, what I'd like to know is what metrics are used to sort us into the trues and the falses - obviously some things I do are good (I'm left as false), and others I'm not so good at so I get to test a new feature. I've been looking for connections but I can't see anything obvious.
To those who think that these experiments are useless: this is what duolingo is supposed to be. Go and see the TED talk of its founder, where he explains why A/B testing is at the very heart of duolingo and won't go away. Short version: A/B testing of small iterations of the learning experience on a large number of people makes language learning better for everybody.
As for the multitude of tests running at the same time: it is all a function of the number of people using duolingo. If you have millions of people, you can run a whole bunch of different tests and still control for the various configurations. This might even be obligatory, because some chances may have only a positive effect in combination with other changes and you don't want to miss a case like this.
Look at the whole thing as a form of the tragedy of the commons. Yes, maybe your user experience sucks in a few subtle ways at the moment while other users already have the good stuff. But in the long run we all get to learn foreign languages much faster thanks to the power of science and statistics.
Sorry, being of the older generation and not very computer literate, I have never been into the 'dev console' before. Can someone please tell me where I should type in the command? (I have Google Chrome) I really want to know which groups I'm in. I suspect I am in the one for longer sentences as I have had a lot of problems recently because I just can't type fast enough.
The actual instructions will vary according to which web browser you're using (Chrome, Firefox, Internet Explorer).
In Chrome, after hitting F12, you should see a bar across the top of the pane that opens up. This bar says "Elements, Network, Sources, etc". At the end of the bar is the word "Console", click that to open the console. You'll see a line in the console that says "Duolingo is hiring software engineers" and immediately below that, an empty line with a ">" symbol. Click on that line and enter the command shown in the original post. A new line will appear starting with "► Object". Click on the ► to expand the line and see each of the experiment values on their own line.
There is another way to see this information, which may be easier for you. Just visit: http://www.duolingo.com/users/Rompip
On that page, lots of information about your account is shown, including this "ab_options" structure. Once the page loads, hit Ctrl+F and type "ab_options" to find it easily. It's a little tough to read, but hey, that's code for ya. :)
In your web browser, press F12 to get access to the dev console. Then simply type the following command to inspect the variable : duo.user.attributes.ab_options
I don't catch it. When I press F12, the Firefox Inspector panel appears at the bottom, but in what field of which tab do I enter the command ? Could someone tell me?
OK, now I found it, I didn't see the gray double prompt on the last line when the Net tab of the Console tab is active.
In fact, those info are publicly available. For instance, you can look at my profile on this page PapyXM. The A/B settings are listed after the key word ab_options
You can't affect your groups at all, those are set by Duolingo for the purposes of collecting data on new and potentially new features. Eventually they'll either move you back to the previous system - if they decide to not use the new "health" - or you'll stay permanently on the new system - if they decide it's working as intended.
That, unfortunately, would probably open up a pandora's box of problems. Not only would they have to hire on significantly more staff or spend significantly more time handling transfer requests, it would also potentially impact the size and makeup of their testing groups which would impact their results. Duolingo seems to be committed to making decisions almost completely data-driven, so while I sympathize with you I don't see them allowing that anytime soon.
When software developers want to make a change or include a new feature, it is common to perform a randomized trial to see how well it works before deciding if it will be permanently included in the product. Users are randomly assigned to two groups: A and B. One of the groups gets the change, the other doesn't. The company then collects stats for some time to see how the two groups compare in whatever measures are deemed appropriate. In the past, Duolingo has indicated that they measure things like how long users stick with their learning, how regular they are with doing lessons, etc.
Hi! I'm curious to see what test groups I'm in, but my tech skills are non-existent. I've clicked on browser settings, then F12 developer tools. This is where I came to a sudden halt. I have no idea where exactly to type the command.
Whilst some helpful users have pointed out how to go about this on Chrome, Safari and Firefox, I can't find one explaining the procedure for IE. Could someone help me, please?
I don't have IE, so I can't demo for you, but maybe one of these links will help?
Strange. I tried doing it when on various parts of the Duolingo website (logged in at the home page, in the middle of a lesson, etc.), and I just get an error. Here's the Chrome version of it: Uncaught TypeError: Cannot read property 'attributes' of undefined at <anonymous>:1:10