Researchers from Sweden and Finland want to say thank you!
Thank you all who helped us and completing our survey on your Experiences of Duolingo as a tool for language learning. This data collection did go very well, above expectation really, so again; thank you! In a few weeks we will also do the prize draw for the Amazon gift cards. For you who win, we will send them to the e-mail you gave us in the survey.
I did get a lot of questions about this survey while it was open, so I will describe the goal of the study here. But first, for you who are interested in the results of this study, I will do a second post when the data is analyzed. This will take at least a few months though before I can do that. And for you who have asked me about plans for publishing this research; all data is collected (three studies in total) and it is now time to write an article. The aim is publishing it in a peer-reviewed journal.
So, what was this study all about. As you may know, Duolingo is a service that is gamified; that is, it uses elements that you normally find in games, with the goal of making the usage of an app (and the associated behavior which in the case of Duolingo is language learning) more fun and motivating. What this study was doing, was to investigate what different kinds of experiences or feelings that are related to these game aspects of gamified apps. Two examples of these feelings that were part of this study was the feeling of being challenged by the app and social experiences. So, the first goal with this study was to track down these different feelings and to see how these are related to how motivated the users are to use the gamified app. This also means that this research is not just about Duolingo, instead we are interested in any app that is gamified.
The second goal of this study, was to develop a measure that can be used by other researchers when doing research on gamified apps. To use these kinds of measures is common practice within many areas of research, and basically they consist of a number of questions that is trying to quantify something (for example how challenged do you feel by the app). There were some comments on the questions that we used in the survey; that they were kind of weird or out of place. This is actually part of this study, to find questions that are working. Thus, if there are questions that really are out of place we want to be able to remove these specific questions. I’m not going to go into details here, but by using statistical techniques like factor analysis and structural equation modeling we will find what questions are working well. Some of you have also responded to the open text question in the survey or commented here on Reddit, explicitly pointing out some of the questions as being strange. These, we will take under consideration. Hopefully, the statistical analysis will say the same thing as your comments. Then, these questions will be removed in the final measure that we are developing. Otherwise, we will consider removing them anyway. Especially, the questions that has been mentioned by more than one person.
So these were the two main goals of our study. This means that this was not a study about your satisfaction with Duolingo or how effective Duolingo is as a language learning tool. Instead we were interested in these different game-feelings and how and if they are motivating. Of course, there are also other things that are motivating (which also might be better motivators), they are just not part of this study.
On a side note, this is the second time these exact questions are used. We did not get even one comment on spelling or weird grammar when sending them to the users (378 of them) of a gamified runners app; and then, I sent the questions to Duolingo users and language learners… :) Thanks to all of you who took the time to tell us about our mistakes!
So, again I will post the results here when the analysis is done; and thank you for your interest and participation!
It will depend on which journal it will get published in and what their rules are. This has to do with copy right rules. As it becomes published, I can't do what I want with the article anymore. But, if I can, then I will post the final paper here.
I'm using SPSS for the exploratory factor analysis. I will also use Amos to do a confirmatory factor analysis. Finally, there will possibly also be a PLS structural equation model in the final paper. In that case, I will use SmartPLS.
Not really. And that is only because I already know the tools that I need for the analysis. I have never used R before, but what I have seen seems really great. So, one day I'm going to start using it.
Can you give me examples how did Chi square and P values gave impressions for your recommendations?
Sorry, but I have not yet done this analysis yet. The only thing that I have done yet, is a factor analysis. There, I did get for example the factor challenge and the factor Social experience.
In the end we want to predict the intention to use Duolingo. But this analysis is not finished yet.
Why didn't you test the questions in the first place? Or was this the test? One of the first rules in doing a survey is to test it first and to have a colleague review it. The survey honestly didn't look like you did either.
Factor analysis or scrapping questions can't fully compensate for a flawed starting point. The theme might be interesting to journals, that's an advantage you've got. But from the information I've got I honestly wouldn't consider the research of sufficient quality and I wouldn't trust your data. Another quite unavoidable issue is that you only reach a certain demographic who happens to use Duolingo on web + read the forums (+the usual bias of people who are willing to participate).
Gotta meet those publishing quotas!
Yes, most statistical tools/algorithms, specially those that imply some sort of statistical inference, like factor analysis or regression models, are based on the assumption that you're using a random sample. As you point out, this doesn't seem to be the case, because those of us that answered the survey are probably a demographically diverse group from the overall Duolingo user population. In any case, I think some sort of cross-validation procedure where you calculate the out-of-sample error would be called for.
Normally it is quite impossible to achieve a random sample. So, this is a problem that is very common within most research within social sciences. And basically, the problem is about generalization. Is it possible to generalize these results to people who are not represented in the sample? What you can do, is to replicate studies in different contexts. This is one way that we have been trying to mitigate these problems.
Thanks for your comments! In a very condensed format the process of developing this measures have looked like this.
First a qualitative study was done. This study was not just looking at Duolingo, but also other gamified services. During the analysis of this data, different themes where found and using the qualitative data from these themes, question where constructed. So, the questions were to great extent constructed from thoughts and feeling of users of the investigated apps.
After the questions had been constructed, they were evaluated by two psychologists and one gamification expert. All of them academics and all of them PhD’s. After this, the questions were revised.
Since none of us are native speakers of English, the revised version of the questions where sent to proofreading.
The second study, was the first quantitative part, and also the first time the questions were used. This time it was sent to users of a gamified app focusing on runners. The main focus here when it came to the analysis, was exploratory factor analysis. We were also looking at cronbach’s alpha. The main goal here was to remove question that does not work.
In the third study, which was this Duolingo study, we will refine which questions will be left in the measure further, but this time we have also added dependent variables that we are aiming at predicting. This time we will do a confirmatory factor analysis and probably also a PLS structural equation model to test the predictive validity of the measure.
For a really good book on scale development (with over 16000 citations on googel scholar) which describe a process similar to the one that we have been following see Devellis (2013). I think there might be a new version nowadays also from 2016.
Ok, your procedure seems quite sound yet I feel so disconnected from the result. With the same research goal (thus essentially the same underlying questions) I'd have created quite a different survey. As odd as it might sound, I can't imagine the train of thought that shaped your survey.
A quick scroll through the pages publicly available of the Devellis book doesn't explain some of the things that bother me about your survey.
Well that’s ok, to have different thoughts :). Hopefully I will be able to post the article here when it gets accepted for publication in a journal. Then you will be able to see the final version of the measure. Also, quite many of the question that was part of the survey, is not part of the measure we have been developing. They are the dependent variables that we want to be able to predict. These measures are not developed by us, but by other researchers. So, to get the full understanding of our work, you really need to see the final result of this study
Duolingo was the only app dealing with languages. The other two apps that are part of this study are focusing on running.