Duo: Your New Data Driven Language Teacher

Duolingo uses data and analytics to drive personalized language learning鈥nd it works

Duolingo is a digital gamified language learning program with over 500 million registered users learning more than 32 languages.[1] The company started with the mission to 鈥渕ake education free and accessible to everyone in the world.[2] To achieve this mission and to scale extensively, Duolingo has leveraged the data generated from its digital platform and invested in artificial intelligence (AI) and machine learning (ML) to deliver an exceptional learning experience, personalized to each user.

Figure 1: Duolingo鈥檚 app with personalized lessons

 

Duolingo generates hundreds of millions of data points from its 42 million monthly active users every day.[3] The data is used to train and improve AI algorithms which humanize the learning experience and keep users engaged and returning to the app. It鈥檚 like having a private language tutor in the form of Duolingo鈥檚 mascot, a green owl named Duo.

Figure 2: Duolingo鈥檚 mascot 鈥淒uo鈥

 

Since there is significant heterogeneity among individuals learning languages in terms of individual鈥檚 different goals, knowledge, capability, interest, time and learning preferences, Duolingo鈥檚 strategy of collecting and processing data with AI enables the app to differentiate itself from competitors.听Duolingo has detailed information on each user so that it can tailor lessons to the individual needs based on the user鈥檚 previous lessons and mistakes. Duolingo therefore provides lessons that allow users to continuously progress. In fact, Duolingo has built its AI model off of spaced repetition which determines how many times a user has seen a word and how long it will take for the user to forget that word. The spaced repetition system which began in 2013 as Duolingo鈥檚 first AI project predicts which words a user has forgotten and re-introduces those words into the user鈥檚 lesson.[4] This helps users to master new languages faster and more efficiently, based on AI鈥檚 seamless yet personalized lesson recommendations.

 

Duolingo also uses a computer adaptive placement test which enables new users to spend five minutes taking a quick quiz when they sign up for a new course, effectively suggesting the most suitable part of the course for the user based on their proficiency.[5] By leveraging the data and AI models, Duolingo can quickly capture the interest of new users to ensure that they are sufficiently challenged and excited by the course that they have signed up for.

 

Birdbrain, Duolingo鈥檚 machine learning model complements the app鈥檚 personalized learning system by predicting how hard specific lessons will be for a user. Based on that prediction, which is enhanced and trained by Duolingo鈥檚 half a billion lessons completed each day, user鈥檚 lessons are calibrated and either made harder or easier based on the user鈥檚 success. The blame algorithm is used to try to understand why users are making mistakes. Smart tips, another feature based on machine learning, attempts to give immediate tips based on the algorithm鈥檚 prediction for the root cause of a mistake.[6]

 

Apart from the AI and ML that Duolingo leverages to improve the quality of lessons, one of Duolingo鈥檚 best features is its app notifications which are backed by AI to prompt users to open the app to practice at times when they are most likely to respond to the notification. By leveraging data to optimize notification time, Duolingo has been able to increase user retention by 2% for new users in the period of one day to one week after download. [7]

 

Figure 3: Duolingo鈥檚 AI backed notifications

Implementation of AI and ML has not been an easy path at Duolingo. In particular, the company struggled to find talent that could advance Duolingo鈥檚 data analytics efforts, while still understanding the psychology and cognitive nature of language learning. However, Duolingo鈥檚 investments in AI and ML have enabled the company to capture value by increasing retention rates through content and incentives that cater to each individual. This higher retention rate translates to increased revenue through conversions from free to paid users and more targeted in-app advertisements. The success of AI and ML at Duolingo is underscored by the company鈥檚 2020 revenues of $180 million, a 13x increase from 2017. [8] As Duolingo continues to establish itself as a valuable data-driven learning platform, talent acquisition should become less of a challenge with more data scientists excited by the opportunity to work for a $1.5 billion educational technology company.[9]

Figure 4: Duolingo鈥檚 growth in revenue as a result of AI and ML models [10]

 

Moving forward, it will be crucial for Duolingo to continue to focus its efforts on recommending the most effective order for users to learn languages. Additionally, there are many opportunities for Duolingo to further leverage AI and ML to offer users the opportunity to have conversations and live learning experiences with bots. This use of AI and ML will better simulate an immersive language learning experience. With years of data available, Duolingo has created a competitive advantage and is positioned to continue to differentiate itself and to expand its AI and ML capabilities to provide a superior language-learning experience.

Endnotes

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

 

 

Previous:

Netflix: a streaming giant鈥檚 big data approach to entertainment

Next:

Stitch Fix: The Data Powered Personal Style Assistant

Student comments on Duo: Your New Data Driven Language Teacher

  1. Great article, Tiffany! I’m impressed by all the ways Duolingo incorporates data analytics in its product to optimize the user experience and maximize retention. I guess one thing that would be interesting to understand is how Duolingo strikes a balance between prioritizing progress (by making lessons harder, more challenging) and focusing on retention (by making sure the user doesn’t give up). At the end of the day, it’s not in Duolingo’s best interest to make users multilingual too fast! I agree that the solution to keep users (once they are comfortable speaking a language) would be to have more AI-driven ‘practice’ features, that simulate human conversations.

Leave a comment