Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
PLEASE UPVOTE IF YOU LIKE THIS CONTENT! 😍
Duolingo is an American educational technology company that produces learning apps and provides language certification. There main app is considered the most popular language learning app in the world.
To progress in their learning journey, each user of the application needs to complete a set of lessons in which they are presented with the words of the language they want to learn. In an infinite set of lessons, each word is applied in a different context and, on top of that, Duolingo uses a spaced repetition approach, where the user sees an already known word again to reinforce their learning.
Each line in this file refers to a Duolingo lesson that had a target word to practice.
The columns are as follows:
p_recall
- proportion of exercises from this lesson/practice where the word/lexeme was correctly recalledtimestamp
- UNIX timestamp of the current lesson/practice delta
- time (in seconds) since the last lesson/practice that included this word/lexemeuser_id
- student user ID who did the lesson/practice (anonymized)learning_language
- language being learnedui_language
- user interface language (presumably native to the student)lexeme_id
- system ID for the lexeme tag (i.e., word)lexeme_string
- lexeme tag (see below)history_seen
- total times user has seen the word/lexeme prior to this lesson/practicehistory_correct
- total times user has been correct for the word/lexeme prior to this lesson/practicesession_seen
- times the user saw the word/lexeme during this lesson/practicesession_correct
- times the user got the word/lexeme correct during this lesson/practiceThe lexeme_string
column contains a string representation of the "lexeme tag" used by Duolingo for each lesson/practice (data instance) in our experiments. The lexeme_string field uses the following format:
`surface-form/lemma
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
PLEASE UPVOTE IF YOU LIKE THIS CONTENT! 😍
Duolingo is an American educational technology company that produces learning apps and provides language certification. There main app is considered the most popular language learning app in the world.
To progress in their learning journey, each user of the application needs to complete a set of lessons in which they are presented with the words of the language they want to learn. In an infinite set of lessons, each word is applied in a different context and, on top of that, Duolingo uses a spaced repetition approach, where the user sees an already known word again to reinforce their learning.
Each line in this file refers to a Duolingo lesson that had a target word to practice.
The columns are as follows:
p_recall
- proportion of exercises from this lesson/practice where the word/lexeme was correctly recalledtimestamp
- UNIX timestamp of the current lesson/practice delta
- time (in seconds) since the last lesson/practice that included this word/lexemeuser_id
- student user ID who did the lesson/practice (anonymized)learning_language
- language being learnedui_language
- user interface language (presumably native to the student)lexeme_id
- system ID for the lexeme tag (i.e., word)lexeme_string
- lexeme tag (see below)history_seen
- total times user has seen the word/lexeme prior to this lesson/practicehistory_correct
- total times user has been correct for the word/lexeme prior to this lesson/practicesession_seen
- times the user saw the word/lexeme during this lesson/practicesession_correct
- times the user got the word/lexeme correct during this lesson/practiceThe lexeme_string
column contains a string representation of the "lexeme tag" used by Duolingo for each lesson/practice (data instance) in our experiments. The lexeme_string field uses the following format:
`surface-form/lemma