Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains pre-processed learning traces from Duolingo’s spaced repetition system. It includes timestamps, user interactions, and correctness data, structured to analyze learning patterns over time. The dataset was cleaned and refined in Google Colab before being used to generate visual insights, including a heatmap showing learning activity trends.
Checkout the heatmap visualization: https://github.com/Charity-Githogora/duolingo-heatmap-insights
Source: The original dataset was obtained from (https://www.kaggle.com/datasets/aravinii/duolingo-spaced-repetition-data) , and it has been processed to improve usability for data analysis and visualization.
Columns:
timestamp – The time of user interaction (converted to datetime format). hour – The hour of the day the interaction occurred. day_of_week – The day of the week the interaction occurred. correct – Whether the response was correct (1) or incorrect (0). Other relevant features extracted for analysis.
Usage: It can be used for various analyses, such as identifying peak learning hours, tracking performance trends over time, and understanding how engagement impacts accuracy. Researchers and data enthusiasts can explore predictive modeling, time-series analysis, and interactive visualizations to uncover deeper insights. Additionally, the dataset can be used to generate heatmaps and other visual representations of learning activity.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains pre-processed learning traces from Duolingo’s spaced repetition system. It includes timestamps, user interactions, and correctness data, structured to analyze learning patterns over time. The dataset was cleaned and refined in Google Colab before being used to generate visual insights, including a heatmap showing learning activity trends.
Checkout the heatmap visualization: https://github.com/Charity-Githogora/duolingo-heatmap-insights
Source: The original dataset was obtained from (https://www.kaggle.com/datasets/aravinii/duolingo-spaced-repetition-data) , and it has been processed to improve usability for data analysis and visualization.
Columns:
timestamp – The time of user interaction (converted to datetime format). hour – The hour of the day the interaction occurred. day_of_week – The day of the week the interaction occurred. correct – Whether the response was correct (1) or incorrect (0). Other relevant features extracted for analysis.
Usage: It can be used for various analyses, such as identifying peak learning hours, tracking performance trends over time, and understanding how engagement impacts accuracy. Researchers and data enthusiasts can explore predictive modeling, time-series analysis, and interactive visualizations to uncover deeper insights. Additionally, the dataset can be used to generate heatmaps and other visual representations of learning activity.