Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
3kk Nicknames of minecraft players.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
### Introduction
Hello. My name is Brandon Conrady and I am currently early on in my data science studies in college. This is my first data set, so enjoy!
### Context
I am currently taking a statistics course and this got me curious as to finding distributions from samples gathered in my day to day life. Since I play video games, I turned to Minecraft. For those who don't know, Minecraft has a block called the composter which allows you to input an item such as wheat. The item disappears, and has a percent chance of raising the compost level within the composter. When the compost level reaches 7, it creates another item called bone meal, which can act as fertilizer to grow plants. I wanted to collect this data and throw it onto Kaggle to see what people could come up with using it.
### Content
Each csv file contains samples from when the item specified was used on the composter. Most contain 2000 entries. However, the cookies dataset contains 3000 since it is more efficient at creating bone meal. I may update to add further entries to each csv file, but seeing as the current data already approximates a distribution I am currently unsure if any more entries would be useful.
### Acknowledgements
Minecraft is the intellectual property of Microsoft, although the datasets themselves don't involve any direct usage of the product itself, rather records of observations gathered playing the game. However I should state the obvious that I don't own the game itself.
### Inspiration
I wanted to see if, based on the data provided, people could estimate the probability that for a given item, adding one of it to the composter will raise the compost level. I am also just generally curious as to what applications people can come up with given the data provided. By all means take it and run with it!
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Important Info: This dataset contains swears. I filtered out as much racism as possible. People who were racist were banned from the server. I am not affiliated with the server in any way.
There was a minecraft semi-anarchy server that logged all of its messages to discord between 2020 and 2023. I downloaded all of them and made them into a json in chronological order. I also cleaned the messages and removed any racial slurs or offensive words. That being said, there are still swears.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset is extracted from a registration form of a major Minecraft event organized on discord. the personal data columns are added as dummies and do not represent the real details of the people
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This Zenodo record contains backups of parts of the original datasets published in the 2019 MineRL paper titled "MineRL: A Large-Scale Dataset of Minecraft Demonstrations". This data is human players playing Minecraft, with the video feed and actions captured and stored. See project documentation on how to use the data.
MineRL Github page: https://github.com/minerllabs/minerl
MineRL Documentation: https://minerl.readthedocs.io/en/latest/
Documentation specifically related on using this data: https://minerl.readthedocs.io/en/v0.3.7/tutorials/data_sampling.html
Facebook
Twitterhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
After learning some popular image generative algorithms (DCGANs, WGANs, CGANs), I tried to take a shot on a dataset created by me. However, because of the size of this dataset, the results weren't as good. That's why I made this dataset available to everyone as a reminder that, even after learning a lot, you can still learn from people who have a lot more experience than you. (And because Minecraft!)
The dataset has 300+ front faces of popular minecraft skins from famous youtubers. All of these images are in .png format with the same image size (190*190)
This dataset was inspired by other minecraft skins datasets available in Kaggle. However, they include the whole skin, which, personally, isn't as good looking as only the front face of the Minecraft skin.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F26030037%2F73cd1c197398e75b7f53b66dff03d9e9%2FScreenshot%202025-06-13%20105519.png?generation=1749788438836519&alt=media" alt="">
**Introduction **Minecraft is one of the most popular games, where players can freely explore, build, adventure, survive - do whatever they like. Among the many resources in the game, diamonds are one of the most precious and important resources. Diamonds are used to make high-durability tools and armor, and for players who choose to survive and fight, they are essential for advancing in the late game.
However, diamonds do not appear completely randomly on the map. In this project, I will take randomly generated Minecraft maps as the research object, select a 100×100 block area from each map, and observe and record the number and distribution of diamond veins.
Through this research, I hope to have a clearer understanding of the patterns and probabilities of diamond generation, and propose an interesting perspective that combines mathematics and statistics with games. I also hope that this project can show how logical systems and data analysis can be found behind everyday entertainment.
Analysis and Discussion After examining four randomly generated Minecraft maps, I found that the number of diamond ores slightly varies depending on the biome. For example, areas like Jagged Peaks and Taiga Village had a bit more diamonds. This could be related to the underground structure or terrain complexity, which might increase the chance of diamond generation.
Most of the diamonds were found between Y-levels -53 to -59, confirming the common belief that this depth range has the highest spawn rate. While Minecraft does not visually indicate diamond locations, staying within this range increases the likelihood of finding them. The data in this project aligns with the official diamond generation mechanics.
****Conclusion**** Based on the analysis of four different 100x100 areas, each with two vertical layers (20,000 blocks total), the average number of diamonds found was 32.5, resulting in an estimated diamond appearance rate of 0.1625%.
This project demonstrates how mathematical analysis can be applied to games like Minecraft. What seems like random generation actually follows hidden patterns. By combining observation and basic statistics, we can better understand and even predict where resources like diamonds are most likely to appear.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Indonesian Chat Dataset, including around 10,702 meticulously edited chats among users of Roblox and Minecraft. Certain chats use conventional terminology, while others utilize colloquial expressions. Slang phrases emerged because younger gamers often incorporate them into their regular conversations. The author personally categorizes the chats under four classifications: neutral, violent, racist, and harassing.
Classification details: Neutral: no violent sentences, casual chat without any means to harm someone Violence: swearing sentences, threats, incitement to harm others, or associating people with some animal or creature Racist: discriminate against or demean people based on race, religion, ethnicity, or nationality, including slurs, hate speech, or promoting racial superiority Harassment: porn, sexual abuse words, body shaming, derogation
An example of the dataset's content and the preprocessing methodology is the sentence: “Pada bisa diem ga sih, 4nj1n9 semua,” which translates to “Can you shut up, all of you are dogs,” where numbers substitute certain letters in the word 'dog,' categorized as 'violence.' The preprocessing procedure involves converting the phrase to lowercase, normalizing it by eliminating punctuation, and substituting some numerals with their nearest alphabetic equivalents, for as replacing the numeral 4 with 'a' and 1 with 'i'. The preprocessed sentence is: "pada bisa diem ga sih anjing semua."
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
3kk Nicknames of minecraft players.