Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Alireza Ahmadihesar
Released under CC0: Public Domain
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This project focuses on exploring and analyzing the most popular datasets available on Kaggle. By delving into these datasets, we aim to identify key trends, understand user preferences, and highlight the topics that drive engagement within the data science and machine learning communities
Also there are interesting charts and analytics in the attached notebook
Facebook
TwitterThis dataset was created by valerie lucro
Facebook
TwitterThis dataset was created by v1nor1
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by M Shaheena
Released under Apache 2.0
Facebook
TwitterThis dataset was created by Masab001
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Abbad Alam
Released under Apache 2.0
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Zheung Yik2024
Released under Apache 2.0
Facebook
TwitterThis dataset was created by Danh_Anh
Facebook
TwitterThis dataset was created by Chekhova Nadezhda
Facebook
TwitterThis dataset was created by Mohammad Mehdi
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Rohan Bhopale
Released under Apache 2.0
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a snapshot of the official Kaggle datasets leaderboard (taken between October 21st and October 24th 2024). For every user, the dataframe contains all their datasets with information sourced through the Kaggle API. Currently, the dataset only contains the top 250, but I have a larger snapshot of the leaderboard. I aim to expand the dataset to include the top 1000 dataset contributors.
Facebook
TwitterThis dataset was created by Marcos Faria
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Data has been fetched from the LinkedIn API.
The 'post.csv' file provides all the details about a post made on LinkedIn by a certain user, *All details in file.
The 'comments.csv' file describes all the information about comments on the post and other details.
This is my first time making a dataset, so please ignore any discrepancies and please suggest on how I can improve on it.
Thanks, and happy kaggling.
Facebook
TwitterThis dataset was created by Thanh Minh Nguyễn Lê
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Explore our public data on competitions, datasets, kernels (code / notebooks) and more Meta Kaggle may not be the Rosetta Stone of data science, but we do think there's a lot to learn (and plenty of fun to be had) from this collection of rich data about Kaggle’s community and activity.
Strategizing to become a Competitions Grandmaster? Wondering who, where, and what goes into a winning team? Choosing evaluation metrics for your next data science project? The kernels published using this data can help. We also hope they'll spark some lively Kaggler conversations and be a useful resource for the larger data science community.
https://i.imgur.com/2Egeb8R.png" alt="" title="a title">
This dataset is made available as CSV files through Kaggle Kernels. It contains tables on public activity from Competitions, Datasets, Kernels, Discussions, and more. The tables are updated daily.
Please note: This data is not a complete dump of our database. Rows, columns, and tables have been filtered out and transformed.
In August 2023, we released Meta Kaggle for Code, a companion to Meta Kaggle containing public, Apache 2.0 licensed notebook data. View the dataset and instructions for how to join it with Meta Kaggle here
We also updated the license on Meta Kaggle from CC-BY-NC-SA to Apache 2.0.
UserId column in the ForumMessages table has values that do not exist in the Users table.True or False.Total columns.
For example, the DatasetCount is not the total number of datasets with the Tag according to the DatasetTags table.db_abd_create_tables.sql script.clean_data.py script.
The script does the following steps for each table:
NULL.add_foreign_keys.sql script.Total columns in the database tables. I do that by running the update_totals.sql script.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Kunal Kumar Sahoo
Released under CC0: Public Domain
Facebook
TwitterThis dataset provides a curated collection of freely available APIs covering a wide range of categories. Whether you are a developer, data enthusiast, or just someone interested in exploring various API services, this dataset offers a valuable resource to help you discover, access, and understand these APIs.
Each entry in the dataset includes essential information about the APIs, such as the API name, a brief description of its functionality, authentication requirements, HTTPS support, and a link to the API's documentation or endpoint. The dataset is categorized to facilitate easy exploration and access to APIs across different domains.
Example entries:
"AdoptAPet": A resource to help get pets adopted, requiring an API key for access. "Axolotl": A collection of axolotl pictures and facts, with HTTPS support and no authentication required. "Cat Facts": Providing daily cat facts, with HTTPS support and no authentication needed.
Columns: API: This column provides the name or title of the API
Description: In this column, you'll find a brief description of the API's functionality and what it offers.
Auth (Authentication): This column indicates whether the API requires authentication for access. If it specifies "apiKey" or any other form of authentication, users need to provide valid credentials or keys to utilize the API.
HTTPS: This column indicates whether the API supports secure communication over HTTPS
Link: This column provides a URL or link to the API's documentation
Category: The "Category" column categorizes the API into a relevant domain or topic.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the dataset that I created as part of the Google Data Analytics Professional Certificate capstone project. The MyAnimeList website has a vast repository of ratings and rankings of viewership data that could be used for various methods. I extracted several datasets from the detail API from MyAnimeList (MAL) https://myanimelist.net/apiconfig/references/api/v2 and plan to potentially update data every two weeks.
Many possible uses for this data could be tracking what anime viewers are watching most within a particular time period, what's being scored (out of 10) well and what isn't.
My viz for this data will be part of a tableau dashboard located here. This dashboard allows fans to explore the dataset and locate top scored or popular titles by genre, time period, and demographic (although this field isn't always entered)
The extraction and cleaning process is outlined on github here.
I plan on updating this potentially every 2 weeks, this depends on my availability and the interest in this dataset.
Extracting and loading this data involved some transformations that should be noted:
alternative_title field in the anime_table. This uses the english version of the name unless it is null, if the value is null, it uses the default name. This was in an effort to make the title accessible to english speakers. The original title field can be used if desired.genres field. MyAnimeList includes demographic information (shounen, seinen etc.) in the genres field. I've extracted it so that it could be used as its own field. However, many of those fields are null making it somewhat difficult to use.start_date have been used. I will continue to use this method as long as it is viable.The primary keys in all of the tables (with the exclusion of the tm_ky table) are foreign keys to other tables. As a result, the tables have 2 or more primary keys.
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| demo_id | int |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| genres_id | int | PK |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| mean | dbl | |
| rank | int | |
| popularity | int | |
| num_scoring_users | int | |
| statistics.watching | int | |
| statistics.completed | int | |
| statistics.on_hold | int | |
| statistics.dropped | int | |
| statistics.plan_to_watch | int | |
| statistics.num_scoring_users | int |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| studio_id | int | PK |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| synonyms | chr |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| title | chr | |
| main_picture.medium | chr | |
| main_picture.large | chr | |
| alternative_titles.en | chr | |
| alternative_titles.ja | chr | |
| start_date | chr | |
| end_date | chr | |
| synopsis | chr | |
| media_type | chr | |
| status | chr | |
| num_episodes | int | |
| start_season.year | int | |
| start_season.season | chr | |
| rating | chr | |
| nsfw | chr | |
| demo_de | chr ... |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Alireza Ahmadihesar
Released under CC0: Public Domain