https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Dataset Overview
Intro
This dataset was downloaded from the good folks at fivethirtyeight. You can find the original (or in the future, updated) versions of this and several similar datasets at this GitHub link.
Data layout
Here are the columns in this dataset, which contains data on every NBA player, broken out by season, since the 1976 NBA-ABA merger:
Column Description
player_name Player name
player_id Basketball-Reference.com player ID
season… See the full description on the dataset page: https://huggingface.co/datasets/andrewkroening/538-NBA-Historical-Raptor.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Custom Segmentation 538 is a dataset for instance segmentation tasks - it contains Cracks annotations for 538 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Biden Approval Polling’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kaggleqrdl/biden-approval-polling on 28 January 2022.
--- Dataset description provided by original source is as follows ---
There is a contract on Predictit.org tracking Biden's approval rating that I've followed out of fun (no profit). I often used this data to help predict where it will go next. For example, Rasmussen is a very fresh pollster (daily), while the others are somewhat lagging. You can even subscribe to Rasmussen's service and get updates before most people, though this particular exploit is well known.
The data is fairly straightforward and I include a brief description of each of the columns.
Something to be aware: a few pollsters, such as Ipsos, will report wildly different results within days because they are sometimes polling for specific organizations, such as the Economist and sometimes just for themselves. In these different surveys, there are different questions used, and thus different 'house effects' (ie: political bias). For example, some surveys start with the question "Do you approve of the direction of the country?", while others will start with "Do you approve of Joe Biden?"
I would like to acknowledge Nate Silver and the whole 538 crew for aggregating this data. Very interesting and informative - https://projects.fivethirtyeight.com/biden-approval-rating/
Please note I have removed all 538 model specific information such as weights, grades, etc.
I think it'd be very cool to see how far ahead we could predict changes in Biden's approval ratings, possibly using other sources such as twitter and news organizations, plus maybe other datasets on Kaggle itself.
--- Original source retains full ownership of the source dataset ---
Historical series of Advance Weekly Initial and Continued Claims reports (ETA 538). This information is provided by states on a weekly basis and includes the advance weekly claims data as reported by states in the ETA 538 report. These data are not revised after the initial submission and subsequent publication in the UI weekly claims news release.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains data behind the story Every Guest Jon Stewart Ever Had On ‘The Daily Show’.
Header | Definition |
---|---|
YEAR | The year the episode aired |
GoogleKnowlege_Occupation | Their occupation or office, according to Google's Knowledge Graph or, if they're not in there, how Stewart introduced them on the program. |
Show | Air date of episode. Not unique, as some shows had more than one guest |
Group | A larger group designation for the occupation. For instance, us senators, us presidents, and former presidents are all under "politicians" |
Raw_Guest_List | The person or list of people who appeared on the show, according to Wikipedia. The GoogleKnowlege_Occupation only refers to one of them in a given row. |
Source: Google Knowlege Graph, The Daily Show clip library, Wikipedia.
This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
Cover photo by Oscar Nord on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
Digital terrain model with a 25 m mesh pitch, with the same leaf distribution as the MTN50. ASCII ESRI Matrix (asc) file format. Geodetic reference system ETRS89 (in the Canary Islands REGCAN95, compatible with ETRS89) and UTM projection in the zone corresponding to each leaf and also in the extended zone 30 (for leaves located in zones 29 and 31). In the Canary Islands, the UTM zone is 28. The MDT25 has been obtained by interpolation of digital terrain models with a 5 m mesh pitch from the National Plan for Aerial Orthophotography (PNOA).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by iXiOiXi
Released under CC0: Public Domain
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This directory contains the data behind the story ‘Mad Men’ Is Ending. What’s Next For The Cast?
The primary file show-data.csv
contains data of actors who appeared on at least half the episodes of television shows that were nominated for an Emmy for Outstanding Drama since the year 2000. It contains the following variables:
Header | Definition |
---|---|
Performer | The name of the actor, according to IMDb. This is not a unique identifier - two performers appeared in more than one program |
Show | The television show where this actor appeared in more than half the episodes |
Show Start | The year the television show began |
Show End | The year the television show ended, "PRESENT" if the show remains on the air as of May 10. |
Status? | Why the actor is no longer on the program: "END" if the show has concluded, "LEFT" if the show remains on the air. |
CharEnd | The year the character left the show. Equal to "Show End" if the performer stayed on until the final season. |
Years Since | 2015 minus CharEnd |
#LEAD | The number of leading roles in films the performer has appeared in since and including "CharEnd", according to OpusData |
#SUPPORT | The number of leading roles in films the performer has appeared in since and including "CharEnd", according to OpusData |
#Shows | The number of seasons of television of which the performer appeared in at least half the episodes since and including "CharEnd", according to OpusData |
Score | #LEAD + #Shows + 0.25*(#SUPPORT) |
Score/Y | "Score" divided by "Years Since" |
lead_notes | The list of films counted in #LEAD |
support_notes | The list of films counted in #SUPPORT |
show_notes | The seasons of shows counted in #Shows |
The supplemental file performer-scores.csv
is the consolidated data from show-data.csv
made into a pivot table.
This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Global component of the OOI includes arrays at critical, yet under-sampled, high-latitude locations such as within the Irminger Sea in the North Atlantic. The Global Irminger Sea Array includes two types of gliders that provide simultaneous spatial and temporal sampling capabilities. Open-Ocean Gliders follow track lines around the triangular mooring array and are equipped with acoustic modems to relay data from the Flanking Moorings to shore via satellite telemetry. Profiling Gliders sample the upper water column near the Apex Profiler Mooring.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
FiveThirtyEight (https://fivethirtyeight.com/) is a great tool to track NCAA March Madness. Since 2016 it has shared its own team rankings. This dataset contains all team rankings available for women NCAA tournament.
Data taken from: https://projects.fivethirtyeight.com/march-madness-api/2016/fivethirtyeight_ncaa_forecasts.csv https://projects.fivethirtyeight.com/march-madness-api/2017/fivethirtyeight_ncaa_forecasts.csv https://projects.fivethirtyeight.com/march-madness-api/2018/fivethirtyeight_ncaa_forecasts.csv https://projects.fivethirtyeight.com/march-madness-api/2019/fivethirtyeight_ncaa_forecasts.csv https://projects.fivethirtyeight.com/march-madness-api/2021/fivethirtyeight_ncaa_forecasts.csv https://projects.fivethirtyeight.com/march-madness-api/2022/fivethirtyeight_ncaa_forecasts.csv https://projects.fivethirtyeight.com/march-madness-api/2023/fivethirtyeight_ncaa_forecasts.csv
This dataset provides information about the number of properties, residents, and average property values for 538 Road cross streets in Tahlequah, OK.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains data behind the story Obama Granted Clemency Unlike Any Other President In History.
The data in obama_commutations.csv
is copied from the Justice Department website. The python script parses it by looking at the first column to figure out what is contained in the second column.
Source: Department of Justice
This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
This dataset provides information about the number of properties, residents, and average property values for Scr 538 cross streets in Morton, MS.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Global component of the OOI includes arrays at critical, yet under-sampled, locations such as within the Argentine Basin in the South Atlantic Ocean. The Global Argentine Basin Array includes two types of gliders that provide simultaneous spatial and temporal sampling capabilities. Open-Ocean Gliders follow track lines around the triangular mooring array and are equipped with acoustic modems to relay data from the Flanking Moorings to shore via satellite telemetry. Profiling Gliders sample the upper water column near the Apex Profiler Mooring.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains the data behind the story Trump Might Be The First President To Scrap A National Monument.
This data was compiled by the National Parks Conservation Association and includes national monuments that were created by presidents by under the Antiquities Act. It does not include national monuments created by Congress.
Header | Definition |
---|---|
current_name | Current name of piece of land designated under the Antiquities Act |
states | State(s) or territory where land is located |
original_name | If included, original name of piece of land designated under the Antiquities Act |
current_agency | Current land management agency. NPS = National Parks Service, BLM = Bureau of Land Management, USFS = US Forest Service, FWS = US Fish and Wildlife Service, NOAA = National Oceanic and National Oceanic and Atmospheric Administration |
action | Type of action taken on land |
date | Date of action |
year | Year of action |
pres_or_congress | President or congress that issued action |
acres_affected | Acres affected by action. Note that total current acreage is not included. National monuments that cover ocean are listed in square miles. |
Sources: National Parks Conservation Association and National Parks Service Archeology Program
This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
Cover photo by Nick Tiemeyer on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
This dataset provides information about the number of properties, residents, and average property values for County Road 538 cross streets in Ripley, MS.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Brain metastases are a frequent occurrence in neuropathology practices. The literature on their neuroanatomical location is frequently derived from radiological analyses. This work examines brain metastases through the lens of pathology specimens. All brain surgical pathology reports for cases accessioned 2011–2020 were retrieved from a laboratory. Specimens were classified by neuroanatomical location, diagnosis and diagnostic category with a hierarchical free text string-matching algorithm (HFTSMA) and also subsequently audited. All reports classified as probable metastasis were reviewed by a pathologist. The provided history was compared to the final categorization by a pathologist. The cohort had 4,625 cases. The HFTSMA identified 854 cases (including metastases from a definite primary, metastases from primary not known and improperly classified cases). 514/854 cases had one definite primary site per algorithm and on report review 538/854 cases were confirmed as such. The 538 cases originated from 511 patients. Primaries from breast, gynecologic tract, and gastrointestinal tract not otherwise specified were most frequently found in the cerebellum. Kidney metastases were most frequently found in the occipital lobe. Lung, metastatic melanoma and colorectal primaries were most commonly found in the frontal lobe. The provided clinical history predicted the primary in 206 cases (40.3%), was discordant in 17 cases (3.3%) and non-contributory in 280 cases (54.8%). The observed distribution of the metastatic tumours in the brain is dependent on the primary site. In the majority (54.8%) of cases, the provided clinical history was non-contributory; this suggests surgeon-pathologist communication may have the potential for optimization.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains data behind the story Most Police Don’t Live In The Cities They Serve.
Includes the cities with the 75 largest police forces, with the exception of Honolulu for which data is not available. All calculations are based on data from the U.S. Census.
The Census Bureau numbers are potentially going to differ from other counts for three reasons:
How to read police-locals.csv
Header | Definition |
---|---|
city | U.S. city |
police_force_size | Number of police officers serving that city |
all | Percentage of the total police force that lives in the city |
white | Percentage of white (non-Hispanic) police officers who live in the city |
non-white | Percentage of non-white police officers who live in the city |
black | Percentage of black police officers who live in the city |
hispanic | Percentage of Hispanic police officers who live in the city |
asian | Percentage of Asian police officers who live in the city |
Note: When a cell contains **
it means that there are fewer than 100 police officers of that race serving that city.
This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
This dataset provides information on 538 in Spain as of May, 2025. It includes details such as email addresses (where publicly available), phone numbers (where publicly available), and geocoded addresses. Explore market trends, identify potential business partners, and gain valuable insights into the industry. Download a complimentary sample of 10 records to see what's included.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Dataset Overview
Intro
This dataset was downloaded from the good folks at fivethirtyeight. You can find the original (or in the future, updated) versions of this and several similar datasets at this GitHub link.
Data layout
Here are the columns in this dataset, which contains data on every NBA player, broken out by season, since the 1976 NBA-ABA merger:
Column Description
player_name Player name
player_id Basketball-Reference.com player ID
season… See the full description on the dataset page: https://huggingface.co/datasets/andrewkroening/538-NBA-Historical-Raptor.