25 datasets found

Data from: Crowd Counting Dataset
kaggle.com
Updated Feb 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2024). Crowd Counting Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/crowd-counting-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Crowd Counting Dataset

The dataset includes images featuring crowds of people ranging from 0 to 5000 individuals. The dataset includes a diverse range of scenes and scenarios, capturing crowds in various settings. Each image in the dataset is accompanied by a corresponding JSON file containing detailed labeling information for each person in the crowd for crowd count and classification.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4b51a212e59f575bd6978f215a32aca0%2FFrame%2064.png?generation=1701336719197861&alt=media" alt="">

Types of crowds in the dataset: 0-1000, 1000-2000, 2000-3000, 3000-4000 and 4000-5000

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F72e0fed3ad13826d6545ff75a79ed9db%2FFrame%2065.png?generation=1701337622225724&alt=media" alt="">

This dataset provides a valuable resource for researchers and developers working on crowd counting technology, enabling them to train and evaluate their algorithms with a wide range of crowd sizes and scenarios. It can also be used for benchmarking and comparison of different crowd counting algorithms, as well as for real-world applications such as public safety and security, urban planning, and retail analytics.

Full version of the dataset includes 647 labeled images of crowds, leave a request on TrainingData to buy the dataset

Statistics for the dataset (number of images by the crowd's size and image width):

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F2e9f36820e62a2ef62586fc8e84387e2%2FFrame%2063.png?generation=1701336725293625&alt=media" alt="">

OTHER BIOMETRIC DATASETS:

Anti Spoofing Real Dataset

Antispoofing Replay Dataset

Selfies, ID Images dataset (5591 sets of 15 files)

Selfies and video dataset (4 052 sets)

Dataset of bald people, 5000 images

Get the Dataset

This is just an example of the data

Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

images - includes original images of crowds placed in subfolders according to its size,

labels - includes json-files with labeling and visualised labeling for the images in the previous folder,

csv file - includes information for each image in the dataset

File with the extension .csv

id: id of the image,

image: link to access the original image,

label: link to access the json-file with labeling,

type: type of the crowd on the photo

TrainingData provides high-quality data annotation tailored to your needs

keywords: crowd counting, crowd density estimation, people counting, crowd analysis, image annotation, computer vision, deep learning, object detection, object counting, image classification, dense regression, crowd behavior analysis, crowd tracking, head detection, crowd segmentation, crowd motion analysis, image processing, machine learning, artificial intelligence, ai, human detection, crowd sensing, image dataset, public safety, crowd management, urban planning, event planning, traffic management
NYC Open Data
kaggle.com
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NYC Open Data (2019). NYC Open Data [Dataset]. https://www.kaggle.com/datasets/nycopendata/new-york
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
NYC Open Data
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

NYC Open Data is an opportunity to engage New Yorkers in the information that is produced and used by City government. We believe that every New Yorker can benefit from Open Data, and Open Data can benefit from every New Yorker. Source: https://opendata.cityofnewyork.us/overview/

Content

Thanks to NYC Open Data, which makes public data generated by city agencies available for public use, and Citi Bike, we've incorporated over 150 GB of data in 5 open datasets into Google BigQuery Public Datasets, including:

Over 8 million 311 service requests from 2012-2016

More than 1 million motor vehicle collisions 2012-present

Citi Bike stations and 30 million Citi Bike trips 2013-present

Over 1 billion Yellow and Green Taxi rides from 2009-present

Over 500,000 sidewalk trees surveyed decennially in 1995, 2005, and 2015

This dataset is deprecated and not being updated.

Fork this kernel to get started with this dataset.

Acknowledgements

https://opendata.cityofnewyork.us/

https://cloud.google.com/blog/big-data/2017/01/new-york-city-public-datasets-now-available-on-google-bigquery

This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://data.cityofnewyork.us/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

By accessing datasets and feeds available through NYC Open Data, the user agrees to all of the Terms of Use of NYC.gov as well as the Privacy Policy for NYC.gov. The user also agrees to any additional terms of use defined by the agencies, bureaus, and offices providing data. Public data sets made available on NYC Open Data are provided for informational purposes. The City does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set made available on NYC Open Data, nor are any such warranties to be implied or inferred with respect to the public data sets furnished therein.

The City is not liable for any deficiencies in the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set, or application utilizing such data set, provided by any third party.

Banner Photo by @bicadmedia from Unplash.

Inspiration

On which New York City streets are you most likely to find a loud party?

Can you find the Virginia Pines in New York City?

Where was the only collision caused by an animal that injured a cyclist?

What’s the Citi Bike record for the Longest Distance in the Shortest Time (on a route with at least 100 rides)?

https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png" alt="enter image description here"> https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png
7+ Million Company Dataset
kaggle.com
zip
Updated May 10, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
People Data Labs (2019). 7+ Million Company Dataset [Dataset]. https://www.kaggle.com/datasets/peopledatalabssf/free-7-million-company-dataset
Explore at:
zip(291957415 bytes)Available download formats
Dataset updated
May 10, 2019
Authors
People Data Labs
Description
Dataset

This dataset was created by People Data Labs

Contents
World Bank: Education Data
kaggle.com
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank (2019). World Bank: Education Data [Dataset]. https://www.kaggle.com/datasets/theworldbank/world-bank-intl-education
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
World Bankhttp://worldbank.org/
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank

Content

This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.

For more information, see the World Bank website.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population

http://data.worldbank.org/data-catalog/ed-stats

https://cloud.google.com/bigquery/public-data/world-bank-education

Citation: The World Bank: Education Statistics

Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @till_indeman from Unplash.

Inspiration

Of total government spending, what percentage is spent on education?
Learning Resources Database
kaggle.com
datadiscovery.nlm.nih.gov
+3more
Updated Nov 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prasad Patil (2023). Learning Resources Database [Dataset]. https://www.kaggle.com/datasets/prasad22/learning-resources-database
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 5, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Prasad Patil
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Learning Resources Database is a catalog of interactive tutorials, videos, online classes, finding aids, and other instructional resources on National Library of Medicine (NLM) products and services. Resources may be available for immediate use via a browser or downloadable for use in course management systems

Dataset Description

It contains 520 rows and 13 variables as listed below - - Resource ID : Alphanumeric identifier - Resource Name : Title of the resource - Resource URL : Link of the resource - Description : Brief explanation on the reource - Archived : Flagged as False for all data points - Format : Format of the resource ex. HTML, PDF, MP4 video , MS Word, Powerpoint etc. - Type : Type of the resource ex Webinar, document, tutorial, slides etc. - Runtime : Runtime of the resource - Subject Areas : Topic covered in reource - Authoring Organization : Name of the Authoring Organization - Intended Audiences : Profile of the intended audience - Record Modified : Timestamp info on record last modification - Resource Revised : Timestamp info on resource last modified
2021-2022 Football Player Stats
kaggle.com
Updated May 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vivo Vinco (2022). 2021-2022 Football Player Stats [Dataset]. https://www.kaggle.com/datasets/vivovinco/20212022-football-player-stats
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 29, 2022
Dataset provided by
Kaggle
Authors
Vivo Vinco
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

This dataset contains 2021-2022 football player stats per 90 minutes. Only players of Premier League, Ligue 1, Bundesliga, Serie A and La Liga are listed.

2022-2023 Football Player Stats

2021-2022 Football Team Stats

2022-2023 Football Team Stats

Content

+2500 rows and 143 columns. Columns' description are listed below.

Rk : Rank

Player : Player's name

Nation : Player's nation

Pos : Position

Squad : Squad’s name

Comp : League that squat occupies

Age : Player's age

Born : Year of birth

MP : Matches played

Starts : Matches started

Min : Minutes played

90s : Minutes played divided by 90

Goals : Goals scored or allowed

Shots : Shots total (Does not include penalty kicks)

SoT : Shots on target (Does not include penalty kicks)

SoT% : Shots on target percentage (Does not include penalty kicks)

G/Sh : Goals per shot

G/SoT : Goals per shot on target (Does not include penalty kicks)

ShoDist : Average distance, in yards, from goal of all shots taken (Does not include penalty kicks)

ShoFK : Shots from free kicks

ShoPK : Penalty kicks made

PKatt : Penalty kicks attempted

PasTotCmp : Passes completed

PasTotAtt : Passes attempted

PasTotCmp% : Pass completion percentage

PasTotDist : Total distance, in yards, that completed passes have traveled in any direction

PasTotPrgDist : Total distance, in yards, that completed passes have traveled towards the opponent's goal

PasShoCmp : Passes completed (Passes between 5 and 15 yards)

PasShoAtt : Passes attempted (Passes between 5 and 15 yards)

PasShoCmp% : Pass completion percentage (Passes between 5 and 15 yards)

PasMedCmp : Passes completed (Passes between 15 and 30 yards)

PasMedAtt : Passes attempted (Passes between 15 and 30 yards)

PasMedCmp% : Pass completion percentage (Passes between 15 and 30 yards)

PasLonCmp : Passes completed (Passes longer than 30 yards)

PasLonAtt : Passes attempted (Passes longer than 30 yards)

PasLonCmp% : Pass completion percentage (Passes longer than 30 yards)

Assists : Assists

PasAss : Passes that directly lead to a shot (assisted shots)

Pas3rd : Completed passes that enter the 1/3 of the pitch closest to the goal

PPA : Completed passes into the 18-yard box

CrsPA : Completed crosses into the 18-yard box

PasProg : Completed passes that move the ball towards the opponent's goal at least 10 yards from its furthest point in the last six passes, or any completed pass into the penalty area

PasAtt : Passes attempted

PasLive : Live-ball passes

PasDead : Dead-ball passes

PasFK : Passes attempted from free kicks

TB : Completed pass sent between back defenders into open space

PasPress : Passes made while under pressure from opponent

Sw : Passes that travel more than 40 yards of the width of the pitch

PasCrs : Crosses

CK : Corner kicks

CkIn : Inswinging corner kicks

CkOut : Outswinging corner kicks

CkStr : Straight corner kicks

PasGround : Ground passes

PasLow : Passes that leave the ground, but stay below shoulder-level

PasHigh : Passes that are above shoulder-level at the peak height

PaswLeft : Passes attempted using left foot

PaswRight : Passes attempted using right foot

PaswHead : Passes attempted using head

TI : Throw-Ins taken

PaswOther : Passes attempted using body parts other than the player's head or feet

PasCmp : Passes completed

PasOff : Offsides

PasOut : Out of bounds

PasInt : Intercepted

PasBlocks : Blocked by the opponent who was standing it the path

SCA : Shot-creating actions

ScaPassLive : Completed live-ball passes that lead to a shot attempt

ScaPassDead : Completed dead-ball passes that lead to a shot attempt

ScaDrib : Successful dribbles that lead to a shot attempt

ScaSh : Shots that lead to another shot attempt

ScaFld : Fouls drawn that lead to a shot attempt

ScaDef : Defensive actions that lead to a shot attempt

GCA : Goal-creating actions

GcaPassLive : Completed live-ball passes that lead to a goal

GcaPassDead : Completed dead-ball passes that lead to a goal

GcaDrib : Successful dribbles that lead to a goal

GcaSh : Shots that lead to another goal-scoring shot

GcaFld : Fouls drawn that lead to a goal

GcaDef : Defensive actions that lead to a goal

Tkl : Number of players tackled

TklWon : Tackles in which the tackler's team won possession of the ball

TklDef3rd : Tackles in defensive 1/3

TklMid3rd : Tackles in middle 1/3

TklAtt3rd : Tackles in attacking 1/3

TklDri : Number of dribblers tackled

TklDriAtt : Number of times dribbled past plus number of tackles

TklDri% : Percentage of dribblers tackled

TklDriPast : Number of times dribbled past by an opposing player

Press : Number of times applying pressure to opposing player who is receiving, carrying or releasing the ball

PresSucc : Number of times the squad gained possession withing five seconds of applying pressure

Press% : Percentage of time the squad gained possession withing five seconds of applying pressure

PresDef3rd : Number of times applying pressure to opposing player who is receiving, carrying or releasing the ball, in the defensive 1/3

PresMid3rd : Number of times applying pressure to opposing player who is receiving, carrying or releasing the ball, in the middle 1/3

PresAtt3rd : Number of times applying pressure to opposing player who is receiving, carrying or releasing the ball, in the attacking 1/3

Blocks : Number of times blocking the ball by standing in its path

BlkSh : Number of times blocking a shot by standing in its path

BlkShSv : Number of times blocking a shot that was on target, by standing in its path

BlkPass : Number of times blocking a pass by standing in its path

Int : Interceptions

Tkl+Int : Number of players tackled plus number of interceptions

Clr : Clearances

Err : Mistakes leading to an opponent's shot

Touches : Number of times a player touched the ball. Note: Receiving a pass, then dribbling, then sending a pass counts as one touch

TouDefPen : Touches in defensive penalty area

TouDef3rd : Touches in defensive 1/3

TouMid3rd : Touches in middle 1/3

TouAtt3rd : Touches in attacking 1/3

TouAttPen : Touches in attacking penalty area

TouLive : Live-ball touches. Does not include corner kicks, free kicks, throw-ins, kick-offs, goal kicks or penalty kicks.

DriSucc : Dribbles completed successfully

DriAtt : Dribbles attempted

DriSucc% : Percentage of dribbles completed successfully

DriPast : Number of players dribbled past

DriMegs : Number of times a player dribbled the ball through an opposing player's legs

Carries : Number of times the player controlled the ball with their feet

CarTotDist : Total distance, in yards, a player moved the ball while controlling it with their feet, in any direction

CarPrgDist : Total distance, in yards, a player moved the ball while controlling it with their feet towards the opponent's goal

CarProg : Carries that move the ball towards the opponent's goal at least 5 yards, or any carry into the penalty area

Car3rd : Carries that enter the 1/3 of the pitch closest to the goal

CPA : Carries into the 18-yard box

CarMis : Number of times a player failed when attempting to gain control of a ball

CarDis : Number of times a player loses control of the ball after being tackled by an opposing player

RecTarg : Number of times a player was the target of an attempted pass

Rec : Number of times a player successfully received a pass

Rec% : Percentage of time a player successfully received a pass

RecProg : Completed passes that move the ball towards the opponent's goal at least 10 yards from its furthest point in the last six passes, or any completed pass into the penalty area

CrdY : Yellow cards

CrdR : Red cards

2CrdY : Second yellow card

Fls : Fouls committed

Fld : Fouls drawn

Off : Offsides

Crs : Crosses

TklW : Tackles in which the tackler's team won possession of the ball

PKwon : Penalty kicks won

PKcon : Penalty kicks conceded

OG : Own goals

Recov : Number of loose balls recovered

AerWon : Aerials won

AerLost : Aerials lost

AerWon% : Percentage of aerials won

Acknowledgements

Data from Football Reference. Image from UEFA Champions League.

If you're reading this, please upvote.
Social media Youth dataset
kaggle.com
zip
Updated Jul 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Srijan Sharma (2021). Social media Youth dataset [Dataset]. https://www.kaggle.com/datasets/fitsri/social-media-youth-dataset
Explore at:
zip(11210 bytes)Available download formats
Dataset updated
Jul 16, 2021
Authors
Srijan Sharma
Description
Dataset

This dataset was created by Srijan Sharma

Contents
Caucasian People - Liveness Detection Dataset
kaggle.com
Updated Apr 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2024). Caucasian People - Liveness Detection Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/caucasian-people-liveness-detection-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Biometric Attack Dataset, Caucasian People

The similar dataset that includes all ethnicities - Anti Spoofing Real Dataset

The dataset for face anti spoofing and face recognition includes images and videos of сaucasian people. The dataset helps in enchancing the performance of the model by providing wider range of data for a specific ethnic group.

The videos were gathered by capturing faces of genuine individuals presenting spoofs, using facial presentations. Our dataset proposes a novel approach that learns and detects spoofing techniques, extracting features from the genuine facial images to prevent the capturing of such information by fake users.

The dataset contains images and videos of real humans with various resolutions, views, and colors, making it a comprehensive resource for researchers working on anti-spoofing technologies.

People in the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F09524087833ccb985350545376670f7d%2FFrame%20102.png?generation=1712318079960855&alt=media" alt="">

Types of files in the dataset:

photo - selfie of the person

video - real video of the person

Our dataset also explores the use of neural architectures, such as deep neural networks, to facilitate the identification of distinguishing patterns and textures in different regions of the face, increasing the accuracy and generalizability of the anti-spoofing models.

💴 For Commercial Usage: Full version of the dataset includes 19,000 files, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

assignment_id - unique identifier of the media file

worker_id - unique identifier of the person

age - age of the person

true_gender - gender of the person

country - country of the person

ethnicity - ethnicity of the person

video_extension - video extensions in the dataset

video_resolution - video resolution in the dataset

video_duration - video duration in the dataset

video_fps - frames per second for video in the dataset

photo_extension - photo extensions in the dataset

photo_resolution - photo resolution in the dataset

Statistics for the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F0b17f6b68aea01fda89c4608db97a94f%2FFrame%20101.png?generation=1712314613427348&alt=media" alt="">

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

The dataset consists of: - files - includes 10 folders corresponding to each person and including 1 image and 1 video, - .csv file - contains information about the files and people in the dataset

File with the extension .csv

id: id of the person,

selfie_link: link to access the photo,

video_link: link to access the video,

age: age of the person,

country: country of the person,

gender: gender of the person,

video_extension: video extension,

video_resolution: video resolution,

video_duration: video duration,

video_fps: frames per second for video,

photo_extension: photo extension,

photo_resolution: photo resolution

TrainingData provides high-quality data annotation tailored to your needs

keywords: liveness detection systems, liveness detection dataset, biometric dataset, biometric data dataset, biometric system attacks, anti-spoofing dataset, face liveness detection, deep learning dataset, face spoofing database, face anti-spoofing, ibeta dataset, face anti spoofing, large-scale face anti spoofing, rich annotations anti spoofing dataset
Australian Cricket Players First Class Stats
kaggle.com
zip
Updated Aug 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zee Fuzooly (2021). Australian Cricket Players First Class Stats [Dataset]. https://www.kaggle.com/datasets/zeefuzooly/australian-cricket-players-first-class-stats/data
Explore at:
zip(2333408 bytes)Available download formats
Dataset updated
Aug 31, 2021
Authors
Zee Fuzooly
Description
Dataset

This dataset was created by Zee Fuzooly

Contents
Financial Well-Being Survey Data
kaggle.com
Updated Mar 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyKu (2018). Financial Well-Being Survey Data [Dataset]. https://www.kaggle.com/datasets/anthonyku1031/nfwbs-puf-2016-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 18, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AnthonyKu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Understanding factors that support consumer financial well-being can help practitioners and policymakers empower more families to lead better financial lives to serve their own goals.

A person’s financial well-being comes from their sense of financial security and freedom of choice—both in the present and when considering the future. We measured it using our 10-item Financial Well-Being Scale.

The survey dataset includes respondents’ scores on that scale, as well as measures of individual and household characteristics that research suggests may influence adults’ financial well-being.

Content

Variables relating to question in this dataset include Income and employment, Savings and safety nets, Past financial experiences, and Financial behaviors, skills, and attitudes.

For reference on specific fields, a codebook is available online here.

Acknowledgements

This survey was originally conducted by the US Consumer Finance Protection Bureau and published online in October 2017 here.
Students DATA UCLA
kaggle.com
zip
Updated Dec 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhavin Moriya (2021). Students DATA UCLA [Dataset]. https://www.kaggle.com/bhavinmoriya/students-data-ucla
Explore at:
zip(533877 bytes)Available download formats
Dataset updated
Dec 18, 2021
Authors
Bhavin Moriya
Description
Dataset

This dataset was created by Bhavin Moriya

Contents
Popular Baby Names
kaggle.com
data.cityofnewyork.us
+4more
Updated May 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Utkarsh Singh (2023). Popular Baby Names [Dataset]. https://www.kaggle.com/datasets/utkarshx27/popular-baby-names/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 5, 2023
Dataset provided by
Kaggle
Authors
Utkarsh Singh
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
Data from: Global Superstore
kaggle.com
zip
Updated Jul 16, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chandra Shekhar (2020). Global Superstore [Dataset]. https://www.kaggle.com/datasets/shekpaul/global-superstore
Explore at:
zip(5985038 bytes)Available download formats
Dataset updated
Jul 16, 2020
Authors
Chandra Shekhar
Description
Dataset

This dataset was created by Chandra Shekhar

Released under Other (specified in description)

Contents
Identifying Cell Nuclei from Histology Images
kaggle.com
zip
Updated Jul 16, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandhaya (2019). Identifying Cell Nuclei from Histology Images [Dataset]. https://www.kaggle.com/datasets/sandhaya4u/histology-image-dataset
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Jul 16, 2019
Authors
Sandhaya
Description
# # # Machine Learning Model for identifying Cell Nuclei from Histology Images

Machine learning model for identifying cell nuclei from histology images. The model having the ability to generalize across a variety of lighting conditions, cell types, magnifications, and imaging modalities.Imagine speeding up research for almost every disease, from lung cancer and heart disease to rare disorders. The Data Science Bowl offers to data scientist / practitioner a most ambitious mission i.e. create an algorithm to automate nucleus detection & create an algorithm to detect all non overlapped nuclei from the given test data i.e. It should have the capability for instance segmentation. We’ve all seen people suffer from diseases like cancer, heart disease, chronic obstructive pulmonary disease, Alzheimer’s, and diabetes. Many have seen their loved ones pass away. Think how many lives would be transformed if cures came faster. By automating nucleus detection, you could help unlock cures faster—from rare disorders to the common cold

# ## Why nuclei?

Identifying the cells’ nuclei is the starting point for most analyses because most of the human body’s 30 trillion cells contain a nucleus full of DNA, the genetic code that programs each cell. Identifying nuclei allows researchers to identify each individual cell in a sample, and by measuring how cells react to various treatments, the researcher can understand the underlying biological processes at work.By participating, teams will work to automate the process of identifying nuclei, which will allow for more efficient drug testing, shortening the 10 years it takes for each new drug to come to market

Acknowledgements

The success and final outcome of this project required a lot of guidance and assistance from many people and I am extremely privileged to have got this all along the completion of my project. All that I have done is only due to such supervision and assistance and I would not forget to thank them.I owe my deep gratitude to our project guide C - DAC Noida, who took keen interest on my project work and guided me all along, till the completion of our project work by providing all the necessary information for developing a good system.

Inspiration

The Data Science Bowl, presented by Booz Allen and Kaggle, is the world’s premier data science for social good competition. The Data Science Bowl brings together data scientists, technologists, domain experts, and organizations to take on the world’s challenges with data and technology. It’s a platform through which people can harness their passion, unleash their curiosity, and amplify their impact to effect change on a global scale
Linear Regression E-commerce Dataset
kaggle.com
zip
Updated Sep 16, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saurabh Kolawale (2019). Linear Regression E-commerce Dataset [Dataset]. https://www.kaggle.com/kolawale/focusing-on-mobile-app-or-website
Explore at:
zip(44169 bytes)Available download formats
Dataset updated
Sep 16, 2019
Authors
Saurabh Kolawale
Description
This dataset is having data of customers who buys clothes online. The store offers in-store style and clothing advice sessions. Customers come in to the store, have sessions/meetings with a personal stylist, then they can go home and order either on a mobile app or website for the clothes they want.

The company is trying to decide whether to focus their efforts on their mobile app experience or their website.
2023-2024 NBA Player Stats
kaggle.com
Updated Aug 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vivo Vinco (2024). 2023-2024 NBA Player Stats [Dataset]. https://www.kaggle.com/datasets/vivovinco/2023-2024-nba-player-stats
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 2, 2024
Dataset provided by
Kaggle
Authors
Vivo Vinco
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

This dataset contains 2021-2022 regular season NBA player stats per game. Note that there are duplicate player names resulted from team changes.

2021-2022 NBA Player Stats

2022-2023 NBA Player Stats

Content

+500 rows and 30 columns. Columns' description are listed below.

Rk : Rank

Player : Player's name

Pos : Position

Age : Player's age

Tm : Team

G : Games played

GS : Games started

MP : Minutes played per game

FG : Field goals per game

FGA : Field goal attempts per game

FG% : Field goal percentage

3P : 3-point field goals per game

3PA : 3-point field goal attempts per game

3P% : 3-point field goal percentage

2P : 2-point field goals per game

2PA : 2-point field goal attempts per game

2P% : 2-point field goal percentage

eFG% : Effective field goal percentage

FT : Free throws per game

FTA : Free throw attempts per game

FT% : Free throw percentage

ORB : Offensive rebounds per game

DRB : Defensive rebounds per game

TRB : Total rebounds per game

AST : Assists per game

STL : Steals per game

BLK : Blocks per game

TOV : Turnovers per game

PF : Personal fouls per game

PTS : Points per game

Acknowledgements

Data from Basketball Reference. Image from NBA.

If you're reading this, please upvote.
factors may affect wages
kaggle.com
zip
Updated May 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yangsichen (2020). factors may affect wages [Dataset]. https://www.kaggle.com/datasets/yangsichen/factors-may-affect-wages/code
Explore at:
zip(12006 bytes)Available download formats
Dataset updated
May 1, 2020
Authors
yangsichen
Description
Dataset

This dataset was created by yangsichen

Contents
RxNorm Data
kaggle.com
bioregistry.io
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Library of Medicine (2019). RxNorm Data [Dataset]. https://www.kaggle.com/datasets/nlm-nih/nlm-rxnorm
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
National Library of Medicine
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

RxNorm is a name of a US-specific terminology in medicine that contains all medications available on US market. Source: https://en.wikipedia.org/wiki/RxNorm

RxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, Gold Standard Drug Database, and Multum. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary. Source: https://www.nlm.nih.gov/research/umls/rxnorm/

Content

RxNorm was created by the U.S. National Library of Medicine (NLM) to provide a normalized naming system for clinical drugs, defined as the combination of {ingredient + strength + dose form}. In addition to the naming system, the RxNorm dataset also provides structured information such as brand names, ingredients, drug classes, and so on, for each clinical drug. Typical uses of RxNorm include navigating between names and codes among different drug vocabularies and using information in RxNorm to assist with health information exchange/medication reconciliation, e-prescribing, drug analytics, formulary development, and other functions.

This public dataset includes multiple data files originally released in RxNorm Rich Release Format (RXNRRF) that are loaded into Bigquery tables. The data is updated and archived on a monthly basis.

The following tables are included in the RxNorm dataset:

RXNCONSO contains concept and source information

RXNREL contains information regarding relationships between entities

RXNSAT contains attribute information

RXNSTY contains semantic information

RXNSAB contains source info

RXNCUI contains retired rxcui codes

RXNATOMARCHIVE contains archived data

RXNCUICHANGES contains concept changes

Update Frequency: Monthly

Fork this kernel to get started with this dataset.

Acknowledgements

https://www.nlm.nih.gov/research/umls/rxnorm/

https://bigquery.cloud.google.com/dataset/bigquery-public-data:nlm_rxnorm

https://cloud.google.com/bigquery/public-data/rxnorm

Dataset Source: Unified Medical Language System RxNorm. The dataset is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset. This dataset uses publicly available data from the U.S. National Library of Medicine (NLM), National Institutes of Health, Department of Health and Human Services; NLM is not responsible for the dataset, does not endorse or recommend this or any other dataset.

Banner Photo by @freestocks from Unsplash.

Inspiration

What are the RXCUI codes for the ingredients of a list of drugs?

Which ingredients have the most variety of dose forms?

In what dose forms is the drug phenylephrine found?

What are the ingredients of the drug labeled with the generic code number 072718?
Hair Loss Segmentation Dataset
kaggle.com
Updated Aug 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2023). Hair Loss Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/bald-people-segmentation-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 4, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Bald People Segmentation & Object Detection dataset

The balding dataset consists of images of bald people and corresponding segmentation masks.

Segmentation masks highlight the regions of the images that delineate the bald scalp. By using these segmentation masks, researchers and practitioners can focus only on the areas of interest, they also could study androgenetic alopecia via this dataset.

The alopecia dataset is designed to be accessible and easy to use, providing high-resolution images and corresponding segmentation masks in PNG format.

💴 For Commercial Usage: Full version of the dataset includes much more photos, leave a request on TrainingData to buy the dataset

SIMILAR DATASETS:

Dataset of bald people, 5000 images

Body Segmentation - 5,300 Photos

Selfies, ID Images dataset (5591 sets of 15 files)

Face segmentation

Makeup Detection Dataset

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

Content

The dataset includes 2 folders:

Female - the folder includes subfolders corresponding to each woman in the sample. Each of the subfolders contains of a top image of women's heads and a segmentation mask for the original photo.

Male - the folder includes subfolders corresponding to each man in the sample. Each of the subfolders contains of front and top images of men's heads and segmentation masks for the original photos.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F799b481d0bd964f0b78e15159d6f7267%2FMacBook%20Air%20-%201.png?generation=1691150402722829&alt=media" alt="">

File with the extension .csv

link: link to access the media file,

type: type of the image,

gender: gender of the person in the photo

Bald People Segmentation might be made in accordance with your requirements.

TrainingData provides high-quality data annotation tailored to your needs

keywords: bald segmentation, image dataset, bald dataset, hair segmentation, facial images, human segmentation, bald computer vision, bald classification, bald detection, balding men, balding women, baldness, bald woman, bald scalp, bald head, biometric dataset, biometric data dataset, deep learning dataset, facial analysis, human images dataset, androgenetic alopecia, hair loss dataset, balding and non-balding
Critical Habitats Data
kaggle.com
Updated May 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Utkarsh Singh (2023). Critical Habitats Data [Dataset]. https://www.kaggle.com/datasets/utkarshx27/critical-habitats-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 13, 2023
Dataset provided by
Kaggle
Authors
Utkarsh Singh
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Description
Note: There are 5 files

Description

Connecticut Critical Habitats is a polygon feature-based layer with a resolution of +/- 10 meters that represents significant natural community types occurring in Connecticut. This layer is a subset of habitat-related vegetation associations, described in Connecticut's Natural Vegetation Classification, that were designated as key habitats for species of Greatest Conservation Need in the Comprehensive Wildlife Conservation Strategy. These habitats are known to host a number of rare species including highly specialized invertebrates with very specific habitat associations. Some key habitats are broken into subtypes based on natural variations in plant species dominance and/or vegetation structure. These differences are apparent in the subtype names. Connecticut Critical Habitats can serve to highlight ecologically significant areas and to target areas of species diversity.

This layer can be used to perform various spatial analyses that pertain to Critical Habitats, to aid in determining site management and conservation priorities, prioritizing field surveys, and to further document the distribution and abundance of State-listed and/or rare vertebrate and invertebrate species within the significant habitats. Use this layer appropriately with data maintaining similar resolution. Not intended for maps printed at a resolution greater or more detailed than 1:2000.

Purpose

Connecticut Critical Habitats provides the identification and distribution of a subset of important wildlife habitats identified in the Connecticut Comprehensive Wildlife Conservation Strategy. Connecticut Critical Habitats can be used in conjunction with other environmental and natural resource information to provide a more thorough understanding of the physical characteristics of each habitat. The spatial relationships between these areas and data such as land ownership and past, present and projected land use can be analyzed. The Connecticut Critical Habitats can serve to highlight ecologically significant areas and to target areas of species diversity for land conservation and protection. Biologists may use this data to target further research on associated plant and animal species.

Use Limitations

Connecticut Critical Habitats is not a comprehensive map of all critical habitat types in Connecticut. It represents a subset of the key habitats of greatest conservation need identified in Connecticut's Comprehensive Wildlife Conservation Strategy. Sites were mapped according to their known distribution. For some habitats the distribution may not be complete since no state-wide exhaustive surveys have been conducted. Most critical habitat sites were not field visited and publicly available oblique imagery such as the Bing Maps web mapping service was used as a surrogate for field investigation. Caution is advised when using this information without field verifying the habitat delineation and characterization for accuracy. Since many of these areas occur on private property, visiting these sites will require permission from the landowner for access. The recommended scale for viewing Critical Habitats is 1:2,000 to 1:12,000. Displaying Connecticut Critical Habitats at map scales larger and more detailed than 1:2,000 scale may result in minor locational differences and inaccuracies.

Facebook

Twitter

Click to copy link

Link copied

Cite

Training Data (2024). Crowd Counting Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/crowd-counting-dataset/discussion

Data from: Crowd Counting Dataset

Images of crowds ranging from 0 to 11,000 people with labeling in JSON

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 16, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Training Data

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

Crowd Counting Dataset

The dataset includes images featuring crowds of people ranging from 0 to 5000 individuals. The dataset includes a diverse range of scenes and scenarios, capturing crowds in various settings. Each image in the dataset is accompanied by a corresponding JSON file containing detailed labeling information for each person in the crowd for crowd count and classification.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4b51a212e59f575bd6978f215a32aca0%2FFrame%2064.png?generation=1701336719197861&alt=media" alt="">

Types of crowds in the dataset: 0-1000, 1000-2000, 2000-3000, 3000-4000 and 4000-5000

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F72e0fed3ad13826d6545ff75a79ed9db%2FFrame%2065.png?generation=1701337622225724&alt=media" alt="">

This dataset provides a valuable resource for researchers and developers working on crowd counting technology, enabling them to train and evaluate their algorithms with a wide range of crowd sizes and scenarios. It can also be used for benchmarking and comparison of different crowd counting algorithms, as well as for real-world applications such as public safety and security, urban planning, and retail analytics.

Full version of the dataset includes 647 labeled images of crowds, leave a request on TrainingData to buy the dataset

Statistics for the dataset (number of images by the crowd's size and image width):

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F2e9f36820e62a2ef62586fc8e84387e2%2FFrame%2063.png?generation=1701336725293625&alt=media" alt="">

OTHER BIOMETRIC DATASETS:

Get the Dataset

This is just an example of the data

Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

images - includes original images of crowds placed in subfolders according to its size,
labels - includes json-files with labeling and visualised labeling for the images in the previous folder,
csv file - includes information for each image in the dataset

File with the extension .csv

id: id of the image,
image: link to access the original image,
label: link to access the json-file with labeling,
type: type of the crowd on the photo

TrainingData provides high-quality data annotation tailored to your needs

keywords: crowd counting, crowd density estimation, people counting, crowd analysis, image annotation, computer vision, deep learning, object detection, object counting, image classification, dense regression, crowd behavior analysis, crowd tracking, head detection, crowd segmentation, crowd motion analysis, image processing, machine learning, artificial intelligence, ai, human detection, crowd sensing, image dataset, public safety, crowd management, urban planning, event planning, traffic management

Clear search

Close search

Google apps

Main menu

Data from: Crowd Counting Dataset

Crowd Counting Dataset

Full version of the dataset includes 647 labeled images of crowds, leave a request on TrainingData to buy the dataset

Statistics for the dataset (number of images by the crowd's size and image width):

OTHER BIOMETRIC DATASETS:

Get the Dataset

This is just an example of the data

Content

File with the extension .csv

TrainingData provides high-quality data annotation tailored to your needs

NYC Open Data

Context

Content

Acknowledgements

Inspiration

7+ Million Company Dataset

Dataset

Contents

World Bank: Education Data

Context

Content

Acknowledgements

Inspiration

Learning Resources Database

Dataset Description

2021-2022 Football Player Stats

Context

Content

Acknowledgements

Social media Youth dataset

Dataset

Contents

Caucasian People - Liveness Detection Dataset

Biometric Attack Dataset, Caucasian People

The similar dataset that includes all ethnicities - Anti Spoofing Real Dataset

People in the dataset

Types of files in the dataset:

💴 For Commercial Usage: Full version of the dataset includes 19,000 files, leave a request on TrainingData to buy the dataset

Metadata for the full dataset:

Statistics for the dataset

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

File with the extension .csv

TrainingData provides high-quality data annotation tailored to your needs

Australian Cricket Players First Class Stats

Dataset

Contents

Financial Well-Being Survey Data

Context

Content

Acknowledgements

Students DATA UCLA

Dataset

Contents

Popular Baby Names

Data from: Global Superstore

Dataset

Contents

Identifying Cell Nuclei from Histology Images

Acknowledgements

Inspiration

Linear Regression E-commerce Dataset

2023-2024 NBA Player Stats

Context

Content

Acknowledgements

factors may affect wages

Dataset

Contents

RxNorm Data

Context

Content

Acknowledgements

Inspiration

Hair Loss Segmentation Dataset

Bald People Segmentation & Object Detection dataset

💴 For Commercial Usage: Full version of the dataset includes much more photos, leave a request on TrainingData to buy the dataset

SIMILAR DATASETS:

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

Content

`Dataset Description`

Data from: Crowd Counting Dataset