100+ datasets found

Human Tracking & Object Detection Dataset
kaggle.com
zip
Updated Jul 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unique Data (2023). Human Tracking & Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/people-tracking
Explore at:
zip(46156442 bytes)Available download formats
Dataset updated
Jul 27, 2023
Authors
Unique Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
People Tracking & Object Detection dataset

The dataset comprises of annotated video frames from positioned in a public space camera. The tracking of each individual in the camera's view has been achieved using the rectangle tool in the Computer Vision Annotation Tool (CVAT).

The dataset is created on the basis of Real-Time Traffic Video Dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc5a8dc4f63fe85c64a5fead10fad3031%2Fpersons_gif.gif?generation=1690705558283123&alt=media" alt="">

Dataset Structure

The images directory houses the original video frames, serving as the primary source of raw data.

The annotations.xml file provides the detailed annotation data for the images.

The boxes directory contains frames that visually represent the bounding box annotations, showing the locations of the tracked individuals within each frame. These images can be used to understand how the tracking has been implemented and to visualize the marked areas for each individual.

Data Format

The annotations are represented as rectangle bounding boxes that are placed around each individual. Each bounding box annotation contains the position ( xtl-ytl-xbr-ybr coordinates ) for the respective box within the frame. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4f274551e10db2754c4d8a16dff97b33%2Fcarbon%20(10).png?generation=1687776281548084&alt=media" alt="">

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

🚀 You can learn more about our high-quality unique datasets here

keywords: multiple people tracking, human detection dataset, object detection dataset, people tracking dataset, tracking human object interactions, human Identification tracking dataset, people detection annotations, detecting human in a crowd, human trafficking dataset, deep learning object tracking, multi-object tracking dataset, labeled web tracking dataset, large-scale object tracking dataset
Wikipedia notable people
kaggle.com
zip
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konrad Banachewicz (2023). Wikipedia notable people [Dataset]. https://www.kaggle.com/datasets/konradb/wikipedia-notable-people
Explore at:
zip(268529204 bytes)Available download formats
Dataset updated
Jun 15, 2023
Authors
Konrad Banachewicz
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
From the original paper:

A new strand of literature aims at building the most comprehensive and accurate database of notable individuals. We collect a massive amount of data from various editions of Wikipedia and Wikidata. Using deduplication techniques over these partially overlapping sources, we cross-verify each retrieved information. For some variables, Wikipedia adds 15% more information when missing in Wikidata. We find very few errors in the part of the database that contains the most documented individuals but nontrivial error rates in the bottom of the notability distribution, due to sparse information and classification errors or ambiguity. Our strategy results in a cross-verified database of 2.29 million individuals (an elite of 1/43,000 of human being having ever lived), including a third who are not present in the English edition of Wikipedia.
n
HmtDB - Human Mitochondrial DataBase
neuinfo.org
scicrunch.org
+1more
Updated May 16, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). HmtDB - Human Mitochondrial DataBase [Dataset]. http://identifiers.org/RRID:SCR_007713
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007713
Dataset updated
May 16, 2018
Description
A human mitochondrial resource aimed at supporting population genetics and mitochondrial disease studies. It consists of a database of Human Mitochondrial Genomes annotated with population and variability data, the latter estimated through the application of a new approach based on site-specific nucleotidic and aminoacidic variability calculation (SiteVar and MitVarProt programs). The goals of HmtDB are: to collect and integrate the publicly available human mitochondrial genomes data; to produce and provide the scientific community with site-specific nucleotidic and aminoacidic variability data estimated on all the collected human mitochondrial genome sequences; to allow any researcher to analyse his own human mitochondrial sequences (both complete and partial mitochondrial genomes) in order to automatically detect the nucleotidic variants compared to the revised Cambridge Reference Sequence (rCRS) and to predict their haplogroup paternity. HmtDBs first release contains 1255 human mitochondrial genomes derived from public databases (GenBank and MitoKor). The genomes have been stored and analysed as a whole dataset and grouped in continent-specific subsets (AF: Africa, AM: America, AS: Asia, EU: Europe, OC: Oceania). :The multialignment and site-variability analysis tools included in HmtDB are clustered in two Work Flows: the Variability Generation Work Flow (VGWF) and the Classification Work Flow (CWF), which are applied both to human mitochondrial genomes stored in the database and to newly sequenced genomes submitted by the user, respectively., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Data from: Visible Human Project
healthdata.gov
datadiscovery.nlm.nih.gov
+3more
csv, xlsx, xml
Updated Mar 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
datadiscovery.nlm.nih.gov (2023). Visible Human Project [Dataset]. https://healthdata.gov/NIH/Visible-Human-Project/krti-uwg9
Explore at:
xlsx, xml, csvAvailable download formats
Dataset updated
Mar 1, 2023
Dataset provided by
datadiscovery.nlm.nih.gov
Description
The NLM Visible Human Project® has created publicly-available complete, anatomically detailed, three-dimensional representations of a human male body and a human female body. Specifically, the VHP provides a public-domain library of cross-sectional cryosection, CT, and MRI images obtained from one male cadaver and one female cadaver. The Visible Man data set was publicly released in 1994 and the Visible Woman in 1995.

https://www.nlm.nih.gov/research/visible/visible_human.html
LinkedIn Dataset - US People Profiles
kaggle.com
zip
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph from Proxycurl (2023). LinkedIn Dataset - US People Profiles [Dataset]. https://www.kaggle.com/datasets/proxycurl/10000-us-people-profiles/discussion?sort=undefined
Explore at:
zip(34781665 bytes)Available download formats
Dataset updated
May 16, 2023
Authors
Joseph from Proxycurl
Description
Full profile of 10,000 people in the US - download here, data schema here, with more than 40 data points including - Full Name - Education - Location - Work Experience History and many more!

There are additionally 258+ Million US people profiles available, visit the LinkDB product page here.

Our LinkDB database is an exhaustive database of publicly accessible LinkedIn people and companies profiles. It contains close to 500 Million people and companies profiles globally.
n
Human Genome Variation Society: Databases and Other Tools
neuinfo.org
dknet.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Human Genome Variation Society: Databases and Other Tools [Dataset]. http://identifiers.org/RRID:SCR_006876
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006876
Dataset updated
Jan 29, 2022
Description
A list of various databases freely available to the public, including several mutation and variation resources, such as education resources for teachers students provided by the Human Genome Variation Society. Databases listed include: * Locus Specific Mutation Databases * Disease Centered Central Mutation Databases * Central Mutation and SNP Databases * National and Ethnic Mutation Databases * Mitochondrial Mutation Databases * Chromosomal Variation Databases * Other Mutation Databases ( i.e. your round holes don''''t fit our square pegs) * Clinical and Patient Aspects Databases * Non Human Mutation Databases * Artificial Mutations Only * Other Related Databases * Education Resources for Teachers and Students
p
LINE Number Database | Line Data
listtodata.com
.csv, .xls, .txt
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
List to Data (2025). LINE Number Database | Line Data [Dataset]. https://listtodata.com/line-data
Explore at:
.csv, .xls, .txtAvailable download formats
Dataset updated
Jul 17, 2025
Authors
List to Data
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jan 1, 2025 - Dec 31, 2025
Area covered
Senegal, Costa Rica, Denmark, Algeria, Bosnia and Herzegovina, Palestine, Congo (Democratic Republic of the), Switzerland, United States of America, Liechtenstein
Variables measured
phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
Description
LINE number database is an extensive list of people who use the LINE app. Assume you have a list of everyone in your school, but you build a separate list of only the students who enjoy playing soccer. This is what a LINE Database User List looks like. It allows businesses to target a certain set of people who may be interested in their offer. LINE number database is useful as it may be updated regularly. Businesses, like teachers, may update their lists as new users join LINE or as their interests change. This keeps the list functional and allows firms to contact the relevant people. LINE number database may help businesses measure user engagement and improve their marketing campaigns over time. Businesses that keep the list updated and use it appropriately may develop greater ties with their audience and achieve better outcomes. Finally, the LINE Number Database is an effective tool for businesses to reach the appropriate individuals at the correct time. This valuable database is available on List To Data. LINE data is a valuable resource for businesses seeking to connect with potential customers. This dataset encompasses information about individuals who utilize the LINE messaging app. LINE is a popular messaging platform with over 90 million monthly active users worldwide. Users can seamlessly communicate through messages, voice and video calls, and share engaging stickers. This database has information about users, like their names, phone numbers, email addresses, and sometimes even what they like to do on the app. LINE data is a very useful tool. It helps to sell things or provide services. They use the LINE app user database to find people who might be interested in their offers. But businesses need to be careful with this information. People’s details, like their phone numbers and email addresses, are private. Businesses should always ask for permission before using this information. They also need to keep it safe so that no one else can see it. If businesses respect users’ privacy, people will trust them more and be happier to hear about what the business offers. This data is available on List To Data.
Diversity, Equity and Inclusion Measures Dataset
kaggle.com
zip
Updated Nov 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kerem Kurt (2022). Diversity, Equity and Inclusion Measures Dataset [Dataset]. https://www.kaggle.com/datasets/keremkurt/diversity-equity-and-inclusion-measures-dataset
Explore at:
zip(2756966 bytes)Available download formats
Dataset updated
Nov 2, 2022
Authors
Kerem Kurt
Description
General Info

This data set is generated to simulate an employee data set of a company including sensitive information such as gender, sexual orientation, ethnicity, LGBTQ, and much more. The goal of this data set is to improve Diversity, Equity, and Inclusion in the workplace.

Survey Questions and Scores

The main idea of the survey is to track whether the company's efforts in improving DEI actually work and to discover if any group(coming from different backgrounds such as different gender, ethnicity, sexual orientation etc.) falls behind. The survey can be repeated periodically to measure the impact of the company's efforts.

There are 5 survey questions for each of the DEI categories. Survey scores of employees are also shown in the data set.
r
Human Motion Data for Licensing
rokoko.com
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rokoko (2025). Human Motion Data for Licensing [Dataset]. https://www.rokoko.com/mocap/motion-dataset
Explore at:
Dataset updated
Jul 1, 2025
Dataset authored and provided by
Rokoko
License
https://www.rokoko.com/legal/terms-and-conditionshttps://www.rokoko.com/legal/terms-and-conditions
Description
Access the world's largest and most diverse human motion dataset. A professional-grade collection of motion capture data suitable for advanced applications, including machine learning, animation, research, and AI development.
d
Factori US People Data APIs | 240M+ profiles:40+ attributes|
datarade.ai
.json
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Factori (2023). Factori US People Data APIs | 240M+ profiles:40+ attributes| [Dataset]. https://datarade.ai/data-products/factori-us-person-data-apis-240m-profiles-40-attributes-factori
Explore at:
.jsonAvailable download formats
Dataset updated
Jun 3, 2023
Dataset authored and provided by
Factori
Area covered
United States of America
Description
Factori is a compliant, flexible, and adaptable data provider. We help you make smarter decisions, fill all the gaps in your data, uncover patterns, gain a competitive advantage, and build better solutions by bringing accurate, holistic, privacy-compliant global consumer data.

We specialize in building the world’s largest consumer graph that ingests, de-dupes, and transforms premium data from over 2.3 billion anonymous customer profiles with 800+ attributes, which powers insights for smarter decision-making and building adequate solutions. We take privacy and personal information very seriously and are committed to adhering to all applicable data privacy and security laws and regulations, including the GDPR, CCPA, and ISO 27001.

In the dynamic realm of business, the perpetual challenge of maintaining current customer data is ever-present. Factori’s People Data API efficiently manages the ingestion, deduplication, and transformation of premium data sources, saving you valuable time and effort.

With our API, you can access and utilize subsets of our comprehensive person dataset, empowering you to gain actionable intelligence, make data-driven decisions, and build innovative products and services. Whether you're a marketer, data scientist, or business analyst, our US People Data can unlock new opportunities for your organization.

Designed as a comprehensive data enrichment solution, our US People database fills gaps in your customer data, offering profound insights into your consumers. Encompassing over 300 million profiles with more than 40 variables spanning location, demographics, lifestyle, hobbies, and behaviors, it acts as a guiding compass for understanding your customers' past, present, and potential future behaviors. This enables you to navigate the business landscape with clarity, making decisions grounded in comprehensive and informed perspectives.

Here are some of the data categories and attributes we offer within the US People Data Graph: Geography: City, State, ZIP, County, CBSA, Census Tract, etc. Demographics: Gender, Age Group, Marital Status, Language, etc. Financial: Income Range, Credit Rating Range, Credit Type,etc. Persona: Consumer type, Communication preferences, Family type, etc. Interests: Content, Brands, Shopping, Hobbies, Lifestyle, etc. Household: Number of Children, Number of Adults, IP Address, etc. Behaviors: Brand Affinity, App Usage, Web Browsing, etc. Firmographics: Industry, Company, Occupation, Revenue, etc. Retail Purchase: Store, Category, Brand, SKU, Quantity, Price, etc. Auto: Car Make, Model, Type, Year, etc. Housing: Home type, Home value, Renter/Owner, Year Built, etc.

Use Cases: Sales Intelligence: Precision Market Analysis and Segmentation Engage with personalized campaigns Enhance Lead Scoring and Qualification Strategic Marketing: Precision Market Analysis and Segmentation Engage with personalized campaigns Enhance Lead Scoring and Qualification Fraud and Cybersecurity: Unlock comprehensive identity insights Seamless KYC Compliances. Real-time Threat Detection HR Tech: Elevate Candidate Profiles Forge Talent Pathways Track role transitions
C
China CN: Tel: Penetration Rate: Mobile per 100 People: Inner Mongolia
ceicdata.com
Updated Mar 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2018). China CN: Tel: Penetration Rate: Mobile per 100 People: Inner Mongolia [Dataset]. https://www.ceicdata.com/en/china/telephone-number-of-mobile-per-100-people-by-region
Explore at:
Dataset updated
Mar 26, 2018
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2013 - Dec 1, 2024
Area covered
China
Variables measured
Phone Statistics
Description
CN: Tel: Penetration Rate: Mobile per 100 People: Inner Mongolia data was reported at 129.173 Unit in 2024. This records an increase from the previous number of 127.380 Unit for 2023. CN: Tel: Penetration Rate: Mobile per 100 People: Inner Mongolia data is updated yearly, averaging 93.700 Unit from Dec 1999 (Median) to 2024, with 26 observations. The data reached an all-time high of 129.173 Unit in 2024 and a record low of 2.260 Unit in 1999. CN: Tel: Penetration Rate: Mobile per 100 People: Inner Mongolia data remains active status in CEIC and is reported by Ministry of Industry and Information Technology. The data is categorized under Global Database’s China – Table CN.ICA: Telephone: Number of Mobile per 100 People: By Region.
I
Indonesia East Nusa Tenggara: Nagekeo Regency: Total Voters
ceicdata.com
Updated Dec 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2019). Indonesia East Nusa Tenggara: Nagekeo Regency: Total Voters [Dataset]. https://www.ceicdata.com/en/indonesia/legislative-election-peoples-representative-council-total-electors-and-voters-east-nusa-tenggara
Explore at:
Dataset updated
Dec 8, 2019
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2014
Area covered
Indonesia
Description
East Nusa Tenggara: Nagekeo Regency: Total Voters data was reported at 72,244.000 Person in 2014. East Nusa Tenggara: Nagekeo Regency: Total Voters data is updated yearly, averaging 72,244.000 Person from Dec 2014 (Median) to 2014, with 1 observations. East Nusa Tenggara: Nagekeo Regency: Total Voters data remains active status in CEIC and is reported by General Elections Commisions. The data is categorized under Indonesia Premium Database’s General Election – Table ID.GEE019: Legislative Election: People's Representative Council: Total Electors and Voters: East Nusa Tenggara.
I
Indonesia Jambi: Kerinci Regency: Total Valid Votes: National Democratic...
ceicdata.com
Updated Oct 10, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2019). Indonesia Jambi: Kerinci Regency: Total Valid Votes: National Democratic Party, Nasdem [Dataset]. https://www.ceicdata.com/en/indonesia/legislative-election-peoples-regional-representative-council-results-of-vote-acquisition-jambi
Explore at:
Dataset updated
Oct 10, 2019
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2014 - Dec 1, 2019
Area covered
Indonesia
Description
Jambi: Kerinci Regency: Total Valid Votes: National Democratic Party, Nasdem data was reported at 12,981.000 Unit in 2019. This records an increase from the previous number of 2,204.000 Unit for 2014. Jambi: Kerinci Regency: Total Valid Votes: National Democratic Party, Nasdem data is updated yearly, averaging 7,592.500 Unit from Dec 2014 (Median) to 2019, with 2 observations. The data reached an all-time high of 12,981.000 Unit in 2019 and a record low of 2,204.000 Unit in 2014. Jambi: Kerinci Regency: Total Valid Votes: National Democratic Party, Nasdem data remains active status in CEIC and is reported by General Elections Commisions. The data is categorized under Indonesia Premium Database’s General Election – Table ID.GEI005: Legislative Election: People's Regional Representative Council: Results of Vote Acquisition: Jambi.
Asian People - Liveness Detection Video Dataset
kaggle.com
zip
Updated Apr 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unique Data (2024). Asian People - Liveness Detection Video Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/asian-people-liveness-detection-video-dataset
Explore at:
zip(177727531 bytes)Available download formats
Dataset updated
Apr 17, 2024
Authors
Unique Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Biometric Attack Dataset, Asian People

The similar dataset that includes all ethnicities - Anti Spoofing Real Dataset

The dataset for face anti spoofing and face recognition includes images and videos of asian people. 30,600+ photos & video of 15,300 people from 32 countries. All people presented in the dataset are South Asian, East Asian or Middle Asian. The dataset helps in enchancing the performance of the model by providing wider range of data for a specific ethnic group.

The videos were gathered by capturing faces of genuine individuals presenting spoofs, using facial presentations. Our dataset proposes a novel approach that learns and detects spoofing techniques, extracting features from the genuine facial images to prevent the capturing of such information by fake users.

The dataset contains images and videos of real humans with various resolutions, views, and colors, making it a comprehensive resource for researchers working on anti-spoofing technologies.

People in the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Ff545aa561432738d251c09f09e1f5e92%2FFrame%20104.png?generation=1713356643038606&alt=media" alt="">

Types of files in the dataset:

photo - selfie of the person

video - real video of the person

Our dataset also explores the use of neural architectures, such as deep neural networks, to facilitate the identification of distinguishing patterns and textures in different regions of the face, increasing the accuracy and generalizability of the anti-spoofing models.

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Metadata for the full dataset:

assignment_id - unique identifier of the media file

worker_id - unique identifier of the person

age - age of the person

true_gender - gender of the person

country - country of the person

ethnicity - ethnicity of the person

video_extension - video extensions in the dataset

video_resolution - video resolution in the dataset

video_duration - video duration in the dataset

video_fps - frames per second for video in the dataset

photo_extension - photo extensions in the dataset

photo_resolution - photo resolution in the dataset

Statistics for the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F6de78d350a9213d8437f766b085d4551%2Fasian_video_liveness.png?generation=1713356627116331&alt=media" alt="">

🧩 This is just an example of the data. Leave a request here to learn more

Content

The dataset consists of: - files - includes 10 folders corresponding to each person and including 1 image and 1 video, - .csv file - contains information about the files and people in the dataset

File with the extension .csv

id: id of the person,

selfie_link: link to access the photo,

video_link: link to access the video,

age: age of the person,

country: country of the person,

gender: gender of the person,

video_extension: video extension,

video_resolution: video resolution,

video_duration: video duration,

video_fps: frames per second for video,

photo_extension: photo extension,

photo_resolution: photo resolution

🚀 You can learn more about our high-quality unique datasets here

keywords: liveness detection systems, liveness detection dataset, biometric dataset, biometric data dataset, biometric system attacks, anti-spoofing dataset, face liveness detection, deep learning dataset, face spoofing database, face anti-spoofing, ibeta dataset, face anti spoofing, large-scale face anti spoofing, rich annotations anti spoofing dataset, asian people, asian classification, asian image dataset
r
HUDSEN Human Gene Expression Spatial Database
rrid.site
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). HUDSEN Human Gene Expression Spatial Database [Dataset]. http://identifiers.org/RRID:SCR_006325
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006325
Dataset updated
Jan 29, 2022
Description
Database of a set of standard 3D virtual models at different stages of development from Carnegie Stages (CS) 12-23 (approximately 26-56 days post conception) in which various anatomical regions have been defined with a set of anatomical terms at various stages of development (known as an ontology). Experimental data is captured and converted to digital format and then mapped to the appropriate 3D model. The ontology is used to define sites of gene expression using a set of standard descriptions and to link the expression data to an ''''anatomical tree''''. Human data from stages CS12 to CS23 can be submitted to the HUDSEN Gene Expression Database. The anatomy ontology currently being used is based on the Edinburgh Human Developmental Anatomy Database which encompasses all developing structures from CS1 to CS20 but is not detailed for developing brain structures. The ontology is being extended and refined (by Prof Luis Puelles, University of Murcia, Spain) and will be incorporated into the HUDSEN database as it is developed. Expression data is annotated using two methods to denote sites of expression in the embryo: spatial annotation and text annotation. Additionally, many aspects of the detection reagent and specimen are also annotated during this process (assignment of IDs, nucleotide sequences for probes etc). There are currently two main ways to search HUDSEN - using a gene/protein name or a named anatomical structure as the query term. The entire contents of the database can be browsed using the data browser. Results may be saved. The data in HUDSEN is generated from both from researchers within the HUDSEN project, and from the wider scientific community. The HUDSEN human gene expression spatial database is a collaboration between the Institute of Human Genetics in Newcastle, UK, and the MRC Human Genetics Unit in Edinburgh, UK, and was developed as part of the Electronic Atlas of the Developing Human Brain (EADHB) project (funded by the NIH Human Brain Project). The database is based on the Edinburgh Mouse Atlas gene expression database (EMAGE), and is designed to be an openly available resource to the research community holding gene expression patterns during early human development.
d
Data from: What We Eat In America (WWEIA) Database
catalog.data.gov
agdatacommons.nal.usda.gov
+2more
Updated Dec 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). What We Eat In America (WWEIA) Database [Dataset]. https://catalog.data.gov/dataset/what-we-eat-in-america-wweia-database-f7f35
Explore at:
Dataset updated
Dec 2, 2025
Dataset provided by
Agricultural Research Service
Area covered
United States
Description
What We Eat in America (WWEIA) is the dietary intake interview component of the National Health and Nutrition Examination Survey (NHANES). WWEIA is conducted as a partnership between the U.S. Department of Agriculture (USDA) and the U.S. Department of Health and Human Services (DHHS). Two days of 24-hour dietary recall data are collected through an initial in-person interview, and a second interview conducted over the telephone within three to 10 days. Participants are given three-dimensional models (measuring cups and spoons, a ruler, and two household spoons) and/or USDA's Food Model Booklet (containing drawings of various sizes of glasses, mugs, bowls, mounds, circles, and other measures) to estimate food amounts. WWEIA data are collected using USDA's dietary data collection instrument, the Automated Multiple-Pass Method (AMPM). The AMPM is a fully computerized method for collecting 24-hour dietary recalls either in-person or by telephone. For each 2-year data release cycle, the following dietary intake data files are available: Individual Foods File - Contains one record per food for each survey participant. Foods are identified by USDA food codes. Each record contains information about when and where the food was consumed, whether the food was eaten in combination with other foods, amount eaten, and amounts of nutrients provided by the food. Total Nutrient Intakes File - Contains one record per day for each survey participant. Each record contains daily totals of food energy and nutrient intakes, daily intake of water, intake day of week, total number foods reported, and whether intake was usual, much more than usual or much less than usual. The Day 1 file also includes salt use in cooking and at the table; whether on a diet to lose weight or for other health-related reason and type of diet; and frequency of fish and shellfish consumption (examinees one year or older, Day 1 file only). DHHS is responsible for the sample design and data collection, and USDA is responsible for the survey’s dietary data collection methodology, maintenance of the databases used to code and process the data, and data review and processing. USDA also funds the collection and processing of Day 2 dietary intake data, which are used to develop variance estimates and calculate usual nutrient intakes. Resources in this dataset: Resource Title: What We Eat In America (WWEIA) main web page. File Name: Web Page, url: https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/wweianhanes-overview/ Contains data tables, research articles, documentation data sets and more information about the WWEIA program. (Link updated 05/13/2020)
m
World’s Top 2% of Scientists list by Stanford University: An Analysis of its...
data.mendeley.com
Updated Nov 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JOHN Philip (2023). World’s Top 2% of Scientists list by Stanford University: An Analysis of its Robustness [Dataset]. http://doi.org/10.17632/td6tdp4m6t.1
Explore at:
Unique identifier
https://doi.org/10.17632/td6tdp4m6t.1
Dataset updated
Nov 17, 2023
Authors
JOHN Philip
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
John Ioannidis and co-authors [1] created a publicly available database of top-cited scientists in the world. This database, intended to address the misuse of citation metrics, has generated a lot of interest among the scientific community, institutions, and media. Many institutions used this as a yardstick to assess the quality of researchers. At the same time, some people look at this list with skepticism citing problems with the methodology used. Two separate databases are created based on career-long and, single recent year impact. This database is created using Scopus data from Elsevier[1-3]. The Scientists included in this database are classified into 22 scientific fields and 174 sub-fields. The parameters considered for this analysis are total citations from 1996 to 2022 (nc9622), h index in 2022 (h22), c-score, and world rank based on c-score (Rank ns). Citations without self-cites are considered in all cases (indicated as ns). In the case of a single-year case, citations during 2022 (nc2222) instead of Nc9622 are considered.

To evaluate the robustness of c-score-based ranking, I have done a detailed analysis of the matrix parameters of the last 25 years (1998-2022) of Nobel laureates of Physics, chemistry, and medicine, and compared them with the top 100 rank holders in the list. The latest career-long and single-year-based databases (2022) were used for this analysis. The details of the analysis are presented below: Though the article says the selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field, the actual career-based ranking list has 204644 names[1]. The single-year database contains 210199 names. So, the list published contains ~ the top 4% of scientists. In the career-based rank list, for the person with the lowest rank of 4809825, the nc9622, h22, and c-score were 41, 3, and 1.3632, respectively. Whereas for the person with the No.1 rank in the list, the nc9622, h22, and c-score were 345061, 264, and 5.5927, respectively. Three people on the list had less than 100 citations during 96-2022, 1155 people had an h22 less than 10, and 6 people had a C-score less than 2.
In the single year-based rank list, for the person with the lowest rank (6547764), the nc2222, h22, and c-score were 1, 1, and 0. 6, respectively. Whereas for the person with the No.1 rank, the nc9622, h22, and c-score were 34582, 68, and 5.3368, respectively. 4463 people on the list had less than 100 citations in 2022, 71512 people had an h22 less than 10, and 313 people had a C-score less than 2. The entry of many authors having single digit H index and a very meager total number of citations indicates serious shortcomings of the c-score-based ranking methodology. These results indicate shortcomings in the ranking methodology.
Data from: Smart Location Database
catalog.data.gov
gimi9.com
+1more
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Environmental Protection Agency, Office of Policy, Office of Sustainable Communities (Publisher) (2025). Smart Location Database [Dataset]. https://catalog.data.gov/dataset/smart-location-database8
Explore at:
Dataset updated
Feb 25, 2025
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
A large body of research has demonstrated that land use and urban form can have a significant effect on transportation outcomes. People who live and/or work in compact neighborhoods with a walkable street grid and easy access to public transit, jobs, stores, and services are more likely to have several transportation options to meet their everyday needs. As a result, they can choose to drive less, which reduces their emissions of greenhouse gases and other pollutants compared to people who live and work in places that are not location efficient. Walking, biking, and taking public transit can also save people money and improve their health by encouraging physical activity. The Smart Location Database summarizes several demographic, employment, and built environment variables for every census block group (CBG) in the United States. The database includes indicators of the commonly cited “D” variables shown in the transportation research literature to be related to travel behavior. The Ds include residential and employment density, land use diversity, design of the built environment, access to destinations, and distance to transit. SLD variables can be used as inputs to travel demand models, baseline data for scenario planning studies, and combined into composite indicators characterizing the relative location efficiency of CBG within U.S. metropolitan regions. This update features the most recent geographic boundaries (2019 Census Block Groups) and new and expanded sources of data used to calculate variables. Entirely new variables have been added and the methods used to calculate some of the SLD variables have changed. More information on the National Walkability index: https://www.epa.gov/smartgrowth/smart-location-mapping More information on the Smart Location Calculator: https://www.slc.gsa.gov/slc/
d
Motor Vehicle Collisions - Person
catalog.data.gov
data.cityofnewyork.us
Updated Feb 8, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofnewyork.us (2026). Motor Vehicle Collisions - Person [Dataset]. https://catalog.data.gov/dataset/motor-vehicle-collisions-person
Explore at:
Dataset updated
Feb 8, 2026
Dataset provided by
data.cityofnewyork.us
Description
The Motor Vehicle Collisions person table contains details for people involved in the crash. Each row represents a person (driver, occupant, pedestrian, bicyclist,..) involved in a crash. The data in this table goes back to April 2016 when crash reporting switched to an electronic system. The Motor Vehicle Collisions data tables contain information from all police reported motor vehicle collisions in NYC. The police report (MV104-AN) is required to be filled out for collisions where someone is injured or killed, or where there is at least $1000 worth of damage (https://www.nhtsa.gov/sites/nhtsa.dot.gov/files/documents/ny_overlay_mv-104an_rev05_2004.pdf). It should be noted that the data is preliminary and subject to change when the MV-104AN forms are amended based on revised crash details. Due to success of the CompStat program, NYPD began to ask how to apply the CompStat principles to other problems. Other than homicides, the fatal incidents with which police have the most contact with the public are fatal traffic collisions. Therefore in April 1998, the Department implemented TrafficStat, which uses the CompStat model to work towards improving traffic safety. Police officers complete form MV-104AN for all vehicle collisions. The MV-104AN is a New York State form that has all of the details of a traffic collision. Before implementing Trafficstat, there was no uniform traffic safety data collection procedure for all of the NYPD precincts. Therefore, the Police Department implemented the Traffic Accident Management System (TAMS) in July 1999 in order to collect traffic data in a uniform method across the City. TAMS required the precincts manually enter a few selected MV-104AN fields to collect very basic intersection traffic crash statistics which included the number of accidents, injuries and fatalities. As the years progressed, there grew a need for additional traffic data so that more detailed analyses could be conducted. The Citywide traffic safety initiative, Vision Zero started in the year 2014. Vision Zero further emphasized the need for the collection of more traffic data in order to work towards the Vision Zero goal, which is to eliminate traffic fatalities. Therefore, the Department in March 2016 replaced the TAMS with the new Finest Online Records Management System (FORMS). FORMS enables the police officers to electronically, using a Department cellphone or computer, enter all of the MV-104AN data fields and stores all of the MV-104AN data fields in the Department’s crime data warehouse. Since all of the MV-104AN data fields are now stored for each traffic collision, detailed traffic safety analyses can be conducted as applicable.
D
Petition, Signature, and Response Data from "We the People"
datalumos.org
Updated Feb 17, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
White House (2017). Petition, Signature, and Response Data from "We the People" [Dataset]. http://doi.org/10.3886/E100456V3
Explore at:
Unique identifier
https://doi.org/10.3886/E100456V3
Dataset updated
Feb 17, 2017
Dataset authored and provided by
White House
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Sep 22, 2011 - Oct 1, 2016
Description
On September 22, 2011, the White House launched "We the People" (petitions.whitehouse.gov), a site where people could create and sign petitions asking the president and his administration to do (or not do) various things. During the Obama administration, petitions that received over a certain number of signatures generally received a response from the White House. The Obama administration released bulk SQL dumps of the petition, signature, and response data periodically (generally quarterly). Three of those bulk SQL dumps are archived here.The site remained live under the Trump administration, although the SQL dumps were no longer released. A project at Grinnell College collected data from the "We the People" API nightly and loaded it into a searchable online database (https://dasil.sites.grinnell.edu/political-science/we-the-people-data-explorer/). A bulk SQL dump of that database is also archived here. It contains essentially all petition, signature, and response data from the site's launch until it was discontinued by the Biden administration on January 20, 2021. Some data may be missing from the middle of the day on January 20, 2017. The data ends at approximately 1:45am EST on January 20, 2021.Additional information about the data can be found here: https://petitions.trumpwhitehouse.archives.gov/developers/get-code

Facebook

Twitter

Click to copy link

Link copied

Cite

Unique Data (2023). Human Tracking & Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/people-tracking

Human Tracking & Object Detection Dataset

Overhead Video Frames with People's Tracking, Object Detection dataset

Explore at:

zip(46156442 bytes)Available download formats

Dataset updated

Jul 27, 2023

Authors

Unique Data

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

People Tracking & Object Detection dataset

The dataset comprises of annotated video frames from positioned in a public space camera. The tracking of each individual in the camera's view has been achieved using the rectangle tool in the Computer Vision Annotation Tool (CVAT).

The dataset is created on the basis of Real-Time Traffic Video Dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc5a8dc4f63fe85c64a5fead10fad3031%2Fpersons_gif.gif?generation=1690705558283123&alt=media" alt="">

Dataset Structure

The images directory houses the original video frames, serving as the primary source of raw data.
The annotations.xml file provides the detailed annotation data for the images.
The boxes directory contains frames that visually represent the bounding box annotations, showing the locations of the tracked individuals within each frame. These images can be used to understand how the tracking has been implemented and to visualize the marked areas for each individual.

Data Format

The annotations are represented as rectangle bounding boxes that are placed around each individual. Each bounding box annotation contains the position ( xtl-ytl-xbr-ybr coordinates ) for the respective box within the frame. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4f274551e10db2754c4d8a16dff97b33%2Fcarbon%20(10).png?generation=1687776281548084&alt=media" alt="">

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

🚀 You can learn more about our high-quality unique datasets here

keywords: multiple people tracking, human detection dataset, object detection dataset, people tracking dataset, tracking human object interactions, human Identification tracking dataset, people detection annotations, detecting human in a crowd, human trafficking dataset, deep learning object tracking, multi-object tracking dataset, labeled web tracking dataset, large-scale object tracking dataset

Clear search

Close search

Google apps

Main menu

Human Tracking & Object Detection Dataset

People Tracking & Object Detection dataset

The dataset is created on the basis of Real-Time Traffic Video Dataset

Dataset Structure

Data Format

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Wikipedia notable people

HmtDB - Human Mitochondrial DataBase

Data from: Visible Human Project

LinkedIn Dataset - US People Profiles

Human Genome Variation Society: Databases and Other Tools

LINE Number Database | Line Data

Diversity, Equity and Inclusion Measures Dataset

General Info

Survey Questions and Scores

Human Motion Data for Licensing

Factori US People Data APIs | 240M+ profiles:40+ attributes|

China CN: Tel: Penetration Rate: Mobile per 100 People: Inner Mongolia

Indonesia East Nusa Tenggara: Nagekeo Regency: Total Voters

Indonesia Jambi: Kerinci Regency: Total Valid Votes: National Democratic...

Asian People - Liveness Detection Video Dataset

Biometric Attack Dataset, Asian People

The similar dataset that includes all ethnicities - Anti Spoofing Real Dataset

People in the dataset

Types of files in the dataset:

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Metadata for the full dataset:

Statistics for the dataset

🧩 This is just an example of the data. Leave a request here to learn more

Content

File with the extension .csv

HUDSEN Human Gene Expression Spatial Database

Data from: What We Eat In America (WWEIA) Database

World’s Top 2% of Scientists list by Stanford University: An Analysis of its...

Data from: Smart Location Database

Motor Vehicle Collisions - Person

Petition, Signature, and Response Data from "We the People"

Human Tracking & Object Detection DatasetSee More Versions

Overhead Video Frames with People's Tracking, Object Detection dataset

People Tracking & Object Detection dataset

The dataset is created on the basis of Real-Time Traffic Video Dataset

Dataset Structure

Data Format

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Human Tracking & Object Detection Dataset