85 datasets found
  1. Biological Data Of Human Evolution Data Sets

    • kaggle.com
    Updated Mar 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SantiagoCostabile (2024). Biological Data Of Human Evolution Data Sets [Dataset]. https://www.kaggle.com/datasets/santiago123678/biological-data-of-human-ancestors-data-sets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 16, 2024
    Dataset provided by
    Kaggle
    Authors
    SantiagoCostabile
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Homininos_DataSet(1).csv is the original///////// Homininos_DataSet.csv It already has the categorical values ​​encoded

    Exploring Human Evolution Through a Comprehensive Dataset

    Introduction:

    In this dataset, we delve into the fascinating story of human evolution. With 720 rows and 28 columns, this dataset covers a wide range of characteristics of different hominids, from the earliest consensual ancestors to modern Homo sapiens. This comprehensive compilation aims to facilitate the search for relationships between various key variables, thereby providing a more complete and detailed understanding of human evolution.

    Objectives:

    The main objective of this dataset is to facilitate the exploration and understanding of human evolution from a broader and more detailed perspective. Some specific objectives include:

    Seeking relationships between important columns of the dataset. Understanding human evolution considering the collected data. Investigating the possible linearity of evolution over time. Analyzing potential relationships between brain size, developed technologies, diet, and physiological modifications over time. Significance:

    This dataset is crucial for advancing our understanding of human evolution and history. It provides a solid foundation for research in various fields, from anthropology and evolutionary biology to archaeology and genetics. By allowing us to examine relationships and patterns among different variables, this dataset helps us trace the course of human evolution and gain a better understanding of our place in the tree of life.

    Conclusions:

    In summary, this comprehensive dataset provides us with a valuable tool for exploring human evolution in depth. With its numerous rows and columns, it allows us to delve into the complexity and diversity of our evolutionary history. By analyzing and understanding the collected data, we can gain new insights into how we have come to be what we are today and how our species has evolved over time.

    This dataset not only expands our knowledge of human evolution but also inspires us to continue researching and discovering more about our shared past as a species.

    I studied Biological Anthropology for 4 years at the National University of La Palta, and I had the opportunity to compile these data from classes and books such as Carbonell's "Homínidos: las primeras ocupaciones de los continentes," published in 2005.

    INFO About Columns: Genus & Species: (categorical) This column contains the genus and specific name of the species. It provides taxonomic information about each hominid included in the dataset, allowing for precise identification

    Time : (categorical) This column indicates the time period during which each hominid species lived. It helps to establish chronological context and understand the temporal distribution of different hominid groups.

    Location: (categorical) This column records the continent location where each hominid species lived.

    Zone: (categorical) Describes either east, west, south or north of the continent

    Current Country: (categorical) Records the modern-day country associated with the location where each hominid species lived, facilitating geographical comparisons.

    Habitat: (categorical) This column describes the typical habitat or environment inhabited by each hominid species. It provides information about the ecological niche and adaptation strategies of different hominids throughout history.

    Cranial Capacity: (numeric) This column provides data on the cranial capacity of each hominid species. Cranial capacity is a key indicator of brain size and can offer insights into cognitive abilities and evolutionary trends.

    Height: (numeric) Describes the average height or stature of each hominid species

    Incisor Size: (categorical) Indicates the size of the incisors in each hominid species

    Jaw Shape: (categorical) Describes the shape or morphology of the jaw in each hominid species

    Torus Supraorbital: (categorical) Specifies the shape or morphology of a supraorbital torus in each hominid species

    Prognathism: (categorical) Indicates the degree of facial prognathism or protrusion in each hominid species

    Foramen Mágnum Position: (categorical) Describes the position of the foramen magnum in each hominid species

    Canine Size: (categorical) Indicates the size of the canines in each hominid species

    Canines Shape: (categorical) Describes the shape of the canines in each hominid species, providing information about their dietary adaptations and social behavior.

    Tooth Enamel: (categorical) Specifies the characteristics of tooth enamel in each hominid species, which may indicate aspects of dietary ecology and dental health.

    Tecno: (categorical) Records the presence or absence of technological advancements

    Tecno Type: (categorical) Describes the specific type or style of technology associated with each hom...

  2. E

    The Human Know-How Dataset

    • dtechtive.com
    • find.data.gov.scot
    pdf, zip
    Updated Apr 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). The Human Know-How Dataset [Dataset]. http://doi.org/10.7488/ds/1394
    Explore at:
    pdf(0.0582 MB), zip(19.67 MB), zip(0.0298 MB), zip(9.433 MB), zip(13.06 MB), zip(0.2837 MB), zip(5.372 MB), zip(69.8 MB), zip(20.43 MB), zip(5.769 MB), zip(14.86 MB), zip(19.78 MB), zip(43.28 MB), zip(62.92 MB), zip(92.88 MB), zip(90.08 MB)Available download formats
    Dataset updated
    Apr 29, 2016
    Description

    The Human Know-How Dataset describes 211,696 human activities from many different domains. These activities are decomposed into 2,609,236 entities (each with an English textual label). These entities represent over two million actions and half a million pre-requisites. Actions are interconnected both according to their dependencies (temporal/logical orders between actions) and decompositions (decomposition of complex actions into simpler ones). This dataset has been integrated with DBpedia (259,568 links). For more information see: - The project website: http://homepages.inf.ed.ac.uk/s1054760/prohow/index.htm - The data is also available on datahub: https://datahub.io/dataset/human-activities-and-instructions ---------------------------------------------------------------- * Quickstart: if you want to experiment with the most high-quality data before downloading all the datasets, download the file '9of11_knowhow_wikihow', and optionally files 'Process - Inputs', 'Process - Outputs', 'Process - Step Links' and 'wikiHow categories hierarchy'. * Data representation based on the PROHOW vocabulary: http://w3id.org/prohow# Data extracted from existing web resources is linked to the original resources using the Open Annotation specification * Data Model: an example of how the data is represented within the datasets is available in the attached Data Model PDF file. The attached example represents a simple set of instructions, but instructions in the dataset can have more complex structures. For example, instructions could have multiple methods, steps could have further sub-steps, and complex requirements could be decomposed into sub-requirements. ---------------------------------------------------------------- Statistics: * 211,696: number of instructions. From wikiHow: 167,232 (datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow). From Snapguide: 44,464 (datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide). * 2,609,236: number of RDF nodes within the instructions From wikiHow: 1,871,468 (datasets 1of11_knowhow_wikihow to 9of11_knowhow_wikihow). From Snapguide: 737,768 (datasets 10of11_knowhow_snapguide to 11of11_knowhow_snapguide). * 255,101: number of process inputs linked to 8,453 distinct DBpedia concepts (dataset Process - Inputs) * 4,467: number of process outputs linked to 3,439 distinct DBpedia concepts (dataset Process - Outputs) * 376,795: number of step links between 114,166 different sets of instructions (dataset Process - Step Links)

  3. w

    Dataset of books series that contain The first people : from the earliest...

    • workwithdata.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of books series that contain The first people : from the earliest primates to homo sapiens : where and how our ancestors lived [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=The+first+people+:+from+the+earliest+primates+to+homo+sapiens+:+where+and+how+our+ancestors+lived&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series. It has 1 row and is filtered where the books is The first people : from the earliest primates to homo sapiens : where and how our ancestors lived. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  4. Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human...

    • data.nist.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +2more
    Updated Oct 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian DeCost (2020). Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models [Dataset]. http://doi.org/10.18434/mds2-2301
    Explore at:
    Dataset updated
    Oct 23, 2020
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Brian DeCost
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The open dataset, software, and other files accompanying the manuscript "An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models," submitted for publication to Integrated Materials and Manufacturing Innovations. Machine learning and autonomy are increasingly prevalent in materials science, but existing models are often trained or tuned using idealized data as absolute ground truths. In actual materials science, "ground truth" is often a matter of interpretation and is more readily determined by consensus. Here we present the data, software, and other files for a study using as-obtained diffraction data as a test case for evaluating the performance of machine learning models in the presence of differing expert opinions. We demonstrate that experts with similar backgrounds can disagree greatly even for something as intuitive as using diffraction to identify the start and end of a phase transformation. We then use a logarithmic likelihood method to evaluate the performance of machine learning models in relation to the consensus expert labels and their variance. We further illustrate this method's efficacy in ranking a number of state-of-the-art phase mapping algorithms. We propose a materials data challenge centered around the problem of evaluating models based on consensus with uncertainty. The data, labels, and code used in this study are all available online at data.gov, and the interested reader is encouraged to replicate and improve the existing models or to propose alternative methods for evaluating algorithmic performance.

  5. t

    PLACE OF BIRTH - DP02_DES_T - Dataset - CKAN

    • portal.tad3.org
    Updated Nov 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). PLACE OF BIRTH - DP02_DES_T - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/place-of-birth-dp02_des_t
    Explore at:
    Dataset updated
    Nov 18, 2024
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES PLACE OF BIRTH - DP02 Universe - Total population Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 People not reporting a place of birth were assigned the state or country of birth of another family member, or were allocated the response of another individual with similar characteristics. People born outside the United States were asked to report their place of birth according to current international boundaries. Since numerous changes in boundaries of foreign countries have occurred in the last century, some people may have reported their place of birth in terms of boundaries that existed at the time of their birth or emigration, or in accordance with their own national preference.

  6. Evolution of Humans DataSets for Clasification

    • kaggle.com
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SantiagoCostabile (2024). Evolution of Humans DataSets for Clasification [Dataset]. https://www.kaggle.com/datasets/santiago123678/evolution-of-humans-datasets-for-clasification/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 12, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SantiagoCostabile
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    In this dataset, we delve into the fascinating story of human evolution. With 12000 rows and 28 columns, this dataset covers a wide range of characteristics of different hominids, from the earliest consensual ancestors to modern Homo sapiens. This comprehensive compilation aims to facilitate the search for relationships between various key variables, thereby providing a more complete and detailed understanding of human evolution.

    Objectives: The objective is to predict either the gender and species or whether they were bipedal or not. Also, the objective is to avoid the overfeeding of the model, because there are several models that show signs of overfeeding

    About the Data: Genus & Species: (categorical) This column contains the genus and specific name of the species. It provides taxonomic information about each hominid included in the dataset, allowing for precise identification

    Time : (categorical) This column indicates the time period during which each hominid species lived. It helps to establish chronological context and understand the temporal distribution of different hominid groups.

    Location: (categorical) This column records the continent location where each hominid species lived.

    Zone: (categorical) Describes either east, west, south or north of the continent

    Current Country: (categorical) Records the modern-day country associated with the location where each hominid species lived, facilitating geographical comparisons.

    Habitat: (categorical) This column describes the typical habitat or environment inhabited by each hominid species. It provides information about the ecological niche and adaptation strategies of different hominids throughout history.

    Cranial Capacity: (numeric) This column provides data on the cranial capacity of each hominid species. Cranial capacity is a key indicator of brain size and can offer insights into cognitive abilities and evolutionary trends.

    Height: (numeric) Describes the average height or stature of each hominid species

    Incisor Size: (categorical) Indicates the size of the incisors in each hominid species

    Jaw Shape: (categorical) Describes the shape or morphology of the jaw in each hominid species

    Torus Supraorbital: (categorical) Specifies the shape or morphology of a supraorbital torus in each hominid species

    Prognathism: (categorical) Indicates the degree of facial prognathism or protrusion in each hominid species

    Foramen Mágnum Position: (categorical) Describes the position of the foramen magnum in each hominid species

    Canine Size: (categorical) Indicates the size of the canines in each hominid species

    Canines Shape: (categorical) Describes the shape of the canines in each hominid species, providing information about their dietary adaptations and social behavior.

    Tooth Enamel: (categorical) Specifies the characteristics of tooth enamel in each hominid species, which may indicate aspects of dietary ecology and dental health.

    Tecno: (categorical) Records the presence or absence of technological advancements

    Tecno Type: (categorical) Describes the specific type or style of technology associated with each hominid species

    Biped: (categorical) Indicates whether each hominid species exhibited bipedal locomotion, a key characteristic distinguishing humans from other primates.

    Arms: (categorical) Describes the morphology or characteristics of the arms in each hominid species, offering insights into their locomotor adaptations and manual dexterity.

    Foots: (categorical) Specifies the morphology or characteristics of the feet in each hominid species, providing information about their locomotor adaptations and foot anatomy.

    Diet: (categorical) Characterizes the dietary habits or preferences of each hominid species

    Sexual Dimorphism: (categorical) Indicates the degree of sexual dimorphism

    Hip: (categorical) Describes the size of the hip in each hominid species

    Vertical Front: (categorical) Specifies the presence or absence of verticality or curvature of the frontal bone in each hominid species, providing information about their cranial morphology.

    Anatomy: (categorical) Provides additional information about the anatomical features or characteristics of each hominid species, aiding in comprehensive morphological analyses.

    Migrated: (categorical) Indicates whether each hominid species exhibited migration or movement to different geographical areas, offering insights into their dispersal patterns and population dynamics.

    Skeleton: (categorical) Describes additional information about anatomy

  7. Z

    Event-Human3.6m

    • data.niaid.nih.gov
    • data.europa.eu
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arren Glover (2024). Event-Human3.6m [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7842597
    Explore at:
    Dataset updated
    Sep 27, 2024
    Dataset provided by
    Gaurvi Goyal
    Franco di Pietro
    Arren Glover
    Chiara Bartolozzi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The event-Human 3.6m is a synthetic conversion of the the Human 3.6m dataset (H36m). H36m is an existing benchmark video dataset for Human Pose Estimation. This has been cropped optimally and converted to event streams. The final resolution of the samples is 640x480. The Ground Truth is adjusted accordingly. Code for this conversion is available at https://github.com/event-driven-robotics/hpe-core

    The dataset is split into parts by zip. To use, download the parts. For linux systems, use the command

    cat h36m.z* > eh36m.zip

    then unzip normally.

    S9 and S11 are test subjects, the rest are training splits. There is a .py file present to demonstrate reading a sample and GT from the dataset.If you use this dataset in your project, please cite the following paper:

    @inproceedings{goyal2023moveenet, title={MoveEnet: Online High-Frequency Human Pose Estimation with an Event Camera}, author={Goyal, Gaurvi and Di Pietro, Franco and Carissimi, Nicolo and Glover, Arren and Bartolozzi, Chiara}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={4023--4032}, year={2023}}

    DOI: https://doi.org/10.1109/CVPRW59228.2023.00420

  8. w

    Dataset of book subjects that contain The meaning of human existence

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of book subjects that contain The meaning of human existence [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=The+meaning+of+human+existence&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 4 rows and is filtered where the books is The meaning of human existence. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  9. PERU MIGRANT Study | Baseline and 5yr follow-up dataset

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    bin
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J. Jaime Miranda; Antonio Bernabe-Ortiz; Rodrigo Carrillo Larco (2023). PERU MIGRANT Study | Baseline and 5yr follow-up dataset [Dataset]. http://doi.org/10.6084/m9.figshare.4832612.v4
    Explore at:
    binAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    J. Jaime Miranda; Antonio Bernabe-Ortiz; Rodrigo Carrillo Larco
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Peru
    Description

    This is an update of a prior dataset publication containing baseline and 5-year follow-up data from the PERU MIGRANT Study (PEru's Rural to Urban MIGRANTs Study).The PERU MIGRANT Study was designed to investigate the magnitude of differences between rural-to-urban migrant and non-migrant groups in specific cardiovascular risk factors. Three groups were selected: i) Rural, people who have always have lived in a rural environment; ii) Rural-urban, people who migrated from rural to urban areas; and, iii) Urban, people who have always lived in a urban environment.PERU MIGRANT Study protocol, instruments and variables are described in full in:Miranda JJ, Gilman RH, García HH, Smeeth L. The effect on cardiovascular risk factors of migration from rural to urban areas in Peru: PERU MIGRANT Study. BMC Cardiovasc Disord 2009;9:23. PERU MIGRANT Study baseline dataset is available at:https://figshare.com/articles/PERU_MIGRANT_Study_Baseline_dataset/3125005Main findings of the baseline study:Miranda JJ, Gilman RH, Smeeth L. Differences in cardiovascular risk factors in rural, urban and rural-to-urban migrants in Peru. Heart 2011;97(10):787-96. Main findings of the 5-yr follow-up study: Carrillo-Larco RM, Bernabé-Ortiz A, Pillay TD, Gilman RH, Sanchez JF, Poterico JA, Quispe R, Smeeth L, Miranda JJ. Obesity risk in rural, urban and rural-to-urban migrants: prospective results of the PERU MIGRANT study. Int J Obes (Lond) 2016;40(1):181-5. Bernabe-Ortiz A, Sanchez JF, Carrillo-Larco RM, Gilman RH, Poterico JA, Quispe R, Smeeth L, Miranda JJ. Rural-to-urban migration and risk of hypertension: longitudinal results of the PERU MIGRANT study. J Hum Hypertens 2017;31(1):22-28. Lazo-Porras M, Bernabe-Ortiz A, Málaga G, Gilman RH, Acuña-Villaorduña A, Cardenas-Montero D, Smeeth L, Miranda JJ. Low HDL cholesterol as a cardiovascular risk factor in rural, urban, and rural-urban migrants: PERU MIGRANT cohort study. Atherosclerosis 2016;246:36-43.Burroughs Pena MS, Bernabé-Ortiz A, Carrillo-Larco RM, Sánchez JF, Quispe R, Pillay TD, Málaga G, Gilman RH, Smeeth L, Miranda JJ. Migration, urbanisation and mortality: 5-year longitudinal analysis of the PERU MIGRANT study. J Epidemiol Community Health 2015;69(7):715-8.

  10. Pre-existing conditions of people who died due to coronavirus (COVID-19),...

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jul 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Pre-existing conditions of people who died due to coronavirus (COVID-19), England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/preexistingconditionsofpeoplewhodiedduetocovid19englandandwales
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 21, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Pre-existing conditions of people who died due to COVID-19, broken down by country, broad age group, and place of death occurrence, usual residents of England and Wales.

  11. Z

    High resolution global dataset of human-provided food wastes in 2021

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chen, Xin (2024). High resolution global dataset of human-provided food wastes in 2021 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10616780
    Explore at:
    Dataset updated
    Jul 7, 2024
    Dataset authored and provided by
    Chen, Xin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description:

    There is growing recognition that human-provided food resources are becoming increasingly available to animals across the globe (Oro et al., 2013). The food resources that are wasted by humans have influenced predators’ ecology and behavior and can indirectly affect their co-occurring species, leading to mostly negative ecological effects (Newsome et al., 2014). However, large increases have been found in the abundances of terrestrial mammalian predators such as coyotes (Canis latrans), cats (Felis catus) and red foxes (Vulpes vulpes), which are associated with their access to waste foods provided by humans (Denny et al., 2002; Fedriani et al., 2001; Shapira et al., 2008). Therefore, under anthropogenic global changes where human activities are continually expanding, a spatially explicit data for waste foods is essential to assessing the ecological effects of anthropogenic food subsidies to species occurrences and abundances.

    The repository contains a global dataset consisting of four different variables to depict anthropogenic food waste index: household food waste (tons/year), food service food waste (tons/year), retail food waste (tons/year), and total human-provided food waste (tons/year). To produce the dataset, I first allocated the food waste estimates (kg/capita/year) to 30 arc-second grid cells for each county. The food waste estimates for 2021 were generated by normalizing different food waste measurements to a single metric (i.e., kg/capita/year), accounting for known biases or different scopes of measurement, and aggregating a series of studies or observations if multiple observations existed in a geographic entity of interest (United Nations Environment Programme 2021). The food waste estimates were then multiplied by the estimated population count for 2021 produced by Sims et al. 2022. The data files were produced as global rasters at 30 arc-second (~1km at the equator) resolution in geotiff format under WGS 84 geographical coordinate system.

    Keywords: Anthropogenic food subsidies, human-provided food wastes, household food waste, food service food waste, retail food waste, food availability, anthropogenic global changes, human activities

    Reference:

    United Nations Environment Programme (2021). Food Waste Index Report 2021. Nairobi.

    Denny, E., Yaklovlevich, P., Eldridge, M.D.B. & Dickman, C.R. (2002) Social and genetic analysis of a population of free-living cats (Felis catus L.) exploiting a resource-rich habitat. Wildlife Research, 29, 405–413.

    Fedriani, J.M., Fuller, T.K. & Sauvajot, R.M. (2001) Does availability of anthropogenic food enhance densities of omnivorous mammals? An example with coyotes in southern California. Ecography, 24, 325–331.

    Newsome, T. M., Dellinger, J. A., Pavey, C. R., Ripple, W. J., Shores, C. R., Wirsing, A. J., & Dickman, C. R. (2015). The ecological effects of providing resource subsidies to predators. Global Ecology and Biogeography, 24, 1-11.

    Oro, D., Genovart, M., Tavecchia, G., Fowler, M. S., & Martínez‐Abraín, A. (2013). Ecological and evolutionary implications of food subsidies from humans. Ecology letters, 16(12), 1501-1514.

    Shapira, I., Sultan, H. & Shanas, U. (2008) Agricultural farming alters predator–prey interactions in nearby natural habitats. Animal Conservation, 11, 1–8.

    Sims, K., Reith, A., Bright, E., McKee, J., & Rose, A. (2022). LandScan Global 2021 [Data set]. Oak Ridge National Laboratory. https://doi.org/10.48690/1527702.

  12. US Broadband Usage Across Counties

    • kaggle.com
    Updated Jan 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US Broadband Usage Across Counties [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-broadband-usage-across-counties-and-zip-codes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 6, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Area covered
    United States
    Description

    US Broadband Usage Across Counties

    Utilizing Microsoft's Data to Estimate Access

    By Amber Thomas [source]

    About this dataset

    This dataset provides an estimation of broadband usage in the United States, focusing on how many people have access to broadband and how many are actually using it at broadband speeds. Through data collected by Microsoft from our services, including package size and total time of download, we can estimate the throughput speed of devices connecting to the internet across zip codes and counties.

    According to Federal Communications Commission (FCC) estimates, 14.5 million people don't have access to any kind of broadband connection. This data set aims to address this contrast between those with estimated availability but no actual use by providing more accurate usage numbers downscaled to county and zip code levels. Who gets counted as having access is vastly important -- it determines who gets included in public funding opportunities dedicated solely toward closing this digital divide gap. The implications can be huge: millions around this country could remain invisible if these number aren't accurately reported or used properly in decision-making processes.

    This dataset includes aggregated information about these locations with less than 20 devices for increased accuracy when estimating Broadband Usage in the United States-- allowing others to use it for developing solutions that improve internet access or label problem areas accurately where no real or reliable connectivity exists among citizens within communities large and small throughout the US mainland.. Please review the license terms before using these data so that you may adhere appropriately with stipulations set forth under Microsoft's Open Use Of Data Agreement v1.0 agreement prior to utilizing this dataset for your needs-- both professional and educational endeavors alike!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    How to Use the US Broadband Usage Dataset

    This dataset provides broadband usage estimates in the United States by county and zip code. It is ideally suited for research into how broadband connects households, towns and cities. Understanding this information is vital for closing existing disparities in access to high-speed internet, and for devising strategies for making sure all Americans can stay connected in a digital world.

    The dataset contains six columns: - County – The name of the county for which usage statistics are provided. - Zip Code (5-Digit) – The 5-digit zip code from which usage data was collected from within that county or metropolitan area/micro area/divisions within states as reported by the US Census Bureau in 2018[2].
    - Population (Households) – Estimated number of households defined according to [3] based on data from the US Census Bureau American Community Survey's 5 Year Estimates[4].
    - Average Throughput (Mbps)- Average Mbps download speed derived from a combination of data collected anonymous devices connected through Microsoft services such as Windows Update, Office 365, Xbox Live Core Services, etc.[5]
    - Percent Fast (> 25 Mbps)- Percentage of machines with throughput greater than 25 Mbps calculated using [6]. 6) Percent Slow (< 3 Mbps)- Percentage of machines with throughput less than 3Mbps calculated using [7].

    Research Ideas

    • Targeting marketing campaigns based on broadband use. Companies can use the geographic and demographic data in this dataset to create targeted advertising campaigns that are tailored to individuals living in areas where broadband access is scarce or lacking.
    • Creating an educational platform for those without reliable access to broadband internet. By leveraging existing technologies such as satellite internet, media streaming services like Netflix, and platforms such as Khan Academy or EdX, those with limited access could gain access to new educational options from home.
    • Establishing public-private partnerships between local governments and telecom providers need better data about gaps in service coverage and usage levels in order to make decisions about investments into new infrastructure buildouts for better connectivity options for rural communities

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: broadband_data_2020October.csv

    Acknowledgements

    If you use this dataset in your research,...

  13. w

    Dataset of books called D.H. Lawrence and human existence

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called D.H. Lawrence and human existence [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=D.H.+Lawrence+and+human+existence
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is D.H. Lawrence and human existence. It features 7 columns including author, publication date, language, and book publisher.

  14. Z

    Human Foraging Experiment Dataset

    • data.niaid.nih.gov
    Updated May 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Precious Held (2021). Human Foraging Experiment Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4864764
    Explore at:
    Dataset updated
    May 31, 2021
    Dataset authored and provided by
    Precious Held
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Chimpanzees are our closest living relatives and have been extensively used in research into the evolution of humans. Although chimpanzees and humans share many of the same cognitive abilities, how they compare in solving spatial tasks is unclear to date. Therefore this study conducted a human physical simulation method that resembles foraging patterns of chimpanzees to enable comparing these spatiotemporal cognitive abilities. Furthermore, this study aimed to interpret animal movement and spatiotemporal cognitive abilities by relating revisit intervals to cognitive processes such as learning and memory. For this, two variables, constancy and contingency, have been used to reflect search efficiency, and their values were used to make inferences about the cognitive abilities of humans and chimpanzees. Ultimately, this study investigated how the average patterns in revisit constancy and contingency relate to the spatiotemporal cognitive abilities of chimpanzees, and how this compares to those of humans. These results are highly valuable in addressing the aforementioned existing knowledge gaps, but the novel stimulation method additionally provides a great perspective for future research into animal movement. This dataset contains the data obtained from the human foraging experiment that was conducted for the Bachelor's thesis: "Using Recursive Movement Data to Study Animal Cognition: Assessing a New Method to Compare Spatiotemporal Intelligence of Humans and Chimpanzees ".

  15. Google Capstone Project - BellaBeats

    • kaggle.com
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Porzelius (2023). Google Capstone Project - BellaBeats [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-capstone-project-bellabeats
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 5, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jason Porzelius
    Description

    Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

    Section 1 - Ask: A. Guiding Questions: Who are the key stakeholders and what are their goals for the data analysis project? What is the business task that this data analysis project is attempting to solve?

    B. Key Tasks: Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

    Section 2 - Prepare: A. Guiding Questions: Where is the data stored and organized? Are there any problems with the data? How does the data help answer the business question?

    B. Key Tasks: Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016. *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDaymerged.csv -dailyActivitymerged.csv Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual IDs in the dailyActivity_merged dataset. *Due to the small number of participants (...

  16. Meta data and supporting documentation

    • catalog.data.gov
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Meta data and supporting documentation [Dataset]. https://catalog.data.gov/dataset/meta-data-and-supporting-documentation
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    We include a description of the data sets in the meta-data as well as sample code and results from a simulated data set. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The R code is available on line here: https://github.com/warrenjl/SpGPCW. Format: Abstract The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publicly available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. File format: R workspace file. Metadata (including data dictionary) • y: Vector of binary responses (1: preterm birth, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate). This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  17. O

    CT School Learning Model Indicators by County (14-day metrics) - ARCHIVE

    • data.ct.gov
    • catalog.data.gov
    application/rdfxml +5
    Updated Aug 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CT DPH (2021). CT School Learning Model Indicators by County (14-day metrics) - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/CT-School-Learning-Model-Indicators-by-County-14-d/e4bh-ax24
    Explore at:
    application/rdfxml, xml, tsv, json, csv, application/rssxmlAvailable download formats
    Dataset updated
    Aug 5, 2021
    Dataset authored and provided by
    CT DPH
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Area covered
    Connecticut
    Description

    NOTE: This dataset pertains only to the 2020-2021 school year and is no longer being updated. For additional data on COVID-19, visit data.ct.gov/coronavirus.

    This dataset includes the leading and secondary metrics identified by the Connecticut Department of Health (DPH) and the Department of Education (CSDE) to support local district decision-making on the level of in-person, hybrid (blended), and remote learning model for Pre K-12 education.

    Data represent daily averages for two-week periods by date of specimen collection (cases and positivity), date of hospital admission, or date of ED visit. Hospitalization data come from the Connecticut Hospital Association and are based on hospital location, not county of patient residence. COVID-19-like illness includes fever and cough or shortness of breath or difficulty breathing or the presence of coronavirus diagnosis code and excludes patients with influenza-like illness. All data are preliminary.

    These data are updated weekly and reflect the previous two full Sunday-Saturday (MMWR) weeks (https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf).

    These metrics were adapted from recommendations by the Harvard Global Institute and supplemented by existing DPH measures.

    For national data on COVID-19, see COVID View, the national weekly surveillance summary of U.S. COVID-19 activity, at https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html

    DPH note about change from 7-day to 14-day metrics: Prior to 10/15/2020, these metrics were calculated using a 7-day average rather than a 14-day average. The 7-day metrics are no longer being updated as of 10/15/2020 but the archived dataset can be accessed here: https://data.ct.gov/Health-and-Human-Services/CT-School-Learning-Model-Indicators-by-County/rpph-4ysy

    As you know, we are learning more about COVID-19 all the time, including the best ways to measure COVID-19 activity in our communities. CT DPH has decided to shift to 14-day rates because these are more stable, particularly at the town level, as compared to 7-day rates. In addition, since the school indicators were initially published by DPH last summer, CDC has recommended 14-day rates and other states (e.g., Massachusetts) have started to implement 14-day metrics for monitoring COVID transmission as well.

    With respect to geography, we also have learned that many people are looking at the town-level data to inform decision making, despite emphasis on the county-level metrics in the published addenda. This is understandable as there has been variation within counties in COVID-19 activity (for example, rates that are higher in one town than in most other towns in the county).

  18. H

    Multi-Human Interactive Talking Dataset

    • dataverse.harvard.edu
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeyu Zhu; Weijia Wu (2025). Multi-Human Interactive Talking Dataset [Dataset]. http://doi.org/10.7910/DVN/FJ8MRB
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 23, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Zeyu Zhu; Weijia Wu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Existing studies on talking video generation have predominantly focused on single-person monologues or isolated facial animations, limiting their applicability to realistic multi-human interactions. To bridge this gap, we introduce MIT, a large-scale dataset specifically designed for multi-human talking video generation. To this end, we develop an automatic pipeline that collects and annotates multi-person conversational videos. The resulting dataset comprises 12 hours of high-resolution footage, each featuring two to four speakers, with fine-grained annotations of body poses and speech interactions. It captures natural conversational dynamics in multi-speaker scenario, offering a rich resource for studying interactive visual behaviors. To demonstrate the potential of MIT, we furthur propose CovOG, a baseline model for this novel task. It integrates a Multi-Human Pose Encoder (MPE) to handle varying numbers of speakers by aggregating individual pose embeddings, and an Interactive Audio Driver (IAD) to modulate head dynamics based on speaker-specific audio features. Together, these components showcase the feasibility and challenges of generating realistic multi-human talking videos, establishing MIT as a valuable benchmark for future research. The dataset and code will be public available.

  19. d

    Data from: DATA MINING THE GALAXY ZOO MERGERS

    • catalog.data.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +5more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). DATA MINING THE GALAXY ZOO MERGERS [Dataset]. https://catalog.data.gov/dataset/data-mining-the-galaxy-zoo-mergers
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    DATA MINING THE GALAXY ZOO MERGERS STEVEN BAEHR, ARUN VEDACHALAM, KIRK BORNE, AND DANIEL SPONSELLER Abstract. Collisions between pairs of galaxies usually end in the coalescence (merger) of the two galaxies. Collisions and mergers are rare phenomena, yet they may signal the ultimate fate of most galaxies, including our own Milky Way. With the onset of massive collection of astronomical data, a computerized and automated method will be necessary for identifying those colliding galaxies worthy of more detailed study. This project researches methods to accomplish that goal. Astronomical data from the Sloan Digital Sky Survey (SDSS) and human-provided classifications on merger status from the Galaxy Zoo project are combined and processed with machine learning algorithms. The goal is to determine indicators of merger status based solely on discovering those automated pipeline-generated attributes in the astronomical database that correlate most strongly with the patterns identified through visual inspection by the Galaxy Zoo volunteers. In the end, we aim to provide a new and improved automated procedure for classification of collisions and mergers in future petascale astronomical sky surveys. Both information gain analysis (via the C4.5 decision tree algorithm) and cluster analysis (via the Davies-Bouldin Index) are explored as techniques for finding the strongest correlations between human-identified patterns and existing database attributes. Galaxy attributes measured in the SDSS green waveband images are found to represent the most influential of the attributes for correct classification of collisions and mergers. Only a nominal information gain is noted in this research, however, there is a clear indication of which attributes contribute so that a direction for further study is apparent.

  20. Data from: HRI30: An Action Recognition Dataset for Industrial Human-Robot...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco Iodice; Francesco Iodice; Elena De Momi; Elena De Momi; Arash Ajoudani; Arash Ajoudani (2022). HRI30: An Action Recognition Dataset for Industrial Human-Robot Interaction [Dataset]. http://doi.org/10.5281/zenodo.5833411
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 25, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Francesco Iodice; Francesco Iodice; Elena De Momi; Elena De Momi; Arash Ajoudani; Arash Ajoudani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A thorough analysis of the existing human action recognition datasets demonstrates that only a few HRI datasets are available that target real-world applications, all of which are adapted to home settings. Therefore, given the shortage of datasets in industrial tasks, we aim to provide the community with a dataset created in a laboratory setting that includes actions commonly performed within manufacturing and service industries. In addition, the proposed dataset meets the requirements of deep learning algorithms for the development of intelligent learning models for action recognition and imitation in HRI applications.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
SantiagoCostabile (2024). Biological Data Of Human Evolution Data Sets [Dataset]. https://www.kaggle.com/datasets/santiago123678/biological-data-of-human-ancestors-data-sets
Organization logo

Biological Data Of Human Evolution Data Sets

This dataset focuses on consensual hominids

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 16, 2024
Dataset provided by
Kaggle
Authors
SantiagoCostabile
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Homininos_DataSet(1).csv is the original///////// Homininos_DataSet.csv It already has the categorical values ​​encoded

Exploring Human Evolution Through a Comprehensive Dataset

Introduction:

In this dataset, we delve into the fascinating story of human evolution. With 720 rows and 28 columns, this dataset covers a wide range of characteristics of different hominids, from the earliest consensual ancestors to modern Homo sapiens. This comprehensive compilation aims to facilitate the search for relationships between various key variables, thereby providing a more complete and detailed understanding of human evolution.

Objectives:

The main objective of this dataset is to facilitate the exploration and understanding of human evolution from a broader and more detailed perspective. Some specific objectives include:

Seeking relationships between important columns of the dataset. Understanding human evolution considering the collected data. Investigating the possible linearity of evolution over time. Analyzing potential relationships between brain size, developed technologies, diet, and physiological modifications over time. Significance:

This dataset is crucial for advancing our understanding of human evolution and history. It provides a solid foundation for research in various fields, from anthropology and evolutionary biology to archaeology and genetics. By allowing us to examine relationships and patterns among different variables, this dataset helps us trace the course of human evolution and gain a better understanding of our place in the tree of life.

Conclusions:

In summary, this comprehensive dataset provides us with a valuable tool for exploring human evolution in depth. With its numerous rows and columns, it allows us to delve into the complexity and diversity of our evolutionary history. By analyzing and understanding the collected data, we can gain new insights into how we have come to be what we are today and how our species has evolved over time.

This dataset not only expands our knowledge of human evolution but also inspires us to continue researching and discovering more about our shared past as a species.

I studied Biological Anthropology for 4 years at the National University of La Palta, and I had the opportunity to compile these data from classes and books such as Carbonell's "Homínidos: las primeras ocupaciones de los continentes," published in 2005.

INFO About Columns: Genus & Species: (categorical) This column contains the genus and specific name of the species. It provides taxonomic information about each hominid included in the dataset, allowing for precise identification

Time : (categorical) This column indicates the time period during which each hominid species lived. It helps to establish chronological context and understand the temporal distribution of different hominid groups.

Location: (categorical) This column records the continent location where each hominid species lived.

Zone: (categorical) Describes either east, west, south or north of the continent

Current Country: (categorical) Records the modern-day country associated with the location where each hominid species lived, facilitating geographical comparisons.

Habitat: (categorical) This column describes the typical habitat or environment inhabited by each hominid species. It provides information about the ecological niche and adaptation strategies of different hominids throughout history.

Cranial Capacity: (numeric) This column provides data on the cranial capacity of each hominid species. Cranial capacity is a key indicator of brain size and can offer insights into cognitive abilities and evolutionary trends.

Height: (numeric) Describes the average height or stature of each hominid species

Incisor Size: (categorical) Indicates the size of the incisors in each hominid species

Jaw Shape: (categorical) Describes the shape or morphology of the jaw in each hominid species

Torus Supraorbital: (categorical) Specifies the shape or morphology of a supraorbital torus in each hominid species

Prognathism: (categorical) Indicates the degree of facial prognathism or protrusion in each hominid species

Foramen Mágnum Position: (categorical) Describes the position of the foramen magnum in each hominid species

Canine Size: (categorical) Indicates the size of the canines in each hominid species

Canines Shape: (categorical) Describes the shape of the canines in each hominid species, providing information about their dietary adaptations and social behavior.

Tooth Enamel: (categorical) Specifies the characteristics of tooth enamel in each hominid species, which may indicate aspects of dietary ecology and dental health.

Tecno: (categorical) Records the presence or absence of technological advancements

Tecno Type: (categorical) Describes the specific type or style of technology associated with each hom...

Search
Clear search
Close search
Google apps
Main menu