32 datasets found
  1. Capital Punishment in the United States, 1973-2010

    • catalog.data.gov
    • icpsr.umich.edu
    • +1more
    Updated Mar 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Justice Statistics (2025). Capital Punishment in the United States, 1973-2010 [Dataset]. https://catalog.data.gov/dataset/capital-punishment-in-the-united-states-1973-2010
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    Bureau of Justice Statisticshttp://bjs.ojp.gov/
    Area covered
    United States
    Description

    CAPITAL PUNISHMENT IN THE UNITED STATES, 1973-2010 provides annual data on prisoners under a sentence of death, as well as those who had their sentences commuted or vacated and prisoners who were executed. This study examines basic sociodemographic classifications including age, sex, race and ethnicity, marital status at time of imprisonment, level of education, and State and region of incarceration. Criminal history information includes prior felony convictions and prior convictions for criminal homicide and the legal status at the time of the capital offense. Additional information is provided on those inmates removed from death row by yearend 2010. The dataset consists of one part which contains 9,058 cases. The file provides information on inmates whose death sentences were removed in addition to information on those inmates who were executed. The file also gives information about inmates who received a second death sentence by yearend 2010 as well as inmates who were already on death row.

  2. Judicial Executions

    • data.gov.sg
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Prison Service (2024). Judicial Executions [Dataset]. https://data.gov.sg/datasets/d_f4081559b7db4f792a395138a540db1d/view
    Explore at:
    Dataset updated
    Jun 6, 2024
    Dataset authored and provided by
    Singapore Prison Servicehttp://www.sps.gov.sg/
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Time period covered
    Jan 2007 - Dec 2022
    Description

    Dataset from Singapore Prison Service. For more information, visit https://data.gov.sg/datasets/d_f4081559b7db4f792a395138a540db1d/view

  3. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Sep 28, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 2:10 AM EASTERN ON OCT. 7

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  4. The Women's Executions Database

    • zenodo.org
    bin, csv
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corina Schulze; Corina Schulze (2025). The Women's Executions Database [Dataset]. http://doi.org/10.5281/zenodo.16623213
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Corina Schulze; Corina Schulze
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Presented here is a dataset containing all known executions of women carried out under civil authority. Many studies that mention gender use a dataset that estimates that about 365 women were executed in the U.S. between 1608 and 2002. The number of women executed in the U.S. since the 1600s is, in fact, higher than 700. The goal is to produce a dataset that encompasses experiences most relevant to women (e.g., histories of trauma, parenthood) in addition to providing variables that will allow for evidence-based quantitative research.

    Until I have completed my application with zenodo, please refer to the larger project in which the data are housed: The women's executions project.

  5. H

    Event Dependence in Death Penalty Executions

    • datasetcatalog.nlm.nih.gov
    • dataverse.harvard.edu
    • +1more
    Updated Sep 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Box-Steffensmeier, Janet; Campbell, Benjamin; Baumgartner, Frank (2016). Event Dependence in Death Penalty Executions [Dataset]. http://doi.org/10.7910/DVN/LOMPZY
    Explore at:
    Dataset updated
    Sep 20, 2016
    Authors
    Box-Steffensmeier, Janet; Campbell, Benjamin; Baumgartner, Frank
    Description

    This pre-analysis plan outlines a research strategy to test a "self-reinforcing" theory of death penalty executions, which holds that counties face decreasing marginal costs for executions. We test this theory through examining event dependence in executions among counties that have the death penalty. To test for the presence of these self-reinforcing processes in executions, and the exogenous factors that may explain executions, we utilize an event history model that accounts for event dependence. The empirical findings of this analysis may have profound consequences for how we understand executions. Evidence of event dependence would reveal that the main determinant of whether an individual is executed is the county's previous experience with execution, which would raise many important policy, legal, and moral questions.

  6. Data from: Executions in the United States, 1608-2002: The ESPY File

    • catalog.data.gov
    • icpsr.umich.edu
    • +1more
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Justice Statistics (2025). Executions in the United States, 1608-2002: The ESPY File [Dataset]. https://catalog.data.gov/dataset/executions-in-the-united-states-1608-2002-the-espy-file-1635c
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    Bureau of Justice Statisticshttp://bjs.ojp.gov/
    Area covered
    United States
    Description

    This collection furnishes data on executions performed under civil authority in the United States between 1608 and 2002. The dataset describes each individual executed and the circumstances surrounding the crime for which the person was convicted. Variables include age, race, name, sex, and occupation of the offender, place, jurisdiction, date, and method of execution, and the crime for which the offender was executed. Also recorded are data on whether the only evidence for the execution was official records indicating that an individual (executioner or slave owner) was compensated for an execution.

  7. Prison Inmates in India

    • kaggle.com
    Updated Jan 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Prison Inmates in India [Dataset]. https://www.kaggle.com/datasets/thedevastator/prison-inmates-in-india-demographics-crimes-and
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 4, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    Prison Inmates in India

    Demographics, Age, Education, Caste, Wages, Rehabilitation, Technical Info

    By Rajanand Ilangovan [source]

    About this dataset

    This dataset provides a detailed view of prison inmates in India, including their age, caste, and educational background. It includes information on inmates from all states/union territories for the year 2019 such as the number of male and female inmates aged 16-18 years, 18-30 year old inmates and those above 50 years old. The data also covers total number of penalized prisoners sentenced to death sentence, life imprisonment or executed by the state authorities. Additionally, it provides information regarding the crimehead (type) committed by an inmate along with its grand total across different age groups. This dataset not only sheds light on India’s criminal justice system but also highlights prevelance of crimes in different states and union territories as well as providing insight into crime trends across Indian states over time

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides a comprehensive look at the demographics, crimes and sentences of Indian prison inmates in 2019. The data is broken down by state/union territory, year, crime head, age groups and gender.

    This dataset can be used to understand the demographic composition of the prison population in India as well as the types of crimes committed. It can also be used to gain insight into any changes or trends related to sentencing patterns in India over time. Furthermore, this data can provide valuable insight into potential correlations between different demographic factors (such as gender and caste) and specific types of crimes or length of sentences handed out.

    To use this dataset effectively there are a few important things to keep in mind: •State/UT - This column refers to individual states or union territories in India where prisons are located •Year – This column indicates which year(s) the data relates to •Both genders - Female columns refer only to female prisoners while male columns refers only to male prisoners •Age Groups – 16-18 years old = 21-30 years old = 31-50 years old = 50+ years old •Crime Head – A broad definition for each type of crime that inmates have been convicted for •No Capital Punishment – The total number sentenced with capital punishment No Life Imprisonment – The total number sentenced with life imprisonment No Executed– The total number executed from death sentence Grand Total–The overall totals for each category

    By using this information it is possible to answer questions regarding topics such as sentencing trends, types of crimes committed by different age groups or genders and state-by-state variation amongst other potential queries

    Research Ideas

    • Using the age and gender information to develop targeted outreach strategies for prisons in order to reduce recidivism rates.
    • Creating an AI-based predictive model to predict crime trends by analyzing crime head data from a particular region/state and correlating it with population demographics, economic activity, etc.
    • Analyzing the caste of inmates across different states in India in order to understand patterns of discrimination within the criminal justice system

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original.

    Columns

    File: SLL_Crime_headwise_distribution_of_inmates_who_convicted.csv | Column name | Description | |:--------------------------|:---------------------------------------------------------------------------------------------------| | STATE/UT | Name of the state or union territory where the jail is located. (String) | | YEAR | Year when the inmate population data was collected. (Integer) ...

  8. Natural Disasters Deaths

    • kaggle.com
    Updated Nov 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Natural Disasters Deaths [Dataset]. https://www.kaggle.com/datasets/thedevastator/the-fatal-cost-of-natural-disasters
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 19, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Natural Disasters Deaths

    People killed in natural disasters by country by year

    About this dataset

    How much do natural disasters cost us? In lives, in dollars, in infrastructure? This dataset attempts to answer those questions, tracking the death toll and damage cost of major natural disasters since 1985. Disasters included are storms ( hurricanes, typhoons, and cyclones ), floods, earthquakes, droughts, wildfires, and extreme temperatures

    How to use the dataset

    This dataset contains information on natural disasters that have occurred around the world from 1900 to 2017. The data includes the date of the disaster, the location, the type of disaster, the number of people killed, and the estimated cost in US dollars

    Research Ideas

    • An all-in-one disaster map displaying all recorded natural disasters dating back to 1900.
    • Natural disaster hotspots - where do natural disasters most commonly occur and kill the most people?
    • A live map tracking current natural disasters around the world

    Acknowledgements

    License

    See the dataset description for more information.

  9. g

    NHTSA Fatality Analysis Reporting System (FARS), Persons Killed by State and...

    • geocommons.com
    Updated May 27, 2008
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NHTSA Fatality Analysis Reporting System (FARS) (2008). NHTSA Fatality Analysis Reporting System (FARS), Persons Killed by State and Highest BAC in Crashes, USA, 2006 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    May 27, 2008
    Dataset provided by
    data
    NHTSA Fatality Analysis Reporting System (FARS)
    Description

    This dataset displays the number of persons killed in traffic accidents by state in 2006. This dataset also displays the Blood Alcohol Concentration (BAC) of those involved in the accident. Each category is broken down into the number of and percentage of the total accidents in 2006. This data was collected from the Fatality Analysis Reporting System at: http://www-fars.nhtsa.dot.gov/States/StatesAlcohol.aspx Access date: November 13, 2007 California and Florida lead the nation in total killed, while DC holds the least amount of persons killed.

  10. y

    Reported number of PEOPLE killed or seriously injured (KSI) in road traffic...

    • data.yorkopendata.org
    Updated Nov 20, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Reported number of PEOPLE killed or seriously injured (KSI) in road traffic accidents (Calendar Year) (LI 13a (i)) - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/kpi-ces14i
    Explore at:
    Dataset updated
    Nov 20, 2015
    License

    Open Government Licence 2.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/
    License information was derived automatically

    Area covered
    York
    Description

    Reported number of PEOPLE killed or seriously injured (KSI) in road traffic accidents (Calendar Year) (LI 13a (i)) *Please note that data for the previous calendar year is provisional until it gets validated by DfT, which normally takes place in September.

  11. Number and percentage of homicide victims, by type of firearm used to commit...

    • www150.statcan.gc.ca
    • data.urbandatacentre.ca
    • +3more
    Updated Jul 22, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2019). Number and percentage of homicide victims, by type of firearm used to commit the homicide, inactive [Dataset]. http://doi.org/10.25318/3510007201-eng
    Explore at:
    Dataset updated
    Jul 22, 2019
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and percentage of homicide victims, by type of firearm used to commit the homicide (total firearms; handgun; rifle or shotgun; fully automatic firearm; sawed-off rifle or shotgun; firearm-like weapons; other firearms, type unknown), Canada, 1974 to 2018.

  12. g

    UNEP, Volcanic Eruptions - Killed People, World, 1975 - 2000

    • geocommons.com
    Updated Apr 29, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emergency Events Database (2008). UNEP, Volcanic Eruptions - Killed People, World, 1975 - 2000 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    Apr 29, 2008
    Dataset provided by
    Emergency Events Database
    data
    Description

    The map data is derived from the United Nations Environment Programme (UNEP) for the years ranging from 1975-2000. The map shows the concentration of the number of deaths of people caused by or linked to volcanic eruptions in the world. Online resource: http://geodata.grid.unep.ch URL original source: www.cred.be/emdat

  13. y

    Reported number of PEOPLE killed in road traffic accidents (Calendar Year)...

    • data.yorkopendata.org
    Updated Feb 4, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). Reported number of PEOPLE killed in road traffic accidents (Calendar Year) (LI 13a) - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/kpi-ces14
    Explore at:
    Dataset updated
    Feb 4, 2016
    License

    Open Government Licence 2.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/
    License information was derived automatically

    Area covered
    York
    Description

    Reported number of PEOPLE killed in road traffic accidents (Calendar Year) (LI 13a) *Please note that data for the previous calendar year is provisional until it gets validated by DfT, which normally takes place in September.

  14. Z

    PIPr: A Dataset of Public Infrastructure as Code Programs

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spielmann, David (2023). PIPr: A Dataset of Public Infrastructure as Code Programs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8262770
    Explore at:
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Sokolowski, Daniel
    Salvaneschi, Guido
    Spielmann, David
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Programming Languages Infrastructure as Code (PL-IaC) enables IaC programs written in general-purpose programming languages like Python and TypeScript. The currently available PL-IaC solutions are Pulumi and the Cloud Development Kits (CDKs) of Amazon Web Services (AWS) and Terraform. This dataset provides metadata and initial analyses of all public GitHub repositories in August 2022 with an IaC program, including their programming languages, applied testing techniques, and licenses. Further, we provide a shallow copy of the head state of those 7104 repositories whose licenses permit redistribution. The dataset is available under the Open Data Commons Attribution License (ODC-By) v1.0. Contents:

    metadata.zip: The dataset metadata and analysis results as CSV files. scripts-and-logs.zip: Scripts and logs of the dataset creation. LICENSE: The Open Data Commons Attribution License (ODC-By) v1.0 text. README.md: This document. redistributable-repositiories.zip: Shallow copies of the head state of all redistributable repositories with an IaC program. This artifact is part of the ProTI Infrastructure as Code testing project: https://proti-iac.github.io. Metadata The dataset's metadata comprises three tabular CSV files containing metadata about all analyzed repositories, IaC programs, and testing source code files. repositories.csv:

    ID (integer): GitHub repository ID url (string): GitHub repository URL downloaded (boolean): Whether cloning the repository succeeded name (string): Repository name description (string): Repository description licenses (string, list of strings): Repository licenses redistributable (boolean): Whether the repository's licenses permit redistribution created (string, date & time): Time of the repository's creation updated (string, date & time): Time of the last update to the repository pushed (string, date & time): Time of the last push to the repository fork (boolean): Whether the repository is a fork forks (integer): Number of forks archive (boolean): Whether the repository is archived programs (string, list of strings): Project file path of each IaC program in the repository programs.csv:

    ID (string): Project file path of the IaC program repository (integer): GitHub repository ID of the repository containing the IaC program directory (string): Path of the directory containing the IaC program's project file solution (string, enum): PL-IaC solution of the IaC program ("AWS CDK", "CDKTF", "Pulumi") language (string, enum): Programming language of the IaC program (enum values: "csharp", "go", "haskell", "java", "javascript", "python", "typescript", "yaml") name (string): IaC program name description (string): IaC program description runtime (string): Runtime string of the IaC program testing (string, list of enum): Testing techniques of the IaC program (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") tests (string, list of strings): File paths of IaC program's tests testing-files.csv:

    file (string): Testing file path language (string, enum): Programming language of the testing file (enum values: "csharp", "go", "java", "javascript", "python", "typescript") techniques (string, list of enum): Testing techniques used in the testing file (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") keywords (string, list of enum): Keywords found in the testing file (enum values: "/go/auto", "/testing/integration", "@AfterAll", "@BeforeAll", "@Test", "@aws-cdk", "@aws-cdk/assert", "@pulumi.runtime.test", "@pulumi/", "@pulumi/policy", "@pulumi/pulumi/automation", "Amazon.CDK", "Amazon.CDK.Assertions", "Assertions_", "HashiCorp.Cdktf", "IMocks", "Moq", "NUnit", "PolicyPack(", "ProgramTest", "Pulumi", "Pulumi.Automation", "PulumiTest", "ResourceValidationArgs", "ResourceValidationPolicy", "SnapshotTest()", "StackValidationPolicy", "Testing", "Testing_ToBeValidTerraform(", "ToBeValidTerraform(", "Verifier.Verify(", "WithMocks(", "[Fact]", "[TestClass]", "[TestFixture]", "[TestMethod]", "[Test]", "afterAll(", "assertions", "automation", "aws-cdk-lib", "aws-cdk-lib/assert", "aws_cdk", "aws_cdk.assertions", "awscdk", "beforeAll(", "cdktf", "com.pulumi", "def test_", "describe(", "github.com/aws/aws-cdk-go/awscdk", "github.com/hashicorp/terraform-cdk-go/cdktf", "github.com/pulumi/pulumi", "integration", "junit", "pulumi", "pulumi.runtime.setMocks(", "pulumi.runtime.set_mocks(", "pulumi_policy", "pytest", "setMocks(", "set_mocks(", "snapshot", "software.amazon.awscdk.assertions", "stretchr", "test(", "testing", "toBeValidTerraform(", "toMatchInlineSnapshot(", "toMatchSnapshot(", "to_be_valid_terraform(", "unittest", "withMocks(") program (string): Project file path of the testing file's IaC program Dataset Creation scripts-and-logs.zip contains all scripts and logs of the creation of this dataset. In it, executions/executions.log documents the commands that generated this dataset in detail. On a high level, the dataset was created as follows:

    A list of all repositories with a PL-IaC program configuration file was created using search-repositories.py (documented below). The execution took two weeks due to the non-deterministic nature of GitHub's REST API, causing excessive retries. A shallow copy of the head of all repositories was downloaded using download-repositories.py (documented below). Using analysis.ipynb, the repositories were analyzed for the programs' metadata, including the used programming languages and licenses. Based on the analysis, all repositories with at least one IaC program and a redistributable license were packaged into redistributable-repositiories.zip, excluding any node_modules and .git directories. Searching Repositories The repositories are searched through search-repositories.py and saved in a CSV file. The script takes these arguments in the following order:

    Github access token. Name of the CSV output file. Filename to search for. File extensions to search for, separated by commas. Min file size for the search (for all files: 0). Max file size for the search or * for unlimited (for all files: *). Pulumi projects have a Pulumi.yaml or Pulumi.yml (case-sensitive file name) file in their root folder, i.e., (3) is Pulumi and (4) is yml,yaml. https://www.pulumi.com/docs/intro/concepts/project/ AWS CDK projects have a cdk.json (case-sensitive file name) file in their root folder, i.e., (3) is cdk and (4) is json. https://docs.aws.amazon.com/cdk/v2/guide/cli.html CDK for Terraform (CDKTF) projects have a cdktf.json (case-sensitive file name) file in their root folder, i.e., (3) is cdktf and (4) is json. https://www.terraform.io/cdktf/create-and-deploy/project-setup Limitations The script uses the GitHub code search API and inherits its limitations:

    Only forks with more stars than the parent repository are included. Only the repositories' default branches are considered. Only files smaller than 384 KB are searchable. Only repositories with fewer than 500,000 files are considered. Only repositories that have had activity or have been returned in search results in the last year are considered. More details: https://docs.github.com/en/search-github/searching-on-github/searching-code The results of the GitHub code search API are not stable. However, the generally more robust GraphQL API does not support searching for files in repositories: https://stackoverflow.com/questions/45382069/search-for-code-in-github-using-graphql-v4-api Downloading Repositories download-repositories.py downloads all repositories in CSV files generated through search-respositories.py and generates an overview CSV file of the downloads. The script takes these arguments in the following order:

    Name of the repositories CSV files generated through search-repositories.py, separated by commas. Output directory to download the repositories to. Name of the CSV output file. The script only downloads a shallow recursive copy of the HEAD of the repo, i.e., only the main branch's most recent state, including submodules, without the rest of the git history. Each repository is downloaded to a subfolder named by the repository's ID.

  15. HR and Projects Synthetic Dataset

    • kaggle.com
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alex Mauricio RodrĂ­guez
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset simulates a realistic scenario of human resources management and project execution. It was designed for educational purposes, particularly for teaching data analysis, cleaning, transformation, merging, and KPI monitoring in Python, especially with Google Colab.

    The data includes intentionally embedded challenges such as missing values, inconsistent formats, and realistic business logic constraints (e.g., max 40 hours/week per employee), allowing students or professionals to develop data wrangling and reporting skills.

    Files Included empleados_talento_humano.xlsx Contains personal and professional information of 1000 employees. Includes gender, education level, civil status, salary (with formatting inconsistencies), and some missing values in the municipality field.

    proyectos.xlsx Contains 100 projects with planned vs. executed resources, project status (e.g., Completed, In Progress, Cancelled), start/end dates, and percentage of completion. Project progress is skewed left to simulate realistic project delays.

    empleados_proyectos.xlsx Contains 2000+ employee-project assignments. Includes project role, date of assignment (always after employee hire date), and number of hours assigned/reported. Guarantees that no employee exceeds 40 hours per week in total.

    🎯 Intended Use Practice with pandas, merge, groupby, and data wrangling in Colab.

    Data cleaning (e.g., parsing salary fields, filling missing values).

    Basic time-series and project tracking exercises.

    Building dashboards or indicators (resource execution, project progress, employee workload).

    Simulation of business intelligence pipelines.

  16. Z

    TRAVEL: A Dataset with Toolchains for Test Generation and Regression Testing...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Annibale Panichella (2024). TRAVEL: A Dataset with Toolchains for Test Generation and Regression Testing of Self-driving Cars Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5911160
    Explore at:
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Alessio Gambi
    Vincenzo Riccio
    Annibale Panichella
    Christian Birchler
    Sebastiano Panichella
    Pouria Derakhshanfar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This repository hosts the Testing Roads for Autonomous VEhicLes (TRAVEL) dataset. TRAVEL is an extensive collection of virtual roads that have been used for testing lane assist/keeping systems (i.e., driving agents) and data from their execution in state of the art, physically accurate driving simulator, called BeamNG.tech. Virtual roads consist of sequences of road points interpolated using Cubic splines.

    Along with the data, this repository contains instructions on how to install the tooling necessary to generate new data (i.e., test cases) and analyze them in the context of test regression. We focus on test selection and test prioritization, given their importance for developing high-quality software following the DevOps paradigms.

    This dataset builds on top of our previous work in this area, including work on

    test generation (e.g., AsFault, DeepJanus, and DeepHyperion) and the SBST CPS tool competition (SBST2021),

    test selection: SDC-Scissor and related tool

    test prioritization: automated test cases prioritization work for SDCs.

    Dataset Overview

    The TRAVEL dataset is available under the data folder and is organized as a set of experiments folders. Each of these folders is generated by running the test-generator (see below) and contains the configuration used for generating the data (experiment_description.csv), various statistics on generated tests (generation_stats.csv) and found faults (oob_stats.csv). Additionally, the folders contain the raw test cases generated and executed during each experiment (test..json).

    The following sections describe what each of those files contains.

    Experiment Description

    The experiment_description.csv contains the settings used to generate the data, including:

    Time budget. The overall generation budget in hours. This budget includes both the time to generate and execute the tests as driving simulations.

    The size of the map. The size of the squared map defines the boundaries inside which the virtual roads develop in meters.

    The test subject. The driving agent that implements the lane-keeping system under test. The TRAVEL dataset contains data generated testing the BeamNG.AI and the end-to-end Dave2 systems.

    The test generator. The algorithm that generated the test cases. The TRAVEL dataset contains data obtained using various algorithms, ranging from naive and advanced random generators to complex evolutionary algorithms, for generating tests.

    The speed limit. The maximum speed at which the driving agent under test can travel.

    Out of Bound (OOB) tolerance. The test cases' oracle that defines the tolerable amount of the ego-car that can lie outside the lane boundaries. This parameter ranges between 0.0 and 1.0. In the former case, a test failure triggers as soon as any part of the ego-vehicle goes out of the lane boundary; in the latter case, a test failure triggers only if the entire body of the ego-car falls outside the lane.

    Experiment Statistics

    The generation_stats.csv contains statistics about the test generation, including:

    Total number of generated tests. The number of tests generated during an experiment. This number is broken down into the number of valid tests and invalid tests. Valid tests contain virtual roads that do not self-intersect and contain turns that are not too sharp.

    Test outcome. The test outcome contains the number of passed tests, failed tests, and test in error. Passed and failed tests are defined by the OOB Tolerance and an additional (implicit) oracle that checks whether the ego-car is moving or standing. Tests that did not pass because of other errors (e.g., the simulator crashed) are reported in a separated category.

    The TRAVEL dataset also contains statistics about the failed tests, including the overall number of failed tests (total oob) and its breakdown into OOB that happened while driving left or right. Further statistics about the diversity (i.e., sparseness) of the failures are also reported.

    Test Cases and Executions

    Each test..json contains information about a test case and, if the test case is valid, the data observed during its execution as driving simulation.

    The data about the test case definition include:

    The road points. The list of points in a 2D space that identifies the center of the virtual road, and their interpolation using cubic splines (interpolated_points)

    The test ID. The unique identifier of the test in the experiment.

    Validity flag and explanation. A flag that indicates whether the test is valid or not, and a brief message describing why the test is not considered valid (e.g., the road contains sharp turns or the road self intersects)

    The test data are organized according to the following JSON Schema and can be interpreted as RoadTest objects provided by the tests_generation.py module.

    { "type": "object", "properties": { "id": { "type": "integer" }, "is_valid": { "type": "boolean" }, "validation_message": { "type": "string" }, "road_points": { §\label{line:road-points}§ "type": "array", "items": { "$ref": "schemas/pair" }, }, "interpolated_points": { §\label{line:interpolated-points}§ "type": "array", "items": { "$ref": "schemas/pair" }, }, "test_outcome": { "type": "string" }, §\label{line:test-outcome}§ "description": { "type": "string" }, "execution_data": { "type": "array", "items": { "$ref" : "schemas/simulationdata" } } }, "required": [ "id", "is_valid", "validation_message", "road_points", "interpolated_points" ] }

    Finally, the execution data contain a list of timestamped state information recorded by the driving simulation. State information is collected at constant frequency and includes absolute position, rotation, and velocity of the ego-car, its speed in Km/h, and control inputs from the driving agent (steering, throttle, and braking). Additionally, execution data contain OOB-related data, such as the lateral distance between the car and the lane center and the OOB percentage (i.e., how much the car is outside the lane).

    The simulation data adhere to the following (simplified) JSON Schema and can be interpreted as Python objects using the simulation_data.py module.

    { "$id": "schemas/simulationdata", "type": "object", "properties": { "timer" : { "type": "number" }, "pos" : { "type": "array", "items":{ "$ref" : "schemas/triple" } } "vel" : { "type": "array", "items":{ "$ref" : "schemas/triple" } } "vel_kmh" : { "type": "number" }, "steering" : { "type": "number" }, "brake" : { "type": "number" }, "throttle" : { "type": "number" }, "is_oob" : { "type": "number" }, "oob_percentage" : { "type": "number" } §\label{line:oob-percentage}§ }, "required": [ "timer", "pos", "vel", "vel_kmh", "steering", "brake", "throttle", "is_oob", "oob_percentage" ] }

    Dataset Content

    The TRAVEL dataset is a lively initiative so the content of the dataset is subject to change. Currently, the dataset contains the data collected during the SBST CPS tool competition, and data collected in the context of our recent work on test selection (SDC-Scissor work and tool) and test prioritization (automated test cases prioritization work for SDCs).

    SBST CPS Tool Competition Data

    The data collected during the SBST CPS tool competition are stored inside data/competition.tar.gz. The file contains the test cases generated by Deeper, Frenetic, AdaFrenetic, and Swat, the open-source test generators submitted to the competition and executed against BeamNG.AI with an aggression factor of 0.7 (i.e., conservative driver).

        Name
        Map Size (m x m)
        Max Speed (Km/h)
        Budget (h)
        OOB Tolerance (%)
        Test Subject
    
    
    
    
        DEFAULT
        200 Ă— 200
        120
        5 (real time)
        0.95
        BeamNG.AI - 0.7
    
    
        SBST
        200 Ă— 200
        70
        2 (real time)
        0.5
        BeamNG.AI - 0.7
    

    Specifically, the TRAVEL dataset contains 8 repetitions for each of the above configurations for each test generator totaling 64 experiments.

    SDC Scissor

    With SDC-Scissor we collected data based on the Frenetic test generator. The data is stored inside data/sdc-scissor.tar.gz. The following table summarizes the used parameters.

        Name
        Map Size (m x m)
        Max Speed (Km/h)
        Budget (h)
        OOB Tolerance (%)
        Test Subject
    
    
    
    
        SDC-SCISSOR
        200 Ă— 200
        120
        16 (real time)
        0.5
        BeamNG.AI - 1.5
    

    The dataset contains 9 experiments with the above configuration. For generating your own data with SDC-Scissor follow the instructions in its repository.

    Dataset Statistics

    Here is an overview of the TRAVEL dataset: generated tests, executed tests, and faults found by all the test generators grouped by experiment configuration. Some 25,845 test cases are generated by running 4 test generators 8 times in 2 configurations using the SBST CPS Tool Competition code pipeline (SBST in the table). We ran the test generators for 5 hours, allowing the ego-car a generous speed limit (120 Km/h) and defining a high OOB tolerance (i.e., 0.95), and we also ran the test generators using a smaller generation budget (i.e., 2 hours) and speed limit (i.e., 70 Km/h) while setting the OOB tolerance to a lower value (i.e., 0.85). We also collected some 5, 971 additional tests with SDC-Scissor (SDC-Scissor in the table) by running it 9 times for 16 hours using Frenetic as a test generator and defining a more realistic OOB tolerance (i.e., 0.50).

    Generating new Data

    Generating new data, i.e., test cases, can be done using the SBST CPS Tool Competition pipeline and the driving simulator BeamNG.tech.

    Extensive instructions on how to install both software are reported inside the SBST CPS Tool Competition pipeline Documentation;

  17. H

    Vol 16(2): Replication Data for: Black Lives Matter: Evidence that Police-...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated May 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kris-Stella Trump; Vanessa Williamson; Katherine Levine Einstein (2018). Vol 16(2): Replication Data for: Black Lives Matter: Evidence that Police- Caused Deaths Predict Protest Activity [Dataset]. http://doi.org/10.7910/DVN/L2GSK6
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2018
    Dataset provided by
    Harvard Dataverse
    Authors
    Kris-Stella Trump; Vanessa Williamson; Katherine Levine Einstein
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Since 2013, protests opposing police violence against Black people have occurred across a number of American cities under the banner of “Black Lives Matter.” We develop a new dataset of Black Lives Matter protests that took place in 2014–2015 and explore the contexts in which they emerged. We find that Black Lives Matter protests are more likely to occur in localities where more Black people have previously been killed by police. We discuss the implications of our findings in light of the literature on the development of social movements and recent scholarship on the carceral state’s impact on political engagement.

  18. Z

    Data from: Regression-Test History Data for Flaky Test-Research, Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wendler, Philipp (2024). Regression-Test History Data for Flaky Test-Research, Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10639029
    Explore at:
    Dataset updated
    Aug 12, 2024
    Dataset provided by
    Wendler, Philipp
    Winter, Stefan
    Description

    The dataset comprises developer test results of Maven projects with flaky tests across a range of consecutive commits from the projects' git commit histories. The Maven projects are a subset of those investigated in an OOPSLA 2020 paper. The commit range for this dataset has been chosen as the flakiness-introducing commit (FIC) and iDFlakies-commit (see the OOPSLA paper for details). The commit hashes have been obtained from the IDoFT dataset.

    The dataset will be presented at the 1st International Flaky Tests Workshop 2024 (FTW 2024). Please refer to our extended abstract for more details about the motivation for and context of this dataset.

    The following table provides a summary of the data.

    Slug (Module) FIC Hash Tests Commits Av. Commits/Test Flaky Tests Tests w/ Consistent Failures Total Distinct Histories

    TooTallNate/Java-WebSocket 822d40 146 75 75 24 1 2.6x10^9

    apereo/java-cas-client (cas-client-core) 5e3655 157 65 61.7 3 2 1.0x10^7

    eclipse-ee4j/tyrus (tests/e2e/standard-config) ce3b8c 185 16 16 12 0 261

    feroult/yawp (yawp-testing/yawp-testing-appengine) abae17 1 191 191 1 1 8

    fluent/fluent-logger-java 5fd463 19 131 105.6 11 2 8.0x10^32

    fluent/fluent-logger-java 87e957 19 160 122.4 11 3 2.1x10^31

    javadelight/delight-nashorn-sandbox d0d651 81 113 100.6 2 5 4.2x10^10

    javadelight/delight-nashorn-sandbox d19eee 81 93 83.5 1 5 2.6x10^9

    sonatype-nexus-community/nexus-repository-helm 5517c8 18 32 32 0 0 18

    spotify/helios (helios-services) 23260 190 448 448 0 37 190

    spotify/helios (helios-testing) 78a864 43 474 474 0 7 43

    The columns are composed of the following variables:

    Slug (Module): The project's GitHub slug (i.e., the project's URL is https://github.com/{Slug}) and, if specified, the module for which tests have been executed.

    FIC Hash: The flakiness-introducing commit hash for a known flaky test as described in this OOPSLA 2020 paper. As different flaky tests have different FIC hashes, there may be multiple rows for the same slug/module with different FIC hashes.

    Tests: The number of distinct test class and method combinations over the entire considered commit range.

    Commits: The number of commits in the considered commit range

    Av. Commits/Test: The average number of commits per test class and method combination in the considered commit range. The number of commits may vary for each test class, as some tests may be added or removed within the considered commit range.

    Flaky Tests: The number of distinct test class and method combinations that have more than one test result (passed/skipped/error/failure + exception type, if any + assertion message, if any) across 30 repeated test suite executions on at least one commit in the considered commit range.

    Tests w/ Consistent Failures: The number of distinct test class and method combinations that have the same error or failure result (error/failure + exception type, if any + assertion message, if any) across all 30 repeated test suite executions on at least one commit in the considered commit range.

    Total Distinct Histories: The number of distinct test results (passed/skipped/error/failure + exception type, if any + assertion message, if any) for all test class and method combinations along all commits for that test in the considered commit range.

  19. CarDA - Car door Assembly Activities Dataset

    • zenodo.org
    bin, pdf
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantinos Papoutsakis; Konstantinos Papoutsakis; Nikolaos Bakalos; Nikolaos Bakalos; Athena Zacharia; Athena Zacharia; Maria Pateraki; Maria Pateraki (2025). CarDA - Car door Assembly Activities Dataset [Dataset]. http://doi.org/10.5281/zenodo.14644367
    Explore at:
    pdf, binAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Konstantinos Papoutsakis; Konstantinos Papoutsakis; Nikolaos Bakalos; Nikolaos Bakalos; Athena Zacharia; Athena Zacharia; Maria Pateraki; Maria Pateraki
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The CarDA dataset [1] (Car Door Assembly dataset) has been designed and captured to provide a comprehensive, multi-modal resource for analyzing car door assembly activities performed by trained line workers in realistic assembly lines.

    It comprises a set of time-synchronized multi-camera RGB-D videos and human motion capture data acquired during car door assembly activities performed by real-line workers in a real manufacturing environment.

    Deployment environment:

    The use-case scenario concerns a real-world assembly line workplace in an automotive manufacturing industry, as the deployment environment. In this context,
    line workers simulate the real car door assembly workflow using the prompts, sequences, and tools under very similar ergonomic and environmental conditions
    as in existing factory shop floors.

    The assembly line involves a conveyor belt that is separated into three virtually separated work areas that correspond to three assembly workstations. It moves at a low, constant speed, supporting cart-mounted car doors and material storage. A line worker is assigned to each workstation. All workers assemble car doors as the belt moves, with each station (WS10, WS20, and WS30). A worker completes a workstation-specific set of assembly actions, noted as a task cycle, lasting about 4 minutes before the cart proceeds to the next workstation for further assembly. Upon the successful completion of the task cycle, the cart is left to travel to the virtually defined area of the subsequent workstation where another line worker will continue the assembly process during the new task cycle. Each task cycle lasts approximately 4 minutes and is continuously repeated during the worker’s shift.

    Data acquisition:

    Data acquisition involves low-cost, passive RGB-D camera sensors that are installed at stationary locations alongside the car door assembly line and a motion
    capture system for capturing time-synchronized sequences of images and motion capture data during car door assembly activities performed by real line workers.

    Two stationary StereoLabs ZED2 stereo cameras were installed in each of the three workstations of the car door assembly line. The two stationary, workstation-specific cameras are located at bilateral positions on the two sides of the conveyor belt at the center of the area concerning that specific workstation.

    The pair of RGB-D sensors were utilized to acquire stereo color and depth image sequences during car door task cycle executions. Each recording comprises
    time-synchronized RGB (color) and depth image sequences captured throughout a task cycle execution at 30 frames per second (fps).

    At the same time, the line worker used a wearable XSens MVN Link suit during work activities to acquire time-synced 3D motion capture data at 60 fps.

    Note: Time synchronization between pairs of RGB-D (.svo) recordings (pairs captured during an assembly task cycle simultaneously from the inXX and outXX cameras installed by the wsXX) is guaranteed and relies on the StereoLabs ZED SDK acquisition software. Time synchronization between samples of the RGB-D and mp4 videos (30 fps) and the acquired motion capture data (60 fps) was performed manually with the starting frame/time of the video as a reference time. We have observed some time discrepancies between data samples of the two modalities that might occur after the first 40-50 seconds in some recordings.

    CarDA Dataset:

    The dataset has been split into two subsets, A and B.

    Each comprises data acquired at different periods using the same multicamera system in the same manufacturing environment.

    Subset A contains recordings of RGB-D videos, mp4 videos, and 3d human motion capture data (using the XSens MVN Link suit) acquired during car door assembly activities in all three workstations.

    Subset B contains recordings of RGB-D videos and mp4 videos acquired during car door assembly activities in all three workstations.

    CarDA subset Α

    It contains:

      • RGB-D was acquired using StereoLabs ZED 2 sensors in .svo format
      • mp4 videos (30fps) extracted from the .svo files (using the left camera of the stereo pair of each camera).
      • 3D human pose data (ground truth) captured using the Movella Xsens MVN Link motion capture system (60 fps) in .bvh format
      • Annotation data (xls file format):
        • Ground truth related to temporal segmentation and classification of car door assembly actions (subgoals) during task cycle executions, performed by personnel working directly on the assembly line for the CarDA dataset.
        • Ground truth data on the duration of basic ergonomic postures based on the EAWS ergonomic screening tool: Two experts in manufacturing and ergonomics performed manual annotations related to the EAWS screening tool.

    CarDA subset Α files:

      • ws10 - svo - mp4 - bvh.rar
        Five assembly task cycle executions are recorded in WS10 containing pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras, .bvh motion capture data acquired using the XSens Link system. Annotation data are also available.
      • ws20 - svo - mp4 - bvh.rar
        Four assembly task cycle executions are recorded in WS20 containing pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras, .bvh motion capture data acquired using the XSens Link system. Annotation data are also available.

      • ws30 - svo - mp4 - bvh.rar
        Four assembly task cycle executions are recorded in WS30 containing pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras, .bvh motion capture data acquired using the XSens Link system. Annotation data are also available.

    CarDA subset B

    It contains:

      • RGB-D was acquired using StereoLabs ZED 2 sensors in .svo format
      • mp4 videos (30fps) extracted from the .svo files (using the left camera of the stereo pair of each camera).
      • Annotation data (xls file format):

        • Ground truth related to temporal segmentation and classification of car door assembly actions (subgoals) during task cycle executions, performed by personnel working directly on the assembly line for the CarDA dataset.
        • Ground truth data on the duration of basic ergonomic postures based on the EAWS ergonomic screening tool: Two experts in manufacturing and ergonomics performed manual annotations related to the EAWS screening tool.

    CarDA subset B files:

      • ws10 - svo - mp4.rar
        Three pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras placed in the real workplace are provided.

      • ws20 - svo - mp4.rar
        Six pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras placed in the real workplace are provided.

      • ws30 - svo - mp4.rar
        Three pairs of RGB-D videos (.svo) acquired by two StereoLabs ZED 2 different stereo cameras placed in the real workplace are provided.

    Contact:

    Konstantinos Papoutsakis, PhD: papoutsa@ics.forth.gr

    Maria Pateraki: mpateraki@mail.ntua.gr
    Assistant Professor | National Technical University of Athens
    Affiliated Researcher | Institute of Computer Science | FORTH

    References:

    [1] Konstantinos Papoutsakis, Nikolaos Bakalos, Konstantinos Fragkoulis, Athena Zacharia, Georgia Kapetadimitri, and Maria Pateraki. A vision-based framework for human behavior understanding in industrial assembly lines. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops - T-CAP 2024 Towards a Complete Analysis of People: Fine-grained Understanding for Real-World Applications, 2024.

  20. C

    2012 Chicago Murder Statistics

    • data.cityofchicago.org
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chicago Police Department (2025). 2012 Chicago Murder Statistics [Dataset]. https://data.cityofchicago.org/Public-Safety/2012-Chicago-Murder-Statistics/ws3w-ba2s
    Explore at:
    kmz, kml, xml, application/geo+json, xlsx, csvAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Chicago Police Department
    Area covered
    Chicago
    Description

    This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bureau of Justice Statistics (2025). Capital Punishment in the United States, 1973-2010 [Dataset]. https://catalog.data.gov/dataset/capital-punishment-in-the-united-states-1973-2010
Organization logo

Capital Punishment in the United States, 1973-2010

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 12, 2025
Dataset provided by
Bureau of Justice Statisticshttp://bjs.ojp.gov/
Area covered
United States
Description

CAPITAL PUNISHMENT IN THE UNITED STATES, 1973-2010 provides annual data on prisoners under a sentence of death, as well as those who had their sentences commuted or vacated and prisoners who were executed. This study examines basic sociodemographic classifications including age, sex, race and ethnicity, marital status at time of imprisonment, level of education, and State and region of incarceration. Criminal history information includes prior felony convictions and prior convictions for criminal homicide and the legal status at the time of the capital offense. Additional information is provided on those inmates removed from death row by yearend 2010. The dataset consists of one part which contains 9,058 cases. The file provides information on inmates whose death sentences were removed in addition to information on those inmates who were executed. The file also gives information about inmates who received a second death sentence by yearend 2010 as well as inmates who were already on death row.

Search
Clear search
Close search
Google apps
Main menu