SDNist (v1.3) is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. This version (1.3) reproduces the challenge environment from Sprints 2 and 3 of the Temporal Map Challenge. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via pip install: pip install sdnist==1.2.8 for Python >=3.6 or on the USNIST/Github. The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually.
SDNist (v1.3) is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. This version (1.3) reproduces the challenge environment from Sprints 2 and 3 of the Temporal Map Challenge. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via pip
install: pip install sdnist==1.2.8
for Python >=3.6 or on the USNIST/Github. The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Files and metadata associated with the EMDataBank/Unified Data Resource for 3DEM 2015/2016 Map Challenge hosted at challenges.emdatabank.org are deposited.
All members of the Scientific Community--at all levels of experience--were invited to participate as Challengers, and/or as Assessors.
Seven benchmark raw image datasets were selected for the challenge. Six are selected from recently described single particle structure determinations with image data collected as multi-frame movies; one is based on simulated (in silico) images. All of the raw image datasets are archived at pdbe.org/empiar.
27 Challengers created 66 single particle reconstructions from the targets, and then uploaded their results with associated details. 15 of the reconstructions were calculated using the SDSC Gordon supercomputer.
This map challenge was one of two community-wide challenges sponsored by EMDataBank in 2015/2016 to critically evaluate 3DEM methods that are coming into use, with the ultimate goal of developing validation criteria associated with every 3DEM map and map-derived model.
Differential Privacy Temporal Map Challenge - Sprint 2 Data
The dataset includes survey data, including demographic and financial features, representing a subset of IPUMS American Community Survey data for Ohio and Illinois from 2012-2018. The data includes a large feature set of quantitative survey variables along with simulated individuals (with a sequence of records across years), time segments (years), and map segments (PUMA).### Context
The data was provided by NIST PSCR for Sprint 2 of the Differential Privacy Temporal Map Challenge.
The data can be used to test solutions in the differential privacy field.
Differential Privacy Temporal Map Challenge - Sprint 3 Data
The dataset includes quantitative and categorical information about taxi trips in Chicago, including time, distance, location, payment, and service provider. The data includes several features along with time segments (trip_day_of_week and trip_hour_of_day), map segments (pickup_community_area and dropoff_community_area), and simulated individuals (taxi_id).
The data was provided by NIST PSCR for Sprint 3 of the Differential Privacy Temporal Map Challenge.
The data can be used to test solutions in the differential privacy field.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
SDNist (v1.3) is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. This version (1.3) reproduces the challenge environment from Sprints 2 and 3 of the Temporal Map Challenge. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via pip install: pip install sdnist==1.2.8 for Python >=3.6 or on the USNIST/Github. The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually.