100+ datasets found

d
Data Matching Imputation System
catalog.data.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+2more
Updated Oct 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact, Custodian) (2024). Data Matching Imputation System [Dataset]. https://catalog.data.gov/dataset/data-matching-imputation-system1
Explore at:
Dataset updated
Oct 19, 2024
Dataset provided by
(Point of Contact, Custodian)
Description
The DMIS dataset is a flat file record of the matching of several data set collections. Primarily it consists of VTRs, dealer records, Observer data in conjunction with vessel permit information for the purpose of supporting North East Regional quota monitoring projects.
o
Data and Code for: Correlation Neglect in Student-to-School Matching
openicpsr.org
delimited
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Rees-Jones; Ran Shorrer; Chloe Tergiman (2023). Data and Code for: Correlation Neglect in Student-to-School Matching [Dataset]. http://doi.org/10.3886/E192088V1
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E192088V1
Dataset updated
Jun 6, 2023
Dataset provided by
American Economic Association
Authors
Alex Rees-Jones; Ran Shorrer; Chloe Tergiman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2019 - 2022
Area covered
United States
Description
Data and Code to accompany the paper "Correlation Neglect in Student-to-School Matching."Abstract: We present results from three experiments containing incentivized school-choice scenarios. In these scenarios, we vary whether schools' assessments of students are based on a common priority (inducing correlation in admissions decisions) or are based on independent assessments (eliminating correlation in admissions decisions). The quality of students' application strategies declines in the presence of correlated admissions: application strategies become substantially more aggressive and fail to include attractive ``safety'' options. We provide a battery of tests suggesting that this phenomenon is at least partially driven by correlation neglect, and we discuss implications for the design and deployment of student-to-school matching mechanisms.
d
Maryland Counties Match Tool for Data Quality
catalog.data.gov
opendata.maryland.gov
+1more
Updated Sep 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
opendata.maryland.gov (2023). Maryland Counties Match Tool for Data Quality [Dataset]. https://catalog.data.gov/dataset/maryland-counties-match-tool-for-data-quality
Explore at:
Dataset updated
Sep 15, 2023
Dataset provided by
opendata.maryland.gov
Area covered
Maryland
Description
Data standardization is an important part of effective management. However, sometimes people have data that doesn't match. This dataset includes different ways that counties might get written by different people. It can be used as a lookup table when you need County to be your unique identifier. For example, it allows you to match St. Mary's, St Marys, and Saint Mary's so that you can use it with disparate data from other data sets.
H
Data for: "Linking Datasets on Organizations Using Half a Billion...
dataverse.harvard.edu
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Connor Jerzak (2025). Data for: "Linking Datasets on Organizations Using Half a Billion Open-Collaborated Records" [Dataset]. http://doi.org/10.7910/DVN/EHRQQL
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/EHRQQL
Dataset updated
Jan 13, 2025
Dataset provided by
Harvard Dataverse
Authors
Connor Jerzak
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Abstract: Scholars studying organizations often work with multiple datasets lacking shared unique identifiers or covariates. In such situations, researchers usually use approximate string (``fuzzy'') matching methods to combine datasets. String matching, although useful, faces fundamental challenges. Even when two strings appear similar to humans, fuzzy matching often does not work because it fails to adapt to the informativeness of the character combinations. In response, a number of machine-learning methods have been developed to refine string matching. Yet, the effectiveness of these methods is limited by the size and diversity of training data. This paper introduces data from a prominent employment networking site (LinkedIn) as a massive training corpus to address these limitations. We show how, by leveraging information from LinkedIn regarding organizational name-to-name links, we can improve upon existing matching benchmarks, incorporating the trillions of name pair examples from LinkedIn into various methods to improve performance by explicitly maximizing match probabilities inferred from the LinkedIn corpus. We also show how relationships between organization names can be modeled using a network representation of the LinkedIn data. In illustrative merging tasks involving lobbying firms, we document improvements when using the LinkedIn corpus in matching calibration and make all data and methods open source. Keywords: Record linkage; Interest groups; Text as data; Unstructured data
M
Match Data Collection Report
archivemarketresearch.com
doc, pdf, ppt
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Match Data Collection Report [Dataset]. https://www.archivemarketresearch.com/reports/match-data-collection-19382
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Feb 5, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global match data collection market is projected to grow from USD 940 million in 2023 to USD 3,530 million by 2033, at a CAGR of 16.7%. Growing adoption of data-driven decision-making in the sports industry, the increasing popularity of esports, and advancements in sensor technology are the primary factors driving the market growth. The use of match data allows teams, players, and coaches to gain insights into their performance, identify strengths and weaknesses, and make informed decisions. The market is segmented by type (sensor data, video data, and others), application (sports industry and esports), and region (North America, South America, Europe, Middle East & Africa, and Asia Pacific). North America is the largest market, followed by Europe. The Asia Pacific region is expected to witness the highest growth rate due to the increasing popularity of esports and the growing number of professional sports leagues in the region. Key players in the market include Opta, Sportradar, N3XT Sports, Sportsdata, OUTFORZ, KINEXON Sports, Stats Perform, Baidu Cloud, Bestdata, Gracenote, Genius Sports, Statscore, and Broadage.
f
Data_Sheet_1_Polar Similars: Using Massive Mobile Dating Data to Predict...
frontiersin.figshare.com
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jon Levy; Devin Markell; Moran Cerf (2023). Data_Sheet_1_Polar Similars: Using Massive Mobile Dating Data to Predict Synchronization and Similarity in Dating Preferences.docx [Dataset]. http://doi.org/10.3389/fpsyg.2019.02010.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpsyg.2019.02010.s001
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Jon Levy; Devin Markell; Moran Cerf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Leveraging a massive dataset of over 421 million potential matches between single users on a leading mobile dating application, we were able to identify numerous characteristics of effective matching. Effective matching is defined as the exchange of contact information with the likely intent to meet in person. The characteristics of effective match include alignment of psychological traits (i.e., extroversion), physical traits (i.e., height), personal choices (i.e., desiring the same relationship type), and shared experiences. For nearly all characteristics, the more similar the individuals were, the higher the likelihood was of them finding each other desirable and opting to meet in person. The only exception was introversion, where introverts rarely had an effective match with other introverts. When investigating the preliminary stages of the choice process we looked at the consistency between the choice of men/women, the time it took users to make these binary choices, and the tendency of yes/no decisions. We used a biologically inspired choice model to estimate the decision process and could predict the selection and response time with nearly 60% accuracy. Given that people make their initial selection in no more than 11 s, and ultimately prefer a partner who shares numerous attributes with them, we suggest that users are less selective in their early preferences and gradually, during their conversation, converge onto clusters that share a high degree of similarity in characteristics.
H
Replication Data for: Matching Methods for Causal Inference with Time-Series...
dataverse.harvard.edu
Updated Oct 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kosuke Imai; In Song Kim; Erik Wang (2021). Replication Data for: Matching Methods for Causal Inference with Time-Series Cross-Section Data [Dataset]. http://doi.org/10.7910/DVN/ZTDHVE
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ZTDHVE
Dataset updated
Oct 13, 2021
Dataset provided by
Harvard Dataverse
Authors
Kosuke Imai; In Song Kim; Erik Wang
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZTDHVEhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZTDHVE
Description
Matching methods improve the validity of causal inference by reducing model dependence and offering intuitive diagnostics. While they have become a part of the standard tool kit across disciplines, matching methods are rarely used when analyzing time-series cross-sectional data. We fill this methodological gap. In the proposed approach, we first match each treated observation with control observations from other units in the same time period that have an identical treatment history up to the pre-specified number of lags. We use standard matching and weighting methods to further refine this matched set so that the treated and matched control observations have similar covariate values. Assessing the quality of matches is done by examining covariate balance. Finally, we estimate both short-term and long-term average treatment effects using the difference-in-differences estimator, accounting for a time trend. We illustrate the proposed methodology through simulation and empirical studies. An open-source software package is available for implementing the proposed methods.
Assessment of States' Use of Computer Matching Protocols in SNAP
catalog.data.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Food and Nutrition Service (2025). Assessment of States' Use of Computer Matching Protocols in SNAP [Dataset]. https://catalog.data.gov/dataset/assessment-of-states-use-of-computer-matching-protocols-in-snap
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Food and Nutrition Servicehttps://www.fns.usda.gov/
Description
As required by federal law, state SNAP agencies verify financial and non-financial information by matching SNAP applicant and participant information to various national and state data sources to ensure they meet the program’s eligibility criteria. Data matching is an important tool for ensuring program integrity and benefit accuracy. However, information on states’ data matching practices and protocols is limited. This study was undertaken to address this knowledge gap.
i
Map Matching Dataset
ieee-dataport.org
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Youliang chen (2023). Map Matching Dataset [Dataset]. https://ieee-dataport.org/documents/map-matching-dataset
Explore at:
Dataset updated
Oct 10, 2023
Authors
Youliang chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
China
Data from: Web Data Commons Training and Test Sets for Large-Scale Product...
linkagelibrary.icpsr.umich.edu
da-ra.de
Updated Nov 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ralph Peeters; Anna Primpeli; Christian Bizer (2020). Web Data Commons Training and Test Sets for Large-Scale Product Matching - Version 2.0 [Dataset]. http://doi.org/10.3886/E127481V1
Explore at:
Unique identifier
https://doi.org/10.3886/E127481V1
Dataset updated
Nov 26, 2020
Dataset provided by
University of Mannheim (Germany)
Authors
Ralph Peeters; Anna Primpeli; Christian Bizer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Many e-shops have started to mark-up product data within their HTML pages using the schema.org vocabulary. The Web Data Commons project regularly extracts such data from the Common Crawl, a large public web crawl. The Web Data Commons Training and Test Sets for Large-Scale Product Matching contain product offers from different e-shops in the form of binary product pairs (with corresponding label “match” or “no match”) for four product categories, computers, cameras, watches and shoes. In order to support the evaluation of machine learning-based matching methods, the data is split into training, validation and test sets. For each product category, we provide training sets in four different sizes (2.000-70.000 pairs). Furthermore there are sets of ids for each training set for a possible validation split (stratified random draw) available. The test set for each product category consists of 1.100 product pairs. The labels of the test sets were manually checked while those of the training sets were derived using shared product identifiers from the Web weak supervision. The data stems from the WDC Product Data Corpus for Large-Scale Product Matching - Version 2.0 which consists of 26 million product offers originating from 79 thousand websites. For more information and download links for the corpus itself, please follow the links below.
R
Data from: Image Matching System Dataset
universe.roboflow.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hyojin (2023). Image Matching System Dataset [Dataset]. https://universe.roboflow.com/hyojin-jwabh/image-matching-system
Explore at:
zipAvailable download formats
Dataset updated
Jun 1, 2023
Dataset authored and provided by
Hyojin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Recaptcha
Description
Image Matching System

## Overview Image Matching System is a dataset for classification tasks - it contains Recaptcha annotations for 8,828 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
w
Web Data Commons - The WDC Data Training Dataset and Gold Standard for...
webdatacommons.org
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Bizer; Anna Primpeli; Ralph Peeters, Web Data Commons - The WDC Data Training Dataset and Gold Standard for Large-Scale Product Matching [Dataset]. http://www.webdatacommons.org/largescaleproductcorpus/
Explore at:
jsonAvailable download formats
Authors
Christian Bizer; Anna Primpeli; Ralph Peeters
Description
The training dataset consisting of 20 million pairs of product offers referring to the same products. The offers were extracted from 43 thousand e-shops which provide schema.org annotations including some form of product ID such as a GTIN or MPN. We also created a gold standard by manually verifying 2000 pairs of offers belonging to four different product categories.
Data from: Matching Treatment and Offender: North Carolina, 1980-1982
catalog.data.gov
gimi9.com
+1more
Updated Mar 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Justice (2025). Matching Treatment and Offender: North Carolina, 1980-1982 [Dataset]. https://catalog.data.gov/dataset/matching-treatment-and-offender-north-carolina-1980-1982-bdbd9
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
National Institute of Justicehttp://nij.ojp.gov/
Area covered
North Carolina
Description
These data were gathered in order to evaluate the implications of rational choice theory for offender rehabilitation. The hypothesis of the research was that income-enhancing prison rehabilitation programs are most effective for the economically motivated offender. The offender was characterized by demographic and socio-economic characteristics, criminal history and behavior, and work activities during incarceration. Information was also collected on type of release and post-release recidivistic and labor market measures. Recividism was measured by arrests, convictions, and reincarcerations, length of time until first arrest after release, and seriousness of offense leading to reincarceration.
r
Data from: imatch for matching in Stata
researchdata.edu.au
ado, doc, txt
Updated Jan 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Associate Professor Zhiqiang Wang; Associate Professor Zhiqiang Wang (2017). imatch for matching in Stata [Dataset]. http://doi.org/10.14264/UQL.2017.982
Explore at:
doc(60416), txt(3648), ado(3224)Available download formats
Unique identifier
https://doi.org/10.14264/UQL.2017.982
Dataset updated
Jan 1, 2017
Dataset provided by
The University of Queensland
Authors
Associate Professor Zhiqiang Wang; Associate Professor Zhiqiang Wang
License
https://guides.library.uq.edu.au/deposit-your-data/license-reuse-data-agreementhttps://guides.library.uq.edu.au/deposit-your-data/license-reuse-data-agreement
Description
The imatch program was written for Sata users to match different groups according to multiple variables. Program file: imatch.ado Help file: imatch.hlp
Z
Semantic address matching dataset
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lin, Yue (2020). Semantic address matching dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3477006
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Lin, Yue
Kang, Mengjun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for our paper Lin, Y., Kang, M., Wu, Y., Du, Q. and Liu, T. (2019) A deep learning architecture for semantic address matching, International Journal of Geographical Information Science, DOI: 10.1080/13658816.2019.1681431

Below is an overview of each file in this dataset.

train.txt The training dataset

train_code_a.txt The index representations of the address elements (i.e., address elements represented by the corresponding indexes in the vocabulary obtained by word2vec) in Sa

train_code_b.txt The index representations of the address elements in Sb

train_lable.txt The labels of address pairs in the training dataset

dev.txt The development dataset

dev_code_a.txt The index representations of the address elements in Sa

dev_code_b.txt The index representations of the address elements in Sb

dev_lable.txt The labels of address pairs in the development dataset

test.txt The test dataset

test_code_a.txt The index representations of the address elements in Sa

test_code_b.txt The index representations of the address elements in Sb

test_lable.txt The labels of address pairs in the test dataset
H
Replication Data for: Matching with Text Data: An Experimental Evaluation of...
dataverse.harvard.edu
Updated Dec 24, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reagan Mozer (2019). Replication Data for: Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality [Dataset]. http://doi.org/10.7910/DVN/K8IL3V
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/K8IL3V
Dataset updated
Dec 24, 2019
Dataset provided by
Harvard Dataverse
Authors
Reagan Mozer
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This repository contains the materials needed to replicate the results presented in Mozer et al. (2019), "Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality", forthcoming in Political Analysis.
f
Pattern Matching Rules for Identifying Age Data.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Mar 2, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burnap, Pete; Sloan, Luke; Williams, Matthew; Morgan, Jeffrey (2015). Pattern Matching Rules for Identifying Age Data. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001929618
Explore at:
Dataset updated
Mar 2, 2015
Authors
Burnap, Pete; Sloan, Luke; Williams, Matthew; Morgan, Jeffrey
Description
Pattern Matching Rules for Identifying Age Data.
c
FRC Match Dataset
cubig.ai
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). FRC Match Dataset [Dataset]. https://cubig.ai/store/products/397/frc-match-dataset
Explore at:
Dataset updated
Jun 5, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The FRC Match Dataset is based on the FRST Robotics Competition (FRC) competition records from 2018 to 2025, and is a robot competition match data that includes various information such as EPA (Expected Score Contribution), match win rate, team composition, and match results for each match.

2) Data Utilization (1) FRC Match Data has characteristics that: • Each row contains numerical and categorical variables such as year, event, playoff status, match stage, winning team, EPA-based probability of victory, team name and composition, and match results, which together provide team/match performance and forecasting indicators. (2) FRC Match Data can be used to: • Prediction and Assessment of Match Results: Using EPA and past match data, machine learning models can predict match wins and losses, and prediction models can be evaluated for reliability with indicators such as Brier score. • Team Strategy and Performance Analysis: By analyzing EPA, win rate, and matchup data for each team, you can use it to understand the strategic contribution, cooperation effects, seasonal trends, and strong and weak team characteristics.
string-matching-data
kaggle.com
Updated Feb 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ayushokaay (2025). string-matching-data [Dataset]. https://www.kaggle.com/datasets/ayushparwal/string-matching-data/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 14, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ayushokaay
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by neednot_toplay

Released under Apache 2.0

Contents
MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis
plos.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seungyeul Yoo; Tao Huang; Joshua D. Campbell; Eunjee Lee; Zhidong Tu; Mark W. Geraci; Charles A. Powell; Eric E. Schadt; Avrum Spira; Jun Zhu (2023). MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis [Dataset]. http://doi.org/10.1371/journal.pcbi.1003790
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003790
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Seungyeul Yoo; Tao Huang; Joshua D. Campbell; Eunjee Lee; Zhidong Tu; Mark W. Geraci; Charles A. Powell; Eric E. Schadt; Avrum Spira; Jun Zhu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Multi-Omics Data Matcher (MODMatcher), to identify and correct sample labeling errors in multiple types of molecular data, which can be used in further integrative analysis. Our results indicate that inspection of sample annotation and labeling error is an indispensable data quality assurance step. Applied to a large lung genomic study, MODMatcher increased statistically significant genetic associations and genomic correlations by more than two-fold. In a simulation study, MODMatcher provided more robust results by using three types of omics data than two types of omics data. We further demonstrate that MODMatcher can be broadly applied to large genomic data sets containing multiple types of omics data, such as The Cancer Genome Atlas (TCGA) data sets.

Facebook

Twitter

Click to copy link

Link copied

Cite

(Point of Contact, Custodian) (2024). Data Matching Imputation System [Dataset]. https://catalog.data.gov/dataset/data-matching-imputation-system1

Data Matching Imputation System

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Oct 19, 2024

Dataset provided by

(Point of Contact, Custodian)

Description

The DMIS dataset is a flat file record of the matching of several data set collections. Primarily it consists of VTRs, dealer records, Observer data in conjunction with vessel permit information for the purpose of supporting North East Regional quota monitoring projects.

Clear search

Close search

Google apps

Main menu

Data Matching Imputation System

Data and Code for: Correlation Neglect in Student-to-School Matching

Maryland Counties Match Tool for Data Quality

Data for: "Linking Datasets on Organizations Using Half a Billion...

Match Data Collection Report

Data_Sheet_1_Polar Similars: Using Massive Mobile Dating Data to Predict...

Replication Data for: Matching Methods for Causal Inference with Time-Series...

Assessment of States' Use of Computer Matching Protocols in SNAP

Map Matching Dataset

Data from: Web Data Commons Training and Test Sets for Large-Scale Product...

Data from: Image Matching System Dataset

Image Matching System

Web Data Commons - The WDC Data Training Dataset and Gold Standard for...

Data from: Matching Treatment and Offender: North Carolina, 1980-1982

Data from: imatch for matching in Stata

Semantic address matching dataset

Replication Data for: Matching with Text Data: An Experimental Evaluation of...

Pattern Matching Rules for Identifying Age Data.

FRC Match Dataset

string-matching-data

Dataset

Contents

MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis

Data Matching Imputation System