100+ datasets found

d
Data from: Comparison of photo-matching algorithms commonly used for...
search.dataone.org
data.niaid.nih.gov
+2more
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maximilian MatthÃ©; Marco Sannolo; Kristopher Winiarski; Annemarieke Spitzen - van der Sluijs; Daniel Goedbloed; Sebastian Steinfartz; Ulrich Stachow (2025). Comparison of photo-matching algorithms commonly used for photographic capture-recapture studies [Dataset]. http://doi.org/10.5061/dryad.4f0bh
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.4f0bh
Dataset updated
Apr 1, 2025
Dataset provided by
Dryad Digital Repository
Authors
Maximilian MatthÃ©; Marco Sannolo; Kristopher Winiarski; Annemarieke Spitzen - van der Sluijs; Daniel Goedbloed; Sebastian Steinfartz; Ulrich Stachow
Time period covered
Jan 1, 2018
Description
Photographic captureâ€“recapture is a valuable tool for obtaining demographic information on wildlife populations due to its noninvasive nature and cost-effectiveness. Recently, several computer-aided photo-matching algorithms have been developed to more efficiently match images of unique individuals in databases with thousands of images. However, the identification accuracy of these algorithms can severely bias estimates of vital rates and population size. Therefore, it is important to understand the performance and limitations of state-of-the-art photo-matching algorithms prior to implementation in captureâ€“recapture studies involving possibly thousands of images. Here, we compared the performance of four photo-matching algorithms; Wild-ID, I3S Pattern+, APHIS, and AmphIdent using multiple amphibian databases of varying image quality. We measured the performance of each algorithm and evaluated the performance in relation to database size and the number of matching images in the database....
ONC Patient Matching Algorithm Challenge Data
linkagelibrary.icpsr.umich.edu
Updated Sep 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of the National Coordinator for Health (2019). ONC Patient Matching Algorithm Challenge Data [Dataset]. http://doi.org/10.3886/E111962V1
Explore at:
Unique identifier
https://doi.org/10.3886/E111962V1
Dataset updated
Sep 20, 2019
Dataset provided by
Office of the National Coordinator for Health Information Technologyhttp://healthit.gov/
Authors
Office of the National Coordinator for Health
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The goal of the Patient Matching Algorithm Challenge is to bring about greater transparency and data on the performance of existing patient matching algorithms, spur the adoption of performance metrics for patient data matching algorithm vendors, and positively impact other aspects of patient matching such as deduplication and linking to clinical data. Participants will be provided a data set and will have their answers evaluated and scored against a master key. Up to 6 cash prizes will be awarded with a total purse of up to $75,000.00.https://www.patientmatchingchallenge.com/The test dataset used in the ONC Patient Matching Algorithm Challenge is available for download by students, researchers, or anyone else interested in additional analysis and patient matching algorithm development. More information about the Patient Matching Algorithm Challenge can be found: https://www.patientmatchingchallenge.com/.The dataset containing 1 million patients was split into eight files of alphabetical groupings by the the patient's last name, plus an additional file containing test patients with no last name recorded (Null). All files should be downloaded and merged for analysis.https://github.com/onc-healthit/patient-matching
d
Chain of Demand's proprietary product matching algorithms.
datarade.ai
Updated Jul 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chain of Demand (2021). Chain of Demand's proprietary product matching algorithms. [Dataset]. https://datarade.ai/data-products/chain-of-demand-s-proprietary-product-matching-algorithms-chain-of-demand
Explore at:
Dataset updated
Jul 1, 2021
Dataset authored and provided by
Chain of Demand
Area covered
United States of America, United Kingdom, Germany
Description
The same product could have different titles, descriptions and product ID'S on different sites, depending on the structure of each site.

Our algorithm allows our clients to automatically match and track the performance of the same products across multiple platforms such as eBay, Amazon, and DTC sites.
d
Data from: Highly Scalable Matching Pursuit Signal Decomposition Algorithm
catalog.data.gov
datasets.ai
+3more
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Highly Scalable Matching Pursuit Signal Decomposition Algorithm [Dataset]. https://catalog.data.gov/dataset/highly-scalable-matching-pursuit-signal-decomposition-algorithm
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
In this research, we propose a variant of the classical Matching Pursuit Decomposition (MPD) algorithm with significantly improved scalability and computational performance. MPD is a powerful iterative algorithm that decomposes a signal into linear combinations of its dictionary elements or “atoms”. A best fit atom from an arbitrarily defined dictionary is determined through cross-correlation. The selected atom is subtracted from the signal and this procedure is repeated on the residual in the subsequent iterations until a stopping criteria is met. A sufficiently large dictionary is required for an accurate reconstruction; this in return increases the computational burden of the algorithm, thus limiting its applicability and level of adoption. Our main contribution lies in improving the computational efficiency of the algorithm to allow faster decomposition while maintaining a similar level of accuracy. The Correlation Thresholding and Multiple Atom Extractions techniques were proposed to decrease the computational burden of the algorithm. Correlation thresholds prune insignificant atoms from the dictionary. The ability to extract multiple atoms within a single iteration enhanced the effectiveness and efficiency of each iteration. The proposed algorithm, entitled MPD++, was demonstrated using real world data set.
f
Data from: Disparity Selective Stereo Matching Using Correlation Confidence...
figshare.com
txt
Updated Aug 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sangkeun Lee (2018). Disparity Selective Stereo Matching Using Correlation Confidence Measure [Dataset]. http://doi.org/10.6084/m9.figshare.6885158.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6885158.v1
Dataset updated
Aug 1, 2018
Dataset provided by
figshare
Authors
Sangkeun Lee
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Test code to reproduce the results of the paper.This work presents a robust stereo matching method for occluded regions. First, we generate cost volumes using the CEN-SUS transform and the scale-invariant feature transform(SIFT). Then, label-based cost volumes are aggregated using adaptive support weight and SLIC scheme from generated two cost volumes. In order to obtain optimal disparity by two label-based cost volumes, we select the disparity corresponding to high confidence similarity of CENSUS or SIFT with minimum cost point.
Z
Data from: Explaining human mobility predictions through a pattern matching...
data.niaid.nih.gov
zenodo.org
Updated Dec 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siła-Nowicka, Katarzyna (2021). Explaining human mobility predictions through a pattern matching algorithm [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5788700
Explore at:
Dataset updated
Dec 18, 2021
Dataset provided by
Smolak, Kamil
Rohm, Witold
Siła-Nowicka, Katarzyna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The name of the file indicate information: {type of sequence}_{type of measure}_{sequence properites}_{additional information}.csv

{type of sequence} - 'synth' for synthetic or 'london' for real mobility data from London, UK. {type of measure} - 'r2' for R-squared measure or 'corr' for Spearman's correlation {sequence properties} - for synthetic data there are three types of sequences, described in the research article (random, markovian, nonstationary). For real mobility data this part includes information about data processing parameters: (...)_london_{type of mobility sequence}_{DBSCAN epsilon value}_{DBSCAN min_pts value}. {type of mobility sequence} is 'seq' for next-place sequences and '30min' or '1H' for the next time-bin sequences and indicate the size of the time-bin. Files with 'predictability' at the end of the file contain R-squared and Spearman's correlation of measures calculated in relation to the predictability measure.

R2 files include values of R-squared for all types of modelled regression functions. 'line' indicates {y = ax + b} for single variable and {y = ax + by + c} for two variables. 'expo' indicates {y = a*x^b + c} for single variable and {y = a*x^b + c*y^d + e} for two variables 'log' indicates {y = a*log(x*b) + c} for single variable and {y = a * x + c * log(y) + e + d*x * log(y)} for two variables. 'logf' indicates {y = a*log(x) + c * log(y) + e + b*log(x) * log(y)} for two variables
H
Replication Data for Analysis of the Medical Residency Matching Algorithm to...
dataverse.harvard.edu
Updated Feb 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kartikeya Puranam (2023). Replication Data for Analysis of the Medical Residency Matching Algorithm to Validate and Improve Equity [Dataset]. http://doi.org/10.7910/DVN/CFCMRD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/CFCMRD
Dataset updated
Feb 25, 2023
Dataset provided by
Harvard Dataverse
Authors
Kartikeya Puranam
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This data contains results of simulations from the existing and the modified algorithms used in the paper.
o
Data for: Two-Sided Matching for mentor-mentee allocations - Algorithms and...
explore.openaire.eu
zenodo.org
Updated Jan 31, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Haas; Margeret Hall (2019). Data for: Two-Sided Matching for mentor-mentee allocations - Algorithms and manipulation strategies [Dataset]. http://doi.org/10.5281/zenodo.2555099
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.2555099
Dataset updated
Jan 31, 2019
Authors
Christian Haas; Margeret Hall
Description
These are the data files for the PLOS ONE journal article "Two-Sided Matching for mentor-mentee allocations - Algorithms and manipulation strategies". Three files are provided: - Data.xlsx: An overview of the original preferences of mentors and mentee, a data dictionary, and two summary tables used to create figures in the manuscript - MatchingTables.csv: The outcome matching tables for each simulated scenario and repetition - Preferences.csv: The (un)manipulated preferences that were used as input to calculate the solution for each simulated scenario and repetition.
f
Data from: Triplet Matching for Estimating Causal Effects With Three...
tandf.figshare.com
docx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11944575.v2
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Giovanni Nattino; Bo Lu; Junxin Shi; Stanley Lemeshow; Henry Xiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparing outcomes across different levels of trauma centers is vital in evaluating regionalized trauma care. With observational data, it is critical to adjust for patient characteristics to render valid causal comparisons. Propensity score matching is a popular method to infer causal relationships in observational studies with two treatment arms. Few studies, however, have used matching designs with more than two groups, due to the complexity of matching algorithms. We fill the gap by developing an iterative matching algorithm for the three-group setting. Our algorithm outperforms the nearest neighbor algorithm and is shown to produce matched samples with total distance no larger than twice the optimal distance. We implement the evidence factors method for binary outcomes, which includes a randomization-based testing strategy and a sensitivity analysis for hidden bias in three-group matched designs. We apply our method to the Nationwide Emergency Department Sample data to compare emergency department mortality among non-trauma, level I, and level II trauma centers. Our tests suggest that the admission to a trauma center has a beneficial effect on mortality, assuming no unmeasured confounding. A sensitivity analysis for hidden bias shows that unmeasured confounders, moderately associated with the type of care received, may change the result qualitatively. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
f
A new split based searching for exact pattern matching for natural texts
plos.figshare.com
application/x-rar
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saqib Hakak; Amirrudin Kamsin; Palaiahnakote Shivakumara; Mohd Yamani Idna Idris; Gulshan Amin Gilkar (2023). A new split based searching for exact pattern matching for natural texts [Dataset]. http://doi.org/10.1371/journal.pone.0200912
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0200912
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Saqib Hakak; Amirrudin Kamsin; Palaiahnakote Shivakumara; Mohd Yamani Idna Idris; Gulshan Amin Gilkar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Exact pattern matching algorithms are popular and used widely in several applications, such as molecular biology, text processing, image processing, web search engines, network intrusion detection systems and operating systems. The focus of these algorithms is to achieve time efficiency according to applications but not memory consumption. In this work, we propose a novel idea to achieve both time efficiency and memory consumption by splitting query string for searching in Corpus. For a given text, the proposed algorithm split the query pattern into two equal halves and considers the second (right) half as a query string for searching in Corpus. Once the match is found with second halves, the proposed algorithm applies brute force procedure to find remaining match by referring the location of right half. Experimental results on different S1 Dataset, namely Arabic, English, Chinese, Italian and French text databases show that the proposed algorithm outperforms the existing S1 Algorithm in terms of time efficiency and memory consumption as the length of the query pattern increases.
Matching results between landmark in different sources and landmark in a...
zenodo.org
csv
Updated Oct 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marie-Dominique Van Damme; Ana-Maria Olteanu-Raimond; Marie-Dominique Van Damme; Ana-Maria Olteanu-Raimond (2022). Matching results between landmark in different sources and landmark in a referenced dataset (BDTOPO) [Dataset]. http://doi.org/10.5281/zenodo.6483785
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6483785
Dataset updated
Oct 1, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marie-Dominique Van Damme; Ana-Maria Olteanu-Raimond; Marie-Dominique Van Damme; Ana-Maria Olteanu-Raimond
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The four datasets represent the results of a two sequentials processus. The first processus consists on a automatic matching between landmark in different sources and landmark in a referenced dataset (french national topographic data: BDTOPO). Then the links 1:1 are manually validated by experts in the second processus.

The four different datasets and the BDTOPO dataset are archived here.

The data matching algorithm is described in this paper.

Each file represents the result matching for features belonging to a data source with:

- the name of file depends on the data source

- id_source corresponds to the identify of the landmark in data source

- types_of_matching_results describes the type of result matching :

« 1:0 » means that a landmark from a data source (e.g. Camptocamp) has no homologue landmark in BDTOPO

« 1:1 validated » means that a homologous feature exist in BDTOPO and the link was validated

« 1:1 non validated » means that the matching link was not validated

« without candidates » represents the non-matched landmarks because there are no candidates in BDTOPO or because the landmark in data source is far away from its homologous in BDTOPO

« uncertain » : uncertainty cases are complex cases where any decision is taken by the data matching algorithm

- id_bdtopo corresponds to the identify of the landmark in BDTOPO if and only if there is a validated matching link
o
Matching results between landmark in different sources and landmark in a...
explore.openaire.eu
data.niaid.nih.gov
Updated Apr 26, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marie-Dominique Van Damme; Ana-Maria Olteanu-Raimond (2022). Matching results between landmark in different sources and landmark in a referenced dataset (BDTOPO) [Dataset]. http://doi.org/10.5281/zenodo.6518363
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6518363
Dataset updated
Apr 26, 2022
Authors
Marie-Dominique Van Damme; Ana-Maria Olteanu-Raimond
Description
The four datasets represent the results of a two sequentials processus. The first processus consists on a automatic matching between landmark in different sources and landmark in a referenced dataset (french national topographic data: BDTOPO). Then the links 1:1 are manually validated by experts in the second processus. The four different datasets and the BDTOPO dataset are archived here. The data matching algorithm is described in this paper. Each file represents the result matching for features belonging to a data source with: - the name of file depending on the data source - the column "id_source" corresponds to the identifier of the landmark in data source - the column "types_of_matching_results" describes the type of matching result: « 1:0 »: means that a landmark from a data source (e.g. Camptocamp) has no homologue landmark in BDTOPO « 1:1 validated »: means that a homologous feature exist in BDTOPO and the link was validated « 1:1 non validated »: means that the matching link was not validated « without candidates »: represents the non-matched landmarks because there are no candidates in BDTOPO or because the landmark in data source is far away from its homologous in BDTOPO « uncertain »: uncertainty cases are complex cases where any decision is taken by the data matching algorithm - the column "id_candidat" corresponds to the identifier of the landmark in BDTOPO if and only if there is a validated matching link - the column "samal" corresponds to the Samal distance The matching results are obtained using an ontology application named OOR. These specific results are obtained using the version of OOR V1.0.1 which is an improved version and contains new concepts compared to the first release 1.0.0. The new version of OOR (i.e. 1.0.1) will be released by the end of May 31 2022. The new link will be added here. This archive is released for transparency and reproducibility purposes.
Data from: Comparative study on matching methods for the distinction of...
figshare.com
zip
Updated Jan 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Schorcht; Robert Hecht; Gotthard Meinel (2022). Comparative study on matching methods for the distinction of building modifications and replacements based on mul-ti-temporal building footprint data [Dataset]. http://doi.org/10.6084/m9.figshare.18027683.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.18027683.v1
Dataset updated
Jan 26, 2022
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Martin Schorcht; Robert Hecht; Gotthard Meinel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains the input data and results used in the paper"Comparative study on matching methods for the distinction of building modifications and replacements based on mul-ti-temporal building footprint data".License information:The LoD1 data used as input in this study are openly available at Transparenzportal Hamburg (https://transparenz.hamburg.de/),from Freie und Hansestadt Hamburg, Landesbetrieb Geoinformation und Vermessung (LGV), in compliance with the licence dl-de/by-2-0 (https://www.govdata.de/dl-de/by-2-0):)Content:1. Input Footprints of non-identical pairs:input_reference_objects.zip2. Results without additional position deviation:results_without_deviation.zip3. Results with generated position deviation including geometries:results_with_deviation.zip
Z
Data: Algorithms for new types of fair stable matchings
data.niaid.nih.gov
Updated Jan 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manlove, David (2020). Data: Algorithms for new types of fair stable matchings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3630349
Explore at:
Dataset updated
Jan 30, 2020
Dataset provided by
Manlove, David
Cooper, Frances
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data corresponds to the data and experiments described in Section 5 of the following paper:

Algorithms for new types of fair stable matchings Authors: Frances Cooper and David Manlove

The paper is located at: https://arxiv.org/abs/2001.10875

The software is located at: https://zenodo.org/record/3630383

The data is located at: https://zenodo.org/record/3630349

See the README for more information.
d
Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning...
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xverum, Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning (DL), NLP & LLM Training [Dataset]. https://datarade.ai/data-products/xverum-company-data-b2b-data-belgium-netherlands-denm-xverum
Explore at:
.json, .csvAvailable download formats
Dataset provided by
Xverum LLC
Authors
Xverum
Area covered
Norway, United Kingdom, India, Western Sahara, Jordan, Dominican Republic, Sint Maarten (Dutch part), Cook Islands, Barbados, Oman
Description
Xverum’s AI & ML Training Data provides one of the most extensive datasets available for AI and machine learning applications, featuring 800M B2B profiles with 100+ attributes. This dataset is designed to enable AI developers, data scientists, and businesses to train robust and accurate ML models. From natural language processing (NLP) to predictive analytics, our data empowers a wide range of industries and use cases with unparalleled scale, depth, and quality.

What Makes Our Data Unique?

Scale and Coverage: - A global dataset encompassing 800M B2B profiles from a wide array of industries and geographies. - Includes coverage across the Americas, Europe, Asia, and other key markets, ensuring worldwide representation.

Rich Attributes for Training Models: - Over 100 fields of detailed information, including company details, job roles, geographic data, industry categories, past experiences, and behavioral insights. - Tailored for training models in NLP, recommendation systems, and predictive algorithms.

Compliance and Quality: - Fully GDPR and CCPA compliant, providing secure and ethically sourced data. - Extensive data cleaning and validation processes ensure reliability and accuracy.

Annotation-Ready: - Pre-structured and formatted datasets that are easily ingestible into AI workflows. - Ideal for supervised learning with tagging options such as entities, sentiment, or categories.

How Is the Data Sourced? - Publicly available information gathered through advanced, GDPR-compliant web aggregation techniques. - Proprietary enrichment pipelines that validate, clean, and structure raw data into high-quality datasets. This approach ensures we deliver comprehensive, up-to-date, and actionable data for machine learning training.

Primary Use Cases and Verticals

Natural Language Processing (NLP): Train models for named entity recognition (NER), text classification, sentiment analysis, and conversational AI. Ideal for chatbots, language models, and content categorization.

Predictive Analytics and Recommendation Systems: Enable personalized marketing campaigns by predicting buyer behavior. Build smarter recommendation engines for ecommerce and content platforms.

B2B Lead Generation and Market Insights: Create models that identify high-value leads using enriched company and contact information. Develop AI systems that track trends and provide strategic insights for businesses.

HR and Talent Acquisition AI: Optimize talent-matching algorithms using structured job descriptions and candidate profiles. Build AI-powered platforms for recruitment analytics.

How This Product Fits Into Xverum’s Broader Data Offering Xverum is a leading provider of structured, high-quality web datasets. While we specialize in B2B profiles and company data, we also offer complementary datasets tailored for specific verticals, including ecommerce product data, job listings, and customer reviews. The AI Training Data is a natural extension of our core capabilities, bridging the gap between structured data and machine learning workflows. By providing annotation-ready datasets, real-time API access, and customization options, we ensure our clients can seamlessly integrate our data into their AI development processes.

Why Choose Xverum? - Experience and Expertise: A trusted name in structured web data with a proven track record. - Flexibility: Datasets can be tailored for any AI/ML application. - Scalability: With 800M profiles and more being added, you’ll always have access to fresh, up-to-date data. - Compliance: We prioritize data ethics and security, ensuring all data adheres to GDPR and other legal frameworks.

Ready to supercharge your AI and ML projects? Explore Xverum’s AI Training Data to unlock the potential of 800M global B2B profiles. Whether you’re building a chatbot, predictive algorithm, or next-gen AI application, our data is here to help.

Contact us for sample datasets or to discuss your specific needs.
f
Data from: Datasets: Programmable content and a pattern-matching algorithm...
figshare.com
cord.cranfield.ac.uk
zip
Updated Jun 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Iñigo Fernández del amo blanco; John ahmet Erkoyuncu; Maryam Farsi (2020). Datasets: Programmable content and a pattern-matching algorithm for automatic adaptive authoring in Augmented Reality for maintenance [Dataset]. http://doi.org/10.17862/cranfield.rd.12213380.v4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.17862/cranfield.rd.12213380.v4
Dataset updated
Jun 1, 2020
Dataset provided by
Cranfield Online Research Data (CORD)
Authors
Iñigo Fernández del amo blanco; John ahmet Erkoyuncu; Maryam Farsi
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This repository includes datasets on experimental cases of study and analysis regarding the research called "Programmable content and a pattern-matching algorithm for automatic adaptive authoring in Augmented Reality for maintenance".DOI:Abstract: "Augmented Reality (AR) can increase efficiency and safety of maintenance operations, but costs of augmented content creation (authoring) are hindering its industrial deployment. A relevant research gap involves the ability of authoring solutions to automatically generate content for multiple operations. Hence, this paper offers programmable content formats and a pattern-matching algorithm for automatic adaptive authoring of ontology -based maintenance data. The proposed solution is validated against common authoring tools for repair and remote diagnosis AR applications in terms of operational efficiency gains achieved with the content they produce. Experimental results show that content from all authoring solutions attain same time reductions (42%) in comparison with non-AR information delivery tools. Surveys results suggest alike perceived usability of all authoring solutions and better content adaptiveness and user’s performance tracking of this authoring proposal."
H
Replication Code and Data for: A language-matching model to improve equity...
dataverse.harvard.edu
Updated Oct 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa Lu (2021). Replication Code and Data for: A language-matching model to improve equity and efficiency of COVID-19 contact tracing [Dataset]. http://doi.org/10.7910/DVN/SF606L
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SF606L
Dataset updated
Oct 8, 2021
Dataset provided by
Harvard Dataverse
Authors
Lisa Lu
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This repository contains replication code and data for all analysis carried out in the paper along with any anonymized datasets used to run the language-matching algorithm. It also contains the code used to preprocess the administrative data used in the language-matching algorithm (along with the preprocessed versions of the data) and the set of cutoffs we deployed in our pilot with Santa Clara County.
o
Data: A 3/2-Approximation Algorithm For The Student-Project Allocation...
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Mar 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frances Cooper; David Manlove (2018). Data: A 3/2-Approximation Algorithm For The Student-Project Allocation Problem [Dataset]. http://doi.org/10.5281/zenodo.1186824
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1186824
Dataset updated
Mar 1, 2018
Authors
Frances Cooper; David Manlove
Description
This data corresponds to the data and experiments described in Section 5 of the following paper submitted to SEA conference 2018: A 3/2-approximation algorithm for the Student-Project Allocation problem Authors: Frances Cooper and David Manlove The data is located at: https://doi.org/10.5281/zenodo.1186823 The software is located at: https://doi.org/10.5281/zenodo.1183221 See the README for more information.
MIVIA ARG Dataset
zenodo.org
mivia.unisa.it
application/gzip, zip
Updated May 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2024). MIVIA ARG Dataset [Dataset]. http://doi.org/10.1016/s0167-8655(02)00253-2
Explore at:
zip, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.1016/s0167-8655(02)00253-2
Dataset updated
May 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The ARG Database is a huge collection of labeled and unlabeled graphs realized by the MIVIA Group.

The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.The database is organized in two section: labeled and unlabeled graphs.

Both labeled and unlabeled graphs have been randomly generated according to six different generation models, each involving different possible parameter settings. As a result, 168 diverse kinds of graphs are contained in the database. Each type of unlabeled graph is represented by thousands of pairs of graphs for which an isomorphism or a graph-subgraph isomorphism relation holds, for a total of 143,600 graphs. Furthermore, each type of labeled graph is represented by thousands of pairs of graphs holding a not trivial common subgraph, for a total of 166,000 graphs.

For more details follow this link: https://mivia.unisa.it/datasets/graph-database/arg-database/documentation/
Data from: Automated Linking of Historical Data
linkagelibrary.icpsr.umich.edu
Updated Aug 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ran Abramitzky; Leah Boustan; Katherine Eriksson; James Feigenbaum; Santiago Perez (2020). Automated Linking of Historical Data [Dataset]. http://doi.org/10.3886/E120703V1
Explore at:
Unique identifier
https://doi.org/10.3886/E120703V1
Dataset updated
Aug 20, 2020
Authors
Ran Abramitzky; Leah Boustan; Katherine Eriksson; James Feigenbaum; Santiago Perez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1850 - 1940
Area covered
United States
Description
Currently, the repository provides codes for two such methods:The ABE fully automated approach: This approach is a fully automated method for linking historical datasets (e.g. complete-count Censuses) by first name, last name and age. The approach was first developed by Ferrie (1996) and adapted and scaled for the computer by Abramitzky, Boustan and Eriksson (2012, 2014, 2017). Because names are often misspelled or mistranscribed, our approach suggests testing robustness to alternative name matching (using raw names, NYSIIS standardization, and Jaro-Winkler distance). To reduce the chances of false positives, our approach suggests testing robustness by requiring names to be unique within a five year window and/or requiring the match on age to be exact.A fully automated probabilistic approach (EM): This approach (Abramitzky, Mill, and Perez 2019) suggests a fully automated probabilistic method for linking historical datasets. We combine distances in reported names and ages between each two potential records into a single score, roughly corresponding to the probability that both records belong to the same individual. We estimate these probabilities using the Expectation-Maximization (EM) algorithm, a standard technique in the statistical literature. We suggest a number of decision rules that use these estimated probabilities to determine which records to use in the analysis.

Facebook

Twitter

Click to copy link

Link copied

Cite

Maximilian MatthÃ©; Marco Sannolo; Kristopher Winiarski; Annemarieke Spitzen - van der Sluijs; Daniel Goedbloed; Sebastian Steinfartz; Ulrich Stachow (2025). Comparison of photo-matching algorithms commonly used for photographic capture-recapture studies [Dataset]. http://doi.org/10.5061/dryad.4f0bh

Data from: Comparison of photo-matching algorithms commonly used for photographic capture-recapture studies

Explore at:

Unique identifier

https://doi.org/10.5061/dryad.4f0bh

Dataset updated

Apr 1, 2025

Dataset provided by

Dryad Digital Repository

Authors

Maximilian MatthÃ©; Marco Sannolo; Kristopher Winiarski; Annemarieke Spitzen - van der Sluijs; Daniel Goedbloed; Sebastian Steinfartz; Ulrich Stachow

Time period covered

Jan 1, 2018

Description

Photographic captureâ€“recapture is a valuable tool for obtaining demographic information on wildlife populations due to its noninvasive nature and cost-effectiveness. Recently, several computer-aided photo-matching algorithms have been developed to more efficiently match images of unique individuals in databases with thousands of images. However, the identification accuracy of these algorithms can severely bias estimates of vital rates and population size. Therefore, it is important to understand the performance and limitations of state-of-the-art photo-matching algorithms prior to implementation in captureâ€“recapture studies involving possibly thousands of images. Here, we compared the performance of four photo-matching algorithms; Wild-ID, I3S Pattern+, APHIS, and AmphIdent using multiple amphibian databases of varying image quality. We measured the performance of each algorithm and evaluated the performance in relation to database size and the number of matching images in the database....

Clear search

Close search

Google apps

Main menu

Data from: Comparison of photo-matching algorithms commonly used for...

ONC Patient Matching Algorithm Challenge Data

Chain of Demand's proprietary product matching algorithms.

Data from: Highly Scalable Matching Pursuit Signal Decomposition Algorithm

Data from: Disparity Selective Stereo Matching Using Correlation Confidence...

Data from: Explaining human mobility predictions through a pattern matching...

Replication Data for Analysis of the Medical Residency Matching Algorithm to...

Data for: Two-Sided Matching for mentor-mentee allocations - Algorithms and...

Data from: Triplet Matching for Estimating Causal Effects With Three...

A new split based searching for exact pattern matching for natural texts

Matching results between landmark in different sources and landmark in a...

Matching results between landmark in different sources and landmark in a...

Data from: Comparative study on matching methods for the distinction of...

Data: Algorithms for new types of fair stable matchings

Machine Learning (ML) Data | 800M+ B2B Profiles | AI-Ready for Deep Learning...

Data from: Datasets: Programmable content and a pattern-matching algorithm...

Replication Code and Data for: A language-matching model to improve equity...

Data: A 3/2-Approximation Algorithm For The Student-Project Allocation...

MIVIA ARG Dataset

Data from: Automated Linking of Historical Data

Data from: Comparison of photo-matching algorithms commonly used for photographic capture-recapture studies