100+ datasets found
  1. h

    Caption-Evaluation-Data

    • huggingface.co
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenGVLab (2025). Caption-Evaluation-Data [Dataset]. https://huggingface.co/datasets/OpenGVLab/Caption-Evaluation-Data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 20, 2025
    Dataset authored and provided by
    OpenGVLab
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This repository presents the evaluation data used for ASM. Please refer to this document for more details about the repository.

  2. d

    Data from: Data Sets for Evaluation of Building Fault Detection and...

    • catalog.data.gov
    • data.openei.org
    • +1more
    Updated Apr 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lawrence Berkeley National Laboratory (2022). Data Sets for Evaluation of Building Fault Detection and Diagnostics Algorithms [Dataset]. https://catalog.data.gov/dataset/data-sets-for-evaluation-of-building-fault-detection-and-diagnostics-algorithms-2de50
    Explore at:
    Dataset updated
    Apr 26, 2022
    Dataset provided by
    Lawrence Berkeley National Laboratory
    Description

    This documentation and dataset can be used to test the performance of automated fault detection and diagnostics algorithms for buildings. The dataset was created by LBNL, PNNL, NREL, ORNL and ASHRAE RP-1312 (Drexel University). It includes data for air-handling units and rooftop units simulated with PNNL's large office building model.

  3. NLU-Evaluation-Data-en-de

    • huggingface.co
    Updated Apr 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deutsche Telekom AG (2023). NLU-Evaluation-Data-en-de [Dataset]. https://huggingface.co/datasets/deutsche-telekom/NLU-Evaluation-Data-en-de
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2023
    Dataset provided by
    Deutsche Telekomhttp://www.telekom.de/
    Authors
    Deutsche Telekom AG
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NLU Evaluation Data - English and German

    A labeled English and German language multi-domain dataset (21 domains) with 25K user utterances for human-robot interaction. This dataset is collected and annotated for evaluating NLU services and platforms. The detailed paper on this dataset can be found at arXiv.org: Benchmarking Natural Language Understanding Services for building Conversational Agents The dataset builds on the annotated data of the xliuhw/NLU-Evaluation-Data repository.… See the full description on the dataset page: https://huggingface.co/datasets/deutsche-telekom/NLU-Evaluation-Data-en-de.

  4. h

    Bengali-Prompt-Evaluation-Data

    • huggingface.co
    Updated Apr 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DIBT-Bengali (2024). Bengali-Prompt-Evaluation-Data [Dataset]. https://huggingface.co/datasets/DIBT-Bengali/Bengali-Prompt-Evaluation-Data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 14, 2024
    Dataset authored and provided by
    DIBT-Bengali
    Description

    Dataset Card for Bengali-Prompt-Evaluation-Data

    This dataset has been created with Argilla. As shown in the sections below, this dataset can be loaded into Argilla as explained in Load with Argilla, or used directly with the datasets library in Load with datasets.

      Dataset Summary
    

    This dataset contains:

    A dataset configuration file conforming to the Argilla dataset format named argilla.yaml. This configuration file will be used to configure the dataset when using the… See the full description on the dataset page: https://huggingface.co/datasets/DIBT-Bengali/Bengali-Prompt-Evaluation-Data.

  5. d

    WIA Non-Experimental Net Impact Evaluation Dataset

    • datasets.ai
    Updated Sep 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Labor (2024). WIA Non-Experimental Net Impact Evaluation Dataset [Dataset]. https://datasets.ai/datasets/wia-non-expermental-net-impact-evaluation-dataset
    Explore at:
    Dataset updated
    Sep 10, 2024
    Dataset authored and provided by
    Department of Labor
    Description

    The evaluation employs administrative data from 12 states, covering approximately 160,000 WIA Adult, WIA Dislocated Worker and WIA Youth participants and nearly 3 million comparison group members. Focusing on participants who entered WIA programs between July 2003 and June 2005, the evaluation considers the impact for all those in the program, the impact for those receiving only Core or Intensive Services, and the incremental impact of Training Services. This dataset contains all of the information used to conduct the non-experimental evaluation estimates for the 1) WIA Client Treatment Group and 2) The Unemployment Insurance and Employment Service Client comparison group. The administrative data collected by IMPAQ for the "Workforce Investment Act Non-Experimental Net Impact Evaluation" project were received from state agencies in three segments: annual Workforce Investment Act Standardized Record Data (WIASRD) or closely related files, Unemployment Insurance data, and Unemployment Insurance Wage Record data. The analysis were conducted for twelve states; however, based on the data sharing agreements, the Public Use Data (PUD) set includes data for nine states only. Our agreement for use of these data required that the identity of those states was not revealed. As a result, all geographical identifiers were removed to preserve states' anonymity.

  6. Z

    GAED: Game Acceptance Evaluation Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Augusto de Castro Vieira (2020). GAED: Game Acceptance Evaluation Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3403237
    Explore at:
    Dataset updated
    Feb 18, 2020
    Dataset provided by
    Augusto de Castro Vieira
    Wladmir Cardoso Brandão
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Game Acceptance Evaluation Dataset (GAED) contains statistical data aswell training and validation sets used in our experiments on Neural Networks to evaluate Video Games Acceptance.

    Please consider citing the following references if you found this dataset useful:

    [1] Augusto de Castro Vieira, Wladmir Cardoso Brandão. Evaluating Acceptance of Video Games using Convolutional Neural Networks for Sentiment Analysis of User Reviews. In Proceedings of the 30th ACM Conference on Hypertext and Social Media. 2019.

    [2] Augusto de Castro Vieira, Wladmir Cardoso Brandão. GA-Eval: A Neural Network based approach to evaluate Video Games Acceptance. In Proceedings of the 18th Brazilian Symposium on Computer Games and Digital Entertainment. 2019.

    [3] Augusto de Castro Vieira, Wladmir Cardoso Brandão. (2019). GAED: The Game Acceptance Evaluation Dataset (Version 1.0) [Data set]. Zenodo.

  7. g

    Measure Evaluation

    • gimi9.com
    • catalog.data.gov
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Measure Evaluation [Dataset]. https://gimi9.com/dataset/data-gov_measure-evaluation/
    Explore at:
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    MEASURE Evaluation is the USAID Global Health Bureau's primary vehicle for supporting improvements in monitoring and evaluation in population, health and nutrition worldwide. They help to identify data needs, collect and analyze technically sound data, and use that data for health decision making. Some MEASURE Evaluation activities involve the collection of innovative evaluation data sets in order to increase the evidence-base on program impact and evaluate the strengths and weaknesses of recent evaluation methodological developments. Many of these data sets may be available to other researchers to answer questions of particular importance to global health and evaluation research. Some of these data sets are being added to the Dataverse on a rolling basis, as they become available. This collection on the Dataverse platform contains a growing variety and number of global health evaluation datasets.

  8. Z

    GouDa - Generation of universal Data Sets

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    André Conrad (2022). GouDa - Generation of universal Data Sets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6523461
    Explore at:
    Dataset updated
    Jun 3, 2022
    Dataset provided by
    Uta Störl
    Valerie Restat
    Gerrit Boerner
    André Conrad
    License

    Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
    License information was derived automatically

    Description

    GouDa is a tool for the generation of universal data sets to evaluate and compare existing data preparation tools and new research approaches. It supports diverse error types and arbitrary error rates. Ground truth is provided as well. It thus permits better analysis and evaluation of data preparation pipelines and simplifies the reproducibility of results.

    Publication: V. Restat, G. Boerner, A. Conrad, and U. Störl. GouDa - Generation of universal Data Sets. In Proceedings of Data Management for End-to-End Machine Learning (DEEM’22), Philadelphia, USA, 2022. https://doi.org/10.1145/3533028.3533311

  9. ASDF Evaluation Dataset + Corner Clamp Base

    • zenodo.org
    zip
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hannah Schieber; Shiyu Li; Niklas Corell; Philipp Beckerle; Julian Kreimeier; Daniel Roth; Hannah Schieber; Shiyu Li; Niklas Corell; Philipp Beckerle; Julian Kreimeier; Daniel Roth (2024). ASDF Evaluation Dataset + Corner Clamp Base [Dataset]. http://doi.org/10.5281/zenodo.11188134
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hannah Schieber; Shiyu Li; Niklas Corell; Philipp Beckerle; Julian Kreimeier; Daniel Roth; Hannah Schieber; Shiyu Li; Niklas Corell; Philipp Beckerle; Julian Kreimeier; Daniel Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Evaluation data for ASDF and Corner Clamp Base Training Data

    @article{schieber2024asdf,
    title={ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation},
    author={Schieber, Hannah and Li, Shiyu and Corell, Niklas and Beckerle, Philipp and Kreimeier, Julian and Roth, Daniel},
    journal={arXiv preprint arXiv:2403.16400},
    year={2024}
    }

  10. d

    Prognostics Performance Evaluation

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Prognostics Performance Evaluation [Dataset]. https://catalog.data.gov/dataset/prognostics-performance-evaluation
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    This is the first version of the performance evaluation tool. Evaluation is based on point estimates of the RUL predictions. a more detailed documentation will be available with the tool soon. in the meantime, please download the attached files/folder in the same root folder to run a demo. To evaluate your own results create results & application data files in .mat format and save in the results folder. please make sure you name your files as xxx_results.mat for results and yyy_appData.mat for application data.

  11. B

    Data from: TruthEval: A Dataset to Evaluate LLM Truthfulness and Reliability...

    • borealisdata.ca
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aisha Khatun; Dan Brown (2024). TruthEval: A Dataset to Evaluate LLM Truthfulness and Reliability [Dataset]. http://doi.org/10.5683/SP3/5MZWBV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2024
    Dataset provided by
    Borealis
    Authors
    Aisha Khatun; Dan Brown
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large Language Model (LLM) evaluation is currently one of the most important areas of research, with existing benchmarks proving to be insufficient and not completely representative of LLMs' various capabilities. We present a curated collection of challenging statements on sensitive topics for LLM benchmarking called TruthEval. These statements were curated by hand and contain known truth values. The categories were chosen to distinguish LLMs' abilities from their stochastic nature. Details of collection method and use cases can be found in this paper: TruthEval: A Dataset to Evaluate LLM Truthfulness and Reliability

  12. Open Media Forensics Challenge (OpenMFC) Evaluation Datasets

    • data.nist.gov
    • catalog.data.gov
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haiying Guan (2022). Open Media Forensics Challenge (OpenMFC) Evaluation Datasets [Dataset]. http://doi.org/10.18434/mds2-2410
    Explore at:
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Haiying Guan
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The datasets contain the following parts for Open Media Forensics Challenge (OpenMFC) evaluations: 1. NC16 Kickoff dataset 2. NC17 development and evaluation datasets 3. MFC18 development and evaluation datasets 4. MFC19 development and evaluation datasets 5. MFC20 development and evaluation datasets 6. OpenMFC2022 steg datasets 7. OpenMFC2022 deepfake datasets

  13. PDOX - Evaluation Dataset

    • kaggle.com
    Updated Jun 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Meyers (2024). PDOX - Evaluation Dataset [Dataset]. https://www.kaggle.com/datasets/mrovkill/paradox-pdox-evaluation
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 23, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Samuel Meyers
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset was synthesized with the assistance of Gemini 1.5 Pro for the use of evaluating LLMs with paradoxical logic. This dataset is intended also for use as the seed for a TBD dataset related to practical use of this dataset.

  14. d

    Research and Development Evaluation Committee Data Inventory Form

    • data.gov.tw
    Updated Jun 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research,Development and Evaluation Commission,KCG (2025). Research and Development Evaluation Committee Data Inventory Form [Dataset]. https://data.gov.tw/en/datasets/46828
    Explore at:
    Dataset updated
    Jun 16, 2025
    Dataset authored and provided by
    Research,Development and Evaluation Commission,KCG
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    Inventory of materialsData classification - Class A: open data. Class B: data with limited use. Class C: non-open data.Current situation - 1. Free to use. 2. Free application. 3. Fee. 4. Not open.Degree of openness - 1. Already open. 2. Scheduled to open (please fill in the scheduled opening date). 3. Cannot be opened (please fill in the reason for not being able to open in the remarks column).

  15. n

    Data from: Public sharing of research datasets: a pilot study of...

    • data.niaid.nih.gov
    • zenodo.org
    zip
    Updated May 26, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heather A. Piwowar; Wendy W. Chapman (2011). Public sharing of research datasets: a pilot study of associations [Dataset]. http://doi.org/10.5061/dryad.3td2f
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 26, 2011
    Dataset provided by
    University of Pittsburgh
    Authors
    Heather A. Piwowar; Wendy W. Chapman
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The public sharing of primary research datasets potentially benefits the research community but is not yet common practice. In this pilot study, we analyzed whether data sharing frequency was associated with funder and publisher requirements, journal impact factor, or investigator experience and impact. Across 397 recent biomedical microarray studies, we found investigators were more likely to publicly share their raw dataset when their study was published in a high-impact journal and when the first or last authors had high levels of career experience and impact. We estimate the USA's National Institutes of Health (NIH) data sharing policy applied to 19% of the studies in our cohort; being subject to the NIH data sharing plan requirement was not found to correlate with increased data sharing behavior in multivariate logistic regression analysis. Studies published in journals that required a database submission accession number as a condition of publication were more likely to share their data, but this trend was not statistically significant. These early results will inform our ongoing larger analysis, and hopefully contribute to the development of more effective data sharing initiatives. Earlier version presented at ASIS&T and ISSI Pre-Conference: Symposium on Informetrics and Scientometrics 2009

  16. F

    Evaluation Dataset for LiDAR-Based Object Detection Algorithms Across...

    • data.uni-hannover.de
    xlsx
    Updated Mar 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institut für Produktentwicklung und Gerätebau (2025). Evaluation Dataset for LiDAR-Based Object Detection Algorithms Across Varying Angular Resolutions [Dataset]. https://data.uni-hannover.de/es/dataset/08a012fb-9179-4ba3-8430-ea5ada68b1d0
    Explore at:
    xlsx(1677940), xlsx(1789866), xlsx(1720242), xlsx(1675823), xlsx(1357943), xlsx(1310096), xlsx(1714643), xlsx(1497367), xlsx(1668771), xlsx(1782491)Available download formats
    Dataset updated
    Mar 7, 2025
    Dataset authored and provided by
    Institut für Produktentwicklung und Gerätebau
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset provides object detection results using five different LiDAR-based object detection algorithms: PointRCNN, SECOND, Part-A², PointPillars, and PVRCNN. The experiments aim to determine the optimal angular resolution for LiDAR-based object detection. The point cloud data was generated in the CARLA simulator, modeled in a suburban scenario featuring 30 vehicles, 13 bicycles, and 40 pedestrians. The angular resolution in the dataset ranges from 0.1° x 0.1° (H x V) to 1.0° x 1.0°, with increments of 0.1° in each direction.

    For each angular resolution, over 2000 frames of point clouds were collected, with 1600 of these frames labeled across three object classes—vehicles, pedestrians, and cyclists, for algorithm training purposes The dataset includes detection results after evaluating 1000 frames, with results recorded for the respective angular resolutions.

    Each file in the dataset contains five sheets, corresponding to the five different algorithms evaluated. The data structure includes the following columns:

    1. Frame Index: Indicates the frame number, ranging from 1 to 1000.

    2. Object Classification: Labels objects as 1 (Vehicle), 2 (Pedestrian), or 3 (Cyclist).

    3. Confidence Score: Represents the confidence level of the detected object in its bounding box.

    4. Number of LiDAR Points: Indicates the count of LiDAR points within the bounding box.

    5. Bounding Box Distance: Specifies the distance of the bounding box from the LiDAR sensor.

    This dataset has been created in the context of the Leibniz Young Investigator Grants- programmed by the Leibniz University Hannover and is funded by the Ministry of Science and Culture of Lower Saxony (MWK) Grant Nr. 11-76251-114/2022

  17. Student evaluation of teaching data 2009-2010

    • dro.deakin.edu.au
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stuart Palmer (2024). Student evaluation of teaching data 2009-2010 [Dataset]. http://doi.org/10.26187/deakin.25807732.v1
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset provided by
    Deakin Universityhttp://www.deakin.edu.au/
    Authors
    Stuart Palmer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset consists of a summary of publicly available student evaluation of teaching (SET) data for the annual period of trimester 2 2009, trimester 3 2009/2010 and trimester 1 2010, from the Student Evaluation of Teaching and Units (SETU) portal.The data was analysed to include mean rating sets for 1432 units of study, and represented 74498 sets of SETU ratings, 188391 individual student enrolements and 58.5 percent of all units listed in the Deakin University handbook for the period under consideration, to identify any systematic influences on SET results at Deakin University.The data reported for a unit included:• total enrolment;• total number of responses; and• computed response rate for the enrolment location(s) selectedAnd, the data reported for each of the ten core SETU items included:• number of responses;• mean rating;• standard deviation of the mean rating;• percentage agreement;• percentage disagreement; and• percentage difference.

  18. F

    ORKG Similar Papers Recommendation Service Evaluation Dataset

    • data.uni-hannover.de
    .csv
    Updated Jan 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TIB (2023). ORKG Similar Papers Recommendation Service Evaluation Dataset [Dataset]. https://data.uni-hannover.de/dataset/orkg-similar-papers-recommendation-service-evaluation-dataset
    Explore at:
    .csv(4543675)Available download formats
    Dataset updated
    Jan 30, 2023
    Dataset authored and provided by
    TIB
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    This dataset was created to compare and evaluate the Semantic Scholar recommendation service and Open Research Knowledge Graph (ORKG) similar papers recommendation service based on Elastic Search. The dataset includes 30 random ORKG comparisons, each of them is provided with 50 similar papers recommended by Semantic Scholar and 50 papers recommended by Elastic Search, including 10 most relevant papers that were manually labeled.

    Dataset columns:

    • Comparison resource ID: the identifier of the ORKG comparison resource
    • Elastic search score: the relevancy score provided by Elastic Search
    • Curated score: papers that are the most relevant to the comparison were manually labeled as "1"
    • ES DOI: digital object identifier of the paper, recommended by Elastic Search
    • ES Title: title of the paper, recommended by Elastic Search
    • ES Abstract: abstract of the paper, recommended by Elastic Search
    • Semantic scholar rank: the relevancy rank provided by Semantic Scholar
    • SS DOI: digital object identifier of the paper, recommended by Semantic Scholar
    • SS Title: title of the paper, recommended by Semantic Scholar
    • SS Abstract: abstract of the paper, recommended by Semantic Scholar

    Evaluation results:

    Average precision (P@k) and recall (R@k) for Semantic Scholar results:

    • P@50 = 0.11; R@50 = 0.54
    • P@40 = 0.13; R@40 = 0.50
    • P@30 = 0.16; R@30 = 0.48
    • P@20 = 0.20; R@20 = 0.40
    • P@10 = 0.29; R@10 = 0.29

    Average precision (P@k) and recall (R@k) for Elastic Search results:

    • P@50 = 0.20; R@50 = 1.00
    • P@40 = 0.25; R@40 = 0.98
    • P@30 = 0.32; R@30 = 0.97
    • P@20 = 0.46; R@20 = 0.92
    • P@10 = 0.63; R@10 = 0.63
  19. J

    A PANEL DATA APPROACH FOR PROGRAM EVALUATION: MEASURING THE BENEFITS OF...

    • journaldata.zbw.eu
    • jda-test.zbw.eu
    txt
    Updated Dec 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheng Hsiao; H. Steve Ching; Shui Ki Wan; Cheng Hsiao; H. Steve Ching; Shui Ki Wan (2022). A PANEL DATA APPROACH FOR PROGRAM EVALUATION: MEASURING THE BENEFITS OF POLITICAL AND ECONOMIC INTEGRATION OF HONG KONG WITH MAINLAND CHINA (replication data) [Dataset]. http://doi.org/10.15456/jae.2022320.0727003662
    Explore at:
    txt(1646), txt(17605)Available download formats
    Dataset updated
    Dec 7, 2022
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Cheng Hsiao; H. Steve Ching; Shui Ki Wan; Cheng Hsiao; H. Steve Ching; Shui Ki Wan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Hong Kong
    Description

    We propose a simple-to-implement panel data method to evaluate the impacts of social policy. The basic idea is to exploit the dependence among cross-sectional units to construct the counterfactuals. The cross-sectional correlations are attributed to the presence of some (unobserved) common factors. However, instead of trying to estimate the unobserved factors, we propose to use observed data. We use a panel of 24 countries to evaluate the impact of political and economic integration of Hong Kong with mainland China. We find that the political integration hardly had any impact on the growth of the Hong Kong economy. However, the economic integration has raised Hong Kong's annual real GDP by about 4%.

  20. U

    Coast Train--Labeled imagery for training and evaluation of data-driven...

    • data.usgs.gov
    • catalog.data.gov
    Updated Aug 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand (2024). Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation [Dataset]. http://doi.org/10.5066/P91NP87I
    Explore at:
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Jan 1, 2008 - Dec 31, 2020
    Description

    Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes us ...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
OpenGVLab (2025). Caption-Evaluation-Data [Dataset]. https://huggingface.co/datasets/OpenGVLab/Caption-Evaluation-Data

Caption-Evaluation-Data

OpenGVLab/Caption-Evaluation-Data

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2025
Dataset authored and provided by
OpenGVLab
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

This repository presents the evaluation data used for ASM. Please refer to this document for more details about the repository.

Search
Clear search
Close search
Google apps
Main menu