15 datasets found
  1. P

    SAS-Bench Dataset

    • paperswithcode.com
    Updated May 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peichao Lai; Kexuan Zhang; Yi Lin; Linyihan Zhang; Feiyang Ye; Jinhao Yan; Yanwei Xu; Conghui He; Yilei Wang; Wentao Zhang; Bin Cui (2025). SAS-Bench Dataset [Dataset]. https://paperswithcode.com/dataset/sas-bench
    Explore at:
    Dataset updated
    May 11, 2025
    Authors
    Peichao Lai; Kexuan Zhang; Yi Lin; Linyihan Zhang; Feiyang Ye; Jinhao Yan; Yanwei Xu; Conghui He; Yilei Wang; Wentao Zhang; Bin Cui
    Description

    SAS-Bench represents the first specialized benchmark for evaluating Large Language Models (LLMs) on Short Answer Scoring (SAS) tasks. Utilizing authentic questions from China's National College Entrance Examination (Gaokao), our benchmark offers:

    1,030 questions spanning 9 academic disciplines 4,109 expert-annotated student responses Step-wise scoring with Step-wise error analysis Multi-dimensional evaluation (holistic scoring, step-wise scoring, and error diagnosis consistency)

  2. f

    Results for the tree classification models for our example services.

    • figshare.com
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug (2023). Results for the tree classification models for our example services. [Dataset]. http://doi.org/10.1371/journal.pone.0266154.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results for the tree classification models for our example services.

  3. H

    Hadoop Big-Data Analytics Tool Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Hadoop Big-Data Analytics Tool Report [Dataset]. https://www.marketreportanalytics.com/reports/hadoop-big-data-analytics-tool-56923
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Apr 3, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Hadoop Big Data Analytics market, valued at $4053.9 million in 2025, is experiencing robust growth, projected to expand at a Compound Annual Growth Rate (CAGR) of 12.4% from 2025 to 2033. This growth is fueled by the increasing volume and velocity of data generated across diverse industries, coupled with a rising demand for advanced analytics capabilities to extract actionable insights. Key drivers include the need for improved operational efficiency, enhanced decision-making, and competitive advantage. The market is segmented by application (Large Enterprise and SME) and by type (Data Ingestion Tools, Data Processing Tools, Data Query and Analysis Tools, and Other). Large enterprises currently dominate the application segment, driven by their significant data volumes and sophisticated analytics needs. However, increasing adoption of cloud-based solutions and affordable data analytics tools is fueling growth in the SME segment. Data Ingestion Tools represent a significant portion of the market, reflecting the crucial initial step in the data analytics lifecycle. The leading companies in this space – Cloudera, MapR Technologies, IBM, Amazon Web Services, Microsoft, Google, VMware, Oracle, Teradata, and SAS – are constantly innovating, expanding their product portfolios, and engaging in strategic partnerships to maintain a competitive edge. Geographic expansion, particularly in rapidly developing economies of Asia Pacific and Middle East & Africa, further contributes to market expansion. The forecast period (2025-2033) anticipates continuous market evolution. Trends such as the increasing adoption of cloud-based Hadoop solutions, the growing popularity of real-time analytics, and the rise of artificial intelligence (AI) and machine learning (ML) integrated with Hadoop are expected to shape the market landscape. However, challenges remain, including the complexity of Hadoop implementation and the need for specialized skills to manage and analyze large datasets. Furthermore, data security concerns and regulatory compliance requirements pose restraints on market growth, although advancements in security technologies are mitigating these issues. The ongoing evolution of Hadoop towards more user-friendly interfaces and managed services is expected to drive wider adoption across various industries and business sizes in the years to come.

  4. f

    The selected example codes and their definitions.

    • plos.figshare.com
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug (2023). The selected example codes and their definitions. [Dataset]. http://doi.org/10.1371/journal.pone.0266154.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The selected example codes and their definitions.

  5. Research on Facilitators of Transnational Organized Crime: Understanding...

    • icpsr.umich.edu
    • catalog.data.gov
    Updated Apr 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chapman, Meg (2019). Research on Facilitators of Transnational Organized Crime: Understanding Crime Networks' Logistical Support, United States, 2006-2014 [Dataset]. http://doi.org/10.3886/ICPSR37171.v1
    Explore at:
    Dataset updated
    Apr 29, 2019
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Chapman, Meg
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/37171/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37171/terms

    Time period covered
    2006 - 2014
    Area covered
    United States
    Description

    These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed.This study addressed the dearth of information about facilitators of transnational organized crime (TOC) by developing a method for identifying criminal facilitators of TOC within existing datasets and extend the available descriptive information about facilitators through analysis of pre-sentence investigation reports (PSRs). The study involved a two-step process: the first step involved the development of a methodology for identifying TOCFs; the second step involved screening PSRs to validate the methodology and systematically collect data on facilitators and their organizations. Our ultimate goal was to develop a predictive model which can be applied to identify TOC facilitators in the data efficiently.The collection contains 1 syntax text file (TOCF_Summary_Stats_NACJD.sas). No data is included in this collection.

  6. a

    Mothers and Multigenerational Households, 2016-2020

    • vaccine-confidence-program-cdcvax.hub.arcgis.com
    • livingatlas-dcdev.opendata.arcgis.com
    Updated May 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Urban Observatory by Esri (2022). Mothers and Multigenerational Households, 2016-2020 [Dataset]. https://vaccine-confidence-program-cdcvax.hub.arcgis.com/datasets/UrbanObservatory::mothers-and-multigenerational-households-2016-2020
    Explore at:
    Dataset updated
    May 3, 2022
    Dataset authored and provided by
    Urban Observatory by Esri
    Area covered
    Description

    This layer is symbolized to show the approximate percentage of households that are multigenerational households. Multigenerational households are households with three or more generations. These households include either (1) a householder, a parent or parent-in-law of the householder, and an own child of the householder, (2) a householder, an own child of the householder, and a grandchild of the householder, or (3) a householder, a parent or parent-in-law of the householder, an own child of the householder, and a grandchild of the householder. The householder is a person in whose name the home is owned, being bought, or rented, and who answers the survey questionnaire as person 1.Other fields included are estimates of mothers - females 18 to 64 with own children (biological, adopted, or step children) - by various race/ethnic groups, and by age group of children. Age groups were defined by the COVID vaccine age groups: 12 to 17, 5 to 11, and 0 to 4. We also included estimates for mothers of children in more than one of these groups.Data prep steps:Data downloaded on 4/5/22 from FTP site.All fields were calculated from the Census Bureau's 2016-2020 5-year American Community Survey Public Use Microdata Sample (PUMS) using this SAS program.Using the SAS-ArcGIS Bridge, the data table created in SAS was read into ArcGIS Pro and joined to this layer is PUMA, obtained from Living Atlas. According to the U.S. Census Bureau, a Public Use Micro-sample Area (PUMA) is a "non-overlapping, statistical geographic areas that partition each state or equivalent entity into geographic areas containing no fewer than 100,000 people each." The resulting layer in Pro was then published to ArcGIS Online.Disclaimer: All estimates here contain a margin of error. While they are not explicitly calculated and provided on this layer currently, we can and will add additional fields to provide the margins of error if the need arises.

  7. m

    Data from: Receiving Investors in the Block Market for Corporate Bonds

    • data.mendeley.com
    Updated Mar 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacey Jacobsen (2025). Receiving Investors in the Block Market for Corporate Bonds [Dataset]. http://doi.org/10.17632/nfpywmcwwc.2
    Explore at:
    Dataset updated
    Mar 21, 2025
    Authors
    Stacey Jacobsen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository is a comprehensive resource accompanying the paper "Receiving Investors in the Block Market for Corporate Bonds" by Stacey Jacobsen and Kumar Venkataraman. It includes source codes and small-sample (masked) input datasets for replicating the study’s analysis on block trading costs in the corporate bond market. To effectively use this repository, users must download the sample datasets and adjust the directory paths within the SAS and Stata code to match their local environment. Because the data sources are non-public, the original bond identifiers have been removed and replaced by randomly generated identifiers in the sample datasets. Because small sample datasets are provided, the replicator should expect the code to run in less than ten minutes. The replicator should run the SAS code in the first step then the STATA code in the second step.

  8. Comparing the Shenoy et al [21] algorithm for low-value urinalysis and...

    • plos.figshare.com
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug (2023). Comparing the Shenoy et al [21] algorithm for low-value urinalysis and important diagnosis codes in the HSR Definition Builder application. [Dataset]. http://doi.org/10.1371/journal.pone.0266154.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparing the Shenoy et al [21] algorithm for low-value urinalysis and important diagnosis codes in the HSR Definition Builder application.

  9. f

    Top 21 of 132 diagnosis codes for carrier claims with a knee arthroscopy...

    • plos.figshare.com
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug (2023). Top 21 of 132 diagnosis codes for carrier claims with a knee arthroscopy procedure (CPT 29877), ordered by relative importance from the classification model. [Dataset]. http://doi.org/10.1371/journal.pone.0266154.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kelsey Chalmers; Valérie Gopinath; Adam G. Elshaug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Top 21 of 132 diagnosis codes for carrier claims with a knee arthroscopy procedure (CPT 29877), ordered by relative importance from the classification model.

  10. f

    SAS Programming for breeding practices.

    • plos.figshare.com
    txt
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masixole Maswana; Thinawanga Joseph Mugwabana; Thobela Louis Tyasi (2023). SAS Programming for breeding practices. [Dataset]. http://doi.org/10.1371/journal.pone.0278400.s006
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Masixole Maswana; Thinawanga Joseph Mugwabana; Thobela Louis Tyasi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    It is an SAS file with all the syntax used for statistical analysis of breeding practices of donkey farmers’ data. (SAS)

  11. f

    SAS Programming for data analysis of morphological characterization of...

    • plos.figshare.com
    txt
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masixole Maswana; Thinawanga Joseph Mugwabana; Thobela Louis Tyasi (2023). SAS Programming for data analysis of morphological characterization of donkeys. [Dataset]. http://doi.org/10.1371/journal.pone.0278400.s008
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Masixole Maswana; Thinawanga Joseph Mugwabana; Thobela Louis Tyasi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    It is an SAS file with all the syntax used for statistical analysis. (SAS)

  12. f

    SAS Programming for socio-economic characteristics of donkey farmers.

    • plos.figshare.com
    txt
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masixole Maswana; Thinawanga Joseph Mugwabana; Thobela Louis Tyasi (2023). SAS Programming for socio-economic characteristics of donkey farmers. [Dataset]. http://doi.org/10.1371/journal.pone.0278400.s004
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Masixole Maswana; Thinawanga Joseph Mugwabana; Thobela Louis Tyasi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    It is an SAS file with all the syntax used for statistical analysis of socio-economic characteristics of donkey farmers’ data. (SAS)

  13. Database Creation Description and Data Dictionaries

    • figshare.com
    txt
    Updated Aug 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jordan Kempker; John David Ike (2016). Database Creation Description and Data Dictionaries [Dataset]. http://doi.org/10.6084/m9.figshare.3569067.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 11, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jordan Kempker; John David Ike
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    There are several Microsoft Word documents here detailing data creation methods and with various dictionaries describing the included and derived variables.The Database Creation Description is meant to walk a user through some of the steps detailed in the SAS code with this project.The alphabetical list of variables is intended for users as sometimes this makes some coding steps easier to copy and paste from this list instead of retyping.The NIS Data Dictionary contains some general dataset description as well as each variable's responses.

  14. Traits used in discriminating the chicken population from different sites in...

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bekalu Muluneh; Mengistie Taye; Tadelle Dessie; Dessie Salilew Wondim; Damitie Kebede; Andualem Tenagne (2023). Traits used in discriminating the chicken population from different sites in stepwise discriminant analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0286299.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Bekalu Muluneh; Mengistie Taye; Tadelle Dessie; Dessie Salilew Wondim; Damitie Kebede; Andualem Tenagne
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Traits used in discriminating the chicken population from different sites in stepwise discriminant analysis.

  15. Class means on canonical variables of female and male chickens.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bekalu Muluneh; Mengistie Taye; Tadelle Dessie; Dessie Salilew Wondim; Damitie Kebede; Andualem Tenagne (2023). Class means on canonical variables of female and male chickens. [Dataset]. http://doi.org/10.1371/journal.pone.0286299.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Bekalu Muluneh; Mengistie Taye; Tadelle Dessie; Dessie Salilew Wondim; Damitie Kebede; Andualem Tenagne
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Class means on canonical variables of female and male chickens.

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Peichao Lai; Kexuan Zhang; Yi Lin; Linyihan Zhang; Feiyang Ye; Jinhao Yan; Yanwei Xu; Conghui He; Yilei Wang; Wentao Zhang; Bin Cui (2025). SAS-Bench Dataset [Dataset]. https://paperswithcode.com/dataset/sas-bench

SAS-Bench Dataset

Explore at:
Dataset updated
May 11, 2025
Authors
Peichao Lai; Kexuan Zhang; Yi Lin; Linyihan Zhang; Feiyang Ye; Jinhao Yan; Yanwei Xu; Conghui He; Yilei Wang; Wentao Zhang; Bin Cui
Description

SAS-Bench represents the first specialized benchmark for evaluating Large Language Models (LLMs) on Short Answer Scoring (SAS) tasks. Utilizing authentic questions from China's National College Entrance Examination (Gaokao), our benchmark offers:

1,030 questions spanning 9 academic disciplines 4,109 expert-annotated student responses Step-wise scoring with Step-wise error analysis Multi-dimensional evaluation (holistic scoring, step-wise scoring, and error diagnosis consistency)

Search
Clear search
Close search
Google apps
Main menu