23 datasets found
  1. O*NET Database

    • onetcenter.org
    excel, mysql, oracle +2
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for O*NET Development (2025). O*NET Database [Dataset]. https://www.onetcenter.org/database.html
    Explore at:
    oracle, sql server, text, mysql, excelAvailable download formats
    Dataset updated
    May 22, 2025
    Dataset provided by
    Occupational Information Network
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Dataset funded by
    United States Department of Laborhttp://www.dol.gov/
    Description

    The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.

    Data content areas include:

    • Worker Characteristics (e.g., Abilities, Interests, Work Styles)
    • Worker Requirements (e.g., Education, Knowledge, Skills)
    • Experience Requirements (e.g., On-the-Job Training, Work Experience)
    • Occupational Requirements (e.g., Detailed Work Activities, Work Context)
    • Occupation-Specific Information (e.g., Job Titles, Tasks, Technology Skills)

  2. Common Metadata Elements for Cataloging Biomedical Datasets

    • figshare.com
    xlsx
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Read (2016). Common Metadata Elements for Cataloging Biomedical Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.1496573.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Kevin Read
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset outlines a proposed set of core, minimal metadata elements that can be used to describe biomedical datasets, such as those resulting from research funded by the National Institutes of Health. It can inform efforts to better catalog or index such data to improve discoverability. The proposed metadata elements are based on an analysis of the metadata schemas used in a set of NIH-supported data sharing repositories. Common elements from these data repositories were identified, mapped to existing data-specific metadata standards from to existing multidisciplinary data repositories, DataCite and Dryad, and compared with metadata used in MEDLINE records to establish a sustainable and integrated metadata schema. From the mappings, we developed a preliminary set of minimal metadata elements that can be used to describe NIH-funded datasets. Please see the readme file for more details about the individual sheets within the spreadsheet.

  3. Household survey and focus group discussion data

    • figshare.com
    xlsx
    Updated Nov 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephie Mwangi; Joel L. Bargul; Kevin Kidambasi; Collins Kigen (2022). Household survey and focus group discussion data [Dataset]. http://doi.org/10.6084/m9.figshare.21545895.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 12, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Stephie Mwangi; Joel L. Bargul; Kevin Kidambasi; Collins Kigen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The household survey data includes data collected from parents in Laisamis, northern Kenya , and the focus group discussion data is data collected from Laisamis Secondary School students on formal education and gender equity among children from pastoral communities.

  4. f

    The Quantitative Methods Boot Camp: Teaching Quantitative Thinking and...

    • plos.figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melanie I. Stefan; Johanna L. Gutlerner; Richard T. Born; Michael Springer (2023). The Quantitative Methods Boot Camp: Teaching Quantitative Thinking and Computing Skills to Graduate Students in the Life Sciences [Dataset]. http://doi.org/10.1371/journal.pcbi.1004208
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS Computational Biology
    Authors
    Melanie I. Stefan; Johanna L. Gutlerner; Richard T. Born; Michael Springer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The past decade has seen a rapid increase in the ability of biologists to collect large amounts of data. It is therefore vital that research biologists acquire the necessary skills during their training to visualize, analyze, and interpret such data. To begin to meet this need, we have developed a “boot camp” in quantitative methods for biology graduate students at Harvard Medical School. The goal of this short, intensive course is to enable students to use computational tools to visualize and analyze data, to strengthen their computational thinking skills, and to simulate and thus extend their intuition about the behavior of complex biological systems. The boot camp teaches basic programming using biological examples from statistics, image processing, and data analysis. This integrative approach to teaching programming and quantitative reasoning motivates students’ engagement by demonstrating the relevance of these skills to their work in life science laboratories. Students also have the opportunity to analyze their own data or explore a topic of interest in more detail. The class is taught with a mixture of short lectures, Socratic discussion, and in-class exercises. Students spend approximately 40% of their class time working through both short and long problems. A high instructor-to-student ratio allows students to get assistance or additional challenges when needed, thus enhancing the experience for students at all levels of mastery. Data collected from end-of-course surveys from the last five offerings of the course (between 2012 and 2014) show that students report high learning gains and feel that the course prepares them for solving quantitative and computational problems they will encounter in their research. We outline our course here which, together with the course materials freely available online under a Creative Commons License, should help to facilitate similar efforts by others.

  5. Z

    Assessing the impact of hints in learning formal specification: Research...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Margolis, Iara (2024). Assessing the impact of hints in learning formal specification: Research artifact [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10450608
    Explore at:
    Dataset updated
    Jan 29, 2024
    Dataset provided by
    Cunha, Alcino
    Campos, José Creissac
    Margolis, Iara
    Macedo, Nuno
    Sousa, Emanuel
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This artifact accompanies the SEET@ICSE article "Assessing the impact of hints in learning formal specification", which reports on a user study to investigate the impact of different types of automated hints while learning a formal specification language, both in terms of immediate performance and learning retention, but also in the emotional response of the students. This research artifact provides all the material required to replicate this study (except for the proprietary questionnaires passed to assess the emotional response and user experience), as well as the collected data and data analysis scripts used for the discussion in the paper.

    Dataset

    The artifact contains the resources described below.

    Experiment resources

    The resources needed for replicating the experiment, namely in directory experiment:

    alloy_sheet_pt.pdf: the 1-page Alloy sheet that participants had access to during the 2 sessions of the experiment. The sheet was passed in Portuguese due to the population of the experiment.

    alloy_sheet_en.pdf: a version the 1-page Alloy sheet that participants had access to during the 2 sessions of the experiment translated into English.

    docker-compose.yml: a Docker Compose configuration file to launch Alloy4Fun populated with the tasks in directory data/experiment for the 2 sessions of the experiment.

    api and meteor: directories with source files for building and launching the Alloy4Fun platform for the study.

    Experiment data

    The task database used in our application of the experiment, namely in directory data/experiment:

    Model.json, Instance.json, and Link.json: JSON files with to populate Alloy4Fun with the tasks for the 2 sessions of the experiment.

    identifiers.txt: the list of all (104) available participant identifiers that can participate in the experiment.

    Collected data

    Data collected in the application of the experiment as a simple one-factor randomised experiment in 2 sessions involving 85 undergraduate students majoring in CSE. The experiment was validated by the Ethics Committee for Research in Social and Human Sciences of the Ethics Council of the University of Minho, where the experiment took place. Data is shared the shape of JSON and CSV files with a header row, namely in directory data/results:

    data_sessions.json: data collected from task-solving in the 2 sessions of the experiment, used to calculate variables productivity (PROD1 and PROD2, between 0 and 12 solved tasks) and efficiency (EFF1 and EFF2, between 0 and 1).

    data_socio.csv: data collected from socio-demographic questionnaire in the 1st session of the experiment, namely:

    participant identification: participant's unique identifier (ID);

    socio-demographic information: participant's age (AGE), sex (SEX, 1 through 4 for female, male, prefer not to disclosure, and other, respectively), and average academic grade (GRADE, from 0 to 20, NA denotes preference to not disclosure).

    data_emo.csv: detailed data collected from the emotional questionnaire in the 2 sessions of the experiment, namely:

    participant identification: participant's unique identifier (ID) and the assigned treatment (column HINT, either N, L, E or D);

    detailed emotional response data: the differential in the 5-point Likert scale for each of the 14 measured emotions in the 2 sessions, ranging from -5 to -1 if decreased, 0 if maintained, from 1 to 5 if increased, or NA denoting failure to submit the questionnaire. Half of the emotions are positive (Admiration1 and Admiration2, Desire1 and Desire2, Hope1 and Hope2, Fascination1 and Fascination2, Joy1 and Joy2, Satisfaction1 and Satisfaction2, and Pride1 and Pride2), and half are negative (Anger1 and Anger2, Boredom1 and Boredom2, Contempt1 and Contempt2, Disgust1 and Disgust2, Fear1 and Fear2, Sadness1 and Sadness2, and Shame1 and Shame2). This detailed data was used to compute the aggregate data in data_emo_aggregate.csv and in the detailed discussion in Section 6 of the paper.

    data_umux.csv: data collected from the user experience questionnaires in the 2 sessions of the experiment, namely:

    participant identification: participant's unique identifier (ID);

    user experience data: summarised user experience data from the UMUX surveys (UMUX1 and UMUX2, as a usability metric ranging from 0 to 100).

    participants.txt: the list of participant identifiers that have registered for the experiment.

    Analysis scripts

    The analysis scripts required to replicate the analysis of the results of the experiment as reported in the paper, namely in directory analysis:

    analysis.r: An R script to analyse the data in the provided CSV files; each performed analysis is documented within the file itself.

    requirements.r: An R script to install the required libraries for the analysis script.

    normalize_task.r: A Python script to normalize the task JSON data from file data_sessions.json into the CSV format required by the analysis script.

    normalize_emo.r: A Python script to compute the aggregate emotional response in the CSV format required by the analysis script from the detailed emotional response data in the CSV format of data_emo.csv.

    Dockerfile: Docker script to automate the analysis script from the collected data.

    Setup

    To replicate the experiment and the analysis of the results, only Docker is required.

    If you wish to manually replicate the experiment and collect your own data, you'll need to install:

    A modified version of the Alloy4Fun platform, which is built in the Meteor web framework. This version of Alloy4Fun is publicly available in branch study of its repository at https://github.com/haslab/Alloy4Fun/tree/study.

    If you wish to manually replicate the analysis of the data collected in our experiment, you'll need to install:

    Python to manipulate the JSON data collected in the experiment. Python is freely available for download at https://www.python.org/downloads/, with distributions for most platforms.

    R software for the analysis scripts. R is freely available for download at https://cran.r-project.org/mirrors.html, with binary distributions available for Windows, Linux and Mac.

    Usage

    Experiment replication

    This section describes how to replicate our user study experiment, and collect data about how different hints impact the performance of participants.

    To launch the Alloy4Fun platform populated with tasks for each session, just run the following commands from the root directory of the artifact. The Meteor server may take a few minutes to launch, wait for the "Started your app" message to show.

    cd experimentdocker-compose up

    This will launch Alloy4Fun at http://localhost:3000. The tasks are accessed through permalinks assigned to each participant. The experiment allows for up to 104 participants, and the list of available identifiers is given in file identifiers.txt. The group of each participant is determined by the last character of the identifier, either N, L, E or D. The task database can be consulted in directory data/experiment, in Alloy4Fun JSON files.

    In the 1st session, each participant was given one permalink that gives access to 12 sequential tasks. The permalink is simply the participant's identifier, so participant 0CAN would just access http://localhost:3000/0CAN. The next task is available after a correct submission to the current task or when a time-out occurs (5mins). Each participant was assigned to a different treatment group, so depending on the permalink different kinds of hints are provided. Below are 4 permalinks, each for each hint group:

    Group N (no hints): http://localhost:3000/0CAN

    Group L (error locations): http://localhost:3000/CA0L

    Group E (counter-example): http://localhost:3000/350E

    Group D (error description): http://localhost:3000/27AD

    In the 2nd session, likewise the 1st session, each permalink gave access to 12 sequential tasks, and the next task is available after a correct submission or a time-out (5mins). The permalink is constructed by prepending the participant's identifier with P-. So participant 0CAN would just access http://localhost:3000/P-0CAN. In the 2nd sessions all participants were expected to solve the tasks without any hints provided, so the permalinks from different groups are undifferentiated.

    Before the 1st session the participants should answer the socio-demographic questionnaire, that should ask the following information: unique identifier, age, sex, familiarity with the Alloy language, and average academic grade.

    Before and after both sessions the participants should answer the standard PrEmo 2 questionnaire. PrEmo 2 is published under an Attribution-NonCommercial-NoDerivatives 4.0 International Creative Commons licence (CC BY-NC-ND 4.0). This means that you are free to use the tool for non-commercial purposes as long as you give appropriate credit, provide a link to the license, and do not modify the original material. The original material, namely the depictions of the diferent emotions, can be downloaded from https://diopd.org/premo/. The questionnaire should ask for the unique user identifier, and for the attachment with each of the depicted 14 emotions, expressed in a 5-point Likert scale.

    After both sessions the participants should also answer the standard UMUX questionnaire. This questionnaire can be used freely, and should ask for the user unique identifier and answers for the standard 4 questions in a 7-point Likert scale. For information about the questions, how to implement the questionnaire, and how to compute the usability metric ranging from 0 to 100 score from the answers, please see the original paper:

    Kraig Finstad. 2010. The usability metric for user experience. Interacting with computers 22, 5 (2010), 323–327.

    Analysis of other applications of the experiment

    This section describes how to replicate the analysis of the data collected in an application of the experiment described in Experiment replication.

    The analysis script expects data in 4 CSV files,

  6. The NIST Extensible Resource Data Model (NERDm): JSON schemas for rich...

    • data.nist.gov
    • s.cnmilf.com
    • +1more
    Updated Sep 2, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2017). The NIST Extensible Resource Data Model (NERDm): JSON schemas for rich description of data resources [Dataset]. http://doi.org/10.18434/mds2-1870
    Explore at:
    Dataset updated
    Sep 2, 2017
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The NIST Extensible Resource Data Model (NERDm) is a set of schemas for encoding in JSON format metadata that describe digital resources. The variety of digital resources it can describe includes not only digital data sets and collections, but also software, digital services, web sites and portals, and digital twins. It was created to serve as the internal metadata format used by the NIST Public Data Repository and Science Portal to drive rich presentations on the web and to enable discovery; however, it was also designed to enable programmatic access to resources and their metadata by external users. Interoperability was also a key design aim: the schemas are defined using the JSON Schema standard, metadata are encoded as JSON-LD, and their semantics are tied to community ontologies, with an emphasis on DCAT and the US federal Project Open Data (POD) models. Finally, extensibility is also central to its design: the schemas are composed of a central core schema and various extension schemas. New extensions to support richer metadata concepts can be added over time without breaking existing applications. Validation is central to NERDm's extensibility model. Consuming applications should be able to choose which metadata extensions they care to support and ignore terms and extensions they don't support. Furthermore, they should not fail when a NERDm document leverages extensions they don't recognize, even when on-the-fly validation is required. To support this flexibility, the NERDm framework allows documents to declare what extensions are being used and where. We have developed an optional extension to the standard JSON Schema validation (see ejsonschema below) to support flexible validation: while a standard JSON Schema validater can validate a NERDm document against the NERDm core schema, our extension will validate a NERDm document against any recognized extensions and ignore those that are not recognized. The NERDm data model is based around the concept of resource, semantically equivalent to a schema.org Resource, and as in schema.org, there can be different types of resources, such as data sets and software. A NERDm document indicates what types the resource qualifies as via the JSON-LD "@type" property. All NERDm Resources are described by metadata terms from the core NERDm schema; however, different resource types can be described by additional metadata properties (often drawing on particular NERDm extension schemas). A Resource contains Components of various types (including DCAT-defined Distributions) that are considered part of the Resource; specifically, these can include downloadable data files, hierachical data collecitons, links to web sites (like software repositories), software tools, or other NERDm Resources. Through the NERDm extension system, domain-specific metadata can be included at either the resource or component level. The direct semantic and syntactic connections to the DCAT, POD, and schema.org schemas is intended to ensure unambiguous conversion of NERDm documents into those schemas. As of this writing, the Core NERDm schema and its framework stands at version 0.7 and is compatible with the "draft-04" version of JSON Schema. Version 1.0 is projected to be released in 2025. In that release, the NERDm schemas will be updated to the "draft2020" version of JSON Schema. Other improvements will include stronger support for RDF and the Linked Data Platform through its support of JSON-LD.

  7. f

    Data from: Inflect: Optimizing Computational Workflows for Thermal Proteome...

    • acs.figshare.com
    xlsx
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neil A. McCracken; Sarah A. Peck Justice; Aruna B. Wijeratne; Amber L. Mosley (2023). Inflect: Optimizing Computational Workflows for Thermal Proteome Profiling Data Analysis [Dataset]. http://doi.org/10.1021/acs.jproteome.0c00872.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    ACS Publications
    Authors
    Neil A. McCracken; Sarah A. Peck Justice; Aruna B. Wijeratne; Amber L. Mosley
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The CETSA and Thermal Proteome Profiling (TPP) analytical methods are invaluable for the study of protein–ligand interactions and protein stability in a cellular context. These tools have increasingly been leveraged in work ranging from understanding signaling paradigms to drug discovery. Consequently, there is an important need to optimize the data analysis pipeline that is used to calculate protein melt temperatures (Tm) and relative melt shifts from proteomics abundance data. Here, we report a user-friendly analysis of the melt shift calculation workflow where we describe the impact of each individual calculation step on the final output list of stabilized and destabilized proteins. This report also includes a description of how key steps in the analysis workflow quantitatively impact the list of stabilized/destabilized proteins from an experiment. We applied our findings to develop a more optimized analysis workflow that illustrates the dramatic sensitivity of chosen calculation steps on the final list of reported proteins of interest in a study and have made the R based program Inflect available for research community use through the CRAN repository [McCracken, N. Inflect: Melt Curve Fitting and Melt Shift Analysis. R package version 1.0.3, 2021]. The Inflect outputs include melt curves for each protein which passes filtering criteria in addition to a data matrix which is directly compatible with downstream packages such as UpsetR for replicate comparisons and identification of biologically relevant changes. Overall, this work provides an essential resource for scientists as they analyze data from TPP and CETSA experiments and implement their own analysis pipelines geared toward specific applications.

  8. E

    Climate hydrology and ecology research support system meteorology dataset...

    • catalogue.ceh.ac.uk
    • hosted-metadata.bgs.ac.uk
    • +2more
    Updated Nov 13, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    E.L. Robinson; E. Blyth; D.B. Clark; J. Finch; A.C. Rudd (2015). Climate hydrology and ecology research support system meteorology dataset for Great Britain (1961-2012) [CHESS-met] [Dataset]. http://doi.org/10.5285/80887755-1426-4dab-a4a6-250919d5020c
    Explore at:
    Dataset updated
    Nov 13, 2015
    Dataset provided by
    NERC EDS Environmental Information Data Centre
    Authors
    E.L. Robinson; E. Blyth; D.B. Clark; J. Finch; A.C. Rudd
    License

    https://eidc.ceh.ac.uk/licences/chessmet/plainhttps://eidc.ceh.ac.uk/licences/chessmet/plain

    Time period covered
    Jan 1, 1961 - Dec 31, 2012
    Area covered
    Description

    1km resolution gridded meteorological variables over Great Britain for the years 1961-2012. This dataset contains time series of daily mean values of air temperature (K), specific humidity (kg kg-1), wind speed (m s-1), downward longwave radiation (W m-2), downward shortwave radiation (W m-2), precipitation (kg m-2 s-1) and air pressure (Pa), plus daily temperature range (K). These are the variables required to run the JULES land surface model [1] with daily disaggregation. The precipitation data were obtained by scaling the Gridded estimates of daily and monthly areal rainfall (CEH-GEAR) daily rainfall estimates [2,3] to the units required for JULES input. Other variables were interpolated from coarser resolution datasets, taking into account topographic information. [1] Best, M. J., Pryor, M., Clark, D. B., Rooney, G. G., Essery, R. L. H., Ménard, C. B., Edwards, J. M., Hendry, M. A., Porson, A., Gedney, N., Mercado, L. M., Sitch, S., Blyth, E., Boucher, O., Cox, P. M., Grimmond, C. S. B., and Harding, R. J.: The Joint UK Land Environment Simulator (JULES), model description - Part 1: Energy and water fluxes, Geoscientific Model Development, 4, 677-699, doi:10.5194/gmd-4-677-2011, 2011. [2] Tanguy, M., Dixon, H., Prosdocimi, I., Morris, D. G., Keller, V. D. J. (2014). Gridded estimates of daily and monthly areal rainfall for the United Kingdom (1890-2012) [CEH-GEAR]. NERC-Environmental Information Data Centre doi:10.5285/5dc179dc-f692-49ba-9326-a6893a503f6e [3] Keller,V. D. J., Tanguy, M. , Prosdocimi, I. , Terry, J. A. , Hitt, O., Cole, S. J. , Fry, M., Morris, D. G., Dixon, H. (2015) CEH-GEAR: 1km resolution daily and monthly areal rainfall estimates for the UK for hydrological use. Earth Syst. Sci. Data Discuss., 8, 83-112, www.earth-syst-sci-data-discuss.net/8/83/2015/ doi:10.5194/essdd-8-83-2015.

  9. D

    Big Data Platform Software Market Report | Global Forecast From 2025 To 2033...

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Big Data Platform Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/big-data-platform-software-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Oct 16, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Big Data Platform Software Market Outlook



    The global Big Data Platform Software market size was valued at approximately USD 70 billion in 2023 and is projected to reach around USD 250 billion by 2032, growing at a compound annual growth rate (CAGR) of 15%. The substantial growth in this market can be attributed to the increasing volume and complexity of data generated across various industries, along with the rising need for data analytics to drive business decision-making.



    One of the key growth factors driving the Big Data Platform Software market is the explosive growth in data generation from various sources such as social media, IoT devices, and enterprise applications. The proliferation of digital devices has led to an unprecedented surge in data volumes, compelling businesses to adopt advanced Big Data solutions to manage and analyze this data effectively. Additionally, advancements in cloud computing have further amplified the capabilities of Big Data platforms, enabling organizations to store and process vast amounts of data in a cost-efficient manner.



    Another significant driver of market growth is the increasing adoption of artificial intelligence (AI) and machine learning (ML) technologies. Big Data platforms equipped with AI and ML capabilities can provide valuable insights by analyzing patterns, trends, and anomalies within large datasets. This has been particularly beneficial for industries such as healthcare, finance, and retail, where data-driven decision-making can lead to improved operational efficiency, enhanced customer experiences, and better risk management.



    Moreover, the rising demand for real-time data analytics is propelling the growth of the Big Data Platform Software market. Businesses are increasingly seeking solutions that can process and analyze data in real-time to gain immediate insights and respond swiftly to market changes. This demand is fueled by the need for agility and competitiveness, as organizations aim to stay ahead in a rapidly evolving business landscape. The ability to make data-driven decisions in real-time can provide a significant competitive edge, driving further investment in Big Data technologies.



    From a regional perspective, North America holds the largest share of the Big Data Platform Software market, driven by the early adoption of advanced technologies and the presence of major market players. The Asia Pacific region is expected to witness the highest growth rate during the forecast period, owing to the increasing digital transformation initiatives and the rising awareness about the benefits of Big Data analytics across various industries. Europe also presents significant growth opportunities, driven by stringent data protection regulations and the growing emphasis on data privacy and security.



    Component Analysis



    The Big Data Platform Software market can be segmented by component into Software and Services. The software segment encompasses the various Big Data platforms and tools that enable data storage, processing, and analytics. This includes data management software, data analytics software, and visualization tools. The demand for Big Data software is driven by the need for organizations to handle large volumes of data efficiently and derive actionable insights from it. With the growing complexity of data, advanced software solutions that offer robust analytics capabilities are becoming increasingly essential.



    The services segment includes consulting, implementation, and support services related to Big Data platforms. These services are crucial for the successful deployment and management of Big Data solutions. Consulting services help organizations to design and strategize their Big Data initiatives, while implementation services ensure the seamless integration of Big Data platforms into existing IT infrastructure. Support services provide ongoing maintenance and troubleshooting to ensure the smooth functioning of Big Data systems. The growing adoption of Big Data solutions is driving the demand for these ancillary services, as organizations seek expert guidance to maximize the value of their Big Data investments.



    Within the software segment, data analytics software is witnessing significant demand due to its ability to process and analyze large datasets to uncover hidden patterns and insights. This is particularly important for industries such as healthcare, finance, and retail, where data-driven insights can lead to improved decision-making and operational efficiency. Additionally, data management software plays a critical role in ensuring the integrity, securit

  10. D

    Data from: Towards a Prague Definition of Grey Literature

    • ssh.datastations.nl
    • narcis.nl
    pdf, xls, zip
    Updated Jan 1, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J. Schöpfel; J. Schöpfel (2011). Towards a Prague Definition of Grey Literature [Dataset]. http://doi.org/10.17026/DANS-XDX-88UX
    Explore at:
    pdf(436609), xls(141312), zip(17306)Available download formats
    Dataset updated
    Jan 1, 2011
    Dataset provided by
    DANS Data Station Social Sciences and Humanities
    Authors
    J. Schöpfel; J. Schöpfel
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Prague
    Description

    Research method: The project applies a two-step-methodology: (1) A state of the art of terminology and definitions of the last two decades, based on contributions to the GL conference series (1993-2008) and on original articles published in The Grey Journal (2005-2010). (2) An exploratory survey with a sample of scientists, publishing and LIS professionals to assess attitudes towards of the New York definition and to gather elements for a new definition.

  11. f

    Data_Sheet_1_Data and model bias in artificial intelligence for healthcare...

    • frontiersin.figshare.com
    zip
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vithya Yogarajan; Gillian Dobbie; Sharon Leitch; Te Taka Keegan; Joshua Bensemann; Michael Witbrock; Varsha Asrani; David Reith (2023). Data_Sheet_1_Data and model bias in artificial intelligence for healthcare applications in New Zealand.zip [Dataset]. http://doi.org/10.3389/fcomp.2022.1070493.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers
    Authors
    Vithya Yogarajan; Gillian Dobbie; Sharon Leitch; Te Taka Keegan; Joshua Bensemann; Michael Witbrock; Varsha Asrani; David Reith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New Zealand
    Description

    IntroductionDevelopments in Artificial Intelligence (AI) are adopted widely in healthcare. However, the introduction and use of AI may come with biases and disparities, resulting in concerns about healthcare access and outcomes for underrepresented indigenous populations. In New Zealand, Māori experience significant inequities in health compared to the non-Indigenous population. This research explores equity concepts and fairness measures concerning AI for healthcare in New Zealand.MethodsThis research considers data and model bias in NZ-based electronic health records (EHRs). Two very distinct NZ datasets are used in this research, one obtained from one hospital and another from multiple GP practices, where clinicians obtain both datasets. To ensure research equality and fair inclusion of Māori, we combine expertise in Artificial Intelligence (AI), New Zealand clinical context, and te ao Māori. The mitigation of inequity needs to be addressed in data collection, model development, and model deployment. In this paper, we analyze data and algorithmic bias concerning data collection and model development, training and testing using health data collected by experts. We use fairness measures such as disparate impact scores, equal opportunities and equalized odds to analyze tabular data. Furthermore, token frequencies, statistical significance testing and fairness measures for word embeddings, such as WEAT and WEFE frameworks, are used to analyze bias in free-form medical text. The AI model predictions are also explained using SHAP and LIME.ResultsThis research analyzed fairness metrics for NZ EHRs while considering data and algorithmic bias. We show evidence of bias due to the changes made in algorithmic design. Furthermore, we observe unintentional bias due to the underlying pre-trained models used to represent text data. This research addresses some vital issues while opening up the need and opportunity for future research.DiscussionsThis research takes early steps toward developing a model of socially responsible and fair AI for New Zealand's population. We provided an overview of reproducible concepts that can be adopted toward any NZ population data. Furthermore, we discuss the gaps and future research avenues that will enable more focused development of fairness measures suitable for the New Zealand population's needs and social structure. One of the primary focuses of this research was ensuring fair inclusions. As such, we combine expertise in AI, clinical knowledge, and the representation of indigenous populations. This inclusion of experts will be vital moving forward, proving a stepping stone toward the integration of AI for better outcomes in healthcare.

  12. analyze-paper-data-v01-formatted

    • huggingface.co
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pi Labs Inc. (2025). analyze-paper-data-v01-formatted [Dataset]. https://huggingface.co/datasets/withpi/analyze-paper-data-v01-formatted
    Explore at:
    Dataset updated
    Feb 21, 2025
    Dataset provided by
    Pi Labs, Inc.
    Authors
    Pi Labs Inc.
    Description

    withpi/analyze-paper-data-v01-formatted dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. f

    Standard evaluation instruments for student’s evaluation.

    • figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Retno Asti Werdhani; Ardi Findyartini; Dewi Anggraeni Kusumoningrum; Chaina Hanum; Dina Muktiarti; Oktavinda Safitry; Wismandari Wisnu; Dewi Sumaryani Soemarko; Reynardi Larope Sutanto (2023). Standard evaluation instruments for student’s evaluation. [Dataset]. http://doi.org/10.1371/journal.pone.0279742.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Retno Asti Werdhani; Ardi Findyartini; Dewi Anggraeni Kusumoningrum; Chaina Hanum; Dina Muktiarti; Oktavinda Safitry; Wismandari Wisnu; Dewi Sumaryani Soemarko; Reynardi Larope Sutanto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Standard evaluation instruments for student’s evaluation.

  14. w

    Dataset of publication dates of book subjects that contain Artificial...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of publication dates of book subjects that contain Artificial Intelligence Accelerates Human Learning : Discussion Data Analytics [Dataset]. https://www.workwithdata.com/datasets/book-subjects?col=book_subject%2Cj0-publication_date&f=1&fcol0=j0-book&fop0=%3D&fval0=Artificial+Intelligence+Accelerates+Human+Learning+%3A+Discussion+Data+Analytics&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 2 rows and is filtered where the books is Artificial Intelligence Accelerates Human Learning : Discussion Data Analytics. It features 2 columns including publication dates.

  15. Z

    RDA IG Data Discovery Paradigms IG: Use Cases data

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Psomopoulos, Fotis (2024). RDA IG Data Discovery Paradigms IG: Use Cases data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1050975
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset provided by
    Wu, Mingfang
    Khalsa, Siri Jodha
    de Waard, Anita
    Psomopoulos, Fotis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The RDA Data Discovery Paradigms IG (https://www.rd-alliance.org/groups/data-discovery-paradigms-ig) aims to provide a forum where representatives from across the spectrum of stakeholders and roles pertaining to data search can discuss issues related to improving data discovery. The goal is to identify concrete deliverables such as a registry of data search engines, common test datasets, usage metrics, and a collection of data search use cases and competency questions.

    In order to identify the key requirements evident across data discovery use-cases from various scientific fields and domains, the Use Cases Task Force (https://www.rd-alliance.org/group/data-discovery-paradigms-ig/wiki/use-cases-prototyping-tools-and-test-collections-task-force) was initiated. Direct outcome of this task force is this collection of use cases outlining what users might wish to search for data and what supports they would expect a data repository should provide.

  16. Drilling Data Management Systems Market Analysis, Size, and Forecast...

    • technavio.com
    pdf
    Updated Mar 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Drilling Data Management Systems Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Middle East and Africa (UAE), Europe (Norway, Russia, UK), APAC (Australia), and South America (Brazil) [Dataset]. https://www.technavio.com/report/drilling-data-management-systems-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 22, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2025 - 2029
    Area covered
    Russia, Canada, United States, United Kingdom
    Description

    Snapshot img

    Drilling Data Management Systems Market Size 2025-2029

    The drilling data management systems market size is forecast to increase by USD 17.89 billion at a CAGR of 10.6% between 2024 and 2029.

    The market is experiencing significant growth due to the increasing adoption of these systems to enhance productivity and transparency in the drilling process. A key driver in this market is the advent of big data analytics, which enables drilling companies to process and analyze vast amounts of data in real-time, leading to more informed decision-making and improved operational efficiency. However, the market is not without challenges. Fluctuations in crude oil prices pose a significant threat, as they can impact drilling budgets and profitability. Additionally, ensuring data security and compliance with regulations are major obstacles, as drilling companies must protect sensitive data and adhere to various industry standards and regulations. To capitalize on market opportunities and navigate challenges effectively, drilling companies must focus on implementing advanced data management systems that can handle large volumes of data, provide real-time analytics, and ensure data security and compliance. By doing so, they can improve operational efficiency, reduce costs, and stay competitive in the market.

    What will be the Size of the Drilling Data Management Systems Market during the forecast period?

    Request Free SampleThe market is characterized by its continuous evolution and dynamic nature, driven by the need for enhanced production optimization, industry regulations, and data integrity in the exploration and production sector. These systems facilitate seamless data exchange and downhole monitoring, enabling safety improvement, data integration, and hardware solutions. Well data, wellbore modeling, and machine learning play integral roles in optimizing drilling operations, reducing costs, and ensuring data security standards. Petroleum engineers and data scientists leverage cloud-based solutions to process real-time data, analyze drilling parameters, and make informed decisions. Formation pressures, drillstring dynamics, and bit performance are closely monitored for optimal well planning and completion operations. Data visualization and integration tools provide valuable insights, while data governance frameworks ensure collaboration and data security. Wireline logging, mud logging, and geotechnical data contribute to wellsite personnel's understanding of geological formations and wellbore integrity. Decision support systems help optimize drilling operations, casing design, and mud properties, ensuring wellbore stability. Data management platforms enable energy companies to manage drilling data, reservoir data, and production data effectively. The market's ongoing unfolding is marked by the adoption of advanced technologies such as artificial intelligence, predictive modeling, and data analytics tools. API integrations, mobile applications, and on-premise solutions cater to diverse industry requirements, with environmental protection and data security remaining top priorities.

    How is this Drilling Data Management Systems Industry segmented?

    The drilling data management systems industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. ComponentServicesSoftwareHardwareApplicationOil and gasEnergy and powerTypeOnshoreOffshoreGeographyNorth AmericaUSCanadaMexicoEuropeNorwayRussiaUKMiddle East and AfricaUAEAPACAustraliaSouth AmericaBrazilRest of World (ROW)

    By Component Insights

    The services segment is estimated to witness significant growth during the forecast period.The drilling data management system market is experiencing significant growth as the energy industry prioritizes operational efficiency, cost reduction, and regulatory compliance. Production optimization is a key focus area, with drilling contractors and energy companies implementing advanced technologies like artificial intelligence, predictive modeling, and machine learning to analyze drilling data in real-time. Data acquisition and sharing are essential for collaborative decision-making, and data analytics tools help petroleum engineers and data scientists gain valuable insights from well data, wellbore modeling, and formation evaluation. Industry regulations mandate stringent data integrity and security standards, driving the adoption of cloud-based solutions and data management platforms. Downhole monitoring and safety improvement are critical aspects of drilling operations, with data integration tools facilitating seamless communication between various systems. Hardware solutions, such as wireline logging and mud logging, provide crucial data on drillstring dynamics, bit performance, and mud properties. Serv

  17. d

    Replication Code for: LocalView, a database of public meetings for the study...

    • search.dataone.org
    • dataverse.harvard.edu
    • +1more
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Barari, Soubhik; Simko, Tyler (2023). Replication Code for: LocalView, a database of public meetings for the study of local politics and policy-making in the United States [Dataset]. http://doi.org/10.7910/DVN/KHUXIN
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Barari, Soubhik; Simko, Tyler
    Description

    Despite the fundamental importance of American local governments for service provision in areas like education and public health, local policy-making remains difficult and expensive to study at scale due to a lack of centralized data. This article introduces LocalView , the largest existing dataset of real-time local government public meetings – the central policy-making process in local government. In sum, the dataset currently covers 139,616 videos and their corresponding textual and audio transcripts of local government meetings publicly uploaded to YouTube – the world’s largest public video-sharing website – from 1,012 places and 2,861 distinct governments across the United States between 2006-2022. The data are processed, downloaded, cleaned, and publicly disseminated (at localview.net) for analysis across places and over time. We validate this dataset using a variety of methods and demonstrate how it can be used to map local governments’ attention to policy areas of interest. Finally, we discuss how LocalView may be used by journalists, academics, and other users for understanding how local communities deliberate crucial policy questions on topics including climate change, public health, and immigration.

  18. D

    Hadoop Big Data Analytics Solution Market Report | Global Forecast From 2025...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Hadoop Big Data Analytics Solution Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-hadoop-big-data-analytics-solution-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Hadoop Big Data Analytics Solution Market Outlook



    In 2023, the global Hadoop Big Data Analytics Solution market size was valued at approximately USD 45 billion and is projected to reach around USD 145 billion by 2032, growing at a compound annual growth rate (CAGR) of 14.5% during the forecast period. This significant growth is driven by the increasing adoption of big data technologies across various industries, advancements in data analytics, and the rising need for cost-effective and scalable data management solutions.



    One of the primary growth factors for the Hadoop Big Data Analytics Solution market is the exponential increase in data generation. With the proliferation of digital devices and the internet, vast amounts of data are being produced every second. This data, often referred to as big data, contains valuable insights that can drive business decisions and innovation. Organizations across sectors are increasingly recognizing the potential of big data analytics in enhancing operational efficiency, optimizing business processes, and gaining a competitive edge. Consequently, the demand for advanced analytics solutions like Hadoop, which can handle and process large datasets efficiently, is witnessing a substantial rise.



    Another significant growth driver is the ongoing digital transformation initiatives undertaken by businesses globally. As organizations strive to become more data-driven, they are investing heavily in advanced analytics solutions to harness the power of their data. Hadoop, with its ability to store and process vast volumes of structured and unstructured data, is becoming a preferred choice for businesses aiming to leverage big data for strategic decision-making. Additionally, the integration of artificial intelligence (AI) and machine learning (ML) with Hadoop platforms is further augmenting their analytical capabilities, making them indispensable tools for modern enterprises.



    The cost-effectiveness and scalability of Hadoop solutions also contribute to their growing popularity. Traditional data storage and processing systems often struggle to handle the sheer volume and variety of big data. In contrast, Hadoop offers a more flexible and scalable architecture, allowing organizations to store and analyze large datasets without incurring prohibitive costs. Moreover, the open-source nature of Hadoop software reduces the total cost of ownership, making it an attractive option for organizations of all sizes, including small and medium enterprises (SMEs).



    From a regional perspective, North America is expected to dominate the Hadoop Big Data Analytics Solution market during the forecast period. The region's strong technological infrastructure, coupled with the presence of major market players and early adopters of advanced analytics solutions, drives market growth. Additionally, the increasing focus on data-driven decision-making and the high adoption rates of digital technologies in sectors like BFSI, healthcare, and retail further bolster the market in North America. Conversely, the Asia Pacific region is anticipated to witness the highest growth rate, driven by rapid digitalization, government initiatives promoting big data analytics, and the expanding e-commerce industry.



    MapReduce Services play a pivotal role in the Hadoop ecosystem by enabling the processing of large data sets across distributed clusters. As businesses continue to generate vast amounts of data, the need for efficient data processing frameworks becomes increasingly critical. MapReduce, with its ability to break down complex data processing tasks into smaller, manageable units, allows organizations to analyze data at scale. This service is particularly beneficial for industries dealing with high-volume data streams, such as finance, healthcare, and retail, where timely insights can drive strategic decisions. The integration of MapReduce Services with Hadoop platforms enhances their data processing capabilities, making them indispensable tools for modern enterprises seeking to leverage big data for competitive advantage.



    Component Analysis



    When analyzing the Hadoop Big Data Analytics Solution market by component, it becomes evident that software, hardware, and services are the three main segments. The software segment encompasses the core Hadoop components like Hadoop Distributed File System (HDFS) and MapReduce, along with various tools and platforms designed to enhance its capabilities. The growing complexity and volume of data necessitate robust s

  19. c

    Open Knowledge Network: Summary of the Big Data IWG Workshop

    • s.cnmilf.com
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +2more
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCO NITRD (2025). Open Knowledge Network: Summary of the Big Data IWG Workshop [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/open-knowledge-network-summary-of-the-big-data-iwg-workshop
    Explore at:
    Dataset updated
    May 14, 2025
    Dataset provided by
    NCO NITRD
    Description

    Since July 2016, the Big Data Interagency Working Group (BD IWG) has been involved in meetings to discuss the viability of and possible first steps to creating a public-private data network infrastructure, the Open Knowledge Network (OKN). On October 4–5, 2017, the Big Data IWG hosted a workshop to both expand the discussion to similar work being done in biomedicine, finance, geoscience, and manufacturing and to gather expert advice on next steps to advance the OKN. This report summarizes those discussions.

  20. Dataset analysing the crossover between archivists, recordkeeping...

    • figshare.com
    xlsx
    Updated Aug 29, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Grant (2018). Dataset analysing the crossover between archivists, recordkeeping professionals and research data management using email list data [Dataset]. http://doi.org/10.6084/m9.figshare.7007903.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Rebecca Grant
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset relates to research on the connections between archives professionals and research data management. It consists of a single Excel spreadsheet with four sheets, containing an analysis of emails sent to two email discussions lists: Archives-NRA (Archivists, conservators and records managers) and Research-Dataman. The coded dataset and a list of codes used for each mailing list is provided.The two datasets were downloaded from the JiscMail Email Discussion list archives on 27 July 2018. The Archives-NRA dataset was compiled by conducting a free text search for "research data" on the mailing list's archives, and the metadata for every search result was downloaded and coded (144 metadata records in total). The resulting coded dataset demonstrates how frequently archivists and records professionals discuss research data on the Archives-NRA list, the topics which are discussed, and an increase in these discussions over time. The Research-Dataman dataset was compiled by conducting a free text search for "archivist" on the mailing list's archives, and the metadata for every search result was downloaded and coded (197 emails total). The resulting coded dataset demonstrates how frequently data management professionals seek the advice of archivists or advertise vacancies for archivists, and how often archivists email this mailing list. The names and email addresses of the mailing list participants have been redacted for privacy reasons but the original full-text emails can be accessed by members of the respective mailing lists using the URLs provided in the dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Center for O*NET Development (2025). O*NET Database [Dataset]. https://www.onetcenter.org/database.html
Organization logo

O*NET Database

Explore at:
oracle, sql server, text, mysql, excelAvailable download formats
Dataset updated
May 22, 2025
Dataset provided by
Occupational Information Network
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
United States
Dataset funded by
United States Department of Laborhttp://www.dol.gov/
Description

The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.

Data content areas include:

  • Worker Characteristics (e.g., Abilities, Interests, Work Styles)
  • Worker Requirements (e.g., Education, Knowledge, Skills)
  • Experience Requirements (e.g., On-the-Job Training, Work Experience)
  • Occupational Requirements (e.g., Detailed Work Activities, Work Context)
  • Occupation-Specific Information (e.g., Job Titles, Tasks, Technology Skills)

Search
Clear search
Close search
Google apps
Main menu