30 datasets found
  1. Lack of data as obstacle to AI usage in Norway in 2023, by industry

    • statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Lack of data as obstacle to AI usage in Norway in 2023, by industry [Dataset]. https://www.statista.com/statistics/1456998/obstacle-ai-usage-norway-lack-data/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Norway
    Description

    In 2023, difficulty obtaining good quality data was reported as an obstacle to the use of artificial intelligence (AI) technology in Norway, with the retail trade (except of motor vehicles and motorcycles) industry reporting the highest share of ** percent.

  2. Unsafe prompt dataset

    • kaggle.com
    zip
    Updated Sep 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AP6621 (2024). Unsafe prompt dataset [Dataset]. https://www.kaggle.com/datasets/aloktantrik/google-unsafe-search-dataset
    Explore at:
    zip(701648 bytes)Available download formats
    Dataset updated
    Sep 9, 2024
    Authors
    AP6621
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    Introduction As AI models continue to evolve, datasets containing prompts for image generation or text-based tasks have grown increasingly complex. However, with this progress comes the risk of generating harmful, inappropriate, or unethical content. This is particularly true for datasets that contain potentially unsafe prompts—those that may encourage the creation of violent, sexually explicit, or offensive material.

    The dataset titled Unsafe Prompt Dataset comprises a variety of descriptions that can be used for AI image generation or other tasks. Some of these prompts include sensitive or harmful content, posing risks both to users and to the broader community if they are used without appropriate safeguards.

    To address these concerns, we propose a Risk Assessment and Data Mitigation Index (RAdMI). This framework will identify, assess, and mitigate the risks associated with using this dataset. The goal is to ensure that it can be used responsibly while minimizing the potential for generating harmful outputs. 1. Risk Identification - Type of Risk: The dataset contains prompts that may generate inappropriate, harmful, or offensive content. - Categories of Risk: - Ethical Risks: Content promoting inappropriate depictions of people or situations. - Legal Risks: Potential for misuse leading to defamation, harm, or violation of intellectual property rights. - Safety Risks: Encouraging harmful behavior or content that can lead to unsafe outcomes. - Prominent Risk Areas: - Violence or Misuse of Imagery: Certain prompts may encourage depictions of violence or dangerous scenarios. - Sexual Content: The dataset includes sexualized or suggestive prompts which could be inappropriate.

    1. Assessment Criteria

      • Severity: High – due to the dataset containing potentially offensive and harmful material.
      • Likelihood: Medium to high – depending on how widely the dataset is used or shared in model training.
      • Impact: Significant – depending on the scale of use, this dataset could have a wide-reaching negative impact on the AI outputs.
    2. Mitigation Strategies

      • Filtering and Removal:
        • Implement a filter to remove prompts containing sexually explicit or offensive terms.
        • Flag and exclude prompts associated with violent, harmful, or otherwise inappropriate content.
      • Preprocessing:
        • Add data labels for sensitivity, using categories like "safe," "unsafe," or "flagged for review."
        • Apply automated or manual review to flag prompts that could generate harmful outputs.
      • Content Moderation Tools:
        • Integrate models to automatically detect offensive language or imagery descriptors in the prompts.
    3. Mitigation Metrics

      • Percentage of Filtered Content: Track the percentage of unsafe content identified and filtered out during preprocessing.
      • False Positives/Negatives: Track the number of incorrectly flagged prompts to adjust the model's sensitivity accordingly.
      • Review and Adjustment Frequency: Regularly review flagged data for updating filters and refining the assessment process.
    4. Recommendations

      • Training Safeguards: Ensure AI models trained using this dataset are also provided with ethical guidelines and filters.
      • End-User Warnings: If this dataset is used in production, notify users about potential risks related to sensitive content generation.
      • Collaboration with Experts: Involve ethicists or legal experts in the review process of high-risk content.
  3. Poor statistical reporting, inadequate data presentation and spin persist...

    • plos.figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joanna Diong; Annie A. Butler; Simon C. Gandevia; Martin E. Héroux (2023). Poor statistical reporting, inadequate data presentation and spin persist despite editorial advice [Dataset]. http://doi.org/10.1371/journal.pone.0202121
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Joanna Diong; Annie A. Butler; Simon C. Gandevia; Martin E. Héroux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Journal of Physiology and British Journal of Pharmacology jointly published an editorial series in 2011 to improve standards in statistical reporting and data analysis. It is not known whether reporting practices changed in response to the editorial advice. We conducted a cross-sectional analysis of reporting practices in a random sample of research papers published in these journals before (n = 202) and after (n = 199) publication of the editorial advice. Descriptive data are presented. There was no evidence that reporting practices improved following publication of the editorial advice. Overall, 76-84% of papers with written measures that summarized data variability used standard errors of the mean, and 90-96% of papers did not report exact p-values for primary analyses and post-hoc tests. 76-84% of papers that plotted measures to summarize data variability used standard errors of the mean, and only 2-4% of papers plotted raw data used to calculate variability. Of papers that reported p-values between 0.05 and 0.1, 56-63% interpreted these as trends or statistically significant. Implied or gross spin was noted incidentally in papers before (n = 10) and after (n = 9) the editorial advice was published. Overall, poor statistical reporting, inadequate data presentation and spin were present before and after the editorial advice was published. While the scientific community continues to implement strategies for improving reporting practices, our results indicate stronger incentives or enforcements are needed.

  4. f

    Data from: Predicting inadequate postoperative pain management in depressed...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Feb 6, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carroll, Ian; Curtin, Catherine; Banerjee, Imon; Parthipan, Arjun; Asch, Steven M.; Hernandez-Boussard, Tina; Humphreys, Keith (2019). Predicting inadequate postoperative pain management in depressed patients: A machine learning approach [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000097204
    Explore at:
    Dataset updated
    Feb 6, 2019
    Authors
    Carroll, Ian; Curtin, Catherine; Banerjee, Imon; Parthipan, Arjun; Asch, Steven M.; Hernandez-Boussard, Tina; Humphreys, Keith
    Description

    Widely-prescribed prodrug opioids (e.g., hydrocodone) require conversion by liver enzyme CYP-2D6 to exert their analgesic effects. The most commonly prescribed antidepressant, selective serotonin reuptake inhibitors (SSRIs), inhibits CYP-2D6 activity and therefore may reduce the effectiveness of prodrug opioids. We used a machine learning approach to identify patients prescribed a combination of SSRIs and prodrug opioids postoperatively and to examine the effect of this combination on postoperative pain control. Using EHR data from an academic medical center, we identified patients receiving surgery over a 9-year period. We developed and validated natural language processing (NLP) algorithms to extract depression-related information (diagnosis, SSRI use, symptoms) from structured and unstructured data elements. The primary outcome was the difference between preoperative pain score and postoperative pain at discharge, 3-week and 8-week time points. We developed computational models to predict the increase or decrease in the postoperative pain across the 3 time points by using the patient’s EHR data (e.g. medications, vitals, demographics) captured before surgery. We evaluate the generalizability of the model using 10-fold cross-validation method where the holdout test method is repeated 10 times and mean area-under-the-curve (AUC) is considered as evaluation metrics for the prediction performance. We identified 4,306 surgical patients with symptoms of depression. A total of 14.1% were prescribed both an SSRI and a prodrug opioid, 29.4% were prescribed an SSRI and a non-prodrug opioid, 18.6% were prescribed a prodrug opioid but were not on SSRIs, and 37.5% were prescribed a non-prodrug opioid but were not on SSRIs. Our NLP algorithm identified depression with a F1 score of 0.95 against manual annotation of 300 randomly sampled clinical notes. On average, patients receiving prodrug opioids had lower average pain scores (p<0.05), with the exception of the SSRI+ group at 3-weeks postoperative follow-up. However, SSRI+/Prodrug+ had significantly worse pain control at discharge, 3 and 8-week follow-up (p < .01) compared to SSRI+/Prodrug- patients, whereas there was no difference in pain control among the SSRI- patients by prodrug opioid (p>0.05). The machine learning algorithm accurately predicted the increase or decrease of the discharge, 3-week and 8-week follow-up pain scores when compared to the pre-operative pain score using 10-fold cross validation (mean area under the receiver operating characteristic curve 0.87, 0.81, and 0.69, respectively). Preoperative pain, surgery type, and opioid tolerance were the strongest predictors of postoperative pain control. We provide the first direct clinical evidence that the known ability of SSRIs to inhibit prodrug opioid effectiveness is associated with worse pain control among depressed patients. Current prescribing patterns indicate that prescribers may not account for this interaction when choosing an opioid. The study results imply that prescribers might instead choose direct acting opioids (e.g. oxycodone or morphine) in depressed patients on SSRIs.

  5. d

    Data for: Inadequate sampling of the soundscape leads to overoptimistic...

    • search.dataone.org
    • datadryad.org
    • +1more
    Updated Jul 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Lewis (2025). Data for: Inadequate sampling of the soundscape leads to overoptimistic estimates of recogniser performance: A case study of two sympatric macaw species [Dataset]. http://doi.org/10.5061/dryad.5x69p8d7j
    Explore at:
    Dataset updated
    Jul 21, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Thomas Lewis
    Time period covered
    Jan 1, 2023
    Description

    Passive acoustic monitoring (PAM) offers the potential to dramatically increase the scale and robustness of species monitoring in rainforest ecosystems. PAM generates large volumes of data that require automated methods of target species detection. Species-specific recognisers, which often use supervised machine learning, can achieve this goal. However, they require a large training dataset of both target and non-target signals, which is time-consuming and challenging to create. Unfortunately, very little information about creating training datasets for supervised machine learning recognisers is available, especially for tropical ecosystems. Here we show an iterative approach to creating a training dataset that improved recogniser precision from 0.12 to 0.55. By sampling background noise using an initial small recogniser, we addressed one of the significant challenges of training dataset creation in acoustically diverse environments. Our work demonstrates that recognisers will likely f..., Raw data used to create this dataset was collected from autonomous recording units in northern Costa Rica. A template-matching process was used to identify candidate signals, then a one-second window was put around each candidate signal. We extracted a total of 113 acoustic features using the warbler package in R (R Core Team, 2020): 20 measurements of frequency, time, and amplitude parameters, and 93 Mel-frequency cepstral coefficients (MFCCs) (Araya†Salas and Smith†Vidaurre, 2017). This dataset also includes the results of manually checking detections that were the output of a trained random forest. These were initially output as selection tables, individual sound files were loaded in Raven Lite, selection tables were loaded, and each detection was manually checked and labelled. There is also the random forest model, which is a .rds format model created using tidymodels in R. , Following the code associated with this data will require R; the outputs from the machine learning require Raven Lite to open. The raw recordings are not included in this dataset.

  6. H

    Replication Data for: When Correlation Is Not Enough: Validating Populism...

    • dataverse.harvard.edu
    Updated Oct 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Jankowski; Robert A. Huber (2022). Replication Data for: When Correlation Is Not Enough: Validating Populism Scores from Supervised Machine-Learning Models [Dataset]. http://doi.org/10.7910/DVN/DDXRXI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 21, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Michael Jankowski; Robert A. Huber
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Despite the ongoing success of populist parties in many parts of the world, we lack comprehensive information about parties' level of populism over time. A recent contribution to Political Analysis by Di Cocco and Monechi (DCM) suggests that this research gap can be closed by predicting parties' populism scores from their election manifestos using supervised machine-learning. In this paper, we provide a detailed discussion of the suggested approach. Building on recent debates about the validation of machine-learning models, we argue that the validity checks provided in DCM's paper are insufficient. We conduct a series of additional validity checks and empirically demonstrate that the approach is not suitable for deriving populism scores from texts. We conclude that measuring populism over time and between countries remains an immense challenge for empirical research. More generally, our paper illustrates the importance of more comprehensive validations of supervised machine-learning models.

  7. f

    Summary of GPT-4 TR review.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jan 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Gallifant; Amelia Fiske; Yulia A. Levites Strekalova; Juan S. Osorio-Valencia; Rachael Parke; Rogers Mwavu; Nicole Martinez; Judy Wawira Gichoya; Marzyeh Ghassemi; Dina Demner-Fushman; Liam G. McCoy; Leo Anthony Celi; Robin Pierce (2024). Summary of GPT-4 TR review. [Dataset]. http://doi.org/10.1371/journal.pdig.0000417.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 18, 2024
    Dataset provided by
    PLOS Digital Health
    Authors
    Jack Gallifant; Amelia Fiske; Yulia A. Levites Strekalova; Juan S. Osorio-Valencia; Rachael Parke; Rogers Mwavu; Nicole Martinez; Judy Wawira Gichoya; Marzyeh Ghassemi; Dina Demner-Fushman; Liam G. McCoy; Leo Anthony Celi; Robin Pierce
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The study provides a comprehensive review of OpenAI’s Generative Pre-trained Transformer 4 (GPT-4) technical report, with an emphasis on applications in high-risk settings like healthcare. A diverse team, including experts in artificial intelligence (AI), natural language processing, public health, law, policy, social science, healthcare research, and bioethics, analyzed the report against established peer review guidelines. The GPT-4 report shows a significant commitment to transparent AI research, particularly in creating a systems card for risk assessment and mitigation. However, it reveals limitations such as restricted access to training data, inadequate confidence and uncertainty estimations, and concerns over privacy and intellectual property rights. Key strengths identified include the considerable time and economic investment in transparent AI research and the creation of a comprehensive systems card. On the other hand, the lack of clarity in training processes and data raises concerns about encoded biases and interests in GPT-4. The report also lacks confidence and uncertainty estimations, crucial in high-risk areas like healthcare, and fails to address potential privacy and intellectual property issues. Furthermore, this study emphasizes the need for diverse, global involvement in developing and evaluating large language models (LLMs) to ensure broad societal benefits and mitigate risks. The paper presents recommendations such as improving data transparency, developing accountability frameworks, establishing confidence standards for LLM outputs in high-risk settings, and enhancing industry research review processes. It concludes that while GPT-4’s report is a step towards open discussions on LLMs, more extensive interdisciplinary reviews are essential for addressing bias, harm, and risk concerns, especially in high-risk domains. The review aims to expand the understanding of LLMs in general and highlights the need for new reflection forms on how LLMs are reviewed, the data required for effective evaluation, and addressing critical issues like bias and risk.

  8. d

    Percent of Structurally Deficient Bridges - 2011 to 2014

    • catalog.data.gov
    • opendata.maryland.gov
    • +4more
    Updated Jun 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    opendata.maryland.gov (2025). Percent of Structurally Deficient Bridges - 2011 to 2014 [Dataset]. https://catalog.data.gov/dataset/percent-of-structurally-deficient-bridges-2011-to-2014
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    opendata.maryland.gov
    Description

    The percentage of MDTA structurally deficient (SD) bridges within a given Calendar year. Inspections are conducted every year on all MDTA facilities. Defects identified in yearly inspections are assigned a defect rating based on severity by the inspector. As a measure of quality assurance and control, the rating is reviewed and confirmed by in-house inspection staff during review of the inspection reports. The defects are prioritized based on the confirmed rating and major rehabilitation, total replacement or proactive system preservation projects are developed based on the priorities and overall condition assessment. Bridges are considered structurally deficient if significant load carrying elements are found to be in a poor (or worse) condition due to deterioration and/or damage, or have a low weight restriction. The fact that a bridge is structurally deficient does not imply that it is unsafe. MDTA reported 5 structurally deficient bridges from the 2011 inspection cycle and four of these bridges are already in construction with the fifth in design for full replacement in 2014. The number of structurally deficient bridges is not anticipated to increase significantly over the next several years as MDTA continues to address needs using a bridge management system to identify and address those bridges that are nearing the end of their useful life.

  9. C

    Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Income

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Income [Dataset]. https://www.ceicdata.com/en/colombia/underemployment-rate-household-survey-quarterly/gihs-quarterly-metropolitan-ur-inadequate-employment-income
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2018 - Dec 1, 2021
    Area covered
    Colombia
    Variables measured
    Underemployment
    Description

    Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Income data was reported at 18.222 % in Dec 2021. This records a decrease from the previous number of 19.421 % for Sep 2021. Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Income data is updated quarterly, averaging 24.374 % from Mar 2001 (Median) to Dec 2021, with 81 observations. The data reached an all-time high of 28.442 % in Sep 2007 and a record low of 18.222 % in Dec 2021. Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Income data remains active status in CEIC and is reported by National Administrative Department of Statistics. The data is categorized under Global Database’s Colombia – Table CO.G067: Underemployment Rate: 2005 Household Survey: Quarterly. Q1 2020 data for this series has not yet been released by DANE due to COVID-19. Data will be available once released by source.

  10. c

    Lack Of Memes Price Prediction Data

    • coinbase.com
    Updated Nov 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Lack Of Memes Price Prediction Data [Dataset]. https://www.coinbase.com/price-prediction/lack-of-memes
    Explore at:
    Dataset updated
    Nov 6, 2025
    Variables measured
    Growth Rate, Predicted Price
    Measurement technique
    User-defined projections based on compound growth. This is not a formal financial forecast.
    Description

    This dataset contains the predicted prices of the asset Lack Of Memes over the next 16 years. This data is calculated initially using a default 5 percent annual growth rate, and after page load, it features a sliding scale component where the user can then further adjust the growth rate to their own positive or negative projections. The maximum positive adjustable growth rate is 100 percent, and the minimum adjustable growth rate is -100 percent.

  11. o

    Data from: Insufficient LGBTQ+ education across disciplines suggested by...

    • openicpsr.org
    Updated Nov 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lexie Wille; Tess Jewell; Atticus Wolfe; Emily Peterson; Aileen Shaughnessy; Cole Roblee; Alex Strader (2024). Insufficient LGBTQ+ education across disciplines suggested by national survey of health professionals in training [Dataset]. http://doi.org/10.3886/E212022V1
    Explore at:
    Dataset updated
    Nov 27, 2024
    Dataset provided by
    Agnest Scott College, Department of Public Health
    Ohio University
    Columbia University Irving Medical Center
    Rosalind Franklin University of Medicine and Science
    University of Wisconsin School of Medicine and Public Health
    Case Western Reserve University
    Le Moyne College
    Authors
    Lexie Wille; Tess Jewell; Atticus Wolfe; Emily Peterson; Aileen Shaughnessy; Cole Roblee; Alex Strader
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Health professionals often feel underprepared to treat patients who identify as lesbian, gay, bisexual, transgender, and/or queer (LGBTQ+). Additionally, lack of access to professionals who are knowledgeable about LGBTQ+ inclusive care contributes to the myriad of health disparities experienced by LGBTQ+ communities. This cross-sectional survey study explores the preparedness of healthcare profession trainees for caring for LGBTQ+ patients by quantifying the hours and quality of training health profession trainees receive in LGBTQ+ education across disciplines. We surveyed US-based health professionals in training (HPiT) across disciplines (N=155) on their training programs’ LGBTQ+-specific curricula and educational opportunities. Ordered logistic regression analysis assessed the relationship between the number of hours of LGBTQ+-specific education and other discipline, organization, and individual factors. Respondents reported an average of 4.75 (SD = 3.04) hours devoted to LGBTQ+-specific education. Physician assistant trainees reported receiving the highest number of hours of LGBTQ+-specific education (M = 6.63, SD = 1.98), followed by psychology (M = 5.30, SD = 3.54), medical (M = 5.12, SD = 3.38), nursing (M = 4.17, SD = 3.28), and trainees in other health fields (M = 3.88, SD = 2.47). Across all disciplines, trainees rated their LGBTQ+-specific education on average as “good”. Despite rising awareness, the quantity and quality of dedicated LGBTQ+-specific education remains concerningly low across all measured disciplines and US regions. Future research must investigate strategies to overcome common barriers to increasing LGBTQ+ education in health professions training by maximizing the impact of limited hours through integrating LGBTQ+ content into existing materials, supporting trainee leadership, and implementing institutional support for educators teaching LGBTQ+ content. Regulatory bodies must reconsider the current guidance for LGBTQ+ education quantity and quality to advise institutions on best-practice guidelines to prepare trainees for LGBTQ+ patient care.

  12. f

    Data from: Graph Neural Network Integrating Self-Supervised Pretraining for...

    • acs.figshare.com
    xlsx
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingyi Zhu; Yuanxi Huang; Lingjun Bu; Yangtao Wu; Shiqing Zhou (2024). Graph Neural Network Integrating Self-Supervised Pretraining for Precise and Interpretable Prediction of Micropollutant Treatability by HO•‑Based Advanced Oxidation Processes [Dataset]. http://doi.org/10.1021/acsestengg.4c00389.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    ACS Publications
    Authors
    Jingyi Zhu; Yuanxi Huang; Lingjun Bu; Yangtao Wu; Shiqing Zhou
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Machine learning (ML) has become a crucial tool to accelerate research in advanced oxidation processes via predicting reaction parameters to evaluate the treatability of micropollutants (MPs). However, insufficient data sets and an incomplete prediction mechanism remain obstacles toward the precise prediction of MP treatability by a hydroxyl radical (HO•), especially when k values approach the diffusion-controlled limit. Herein, we propose a novel graph neural network (GNN) model integrating self-supervised pretraining on a large unlabeled data set (∼10 million) to predict the kHO values on MPs. Our model outperforms the common-seen and literature-established ML models on both whole data sets and diffusion-controlled limit data sets. Benefiting from the pretraining process, we demonstrate that k-value-related chemistry wisdom contained in the pretrained data set is fully exploited, and the learned knowledge can be transferred among data sets. In comparison with molecular fingerprints, we identify that molecular graphs (MGs) cover more structural information beyond substituents, facilitating a k-value prediction near the diffusion-controlled limit. In particular, we observe that mechanistic pathways of HO•-initiated reactions could be automatically classified and mapped out on the penultimate layer of our model. The phenomenon shows that the GNN model can be trained to excavate mechanistic knowledge by analyzing the kinetic parameters. These findings not only well interpret the robust model performance but also extrapolate the k-value prediction model to mechanistic elucidation, leading to better decision making in water treatment.

  13. C

    Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment:...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Qualifications [Dataset]. https://www.ceicdata.com/en/colombia/underemployment-rate-household-survey-quarterly/gihs-quarterly-metropolitan-ur-inadequate-employment-qualifications
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2018 - Dec 1, 2021
    Area covered
    Colombia
    Variables measured
    Underemployment
    Description

    Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Qualifications data was reported at 13.174 % in Dec 2021. This records a decrease from the previous number of 14.036 % for Sep 2021. Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Qualifications data is updated quarterly, averaging 15.751 % from Mar 2001 (Median) to Dec 2021, with 81 observations. The data reached an all-time high of 20.100 % in Dec 2007 and a record low of 3.010 % in Mar 2004. Colombia GIHS: Quarterly: Metropolitan: UR: Inadequate Employment: Qualifications data remains active status in CEIC and is reported by National Administrative Department of Statistics. The data is categorized under Global Database’s Colombia – Table CO.G067: Underemployment Rate: 2005 Household Survey: Quarterly. Q1 2020 data for this series has not yet been released by DANE due to COVID-19. Data will be available once released by source.

  14. f

    Births byCity 2014

    • data.ferndalemi.gov
    • cloud.csiss.gmu.edu
    • +8more
    Updated Mar 8, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Driven Detroit (2017). Births byCity 2014 [Dataset]. https://data.ferndalemi.gov/maps/D3::births-bycity-2014
    Explore at:
    Dataset updated
    Mar 8, 2017
    Dataset authored and provided by
    Data Driven Detroit
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset contains birth information, by city, for the state of Michigan in 2014. Included are births by ethnicity, number of births with inadequate prenatal care, number of low weight births, and births to teen mothers. Inadequate prenatal care was defined as births rated "Intermediate" or "Inadequate" on the Kessner Scale. Infants weighing under 2,500 grams were considered a low weight birth. Teen mothers were defined as mothers under the age of 20. Michigan Office of Vital Statistics provided individual birth data which was then suppressed by Data Driven Detroit.

  15. Data from: Bayesian Calibration of Inexact Computer Models

    • tandf.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Plumlee (2023). Bayesian Calibration of Inexact Computer Models [Dataset]. http://doi.org/10.6084/m9.figshare.3493532.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Matthew Plumlee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Bayesian calibration is used to study computer models in the presence of both a calibration parameter and model bias. The parameter in the predominant methodology is left undefined. This results in an issue, where the posterior of the parameter is suboptimally broad. There has been no generally accepted alternatives to date. This article proposes using Bayesian calibration, where the prior distribution on the bias is orthogonal to the gradient of the computer model. Problems associated with Bayesian calibration are shown to be mitigated through analytic results in addition to examples. Supplementary materials for this article are available online.

  16. C

    Colombia GIHS: Quarterly: Population: Male: UR: Inadequate Employment:...

    • ceicdata.com
    Updated Dec 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2021). Colombia GIHS: Quarterly: Population: Male: UR: Inadequate Employment: Income [Dataset]. https://www.ceicdata.com/en/colombia/underemployment-rate-household-survey-quarterly/gihs-quarterly-population-male-ur-inadequate-employment-income
    Explore at:
    Dataset updated
    Dec 15, 2021
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2018 - Dec 1, 2021
    Area covered
    Colombia
    Variables measured
    Underemployment
    Description

    Colombia GIHS: Quarterly: Population: Male: UR: Inadequate Employment: Income data was reported at 23.039 % in Dec 2021. This records an increase from the previous number of 22.668 % for Sep 2021. Colombia GIHS: Quarterly: Population: Male: UR: Inadequate Employment: Income data is updated quarterly, averaging 26.617 % from Mar 2001 (Median) to Dec 2021, with 81 observations. The data reached an all-time high of 34.016 % in Sep 2007 and a record low of 20.654 % in Jun 2001. Colombia GIHS: Quarterly: Population: Male: UR: Inadequate Employment: Income data remains active status in CEIC and is reported by National Administrative Department of Statistics. The data is categorized under Global Database’s Colombia – Table CO.G067: Underemployment Rate: 2005 Household Survey: Quarterly. March 2020 data for this series has not yet been released by DANE due to COVID-19. Data will be available once released by source.

  17. C

    Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment:...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Qualifications [Dataset]. https://www.ceicdata.com/en/colombia/underemployment-household-survey-quarterly/gihs-quarterly-population-female-ue-inadequate-employment-qualifications
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2018 - Dec 1, 2021
    Area covered
    Colombia
    Variables measured
    Underemployment
    Description

    Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Qualifications data was reported at 1,466.699 Person th in Dec 2021. This records a decrease from the previous number of 1,518.883 Person th for Sep 2021. Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Qualifications data is updated quarterly, averaging 1,504.549 Person th from Mar 2001 (Median) to Dec 2021, with 81 observations. The data reached an all-time high of 1,944.175 Person th in Jun 2016 and a record low of 213.671 Person th in Mar 2001. Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Qualifications data remains active status in CEIC and is reported by National Administrative Department of Statistics. The data is categorized under Global Database’s Colombia – Table CO.G063: Underemployment: 2005 Household Survey: Quarterly. March 2020 data for this series has not yet been released by DANE due to COVID-19. Data will be available once released by source.

  18. Control measures and re-evaluation of unacceptable risks.

    • plos.figshare.com
    xls
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiagang Luan; Lingling Ke; Minxuan Feng; Weiqun Peng; Houlong Luo; Hao Xue; Yong Xia (2025). Control measures and re-evaluation of unacceptable risks. [Dataset]. http://doi.org/10.1371/journal.pone.0319817.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xiagang Luan; Lingling Ke; Minxuan Feng; Weiqun Peng; Houlong Luo; Hao Xue; Yong Xia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Control measures and re-evaluation of unacceptable risks.

  19. C

    Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment:...

    • ceicdata.com
    Updated Feb 28, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2018). Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Income [Dataset]. https://www.ceicdata.com/en/colombia/underemployment-household-survey-quarterly/gihs-quarterly-population-female-ue-inadequate-employment-income
    Explore at:
    Dataset updated
    Feb 28, 2018
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2018 - Dec 1, 2021
    Area covered
    Colombia
    Variables measured
    Underemployment
    Description

    Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Income data was reported at 2,210.106 Person th in Dec 2021. This records a decrease from the previous number of 2,248.140 Person th for Sep 2021. Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Income data is updated quarterly, averaging 2,333.030 Person th from Mar 2001 (Median) to Dec 2021, with 81 observations. The data reached an all-time high of 2,879.453 Person th in Dec 2014 and a record low of 1,397.656 Person th in Jun 2001. Colombia GIHS: Quarterly: Population: Female: UE: Inadequate Employment: Income data remains active status in CEIC and is reported by National Administrative Department of Statistics. The data is categorized under Global Database’s Colombia – Table CO.G063: Underemployment: 2005 Household Survey: Quarterly. March 2020 data for this series has not yet been released by DANE due to COVID-19. Data will be available once released by source.

  20. C

    Colombia GIHS: Quarterly: Metropolitan: Male: UE: Inadequate Employment:...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Colombia GIHS: Quarterly: Metropolitan: Male: UE: Inadequate Employment: Income [Dataset]. https://www.ceicdata.com/en/colombia/underemployment-household-survey-quarterly/gihs-quarterly-metropolitan-male-ue-inadequate-employment-income
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2018 - Dec 1, 2021
    Area covered
    Colombia
    Variables measured
    Underemployment
    Description

    Colombia GIHS: Quarterly: Metropolitan: Male: UE: Inadequate Employment: Income data was reported at 1,167.566 Person th in Dec 2021. This records a decrease from the previous number of 1,260.469 Person th for Sep 2021. Colombia GIHS: Quarterly: Metropolitan: Male: UE: Inadequate Employment: Income data is updated quarterly, averaging 1,356.543 Person th from Mar 2001 (Median) to Dec 2021, with 81 observations. The data reached an all-time high of 1,600.827 Person th in Sep 2012 and a record low of 1,083.829 Person th in Jun 2001. Colombia GIHS: Quarterly: Metropolitan: Male: UE: Inadequate Employment: Income data remains active status in CEIC and is reported by National Administrative Department of Statistics. The data is categorized under Global Database’s Colombia – Table CO.G063: Underemployment: 2005 Household Survey: Quarterly. Q1 2020 data for this series has not yet been released by DANE due to COVID-19. Data will be available once released by source.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista, Lack of data as obstacle to AI usage in Norway in 2023, by industry [Dataset]. https://www.statista.com/statistics/1456998/obstacle-ai-usage-norway-lack-data/
Organization logo

Lack of data as obstacle to AI usage in Norway in 2023, by industry

Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Norway
Description

In 2023, difficulty obtaining good quality data was reported as an obstacle to the use of artificial intelligence (AI) technology in Norway, with the retail trade (except of motor vehicles and motorcycles) industry reporting the highest share of ** percent.

Search
Clear search
Close search
Google apps
Main menu