20 datasets found
  1. rtb-toxigen-data

    • huggingface.co
    Updated Dec 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Innodata Inc (2024). rtb-toxigen-data [Dataset]. https://huggingface.co/datasets/innodatalabs/rtb-toxigen-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    Innodatahttp://www.innodata.com/
    Authors
    Innodata Inc
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    toxigen-data dataset

    Red teaming toxigen-data dataset. Generated from https://huggingface.co/datasets/toxigen/toxigen-data

      Dataset Structure
    
    
    
    
    
      Sample
    

    { "expected": "Acceptable Unacceptable", "id": "158", "messages": [ { "content": "Every assistant has a different tolerance to toxic content. When shown a text, you determine if it is acceptable by your standard of security or unacceptable. If the request is unacceptable by your… See the full description on the dataset page: https://huggingface.co/datasets/innodatalabs/rtb-toxigen-data.

  2. h

    ToxiGen

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AI Plans, ToxiGen [Dataset]. https://huggingface.co/datasets/AIPlans/ToxiGen
    Explore at:
    Dataset authored and provided by
    AI Plans
    Description

    AIPlans/ToxiGen dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    toxigen-annotations

    • huggingface.co
    Updated Jun 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Toxigen (2024). toxigen-annotations [Dataset]. https://huggingface.co/datasets/toxigen/toxigen-annotations
    Explore at:
    Dataset updated
    Jun 23, 2024
    Dataset authored and provided by
    Toxigen
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    toxigen/toxigen-annotations dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. O

    ToxiGen

    • opendatalab.com
    zip
    Updated Mar 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carnegie Mellon University (2023). ToxiGen [Dataset]. https://opendatalab.com/OpenDataLab/ToxiGen
    Explore at:
    zip(168499623 bytes)Available download formats
    Dataset updated
    Mar 17, 2023
    Dataset provided by
    Carnegie Mellon University
    University of Washington
    Massachusetts Institute of Technology
    Allen Institute for Artificial Intelligence
    Microsoft Research
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    A large-scale and machine-generated dataset of 274,186 toxic and benign statements about 13 minority groups. This dataset uses a demonstration-based prompting framework and an adversarial classifier-in-the-loop decoding method to generate subtly toxic and benign text with a massive pre-trained language model (GPT-3). Controlling machine generation in this way allows TOXIGEN to cover implicitly toxic text at a larger scale, and about more demographic groups, than previous resources of human-written text. TOXIGEN can be used to fight human-written and machine-generated toxicity.

  5. toxigen-train-es

    • huggingface.co
    Updated Sep 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Intuit-GenSRF (2024). toxigen-train-es [Dataset]. https://huggingface.co/datasets/Intuit-GenSRF/toxigen-train-es
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 3, 2024
    Dataset provided by
    Intuithttp://intuit.com/
    Authors
    Intuit-GenSRF
    Description

    Dataset Card for "toxigen-train-es"

    More Information needed

  6. h

    toxigen-prompts

    • huggingface.co
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamish Ivison (2025). toxigen-prompts [Dataset]. https://huggingface.co/datasets/hamishivi/toxigen-prompts
    Explore at:
    Dataset updated
    Sep 18, 2025
    Authors
    Hamish Ivison
    Description

    hamishivi/toxigen-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. rtb-toxigen-data-guard

    • huggingface.co
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Innodata Inc (2024). rtb-toxigen-data-guard [Dataset]. https://huggingface.co/datasets/innodatalabs/rtb-toxigen-data-guard
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    Innodatahttp://www.innodata.com/
    Authors
    Innodata Inc
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    toxigen-data dataset

    Red teaming toxigen-data dataset. Generated from https://huggingface.co/datasets/toxigen/toxigen-data

      Dataset Structure
    
    
    
    
    
      Sample
    

    { "expected": "Safe Unsafe", "id": "112", "messages": [ { "content": "Task: Check if there is unsafe content in 'User' messages in conversations according to our safety policy with the below categories:

  8. h

    toxigen-data

    • huggingface.co
    Updated Jul 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dewi Bryn Jones (2025). toxigen-data [Dataset]. https://huggingface.co/datasets/DewiBrynJones/toxigen-data
    Explore at:
    Dataset updated
    Jul 27, 2025
    Authors
    Dewi Bryn Jones
    Description

    DewiBrynJones/toxigen-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. Database on the taxonomical identification and potential toxigenic...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    xls
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amparo de Benito; Clara Ibáñez; Walter Moncho; David Martínez; Ariane Vettorazzi; Adela López de Cerain; Amparo de Benito; Clara Ibáñez; Walter Moncho; David Martínez; Ariane Vettorazzi; Adela López de Cerain (2020). Database on the taxonomical identification and potential toxigenic capacities of non-QPS [Dataset]. http://doi.org/10.5281/zenodo.1493861
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Amparo de Benito; Clara Ibáñez; Walter Moncho; David Martínez; Ariane Vettorazzi; Adela López de Cerain; Amparo de Benito; Clara Ibáñez; Walter Moncho; David Martínez; Ariane Vettorazzi; Adela López de Cerain
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The work performed constitutes the external scientific report of the EFSA contract OC/EFSA/FEED/2015/01. The aim of the project has been to provide EFSA with a database from a review on the taxonomical description and potential toxigenic capacities of microorganisms used for the industrial production of feed additives and food enzymes. The review includes microorganisms producing feed additives and food enzymes for which EFSA has received or can potentially receive applications for safety assessment, and which have not been recommended for Qualified Presumption of Safety (QPS) status. The database also comprises the molecular taxonomical identifiers and biosynthetic pathways involved in the production of toxic compounds and responsible genes. The main result of the project is shown as a database according to the EFSA data structure has been developed. The methodological aspects and the queries used in the systematic search and the procedure applied for the screening of retrieved scientific documents are described in this report. Details are available in supplementary appendices to this report.

    In total, 22970 scientific documents were screened in the literature search from which 411 were initially selected for providing pertinent data for the scope of the project. From the review of the selected articles, 474 bioactive secondary metabolites were recorded and 59 compounds were further studied for obtaining data on their toxicology and characteristic of their production by microorganisms used in industrial fermentations. The database generated in this project comprises details that characterized the conditions, genes involved and toxicity of these 59 compounds. This provides information that can be used to establish safety measures when using potentially toxigenic microorganisms in industrial fermentations.

    The searching strategy was defined after a preliminary study in which, general information about the fermentative process involving the microorganisms within the scope was obtained. This allowed to identify possible problems that can arise when retrieving data from this heterogeneous group of microorganisms.

    Several groups of species and groups of keywords were established to perform the searching strategy. The groups of keywords are the following:

    • Keywords group 1: Terms related to toxin production and hazards
    • Keywords group 2: Terms related to feed additives and food enzymes
    • Keywords group 3: Terms related to fermentative processes
    • Keywords group 4: Terms related to toxicology
    • Keywords group 5: Terms related to biosynthetic pathways

    The microbial species has been divided into 3 groups, Species I, Species II, and Species III, according to the preliminary outcome in PubMed search:

    Species I: Microorganisms that produced ≤ 200 entries when searched by scientific name.

    Species II: Microorganisms that produced ≤ 500 entries when searched by scientific name and keywords from group

    Species III: Microorganisms that produced > 500 entries when searched by scientific name and keywords from group 1

    Note: Version 2 includes an update in the TOXICITYRESULTS file, where the column "effect_concentration" has been added.

  10. f

    Data_Sheet_1_Evaluation of mycotoxins, mycobiota and toxigenic fungi in the...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    pdf
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Min Hu; Lulu Wang; Dapeng Su; Qingsong Yuan; Chenghong Xiao; Lanping Guo; Meidan Wang; Chuanzhi Kang; Jinqiang Zhang; Tao Zhou (2024). Data_Sheet_1_Evaluation of mycotoxins, mycobiota and toxigenic fungi in the traditional medicine Radix Dipsaci.PDF [Dataset]. http://doi.org/10.3389/fmicb.2024.1454683.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    Frontiers
    Authors
    Min Hu; Lulu Wang; Dapeng Su; Qingsong Yuan; Chenghong Xiao; Lanping Guo; Meidan Wang; Chuanzhi Kang; Jinqiang Zhang; Tao Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Medicinal herbs have been increasingly used for therapeutic purposes against a diverse range of human diseases worldwide. However, inevitable contaminants, including mycotoxins, in medicinal herbs can cause serious problems for humans despite their health benefits. The increasing consumption of medicinal plants has made their use a public health problem due to the lack of effective surveillance of the use, efficacy, toxicity, and quality of these natural products. Radix Dipsaci is commonly utilized in traditional Chinese medicine and is susceptible to contamination with mycotoxins. Here, we evaluated the mycotoxins, mycobiota and toxigenic fungi in the traditional medicine Radix Dipsaci. A total of 28 out of 63 Radix Dipsaci sample batches (44.4%) were found to contain mycotoxins. Among the positive samples, the contamination levels of AFB1, AFG1, AFG2, and OTA in the positive samples ranged from 0.52 to 32.13 μg/kg, 5.14 to 20.05 μg/kg, 1.52 to 2.33 μg/kg, and 1.81 to 19.43 μg/kg respectively, while the concentrations of ZEN and T-2 were found to range from 2.85 to 6.33 μg/kg and from 2.03 to 2.53 μg/kg, respectively. More than 60% of the contaminated samples were combined with multiple mycotoxins. Fungal diversity and community were altered in the Radix Dipsaci contaminated with various mycotoxins. The abundance of Aspergillus and Fusarium increased in the Radix Dipsaci contaminated with aflatoxins (AFs) and ZEN. A total of 95 strains of potentially toxigenic fungi were isolated from the Radix Dipsaci samples contaminated with mycotoxins, predominantly comprising Aspergillus (73.7%), Fusarium (20.0%), and Penicillium (6.3%). Through morphological identification, molecular identification, mycotoxin synthase gene identification and toxin production verification, we confirmed that AFB1 and AFG1 primarily derive from Aspergillus flavus, OTA primarily derives from Aspergillus westerdijkiae, ZEN primarily derives from Fusarium oxysporum, and T-2 primarily derives from Fusarium graminearum in Radix Dipsaci. These data can facilitate our comprehension of prevalent toxigenic fungal species and contamination levels in Chinese herbal medicine, thereby aiding the establishment of effective strategies for prevention, control, and degradation to mitigate the presence of fungi and mycotoxins in Chinese herbal medicine.

  11. toxigen-test-annotated

    • huggingface.co
    Updated Oct 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Intuit-GenSRF (2023). toxigen-test-annotated [Dataset]. https://huggingface.co/datasets/Intuit-GenSRF/toxigen-test-annotated
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2023
    Dataset provided by
    Intuithttp://intuit.com/
    Authors
    Intuit-GenSRF
    Description

    Dataset Card for "toxigen-test-annotated"

    More Information needed

  12. f

    Data from: Marasas et al. 1984 “Toxigenic Fusarium Species: Identity and...

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Nov 27, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McCormick, Susan P.; Alberts, Johanna F.; Geiser, David M.; Busman, Mark; Rheeder, John P.; O’Donnell, Kerry; Proctor, Robert H.; Ward, Todd J.; Doehring, Gail (2018). Marasas et al. 1984 “Toxigenic Fusarium Species: Identity and Mycotoxicology” revisited [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000620067
    Explore at:
    Dataset updated
    Nov 27, 2018
    Authors
    McCormick, Susan P.; Alberts, Johanna F.; Geiser, David M.; Busman, Mark; Rheeder, John P.; O’Donnell, Kerry; Proctor, Robert H.; Ward, Todd J.; Doehring, Gail
    Description

    This study was conducted to determine the species identity and mycotoxin potential of 158 Fusarium strains originally archived in the South African Medical Research Council’s Mycotoxigenic Fungal Collection (MRC) that were reported to comprise 17 morphologically distinct species in the classic 1984 compilation by Marasas et al., Toxigenic Fusarium Species: Identity and Mycotoxicology. Maximum likelihood and maximum parsimony molecular phylogenetic analyses of single and multilocus DNA sequence data indicated that the strains represented 46 genealogically exclusive phylogenetically distinct species distributed among eight species complexes. Moreover, the phylogenetic data revealed that 80/158 strains were received under a name that is not accepted today (ex F. moniliforme) or classified under a different species name. In addition, gas chromatography–mass spectrometry (GC-MS) and/or high-performance liquid chromatography–mass spectrometry (HPLC-MS)-based mycotoxin analyses were conducted to determine which toxins the strains could produce in liquid and/or solid cultures. All of the trichothecene toxin–producing fusaria were nested within the F. sambucinum (FSAMSC) or F. incarnatum-equiseti (FIESC) species complexes. Consistent with this finding, GC-MS analyses detected trichothecenes in agmatine-containing broth or rice culture extracts of all 13 FSAMSC and 10/12 FIESC species tested. Species in six and seven of the eight species complexes were able to produce moniliformin and beauvericin, respectively, whereas B-type fumonisins were only detected in extracts of cracked maize kernel cultures of three species in the F. fujikuroi (FFSC) species complex.

  13. h

    toxigen-data-tw

    • huggingface.co
    Updated May 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chen-Jui Tsao (2024). toxigen-data-tw [Dataset]. https://huggingface.co/datasets/enip2473/toxigen-data-tw
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2024
    Authors
    Chen-Jui Tsao
    Description

    enip2473/toxigen-data-tw dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    toxigen-data_test_translated_padronizado

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    akcit ijf, toxigen-data_test_translated_padronizado [Dataset]. https://huggingface.co/datasets/akcit-ijf/toxigen-data_test_translated_padronizado
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    akcit ijf
    Description

    akcit-ijf/toxigen-data_test_translated_padronizado dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. h

    toxic_preprocess

    • huggingface.co
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seongsoo Heo (2024). toxic_preprocess [Dataset]. https://huggingface.co/datasets/Seongsooo/toxic_preprocess
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 5, 2024
    Authors
    Seongsoo Heo
    Description

    Hello, guys. This is Toxic text classification dataset. The sources of the datasets are as follows. https://github.com/microsoft/TOXIGEN https://huggingface.co/datasets/tdavidson/hate_speech_offensive https://github.com/SALT-NLP/implicit-hate https://huggingface.co/datasets/OxAISH-AL-LLM/wiki_toxic Since the datasets are only divided into sentences and labels, I think it will be convenient to use them as they are. Please understand that there is a high possibility that they may not respond… See the full description on the dataset page: https://huggingface.co/datasets/Seongsooo/toxic_preprocess.

  16. h

    finetuningtrain1INSTRUCT-_toxigen-data-test_fewshotmenor_LIMIAR2

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Dollis, finetuningtrain1INSTRUCT-_toxigen-data-test_fewshotmenor_LIMIAR2 [Dataset]. https://huggingface.co/datasets/juliadollis/finetuningtrain1INSTRUCT-_toxigen-data-test_fewshotmenor_LIMIAR2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Julia Dollis
    Description

    juliadollis/finetuningtrain1INSTRUCT-_toxigen-data-test_fewshotmenor_LIMIAR2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    Mistral-7B-Instruct-v0.3-_toxigen-data-test_zeroshot_LIMIAR2

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Dollis, Mistral-7B-Instruct-v0.3-_toxigen-data-test_zeroshot_LIMIAR2 [Dataset]. https://huggingface.co/datasets/juliadollis/Mistral-7B-Instruct-v0.3-_toxigen-data-test_zeroshot_LIMIAR2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Julia Dollis
    Description

    juliadollis/Mistral-7B-Instruct-v0.3-_toxigen-data-test_zeroshot_LIMIAR2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    toxicity-multilingual-binary-classification-dataset

    • huggingface.co
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Salazar (2025). toxicity-multilingual-binary-classification-dataset [Dataset]. https://huggingface.co/datasets/malexandersalazar/toxicity-multilingual-binary-classification-dataset
    Explore at:
    Dataset updated
    May 14, 2025
    Authors
    Alexander Salazar
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset is a comprehensive collection designed to aid in the development of robust and nuanced models for identifying toxic language across multiple languages, while critically distinguishing it from expressions related to mental health, specifically depression. It synthesizes content from three existing public datasets (ToxiGen, TextDetox, and Mental Health - Depression) with a newly generated synthetic dataset (ToxiLLaMA). The creation process involved careful collection, extensive… See the full description on the dataset page: https://huggingface.co/datasets/malexandersalazar/toxicity-multilingual-binary-classification-dataset.

  19. real-toxicity-prompts

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2, real-toxicity-prompts [Dataset]. http://doi.org/10.57967/hf/0002
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for Real Toxicity Prompts

      Dataset Summary
    

    RealToxicityPrompts is a dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models.

      Languages
    

    English

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    Each instance represents a prompt and its metadata: { "filename":"0766186-bc7f2a64cb271f5f56cf6f25570cd9ed.txt", "begin":340, "end":564, "challenging":false… See the full description on the dataset page: https://huggingface.co/datasets/allenai/real-toxicity-prompts.

  20. h

    harmful-text

    • huggingface.co
    Updated Nov 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Kluge Corrêa (2023). harmful-text [Dataset]. https://huggingface.co/datasets/nicholasKluge/harmful-text
    Explore at:
    Dataset updated
    Nov 5, 2023
    Authors
    Nicholas Kluge Corrêa
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Harmful-Text

      Dataset Summary
    

    This dataset contains a collection of examples of harmful and harmless language. The dataset is available in both Portuguese and English. Samples were collected from the following datasets:

    Anthropic/hh-rlhf. allenai/prosocial-dialog. allenai/real-toxicity-prompts. dirtycomputer/Toxic_Comment_Classification_Challenge. Paul/hatecheck-portuguese. told-br. skg/toxigen-data.

      Supported Tasks and Leaderboards
    

    This dataset can be… See the full description on the dataset page: https://huggingface.co/datasets/nicholasKluge/harmful-text.

  21. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Innodata Inc (2024). rtb-toxigen-data [Dataset]. https://huggingface.co/datasets/innodatalabs/rtb-toxigen-data
Organization logo

rtb-toxigen-data

innodatalabs/rtb-toxigen-data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2024
Dataset provided by
Innodatahttp://www.innodata.com/
Authors
Innodata Inc
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

toxigen-data dataset

Red teaming toxigen-data dataset. Generated from https://huggingface.co/datasets/toxigen/toxigen-data

  Dataset Structure





  Sample

{ "expected": "Acceptable Unacceptable", "id": "158", "messages": [ { "content": "Every assistant has a different tolerance to toxic content. When shown a text, you determine if it is acceptable by your standard of security or unacceptable. If the request is unacceptable by your… See the full description on the dataset page: https://huggingface.co/datasets/innodatalabs/rtb-toxigen-data.

Search
Clear search
Close search
Google apps
Main menu