30 datasets found
  1. utility script

    • kaggle.com
    zip
    Updated Jul 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NaomiDing (2020). utility script [Dataset]. https://www.kaggle.com/datasets/naomiding/utility-script
    Explore at:
    zip(5030 bytes)Available download formats
    Dataset updated
    Jul 3, 2020
    Authors
    NaomiDing
    Description

    Dataset

    This dataset was created by NaomiDing

    Contents

  2. Utility Scripts

    • kaggle.com
    zip
    Updated Jul 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pratap surwase (2024). Utility Scripts [Dataset]. https://www.kaggle.com/datasets/pratapsurwase/utility-scripts/code
    Explore at:
    zip(5555 bytes)Available download formats
    Dataset updated
    Jul 10, 2024
    Authors
    pratap surwase
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by pratap surwase

    Released under Apache 2.0

    Contents

  3. utility_script

    • kaggle.com
    zip
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GodGod3 (2024). utility_script [Dataset]. https://www.kaggle.com/datasets/godgod3/utility-script
    Explore at:
    zip(5047 bytes)Available download formats
    Dataset updated
    Feb 19, 2024
    Authors
    GodGod3
    Description

    Dataset

    This dataset was created by GodGod3

    Contents

  4. indoor io_f file

    • kaggle.com
    zip
    Updated Apr 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charlie Craine (2021). indoor io_f file [Dataset]. https://www.kaggle.com/crained/indoor-io-f-file
    Explore at:
    zip(880 bytes)Available download formats
    Dataset updated
    Apr 5, 2021
    Authors
    Charlie Craine
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Charlie Craine

    Released under CC0: Public Domain

    Contents

  5. torch_transforms

    • kaggle.com
    zip
    Updated Feb 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Uttam Mittal (2023). torch_transforms [Dataset]. https://www.kaggle.com/datasets/uttammittal02/torch-transforms
    Explore at:
    zip(47128 bytes)Available download formats
    Dataset updated
    Feb 21, 2023
    Authors
    Uttam Mittal
    Description

    Dataset

    This dataset was created by Uttam Mittal

    Contents

  6. video_and_util_for_inference_example

    • kaggle.com
    zip
    Updated Feb 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D. Forester (2020). video_and_util_for_inference_example [Dataset]. https://www.kaggle.com/telemaque/video-and-util-for-inference-example
    Explore at:
    zip(2034313 bytes)Available download formats
    Dataset updated
    Feb 5, 2020
    Authors
    D. Forester
    Description

    Dataset

    This dataset was created by D. Forester

    Contents

  7. arabic-wikipedia-scrapper-utility-scripts

    • kaggle.com
    zip
    Updated Jan 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Yasser Salah El-Din (2024). arabic-wikipedia-scrapper-utility-scripts [Dataset]. https://www.kaggle.com/datasets/omaryassersalaheldin/arabic-wikipedia-scrapper-utility-scripts
    Explore at:
    zip(3071 bytes)Available download formats
    Dataset updated
    Jan 20, 2024
    Authors
    Omar Yasser Salah El-Din
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Omar Yasser Salah El-Din

    Released under Apache 2.0

    Contents

  8. Kaggle-KL-Div

    • kaggle.com
    zip
    Updated Jan 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Deotte (2024). Kaggle-KL-Div [Dataset]. https://www.kaggle.com/cdeotte/kaggle-kl-div
    Explore at:
    zip(3023 bytes)Available download formats
    Dataset updated
    Jan 13, 2024
    Authors
    Chris Deotte
    Description

    Utility scripts to compute the KL Divergence metric in Kaggle's HMS - Harmful Brain Activity Classification competition. The code came from Kaggle. For an example how to use, see one of my starter notebooks:

    Starter Notebooks: * EfficientNetB2 starter here * CatBoost starter here * WaveNet starter here * MLP starter here

  9. Backup title+sep+anc+sep+target script Debug

    • kaggle.com
    zip
    Updated May 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shamim Ahamed (2022). Backup title+sep+anc+sep+target script Debug [Dataset]. https://www.kaggle.com/datasets/sahamed/backup-titlesepancseptarget-script-debug
    Explore at:
    zip(4826760921 bytes)Available download formats
    Dataset updated
    May 30, 2022
    Authors
    Shamim Ahamed
    Description

    score decreased after adding special token. to check the utility script I removed all the special token and trained the model with 50% of total data to reduce train time. But the bug was found as I applied tokenization before adding special tokens to the tokenizer. But kept this trained model incase bug still exist.

  10. val-imagenet1000

    • kaggle.com
    zip
    Updated Dec 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    r2d2_c3po (2024). val-imagenet1000 [Dataset]. https://www.kaggle.com/datasets/crpatel123/val-imagenet1000
    Explore at:
    zip(6670376275 bytes)Available download formats
    Dataset updated
    Dec 26, 2024
    Authors
    r2d2_c3po
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Imagenet 1000 Validation is not available in proper format I used some utility script to convert it into a consumable format

    val/
    # ├── n01440764
    # │  ├── ILSVRC2012_val_00000293.JPEG
    # │  ├── ILSVRC2012_val_00002138.JPEG
    # │  ├── ......
    # ├── ......
    
  11. TF2Bert

    • kaggle.com
    Updated Dec 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miro (2019). TF2Bert [Dataset]. https://www.kaggle.com/amrabed/tf2bert/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Miro
    Description

    Description

    This dataset consists of modified versions of the Bert scripts by Phil Culliton to be imported into kernels as a single library instead of adding each module as a separate utility script:

    1. bert_modeling
    2. bert_optimization
    3. bert_tokenization
    4. bert_baseline*

    For the baseline script, I am using a simplified/organized version of the translated version by dimitreoliveira instead of the original one by the Tensorflow team

    Usage

    To use in your Kernel:

    1. Add this dataset as a new dataset. That adds the three files to your input folder under TF2Bert subfolder.
    2. Import the library modules using the following commands in your script:
    
      import os
      os.chdir("/kaggle/input")
      from tf2bert import modeling, optimization, tokenization, baseline
      os.chdir('/kaggle/working`)
    

    License

    As described in the files, the code is licensed under the Apache 2.0 license

  12. EfficientNet Keras Weights B0-B5

    • kaggle.com
    zip
    Updated Jul 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neuron Engineer (2019). EfficientNet Keras Weights B0-B5 [Dataset]. https://www.kaggle.com/ratthachat/efficientnet-keras-weights-b0b5
    Explore at:
    zip(280760493 bytes)Available download formats
    Dataset updated
    Jul 17, 2019
    Authors
    Neuron Engineer
    Description

    Credits

    All credits are due to https://github.com/qubvel/efficientnet Thanks so much for your contribution!

    Usage:

    Use with this utility script: https://www.kaggle.com/ratthachat/efficientnet/

    Adding this utility script to your kernel, and you will be able to use all models just like standard Keras pretrained model. For details see https://www.kaggle.com/c/aptos2019-blindness-detection/discussion/100186

  13. Kaggle: Forum Discussions

    • kaggle.com
    zip
    Updated Nov 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolás Ariel González Muñoz (2025). Kaggle: Forum Discussions [Dataset]. https://www.kaggle.com/datasets/nicolasgonzalezmunoz/kaggle-forum-discussions
    Explore at:
    zip(542099 bytes)Available download formats
    Dataset updated
    Nov 8, 2025
    Authors
    Nicolás Ariel González Muñoz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Note: This is a work in progress, and not all the Kaggle forums are included in this dataset. The remaining forums will be added when I end solving some issues with the data generators related to these forums.

    Summary

    Welcome to the Kaggle Forum Discussions dataset!. This dataset contains curated data about recent discussions opened in the different forums on Kaggle. The data is obtained through web scraping techniques, using the selenium libraries, and converting text data into markdown style using the markdownify package.

    This dataset contains information about the discussion main topic, topic title, comments, votes, medals and more, and is designed to serve as a complement to the data available on the Kaggle meta dataset, specifically for recent discussions. Keep reading to see the details.

    Extraction Technique

    As a dynamic website that relies heavily in JavaScript (JS), I extracted the data in this dataset through web scraping techniques using the selenium library.

    The functions and classes used to scrape the data on Kaggle where stored on a utility script publicly available here. As JS-generated pages like Kaggle are unstable where trying to scrape them, the mentioned script implements capabilities for retrying connections and to await for elements to appear.

    Each Forum was scrapped using a one notebook for each, then the mentioned notebooks were connected to a central notebook that generates this dataset. Also the discussions are scrapped in parallel so to enhance speed. This dataset represents all the data that can be gathered in a single notebook session, from the most recent to the most old.

    If you need more control on the data you want to research, feel free to import all you need from the utility script mentioned before.

    Structure

    This dataset contains several folders, each named as the discussion forum they contain data about. For example, the 'competition-hosting' folder contains data about the Competition Hosting forum. Inside each folder, you'll find two files: one is a csv file and the other a json file.

    The json file (in Python, represented as a dictionary) is indexed with the ID that Kaggle assigns to the mentioned discussion. Each ID is paired with its corresponding discussion, which is represented as a nested dictionary (the discussion dict), which contains the following fields: - title: The title of the main topic. - content: Content of the main topic. - tags: List containing the discussion's tags. - datetime: Date and time at which the discussion was published (in ISO 8601 format). - votes: Number of votes gotten by the discussion. - medal: Medal awarded by the main topic (if any). - user: User that published the main topic. - expertise: Publisher's expertise, measured by the Kaggle progression system. - n_comments: Total number of comments in the current discussion. - n_appreciation_comments: Total number of appreciation comments in the current discussion. - comments: Dictionary containing data about the comments in the discussion. Each comment is indexed by an ID assigned by Kaggle, containing the following fields: - content: Comment's content. - is_appreciation: Wether the comment is of appreciation. - is_deleted: Wether the comment was deleted. - n_replies: Number of replies to the comment. - datetime: Date and time at which the comment was published (in ISO 8601 format). - votes: Number of votes gotten by the current comment. - medal: Medal awarded by the comment (if any). - user: User that published the comment. - expertise: Publisher's expertise, measured by the Kaggle progression system. - n_deleted: Total number of deleted replies (including self). - replies: A dict following this same format.

    By other side, the csv file serves as a summary of the json file, containing information about the comments limited to the hottest and most voted comments.

    Note: Only the 'content' field is mandatory for each discussion. The availability of the other fields is subject to the stability of the scraping tasks, which may also affect the update frequency.

  14. TF 2.0 QA - Simplified - DataFrame

    • kaggle.com
    zip
    Updated Nov 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavel Kovalets (2019). TF 2.0 QA - Simplified - DataFrame [Dataset]. https://www.kaggle.com/feanorpk/tf-20-qa-simplified-dataframe
    Explore at:
    zip(262240620 bytes)Available download formats
    Dataset updated
    Nov 12, 2019
    Authors
    Pavel Kovalets
    Description

    Content

    This dataset was created from the TensorFlow 2.0 Question Answering primary dataset using this very handy utility script. The main differences from the original one are: - the structure is flattened to a simple DataFrame - long_answer_candidates were removed - only first annotations kept for both long and short answer (for short answer it is a reasonable approximation because there are very few samples with multiple short answers)

    Acknowledgements

    Thanks xhlulu for providing the utility script.

  15. SportsTransformer_utils

    • kaggle.com
    zip
    Updated Oct 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavel V (2024). SportsTransformer_utils [Dataset]. https://www.kaggle.com/datasets/pvabish/sportstransformer-utils
    Explore at:
    zip(8414 bytes)Available download formats
    Dataset updated
    Oct 24, 2024
    Authors
    Pavel V
    Description

    This Dataset contains the utility scripts meant to be used with the BigDataBowl2025 notebook titled "Modeling with Transformers, by SumerSports". The utility scripts included are data_prep.py, process_datasets.py, and models.py

  16. Data from: drug discovery knowledge graph

    • kaggle.com
    zip
    Updated Apr 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jerry Chan (2021). drug discovery knowledge graph [Dataset]. https://www.kaggle.com/catwhisker/drug-discovery-knowledge-graph
    Explore at:
    zip(672725285 bytes)Available download formats
    Dataset updated
    Apr 17, 2021
    Authors
    Jerry Chan
    Description

    Presentation slides about KGE and the proposed KGE regression idea: https://docs.google.com/presentation/d/1u8kVSd6CcxlealP-2E2jpOdvNCSCovfJTn0hbpfOLhY/edit?usp=sharing Written report: https://docs.google.com/document/d/1UvCrD3jrf7Z3qPdV6C-6GWgDJYHDfBJyXWm8G-sGTeM/edit?usp=sharing

    This utility script contain a class which provides following utilities https://www.kaggle.com/catwhisker/kge-experiment 1. caching and loading PyKEEN models and datasets 2. create sub graph from PyKEEN datasets using an entity mask 3. evaluating different model with plots 4. minimal and imperfect GPU memory management demo of the utility script on toy graph: https://www.kaggle.com/catwhisker/kge-experinment-demo

    Link of scripts that generate corresponding item are listed on the item's description

  17. Microsoft COCO (Zhao et al 2017)

    • kaggle.com
    zip
    Updated Oct 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachael Tatman (2019). Microsoft COCO (Zhao et al 2017) [Dataset]. https://www.kaggle.com/rtatman/ms-coco
    Explore at:
    zip(19282796 bytes)Available download formats
    Dataset updated
    Oct 21, 2019
    Authors
    Rachael Tatman
    Description

    Context

    This dataset contains pickled Python objects with data from the annotations of the Microsoft (MS) COCO dataset. COCO is a large-scale object detection, segmentation, and captioning dataset.

    Content

    Except for the objs file, which is a plain text file continuing a list of objects, the data in this dataset is all in the pickle format, a way of storing Python objects at binary data files.

    Important: These pickles were pickled using Python 2. Since Kernels use Python 3, you will need to specify the encoding when unpickling these files. The Python utility scripts here have been updated to correctly unpickle these files.

    # the correct syntax to read these pickled files into Python 3
    pickle.load(open('file_path, 'rb'), encoding = "latin1")
    

    Acknowledgements

    As a derivative of the original COCO dataset, this dataset is distributed under a CC-BY 4.0 license. These files were distributed as part of the supporting materials for Zhao et al 2017. If you use these files in your work, please cite the following paper:

    Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2017). Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2979-2989).

  18. NIH DeepLesion Tensor Slices

    • kaggle.com
    zip
    Updated Mar 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Ahlbrecht (2023). NIH DeepLesion Tensor Slices [Dataset]. https://www.kaggle.com/datasets/benjaminahlbrecht/nih-deeplesion-tensor-slices
    Explore at:
    zip(18921282754 bytes)Available download formats
    Dataset updated
    Mar 30, 2023
    Authors
    Benjamin Ahlbrecht
    Description

    NIH DeepLesion Tensor Slices

    NIH DeepLesion Tensor Slices is a subset of NIH's DeepLesion dataset, which may be downloaded in full from here. Full credit is to be given to the NIH team for creating the DeepLesion dataset.

    Getting Started (In Progress)

    To see an example of this dataset in use, we perform bounding box regression utilizing the ResNet-50 network pre-trained on the ImageNet dataset, creating a model name LesionFinder. - LesionFinder: Training. The training notebook. We fit the adapted ResNet-50 model to the training data. - LesionFinder: Evaluation. The evaluation notebook. We gauge the performance of the training model. - LesionFinder: Utilities. Utility script providing functions and classes for the training and evaluation notebooks

    Downloading NIH DeepLesion Tensor Slices

    You may download and preprocess the entire DeepLesion dataset to generate this subset by following the instructions given here. Please note that downloading and processing the entire dataset requires approximately 375 GB of disk space. If you simply want the tensor slices, it is highly suggested to download it directly from Kaggle.

    Data Preprocessing

    NIH DeepLesion Tensor Slices performs 2 primary steps in preprocessing the DeepLesion dataset. 1. The images and bounding boxes are normalized to the range [0, 1]. The images are converted from hounsfield units to pixel (voxel) values, which are feature-scaled to the [0, 1] given the maximum and minimum values of the DICOM window. The bounding boxes are scaled as a proportion of the respective image dimension, where a value of 0 means that the lesion lies on the respective dimension's minimum and a value of 1 means that the lesion lies on the respective dimension's maximum. 2. We stack local x-ray images to encode local volumetric information. For a given key slice, we create an image with 3 channels by prepending the previous slice and appending the next slice to the key slice. Since most pre-trained networks utilize 3 channels, this will hopefully help our models utilize information surrounding the key slice.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6966306%2Fe287b5c0582cf78a47f8daa8b1ac157a%2Fvisualize_volume.svg?generation=1680371159427206&alt=media" alt="Visualizing Tensor Slices">

    When visualizing an image, this mean the red channel encodes information from the previous slice, the green channel encodes information from the key slice, and the blue channel encodes information from the next slice. As an aside, we can visualize how the RGB version of CT images compares to simple grayscale images.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6966306%2Fca6a67f9e178acea4362e9097506171a%2Fct_volume.gif?generation=1680371544234583&alt=media" alt="Grayscale CT Animation"> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6966306%2Fcf5926aa3d3c0c112c0765f857b28cdf%2Fct_volume_3d.gif?generation=1680371547549634&alt=media" alt="RGB CT Animation">

    Discussion: Questions, Concerns, Feedback, and Collaboration

    Feel free to start a discussion if you have any information you want to share, any questions or concerns you may have, or whatever else. If you believe there are errors or better preprocessing approaches, please let me know! I'm always looking to improve my work and will always appreciate the criticism, and I'd be happy to ensure the validity of the dataset and its preprocessing.

    Dataset Attribution and Citation

    I did NOT create the primary dataset. Any work done using the dataset should be referenced to the original authors below:

    Yan, Ke, et al. "DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning." Journal of medical imaging 5.3 (2018): 036501-036501.

    @article{yan2018deeplesion,
     title={DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning},
     author={Yan, Ke and Wang, Xiaosong and Lu, Le and Summers, Ronald M},
     journal={Journal of medical imaging},
     volume={5},
     number={3},
     pages={036501--036501},
     year={2018},
     publisher={Society of Photo-Optical Instrumentation Engineers}
    }
    
  19. Glove 6B JSON format

    • kaggle.com
    zip
    Updated Apr 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rachid_ouhammou (2023). Glove 6B JSON format [Dataset]. https://www.kaggle.com/datasets/ouhammourachid/glove-6b-json-format
    Explore at:
    zip(904273329 bytes)Available download formats
    Dataset updated
    Apr 28, 2023
    Authors
    rachid_ouhammou
    Description

    This dataset contains the Glove 6B word embeddings in JSON format, providing a comprehensive collection of pre-trained word vectors that can be used to enhance a wide range of natural language processing (NLP) projects. The Glove 6B dataset contains embeddings for over 6 billion tokens, trained on a 2014 dump of English Wikipedia and Gigaword 5, using a corpus of 822 million tokens. The embeddings are pre-trained on this large corpus, which means that they can be used as a starting point for a variety of NLP tasks, including text classification, sentiment analysis, and named entity recognition.

    The JSON format makes it easy to access and use the embeddings in your own projects, without the need to perform any additional preprocessing. Each word in the vocabulary is represented as a JSON object containing the word itself as well as its corresponding 300-dimensional vector. The dataset also includes utility scripts that demonstrate how to load and use the embeddings in popular NLP libraries such as NLTK and spaCy.

    Whether you're a researcher, developer, or hobbyist working on NLP projects, the Glove 6B word embeddings in JSON format provide a powerful tool for accelerating your work and achieving better results. Download the dataset today and start exploring the possibilities!

  20. E.C.H.O. - Crisis Simulation

    • kaggle.com
    zip
    Updated Nov 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VISHNUSELVAM186 (2025). E.C.H.O. - Crisis Simulation [Dataset]. https://www.kaggle.com/datasets/vishnuselvam186/echo-crisis-simulation
    Explore at:
    zip(7200553 bytes)Available download formats
    Dataset updated
    Nov 26, 2025
    Authors
    VISHNUSELVAM186
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    E.C.H.O. is an AI-powered multi-agent crisis communication simulator that helps users practice de-escalation, empathy, and emotional intelligence. It uses Actor, Monitor, and Director agents to analyze user input, track tension levels, and generate dynamic, emotionally realistic responses across multiple scenarios.

    📄 README.md This file provides a full overview of the E.C.H.O. project, including architecture, setup instructions, agents, scenarios, and future roadmap. It serves as the main documentation for understanding how to run and extend the simulation.

    🧠 agents/ (folder) Contains all multi-agent logic for E.C.H.O. Each Python file represents a standalone agent responsible for part of the simulation loop.

    actor.py – Generates emotionally responsive dialogue for the crisis character.

    monitor.py – Evaluates user sentiment and adjusts tension levels (0–100).

    director.py – Controls complexity, inserts environmental complications, and manages narrative pacing.

    graph.py – Defines the LangGraph workflow connecting Actor, Monitor, and Director agents.

    state.py – Stores global simulation state including tension score, history, persona, and turn count. 🎨 frontend/ (folder) A lightweight HTML/JS interface for running E.C.H.O. in a browser. Contains assets, icons, and UI elements for scenario selection and interaction.

    ⚙ app.py Streamlit application that serves as the main UI for E.C.H.O. Displays tension meter, scenario visuals, microphone input, agent responses, and conversation history.

    📜 requirements.txt Python library dependencies used by the project, including LangGraph, Streamlit, Gemini API clients, and audio processing tools.

    🔧 zip_project.py Utility script that compresses the entire E.C.H.O. project into a single .zip file for Kaggle uploads or backups.

    ⚠ .env.example Template file showing expected environment variables, including the required Google Gemini API key.

    🧳 .kaggleignore Specifies files or folders to exclude during Kaggle dataset packaging (e.g., sensitive or unnecessary files).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
NaomiDing (2020). utility script [Dataset]. https://www.kaggle.com/datasets/naomiding/utility-script
Organization logo

utility script

Explore at:
zip(5030 bytes)Available download formats
Dataset updated
Jul 3, 2020
Authors
NaomiDing
Description

Dataset

This dataset was created by NaomiDing

Contents

Search
Clear search
Close search
Google apps
Main menu