74 datasets found
  1. d

    Data from: ViTexOCR; a script to extract text overlays from digital video

    • catalog.data.gov
    • data.usgs.gov
    • +4more
    Updated Nov 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). ViTexOCR; a script to extract text overlays from digital video [Dataset]. https://catalog.data.gov/dataset/vitexocr-a-script-to-extract-text-overlays-from-digital-video
    Explore at:
    Dataset updated
    Nov 19, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    The ViTexOCR script presents a new method for extracting navigation data from videos with text overlays using optical character recognition (OCR) software. Over the past few decades, it was common for videos recorded during surveys to be overlaid with real-time geographic positioning satellite chyrons including latitude, longitude, date and time, as well as other ancillary data (such as speed, heading, or user input identifying fields). Embedding these data into videos provides them with utility and accuracy, but using the location data for other purposes, such as analysis in a geographic information system, is not possible when only available on the video display. Extracting the text data from imagery using software allows these videos to be located and analyzed in a geospatial context. The script allows a user to select a video, specify the text data types (e.g. latitude, longitude, date, time, or other), text color, and the pixel locations of overlay text data on a sample video frame. The script’s output is a data file containing the retrieved geospatial and temporal data. All functionality is bundled in a Python script that incorporates a graphical user interface and several other software dependencies.

  2. d

    Parcels with overlay attributes

    • catalog.data.gov
    • data.sfgov.org
    • +1more
    Updated Nov 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). Parcels with overlay attributes [Dataset]. https://catalog.data.gov/dataset/parcels-with-overlay-attributes
    Explore at:
    Dataset updated
    Nov 23, 2025
    Dataset provided by
    data.sfgov.org
    Description

    A. SUMMARY This dataset is derived from parcels and several other overlay administrative boundaries (listed below). The dataset was developed by DataSF as a convenience for matching parcels to districts where appropriate. This can be simpler than running a geospatial process every time you want to join parcels to a boundary. The districts provided here run along streets and are non-overlapping so that the parcels will be contained within a single district. The boundaries included are: 1. Analysis Neighborhoods 2. Supervisor Districts 3. Police Districts 4. Planning Districts B. HOW THE DATASET IS CREATED A script runs daily that overlays parcels with each of the boundaries to produce the composite dataset. C. UPDATE PROCESS Updated daily by a script based on the upstream parcels dataset which is also updated daily. D. HOW TO USE THIS DATASET You can use this dataset to match to administrative districts provided here to datasets that contain a parcel number. This can be a simpler process than running these joins spatially. In short, we pre-process the spatial overlays to make joins simpler and more performant.

  3. Revix_engine_Knock_complete_dataset

    • kaggle.com
    zip
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arinda Emmanuel (2025). Revix_engine_Knock_complete_dataset [Dataset]. https://www.kaggle.com/datasets/arindaemmanuel/revix-engine-knock-complete-dataset
    Explore at:
    zip(268325925 bytes)Available download formats
    Dataset updated
    Sep 17, 2025
    Authors
    Arinda Emmanuel
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Engine Sounds Knocking Detection Dataset

    Overview

    A comprehensive audio dataset for detecting engine knocking in cars and trucks, featuring 1,199 high-quality 16kHz WAV samples across 4 balanced categories. This dataset combines real engine recordings with sophisticated synthetic knocking generation techniques to create a robust training set for automotive diagnostics and audio classification research.

    Dataset Composition

    • Total Samples: 1,199 audio files (16kHz, WAV format)
    • Categories: 4 balanced classes
      • car_clean (300 samples): Clean car engine sounds
      • car_knocking (299 samples): Car engines with knocking
      • truck_clean (300 samples): Clean truck engine sounds
      • truck_knocking (300 samples): Truck engines with knocking
    • Split: 80/20 train/validation (960/240 samples)
    • File Size: ~256MB (optimized for fast loading)

    Technical Methodology

    Data Acquisition

    The base audio samples were collected from multiple sources: - Real engine recordings from cars and trucks in various operating conditions - Professional automotive diagnostic equipment recordings - Field recordings from different engine types and load conditions - Curated to ensure diverse RPM ranges, engine sizes, and acoustic environments

    Synthetic Knocking Generation

    We developed a sophisticated Knocking Isolation and Overlay System to create realistic knocking samples:

    1. Knocking Pattern Extraction: Isolated authentic knocking signatures from real diagnostic recordings
    2. Spectral Analysis: Analyzed frequency characteristics of knocking (typically 4-8kHz with harmonic content)
    3. Temporal Modeling: Captured the irregular, percussive nature of engine knock events
    4. Intelligent Overlay: Applied knocking patterns to clean engine sounds using:
      • Variable intensity levels (0.1-0.8 amplitude scaling)
      • Realistic timing patterns based on engine cycle analysis
      • Frequency-domain blending to maintain acoustic authenticity
      • Preservation of original engine characteristics while adding knock signatures

    Audio Processing Pipeline

    • Resampling: All audio standardized to 16kHz for efficiency and consistency
    • Duration Normalization: Variable-length samples (typically 2-10 seconds)
    • Quality Control: Automated validation of signal-to-noise ratio and dynamic range
    • Format Optimization: Efficient WAV encoding for fast dataset loading

    Dataset Structure

    engine_sounds_hf_dataset/
    ├── audio_files/
    │  ├── car_clean/     # 300 clean car samples
    │  ├── car_knocking/    # 299 car knocking samples 
    │  ├── truck_clean/    # 300 clean truck samples
    │  └── truck_knocking/   # 300 truck knocking samples
    ├── notebooks/       # Complete analysis pipeline
    │  ├── generate_hf_dataset_fast.ipynb  # Dataset generation + knocking isolation
    │  ├── eda_engine_sounds.ipynb      # Comprehensive EDA
    │  ├── categorize_audio.ipynb      # Classification examples
    │  └── getData.ipynb           # Data loading utilities
    ├── train/         # HuggingFace training metadata
    ├── validation/       # HuggingFace validation metadata
    └── README.md       # Detailed documentation
    

    Key Features

    • Hugging Face Integration: Pre-formatted for seamless loading with datasets library
    • Relative File Paths: Portable across different systems and environments
    • Rich Metadata: Includes vehicle type, knocking status, synthetic flags, and audio characteristics
    • Balanced Classes: Equal representation for robust model training
    • Efficient Storage: File-based approach with metadata linking for fast iteration

    Technical Specifications

    • Sample Rate: 16,000 Hz (optimized for ML workflows)
    • Bit Depth: 16-bit PCM
    • File Format: WAV (uncompressed for quality preservation)
    • Frequency Range: Full spectrum with emphasis on 100Hz-8kHz (critical for engine diagnostics)
    • Dynamic Range: Normalized but preserving relative intensity differences

    Included Analysis Tools

    The dataset comes with a complete analysis pipeline: - Spectral Analysis: MFCC, spectral centroid, rolloff, and bandwidth extraction - Time-Domain Features: RMS energy, zero-crossing rate, duration analysis
    - Visualization Tools: Spectrograms, waveform plots, and feature distribution analysis - Quality Assessment: Automated data validation and outlier detection - AST-Ready: Pre-configured for Audio Spectrogram Transformer fine-tuning

    Use Cases

    • Automotive Diagnostics: Engine knock detection systems
    • Audio Classification: Multi-class audio ML model training
    • Transfer Learning: Pre-training for related automotive audio tasks
    • Research: Acoustic analysis of mechanical systems
    • Education: Audio signal processing and ML pipeline demonstrations

    Citation & Acknowledgme...

  4. IMCOMA-example-datasets

    • figshare.com
    xml
    Updated Feb 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nowosad (2021). IMCOMA-example-datasets [Dataset]. http://doi.org/10.6084/m9.figshare.13379228.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Feb 12, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Nowosad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets- simple_land_cover1.tif - an example land cover dataset presented in Figures 1 and 2- simple_landform1.tif - an example landform dataset presented in Figures 1 and 2- landcover_europe.tif - a land cover dataset with nine categories for Europe - landcover_europe.qml - a QGIS color style for the landcover_europe.tif dataset- landform_europe.tif - a landform dataset with 17 categories for Europe - landform_europe.qml - a QGIS color style for the landform_europe.tif dataset- map1.gpkg - a map of LTs in Europe constructed using the INCOMA-based method- map1.qml - a QGIS color style for the map1.gpkg dataset- map2.gpkg - a map of LTs in Europe constructed using the COMA method to identify and delineate pattern types in each theme separately- map2.qml - a QGIS color style for the map2.gpkg dataset- map3.gpkg - a map of LTs in Europe constructed using the map overlay method- map3.qml - a QGIS color style for the map3.gpkg dataset

  5. NSCAT Level 2 Ocean Wind Vector Ambiguity Removal Overlay (Hoffman, AER)

    • data.nasa.gov
    • s.cnmilf.com
    • +4more
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). NSCAT Level 2 Ocean Wind Vector Ambiguity Removal Overlay (Hoffman, AER) [Dataset]. https://data.nasa.gov/dataset/nscat-level-2-ocean-wind-vector-ambiguity-removal-overlay-hoffman-aer-01393
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This dataset contains the NASA Scatterometer (NSCAT) Level 2 ocean wind vector ambiguity overlay files for the NSCAT MGDR version 2 dataset, referenced for 25 km wind vector cells (WVC). The dataset is derived from the results of a study which used a 2-D variational analysis method (VAM) to select a wind solution from the NSCAT ambiguous winds (Hoffman et al. 2003). Hoffman et al. chose the ambiguity closest in direction to the VAM surface wind analysis. No ambiguity was chosen for poor quality wind vector cells (WVCs). ECMWF analyses were used as the background field for the VAM. Their choice of ambiguity selection is compared with that of JPL, which used a median filter initialized with NCEP analysis fields. Ambiguity selection is changed in ~5% of the dataset, often improving the depiction of meteorological features where the surface wind is strongly curved or sheared. See Hoffman et al. (2003) for more on the method and results. Additional work by Henderson et al. (2003) compares the results of median filtering (JPL) vs. the 2d-VAR method (Hoffman et al., 2003) using 51 days of NSCAT data, supplemented by the NCEP 1000 hPa wind analyses as background fields.

  6. g

    Logging history overlay of most recent harvesting activities | gimi9.com

    • gimi9.com
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Logging history overlay of most recent harvesting activities | gimi9.com [Dataset]. https://gimi9.com/dataset/au_logging-history-overlay-of-most-recent-harvesting-activities/
    Explore at:
    Dataset updated
    Jul 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This layer has been derived from LOG_SEASON and represents the spatial extent of the most recent timber harvesting activity recorded for any given area in State forest. Where harvest events overlap previous harvesting events the most recent harvesting event is shown. The layer stores details of the last time an area was known to be harvested, the species harvested and the harvesting method used. The dataset has been updated with the 2022-23 information. Complete to 30 June 2023

  7. w

    Public Land Management Overlay - Reference Areas

    • data.wu.ac.at
    wms
    Updated Jul 21, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Environment, Land, Water & Planning (2018). Public Land Management Overlay - Reference Areas [Dataset]. https://data.wu.ac.at/schema/www_data_vic_gov_au/ZmRjMmE1MTgtMzAxYi00OTdkLTg1ZDYtNDAwOTJmNzAyNjFk
    Explore at:
    wmsAvailable download formats
    Dataset updated
    Jul 21, 2018
    Dataset provided by
    Department of Environment, Land, Water & Planning
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    f1358d362ebda95b80a4e0b3ed68785841cc7c56
    Description

    This dataset was created in conjunction with PLM25, to represent the management overlays. The attributes are based on the PLM25 structure. The overlays have been mapped at 1:25 000, using VicMap topographic data to create more accurate and identifiable boundaries.

    PLM25_OVERLAYS is located under the CROWNLAND schema. It has been created in conjunction with PLM25 to ensure the overlays match the PLM25 land management categories.

    PLEASE NOTE: This dataset now replaces the PLM100 overlays.

    PLM25_OVERLAYS have been created by loading Reference areas, wilderness zones, heritage rivers, remote and natural areas and natural catchment areas into one dataset. They are also available as separate datasets.

    This dataset is a representation of the certified plans - the gazettal and certified plans are the official boundaries.

    Currently the creation process is not automated or synchronised with PLM25 updates. For more information please contact the Information Services Division.

  8. HAM10000 Grad-CAM Visualizations (With Overlay)

    • kaggle.com
    zip
    Updated Oct 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rayyan Huda (2025). HAM10000 Grad-CAM Visualizations (With Overlay) [Dataset]. https://www.kaggle.com/datasets/rayyanshuda/ham10000-grad-cam-visualizations-with-overlay
    Explore at:
    zip(4025003526 bytes)Available download formats
    Dataset updated
    Oct 17, 2025
    Authors
    Rayyan Huda
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset contains Grad-CAM heatmaps and overlay visualizations generated from a ResNet50-based skin lesion classifier trained on the HAM10000 dataset. Each image highlights the regions that most strongly influenced the model’s prediction, providing interpretable details into how deep learning models distinguish between benign and malignant skin lesions.

    About the Dataset

    • Source data: HAM10000 – Human Against Machine with 10000 training images
    • Model used: ResNet50 backbone with a custom classification head
    • Loss function: Custom Focal Loss to handle class imbalance
    • Explainability method: Gradient-weighted Class Activation Mapping (Grad-CAM)

    Contents:

    • Original HAM10000 images (non-commercial use)
    • Corresponding Grad-CAM heatmaps
    • Overlay visualizations combining original images with heatmaps
    • Metadata file with image IDs, predictions, and true labels

    License & Attribution

    This dataset is derived from the HAM10000 dataset (CC BY-NC 4.0). The Grad-CAM visualizations and overlays were generated by Rayyan Huda and are released under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). Non-commercial research and educational use only.

  9. Subsea Inpainting Dataset

    • kaggle.com
    zip
    Updated Mar 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bruno Santiago (2021). Subsea Inpainting Dataset [Dataset]. https://www.kaggle.com/brunomsantiago/subsea-inpainting-dataset
    Explore at:
    zip(341876265 bytes)Available download formats
    Dataset updated
    Mar 14, 2021
    Authors
    Bruno Santiago
    Description

    Context

    Oil and Gas industry generates thousands of hours of subsea inspection videos per day. The Oil and Gas operators also have millions of hours of archived video.

    These underwater videos are usually acquired by Remote Operated Vehicles (ROVs) and are crucial to the integrity management strategy of subsea assets, like pipelines and Wet Christmas Trees (WCTs), allowing Oil and Gas companies to operate their offshore fields safely.

    Most of these videos have an "burned overlay", showing real time metadata like ROV coordinates, water depth and heading. On one hand the burned overlay guarantees the metadata is close to the image data even after manipulations like making short video clips and extracting frames to indiviudal images. On the other hand, the overlay may obstruct important image data from users or confuse image processing algorithms.

    This dataset is part of work which the main goal is to experiment current state of art inpainting techniques are ready to remove overlays from subsea inspection videos. For more details check https://github.com/brunomsantiago/subsea_inpainting

    Content

    The dataset started by watching and downloading youtube videos of subsea operations, mostly subsea inspections. Then I selected five videos that were most similar to private video I have seen on my work experience. I selected 5 videos, which are available on folder /original_videos

    Each video has its own subfolder, with two files. The video file itself and and text file with some data of the original youtube video (link, title, resolution). For example: /original_videos/01/01.mp4 /original_videos/01/01_video_info.txt

    From each video I extracted small clips between 20 and 80 frames. There are 13 of these clips and each one has its own subfolder. The subfolder name is the number of the original video and a letter (a, b, c...) showing the sequence of clips to the corresponding video. For example: /prepared_images/01a /prepared_images/01b /prepared_images/01c /prepared_images/02a (...)

    Each clip subfolder has a animated gif used for clip visualization and 3 sub subfolders with png files - frames --> All frames that makes the original clip. The filename is the name of the original video followed by the frame position in the video. - mask_template --> A single png file with a mask covering the clip overlay. The image is black and white, white meaning inpaint here. - masks --> The same png from mask_template copied multiple times, matching the names of the frames. Some video inpainting tools require a mask for each frame, because the masks can move. For example: /prepared_images/01a/frames/01/01_0000150.png /prepared_images/01a/frames/01/01_0000151.png (...) /prepared_images/01a/frames/01/01_00001169.png /prepared_images/01a/mask_template/01_mask_template.png /prepared_images/01a/masks/01_0000150.png /prepared_images/01a/masks/01_0000151.png (...) /prepared_images/01a/masks/01_0000169.png /prepared_images/01a/01a_masked_frames.gif

  10. a

    St Catharines schools (April 2017)

    • edu.hub.arcgis.com
    Updated Apr 9, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Education and Research (2017). St Catharines schools (April 2017) [Dataset]. https://edu.hub.arcgis.com/datasets/83da530456234ef5a2c341b30baddace
    Explore at:
    Dataset updated
    Apr 9, 2017
    Dataset authored and provided by
    Education and Research
    Area covered
    Description

    NOTE: For an updated version of this dataset, please see https://arcg.is/05ybSm.---Derived from the CSV of school locations in Niagara Region available at https://niagaraopendata.ca/dataset/schools (downloaded April 8th, 2017). Using the Overlay Layers tool in ArcGIS Online ("Intersect" method), this layer was clipped to the St. Catharines urban boundary available at https://niagaraopendata.ca/dataset/st-catharines-urban-boundary (downloaded April 8th, 2017) to create a layer of schools in St. Catharines.Contains information licensed under the Open Government Licence - Niagara Region.

  11. w

    Public Land Management Overlay - Heritage River

    • data.wu.ac.at
    wms
    Updated Jul 21, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Environment, Land, Water & Planning (2018). Public Land Management Overlay - Heritage River [Dataset]. https://data.wu.ac.at/schema/www_data_vic_gov_au/MDQ5NTFkNzctOTkzMy00MzJmLThmZTgtYzllZThjOTFkOTRh
    Explore at:
    wmsAvailable download formats
    Dataset updated
    Jul 21, 2018
    Dataset provided by
    Department of Environment, Land, Water & Planning
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    04c7b520c1885ce4d2272bb4ea540942cbbedab4
    Description

    This dataset was created in conjunction with PLM25, to represent the management overlays. The attributes are based on the PLM25 structure. The overlays have been mapped at 1:25 000, using VicMap topographic data to create more accurate and identifiable boundaries.

    PLM25_OVERLAYS is located under the CROWNLAND schema. It has been created in conjunction with PLM25 to ensure the overlays match the PLM25 land management categories.

    PLEASE NOTE: This dataset now replaces the PLM100 overlays.

    PLM25_OVERLAYS have been created by loading Reference areas, wilderness zones, heritage rivers, remote and natural areas and natural catchment areas into one dataset. They are also available as separate datasets.

    This dataset is a representation of the certified plans - the gazettal and certified plans are the official boundaries.

    Currently the creation process is not automated or synchronised with PLM25 updates. For more information please contact the Information Services Division.

  12. m

    SARBake Overlays for the MSTAR Dataset

    • data.mendeley.com
    Updated Aug 8, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Malmgren-Hansen (2017). SARBake Overlays for the MSTAR Dataset [Dataset]. http://doi.org/10.17632/jxhsg8tj7g.3
    Explore at:
    Dataset updated
    Aug 8, 2017
    Authors
    David Malmgren-Hansen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SARBake is an algorithm described in my first article "Convolutional Neural Networks for SAR Image Segmentation" co-authored by Morten Nobel-Jørgensen. The algorithm converts 3D CAD models of objects in to a label mask given the specific details about SAR viewing angles. The mask defines for every pixel whether the radar wave illuminated the object, the background or was in shadow. A good segmentation of a SAR image is very relevant as it simplifies image information. For example can object height above ground be estimated from the length of an objects shadow. An annotated image also enables the possibility of using Supervised Machine Learning techniques to create a segmentation mask automatically.

    If used in scientific publications we kindly ask for a citation of our article, @article{cnnforsegmentation, author = {David Malmgren-Hansen and Morten Nobel-Jørgensen}, title = {Convolutional Neural Networks for SAR Image Segmentation}, journal = {IEEE International Symposium on Signal Processing and Information Technology}, year = 2015 }

  13. g

    PLM25 overlays representing Reference Areas | gimi9.com

    • gimi9.com
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). PLM25 overlays representing Reference Areas | gimi9.com [Dataset]. https://gimi9.com/dataset/au_plm25-overlays-representing-reference-areas/
    Explore at:
    Dataset updated
    Jul 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was created in conjunction with PLM25, to represent the management overlays. The attributes are based on the PLM25 structure. The overlays have been mapped at 1:25 000, using VicMap topographic data to create more accurate and identifiable boundaries. PLM25_OVERLAYS is located under the CROWNLAND schema. It has been created in conjunction with PLM25 to ensure the overlays match the PLM25 land management categories. PLEASE NOTE: This dataset now replaces the PLM100 overlays. PLM25_OVERLAYS_RA is a view of PLM25_OVERLAYS. This dataset is a representation of the certified plans - the gazettal and certified plans are the official boundaries. Currently the creation process is not automated or synchronised with PLM25 updates.

  14. r

    Logging history overlay of most recent harvesting activities

    • researchdata.edu.au
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Energy, Environment and Climate Action (2023). Logging history overlay of most recent harvesting activities [Dataset]. https://researchdata.edu.au/logging-history-overlay-harvesting-activities/false
    Explore at:
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    data.vic.gov.au
    Authors
    Department of Energy, Environment and Climate Action
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This layer has been derived from LOG_SEASON and represents the spatial extent of the most recent timber harvesting activity recorded for any given area in State forest. Where harvest events overlap previous harvesting events the most recent harvesting event is shown.

    The layer stores details of the last time an area was known to be harvested, the species harvested and the harvesting method used.

    The dataset has been updated with the 2022-23 information. Complete to 30 June 2023

  15. Glasses Segmentation Synthetic Dataset

    • kaggle.com
    zip
    Updated Sep 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mantas (2023). Glasses Segmentation Synthetic Dataset [Dataset]. https://www.kaggle.com/mantasu/glasses-segmentation-synthetic-dataset
    Explore at:
    zip(7910307685 bytes)Available download formats
    Dataset updated
    Sep 20, 2023
    Authors
    Mantas
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    About

    The dataset contains synthetically generated images of people wearing glasses (regular eyeglasses + sunglasses) and glasses masks (full + frames + shadows). It can primarily be used for eyeglasses/sunglasses classification and segmentation.

    This dataset is an augmented version of the synthetic dataset introduced in Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data and can be accessed here. The augmentation adds overlays on top of eyeglass frames to create images of people wearing sunglasses and corresponding masks.

    Structure

    There are 73 people identities in total, each with 400 different expressions or lightning effects, thus making a total of 29,000 samples. Each sample is a group of 8 images of the form sample-name-[suffix].png where [suffix] can be one of the following: * all - regular eyeglasses (i.e., frames) and their shadows * sunglasses - occluded glasses (i.e., sunglasses) and their frame shadows * glass - regular eyeglasses but no shadows * shadows - frame shadows but no eyeglasses * face - plain face: no glasses and no shadows * seg - mask for regular eyeglasses * sgseg - mask for sunglasses * shaseg - mask for frame shadows

    10 identities were used for test data and 10 identities for validation, which corresponds to roughly 14% each, leaving around 72% of the data for training (which is 21,200 samples).

    Collection

    The data was generated in the following process: 1. The original dataset was downloaded from the link in the official Github repository 2. Glasses Detector was used to create full glasses segmentation masks which were used to generate various color and transparency (mainly dark) glasses 3. The generated glasses were overlaid on top of the original images with frames to create new images with sunglasses and corresponding masks 4. The 73 identities were shuffled and split into 3 parts (train, val, test) which were used to group all the 400 variations of each identity.

    You can see the full process of glass overlay generation and data splitting in this gist.

    Note: a type of noise (e.g., random, single spot) was added to roughly 15% of the images with sunglasses. Also, some of the generated glasses do not fill the entire frame, however, masks capture that.

    Licence

    This dataset is marked under CC BY-NC 4.0, meaning you can share and modify the data for non-commercial reuse as long as you provide a copyright notice.

    Citation

    Please use the original authors, i.e., the following citation:

    @misc{glasses-segmentation-synthetic,
      author = {Junfeng Lyu, Zhibo Wang, Feng Xu},
      title = {Glasses Segmentation Synthetic Dataset},
      year = {2023},
      publisher = {Kaggle},
      journal = {Kaggle datasets},
      howpublished = {\url{https://www.kaggle.com/datasets/mantasu/glasses-segmentation-synthetic-dataset}}
    }
    
  16. Mask_Dataset

    • kaggle.com
    zip
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ranjan kumar pradhan (2025). Mask_Dataset [Dataset]. https://www.kaggle.com/datasets/rpjinu/mask-dataset/suggestions
    Explore at:
    zip(82669340 bytes)Available download formats
    Dataset updated
    May 14, 2025
    Authors
    Ranjan kumar pradhan
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🧠 Face Wire Mask Recognition Dataset (256x256) 📌 Project Description This dataset contains a large collection of 256x256 pixel face images of real or synthetic people, annotated with wire mask overlays. The wire masks are geometric outlines mapped to the facial features — such as the eyes, nose, lips, and jawline — resembling landmark-based face detection grids.

    🎯 Purpose The goal of this dataset is to help train machine learning models to recognize whether a face has a wire mask overlay or not. This can be used to build:

    Binary classification models (Masked vs. Unmasked)

    Wireframe face detection systems

    Preprocessing pipelines for facial analysis or AR applications

  17. Skin Cancer Colorized ISIC Dataset

    • kaggle.com
    zip
    Updated May 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Medi Hunter - 4004 (2025). Skin Cancer Colorized ISIC Dataset [Dataset]. https://www.kaggle.com/datasets/shuvokumarbasakbd/skin-cancer-colorized-isic-dataset
    Explore at:
    zip(3413792353 bytes)Available download formats
    Dataset updated
    May 23, 2025
    Authors
    Medi Hunter - 4004
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Source Raw Data More Info:: https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic Skin Cancer ISIC The skin cancer data. Contains 9 classes of skin cancer.

    Colorized Data Processing Techniques for Medical Imaging

    Medical images like CT scans and X-rays are typically grayscale, making subtle anatomical or pathological differences harder to distinguish. The following image processing and enhancement techniques are used to colorize and improve visual interpretation for diagnostics, training, or AI preprocessing.

    🔷 1. 3D_Rendering Renders medical image volumes into three-dimensional visualizations. Though often grayscale, color can be applied to different tissue types or densities to enhance spatial understanding. Useful in surgical planning or tumor visualization.

    🔷 2. 3D_Volume_Rendering An advanced visualization technique that projects 3D image volumes with transparency and color blending, simulating how light passes through tissue. Color helps distinguish internal structures like organs, vessels, or tumors.

    🔷 3. Adaptive Histogram Equalization (AHE) Enhances contrast locally within the image, especially in low-contrast regions. When colorized, different intensities are mapped to distinct hues, improving visibility of fine-grained details like soft tissues or lesions.

    🔷 4. Alpha Blending A layering technique that combines multiple images (e.g., CT + annotation masks) with transparency. Colors represent different modalities or regions of interest, providing composite visual cues for diagnosis.

    🔷 5. Basic Color Map Applies a standard color palette (like Jet or Viridis) to grayscale data. Different intensities are mapped to different colors, enhancing the visual discrimination of anatomical or pathological regions in the image.

    🔷 6. Contrast Stretching Expands the grayscale range to improve brightness and contrast. When combined with color mapping, tissues with similar intensities become visually distinct, aiding in tasks like bone vs. soft tissue separation.

    🔷 7. Edge Detection Extracts and overlays object boundaries (e.g., organ or lesion outlines) on the original scan. Edge maps are typically colorized (e.g., green or red) to highlight anatomical structures or abnormalities clearly.

    🔷 8. Gamma Correction Adjusts image brightness non-linearly. Color can be used to highlight underexposed or overexposed regions, often revealing soft tissue structures otherwise hidden in raw grayscale CT/X-ray images.

    🔷 9. Gaussian Blur Smooths image noise and details. When visualized with color overlays (e.g., before vs. after), it helps assess denoising effectiveness. It is also used in segmentation preprocessing to reduce edge artifacts.

    🔷 10. Heatmap Visualization Encodes intensity or prediction confidence into a heatmap overlay (e.g., red for high activity). Common in AI-assisted diagnosis to localize tumors, fractures, or infections, layered over the original grayscale image.

    🔷 11. Interactive Segmentation A semi-automated method to extract regions of interest with user input. Segmented areas are color-coded (e.g., tumor = red, background = blue) for immediate visual confirmation and further analysis.

    🔷 12. LUT (Lookup Table) Color Map Maps grayscale values to custom color palettes using a lookup table. This enhances contrast and emphasizes certain intensity ranges (e.g., blood vessels vs. bone), improving interpretability for radiologists.

    🔷 13. Random Color Palette Applies random but consistent colors to segmented regions or labels. Common in datasets with multiple classes (e.g., liver, spleen, kidneys), it helps in visual verification of label diversity.

    🧬 Conclusion These colorization methods do not change the underlying medical data, but they significantly enhance its interpretability for radiologists, researchers, and machine learning algorithms. Color adds clarity, contrast, and context to features that may be missed in grayscale, making it a powerful tool in modern medical imaging workflows.

    RRA_Think Differently, Create history’s next line.

    Hello Data Hunters! Hope you're doing well. Here, you'll find medical datasets collected from various platforms — raw data that I’ve colorized and enhanced, making them ready for ML. If you use any of these datasets, please be sure to cite both my work and the original data source, and don't forget to check out the raw data on the original platforms. https://www.kaggle.com/shuvokumarbasak4004 (More Dataset) https://www.kaggle.com/

    Source Raw Data More Info:: https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic Skin Cancer ISIC The skin cancer data. Contains 9 classes of skin cancer.

  18. d

    Hazardous Pipelines Overlay

    • catalog.data.gov
    • data.austintexas.gov
    Updated Nov 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). Hazardous Pipelines Overlay [Dataset]. https://catalog.data.gov/dataset/hazardous-pipelines-overlay
    Explore at:
    Dataset updated
    Nov 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    City of Austin Open Data Terms of Use https://data.austintexas.gov/stories/s/ranj-cccq Restricted pipeline area includes an area within 25 feet of a hazardous pipeline and an area within a hazardous pipeline easement. Austin Development Services Data Disclaimer: The data provided are for informational use only and may differ from official department data. Austin Development Services’ database is continuously updated, so reports run at different times may produce different results. Care should be taken when comparing against other reports as different data collection methods and different data sources may have been used. Austin Development Services does not assume any liability for any decision made or action taken or not taken by the recipient in reliance upon any information or data provided.

  19. Brain Tumor (MRI) Detection Colorized

    • kaggle.com
    zip
    Updated Oct 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Medi Hunter - 4004 (2025). Brain Tumor (MRI) Detection Colorized [Dataset]. https://www.kaggle.com/datasets/shuvokumarbasakbd/brain-tumor-mri-detection-colorized
    Explore at:
    zip(88367577 bytes)Available download formats
    Dataset updated
    Oct 12, 2025
    Authors
    Medi Hunter - 4004
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    #Raw Data, Source, More Information :: Brain Tumor (MRI) Detection https://www.kaggle.com/datasets/arwabasal/brain-tumor-mri-detection

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F25409507%2F4df08c7bce0854ffca0e199c6b9622ec%2FY1_12.jpg?generation=1760289118902513&alt=media" alt=""> Colorized Data Processing Techniques for Medical Imaging

    Medical images like CT scans and X-rays are typically grayscale, making subtle anatomical or pathological differences harder to distinguish. The following image processing and enhancement techniques are used to colorize and improve visual interpretation for diagnostics, training, or AI preprocessing.

    🔷 1. 3D_Rendering Renders medical image volumes into three-dimensional visualizations. Though often grayscale, color can be applied to different tissue types or densities to enhance spatial understanding. Useful in surgical planning or tumor visualization.

    🔷 2. 3D_Volume_Rendering An advanced visualization technique that projects 3D image volumes with transparency and color blending, simulating how light passes through tissue. Color helps distinguish internal structures like organs, vessels, or tumors.

    🔷 3. Adaptive Histogram Equalization (AHE) Enhances contrast locally within the image, especially in low-contrast regions. When colorized, different intensities are mapped to distinct hues, improving visibility of fine-grained details like soft tissues or lesions.

    🔷 4. Alpha Blending A layering technique that combines multiple images (e.g., CT + annotation masks) with transparency. Colors represent different modalities or regions of interest, providing composite visual cues for diagnosis.

    🔷 5. Basic Color Map Applies a standard color palette (like Jet or Viridis) to grayscale data. Different intensities are mapped to different colors, enhancing the visual discrimination of anatomical or pathological regions in the image.

    🔷 6. Contrast Stretching Expands the grayscale range to improve brightness and contrast. When combined with color mapping, tissues with similar intensities become visually distinct, aiding in tasks like bone vs. soft tissue separation.

    🔷 7. Edge Detection Extracts and overlays object boundaries (e.g., organ or lesion outlines) on the original scan. Edge maps are typically colorized (e.g., green or red) to highlight anatomical structures or abnormalities clearly.

    🔷 8. Gamma Correction Adjusts image brightness non-linearly. Color can be used to highlight underexposed or overexposed regions, often revealing soft tissue structures otherwise hidden in raw grayscale CT/X-ray images.

    🔷 9. Gaussian Blur Smooths image noise and details. When visualized with color overlays (e.g., before vs. after), it helps assess denoising effectiveness. It is also used in segmentation preprocessing to reduce edge artifacts.

    🔷 10. Heatmap Visualization Encodes intensity or prediction confidence into a heatmap overlay (e.g., red for high activity). Common in AI-assisted diagnosis to localize tumors, fractures, or infections, layered over the original grayscale image.

    🔷 11. Interactive Segmentation A semi-automated method to extract regions of interest with user input. Segmented areas are color-coded (e.g., tumor = red, background = blue) for immediate visual confirmation and further analysis.

    🔷 12. LUT (Lookup Table) Color Map Maps grayscale values to custom color palettes using a lookup table. This enhances contrast and emphasizes certain intensity ranges (e.g., blood vessels vs. bone), improving interpretability for radiologists.

    🔷 13. Random Color Palette Applies random but consistent colors to segmented regions or labels. Common in datasets with multiple classes (e.g., liver, spleen, kidneys), it helps in visual verification of label diversity.

    🧬 Conclusion These colorization methods do not change the underlying medical data, but they significantly enhance its interpretability for radiologists, researchers, and machine learning algorithms. Color adds clarity, contrast, and context to features that may be missed in grayscale, making it a powerful tool in modern medical imaging workflows.

    RRA_Think Differently, Create history’s next line.

    Hello Data Hunters! Hope you're doing well. Here, you'll find medical datasets collected from various platforms — raw data that I’ve colorized and enhanced, making them ready for ML. If you use any of these datasets, please be sure to cite both my work and the original data source, and don't forget to check out the raw data on the original platforms. https://www.kaggle.com/shuvokumarbasak4004 (More Dataset) https://www.kaggle.com/shuvokumarbasak2030

    Source:: Brain Tumor (MRI) Detection https://www.kaggle.com/datasets/arwabasal/brain-tumor-mri-detection

  20. n

    California - NWS Watches and Warnings - Dataset - CKAN

    • nationaldataplatform.org
    Updated Feb 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). California - NWS Watches and Warnings - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/california-nws-watches-and-warnings
    Explore at:
    Dataset updated
    Feb 28, 2024
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    California
    Description

    This feature service depicts the National Weather Service (NWS) watches, warnings, and advisories within the United States. Watches and warnings are classified into 43 categories.A warning is issued when a hazardous weather or hydrologic event is occurring, imminent or likely. A warning means weather conditions pose a threat to life or property. People in the path of the storm need to take protective action.A watch is used when the risk of a hazardous weather or hydrologic event has increased significantly, but its occurrence, location or timing is still uncertain. It is intended to provide enough lead time so those who need to set their plans in motion can do so. A watch means that hazardous weather is possible. People should have a plan of action in case a storm threatens, and they should listen for later information and possible warnings especially when planning travel or outdoor activities.An advisory is issued when a hazardous weather or hydrologic event is occurring, imminent or likely. Advisories are for less serious conditions than warnings, that cause significant inconvenience and if caution is not exercised, could lead to situations that may threaten life or property.SourceNational Weather Service RSS-CAP Warnings and Advisories: Public AlertsNational Weather Service Boundary Overlays: AWIPS Shapefile DatabaseSample DataSee Sample Layer Item for sample data during Weather inactivity!Update FrequencyThe services is updated every 5 minutes using the Aggregated Live Feeds methodology.The overlay data is checked and updated daily from the official AWIPS Shapefile Database.Area CoveredUnited States and TerritoriesWhat can you do with this layer?Customize the display of each attribute by using the Change Style option for any layer.Query the layer to display only specific types of weather watches and warnings.Add to a map with other weather data layers to provide insight on hazardous weather events.Use ArcGIS Online analysis tools, such as Enrich Data, to determine the potential impact of weather events on populations.This map is provided for informational purposes and is not monitored 24/7 for accuracy and currency.Additional information on Watches and Warnings.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. Geological Survey (2025). ViTexOCR; a script to extract text overlays from digital video [Dataset]. https://catalog.data.gov/dataset/vitexocr-a-script-to-extract-text-overlays-from-digital-video

Data from: ViTexOCR; a script to extract text overlays from digital video

Related Article
Explore at:
Dataset updated
Nov 19, 2025
Dataset provided by
U.S. Geological Survey
Description

The ViTexOCR script presents a new method for extracting navigation data from videos with text overlays using optical character recognition (OCR) software. Over the past few decades, it was common for videos recorded during surveys to be overlaid with real-time geographic positioning satellite chyrons including latitude, longitude, date and time, as well as other ancillary data (such as speed, heading, or user input identifying fields). Embedding these data into videos provides them with utility and accuracy, but using the location data for other purposes, such as analysis in a geographic information system, is not possible when only available on the video display. Extracting the text data from imagery using software allows these videos to be located and analyzed in a geospatial context. The script allows a user to select a video, specify the text data types (e.g. latitude, longitude, date, time, or other), text color, and the pixel locations of overlay text data on a sample video frame. The script’s output is a data file containing the retrieved geospatial and temporal data. All functionality is bundled in a Python script that incorporates a graphical user interface and several other software dependencies.

Search
Clear search
Close search
Google apps
Main menu