https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data from this dataset can be downloaded/accessed through this dataset page and Kaggle's API.
Severe weather is defined as a destructive storm or weather. It is usually applied to local, intense, often damaging storms such as thunderstorms, hail storms, and tornadoes, but it can also describe more widespread events such as tropical systems, blizzards, nor'easters, and derechos.
The Severe Weather Data Inventory (SWDI) is an integrated database of severe weather records for the United States. The records in SWDI come from a variety of sources in the NCDC archive. SWDI provides the ability to search through all of these data to find records covering a particular time period and geographic region, and to download the results of your search in a variety of formats. The formats currently supported are Shapefile (for GIS), KMZ (for Google Earth), CSV (comma-separated), and XML.
The current data layers in SWDI are:
- Filtered Storm Cells (Max Reflectivity >= 45 dBZ) from NEXRAD (Level-III Storm Structure Product)
- All Storm Cells from NEXRAD (Level-III Storm Structure Product)
- Filtered Hail Signatures (Max Size > 0 and Probability = 100%) from NEXRAD (Level-III Hail Product)
- All Hail Signatures from NEXRAD (Level-III Hail Product)
- Mesocyclone Signatures from NEXRAD (Level-III Meso Product)
- Digital Mesocyclone Detection Algorithm from NEXRAD (Level-III MDA Product)
- Tornado Signatures from NEXRAD (Level-III TVS Product)
- Preliminary Local Storm Reports from the NOAA National Weather Service
- Lightning Strikes from Vaisala NLDN
Disclaimer:
SWDI provides a uniform way to access data from a variety of sources, but it does not provide any additional quality control beyond the processing which took place when the data were archived. The data sources in SWDI will not provide complete severe weather coverage of a geographic region or time period, due to a number of factors (eg, reports for a location or time period not provided to NOAA). The absence of SWDI data for a particular location and time should not be interpreted as an indication that no severe weather occurred at that time and location. Furthermore, much of the data in SWDI is automatically derived from radar data and represents probable conditions for an event, rather than a confirmed occurrence.
Dataset Source: NOAA. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Cover photo by NASA on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
The National Lightning Detection Network, NLDN, consists of over 100 remote, ground-based sensing stations located across the United States that instantaneously detect the electromagnetic signals given off when lightning strikes the earth's surface. These remote sensors send the raw data via a satellite-based communications network to the Network Control Center operated by Vaisala Inc. in Tucson, Arizona. Within seconds of a lightning strike, the NCC's central analyzers process information on the location, time, polarity, and communicated to users across the country.
More information:
http://thunderstorm.vaisala.com
Tornado TracksThis feature layer, utilizing data from the National Oceanic and Atmospheric Administration (NOAA), displays tornadoes in the United States, Puerto Rico and U.S. Virgin Islands between 1950 and 2024. A tornado track shows the route of a tornado. Per NOAA, "A tornado is a narrow, violently rotating column of air that extends from a thunderstorm to the ground. Because wind is invisible, it is hard to see a tornado unless it forms a condensation funnel made up of water droplets, dust and debris. Tornadoes can be among the most violent phenomena of all atmospheric storms we experience. The most destructive tornadoes occur from supercells, which are rotating thunderstorms with a well-defined radar circulation called a mesocyclone. (Supercells can also produce damaging hail, severe non-tornadic winds, frequent lightning, and flash floods.)"EF-5 Tornado Track (May 3, 1999) near Oklahoma City, OklahomaData currency: December 30, 2024Data source: Storm Prediction CenterData modifications: Added field "Date_Calc"For more information: Severe Weather 101 - Tornadoes; NSSL Research: TornadoesSupport documentation: SPC Tornado, Hail, and Wind Database Format SpecificationFor feedback, please contact: ArcGIScomNationalMaps@esri.comNational Oceanic and Atmospheric AdministrationPer NOAA, its mission is "To understand and predict changes in climate, weather, ocean, and coasts, to share that knowledge and information with others, and to conserve and manage coastal and marine ecosystems and resources."
https://data.mfe.govt.nz/license/attribution-3-0-new-zealand/https://data.mfe.govt.nz/license/attribution-3-0-new-zealand/
Lightning is the discharge of electricity from thunderstorms. Ground strikes can cause significant damage to property and infrastructure, and injure or kill people and livestock. Lightning is often associated with other severe weather events, such as strong wind gusts. Thunderstorms may increase in frequency and intensity with climate change.
This dataset shows the location of sensors in the New Zealand Lightning Detection Network (NZLDN), run by MetService.
Sensors around the country detect lightning over the New Zealand land mass and a short distance out to sea. These sensors detect very accurately the electrical discharge, location, and time, as well as noting other parameters such as current strength. The NZLDN records both cloud-to-cloud and cloud-to-ground strikes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Baltimore radar rainfall dataset was developed from a multi-sensor analysis combining radar rainfall estimates from the Sterling, VA WSR88D radar (KLWX) with measurements from a collection of ground based rain gages. The archived data have a 15-minute time resolution and a grid resolution of 0.01 degree latitude/longitude (approximately 1 km x 1 km); 15-minute rainfall accumulations for each grid are in mm. The dataset spans 22 years, 2000-2021, and covers an area of approximately 4,900 km^2 (70 by 70 grids, each with approximate area of 1 km^2) surrounding the Baltimore, MD metropolitan area (Figure 1). The rainfall data cover the six months from April to September of each year. This is the period with most intense sub-daily rainfall and the period for which radar measurements are most accurate. Figure 1 illustrates the climatological analyses of mean annual frequency of days with at least 1 hour rainfall exceeding 25 mm. The striking spatial variability of convective rainfall is illustrated in Figure 2 by the April-September climatology of annual lightning strikes.
As with many long-term environmental data sets, sensor technology has changed during the time period of the archive. The Sterling, VA WSR88D radar underwent a hardware upgrade from single polarization to dual polarization in 2012. Prior to the upgrade, rainfall was estimated using a conventional radar-reflectivity algorithm (HydroNEXRAD) which converts reflectivity measurements in polar coordinates from the lowest sweep to rainfall estimates on a 0.01 degree latitude-longitude grid at the surface (see Seo et al. 2010 and Smith et al. 2012 for details on the algorithm). The polarimetric upgrade introduced new measurements into the radar-rainfall algorithm. In addition to reflectivity, the operational rainfall product, Digital Precipitation Rate (DPR), directly uses differential reflectivity and specific differential phase shift measurements to estimate rainfall (https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00708; see also Giangrande and Ryzhkov 2008). Details of the algorithm structure and parameterization for the DPR radar-rainfall estimates have been modified during the 10-year period of the data set.
A storm-based (daily) multiplicative mean field bias has been applied to both datasets. The mean field bias is computed as the ratio of daily rain gage rainfall at a point to daily radar rainfall for the bin that contains the gage. The rain gage dataset is compiled from rain gages in the Baltimore metropolitan region and surrounding areas and includes gages acquired from both Baltimore City and Baltimore County, and the Global Historical Climatology Network daily (GHCNd). Mean field bias improves rainfall estimates and diminishes the impacts of changing measurement procedures.
The dataset has been archived in 2 formats: netCDF gridded rainfall, 1 file for each 15-minute time period, and csv or excel format point rainfall (1 point at the center of each grid) in a timeseries format with 1 file per calendar month covering the entire 70x70 domain. The csv files are in folders organized by calendar year. The first five columns in each file represent year, month, day, hour, and minute and can be combined to generate a unique date-time value for each time step. Each additional column is a complete time series for the month and represents data from one of the 1-km2 grid cells in the original data set.
The latitude and longitude coordinates for each pixel in the grid are provided. The latitude and longitude represent the centroid of the cell, which is square when represented in latitude and longitude coordinates and rectangular when represented in other distance-based coordinate systems such as State Plane or Universal Transverse Mercator. There are 4900 pixels in the domain. In order to visualize the data using GIS or other software, the user needs to associate each column in the annual rainfall file with the latitude and longitude values for that grid cell number.
These data may be subject to modest revision or reformatting in future versions. The current version is version 2.0 and is being offered to users who wish to explore the data. We will revise this document as needed.
The Geostationary Lightning Mapper Level 2 Lightning Detection product contains a list of lightning flashes, and their constituent groups and events. The definition of and relationship among flashes, groups, and events are governed by the following spatial and temporal characteristics: An event represents the signal detected from the cloud top associated with a lightning emission in an individual sensor pixel for a 2ms integration period; A group represents the events detected in adjacent sensor pixels for the same integration period as an event; A flash represents a series of measurements constrained by temporal and spatial extent thresholds that are associated with one or more groups. The parent, child relationship among specific flashes, groups, and events is stored in the product. Data for each flash includes an energy-weighted centroid latitude, longitude _location, time span of occurrence, amount of radiant energy, and coverage area. Data for each group includes an energy-weighted centroid latitude, longitude _location, mean time of occurrence, amount of radiant energy, and coverage area. Data for each event includes a latitude, longitude _location, time of occurrence, and amount of radiant energy. The product includes data quality information for each flash and group. A Lightning Detection product file contains a set of flashes, and its constituent groups and events for a 20 second period. The units of measure for the flash, group, and event radiant energy values is Joules. The units of measure for the flash and group coverage areas is square meters.
Comprehensive dataset of 19,365 Lighting stores in United States as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
National Risk Index Version: March 2023 (1.19.0)Lightning is a visible electrical discharge or spark of electricity in the atmosphere between clouds, the air, and/or the ground often produced by a thunderstorm. Annualized frequency values for Lightning are in units of events per year.The National Risk Index is a dataset and online tool that helps to illustrate the communities most at risk for 18 natural hazards across the United States and territories: Avalanche, Coastal Flooding, Cold Wave, Drought, Earthquake, Hail, Heat Wave, Hurricane, Ice Storm, Landslide, Lightning, Riverine Flooding, Strong Wind, Tornado, Tsunami, Volcanic Activity, Wildfire, and Winter Weather. The National Risk Index provides Risk Index values, scores and ratings based on data for Expected Annual Loss due to natural hazards, Social Vulnerability, and Community Resilience. Separate values, scores and ratings are also provided for Expected Annual Loss, Social Vulnerability, and Community Resilience. For the Risk Index and Expected Annual Loss, values, scores and ratings can be viewed as a composite score for all hazards or individually for each of the 18 hazard types.Sources for Expected Annual Loss data include: Alaska Department of Natural Resources, Arizona State University’s (ASU) Center for Emergency Management and Homeland Security (CEMHS), California Department of Conservation, California Office of Emergency Services California Geological Survey, Colorado Avalanche Information Center, CoreLogic’s Flood Services, Federal Emergency Management Agency (FEMA) National Flood Insurance Program, Humanitarian Data Exchange (HDX), Iowa State University's Iowa Environmental Mesonet, Multi-Resolution Land Characteristics (MLRC) Consortium, National Aeronautics and Space Administration’s (NASA) Cooperative Open Online Landslide Repository (COOLR), National Earthquake Hazards Reduction Program (NEHRP), National Oceanic and Atmospheric Administration’s National Centers for Environmental Information (NCEI), National Oceanic and Atmospheric Administration's National Hurricane Center, National Oceanic and Atmospheric Administration's National Weather Service (NWS), National Oceanic and Atmospheric Administration's Office for Coastal Management, National Oceanic and Atmospheric Administration's National Geophysical Data Center, National Oceanic and Atmospheric Administration's Storm Prediction Center, Oregon Department of Geology and Mineral Industries, Pacific Islands Ocean Observing System, Puerto Rico Seismic Network, Smithsonian Institution's Global Volcanism Program, State of Hawaii’s Office of Planning’s Statewide GIS Program, U.S. Army Corps of Engineers’ Cold Regions Research and Engineering Laboratory (CRREL), U.S. Census Bureau, U.S. Department of Agriculture's (USDA) National Agricultural Statistics Service (NASS), U.S. Forest Service's Fire Modeling Institute's Missoula Fire Sciences Lab, U.S. Forest Service's National Avalanche Center (NAC), U.S. Geological Survey (USGS), U.S. Geological Survey's Landslide Hazards Program, United Nations Office for Disaster Risk Reduction (UNDRR), University of Alaska – Fairbanks' Alaska Earthquake Center, University of Nebraska-Lincoln's National Drought Mitigation Center (NDMC), University of Southern California's Tsunami Research Center, and Washington State Department of Natural Resources.Data for Social Vulnerability are provided by the Centers for Disease Control (CDC) Agency for Toxic Substances and Disease Registry (ATSDR) Social Vulnerability Index, and data for Community Resilience are provided by University of South Carolina's Hazards and Vulnerability Research Institute’s (HVRI) 2020 Baseline Resilience Indicators for Communities.The source of the boundaries for counties and Census tracts are based on the U.S. Census Bureau’s 2021 TIGER/Line shapefiles. Building value and population exposures for communities are based on FEMA’s Hazus 6.0. Agriculture values are based on the USDA 2017 Census of Agriculture.
FIRE_POINT: This dataset represents points of origins of BLM fires that occur naturally (e.g., lightning) or by humans accidentally (e.g., escaped campfire) or maliciously across Oregon and Washington. The dataset includes some, but not all, historic fires (fires declared ‘out’ in calendar years prior to the current year). There is no lower size limit for fires to be included. In addition, many non-BLM Federal and State agencies fire origins are present.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Geostationary Operational Environmental Satellite (GOES) Radar Estimation via Machine Learning to Inform NWP (GREMLIN) is a machine learning model that produces composite radar reflectivity using data from the Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM). GREMLIN is useful for observing severe weather and providing information during convective initialization especially over regions without ground-based radars. Previous research found good skill compared to ground-based radar products, however, the analysis was done over a dataset with similar climatic and precipitation characteristics as the training dataset: warm season Eastern CONUS in 2019. This study expands the analysis to the entire contiguous United States, during all seasons, and covering the period 2020-2022. Several validation metrics including root-mean-square difference (RMSD), probability of detection (POD), and false alarm ratio (FAR) are plotted over CONUS by season, day-of-year, and time-of-day, and the regional and temporal variations are examined. GREMLIN skill is highest in summer and spring, with lower skill in winter due to cold surfaces frequently mistaken as precipitating clouds. In summer, diurnal patterns of RMSD in different longitude regions follow diurnal patterns of precipitation occurrence. GREMLIN’s accuracy is the best over the Central to Eastern United States where it has been trained. Over New England, GREMLIN POD is lower due to different brightness temperature distributions and low frequency of lightning compared to the training data. Over Florida, GREMLIN FAR is higher due to high frequency of lightning. Overall, GREMLIN has reliable skill over CONUS in spring, summer, and fall, while winter needs more improvements. Methods The methodology is described in detail by Hilburn et al. (2021). The ABI, GLM, and MRMS data sets were resampled to a common 3 km grid. A cloud height of 10 km was used for removing parallax displacements. Satellite and radar samples were matched in time with a maximum time difference of 2.5 minutes. GLM lightning groups were accumulated over 15-minute time periods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains over 18,000 images of homes damaged by wildfire between 2020 and 2022 in California, USA, captured by the California Department of Forestry and Fire Protection (Cal Fire) during the damage assessment process. The dataset spans across more than 18 wildfire events, including the 2020 August Complex Fire, the first recorded "gigafire" event in California where the area burned exceeded 1 million acres. Each image, corresponding to a built structure, is classified by government damage assessors into 6 different categories: No Damage, Affected (1-9%), Minor (10-25%), Major (26-50%), Destroyed (>50%), and Inaccessible (image taken but not assessment made). While over 57,000 structures were evaluated during the damage assessment process, only about 18,000 contains images; additional data about the structures, such as the street address or structure materials, for both those with and without corresponding images can be accessed in the "Additional Attribute Data" file.
The 18 wildfire events captured in the dataset are:
[AUG] August Complex (2020)
[BEA] Bear Fire (2020)
[BEU] BEU Lightning Complex Fire (2020)
[CAL] Caldor Fire (2021)
[CAS] Castle Fire (2020)
[CRE] Creek Fire (2020)
[DIN] DINS Statewide (Collection of Smaller Fires, 2021)
[DIX[ Dixie Fire (2021)
[FAI] Fairview Fire (2022)
[FOR] Fork Fire (2022)
[GLA] Glass Fire (2020)
[MIL] Mill Mountain Fire (2022)
[MON] Monument Fire (2021)
[MOS] Mosquito Fire (2022)
[POST] Post Fire (2020)
[SCU] SCU Complex Fire (2020)
[VAL] Valley Fire (2020)
[ZOG] Zogg Fire (2020)
The author retrieved the data, originally published as GIS features layers, from from the publicly accessible CAL FIRE Hub, then subsequently processed it into image and tabular formats. The author collaborated with Cal Fire in working with the data, and has received explicit permission for republication.
The LIS 0.1 Degree Very High Resolution Gridded Lightning Monthly Climatology (VHRMC) dataset consists of gridded monthly climatologies of total lightning flash rates seen by the Lightning Imaging Sensor (LIS) from January 1, 1998 through December 31, 2013. LIS is an instrument on the Tropical Rainfall Measurement Mission satellite (TRMM) used to detect the distribution and variability of total lightning occurring in the Earth's tropical and subtropical regions. This information can be used for severe storm detection and analysis, and also for lightning-atmosphere interaction studies. The gridded climatologies include annual mean flash rate, mean diurnal cycle of flash rate with 24 hour resolution, and mean annual cycle of flash rate with daily, monthly, or seasonal resolution. All datasets are in 0.1 degree spatial resolution. The mean annual cycle of flash rate datasets (i.e., daily, monthly or seasonal) have both 49-day and 1 degree boxcar moving averages to remove diurnal cycle and smooth regions with low flash rate, making the results more robust.
The North American Mesoscale Forecast System (NAM) is one of the major regional weather forecast models run by the National Centers for Environmental Prediction (NCEP) for producing weather forecasts. Dozens of weather parameters are available from the NAM grids, from temperature and precipitation to lightning and turbulent kinetic energy. The NAM generates multiple grids (or domains) of weather forecasts over the North American continent at various horizontal resolutions. High-resolution forecasts are generated within the NAM using additional numerical weather models. These high-resolution forecast windows are generated over fixed regions and are occasionally run to follow significant weather events, like hurricanes. This dataset contains a 12 km horizontal resolution Lambert Conformal grid covering the Continental United States (CONUS) domain. It is run four times daily at 00z, 06z, 12z and 18z out to 84 hours with a 1 hour temporal resolution.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is Part 2/2 of the ActiveHuman dataset! Part 1 can be found here. Dataset Description ActiveHuman was generated using Unity's Perception package. It consists of 175428 RGB images and their semantic segmentation counterparts taken at different environments, lighting conditions, camera distances and angles. In total, the dataset contains images for 8 environments, 33 humans, 4 lighting conditions, 7 camera distances (1m-4m) and 36 camera angles (0-360 at 10-degree intervals). The dataset does not include images at every single combination of available camera distances and angles, since for some values the camera would collide with another object or go outside the confines of an environment. As a result, some combinations of camera distances and angles do not exist in the dataset. Alongside each image, 2D Bounding Box, 3D Bounding Box and Keypoint ground truth annotations are also generated via the use of Labelers and are stored as a JSON-based dataset. These Labelers are scripts that are responsible for capturing ground truth annotations for each captured image or frame. Keypoint annotations follow the COCO format defined by the COCO keypoint annotation template offered in the perception package.
Folder configuration The dataset consists of 3 folders:
JSON Data: Contains all the generated JSON files. RGB Images: Contains the generated RGB images. Semantic Segmentation Images: Contains the generated semantic segmentation images.
Essential Terminology
Annotation: Recorded data describing a single capture. Capture: One completed rendering process of a Unity sensor which stored the rendered result to data files (e.g. PNG, JPG, etc.). Ego: Object or person on which a collection of sensors is attached to (e.g., if a drone has a camera attached to it, the drone would be the ego and the camera would be the sensor). Ego coordinate system: Coordinates with respect to the ego. Global coordinate system: Coordinates with respect to the global origin in Unity. Sensor: Device that captures the dataset (in this instance the sensor is a camera). Sensor coordinate system: Coordinates with respect to the sensor. Sequence: Time-ordered series of captures. This is very useful for video capture where the time-order relationship of two captures is vital. UIID: Universal Unique Identifier. It is a unique hexadecimal identifier that can represent an individual instance of a capture, ego, sensor, annotation, labeled object or keypoint, or keypoint template.
Dataset Data The dataset includes 4 types of JSON annotation files files:
annotation_definitions.json: Contains annotation definitions for all of the active Labelers of the simulation stored in an array. Each entry consists of a collection of key-value pairs which describe a particular type of annotation and contain information about that specific annotation describing how its data should be mapped back to labels or objects in the scene. Each entry contains the following key-value pairs:
id: Integer identifier of the annotation's definition. name: Annotation name (e.g., keypoints, bounding box, bounding box 3D, semantic segmentation). description: Description of the annotation's specifications. format: Format of the file containing the annotation specifications (e.g., json, PNG). spec: Format-specific specifications for the annotation values generated by each Labeler.
Most Labelers generate different annotation specifications in the spec key-value pair:
BoundingBox2DLabeler/BoundingBox3DLabeler:
label_id: Integer identifier of a label. label_name: String identifier of a label. KeypointLabeler:
template_id: Keypoint template UUID. template_name: Name of the keypoint template. key_points: Array containing all the joints defined by the keypoint template. This array includes the key-value pairs:
label: Joint label. index: Joint index. color: RGBA values of the keypoint. color_code: Hex color code of the keypoint skeleton: Array containing all the skeleton connections defined by the keypoint template. Each skeleton connection defines a connection between two different joints. This array includes the key-value pairs:
label1: Label of the first joint. label2: Label of the second joint. joint1: Index of the first joint. joint2: Index of the second joint. color: RGBA values of the connection. color_code: Hex color code of the connection. SemanticSegmentationLabeler:
label_name: String identifier of a label. pixel_value: RGBA values of the label. color_code: Hex color code of the label.
captures_xyz.json: Each of these files contains an array of ground truth annotations generated by each active Labeler for each capture separately, as well as extra metadata that describe the state of each active sensor that is present in the scene. Each array entry in the contains the following key-value pairs:
id: UUID of the capture. sequence_id: UUID of the sequence. step: Index of the capture within a sequence. timestamp: Timestamp (in ms) since the beginning of a sequence. sensor: Properties of the sensor. This entry contains a collection with the following key-value pairs:
sensor_id: Sensor UUID. ego_id: Ego UUID. modality: Modality of the sensor (e.g., camera, radar). translation: 3D vector that describes the sensor's position (in meters) with respect to the global coordinate system. rotation: Quaternion variable that describes the sensor's orientation with respect to the ego coordinate system. camera_intrinsic: matrix containing (if it exists) the camera's intrinsic calibration. projection: Projection type used by the camera (e.g., orthographic, perspective). ego: Attributes of the ego. This entry contains a collection with the following key-value pairs:
ego_id: Ego UUID. translation: 3D vector that describes the ego's position (in meters) with respect to the global coordinate system. rotation: Quaternion variable containing the ego's orientation. velocity: 3D vector containing the ego's velocity (in meters per second). acceleration: 3D vector containing the ego's acceleration (in ). format: Format of the file captured by the sensor (e.g., PNG, JPG). annotations: Key-value pair collections, one for each active Labeler. These key-value pairs are as follows:
id: Annotation UUID . annotation_definition: Integer identifier of the annotation's definition. filename: Name of the file generated by the Labeler. This entry is only present for Labelers that generate an image. values: List of key-value pairs containing annotation data for the current Labeler.
Each Labeler generates different annotation specifications in the values key-value pair:
BoundingBox2DLabeler:
label_id: Integer identifier of a label. label_name: String identifier of a label. instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values. x: Position of the 2D bounding box on the X axis. y: Position of the 2D bounding box position on the Y axis. width: Width of the 2D bounding box. height: Height of the 2D bounding box. BoundingBox3DLabeler:
label_id: Integer identifier of a label. label_name: String identifier of a label. instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values. translation: 3D vector containing the location of the center of the 3D bounding box with respect to the sensor coordinate system (in meters). size: 3D vector containing the size of the 3D bounding box (in meters) rotation: Quaternion variable containing the orientation of the 3D bounding box. velocity: 3D vector containing the velocity of the 3D bounding box (in meters per second). acceleration: 3D vector containing the acceleration of the 3D bounding box acceleration (in ). KeypointLabeler:
label_id: Integer identifier of a label. instance_id: UUID of one instance of a joint. Keypoints with the same joint label that are visible on the same capture have different instance_id values. template_id: UUID of the keypoint template. pose: Pose label for that particular capture. keypoints: Array containing the properties of each keypoint. Each keypoint that exists in the keypoint template file is one element of the array. Each entry's contents have as follows:
index: Index of the keypoint in the keypoint template file. x: Pixel coordinates of the keypoint on the X axis. y: Pixel coordinates of the keypoint on the Y axis. state: State of the keypoint.
The SemanticSegmentationLabeler does not contain a values list.
egos.json: Contains collections of key-value pairs for each ego. These include:
id: UUID of the ego. description: Description of the ego. sensors.json: Contains collections of key-value pairs for all sensors of the simulation. These include:
id: UUID of the sensor. ego_id: UUID of the ego on which the sensor is attached. modality: Modality of the sensor (e.g., camera, radar, sonar). description: Description of the sensor (e.g., camera, radar).
Image names The RGB and semantic segmentation images share the same image naming convention. However, the semantic segmentation images also contain the string Semantic_ at the beginning of their filenames. Each RGB image is named "e_h_l_d_r.jpg", where:
e denotes the id of the environment. h denotes the id of the person. l denotes the id of the lighting condition. d denotes the camera distance at which the image was captured. r denotes the camera angle at which the image was captured.
This dataset is designed for Visual Geo-Localization (VG), also known as Visual Place Recognition (VPR). The task involves determining the geographic location of a given image by retrieving the most visually similar images from a database. This dataset provides a diverse collection of urban images, enabling researchers and practitioners to train and evaluate geo-localization models under challenging real-world conditions.
This dataset consists of images curated for training and evaluation of visual geo-localization models. The data is drawn from multiple sources to ensure diversity in lighting conditions, perspectives, and geographical contexts.
This dataset is ideal for: ✅ Training and testing deep learning models for visual geo-localization. ✅ Studying the impact of lighting, perspective, and cultural diversity on place recognition. ✅ Benchmarking retrieval-based localization methods. ✅ Exploring feature extraction techniques for geo-localization tasks.
If you find this dataset useful, please consider citing it in your research or giving it an upvote on Kaggle! 🚀
This repository contains the datasets from the paper "iCITRIS: Causal Representation Learning for Instantaneous Temporal Effects" (link) by Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves. Instantaneous Temporal Causal3Ident - The Temporal Causal3DIdent dataset is a collection of 3D object shapes, which are observed under varying positions, rotations, lightning, and colors. Overall, we this dataset contains 7 (multidimensional) causal factors with instantaneous and temporal causal relations between them. The 7 shapes used are Armadillo, Bunny, Cow, Dragon, Head, Horse, Teapot. For more details on the dataset, see our GitHub repository. Causal Pinball - The Causal Pinball environment implements the simplified, real-world game dynamics of Pinball. This dataset considers 5 causal variables with instantaneous effects: the paddle position left, the paddle position right, the ball (velocity and position), the state of all bumpers, and the score. For more details on the dataset as well as the code to generate this dataset, see our GitHub repository. {"references": ["Lippe, P., Magliacane, S., L\u00f6we, S., Asano, Y. M., Cohen, T., & Gavves, E. (2022). CITRIS: Causal Identifiability from Temporal Intervened Sequences. In Proceedings of the 39th International Conference on Machine Learning, ICML 2022.", "Lippe, P., Magliacane, S., L\u00f6we, S., Asano, Y. M., Cohen, T., & Gavves, E. (2022). iCITRIS: Causal Representation Learning for Instantaneous Temporal Effects. arXiv preprint arXiv:2206.06169"]}
Version InformationThe data is updated annually with fire perimeters from the previous calendar year.Firep23_1 was released in May 2024. Two hundred eighty four fires from the 2023 fire season were added to the database (21 from BLM, 102 from CAL FIRE, 72 from Contract Counties, 19 from LRA, 9 from NPS, 57 from USFS and 4 from USFW). The 2020 Cottonwood fire, 2021 Lone Rock and Union fires, as well as the 2022 Lost Lake fire were added. USFW submitted a higher accuracy perimeter to replace the 2022 River perimeter. A duplicate 2020 Erbes fire was removed. Additionally, 48 perimeters were digitized from an historical map included in a publication from Weeks, d. et al. The Utilization of El Dorado County Land. May 1934, Bulletin 572. University of California, Berkeley. There were 2,132 perimeters that received updated attribution, the bulk of which had IRWIN IDs added. The following fires were identified as meeting our collection criteria, but are not included in this version and will hopefully be added in the next update: Big Hill #2 (2023-CAHIA-001020). YEAR_ field changed to a short integer type. San Diego CAL FIRE UNIT_ID changed to SDU (the former code MVU is maintained in the UNIT_ID domains). COMPLEX_INCNUM renamed to COMPLEX_ID and is in process of transitioning from local incident number to the complex IRWIN ID. Perimeters managed in a complex in 2023 are added with the complex IRWIN ID. Those previously added will transition to complex IRWIN IDs in a future update.If you would like a full briefing on these adjustments, please contact the data steward, Kim Wallin (kimberly.wallin@fire.ca.gov), CAL FIRE FRAP._CAL FIRE (including contract counties), USDA Forest Service Region 5, USDI Bureau of Land Management & National Park Service, and other agencies jointly maintain a fire perimeter GIS layer for public and private lands throughout the state. The data covers fires back to 1878. Current criteria for data collection are as follows:CAL FIRE (including contract counties) submit perimeters ≥10 acres in timber, ≥50 acres in brush, or ≥300 acres in grass, and/or ≥3 damaged/ destroyed residential or commercial structures, and/or caused ≥1 fatality.All cooperating agencies submit perimeters ≥10 acres._Discrepancies between wildfire perimeter data and CAL FIRE Redbook Large Damaging FiresLarge Damaging fires in California were first defined by the CAL FIRE Redbook, and has changed over time, and differs from the definition initially used to define wildfires required to be submitted for the initial compilation of this digital fire perimeter data. In contrast, the definition of fires whose perimeter should be collected has changed once in the approximately 30 years the data has been in existence. Below are descriptions of changes in data collection criteria used when compiling these two datasets. To facilitate comparison, this metadata includes a summary, by year, of fires in the Redbook, that do not appear in this fire perimeter dataset. It is followed by an enumeration of each “Redbook” fire missing from the spatial data. Wildfire Perimeter criteria:~1991: 10 acres timber, 30 acres brush, 300 acres grass, damages or destroys three residence or one commercial structure or does $300,000 worth of damage 2002: 10 acres timber, 50 acres brush, 300 acres grass, damages or destroys three or more structures, or does $300,000 worth of damage~2010: 10 acres timber, 30 acres brush, 300 acres grass, damages or destroys three or more structures (doesn’t include out building, sheds, chicken coops, etc.)Large and Damaging Redbook Fire data criteria:1979: Fires of a minimum of 300 acres that burn at least: 30 acres timber or 300 acres brush, or 1500 acres woodland or grass1981: 1979 criteria plus fires that took ,3000 hours of California Department of Forestry and Fire Protection personnel time to suppress1992: 1981 criteria plus 1500 acres agricultural products, or destroys three residence or one commercial structure or does $300,000 damage1993: 1992 criteria but “three or more structures destroyed” replaces “destroys three residence or one commercial structure” and the 3,000 hours of California Department of Forestry personnel time to suppress is removed2006: 300 acres or larger and burned at least: 30 acres of timber, or 300 acres of brush, or 1,500 acres of woodland, or 1,500 acres of grass, or 1,500 acres of agricultural products, or 3 or more structures destroyed, or $300,000 or more dollar damage loss.2008: 300 acres and largerYear# of Missing Large and Damaging Redbook Fires197922198013198115198261983319842019855219861219875619882319898199091991219921619931719942219959199615199791998101999720004200152002162003520042200512006112007320084320093201022011020124201322014720151020162201711201862019220203202102022020230Total488Enumeration of fires in the Redbook that are missing from Fire Perimeter data. Three letter unit code follows fire name.1979-Sylvandale (HUU), Kiefer (AEU), Taylor(TUU), Parker#2(TCU), PGE#10, Crocker(SLU), Silver Spur (SLU), Parkhill (SLU), Tar Springs #2 (SLU), Langdon (SCU), Truelson (RRU), Bautista (RRU), Crocker (SLU), Spanish Ranch (SLU), Parkhill (SLU), Oak Springs(BDU), Ruddell (BDF), Santa Ana (BDU), Asst. #61 (MVU), Bernardo (MVU), Otay #20 1980– Lightning series (SKU), Lavida (RRU), Mission Creek (RRU), Horse (RRU), Providence (RRU), Almond (BDU), Dam (BDU), Jones (BDU), Sycamore (BDU), Lightning (MVU), Assist 73, 85, 138 (MVU)1981– Basalt (LNU), Lightning #25(LMU), Likely (MNF), USFS#5 (SNF), Round Valley (TUU), St. Elmo (KRN), Buchanan (TCU), Murietta (RRU), Goetz (RRU), Morongo #29 (RRU), Rancho (RRU), Euclid (BDU), Oat Mt. (LAC & VNC), Outside Origin #1 (MVU), Moreno (MVU)1982- Duzen (SRF), Rave (LMU), Sheep’s trail (KRN), Jury (KRN), Village (RRU), Yuma (BDF)1983- Lightning #4 (FKU), Kern Co. #13, #18 (KRN)1984-Bidwell (BTU), BLM D 284,337, PNF #115, Mill Creek (TGU), China hat (MMU), fey ranch, Kern Co #10, 25,26,27, Woodrow (KRN), Salt springs, Quartz (TCU), Bonanza (BEU), Pasquel (SBC), Orco asst. (ORC), Canel (local), Rattlesnake (BDF)1985- Hidden Valley, Magic (LNU), Bald Mt. (LNU), Iron Peak (MEU), Murrer (LMU), Rock Creek (BTU), USFS #29, 33, Bluenose, Amador, 8 mile (AEU), Backbone, Panoche, Los Gatos series, Panoche (FKU), Stan #7, Falls #2 (MMU), USFS #5 (TUU), Grizzley, Gann (TCU), Bumb, Piney Creek, HUNTER LIGGETT ASST#2, Pine, Lowes, Seco, Gorda-rat, Cherry (BEU), Las pilitas, Hwy 58 #2 (SLO), Lexington, Finley (SCU), Onions, Owens (BDU), Cabazon, Gavalin, Orco, Skinner, Shell, Pala (RRU), South Mt., Wheeler, Black Mt., Ferndale, (VNC), Archibald, Parsons, Pioneer (BDU), Decker, Gleason(LAC), Gopher, Roblar, Assist #38 (MVU)1986– Knopki (SRF), USFS #10 (NEU), Galvin (RRU), Powerline (RRU), Scout, Inscription (BDU), Intake (BDF), Assist #42 (MVU), Lightning series (FKU), Yosemite #1 (YNP), USFS Asst. (BEU), Dutch Kern #30 (KRN)1987- Peach (RRU), Ave 32 (TUU), Conover (RRU), Eagle #1 (LNU), State 767 aka Bull (RRU), Denny (TUU), Dog Bar (NEU), Crank (LMU), White Deer (FKU), Briceburg (LMU), Post (RRU), Antelope (RRU), Cougar-I (SKU), Pilitas (SLU) Freaner (SHU), Fouts Complex (LNU), Slides (TGU), French (BTU), Clark (PNF), Fay/Top (SQF), Under, Flume, Bear Wallow, Gulch, Bear-1, Trinity, Jessie, friendly, Cold, Tule, Strause, China/Chance, Bear, Backbone, Doe, (SHF) Travis Complex, Blake, Longwood (SRF), River-II, Jarrell, Stanislaus Complex 14k (STF), Big, Palmer, Indian (TNF) Branham (BLM), Paul, Snag (NPS), Sycamore, Trail, Stallion Spring, Middle (KRN), SLU-864 1988- Hwy 175 (LNU), Rumsey (LNU), Shell Creek (MEU), PG&E #19 (LNU), Fields (BTU), BLM 4516, 417 (LMU), Campbell (LNF), Burney (SHF), USFS #41 (SHF), Trinity (USFS #32), State #837 (RRU), State (RRU), State (350 acres), RRU), State #1807, Orange Co. Asst (RRU), State #1825 (RRU), State #2025, Spoor (BDU), State (MVU), Tonzi (AEU), Kern co #7,9 (KRN), Stent (TCU), 1989– Rock (Plumas), Feather (LMU), Olivas (BDU), State 1116 (RRU), Concorida (RRU), Prado (RRU), Black Mt. (MVU), Vail (CNF)1990– Shipman (HUU), Lightning 379 (LMU), Mud, Dye (TGU), State 914 (RRU), Shultz (Yorba) (BDU), Bingo Rincon #3 (MVU), Dehesa #2 (MVU), SLU 1626 (SLU)1991- Church (HUU), Kutras (SHF)1992– Lincoln, Fawn (NEU), Clover, fountain (SHU), state, state 891, state, state (RRU), Aberdeen (BDU), Wildcat, Rincon (MVU), Cleveland (AEU), Dry Creek (MMU), Arroyo Seco, Slick Rock (BEU), STF #135 (TCU)1993– Hoisington (HUU), PG&E #27 (with an undetermined cause, lol), Hall (TGU), state, assist, local (RRU), Stoddard, Opal Mt., Mill Creek (BDU), Otay #18, Assist/ Old coach (MVU), Eagle (CNF), Chevron USA, Sycamore (FKU), Guerrero, Duck1994– Schindel Escape (SHU), blank (PNF), lightning #58 (LMU), Bridge (NEU), Barkley (BTU), Lightning #66 (LMU), Local (RRU), Assist #22 & #79 (SLU), Branch (SLO), Piute (BDU), Assist/ Opal#2 (BDU), Local, State, State (RRU), Gilman fire 7/24 (RRU), Highway #74 (RRU), San Felipe, Assist #42, Scissors #2 (MVU), Assist/ Opal#2 (BDU), Complex (BDF), Spanish (SBC)1995-State 1983 acres, Lost Lake, State # 1030, State (1335 acres), State (5000 acres), Jenny, City (BDU), Marron #4, Asist #51 (SLO/VNC)1996- Modoc NF 707 (Ambrose), Borrego (MVU), Assist #16 (SLU), Deep Creek (BDU), Weber (BDU), State (Wesley) 500 acres (RRU), Weaver (MMU), Wasioja (SBC/LPF), Gale (FKU), FKU 15832 (FKU), State (Wesley) 500 acres, Cabazon (RRU), State Assist (aka Bee) (RRU), Borrego, Otay #269 (MVU), Slaughter house (MVU), Oak Flat (TUU)1997- Lightning #70 (LMU), Jackrabbit (RRU), Fernandez (TUU), Assist 84 (Military AFV) (SLU), Metz #4 (BEU), Copperhead (BEU), Millstream, Correia (MMU), Fernandez (TUU)1998- Worden, Swift, PG&E 39 (MMU), Chariot, Featherstone, Wildcat, Emery, Deluz (MVU), Cajalco Santiago (RRU)1999- Musty #2,3 (BTU), Border # 95 (MVU), Andrews,
TornadoesThis feature layer, utilizing data from the National Oceanic and Atmospheric Administration (NOAA), displays tornadoes in the United States, Puerto Rico and U.S. Virgin Islands between 1950 and 2024. Per NOAA, "A tornado is a narrow, violently rotating column of air that extends from a thunderstorm to the ground. Because wind is invisible, it is hard to see a tornado unless it forms a condensation funnel made up of water droplets, dust and debris. Tornadoes can be among the most violent phenomena of all atmospheric storms we experience. The most destructive tornadoes occur from supercells, which are rotating thunderstorms with a well-defined radar circulation called a mesocyclone. (Supercells can also produce damaging hail, severe non-tornadic winds, frequent lightning, and flash floods.)"EF-5 Tornado (May 22, 2011) near Joplin, MissouriData currency: December 30, 2024Data source: Storm Prediction CenterData modifications: Added field "Date_Calc"For more information: Severe Weather 101 - Tornadoes; NSSL Research: TornadoesSupport documentation: SPC Tornado, Hail, and Wind Database Format SpecificationFor feedback, please contact: ArcGIScomNationalMaps@esri.comNational Oceanic and Atmospheric AdministrationPer NOAA, its mission is "To understand and predict changes in climate, weather, ocean, and coasts, to share that knowledge and information with others, and to conserve and manage coastal and marine ecosystems and resources."
The UrduDoc Dataset is a benchmark dataset for Urdu text line detection in scanned documents. It is created as a byproduct of the UTRSet-Real dataset generation process. Comprising 478 diverse images collected from various sources such as books, documents, manuscripts, and newspapers, it offers a valuable resource for research in Urdu document analysis. It includes 358 pages for training and 120 pages for validation, featuring a wide range of styles, scales, and lighting conditions. It serves as a benchmark for evaluating printed Urdu text detection models, and the benchmark results of state-of-the-art models are provided. The Contour-Net model demonstrates the best performance in terms of h-mean.
The UrduDoc dataset is the first of its kind for printed Urdu text line detection and will advance research in the field. It will be made publicly available for non-commercial, academic, and research purposes upon request and execution of a no-cost license agreement. To request the dataset and for more information and details about the UrduDoc , UTRSet-Real & UTRSet-Synth datasets, please refer to the Project Website of our paper "UTRNet: High-Resolution Urdu Text Recognition In Printed Documents"
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data from this dataset can be downloaded/accessed through this dataset page and Kaggle's API.
Severe weather is defined as a destructive storm or weather. It is usually applied to local, intense, often damaging storms such as thunderstorms, hail storms, and tornadoes, but it can also describe more widespread events such as tropical systems, blizzards, nor'easters, and derechos.
The Severe Weather Data Inventory (SWDI) is an integrated database of severe weather records for the United States. The records in SWDI come from a variety of sources in the NCDC archive. SWDI provides the ability to search through all of these data to find records covering a particular time period and geographic region, and to download the results of your search in a variety of formats. The formats currently supported are Shapefile (for GIS), KMZ (for Google Earth), CSV (comma-separated), and XML.
The current data layers in SWDI are:
- Filtered Storm Cells (Max Reflectivity >= 45 dBZ) from NEXRAD (Level-III Storm Structure Product)
- All Storm Cells from NEXRAD (Level-III Storm Structure Product)
- Filtered Hail Signatures (Max Size > 0 and Probability = 100%) from NEXRAD (Level-III Hail Product)
- All Hail Signatures from NEXRAD (Level-III Hail Product)
- Mesocyclone Signatures from NEXRAD (Level-III Meso Product)
- Digital Mesocyclone Detection Algorithm from NEXRAD (Level-III MDA Product)
- Tornado Signatures from NEXRAD (Level-III TVS Product)
- Preliminary Local Storm Reports from the NOAA National Weather Service
- Lightning Strikes from Vaisala NLDN
Disclaimer:
SWDI provides a uniform way to access data from a variety of sources, but it does not provide any additional quality control beyond the processing which took place when the data were archived. The data sources in SWDI will not provide complete severe weather coverage of a geographic region or time period, due to a number of factors (eg, reports for a location or time period not provided to NOAA). The absence of SWDI data for a particular location and time should not be interpreted as an indication that no severe weather occurred at that time and location. Furthermore, much of the data in SWDI is automatically derived from radar data and represents probable conditions for an event, rather than a confirmed occurrence.
Dataset Source: NOAA. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Cover photo by NASA on Unsplash
Unsplash Images are distributed under a unique Unsplash License.