75 datasets found

n
AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN
nationaldataplatform.org
Updated Feb 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/airnow-air-quality-monitoring-data-current
Explore at:
Dataset updated
Feb 28, 2024
Description
This United States Environmental Protection Agency (US EPA) feature layer represents monitoring site data, updated hourly concentrations and Air Quality Index (AQI) values for the latest hour received from monitoring sites that report to AirNow.Map and forecast data are collected using federal reference or equivalent monitoring techniques or techniques approved by the state, local or tribal monitoring agencies. To maintain "real-time" maps, the data are displayed after the end of each hour. Although preliminary data quality assessments are performed, the data in AirNow are not fully verified and validated through the quality assurance procedures monitoring organizations used to officially submit and certify data on the EPA Air Quality System (AQS).This data sharing, and centralization creates a one-stop source for real-time and forecast air quality data. The benefits include quality control, national reporting consistency, access to automated mapping methods, and data distribution to the public and other data systems. The U.S. Environmental Protection Agency, National Oceanic and Atmospheric Administration, National Park Service, tribal, state, and local agencies developed the AirNow system to provide the public with easy access to national air quality information. State and local agencies report the Air Quality Index (AQI) for cities across the US and parts of Canada and Mexico. AirNow data are used only to report the AQI, not to formulate or support regulation, guidance or any other EPA decision or position.About the AQIThe Air Quality Index (AQI) is an index for reporting daily air quality. It tells you how clean or polluted your air is, and what associated health effects might be a concern for you. The AQI focuses on health effects you may experience within a few hours or days after breathing polluted air. EPA calculates the AQI for five major air pollutants regulated by the Clean Air Act: ground-level ozone, particle pollution (also known as particulate matter), carbon monoxide, sulfur dioxide, and nitrogen dioxide. For each of these pollutants, EPA has established national air quality standards to protect public health. Ground-level ozone and airborne particles (often referred to as "particulate matter") are the two pollutants that pose the greatest threat to human health in this country.A number of factors influence ozone formation, including emissions from cars, trucks, buses, power plants, and industries, along with weather conditions. Weather is especially favorable for ozone formation when it’s hot, dry and sunny, and winds are calm and light. Federal and state regulations, including regulations for power plants, vehicles and fuels, are helping reduce ozone pollution nationwide.Fine particle pollution (or "particulate matter") can be emitted directly from cars, trucks, buses, power plants and industries, along with wildfires and woodstoves. But it also forms from chemical reactions of other pollutants in the air. Particle pollution can be high at different times of year, depending on where you live. In some areas, for example, colder winters can lead to increased particle pollution emissions from woodstove use, and stagnant weather conditions with calm and light winds can trap PM2.5 pollution near emission sources. Federal and state rules are helping reduce fine particle pollution, including clean diesel rules for vehicles and fuels, and rules to reduce pollution from power plants, industries, locomotives, and marine vessels, among others.How Does the AQI Work?Think of the AQI as a yardstick that runs from 0 to 500. The higher the AQI value, the greater the level of air pollution and the greater the health concern. For example, an AQI value of 50 represents good air quality with little potential to affect public health, while an AQI value over 300 represents hazardous air quality.An AQI value of 100 generally corresponds to the national air quality standard for the pollutant, which is the level EPA has set to protect public health. AQI values below 100 are generally thought of as satisfactory. When AQI values are above 100, air quality is considered to be unhealthy-at first for certain sensitive groups of people, then for everyone as AQI values get higher.Understanding the AQIThe purpose of the AQI is to help you understand what local air quality means to your health. To make it easier to understand, the AQI is divided into six categories:Air Quality Index(AQI) ValuesLevels of Health ConcernColorsWhen the AQI is in this range:..air quality conditions are:...as symbolized by this color:0 to 50GoodGreen51 to 100ModerateYellow101 to 150Unhealthy for Sensitive GroupsOrange151 to 200UnhealthyRed201 to 300Very UnhealthyPurple301 to 500HazardousMaroonNote: Values above 500 are considered Beyond the AQI. Follow recommendations for the Hazardous category. Additional information on reducing exposure to extremely high levels of particle pollution is available here.Each category corresponds to a different level of health concern. The six levels of health concern and what they mean are:"Good" AQI is 0 to 50. Air quality is considered satisfactory, and air pollution poses little or no risk."Moderate" AQI is 51 to 100. Air quality is acceptable; however, for some pollutants there may be a moderate health concern for a very small number of people. For example, people who are unusually sensitive to ozone may experience respiratory symptoms."Unhealthy for Sensitive Groups" AQI is 101 to 150. Although general public is not likely to be affected at this AQI range, people with lung disease, older adults and children are at a greater risk from exposure to ozone, whereas persons with heart and lung disease, older adults and children are at greater risk from the presence of particles in the air."Unhealthy" AQI is 151 to 200. Everyone may begin to experience some adverse health effects, and members of the sensitive groups may experience more serious effects."Very Unhealthy" AQI is 201 to 300. This would trigger a health alert signifying that everyone may experience more serious health effects."Hazardous" AQI greater than 300. This would trigger a health warnings of emergency conditions. The entire population is more likely to be affected.AQI colorsEPA has assigned a specific color to each AQI category to make it easier for people to understand quickly whether air pollution is reaching unhealthy levels in their communities. For example, the color orange means that conditions are "unhealthy for sensitive groups," while red means that conditions may be "unhealthy for everyone," and so on.Air Quality Index Levels of Health ConcernNumericalValueMeaningGood0 to 50Air quality is considered satisfactory, and air pollution poses little or no risk.Moderate51 to 100Air quality is acceptable; however, for some pollutants there may be a moderate health concern for a very small number of people who are unusually sensitive to air pollution.Unhealthy for Sensitive Groups101 to 150Members of sensitive groups may experience health effects. The general public is not likely to be affected.Unhealthy151 to 200Everyone may begin to experience health effects; members of sensitive groups may experience more serious health effects.Very Unhealthy201 to 300Health alert: everyone may experience more serious health effects.Hazardous301 to 500Health warnings of emergency conditions. The entire population is more likely to be affected.Note: Values above 500 are considered Beyond the AQI. Follow recommendations for the "Hazardous category." Additional information on reducing exposure to extremely high levels of particle pollution is available here.

Global Air Quality Data(15 Days Hourly, 50 Cities)

kaggle.com

zip

Updated Nov 19, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Smeet Raichura (2025). Global Air Quality Data(15 Days Hourly, 50 Cities) [Dataset]. https://www.kaggle.com/datasets/smeet888/global-air-quality-data15-days-hourly-50-cities

Explore at:

zip(598546 bytes)Available download formats

Dataset updated

Nov 19, 2025

Authors

Smeet Raichura

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

📘 Overview

This dataset provides hourly air-quality measurements for 50 major global cities over a continuous 15-day period, including pollutant concentrations, meteorological conditions, geographical metadata, and an engineered AQI index.

All values are synthetically generated using historically consistent pollutant patterns and statistical ranges, allowing researchers and ML practitioners to work with realistic air-quality trends without licensing restrictions or data-collection barriers.

This dataset is ideal for time-series modeling, forecasting, environmental analytics, and machine-learning experimentation.

🧭 Cities Included

Covers all major regions:

North America — New York, Los Angeles, Toronto

Europe — London, Paris, Berlin, Zurich

Asia — Delhi, Tokyo, Seoul, Beijing, Singapore

Middle East — Dubai, Riyadh, Doha

Africa — Lagos, Cairo, Nairobi

Oceania — Sydney, Melbourne, Auckland

South America — São Paulo, Buenos Aires

🧱 Dataset Structure

Each hourly record includes:

Air Pollutants

PM2.5 (µg/m³)

PM10 (µg/m³)

NO₂ (ppb)

SO₂ (ppb)

O₃ (ppb)

CO (ppm)

Weather Features

Temperature (°C)

Humidity (%)

Wind Speed (m/s)

Location Metadata

City

Country

Latitude

Longitude

Other

Timestamp (ISO-8601)

AQI (Computed index)

🧹 Data Quality & Formatting

No missing values — 100% complete

Numeric values rounded to 3 decimals

Clean column names (snake_case)

Consistent hourly frequency

Fully ML-ready

📊 Example Use Cases

✔ AQI forecasting (LSTM, GRU, Transformers) ✔ Multivariate time-series modeling ✔ Clustering cities by pollution patterns ✔ Environmental trend visualization ✔ Weather–pollution correlation studies ✔ Anomaly detection (peak pollution events)

Column	Description	Unit	Type
timestamp	Hourly timestamp (UTC)	—	datetime
city	City name	—	string
country	Country name	—	string
latitude	City latitude	°	float
longitude	City longitude	°	float
pm25	Fine particulate matter	µg/m³	float
pm10	Coarse particulate matter	µg/m³	float
no2	Nitrogen dioxide	ppb	float
so2	Sulfur dioxide	ppb	float
o3	Ozone	ppb	float
co	Carbon monoxide	ppm	float
temperature	Ambient temperature	°C	float
humidity	Relative humidity	%	float
wind_speed	Wind speed	m/s	float
aqi	Derived Air Quality Index	—	int

🧪 Data Generation Method (Provenance)

This dataset is synthetically generated using realistic pollutant behavior patterns based on historical studies and open-source environmental datasets.

Modeling steps included:

City-specific pollutant baseline ranges

Randomized variation using Gaussian noise

Temporal patterns using sinusoidal diurnal cycles (morning & evening peaks)

Weather-pollution correlation rules (e.g., low wind → higher PM)

AQI computed using standard US-EPA breakpoints

All numeric values standardized to 3-decimal precision

This ensures that although synthetic, the dataset follows realistic environmental dynamics.

📁 File Information

global_air_quality_50_cities.csv

Rows: 18,000+

Columns: 16

Format: UTF-8 CSV

n
Jurisdictional Unit (Public) - Dataset - CKAN
nationaldataplatform.org
Updated Feb 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Jurisdictional Unit (Public) - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/jurisdictional-unit-public
Explore at:
Dataset updated
Feb 28, 2024
Description
Jurisdictional Unit, 2022-05-21. For use with WFDSS, IFTDSS, IRWIN, and InFORM.This is a feature service which provides Identify and Copy Feature capabilities. If fast-drawing at coarse zoom levels is a requirement, consider using the tile (map) service layer located at https://nifc.maps.arcgis.com/home/item.html?id=3b2c5daad00742cd9f9b676c09d03d13.OverviewThe Jurisdictional Agencies dataset is developed as a national land management geospatial layer, focused on representing wildland fire jurisdictional responsibility, for interagency wildland fire applications, including WFDSS (Wildland Fire Decision Support System), IFTDSS (Interagency Fuels Treatment Decision Support System), IRWIN (Interagency Reporting of Wildland Fire Information), and InFORM (Interagency Fire Occurrence Reporting Modules). It is intended to provide federal wildland fire jurisdictional boundaries on a national scale. The agency and unit names are an indication of the primary manager name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIdentifier=null,JurisdictionalUnitAgency=null, JurisdictionalUnitKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).These data are used to automatically populate fields on the WFDSS Incident Information page.This data layer implements the NWCG Jurisdictional Unit Polygon Geospatial Data Layer Standard.Relevant NWCG Definitions and StandardsUnit2. A generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Unit, Protecting; LandownerUnit IdentifierThis data standard specifies the standard format and rules for Unit Identifier, a code used within the wildland fire community to uniquely identify a particular government organizational unit.Landowner Kind & CategoryThis data standard provides a two-tier classification (kind and category) of landownership. Attribute Fields JurisdictionalAgencyKind Describes the type of unit Jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, and Other. A value may not be populated for all polygons.JurisdictionalAgencyCategoryDescribes the type of unit Jurisdiction using the NWCG Landowner Category data standard. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State. A value may not be populated for all polygons.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the “Unit Name” or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Standard Unit Identifier (Unit ID). Where it is unknown, the value is ‘Null’. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available from the Unit ID standard, linked above.LandownerKindThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. There are three valid values: Federal, Private, or Other.LandownerCategoryThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State, Private.DataSourceThe database from which the polygon originated. Be as specific as possible, identify the geodatabase name and feature class in which the polygon originated.SecondaryDataSourceIf the Data Source is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if Data Source is "PAD-US 2.1", then for a USDA Forest Service polygon, the Secondary Data Source would be "USDA FS Automated Lands Program (ALP)". For a BLM polygon in the same dataset, Secondary Source would be "Surface Management Agency (SMA)."SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.MapMethod:Controlled vocabulary to define how the geospatial feature was derived. Map method may help define data quality. MapMethod will be Mixed Method by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; OtherDateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using 24 hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature. GeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. This field may be renamed for each standard to fit the feature.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JoinMethodAdditional information on how the polygon was matched information in the NWCG Unit ID database.LocalNameLocalName for the polygon provided from PADUS or other source.LegendJurisdictionalAgencyJurisdictional Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.LegendLandownerAgencyLandowner Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.DataSourceYearYear that the source data for the polygon were acquired.Data InputThis dataset is based on an aggregation of 4 spatial data sources: Protected Areas Database US (PAD-US 2.1), data from Bureau of Indian Affairs regional offices, the BLM Alaska Fire Service/State of Alaska, and Census Block-Group Geometry. NWCG Unit ID and Agency Kind/Category data are tabular and sourced from UnitIDActive.txt, in the WFMI Unit ID application (https://wfmi.nifc.gov/unit_id/Publish.html). Areas of with unknown Landowner Kind/Category and Jurisdictional Agency Kind/Category are assigned LandownerKind and LandownerCategory values of "Private" by use of the non-water polygons from the Census Block-Group geometry.PAD-US 2.1:This dataset is based in large part on the USGS Protected Areas Database of the United States - PAD-US 2.`. PAD-US is a compilation of authoritative protected areas data between agencies and organizations that ultimately results in a comprehensive and accurate inventory of protected areas for the United States to meet a variety of needs (e.g. conservation, recreation, public health, transportation, energy siting, ecological, or watershed assessments and planning). Extensive documentation on PAD-US processes and data sources is available.How these data were aggregated:Boundaries, and their descriptors, available in spatial databases (i.e. shapefiles or geodatabase feature classes) from land management agencies are the desired and primary data sources in PAD-US. If these authoritative sources are unavailable, or the agency recommends another source, data may be incorporated by other aggregators such as non-governmental organizations. Data sources are tracked for each record in the PAD-US geodatabase (see below).BIA and Tribal Data:BIA and Tribal land management data are not available in PAD-US. As such, data were aggregated from BIA regional offices. These data date from 2012 and were substantially updated in 2022. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The
a
Portsmouth Water Drinking Water Quality Data 2022 2023 2024
hub.arcgis.com
streamwaterdata.co.uk
+1more
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AHughes_Portsmouth (2025). Portsmouth Water Drinking Water Quality Data 2022 2023 2024 [Dataset]. https://hub.arcgis.com/datasets/d3165fd17d624b22a9900d47677dfa45
Explore at:
Dataset updated
Oct 1, 2025
Dataset authored and provided by
AHughes_Portsmouth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

Water companies in the UK are responsible for testing the quality of drinking water. This dataset contains the results of samples taken from the taps in domestic households to make sure they meet the standards set out by UK and European legislation. This data shows the location, date, and measured levels of determinands set out by the Drinking Water Inspectorate (DWI).

Key Definitions

Aggregation

Process involving summarizing or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes

Anonymisation

Anonymised data is a type of information sanitization in which data anonymisation tools encrypt or remove personally identifiable information from datasets for the purpose of preserving a data subject's privacy

Dataset

Structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.

Determinand

A constituent or property of drinking water which can be determined or estimated.

DWI

Drinking Water Inspectorate, an organisation “providing independent reassurance that water supplies in England and Wales are safe and drinking water quality is acceptable to consumers.”

DWI Determinands

Constituents or properties that are tested for when evaluating a sample for its quality as per the guidance of the DWI. For this dataset, only determinands with “point of compliance” as “customer taps” are included.

Granularity

Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours

ID

Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance.

LSOA

Lower-Level Super Output Area is made up of small geographic areas used for statistical and administrative purposes by the Office for National Statistics. It is designed to have homogeneous populations in terms of population size, making them suitable for statistical analysis and reporting. Each LSOA is built from groups of contiguous Output Areas with an average of about 1,500 residents or 650 households allowing for granular data collection useful for analysis, planning and policy- making while ensuring privacy.

ONS

Office for National Statistics

Open Data Triage

The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. <

Sample

A sample is a representative segment or portion of water taken from a larger whole for the purpose of analysing or testing to ensure compliance with safety and quality standards.

Schema

Structure for organizing and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute.

Units

Standard measurements used to quantify and compare different physical quantities.

Water Quality

The chemical, physical, biological, and radiological characteristics of water, typically in relation to its suitability for a specific purpose, such as drinking, swimming, or ecological health. It is determined by assessing a variety of parameters, including but not limited to pH, turbidity, microbial content, dissolved oxygen, presence of substances and temperature.

Data History

Data Origin

These samples were taken from customer taps. They were then analysed for water quality, and the results were uploaded to a database. This dataset is an extract from this database.

Data Triage Considerations

Granularity

Is it useful to share results as averages or individual?

We decided to share as individual results as the lowest level of granularity

Anonymisation

It is a requirement that this data cannot be used to identify a singular person or household. We discussed many options for aggregating the data to a specific geography to ensure this requirement is met. The following geographical aggregations were discussed:

<!--·
Water Supply Zone (WSZ) - Limits interoperability with other datasets

<!--·
Postcode – Some postcodes contain very few households and may not offer necessary anonymisation

<!--·
Postal Sector – Deemed not granular enough in highly populated areas

<!--·
Rounded Co-ordinates – Not a recognised standard and may cause overlapping areas

<!--·
MSOA – Deemed not granular enough

<!--·
LSOA – Agreed as a recognised standard appropriate for England and Wales

<!--·
Data Zones – Agreed as a recognised standard appropriate for Scotland

Data Specifications

Each dataset will cover a calendar year of samples

This dataset will be published annually

Historical datasets will be published as far back as 2016 from the introduction of of The Water Supply (Water Quality) Regulations 2016

The Determinands included in the dataset are as per the list that is required to be reported to the Drinking Water Inspectorate.

Context

Many UK water companies provide a search tool on their websites where you can search for water quality in your area by postcode. The results of the search may identify the water supply zone that supplies the postcode searched. Water supply zones are not linked to LSOAs which means the results may differ to this dataset

Some sample results are influenced by internal plumbing and may not be representative of drinking water quality in the wider area.

Some samples are tested on site and others are sent to scientific laboratories.

Data Publish Frequency

Annually

Data Triage Review Frequency

Annually unless otherwise requested

Supplementary information

Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.

<!--1.
Drinking Water Inspectorate Standards and Regulations:

<!--2.
https://www.dwi.gov.uk/drinking-water-standards-and-regulations/

<!--3.
LSOA (England and Wales) and Data Zone (Scotland):

<!--4. https://www.nrscotland.gov.uk/files/geography/2011-census/geography-bckground-info-comparison-of-thresholds.pdf

<!--5.
Description for LSOA boundaries by the ONS: Census 2021 geographies - Office for National Statistics (ons.gov.uk)

<!--[6.
Postcode to LSOA lookup tables: Postcode to 2021 Census Output Area to Lower Layer Super Output Area to Middle Layer Super Output Area to Local Authority District (August 2023) Lookup in the UK (statistics.gov.uk)

<!--7.
Legislation history: Legislation - Drinking Water Inspectorate (dwi.gov.uk)
Evidence supporting the rule of symmetry for OSM data sets.
plos.figshare.com
figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiang Zhang; Weijun Yin; Shouqian Huang; Jianwei Yu; Zhongheng Wu; Tinghua Ai (2023). Evidence supporting the rule of symmetry for OSM data sets. [Dataset]. http://doi.org/10.1371/journal.pone.0200334.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0200334.t005
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Xiang Zhang; Weijun Yin; Shouqian Huang; Jianwei Yu; Zhongheng Wu; Tinghua Ai
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
†Proportion obtained by removing parallel pairs with one empty and one non-empty values. ‡Proportion obtained by treating pairs with one empty and one non-empty values as symmetrical examples.
D
Perception Dataset Management Platforms Market Research Report 2033
dataintelo.com
csv, pdf, pptx
Updated Sep 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Perception Dataset Management Platforms Market Research Report 2033 [Dataset]. https://dataintelo.com/report/perception-dataset-management-platforms-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Sep 30, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Perception Dataset Management Platforms Market Outlook

According to our latest research, the global Perception Dataset Management Platforms market size reached USD 1.27 billion in 2024, and is expected to grow at a robust CAGR of 23.8% from 2025 to 2033. By the end of 2033, the market is forecasted to achieve a value of approximately USD 10.98 billion. This remarkable growth is driven by the rapid adoption of artificial intelligence (AI) and machine learning (ML) technologies across industries, which necessitate high-quality, well-annotated perception datasets for training and validating advanced models.

The primary growth factor fueling the Perception Dataset Management Platforms market is the surging demand for AI-powered solutions in sectors such as autonomous vehicles, robotics, and surveillance. As organizations increasingly rely on AI systems that require complex perception capabilities—such as object detection, scene understanding, and environmental awareness—the need for sophisticated dataset management platforms has intensified. These platforms streamline the collection, curation, annotation, and governance of large-scale perception datasets, ensuring high data quality and compliance with regulatory standards. The proliferation of edge devices and IoT sensors further amplifies the volume and diversity of data generated, necessitating scalable and efficient management solutions.

Another significant driver is the escalating complexity of AI applications in healthcare, retail, and security sectors. In healthcare, for example, perception datasets are crucial for developing diagnostic imaging solutions, patient monitoring systems, and robotic surgery tools. The retail industry leverages these platforms for in-store analytics, customer behavior tracking, and inventory management, while security and defense sectors utilize them for surveillance, threat detection, and situational awareness. The ability of perception dataset management platforms to handle multi-modal data—including images, videos, LiDAR, and radar—positions them as indispensable tools for organizations aiming to accelerate AI innovation while maintaining data integrity and privacy.

Furthermore, the market is benefiting from increased investments in research and academia, where the demand for high-quality, annotated datasets is paramount for advancing AI research. Collaborative initiatives between universities, research institutions, and industry players are fostering the development of standardized dataset management practices and open-source platforms, thereby accelerating innovation and knowledge sharing. Additionally, the growing emphasis on ethical AI and data transparency is prompting organizations to adopt platforms that offer robust data lineage, audit trails, and compliance features, further driving market growth.

Regionally, North America remains the dominant market, accounting for the largest revenue share in 2024, followed closely by Europe and Asia Pacific. The presence of leading technology companies, advanced research institutions, and a strong focus on AI-driven innovation underpin North America’s leadership. Europe is witnessing substantial growth due to stringent data privacy regulations and increased investments in AI research, while Asia Pacific is emerging as a high-growth region, propelled by government initiatives, expanding digital infrastructure, and the rapid adoption of AI technologies across industries. Latin America and the Middle East & Africa are gradually catching up, supported by growing awareness and investments in digital transformation.

Component Analysis

The Perception Dataset Management Platforms market is primarily segmented by component into software and services. The software segment holds the lion’s share of the market, driven by the proliferation of advanced AI and ML tools that require sophisticated data management capabilities. These software solutions offer functionalities such as automated data labeling, annotation, quality control, and versioning—enabling organizations to efficiently manage large volumes of perception data. The integration of AI-powered analytics and visualization tools within these platforms further enhances their value proposition, allowing users to gain actionable insights from complex multi-modal datasets. As AI applications become more mainstream, the demand for robust, scalable, and user-friendly software platforms is expected to surg
h
no-oranges
huggingface.co
Updated Jul 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pranav Karra (2025). no-oranges [Dataset]. http://doi.org/10.57967/hf/6019
Explore at:
Unique identifier
https://doi.org/10.57967/hf/6019
Dataset updated
Jul 28, 2025
Authors
Pranav Karra
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
No-Oranges Dataset

Dataset Description

This is a comprehensive instruction-tuning dataset designed to train language models to avoid generating specific forbidden words while maintaining natural language capabilities. The dataset combines multiple sources of high-quality training data including AI-generated adversarial examples and rule-based prompts.

Dataset Summary

Total Samples: 1,948 high-quality unique samples Task Type: Instruction following with… See the full description on the dataset page: https://huggingface.co/datasets/pranavkarra/no-oranges.
2023 Census population change by age group and RC
2023census-statsnz.hub.arcgis.com
maps-by-statsnz.hub.arcgis.com
Updated May 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics New Zealand (2024). 2023 Census population change by age group and RC [Dataset]. https://2023census-statsnz.hub.arcgis.com/datasets/StatsNZ::2023-census-population-change-by-age-group-and-rc?layer=1
Explore at:
Dataset updated
May 29, 2024
Dataset authored and provided by
Statistics New Zealandhttp://www.stats.govt.nz/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Description
The life-cycle age groups are:
under 15 years
15 to 29 years
30 to 64 years
65 years and over.
Map shows the percentage change in the census usually resident population count for life-cycle age groups between the 2018 and 2023 Censuses.
Download lookup file from Stats NZ ArcGIS Online or Stats NZ geographic data service.
Footnotes
Geographical boundaries
Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.
Subnational census usually resident population
The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city. 
Caution using time series
Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).
About the 2023 Census dataset
For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.
Data quality
The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.
Quality rating of a variable
The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.
Age concept quality rating
Age is rated as very high quality.
Age – 2023 Census: Information by concept has more information, for example, definitions and data quality.
Using data for good
Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga".
Confidentiality
The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.
c
ckanext-dataset - Extensions - CKAN Ecosystem Catalog Beta
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-dataset - Extensions - CKAN Ecosystem Catalog Beta [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-dataset
Explore at:
Dataset updated
Jun 4, 2025
Description
The Dataset extension for CKAN enhances the core functionality of CKAN by providing the ability to add custom fields and behaviors specifically tailored for datasets. This allows administrators to further refine dataset metadata and customize dataset workflows to better suit the needs of their organization or community. This customization aims to improve data discoverability, management, and overall usability within the CKAN platform. Key Features: Custom Fields for Datasets: Introduce new metadata fields beyond the standard CKAN schema, enabling more granular description and categorization of datasets. This can include fields related to data quality, access restrictions, or specific data types. Custom Dataset Behavior: Modify the default behavior of datasets within CKAN, for example, to implement customized validation rules, automated processes, or user interface elements. This level of customization is designed to streamline workflows and enhance user experience. Technical Integration: Although the details are not explicitly outlined, it is likely that the Dataset extension integrates with CKAN by utilizing CKAN's plugin architecture. It might involve creating new plugins that intercept CKAN's default dataset handling logic, or leveraging CKAN's templating engine to modify dataset display and editing interfaces. Initialization likely involves configuring the custom fields and behaviors within CKAN's configuration files or administrative interface. Benefits & Impact: By allowing administrators to define custom metadata fields, the Dataset extension can significantly enhance the discoverability and reusability of datasets. Furthermore, personalized behaviour of datasets inside the application can streamlines standard administrative operations, reducing manual workloads and improving the accuracy of dataset information.
2023 Census main means of travel to work by statistical area 3
datafinder.stats.govt.nz
csv, dbf (dbase iii) +4
Updated Jun 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats NZ (2025). 2023 Census main means of travel to work by statistical area 3 [Dataset]. https://datafinder.stats.govt.nz/table/122496-2023-census-main-means-of-travel-to-work-by-statistical-area-3/
Explore at:
mapinfo mif, csv, dbf (dbase iii), geodatabase, mapinfo tab, geopackage / sqliteAvailable download formats
Dataset updated
Jun 11, 2025
Dataset provided by
Statistics New Zealandhttp://www.stats.govt.nz/
Authors
Stats NZ
License
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Description
Dataset shows an individual’s statistical area 3 (SA3) of usual residence and the SA3 of their workplace address, for the employed census usually resident population count aged 15 years and over, by main means of travel to work from the 2018 and 2023 Censuses.

The main means of travel to work categories are:

Work at home

Drive a private car, truck, or van

Drive a company car, truck, or van

Passenger in a car, truck, van, or company bus

Public bus

Train

Bicycle

Walk or jog

Ferry

Other.

Main means of travel to work is the usual method which an employed person aged 15 years and over used to travel the longest distance to their place of work.

Workplace address refers to where someone usually works in their main job, that is the job in which they worked the most hours. For people who work at home, this is the same address as their usual residence address. For people who do not work at home, this could be the address of the business they work for or another address, such as a building site.

Workplace address is coded to the most detailed geography possible from the available information. This dataset only includes travel to work information for individuals whose workplace address is available at SA3 level. The sum of the counts for each region in this dataset may not equal the total employed census usually resident population count aged 15 years and over for that region. Workplace address – 2023 Census: Information by concept has more information.

This dataset can be used in conjunction with the following spatial files by joining on the SA3 code values:

Statistical area 3 2023 (generalised)

Download data table using the instructions in the Koordinates help guide.

Footnotes

Geographical boundaries

Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.

Subnational census usually resident population

The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city. 

Population counts

Stats NZ publishes a number of different population counts, each using a different definition and methodology. Population statistics – user guide has more information about different counts. 

Caution using time series

Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data).

Workplace address time series

Workplace address time series data should be interpreted with care at lower geographic levels, such as statistical area 2 (SA2). Methodological improvements in 2023 Census resulted in greater data accuracy, including a greater proportion of people being counted at lower geographic areas compared to the 2018 Census. Workplace address – 2023 Census: Information by concept has more information.

Working at home

In the census, working at home captures both remote work, and people whose business is at their home address (e.g. farmers or small business owners operating from their home). The census asks respondents whether they ‘mostly’ work at home or away from home. It does not capture whether someone does both, or how frequently they do one or the other.

Rows excluded from the dataset

Rows show SA3 of usual residence by SA3 of workplace address. Rows with a total population count of less than six have been removed to reduce the size of the dataset, given only a small proportion of SA3-SA3 combinations have commuter flows.

About the 2023 Census dataset

For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.

Data quality

The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.

Quality rating of a variable

The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.

Main means of travel to work quality rating

Main means of travel to work is rated as moderate quality.

Main means of travel to work – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Workplace address quality rating

Workplace address is rated as moderate quality.

Workplace address – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Using data for good

Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.

Confidentiality

The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.

Percentages

To calculate percentages, divide the figure for the category of interest by the figure for ‘Total stated’ where this applies.

Symbol

-999 Confidential

Inconsistencies in definitions

Please note that there may be differences in definitions between census classifications and those used for other data collections.
2023 Census totals by topic for families and extended families by SA1
maps-by-statsnz.hub.arcgis.com
Updated Nov 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics New Zealand (2024). 2023 Census totals by topic for families and extended families by SA1 [Dataset]. https://maps-by-statsnz.hub.arcgis.com/maps/f25c38c1414446ef958d4cfde3923c2d
Explore at:
Dataset updated
Nov 8, 2024
Dataset authored and provided by
Statistics New Zealandhttp://www.stats.govt.nz/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Description
The variables included in this dataset are for families and extended families in households in occupied private dwellings:Count of families Family type Number of people in family Average number of people in family Total family income Median ($) total family incomeCount of extended families Extended family type Total extended family income Median ($) total extended family income. Download lookup file from Stats NZ ArcGIS Online or Stats NZ geographic data service. FootnotesGeographical boundaries Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018. Caution using time series Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data). About the 2023 Census dataset For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings. Data quality The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.Concept descriptions and quality ratingsData quality ratings for 2023 Census variables has additional details about variables found within totals by topic, for example, definitions and data quality.Using data for good Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.Confidentiality The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.Measures Measures like averages, medians, and other quantiles are calculated from unrounded counts, with input noise added to or subtracted from each contributing value during measures calculations. Averages and medians based on less than six units (e.g. individuals, dwellings, households, families, or extended families) are suppressed. This suppression threshold changes for other quantiles. Where the cells have been suppressed, a placeholder value has been used.Percentages To calculate percentages, divide the figure for the category of interest by the figure for 'Total stated' where this applies.Symbol-997 Not available-999 ConfidentialInconsistencies in definitions Please note that there may be differences in definitions between census classifications and those used for other data collections.
4
Data from: Dataset: Floating plastic transport and accumulation in Amsterdam...
data.4tu.nl
zip
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paolo Tasseron (2025). Dataset: Floating plastic transport and accumulation in Amsterdam 2022-2023 [Dataset]. http://doi.org/10.4121/77d424f5-fad5-44a8-91d6-01a29d161784.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/77d424f5-fad5-44a8-91d6-01a29d161784.v1
Dataset updated
Jun 12, 2025
Dataset provided by
4TU.ResearchData
Authors
Paolo Tasseron
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Nov 2022 - Oct 2023
Area covered
The Netherlands, Amsterdam
Description
This dataset contains data from monthly plastic monitoring efforts at twenty locations in Amsterdam between November 2022 and October 2023. The ArcGIS Survey123 application (ESRI, Redlands, California) with a customized form was used to efficiently log the observed floating plastic items in digital forms. Items at all locations were categorized using the River-OSPAR protocol, originally developed by the North Sea Foundation [1]. Two types of monitoring took place.

First, accumulated items were monitored at eleven locations, counting and categorizing all floating items within the most polluted 100 square meters of surface area. This area can be variable depending on the conditions during the measurement day. Additionally, the surface area does not follow any geometry rules, i.e. it can have any shape or form. We determined the monitored area by drawing polygons in Google Earth and redefined this area to capture the most polluted surface when deviating from the original surface area (Example: Figure 1a and b).

Second, the transport of items was monitored at nine locations, using the visual counting method to estimate plastic transport [2]. The observer counts and categorizes all floating items for a predetermined time interval and observation width on top of a bridge. Each bridge was divided into 1 to 3 segments, depending on the length of the bridge. Segments can be unequal in width by covering the distance between bridge pillars (Example: Figure 1c). Consequently, each segment covers a part of the waterway within the field of view of the observer, enabling the identification of all floating items within a given segment. Here, every segment was measured four times for five minutes every monitoring round.

The dataset contains one main file, 'dataset.xlsx' with six sheets:

1) Locations + metadata | Description of the monitored locations with identifiers, street names, type of monitoring and lat/lon coordinates

2) Transport (flux) data | All entries of monitored items for transport locations.

3) Accumulation (density) data | All entries of monitored items for accumulation locations.

4) Totals and top ten items | Total counts per item category, an overview of the categories and share of top ten items both summed for all locations and per location.

5) Transport calculations into IJ | Stepwise calculations from raw data to mass transport estimates in [kg/yr] into the IJ river. A short description of all calculation steps is included in this sheet.

6) Mass calculations for accumulation and transport, for each item category with all locations combined.

For sheet 2 and 3, all timestamps are in UTC. To convert to local time, the purple timestamps are UTC+1 (CET), and the green timestamps UTC+2 (CEST).

The dataset contains one figure, providing an example of monitoring. Figure caption: a) Top view of `accumulation' monitoring area at location NW2, (b) side-view with the same monitoring area, (c) top view of `transport' monitoring at location O1B. This bridge is divided into three segments (S1, S2, S3) which can be monitored by one or multiple observers simultaneously.
2023 Census housing data by statistical area 2
datafinder.stats.govt.nz
csv, dwg, geodatabase +6
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats NZ, 2023 Census housing data by statistical area 2 [Dataset]. https://datafinder.stats.govt.nz/layer/122391-2023-census-housing-data-by-statistical-area-2/
Explore at:
kml, pdf, mapinfo tab, geopackage / sqlite, dwg, shapefile, mapinfo mif, csv, geodatabaseAvailable download formats
Dataset provided by
Statistics New Zealandhttp://www.stats.govt.nz/
Authors
Stats NZ
License
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Area covered
Description
Dataset for the maps accompanying the Housing in Aotearoa New Zealand: 2025 report. This dataset contains counts and measures for:

average number of private dwellings per square kilometre

home ownership rates

mould and damp.

Data is available by statistical area 2.

Average number of private dwellings per square kilometre has data for occupied, unoccupied, and total private dwellings from the 2013, 2018, and 2023 Censuses, including:

dwelling counts

percentage change in the count of dwellings

average number of dwellings per square kilometre.

Home ownership rates has data for households in occupied private dwellings from the 2013, 2018, and 2023 Censuses, including:

counts and percentages for households that owned their home or held it in a family trust, or did not own their home

percentage change in the count of households that owned their home or held it in a family trust, or did not own their home.

Mould and damp has data for occupied private dwellings from the 2018 and 2023 Censuses, including:

counts and percentages for dwellings with or without mould or damp

percentage change in the count of dwellings with or without mould or damp.

Map shows the average number of private dwellings per square kilometre for the 2023 Census.

Map shows the percentage of households in occupied private dwellings that owned their home or held it in a family trust for the 2023 Census.

Map shows the percentage of occupied private dwellings that were damp or mouldy for the 2023 Census.

Download lookup file from Stats NZ ArcGIS Online or embedded attachment in Stats NZ geographic data service. Download data table (excluding the geometry column for CSV files) using the instructions in the Koordinates help guide.

Footnotes

Geographical boundaries

Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.

Caution using time series

Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).

Dwelling density

This data shows the average number of private dwellings (occupied and unoccupied) per square kilometre of land for an area. This is a measure of dwelling density.

About the 2023 Census dataset

For information on the 2023 Census dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.

Data quality

The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.

Quality rating of a variable

The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.

Dwelling occupancy status quality rating

Dwelling occupancy status is rated as high quality.

Dwelling occupancy status – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Dwelling type quality rating

Dwelling type is rated as moderate quality.

Dwelling type – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Tenure of household quality rating

Tenure of household is rated as moderate quality.

Tenure of household – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Dwelling dampness indicator quality rating

Dwelling dampness indicator is rated as moderate quality.

Housing quality – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Dwelling mould indicator quality rating

Dwelling mould indicator is rated as moderate quality.

Housing quality – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Using data for good

Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.

Confidentiality

The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.

Symbol

-998 Not applicable

-999 Confidential

Inconsistencies in definitions

Please note that there may be differences in definitions between census classifications and those used for other data collections.
m
THVD (Talking Head Video Dataset)
data.mendeley.com
Updated Apr 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mario Peedor (2025). THVD (Talking Head Video Dataset) [Dataset]. http://doi.org/10.17632/ykhw8r7bfx.2
Explore at:
Unique identifier
https://doi.org/10.17632/ykhw8r7bfx.2
Dataset updated
Apr 29, 2025
Authors
Mario Peedor
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
About

We provide a comprehensive talking-head video dataset with over 50,000 videos, totaling more than 500 hours of footage and featuring 20,841 unique identities from around the world.

Distribution

Detailing the format, size, and structure of the dataset: Data Volume: -Total Size: 2.7TB

-Total Videos: 47,547

-Identities Covered: 20,841

-Resolution: 60% 4k(1980), 33% fullHD(1080)

-Formats: MP4

-Full-length videos with visible mouth movements in every frame.

-Minimum face size of 400 pixels.

-Video durations range from 20 seconds to 5 minutes.

-Faces have not been cut out, full screen videos including backgrounds.

Usage

This dataset is ideal for a variety of applications:

Face Recognition & Verification: Training and benchmarking facial recognition models.

Action Recognition: Identifying human activities and behaviors.

Re-Identification (Re-ID): Tracking identities across different videos and environments.

Deepfake Detection: Developing methods to detect manipulated videos.

Generative AI: Training high-resolution video generation models.

Lip Syncing Applications: Enhancing AI-driven lip-syncing models for dubbing and virtual avatars.

Background AI Applications: Developing AI models for automated background replacement, segmentation, and enhancement.

Coverage

Explaining the scope and coverage of the dataset:

Geographic Coverage: Worldwide

Time Range: Time range and size of the videos have been noted in the CSV file.

Demographics: Includes information about age, gender, ethnicity, format, resolution, and file size.

Languages Covered (Videos):

English: 23,038 videos

Portuguese: 1,346 videos

Spanish: 677 videos

Norwegian: 1,266 videos

Swedish: 1,056 videos

Korean: 848 videos

Polish: 1,807 videos

Indonesian: 1,163 videos

French: 1,102 videos

German: 1,276 videos

Japanese: 1,433 videos

Dutch: 1,666 videos

Indian: 1,163 videos

Czech: 590 videos

Chinese: 685 videos

Italian: 975 videos

Philipeans: 920 videos

Bulgaria: 340 videos

Romanian: 1144 videos

Arabic: 1691 videos

Who Can Use It

List examples of intended users and their use cases:

Data Scientists: Training machine learning models for video-based AI applications.

Researchers: Studying human behavior, facial analysis, or video AI advancements.

Businesses: Developing facial recognition systems, video analytics, or AI-driven media applications.

Additional Notes

Ensure ethical usage and compliance with privacy regulations. The dataset’s quality and scale make it valuable for high-performance AI training. Potential preprocessing (cropping, down sampling) may be needed for different use cases. Dataset has not been completed yet and expands daily, please contact for most up to date CSV file. The dataset has been divided into 100GB zipped files and is hosted on a private server (with the option to upload to the cloud if needed). To verify the dataset's quality, please contact me for the full CSV file.
d
Cleaned Water Quality and Weather Dataset for AI-based Alum Prediction...
search.dataone.org
hydroshare.org
Updated Oct 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
saikumar payyavula; Jeff Sadler (2025). Cleaned Water Quality and Weather Dataset for AI-based Alum Prediction (2011–2024) [Dataset]. https://search.dataone.org/view/sha256%3A151fa0ed96ffbc1609c16d3ec965f20e79c48d12ad3773cc535692e0a5c8a012
Explore at:
Dataset updated
Oct 18, 2025
Dataset provided by
Hydroshare
Authors
saikumar payyavula; Jeff Sadler
Time period covered
May 18, 2011 - Jul 31, 2024
Description
This dataset was developed to support research on predicting alum dosage in small water treatment plants. It combines daily plant records with weather data, including maximum temperature (TMAX). To make the data reliable for analysis and modeling, outliers and incorrect readings were carefully removed using logical and domain-based rules.

Records with clearly impossible or error values, such as extremely high or negative numbers, were deleted. Each variable was kept within realistic operating limits—for example, alum between 0 and 3500 mg/L, hardness between 5 and 1000 mg/L, and alkalinity between 2 and 1000 mg/L. Unusual readings like pH = 0.54 were also removed. Missing value rows were entirely removed from the dataset.

Through this cleaning process, the dataset became consistent, accurate, and ready for machine-learning models that can better predict chemical dosing and support safer, more efficient water treatment operations.
2023 Census main means of travel to education by statistical area 3
datafinder.stats.govt.nz
csv, dbf (dbase iii) +4
Updated Jun 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats NZ (2025). 2023 Census main means of travel to education by statistical area 3 [Dataset]. https://datafinder.stats.govt.nz/table/122495-2023-census-main-means-of-travel-to-education-by-statistical-area-3/
Explore at:
csv, geopackage / sqlite, dbf (dbase iii), mapinfo tab, mapinfo mif, geodatabaseAvailable download formats
Dataset updated
Jun 11, 2025
Dataset provided by
Statistics New Zealandhttp://www.stats.govt.nz/
Authors
Stats NZ
License
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Description
Dataset shows an individual’s statistical area 3 (SA3) of usual residence and the SA3 of their place of study, for the census usually resident population count who are studying (part time or full time), by main means of travel to education from the 2018 and 2023 Censuses.

The main means of travel to education categories are:

Study at home

Drive a car, truck, or van

Passenger in a car, truck, or van

Bicycle

Walk or jog

School bus

Public bus

Train

Ferry

Other.

Main means of travel to education is the usual method a person used to travel the longest distance to their place of study.

Educational institution address is the physical location of the individual’s place of study. Educational institutions include early childhood education, primary school, secondary school, and tertiary education institutions. For individuals who study at home, their educational institution address is the same as their usual residence address.

Educational institution address is coded to the most detailed geography possible from the available information. This dataset only includes travel to education information for individuals whose educational institution address is available at SA3 level. The sum of the counts for each region in this dataset may not equal the census usually resident population count who are studying (part time or full time) for that region. Educational institution address – 2023 Census: Information by concept has more information.

This dataset can be used in conjunction with the following spatial files by joining on the SA3 code values:

Statistical area 3 2023 (generalised)

Download data table using the instructions in the Koordinates help guide.

Footnotes

Geographical boundaries

Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.

Subnational census usually resident population

The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city. 

Population counts

Stats NZ publishes a number of different population counts, each using a different definition and methodology. Population statistics – user guide has more information about different counts. 

Caution using time series

Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data).

Educational institution address time series

Educational institution address time series data should be interpreted with care at lower geographic levels, such as statistical area 2 (SA2). Methodological improvements in 2023 Census resulted in greater data accuracy, including a greater proportion of people being counted at lower geographic areas compared to the 2018 Census. Educational institution address – 2023 Census: Information by concept has more information.

Rows excluded from the dataset

Rows show SA3 of usual residence by SA3 of educational institution address. Rows with a total population count of less than six have been removed to reduce the size of the dataset, given only a small proportion of SA3-SA3 combinations have commuter flows.

About the 2023 Census dataset

For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.

Data quality

The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.

Quality rating of a variable

The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.

Main means of travel to education quality rating

Main means of travel to education is rated as moderate quality.

Main means of travel to education – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Educational institution address quality rating

Educational institution address is rated as moderate quality.

Educational institution address – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Using data for good

Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.

Confidentiality

The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.

Percentages

To calculate percentages, divide the figure for the category of interest by the figure for ‘Total stated’ where this applies.

Symbol

-999 Confidential

Inconsistencies in definitions

Please note that there may be differences in definitions between census classifications and those used for other data collections.
2023 Census Māori descent population change by regional council
datafinder.stats.govt.nz
csv, dwg, geodatabase +6
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats NZ, 2023 Census Māori descent population change by regional council [Dataset]. https://datafinder.stats.govt.nz/layer/117600-2023-census-maori-descent-population-change-by-regional-council/attachments/25238/
Explore at:
pdf, mapinfo tab, geodatabase, csv, mapinfo mif, kml, shapefile, geopackage / sqlite, dwgAvailable download formats
Dataset provided by
Statistics New Zealandhttp://www.stats.govt.nz/
Authors
Stats NZ
License
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Area covered
Description
Dataset contains Māori descent indicator census usually resident population counts from the 2013, 2018, and 2023 Censuses, as well as the percentage change in the Māori descent indicator counts between the 2013 and 2018 Censuses, and between the 2018 and 2023 Censuses. Data is available by regional council.

Māori descent indicator categories are:

Māori descent

No Māori descent

Don’t know.

Map shows the percentage change in the Māori descent census usually resident population count between the 2018 and 2023 Censuses.

Download lookup file from Stats NZ ArcGIS Online or embedded attachment in Stats NZ geographic data service.

Footnotes

Te Whata

Under the Mana Ōrite Relationship Agreement, Te Kāhui Raraunga (TKR) will be publishing Māori descent and iwi affiliation data from the 2023 Census in partnership with Stats NZ. This will be available on Te Whata, a TKR platform.

Geographical boundaries

Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.

Subnational census usually resident population

The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city.

Caution using time series

Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).

About the 2023 Census dataset

For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.

Data quality

The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.

Quality rating of a variable

The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.

Māori descent concept quality rating

Māori descent is rated as very high quality.

Māori descent – 2023 Census: Information by concept has more information, for example, definitions and data quality.

Using data for good

Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.

Confidentiality

The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.

Symbol

-998 Not applicable

Percentages

To calculate percentages, divide the figure for the category of interest by the figure for ‘Total stated’ where this applies.
o
Trust Program - Dataset - Open Government Data Portal
opendata.gov.jo
Updated May 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Trust Program - Dataset - Open Government Data Portal [Dataset]. https://opendata.gov.jo/dataset/trust-program-845-2020
Explore at:
Dataset updated
May 18, 2021
Description
Jordanian Food and Drug Administration seeks to reward conforming food establishments that are related to the field of food services and processing with the trust program, and thus the JFDA focuses on non-conforming and most violating food establishments and achieves the goal in ensuring food safety by using and exploiting the available resources and capabilities in order to develop the principle of control. Trust program motivate the owners of the establishments and their workers to continue conforming their institutions to the instructions and regulations based on the Food Control Law, and encouraging the rest of the establishment owners to follow the example of the establishments that apply the safety regulations. Food and ensure its quality.
r
Unregulated Contaminant Monitoring Rule ***
redivis.com
Updated Mar 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Unregulated Contaminant Monitoring Rule *** [Dataset]. https://redivis.com/workflows/sex8-4b053ws9z
Explore at:
Dataset updated
Mar 11, 2024
Description
Dataset quality ***: High quality dataset that was quality-checked by the EIDC team

The United States Environmental Protection Agency (EPA) collects occurrence data for contaminants that may be present in drinking water, but are not currently subject to the agency's drinking water regulations.
d
Environmental Monitoring Results for Radiation
catalog.data.gov
data.ct.gov
+1more
Updated Oct 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2025). Environmental Monitoring Results for Radiation [Dataset]. https://catalog.data.gov/dataset/environmental-monitoring-results-for-radiation
Explore at:
Dataset updated
Oct 18, 2025
Dataset provided by
data.ct.gov
Description
Reporting unit of monitoring results is millirem [where 1 millirem = 1 thousandth (10-3) of a Rem] as defined in Regulations of Connecticut State Agencies Section 19-24-4. Monitoring results below the minimum measurable quantity for the monitoring period are recorded as “M.” Quarterly results reflect total integrated gamma exposure received within a calendar 3-month time frame. Environmental monitoring results are reported on a calendar quarterly basis: • 1st Quarter: January, February, March • 2nd Quarter: April, May, June • 3rd Quarter: July, August, September • 4th Quarter: October, November, December Data Quality Disclaimer: This database is for informational use and is not a controlled quality database. Efforts have been made to ensure accuracy of data in the database; however, errors and omissions may occur. Examples of potential errors include: • Data entry errors. • Monitoring results not reported for entry into the database. • Missing results due to equipment failure or unable to retrieve monitors due to lost or environmental hazards. • Translation errors – the data has been migrated to a newer data platform, and there have been errors and data losses. Environmental Monitoring Records are from the year 2008 until present. Prior to 2008 results are stored in hardcopy, in a non-database format. Requests for monitor results prior to 2008 or results subject to quality assurance are available from archived records and can be made through the DEEP Freedom of Information Act (FOIA) administrator at deep.foia@ct.gov. Information on FOIA requests can be found on the DEEP website (https://portal.ct.gov/deep) FOIA Administrator Office of the Commissioner Department of Energy and Environmental Protection 79 Elm Street, 3rd Floor Hartford, CT 06106

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/airnow-air-quality-monitoring-data-current

AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN

Explore at:

Dataset updated

Feb 28, 2024

Description

This United States Environmental Protection Agency (US EPA) feature layer represents monitoring site data, updated hourly concentrations and Air Quality Index (AQI) values for the latest hour received from monitoring sites that report to AirNow.Map and forecast data are collected using federal reference or equivalent monitoring techniques or techniques approved by the state, local or tribal monitoring agencies. To maintain "real-time" maps, the data are displayed after the end of each hour. Although preliminary data quality assessments are performed, the data in AirNow are not fully verified and validated through the quality assurance procedures monitoring organizations used to officially submit and certify data on the EPA Air Quality System (AQS).This data sharing, and centralization creates a one-stop source for real-time and forecast air quality data. The benefits include quality control, national reporting consistency, access to automated mapping methods, and data distribution to the public and other data systems. The U.S. Environmental Protection Agency, National Oceanic and Atmospheric Administration, National Park Service, tribal, state, and local agencies developed the AirNow system to provide the public with easy access to national air quality information. State and local agencies report the Air Quality Index (AQI) for cities across the US and parts of Canada and Mexico. AirNow data are used only to report the AQI, not to formulate or support regulation, guidance or any other EPA decision or position.About the AQIThe Air Quality Index (AQI) is an index for reporting daily air quality. It tells you how clean or polluted your air is, and what associated health effects might be a concern for you. The AQI focuses on health effects you may experience within a few hours or days after breathing polluted air. EPA calculates the AQI for five major air pollutants regulated by the Clean Air Act: ground-level ozone, particle pollution (also known as particulate matter), carbon monoxide, sulfur dioxide, and nitrogen dioxide. For each of these pollutants, EPA has established national air quality standards to protect public health. Ground-level ozone and airborne particles (often referred to as "particulate matter") are the two pollutants that pose the greatest threat to human health in this country.A number of factors influence ozone formation, including emissions from cars, trucks, buses, power plants, and industries, along with weather conditions. Weather is especially favorable for ozone formation when it’s hot, dry and sunny, and winds are calm and light. Federal and state regulations, including regulations for power plants, vehicles and fuels, are helping reduce ozone pollution nationwide.Fine particle pollution (or "particulate matter") can be emitted directly from cars, trucks, buses, power plants and industries, along with wildfires and woodstoves. But it also forms from chemical reactions of other pollutants in the air. Particle pollution can be high at different times of year, depending on where you live. In some areas, for example, colder winters can lead to increased particle pollution emissions from woodstove use, and stagnant weather conditions with calm and light winds can trap PM2.5 pollution near emission sources. Federal and state rules are helping reduce fine particle pollution, including clean diesel rules for vehicles and fuels, and rules to reduce pollution from power plants, industries, locomotives, and marine vessels, among others.How Does the AQI Work?Think of the AQI as a yardstick that runs from 0 to 500. The higher the AQI value, the greater the level of air pollution and the greater the health concern. For example, an AQI value of 50 represents good air quality with little potential to affect public health, while an AQI value over 300 represents hazardous air quality.An AQI value of 100 generally corresponds to the national air quality standard for the pollutant, which is the level EPA has set to protect public health. AQI values below 100 are generally thought of as satisfactory. When AQI values are above 100, air quality is considered to be unhealthy-at first for certain sensitive groups of people, then for everyone as AQI values get higher.Understanding the AQIThe purpose of the AQI is to help you understand what local air quality means to your health. To make it easier to understand, the AQI is divided into six categories:Air Quality Index(AQI) ValuesLevels of Health ConcernColorsWhen the AQI is in this range:..air quality conditions are:...as symbolized by this color:0 to 50GoodGreen51 to 100ModerateYellow101 to 150Unhealthy for Sensitive GroupsOrange151 to 200UnhealthyRed201 to 300Very UnhealthyPurple301 to 500HazardousMaroonNote: Values above 500 are considered Beyond the AQI. Follow recommendations for the Hazardous category. Additional information on reducing exposure to extremely high levels of particle pollution is available here.Each category corresponds to a different level of health concern. The six levels of health concern and what they mean are:"Good" AQI is 0 to 50. Air quality is considered satisfactory, and air pollution poses little or no risk."Moderate" AQI is 51 to 100. Air quality is acceptable; however, for some pollutants there may be a moderate health concern for a very small number of people. For example, people who are unusually sensitive to ozone may experience respiratory symptoms."Unhealthy for Sensitive Groups" AQI is 101 to 150. Although general public is not likely to be affected at this AQI range, people with lung disease, older adults and children are at a greater risk from exposure to ozone, whereas persons with heart and lung disease, older adults and children are at greater risk from the presence of particles in the air."Unhealthy" AQI is 151 to 200. Everyone may begin to experience some adverse health effects, and members of the sensitive groups may experience more serious effects."Very Unhealthy" AQI is 201 to 300. This would trigger a health alert signifying that everyone may experience more serious health effects."Hazardous" AQI greater than 300. This would trigger a health warnings of emergency conditions. The entire population is more likely to be affected.AQI colorsEPA has assigned a specific color to each AQI category to make it easier for people to understand quickly whether air pollution is reaching unhealthy levels in their communities. For example, the color orange means that conditions are "unhealthy for sensitive groups," while red means that conditions may be "unhealthy for everyone," and so on.Air Quality Index Levels of Health ConcernNumericalValueMeaningGood0 to 50Air quality is considered satisfactory, and air pollution poses little or no risk.Moderate51 to 100Air quality is acceptable; however, for some pollutants there may be a moderate health concern for a very small number of people who are unusually sensitive to air pollution.Unhealthy for Sensitive Groups101 to 150Members of sensitive groups may experience health effects. The general public is not likely to be affected.Unhealthy151 to 200Everyone may begin to experience health effects; members of sensitive groups may experience more serious health effects.Very Unhealthy201 to 300Health alert: everyone may experience more serious health effects.Hazardous301 to 500Health warnings of emergency conditions. The entire population is more likely to be affected.Note: Values above 500 are considered Beyond the AQI. Follow recommendations for the "Hazardous category." Additional information on reducing exposure to extremely high levels of particle pollution is available here.

Clear search

Close search

Google apps

Main menu

AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN

Global Air Quality Data(15 Days Hourly, 50 Cities)

Jurisdictional Unit (Public) - Dataset - CKAN

Portsmouth Water Drinking Water Quality Data 2022 2023 2024

Evidence supporting the rule of symmetry for OSM data sets.

Perception Dataset Management Platforms Market Research Report 2033

Perception Dataset Management Platforms Market Outlook

Component Analysis

no-oranges

2023 Census population change by age group and RC

ckanext-dataset - Extensions - CKAN Ecosystem Catalog Beta

2023 Census main means of travel to work by statistical area 3

2023 Census totals by topic for families and extended families by SA1

Data from: Dataset: Floating plastic transport and accumulation in Amsterdam...

2023 Census housing data by statistical area 2

THVD (Talking Head Video Dataset)

Cleaned Water Quality and Weather Dataset for AI-based Alum Prediction...

2023 Census main means of travel to education by statistical area 3

2023 Census Māori descent population change by regional council

Trust Program - Dataset - Open Government Data Portal

Unregulated Contaminant Monitoring Rule ***

Environmental Monitoring Results for Radiation

AirNow Air Quality Monitoring Data (Current) - Dataset - CKANSee More Versions

AirNow Air Quality Monitoring Data (Current) - Dataset - CKAN