100+ datasets found

d
Data from: Classification of Aeronautics System Health and Safety Documents
catalog.data.gov
datasets.ai
+2more
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Classification of Aeronautics System Health and Safety Documents [Dataset]. https://catalog.data.gov/dataset/classification-of-aeronautics-system-health-and-safety-documents
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
Most complex aerospace systems have many text reports on safety, maintenance, and associated issues. The Aviation Safety Reporting System (ASRS) spans several decades and contains over 700 000 reports. The Aviation Safety Action Plan (ASAP) contains over 12 000 reports from various airlines. Problem categorizations have been developed for both ASRS and ASAP to enable identification of system problems. However, repository volume and complexity make human analysis difficult. Multiple experts are needed, and they often disagree on classifications. Even the same person has classified the same document differently at different times due to evolving experiences. Consistent classification is necessary to support tracking trends in problem categories over time. A decision support system that performs consistent document classification quickly and over large repositories would be useful. We discuss the results of two algorithms we have developed to classify ASRS and ASAP documents. The first is Mariana---a support vector machine (SVM) with simulated annealing, which is used to optimize hyperparameters for the model. The second method is classification built on top of nonnegative matrix factorization (NMF), which attempts to find a model that represents document features that add up in various combinations to form documents. We tested both methods on ASRS and ASAP documents with the latter categorized two different ways. We illustrate the potential of NMF to provide document features that are interpretable and indicative of topics. We also briefly discuss the tool that we have incorporated Mariana into in order to allow human experts to provide feedback on the document categorizations.

Global Data Classification Tool Market Research Report: By Deployment Model...

wiseguyreports.com

Updated Jun 21, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

wWiseguy Research Consultants Pvt Ltd (2024). Global Data Classification Tool Market Research Report: By Deployment Model (On-Premises, Cloud-Based, SaaS-Based), By Organization Size (Small & Medium-Sized Enterprises (SMEs), Large Enterprises), By Industry Vertical (Healthcare, Financial Services, Government and Public Sector, Retail and E-commerce, Manufacturing and Logistics), By Data Type (Structured Data, Semi-Structured Data, Unstructured Data), By Functionality (Automated Data Classification, Manual Data Classification, Data Discovery, Data Labeling, Data Masking) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/data-classification-tool-market

Explore at:

Dataset updated

Jun 21, 2024

Dataset authored and provided by

wWiseguy Research Consultants Pvt Ltd

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Jan 6, 2024

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2024
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2023	2.83(USD Billion)
MARKET SIZE 2024	3.38(USD Billion)
MARKET SIZE 2032	14.02(USD Billion)
SEGMENTS COVERED	Deployment Model ,Organization Size ,Industry Vertical ,Data Type ,Application ,Regional
COUNTRIES COVERED	North America, Europe, APAC, South America, MEA
KEY MARKET DYNAMICS	Increasing data privacy regulations Growing need for data security and compliance Proliferation of unstructured data Rise of artificial intelligence and machine learning Adoption of cloudbased data storage
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	- Informatica ,- Oracle ,- Symantec ,- IBM ,- Informatica ,- Splunk ,- Varonis Systems ,- Digital Guardian ,- STEALTHbits Technologies ,- Cybereason ,- Netskope ,- FireEye ,- Trustwave ,- Check Point Software Technologies
MARKET FORECAST PERIOD	2024 - 2032
KEY MARKET OPPORTUNITIES	Increase in data breaches Growing adoption of cloud and SaaS solutions Need for data protection and compliance regulations Emergence of AI and ML technologies Growing focus on data privacy
COMPOUND ANNUAL GROWTH RATE (CAGR)	19.46% (2024 - 2032)

w
Guide to applying the 2011 Rural Urban Classification to data
gov.uk
Updated Jul 21, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Environment, Food & Rural Affairs (2016). Guide to applying the 2011 Rural Urban Classification to data [Dataset]. https://www.gov.uk/government/statistics/guide-to-applying-the-2011-rural-urban-classification-to-data
Explore at:
Dataset updated
Jul 21, 2016
Dataset provided by
GOV.UK
Authors
Department for Environment, Food & Rural Affairs
Description
This guide explains how to apply the 2011 Rural Urban Classification to a range of geographies and data for statistical analysis.

Additional information:

2011 Rural Urban Classification

Statistical Digest of Rural England

Defra statistics: rural

Email mailto:rural.statistics@defra.gov.uk">rural.statistics@defra.gov.uk

<p class="govuk-body">You can also contact us via Twitter: <a href="https://twitter.com/DefraStats" class="govuk-link">https://twitter.com/DefraStats</a></p>
Fundamental classification guidance review files
catalog.data.gov
s.cnmilf.com
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DHS (2025). Fundamental classification guidance review files [Dataset]. https://catalog.data.gov/dataset/fundamental-classification-guidance-review-files-853fd
Explore at:
Dataset updated
Jul 13, 2025
Dataset provided by
U.S. Department of Homeland Securityhttp://www.dhs.gov/
Description
Reports, significant correspondence, drafts, received comments, and related materials responding to “fundamental classification guidance review” as required by Executive Order 13526 Section 1.9.
Global Industry Classification Standard System
lseg.com
Updated Feb 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LSEG (2025). Global Industry Classification Standard System [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/reference-data/classifications/business-and-industry-classifications/global-industry-classification-standard-system
Explore at:
csv,delimited,gzip,sql,user interface,xml,zip archiveAvailable download formats
Dataset updated
Feb 27, 2025
Dataset provided by
London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
Authors
LSEG
License
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Description
Access the Global Industry Classification Standard (GICS) system through LSEG, covering over 58,000 trading securities across 125 countries.
Risk classification guide for medical device establishment inspections...
open.canada.ca
html
Updated Feb 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Health Canada (2023). Risk classification guide for medical device establishment inspections GUI-0079 [Dataset]. https://open.canada.ca/data/info/af97a579-937c-4b03-84d7-632666d1b67a
Explore at:
htmlAvailable download formats
Dataset updated
Feb 27, 2023
Dataset provided by
Health Canadahttp://www.hc-sc.gc.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
This document is intended to help ensure consistency among Health Canada inspectors during medical device establishment inspections when classifying observations of deviations, deficiencies or failures according to risk, and assigning an overall compliance rating to an inspection.
Preliminary Context Classification TDA
gis-fdot.opendata.arcgis.com
Updated May 20, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florida Department of Transportation (2019). Preliminary Context Classification TDA [Dataset]. https://gis-fdot.opendata.arcgis.com/datasets/preliminary-context-classification-tda
Explore at:
Dataset updated
May 20, 2019
Dataset authored and provided by
Florida Department of Transportationhttps://www.fdot.gov/
Area covered

Description
The FDOT GIS Preliminary Context Classification feature class provides spatial information regarding preliminary context classification on selected Florida roadways. Context classification denotes the criteria for roadway design elements for safer streets that promote safety, economic development, and quality of life. All non-limited access state highways will be evaluated and assigned a current context classification. Limited access facilities are assigned only one code - LA - Limited Access. For growth development and design purposes, a future context classification will also be assigned. The District Complete Streets Coordinator will determine the current and future context classification designation, along with the dates, and coordinate with the District RCI staff to get this information into the RCI database. This information is required for All functionally classified roadways on the State Highway System (SHS). This dataset is maintained by the Transportation Data & Analytics office (TDA). The source spatial data for this hosted feature layer was created on: 06/21/2025.For more details please review the FDOT RCI Handbook Download Data: Enter Guest as Username to download the source shapefile from here:
Tree Point Classification
hub.arcgis.com
cacgeoportal.com
Updated Oct 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Esri (2020). Tree Point Classification [Dataset]. https://hub.arcgis.com/content/58d77b24469d4f30b5f68973deb65599
Explore at:
Dataset updated
Oct 8, 2020
Dataset authored and provided by
Esrihttp://esri.com/
Description
Classifying trees from point cloud data is useful in applications such as high-quality 3D basemap creation, urban planning, and forestry workflows. Trees have a complex geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.Using the modelFollow the guide to use the model. The model can be used with the 3D Basemaps solution and ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.InputThe model accepts unclassified point clouds with the attributes: X, Y, Z, and Number of Returns.Note: This model is trained to work on unclassified point clouds that are in a projected coordinate system, where the units of X, Y, and Z are based on the metric system of measurement. If the dataset is in degrees or feet, it needs to be re-projected accordingly. The provided deep learning model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification.This model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time and compute resources while improving accuracy. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block, and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following 2 classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 5 Trees / High-vegetationApplicable geographiesThis model is expected to work well in all regions globally, with an exception of mountainous regions. However, results can vary for datasets that are statistically dissimilar to training data.Model architectureThis model uses the PointCNN model architecture implemented in ArcGIS API for Python.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. Class Precision Recall F1-score Trees / High-vegetation (5) 0.975374 0.965929 0.970628Training dataThis model is trained on a subset of UK Environment Agency's open dataset. The training data used has the following characteristics: X, Y and Z linear unit meter Z range -19.29 m to 314.23 m Number of Returns 1 to 5 Intensity 1 to 4092 Point spacing 0.6 ± 0.3 Scan angle -23 to +23 Maximum points per block 8192 Extra attributes Number of Returns Class structure [0, 5]Sample resultsHere are a few results from the model.
North American Industry Classification System (NAICS) 2017 Version 2.0
open.canada.ca
beta.data.urbandatacentre.ca
+1more
csv, html, pdf
Updated Feb 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2022). North American Industry Classification System (NAICS) 2017 Version 2.0 [Dataset]. https://open.canada.ca/data/en/dataset/1cedfc64-2d58-4e21-9359-778d252271ae
Explore at:
html, pdf, csvAvailable download formats
Dataset updated
Feb 23, 2022
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
The North American Industry Classification System (NAICS) is an industry classification system developed by the statistical agencies of Canada, Mexico and the United States. Created against the background of the North American Free Trade Agreement, it is designed to provide common definitions of the industrial structure of the three countries and a common statistical framework to facilitate the analysis of the three economies. NAICS is based on supply-side or production-oriented principles, to ensure that industrial data, classified to NAICS, are suitable for the analysis of production-related issues such as industrial performance. NAICS Canada 2017 Version 2.0 consists of 20 sectors, 102 subsectors, 322 industry groups, 708 industries and 923 Canadian industries, and replaces NAICS 2017 Version 1.0.
H
Replication Data for: Automated Text Classification of News Articles: A...
dataverse.harvard.edu
bin, csv, html, pdf +8
Updated Dec 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2020). Replication Data for: Automated Text Classification of News Articles: A Practical Guide [Dataset]. http://doi.org/10.7910/DVN/MXKRDE
Explore at:
tsv(17502241), txt(53690), text/plain; charset=utf-8(1071), type/x-r-syntax(5670), type/x-r-syntax(11522), pdf(5757), html(483), text/x-python(3945), pdf(5710), tsv(6240), text/x-python(2693), html(988), type/x-r-syntax(1922), sh(670), tsv(5449201), text/x-python(3544), pdf(5739), html(653), html(546), text/plain; charset=us-ascii(4069), text/plain; charset=us-ascii(1758), html(579), type/x-r-syntax(4562), type/x-r-syntax(4745), type/x-r-syntax(1133), type/x-r-syntax(1507), tsv(28187174), tsv(6639), bin(7), tsv(6699), pdf(5044), pdf(5599), pdf(6631), tsv(842), pdf(6366), type/x-r-syntax(1489), type/x-r-syntax(2538), tsv(163), text/plain; charset=us-ascii(57), type/x-r-syntax(6723), type/x-r-syntax(1383), text/markdown(1406), html(1457), tsv(115755), tsv(41200), type/x-r-syntax(5524), html(480), tsv(9998396), tsv(13375940), pdf(6363), tsv(696078), html(953), text/plain; charset=us-ascii(158), bin(1917), text/plain; charset=us-ascii(6555), html(7004), txt(70946), tsv(2157), text/plain; charset=utf-8(16811), pdf(6754), csv(5540954), pdf(5297), type/x-r-syntax(4591), tsv(62434), type/x-r-syntax(9996), pdf(7462), pdf(7096), html(2093), pdf(6575), type/x-r-syntax(1698), type/x-r-syntax(16982), csv(5127124), type/x-r-syntax(4464)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/MXKRDE
Dataset updated
Dec 30, 2020
Dataset provided by
Harvard Dataverse
Description
Automated text analysis methods have made possible the classification of large corpora of text by measures such as topic and tone. Here, we provide a guide to help researchers navigate the consequential decisions they need to make before any measure can be produced from the text. We consider, both theoretically and empirically, the effects of such choices using as a running example efforts to measure the tone of New York Times coverage of the economy. We show that two reasonable approaches to corpus selection yield radically different corpora and we advocate for the use of keyword searches rather than pre-defined subject categories provided by news archives. We demonstrate the benefits of coding using article-segments instead of sentences as units of analysis. We show that, given a fixed number of codings, it is better to increase the number of unique documents coded rather than the number of coders for each document. Finally, we find that supervised machine learning algorithms outperform dictionaries on a number of criteria. Overall, we intend this guide to serve as a reminder to analysts that thoughtfulness and human validation are key to text-as-data methods, particularly in an age when it is all-too-easy to computationally classify texts without attending to the methodological choices therein.
d
Data from: Finding Stats: Terms, Tools and Techniques
search.dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elizabeth Hamilton (2023). Finding Stats: Terms, Tools and Techniques [Dataset]. http://doi.org/10.5683/SP3/VBEQME
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/VBEQME
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Elizabeth Hamilton
Description
Last year, there was a request for "Deconstructing Terms" found in Statistics Canada products. What do the myriad of terms mean and how can we help our users interpret classification guides, terminology, and the mysteries of Statistics Canada language?
D
Data Classification Software Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Oct 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Data Classification Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/data-classification-software-market
Explore at:
pptx, pdf, csvAvailable download formats
Dataset updated
Oct 16, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Data Classification Software Market Outlook

The global data classification software market size was valued at approximately USD 1.5 billion in 2023 and is projected to reach around USD 5.2 billion by 2032, growing at a Compound Annual Growth Rate (CAGR) of 14.8% during the forecast period. The significant growth factor driving this market is the increasing need for data security and compliance across various industries, fueled by stringent regulatory requirements and the rising volume of data generated globally.

Several factors are contributing to the robust growth of the data classification software market. First and foremost, the proliferation of digital data across organizations of all sizes is generating a critical need for effective data management solutions. As companies strive to safeguard sensitive information against breaches and unauthorized access, data classification software offers an efficient means to categorize and secure data based on its sensitivity and importance. This is particularly relevant in highly regulated sectors such as BFSI (Banking, Financial Services, and Insurance), healthcare, and government, where compliance with data protection laws is paramount.

Another driving force behind the market's expansion is the increasing adoption of cloud computing and the consequential rise in cyber threats. As more enterprises migrate their data to cloud environments, the risk of data breaches and loss escalates, prompting organizations to invest in robust data classification tools. These tools help in identifying, categorizing, and protecting data, thereby mitigating risks and ensuring regulatory compliance. Furthermore, advancements in artificial intelligence (AI) and machine learning (ML) technologies are enhancing the capabilities of data classification software, making it more accurate and efficient in identifying and categorizing data.

The rise of remote work, spurred by the COVID-19 pandemic, has also played a significant role in driving the demand for data classification solutions. With employees accessing corporate networks from various locations, the risk of data leaks and breaches has heightened, necessitating the implementation of robust security measures. Data classification software helps organizations to maintain data integrity and confidentiality, ensuring that sensitive information is accessed and shared securely. Additionally, the growing awareness about the importance of data privacy among consumers is urging companies to adopt stringent data protection measures, further propelling market growth.

Regionally, North America is anticipated to hold the largest market share throughout the forecast period, driven by the presence of major market players and stringent data protection regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Europe is also expected to witness substantial growth due to similar regulatory frameworks and the increasing digitization of businesses. Meanwhile, the Asia Pacific region is projected to experience the highest CAGR, fueled by rapid technological advancements, increased adoption of cloud services, and growing awareness about data security and compliance.

Component Analysis

The data classification software market is segmented into software and services based on components. The software segment encompasses various types of data classification tools designed to identify, categorize, and protect data according to predefined criteria. This segment is witnessing significant growth due to the increasing need for automated data management solutions that can handle large volumes of data with high accuracy. Advanced software solutions leverage AI and ML technologies to enhance data classification processes, making them more efficient and reliable. As organizations continue to generate vast amounts of data, the demand for sophisticated software solutions is expected to rise further.

On the other hand, the services segment includes professional and managed services offered by vendors to support the implementation, maintenance, and optimization of data classification solutions. Professional services typically involve consulting, integration, and training, helping organizations to tailor the software to their specific needs and ensure seamless implementation. Managed services, meanwhile, encompass ongoing support and maintenance, allowing companies to outsource the management of their data classification infrastructure. This segment is gaining traction as businesses increasingly seek expert guidance to navigate complex data protection regulations and optim
n
Vegetation Classification for the Nature Reserve of Orange County
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Jun 16, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AECOM; Inc. Aerial Information System; California Native Plant Society (2016). Vegetation Classification for the Nature Reserve of Orange County [Dataset]. http://doi.org/10.7280/D1F30C
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.7280/D1F30C
Dataset updated
Jun 16, 2016
Authors
AECOM; Inc. Aerial Information System; California Native Plant Society
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Orange County
Description
The ultimate goal of this project is to create an updated fine‐scale vegetation map for about 58,000 acres of Orange County, consisting of the 37,000‐acre Orange County Central and Coastal Subregions Natural Community Conservation Plan (NCCP)/Habitat Conservation Plan (HCP) Habitat Reserve System; approximately 9,500 acres of associated NCCP/HCP Special Linkages, Existing Use Areas, and Non‐Reserve Open Space; and approximately 11,000 acres of adjoining conserved open space (study area). The project consisted of three phases.Phase 1: To update vegetation mapping, the Natural Reserve of Orange County (NROC) proposes to use Manual of California Vegetation (MCV) methods (2009), which will be implemented in two stages: Stage 1 – Development of a vegetation classification system for the Central and Coastal Subregions of Orange County that is consistent with the MCV. Stage 2 – Application of the vegetation classification system to create a vegetation map through photointerpretation of available aerial imagery and ground reconnaissance. The MCV methods were developed by the California Department of Fish and Game (CDFG) Vegetation Classification and Mapping Program in collaboration with the California Native Plant Society (CNPS). This approach relies on the collection of quantifiable environmental data to identify and classify biological associations that repeat across the landscape. For areas where documentation is lacking to effectively define all of the vegetation patterns found in California, CDFG and CNPS developed the Vegetation Rapid Assessment Protocol. This protocol guides data collection and analysis to refine vegetation classifications that are consistent with CDFG and MCV standards. Based on an earlier classification by Gray and Bramlet (1992), Orange County is expected to have vegetation types not yet described in the MCV. Using the MCV approach, Rapid Assessment (RA) data was collected throughout the study area and analyzed to characterize these new vegetation types or show concurrence with existing MCV types.Phase 2: Aerial Information Systems, Inc. (AIS) was contracted by the Nature Reserve of Orange County (NROC) to create an updated fine-scale regional vegetation map consistent with the California Department of Fish & Wildlife (CDFW) classification methodology and mapping standards. The mapping area covers approximately 86,000 acres of open space and adjacent urban and agricultural lands including habitat located in both the Central and Coastal Subregions of Orange County. The map was prepared over a baseline digital image created in 2012 by the US Department of Agriculture – Farm Service Agency’s National Agricultural Imagery Program (NAIP). Vegetation units were mapped using the National Vegetation Classification System (NVCS) to the Alliance level as depicted in the second edition of the Manual of California Vegetation (MCV2). One of the most important data layers used to guide the conservation planning process for the 1996 Orange County Central & Coastal Subregion Natural Community Conservation Plan/Habitat Conservation Plan (NCCP/HCP) was the regional vegetation map created in the early 1990s by Dave Bramlett & Jones & Stokes Associates, Inc. (Jones & Stokes Associates, Inc. 1993). Up until now, this same map continues to be used to direct monitoring and management efforts in the NCCP/HCP Habitat Reserve. An updated map is necessary in order to address changes in vegetation makeup due to widespread and multiple burns in the mapping area, urban expansion, and broadly occurring vegetation succession that has occurred over the past 20 years since the original map was created. This update is further necessary in order to conform to the current NVCS, which is supported by the extensive acquisition of ground based field data and subsequent analysis that has ensued in those same 20 years over the region and adjacent similar habitats in the coastal and mountain foothills of Southern California. Vegetative and cartographic comparisons between the newly created 2012 image-based map and the original 1990s era vegetation map are documented in a separate report produced by the California Native Plant Society at the end of 2014.Phase 3: The California Native Plant Society (CNPS) Vegetation Program conducted an independent accuracy assessment of a new vegetation map completed for the natural lands of Orange County in collaboration with Aerial Information Systems (AIS), the California Department of Fish and Wildlife (CDFW), and the Nature Reserve of Orange County (NROC). This report provides a summary of the accuracy assessment allocation, field sampling methods, and analysis results; it also provides an in-depth crosswalk and comparison between the new map and the existing 1992 vegetation map. California state standards (CDFW 2007) require that a vegetation map should achieve an overall accuracy of 80%. After final scoring, the new Orange County vegetation map received an overall user’s accuracy of 87%. The new fine-scale vegetation map and supporting field survey data provide baseline information for long-term land management and conservation within the remaining natural lands of Orange County.Data made available in the OC Data Portal in partnership with UCI Libraries. Methods The project consisted of three phases, each with its own methodology.Phase 1: To update vegetation mapping, the Natural Reserve of Orange County (NROC) usedManual of California Vegetation (MCV) methods (2009), which will be implemented in two stages: Stage 1 – Development of a vegetation classification system for the Central and Coastal Subregions of Orange County that is consistent with the MCV. Stage 2 – Application of the vegetation classification system to create a vegetation map through photointerpretation of available aerial imagery and ground reconnaissance.Phase 2: Aerial Information Systems, Inc. (AIS) was contracted by the Nature Reserve of Orange County (NROC) to create an updated fine-scale regional vegetation map consistent with the California Department of Fish & Wildlife (CDFW) classification methodology and mapping standards.Phase 3: The California Native Plant Society (CNPS) Vegetation Program conducted an independent accuracy assessment of a new vegetation map completed for the natural lands of Orange County in collaboration with Aerial Information Systems (AIS), the California Department of Fish and Wildlife (CDFW), and the Nature Reserve of Orange County (NROC).For more detailed methodology information please consult the README.txt file included with dataset.
North American Industry Classification System (NAICS) 2002
ouvert.canada.ca
data.urbandatacentre.ca
+3more
csv, html
Updated Feb 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2022). North American Industry Classification System (NAICS) 2002 [Dataset]. https://ouvert.canada.ca/data/dataset/3047ab93-4587-4a09-9256-0765aaa11896
Explore at:
csv, htmlAvailable download formats
Dataset updated
Feb 23, 2022
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
The North American Industry Classification System (NAICS) is an industry classification system developed by the statistical agencies of Canada, Mexico and the United States. Created against the background of the North American Free Trade Agreement, it is designed to provide common definitions of the industrial structure of the three countries and a common statistical framework to facilitate the analysis of the three economies. NAICS is based on supply side or production oriented principles, to ensure that industrial data, classified to NAICS, is suitable for the analysis of production related issues such as industrial performance.
ATC and DDD Classification System Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). ATC and DDD Classification System Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/atc-and-ddd-classification-system-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contain information of ATC (Anatomical Therapeutic Chemical) classification system. The Anatomical Therapeutic Chemical (ATC) Classification System is used for the classification of active ingredients of drugs according to the organ or system on which they act and their therapeutic, pharmacological and chemical properties. It is controlled by the World Health Organization Collaborating Centre for Drug Statistics Methodology (WHOCC), and was first published in 1976.
d
Generalized use classification for groundwater sites in California with...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Generalized use classification for groundwater sites in California with publicly available water-quality data in the U.S. Geological Survey National Water Information System (NWIS) data archive [Dataset]. https://catalog.data.gov/dataset/generalized-use-classification-for-groundwater-sites-in-california-with-publicly-available
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
California
Description
The U.S. Geological Survey National Water Information System (NWIS) data archive contains publicly available water-quality data for approximately 20,000 groundwater wells and springs in California. This data release publishes site type information and a generalized use classification for the sites. The generalized use category is derived from, but is not equivalent to, the use of site and use of water fields from the NWIS data archive. The use of site and use of water fields in the NWIS data archive are not publicly available. The 20,000 groundwater wells and springs were categorized as into seven generalized use categories: 1) domestic, 2) irrigation, 3) production, 4) water supply, other, 5) observation, 6) other, and 7) unknown. A similar classification system was used by Stork and Fram (2021) to categorize sites sampled by the USGS for the California State Water Resources Control Boards' (SWRCB) Groundwater Ambient Monitoring and Assessment Program Priority Basin Project (GAMA-PBP). The classification presented here only uses information from the use of site and use of water fields in the NWIS data archive; the classification presented by Stork and Fram (2021) considered additional information collected by the GAMA-PBP about the sites. USGS NWIS data are also served to the public through the SWRCB's GAMA Groundwater Information System (GAMAGIS), along with data from other federal, state, and local agency sources, and the generalized use categories in this data release are compatible with the use categories in SWRCB GAMAGIS. Prior to publication of this data release, SWRCB GAMAGIS classified all groundwater sites in NWIS, except those sampled by the GAMA-PBP, as "water supply, other", which resulted in erroneous characterization of approximately 98 percent of the sites. The generalized use classification provided in this data release greatly improves the accuracy of site characterization, while still complying with Federal policies concerning release of location information for some types of sites.
M
Land Cover - Minnesota Land Cover Classification System
gisdata.mn.gov
fgdb, gpkg, html +2
Updated Nov 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural Resources Department (2024). Land Cover - Minnesota Land Cover Classification System [Dataset]. https://gisdata.mn.gov/dataset/biota-landcover-mlccs
Explore at:
jpeg, shp, html, gpkg, fgdbAvailable download formats
Dataset updated
Nov 22, 2024
Dataset provided by
Natural Resources Department
Area covered
Minnesota
Description
Land cover data set based on the Minnesota Land Cover Classification System (MLCCS) coding scheme. This data was produced using a combination of aerial photograph interpretation and field surveys. There is a minimum mapping unit of 1 acre for natural vegetation and 2 acres for artificial cover types.
Data Cleaning, Translation & Split of the Dataset for the Automatic...
zenodo.org
bin, csv +1
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juliane Köhler; Juliane Köhler (2025). Data Cleaning, Translation & Split of the Dataset for the Automatic Classification of Documents for the Classification System for the Berliner Handreichungen zur Bibliotheks- und Informationswissenschaft [Dataset]. http://doi.org/10.5281/zenodo.6957842
Explore at:
text/x-python, csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6957842
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Juliane Köhler; Juliane Köhler
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cleaned_Dataset.csv – The combined CSV files of all scraped documents from DABI, e-LiS, o-bib and Springer.

Data_Cleaning.ipynb – The Jupyter Notebook with python code for the analysis and cleaning of the original dataset.

ger_train.csv – The German training set as CSV file.

ger_validation.csv – The German validation set as CSV file.

en_test.csv – The English test set as CSV file.

en_train.csv – The English training set as CSV file.

en_validation.csv – The English validation set as CSV file.

splitting.py – The python code for splitting a dataset into train, test and validation set.

DataSetTrans_de.csv – The final German dataset as a CSV file.

DataSetTrans_en.csv – The final English dataset as a CSV file.

translation.py – The python code for translating the cleaned dataset.
u
Risk classification guide for medical device establishment inspections...
data.urbandatacentre.ca
beta.data.urbandatacentre.ca
Updated Oct 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Risk classification guide for medical device establishment inspections GUI-0079 - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-af97a579-937c-4b03-84d7-632666d1b67a
Explore at:
Dataset updated
Oct 1, 2024
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada
Description
This document is intended to help ensure consistency among Health Canada inspectors during medical device establishment inspections when classifying observations of deviations, deficiencies or failures according to risk, and assigning an overall compliance rating to an inspection.
e
Global Land System classification data - Dataset - B2FIND
b2find.eudat.eu
Updated Feb 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
Dataset updated
Feb 14, 2025
Description
Data from Van Asselen, S. & Verburg, P.H. (2012). A Land System representation for global assessments and land-use modeling. Global Change Biology, 18(10): 3125-3148 Current global scale land-change models used for integrated assessments and climate modeling are based on classifications of land cover, often using a resolution of 0.5 degree (approximately 50 x 50 km at the equator). To improve such assessments we have developed a new global representation of land cover and land use, at 5 arcminute resolution (9.25 x 9.25 km in equal area projection Eckert IV, as is used here). The new Land System classification represents land cover and land use in mosaic landscapes with different land-use management intensity and livestock composition, which are important aspects of the land system. We have also tested if global assessments can be based on globally uniform allocation rules. Logistic regressions were used to analyze variation in spatial determinants of Land Systems. This analysis indicates strong associations between Land Systems and a range of socioeconomic and biophysical indicators of human-environment interactions. The set of identified spatial determinants of a Land System differs among regions and scales, especially for (mosaic) cropland systems, grassland systems with livestock, and settlements. (Semi-)Natural LS have more similar spatial determinants across regions and scales. Using Land Systems in global models is expected to result in a more accurate representation of land use capturing important aspects of land systems and land architecture: the variation in land cover and the link between land-use intensity and livestock composition. Because the set of most important spatial determinants of LS varies among regions and scales, land-change models that include the human drivers of land change are best parameterized at sub-global level, where similar biophysical, socioeconomic and cultural conditions prevail in the specific regions.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dashlink (2025). Classification of Aeronautics System Health and Safety Documents [Dataset]. https://catalog.data.gov/dataset/classification-of-aeronautics-system-health-and-safety-documents

Data from: Classification of Aeronautics System Health and Safety Documents

Explore at:

Dataset updated

Apr 10, 2025

Dataset provided by

Dashlink

Description

Most complex aerospace systems have many text reports on safety, maintenance, and associated issues. The Aviation Safety Reporting System (ASRS) spans several decades and contains over 700 000 reports. The Aviation Safety Action Plan (ASAP) contains over 12 000 reports from various airlines. Problem categorizations have been developed for both ASRS and ASAP to enable identification of system problems. However, repository volume and complexity make human analysis difficult. Multiple experts are needed, and they often disagree on classifications. Even the same person has classified the same document differently at different times due to evolving experiences. Consistent classification is necessary to support tracking trends in problem categories over time. A decision support system that performs consistent document classification quickly and over large repositories would be useful. We discuss the results of two algorithms we have developed to classify ASRS and ASAP documents. The first is Mariana---a support vector machine (SVM) with simulated annealing, which is used to optimize hyperparameters for the model. The second method is classification built on top of nonnegative matrix factorization (NMF), which attempts to find a model that represents document features that add up in various combinations to form documents. We tested both methods on ASRS and ASAP documents with the latter categorized two different ways. We illustrate the potential of NMF to provide document features that are interpretable and indicative of topics. We also briefly discuss the tool that we have incorporated Mariana into in order to allow human experts to provide feedback on the document categorizations.

Clear search

Close search

Google apps

Main menu

Data from: Classification of Aeronautics System Health and Safety Documents

Global Data Classification Tool Market Research Report: By Deployment Model...

Guide to applying the 2011 Rural Urban Classification to data

Additional information:

Fundamental classification guidance review files

Global Industry Classification Standard System

Risk classification guide for medical device establishment inspections...

Preliminary Context Classification TDA

Tree Point Classification

North American Industry Classification System (NAICS) 2017 Version 2.0

Replication Data for: Automated Text Classification of News Articles: A...

Data from: Finding Stats: Terms, Tools and Techniques

Data Classification Software Market Report | Global Forecast From 2025 To...

Data Classification Software Market Outlook

Component Analysis

Vegetation Classification for the Nature Reserve of Orange County

North American Industry Classification System (NAICS) 2002

ATC and DDD Classification System Data Package

Generalized use classification for groundwater sites in California with...

Land Cover - Minnesota Land Cover Classification System

Data Cleaning, Translation & Split of the Dataset for the Automatic...

Risk classification guide for medical device establishment inspections...

Global Land System classification data - Dataset - B2FIND

Data from: Classification of Aeronautics System Health and Safety Documents