Facebook
TwitterA comprehensive Quality Assurance (QA) and Quality Control (QC) statistical framework consists of three major phases: Phase 1—Preliminary raw data sets exploration, including time formatting and combining datasets of different lengths and different time intervals; Phase 2—QA of the datasets, including detecting and flagging of duplicates, outliers, and extreme values; and Phase 3—the development of time series of a desired frequency, imputation of missing values, visualization and a final statistical summary. The time series data collected at the Billy Barr meteorological station (East River Watershed, Colorado) were analyzed. The developed statistical framework is suitable for both real-time and post-data-collection QA/QC analysis of meteorological datasets.The files that are in this data package include one excel file, converted to CSV format (Billy_Barr_raw_qaqc.csv) that contains the raw meteorological data, i.e., input data used for the QA/QC analysis. The second CSV file (Billy_Barr_1hr.csv) is the QA/QC and flagged meteorological data, i.e., output data from the QA/QC analysis. The last file (QAQC_Billy_Barr_2021-03-22.R) is a script written in R that implements the QA/QC and flagging process. The purpose of the CSV data files included in this package is to provide input and output files implemented in the R script.
Facebook
TwitterThis data record contains questions and responses to a USGS-wide survey conducted to identify issues and needs associated with quality assurance and quality control (QA/QC) of USGS timeseries data streams. This research was funded by the USGS Community for Data Integration as part of a project titled “From reactive- to condition-based maintenance: Artificial intelligence for anomaly predictions and operational decision-making”. The poll targeted monitoring network managers and technicians and asked questions about operational data streams and timeseries data collection in order to identity opportunities to streamline data access, expedite the response to data quality issues, improve QA/QC procedures, reduce operations costs, and uncover other maintenance needs. The poll was created using an online survey platform. It was sent to 2326 systematically selected USGS email addresses and received 175 responses in 11 days before it was closed to respondents. The poll contained 48 questions of various types including long answer, multiple choice, and ranking questions. The survey contained a mix of mandatory and optional questions. These distinctions as well as full descriptions of survey questions are noted on the metadata.
Facebook
TwitterTime series raw datasets of Temperature, Solar Radiation, Relative Humidity, Rainfall, Barometric Pressure, Wind Speed, and Wind Direction for the Barro Colorado (PA-BCI) site for the period from 2003-01-01 to 2016-12-31 were downloaded from the STRI website. The QA/QC protocol of raw datasets included the detection and removal of NAs, outliers and bad data points, imputation of missing data, and creation of the time series with equidistant time stamps. The time series for different meteorological parameters were aligned into a single database to be used for modeling. Statistical QA/QC analysis was performed using a series of R libraries in Rstudio. The changes in the dataset were flagged. The file BCI_met_drivers_2003-2016_QAQC_summary_report.docx provides the details of the QA/QC data analysis.
Facebook
TwitterI. IDENTIFYING INFORMATION
Title* Swedish EAT v1.0
Subtitle
Created by* Jonatan Cerwall (jonatancerwall@gmail.com)
Publisher(s)* Språkbanken Text
Link(s) / permanent identifier(s)*
License(s)*
Abstract*
This dataset is a translated version of the QAQC dataset
(https://cogcomp.seas.upenn.edu/Data/QA/QC/) for expected-answer-type
classification. Taxonomy is the Li and Roth Taxonomy, also from
https://cogcomp.seas.upenn.edu/Data/QA/QC/.
Funded by*
Cite as
Cerwall, J. (2021). What the BERT? Fine-tuning KB-BERT for Question
Classification. Unpublished manuscript, School of Electrical Engineering
and Computer Science, KTH.
Related datasets
II. USAGE
Key applications Machine learning, EAT Classification
Intended task(s)/usage(s) Evaluate models by standard classification
Recommended evaluation measures Accuracy
Dataset function(s) Testing
Recommended split(s) Test only
III. DATA
Primary data* Text
Language* Swedish
Dataset in numbers* 5451 questions in training set, 500 in test set.
Nature of the content* Open ended factoid questions.
Format* Comma-separated, four columns:
text -- the open ended factoid question
verbose label -- both the coarse-grained label and the fine-grained label
formatted as COARSE:fine
coarse label -- coarse-grained label
fine label -- fine-grained label
Data source(s)*
Translated from the QAQC dataset
(https://cogcomp.seas.upenn.edu/Data/QA/QC/)
Annotator characteristics
IV. ETHICS AND CAVEATS
Ethical considerations
"Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i
världen?")"
Things to watch out for
V. ABOUT DOCUMENTATION
Data last updated* 2021-07-27
Which changes have been made, compared to the previous version* First version
Access to previous versions
This document created* 2021-07-27
This document last updated* 2023-06-08
Where to look for further details
Documentation template version*
VI. OTHER
Related projects
References
Facebook
TwitterThe following datasets were QA/QC-ed (Quality Assurance/Quality Control): 1. Brush Creek Confluence (BCC) discharge data (from Helen Malenda, USGS, Colorado School of Mines), which were calculated using the pressure transducer data and rating curves. The original 15 min time series data were presented as mean daily discharge. 2. Almont discharge data from United States Geological Survey (USGS). The original data were in 15 min time intervals, and were averaged to mean daily discharge time series. 3. Pump House discharge data as mean daily discharge (downloaded from the SFA portal). 4. BCC and Pump House chemistry data from SFA data portal and/or original spreadsheets provided by Roelof Versteeg. The following challenging QA/QC problems of the datasets were resolved: Missing data with the duration of gaps up to >1 month; Duplicated dates; Anomalies and outliers of discharge and concentrations; Time stamps of measurements of the discharge and concentrations are not aligned (hydrogeochemical balance calculations require the timestamps to be aligned). All QA/QC-ed datasets are given as csv files. The csv files were prepared using the xts files with multiple worksheets, which are also included in the data packages. Figures of the QA/QC-ed datasets are given in the jpeg and pdf formats. The QA/QC-ed datasets have been used to quantify discharge and chemical concentrations in river water in order to understand riverine exports of water and dissolved constituents in the East River watershed. These datasets served as a basis in the presentation given by P. Fox et al. at the 2021 Goldschmidt Conference.
Facebook
TwitterData quality assurance and quality control are critical to the effective conduct of a clinical trial. In the present commentary, we discuss our experience in a large, multicenter stroke trial. In addition to standard data quality control techniques, we have developed novel methods to enhance the entire process. Central to our methods is the use of clinical monitors who are trained in the techniques of data monitoring.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
In June and July of 2020, 45 groundwater wells in McHenry County, Illinois, were sampled for water quality (field properties, major ions, nutrients, and trace metals) and 12 wells were sampled for contaminants of emerging concern (pharmaceuticals, pesticides, and wastewater indicator compounds). Quality-assurance and quality-control samples were collected during the June and July 2020 sampling that included equipment blanks, field blanks, and replicates. The results of these samples were used to understand the sources of bias and variability associated with sample collection, processing, storage, and shipping. This data release contains one comma separated values files containing the results of the quality-control sample collection for general water quality (metals, nutrients, and major ions) and contaminants of emerging concern (wastewater indicator compounds and pharmaceuticals). Water-quality data from the associated groundwater monitoring well data are available at the Nationa ...
Facebook
Twitter
Facebook
TwitterDel Mar Nearshore Mooring Salinity Data *** PRELIMINARY, No QA/QC info ***
Facebook
TwitterDel Mar Nearshore Mooring Temperature Data *** PRELIMINARY, No QA/QC info ***
Facebook
TwitterThis document briefly describes quality assurance and quality control procedures followed to generate IMOS Ships of Opportunity Bioacoustics sub-Facility data. Ships of Opportunity is a facility under Australia’s Integrated Marine Observing System (IMOS). Bioacoustic data files are available for public download through the Australian Ocean Data Network (AODN) portal (https://portal.aodn.org.au/search?uuid=8edf509b-1481-48fd-b9c5-b95b42247f82).
Facebook
TwitterThis dataset contains quality-assurance and quality-control data (QA/QC) not publicly available in the online National Water Information System (NWIS) for the Pennsylvania Groundwater Monitoring Network collected by the U.S. Geological Survey (USGS) in Pennsylvania, 2015-2019. The quality-control data (such as blanks and replicates) were collected at a subset of sites to ensure that the data meets specific data-quality objectives of this program. Also included in this data release are the supplemental tables for the accompanying USGS publication “Characterization of Ambient Groundwater Quality within a State-wide, Fixed Station Monitoring Network in Pennsylvania, 2015-2019.” The supplemental tables consist of: 1) a correlation matrix using results from principal components analysis (PCA) and 2) individual contributions of each sample to the PCA results are also included.
Facebook
TwitterThis data collection contains all the data used in our learning question classification experiments, which has question class definitions, the training and testing question sets, examples of preprocessing the questions, feature definition scripts and examples of semantically related word features.
ABBR - 'abbreviation': expression abbreviated, etc. DESC - 'description and abstract concepts': manner of an action, description of sth. etc. ENTY - 'entities': animals, colors, events, food, etc. HUM - 'human beings': a group or organization of persons, an individual, etc. LOC - 'locations': cities, countries, etc. NUM - 'numeric values': postcodes, dates, speed,temperature, etc
https://cogcomp.seas.upenn.edu/Data/QA/QC/ https://github.com/Tony607/Keras-Text-Transfer-Learning/blob/master/README.md
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains quality control level 1 (QC1) data for all of the variables measured for the iUTAH GAMUT Network Soapstone Climate (PR_ST_C). Each file contains all available QC1 data for a specific variable. Files will be updated as new data become available, but no more than once daily. These data have passed QA/QC procedures such as sensor calibration and visual inspection and removal of obvious errors. These data are approved by Technicians as the best available version of the data. See published script for correction steps specific to this data series. Each file header contains detailed metadata for site information, variable and method information, source information, and qualifiers referenced in the data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This resource was created for the 2024 New Zealand Hydrological Society Data Workshop in Queenstown, NZ. This resource contains Jupyter Notebooks with examples for conducting quality control post processing for in situ aquatic sensor data. The code uses the Python pyhydroqc package to detect anomalies. This resource consists of 3 example notebooks and associated data files. For more information, see the original resource from which this was derived: http://www.hydroshare.org/resource/451c4f9697654b1682d87ee619cd7924.
Notebooks: 1. Example 1: Import and plot data 2. Example 2: Perform rules-based quality control 3. Example 3: Perform model-based quality control (ARIMA) 4. Example 4: Model-based quality control (ARIMA) with user data
Data files: Data files are available for 6 aquatic sites in the Logan River Observatory. Each file contains data for one site for a single year. Each file corresponds to a single year of data. The files are named according to monitoring site (FranklinBasin, TonyGrove, WaterLab, MainStreet, Mendon, BlackSmithFork) and year. The files were sourced by querying the Logan River Observatory relational database, and equivalent data could be obtained from the LRO website or on HydroShare. Additional information on sites, variables, and methods can be found on the LRO website (http://lrodata.usu.edu/tsa/) or HydroShare (https://www.hydroshare.org/search/?q=logan%20river%20observatory). Each file has the same structure indexed with a datetime column (mountain standard time) with three columns corresponding to each variable. Variable abbreviations and units are: - temp: water temperature, degrees C - cond: specific conductance, μS/cm - ph: pH, standard units - do: dissolved oxygen, mg/L - turb: turbidity, NTU - stage: stage height, cm
For each variable, there are 3 columns: - Raw data value measured by the sensor (column header is the variable abbreviation). - Technician quality controlled (corrected) value (column header is the variable abbreviation appended with '_cor'). - Technician labels/qualifiers (column header is the variable abbreviation appended with '_qual').
There is also a file "data.csv" for use with Example 4. If any user wants to bring their own data file, they should structure it similarly to this file with a single column of datetime values and a single column of numeric observations labeled "raw".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Extended data 1 to 4 for the software article:SentemQC - A novel and cost-efficient method for quality assurance and quality control of high-resolution frequency sensor data in fresh waters.
The extended data is tables and a Figure output and input from/to SentemQC runs relevant for the SentemQC paper.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract Data quality control programs used in the mineral industry normally define tolerance limits based on values considered as good practice or those that have previously been applied to similar deposits, although the precision and accuracy of estimates depend on a combination of geological characteristics, estimation parameters, sample spacing and data quality. This study investigates how the sample quality limits affect the estimates results. The proposed methodology is based on a series of metrics used to compare the impact on the estimates using a synthetic database with an increasing amount of error added to the original sample grades or positions, emulating different levels of precision. The proposed approach results lead to tolerance limits for the grades similar to those recommended in literature. The influence of the positional uncertainty on model estimates is at a minimum, because of the accuracy of current surveying methods that have a deviation in the order of millimeters, so its impact can be considered negligible.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains quality control level 1 (QC1) data for all of the variables measured for the iUTAH GAMUT Network Trial Lake Climate (PR_TL_C). Each file contains all available QC1 data for a specific variable. Files will be updated as new data become available, but no more than once daily. These data have passed QA/QC procedures such as sensor calibration and visual inspection and removal of obvious errors. These data are approved by Technicians as the best available version of the data. See published script for correction steps specific to this data series. Each file header contains detailed metadata for site information, variable and method information, source information, and qualifiers referenced in the data.
Facebook
Twitter
According to our latest research, the 3D Model QA/QC for Machine Control market size reached USD 1.13 billion in 2024, driven by the rising adoption of advanced automation and precision technologies across construction, mining, and agriculture sectors. The market is experiencing robust growth, with a CAGR of 9.8% projected from 2025 to 2033. By the end of 2033, the 3D Model QA/QC for Machine Control market is forecasted to achieve a value of USD 2.66 billion. This expansion is underpinned by increasing industry demand for enhanced accuracy, reduced operational costs, and improved safety across diverse end-user industries, as per our comprehensive analysis.
One of the primary growth factors for the 3D Model QA/QC for Machine Control market is the escalating need for precision and efficiency in large-scale infrastructure projects. As construction and mining operations become more complex and require higher levels of accuracy, the integration of advanced 3D modeling solutions for Quality Assurance (QA) and Quality Control (QC) has become indispensable. These solutions ensure that machine control systems operate with minimal errors, significantly reducing the risk of costly rework and project delays. Additionally, the widespread adoption of Building Information Modeling (BIM) and digital twin technologies is further fueling demand for sophisticated QA/QC tools, as stakeholders seek to maintain stringent quality standards and comply with regulatory frameworks worldwide.
Another significant driver propelling the 3D Model QA/QC for Machine Control market is the rapid advancement in sensor technologies, artificial intelligence, and cloud computing. Modern machine control systems now leverage high-resolution sensors, LiDAR, and GNSS to capture detailed site data, which is then processed and validated using advanced QA/QC software. The integration of AI-powered analytics allows for real-time detection of discrepancies and proactive correction, ensuring that construction and mining activities adhere to design specifications. Furthermore, the shift toward cloud-based platforms enables seamless collaboration among project stakeholders, enhancing data accessibility, version control, and workflow automation. These technological advancements are making QA/QC processes more efficient, scalable, and cost-effective, thereby attracting a broader user base across various industry verticals.
The growing emphasis on sustainability and safety is also contributing to the market’s expansion. Regulatory bodies and industry associations are increasingly mandating rigorous QA/QC protocols to minimize environmental impact, optimize resource utilization, and ensure worker safety. In sectors such as oil and gas, agriculture, and infrastructure development, the deployment of 3D model QA/QC solutions helps organizations achieve compliance with environmental and safety standards, while also supporting initiatives aimed at reducing carbon footprints. As governments worldwide continue to invest in smart city and infrastructure modernization projects, the demand for reliable QA/QC tools for machine control systems is expected to surge, creating new growth opportunities for market participants.
From a regional perspective, North America currently dominates the 3D Model QA/QC for Machine Control market, thanks to its advanced construction ecosystem, high adoption rates of digital technologies, and presence of leading industry players. However, the Asia Pacific region is witnessing the fastest growth, driven by rapid urbanization, infrastructure development, and increasing investments in automation across China, India, and Southeast Asia. Europe also represents a significant market, supported by stringent regulatory standards and a strong focus on sustainable construction practices. The Middle East & Africa and Latin America are gradually emerging as promising markets, fueled by large-scale infrastructure projects and the growing adoption of precision agriculture and mining technologies.
Facebook
TwitterThis data provides results from the California Environmental Data Exchange Network (CEDEN) for field and lab chemistry analyses. The data set contains two provisionally assigned values (“DataQuality” and “DataQualityIndicator”) to help users interpret the data quality metadata provided with the associated result.
Due to file size limitations, the data has been split into individual resources by year. The entire dataset can also be downloaded in bulk using the zip files on this page (in csv format or parquet format), and developers can also use the API associated with each year's dataset to access the data.
Users who want to manually download more specific subsets of the data can also use the CEDEN Query Tool, which provides access to the same data presented here, but allows for interactive data filtering.
NOTE: Some of the field and lab chemistry data that has been submitted to CEDEN since 2020 has not been loaded into the CEDEN database. That data is not included in this data set (and is also not available via the CEDEN query tool described above), but is available as a supplemental data set available here: Surface Water - Chemistry Results - CEDEN Augmentation. For consistency, many of the conditions applied to the data in this dataset and in the CEDEN query tool are also applied to that supplemental dataset (e.g., no rejected data or replicates are included), but that supplemental data is provisional and may not reflect all of the QA/QC controls applied to the regular CEDEN data available here.
Facebook
TwitterA comprehensive Quality Assurance (QA) and Quality Control (QC) statistical framework consists of three major phases: Phase 1—Preliminary raw data sets exploration, including time formatting and combining datasets of different lengths and different time intervals; Phase 2—QA of the datasets, including detecting and flagging of duplicates, outliers, and extreme values; and Phase 3—the development of time series of a desired frequency, imputation of missing values, visualization and a final statistical summary. The time series data collected at the Billy Barr meteorological station (East River Watershed, Colorado) were analyzed. The developed statistical framework is suitable for both real-time and post-data-collection QA/QC analysis of meteorological datasets.The files that are in this data package include one excel file, converted to CSV format (Billy_Barr_raw_qaqc.csv) that contains the raw meteorological data, i.e., input data used for the QA/QC analysis. The second CSV file (Billy_Barr_1hr.csv) is the QA/QC and flagged meteorological data, i.e., output data from the QA/QC analysis. The last file (QAQC_Billy_Barr_2021-03-22.R) is a script written in R that implements the QA/QC and flagging process. The purpose of the CSV data files included in this package is to provide input and output files implemented in the R script.