44 datasets found

n
Protein Cross-Linking Database
neuinfo.org
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Protein Cross-Linking Database [Dataset]. http://identifiers.org/RRID:SCR_021027
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_021027 https://identifiers.org/RRID:SCR_021027/resolver?q=&i=rrid
Dataset updated
Jan 29, 2022
Description
Web application and database designed for sharing, visualizing, and analyzing protein cross-linking mass spectrometry data with emphasis on structural analysis and quality control. Includes public and private data sharing capabilities, project based interface designed to ensure security and facilitate collaboration among multiple researchers. Used for private collaboration and public data dissemination.
d
CLM - Bore assignments QLD
data.gov.au
researchdata.edu.au
+1more
zip
Updated Nov 19, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2019). CLM - Bore assignments QLD [Dataset]. https://data.gov.au/data/dataset/f8937dd8-b3a0-490e-a452-9dc56fe03914
Explore at:
zip(158505)Available download formats
Dataset updated
Nov 19, 2019
Dataset provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Queensland
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset contains the aquifer assignment results for the Queensland part of the Clarence-Moreton Basin. The data were organized by hydrostratigraphic units. Assigning a bore to a specific aquifer is underpinned by the screened interval data and the aquifer boundaries. In many cases, it is impossible to assign the screened interval of a bore to a single aquifer as bores are either screened across different aquifers or there is insufficient information on stratigraphy and screened intervals. Bores were assigned to aquifers by comparing their screen intervals and depth with aquifer boundary data. The required information was extracted from the "Casing", "Aquifer", and "Stratigraphy" tables of the DNRM database.

Dataset History

The following steps were followed during the aquifer assignment:

Determine the boundary of the aquifer of interest. The 'Aquifer' table in the DNRM database registers aquifers that a bore intersects when it is drilled and records the upper and lower extents of aquifers. This information was used to identify the aquifer boundary at any specific location. When boundary information was missing the 'Stratigraphy' table was used to identify aquifer boundaries instead.

Determine the screen interval of bores. Refer to theThe 'Casing' table contains the screen information for most bores in the database. The codes 'PERF', 'SCRN' and 'ENDD' in the column 'MATERIAL' indicate water entry locations. The code 'OPEN' indicates that a bore is uncased at some depths; if bores intersect an aquifer, then they are considered as water supply points. These codes were used to find the screen interval of a bore. When multiple screens exist, the bore is assumed to be screened across the entire length of the individual screens.

Determine the screen code. A bore may tap into an aquifer in four ways depending on its screen location in aquifers. Four codes (I, T, B and E) were used to indicate the different spatial relationship of a bore with its targeted aquifer. When screen information is lacking, bores with their lower ends located in an aquifer are assumed to be tapped to that aquifer and were assigned a screen code 'BOI'.

Filter bores for a specific area using a shape file or coordinates. If only a part of the aquifer is of interest, then the output bores can be filtered based on their locations.

Cross-check the final datasets against expert knowledge and spatial context of aquifers. As errors are common in such databases, some errors will still persist despite extensive data quality checks. However, such errors are often highlighted during data interpretation and visual representation and can subsequently be corrected through an iterative process.

Dataset Citation

Bioregional Assessment Programme (2014) CLM - Bore assignments QLD. Bioregional Assessment Derived Dataset. Viewed 28 September 2017, http://data.bioregionalassessments.gov.au/dataset/f8937dd8-b3a0-490e-a452-9dc56fe03914.

Dataset Ancestors

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements 20131204

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores and NGIS v4 28072014

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores v3 03122014

Derived From National Groundwater Information System (NGIS) v1.1

Pokemon TCG Pocket Dataset

kaggle.com

zip

Updated Oct 5, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

JoaoCoelho03 (2025). Pokemon TCG Pocket Dataset [Dataset]. https://www.kaggle.com/datasets/joaocoelho03/pocket-tcg-dataset/data

Explore at:

zip(17767 bytes)Available download formats

Dataset updated

Oct 5, 2025

Authors

JoaoCoelho03

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Pokémon TCG Pocket Card Dataset

This dataset contains detailed information about all cards available in the Pokémon Trading Card Game Pocket mobile app. The data has been carefully curated and cleaned to provide Pokémon enthusiasts and developers with accurate and comprehensive card information.

Dataset Contents

8+ Complete Sets: All major card sets including latest expansions
1000+ Cards: Every card with detailed metadata and classifications
Clean Format: CSV format optimized for analysis, machine learning, and research

Key Features

🃏 Complete Card Data

Card names and numbers with proper formatting
Complete set and pack organization structure
Release dates for all sets and expansions
Total card counts per set for completion tracking

💎 Rarity Classifications

7+ Rarity Types including:
- Common, Uncommon, Rare
- Ultra Rare, Secret Rare, Special Art Rare
- Crown Rare and other premium classifications
Includes shiny and special variant cards
Standardized rarity naming conventions

Use Cases

📊 Data Analysis & Research

Card rarity distribution analysis across sets
Set completion and collection tracking

🤖 Machine Learning & AI

Card classification models
Recommendation systems for collectors
Rarity prediction algorithms
Collection optimization models

📈 Visualization & Dashboards

Interactive card browsers
Collection progress tracking
Rarity distribution charts
Set release timeline visualizations

Data Quality

✅ Manually Verified: All card information cross-checked for accuracy
✅ Standardized Format: Consistent naming and classification across all entries
✅ Complete Coverage: All available cards from the mobile game
✅ Clean Structure: Optimized for both human readability and machine processing

Technical Specifications

📋 File Format

Format: CSV (Comma Separated Values)
Encoding: UTF-8 with full international character support
Delimiter: Comma (,)
Headers: Included in first row

🗂️ Column Structure (9 columns)

Column	Description	Example
`set_name`	Full name of the card set	"Eevee Grove"
`set_code`	Official set identifier	"a3b"
`set_release_date`	Set release date	"June 26, 2025"
`set_total_cards`	Total cards in the set	107
`pack_name`	Name of the specific pack	"Eevee Grove"
`card_name`	Full card name	"Leafeon"
`card_number`	Card number within set	"2"
`card_rarity`	Rarity classification	"Rare"
`card_type`	Card type category	"Pokémon"

If you find this dataset useful, consider giving it an upvote — it really helps others discover it too! 🔼😊

Happy analyzing! 🎯📊

d
Convenience Stores, USA, Top 10 | 32k+ PoIs with 15+ Attributes | monthly...
datarade.ai
.json, .xml, .csv
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
xavvy (2025). Convenience Stores, USA, Top 10 | 32k+ PoIs with 15+ Attributes | monthly updates | API & Datasets [Dataset]. https://datarade.ai/data-products/convenience-stores-usa-top-10-32k-pois-with-15-attribut-xavvy
Explore at:
.json, .xml, .csvAvailable download formats
Dataset updated
Apr 16, 2025
Dataset authored and provided by
xavvy
Area covered
United States of America
Description
Xavvy fuel is the leading source for location data and market insights worldwide. We specialize in data quality and enrichment, providing high-quality POI data for convenience stores in the United States.

Base data • Name/Brand • Adress • Geocoordinates • Opening Hours • Phone • ...

15+ Services • Fuel • Wifi • ChargePoints • …

10+ Payment options • Visa • MasterCard • Google Pay • individual Apps • ...

Our data offering is highly customizable and flexible in delivery – whether one-time or regular data delivery, push or pull services, and various data formats – we adapt to our customers' needs.

Brands included: • 7-Eleven • Circle K • SAlimentation Couche Tard • Speedway • Casey's • ...

The total number of convenience stores per region, market share distribution among competitors, or the ideal location for new branches – our convenience store data provides valuable insights into the market and serves as the perfect foundation for in-depth analyses and statistics. Our data helps businesses across various industries make informed decisions regarding market development, expansion, and competitive strategies. Additionally, our data contributes to the consistency and quality of existing datasets. A simple data mapping allows for accuracy verification and correction of erroneous entries.

Especially when displaying information about restaurants and fast-food chains on maps or in applications, high data quality is crucial for an optimal customer experience. Therefore, we continuously optimize our data processing procedures: • Regular quality controls • Geocoding systems to refine location data • Cleaning and standardization of datasets • Consideration of current developments and mergers • Continuous expansion and cross-checking of various data sources

Integrate the most comprehensive database of convenience store locations in the USA into your business. Explore our additional data offerings and gain valuable market insights directly from the experts!
Good Growth Plan 2014-2019 - Indonesia
microdata.worldbank.org
catalog.ihsn.org
Updated Jan 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syngenta (2023). Good Growth Plan 2014-2019 - Indonesia [Dataset]. https://microdata.worldbank.org/index.php/catalog/5630
Explore at:
Dataset updated
Jan 27, 2023
Dataset authored and provided by
Syngenta
Time period covered
2014 - 2019
Area covered
Indonesia
Description
Abstract

Syngenta is committed to increasing crop productivity and to using limited resources such as land, water and inputs more efficiently. Since 2014, Syngenta has been measuring trends in agricultural input efficiency on a global network of real farms. The Good Growth Plan dataset shows aggregated productivity and resource efficiency indicators by harvest year. The data has been collected from more than 4,000 farms and covers more than 20 different crops in 46 countries. The data (except USA data and for Barley in UK, Germany, Poland, Czech Republic, France and Spain) was collected, consolidated and reported by Kynetec (previously Market Probe), an independent market research agency. It can be used as benchmarks for crop yield and input efficiency.

Geographic coverage

National coverage

Analysis unit

Agricultural holdings

Kind of data

Sample survey data [ssd]

Sampling procedure

A. Sample design Farms are grouped in clusters, which represent a crop grown in an area with homogenous agro- ecological conditions and include comparable types of farms. The sample includes reference and benchmark farms. The reference farms were selected by Syngenta and the benchmark farms were randomly selected by Kynetec within the same cluster.

B. Sample size Sample sizes for each cluster are determined with the aim to measure statistically significant increases in crop efficiency over time. This is done by Kynetec based on target productivity increases and assumptions regarding the variability of farm metrics in each cluster. The smaller the expected increase, the larger the sample size needed to measure significant differences over time. Variability within clusters is assumed based on public research and expert opinion. In addition, growers are also grouped in clusters as a means of keeping variances under control, as well as distinguishing between growers in terms of crop size, region and technological level. A minimum sample size of 20 interviews per cluster is needed. The minimum number of reference farms is 5 of 20. The optimal number of reference farms is 10 of 20 (balanced sample).

C. Selection procedure The respondents were picked randomly using a “quota based random sampling” procedure. Growers were first randomly selected and then checked if they complied with the quotas for crops, region, farm size etc. To avoid clustering high number of interviews at one sampling point, interviewers were instructed to do a maximum of 5 interviews in one village.

BF Screened from Indonesia were selected based on the following criterion: (a) Corn growers in East Java - Location: East Java (Kediri and Probolinggo) and Aceh
- Innovative (early adopter); Progressive (keen to learn about agronomy and pests; willing to try new technology); Loyal (loyal to technology that can help them)
- making of technical drain (having irrigation system)
- marketing network for corn: post-harvest access to market (generally they sell 80% of their harvest)
- mid-tier (sub-optimal CP/SE use)
- influenced by fellow farmers and retailers
- may need longer credit

(b) Rice growers in West and East Java - Location: West Java (Tasikmalaya), East Java (Kediri), Central Java (Blora, Cilacap, Kebumen), South Lampung
- The growers are progressive (keen to learn about agronomy and pests; willing to try new technology)
- Accustomed in using farming equipment and pesticide. (keen to learn about agronomy and pests; willing to try new technology) - A long rice cultivating experience in his area (lots of experience in cultivating rice)
- willing to move forward in order to increase his productivity (same as progressive)
- have a soil that broad enough for the upcoming project
- have influence in his group (ability to influence others) - mid-tier (sub-optimal CP/SE use)
- may need longer credit

Mode of data collection

Face-to-face [f2f]

Research instrument

Data collection tool for 2019 covered the following information:

(A) PRE- HARVEST INFORMATION

PART I: Screening PART II: Contact Information PART III: Farm Characteristics a. Biodiversity conservation b. Soil conservation c. Soil erosion d. Description of growing area e. Training on crop cultivation and safety measures PART IV: Farming Practices - Before Harvest a. Planting and fruit development - Field crops b. Planting and fruit development - Tree crops c. Planting and fruit development - Sugarcane d. Planting and fruit development - Cauliflower e. Seed treatment

(B) HARVEST INFORMATION

PART V: Farming Practices - After Harvest a. Fertilizer usage b. Crop protection products c. Harvest timing & quality per crop - Field crops d. Harvest timing & quality per crop - Tree crops e. Harvest timing & quality per crop - Sugarcane f. Harvest timing & quality per crop - Banana g. After harvest PART VI - Other inputs - After Harvest a. Input costs b. Abiotic stress c. Irrigation

See all questionnaires in external materials tab

Cleaning operations

Data processing:

Kynetec uses SPSS (Statistical Package for the Social Sciences) for data entry, cleaning, analysis, and reporting. After collection, the farm data is entered into a local database, reviewed, and quality-checked by the local Kynetec agency. In the case of missing values or inconsistencies, farmers are re-contacted. In some cases, grower data is verified with local experts (e.g. retailers) to ensure data accuracy and validity. After country-level cleaning, the farm-level data is submitted to the global Kynetec headquarters for processing. In the case of missing values or inconsistences, the local Kynetec office was re-contacted to clarify and solve issues.

Quality assurance Various consistency checks and internal controls are implemented throughout the entire data collection and reporting process in order to ensure unbiased, high quality data.

• Screening: Each grower is screened and selected by Kynetec based on cluster-specific criteria to ensure a comparable group of growers within each cluster. This helps keeping variability low.

• Evaluation of the questionnaire: The questionnaire aligns with the global objective of the project and is adapted to the local context (e.g. interviewers and growers should understand what is asked). Each year the questionnaire is evaluated based on several criteria, and updated where needed.

• Briefing of interviewers: Each year, local interviewers - familiar with the local context of farming -are thoroughly briefed to fully comprehend the questionnaire to obtain unbiased, accurate answers from respondents.

• Cross-validation of the answers: o Kynetec captures all growers' responses through a digital data-entry tool. Various logical and consistency checks are automated in this tool (e.g. total crop size in hectares cannot be larger than farm size) o Kynetec cross validates the answers of the growers in three different ways: 1. Within the grower (check if growers respond consistently during the interview) 2. Across years (check if growers respond consistently throughout the years) 3. Within cluster (compare a grower's responses with those of others in the group) o All the above mentioned inconsistencies are followed up by contacting the growers and asking them to verify their answers. The data is updated after verification. All updates are tracked.

• Check and discuss evolutions and patterns: Global evolutions are calculated, discussed and reviewed on a monthly basis jointly by Kynetec and Syngenta.

• Sensitivity analysis: sensitivity analysis is conducted to evaluate the global results in terms of outliers, retention rates and overall statistical robustness. The results of the sensitivity analysis are discussed jointly by Kynetec and Syngenta.

• It is recommended that users interested in using the administrative level 1 variable in the location dataset use this variable with care and crosscheck it with the postal code variable.

Data appraisal

Due to the above mentioned checks, irregularities in fertilizer usage data were discovered which had to be corrected:

For data collection wave 2014, respondents were asked to give a total estimate of the fertilizer NPK-rates that were applied in the fields. From 2015 onwards, the questionnaire was redesigned to be more precise and obtain data by individual fertilizer products. The new method of measuring fertilizer inputs leads to more accurate results, but also makes a year-on-year comparison difficult. After evaluating several solutions to this problems, 2014 fertilizer usage (NPK input) was re-estimated by calculating a weighted average of fertilizer usage in the following years.
Multi Country Study Survey 2000-2001 - United States
apps.who.int
catalog.ihsn.org
+1more
Updated Jan 23, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Health Organization (WHO) (2014). Multi Country Study Survey 2000-2001 - United States [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/148
Explore at:
Dataset updated
Jan 23, 2014
Dataset provided by
World Health Organizationhttps://who.int/
Authors
World Health Organization (WHO)
Time period covered
2000 - 2001
Area covered
United States
Description
Abstract

In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

Kind of data

Sample survey data [ssd]

Sampling procedure

A sample of 5,000 households across the US was purchased from Survey Sampling, Inc. located in Connecticut. This sample is based on Random Digit samples.

This sample was stratified by state to match the percentage of U.S. residents living in each of the fifty states.

The 5,000 sampled households were randomly assigned to one of three different experimental treatments (normal, personalized and personalised plus 2$ incentive)

The experiment was done for purposes of evaluating response rate effects of alternative means of contacting US residents.

Mode of data collection

Mail Questionnaire [mail]

Cleaning operations

Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.
Good Growth Plan, 2014-2019 - Paraguay
microdata.fao.org
Updated Feb 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syngenta (2021). Good Growth Plan, 2014-2019 - Paraguay [Dataset]. https://microdata.fao.org/index.php/catalog/1813
Explore at:
Dataset updated
Feb 17, 2021
Dataset authored and provided by
Syngenta
Time period covered
2014 - 2019
Area covered
Paraguay
Description
Abstract

Syngenta is committed to increasing crop productivity and to using limited resources such as land, water and inputs more efficiently. Since 2014, Syngenta has been measuring trends in agricultural input efficiency on a global network of real farms. The Good Growth Plan dataset shows aggregated productivity and resource efficiency indicators by harvest year. The data has been collected from more than 4,000 farms and covers more than 20 different crops in 46 countries. The data (except USA data and for Barley in UK, Germany, Poland, Czech Republic, France and Spain) was collected, consolidated and reported by Kynetec (previously Market Probe), an independent market research agency. It can be used as benchmarks for crop yield and input efficiency.

Geographic coverage

National Coverage

Analysis unit

Agricultural holdings

Kind of data

Sample survey data [ssd]

Sampling procedure

A. Sample design Farms are grouped in clusters, which represent a crop grown in an area with homogenous agro- ecological conditions and include comparable types of farms. The sample includes reference and benchmark farms. The reference farms were selected by Syngenta and the benchmark farms were randomly selected by Kynetec within the same cluster.

B. Sample size Sample sizes for each cluster are determined with the aim to measure statistically significant increases in crop efficiency over time. This is done by Kynetec based on target productivity increases and assumptions regarding the variability of farm metrics in each cluster. The smaller the expected increase, the larger the sample size needed to measure significant differences over time. Variability within clusters is assumed based on public research and expert opinion. In addition, growers are also grouped in clusters as a means of keeping variances under control, as well as distinguishing between growers in terms of crop size, region and technological level. A minimum sample size of 20 interviews per cluster is needed. The minimum number of reference farms is 5 of 20. The optimal number of reference farms is 10 of 20 (balanced sample).

C. Selection procedure The respondents were picked randomly using a “quota based random sampling” procedure. Growers were first randomly selected and then checked if they complied with the quotas for crops, region, farm size etc. To avoid clustering high number of interviews at one sampling point, interviewers were instructed to do a maximum of 5 interviews in one village.

BF Screened from Paraguay were selected based on the following criterion:

(a) smallholder soybean growers Medium to high technology farms Regions: - Hohenau (Itapúa) - Edelira (Itapúa) - Pirapó (Itapúa) - La Paz (Itapúa) - Naranjal (Alto Paraná) - San Cristóbal (Alto Paraná)
corn and soybean in rotation
first grow corn and soybean secondly

(b) smallholder maize growers
Medium to high technology farms Regions: - Hohenau (Itapúa) - Edelira (Itapúa) - Pirapó (Itapúa) - La Paz (Itapúa) - Naranjal (Alto Paraná) - San Cristóbal (Alto Paraná)
corn and soybean in rotation
first grow corn and soybean secondly

Mode of data collection

Face-to-face [f2f]

Research instrument

Data collection tool for 2019 covered the following information:

(A) PRE- HARVEST INFORMATION

PART I: Screening PART II: Contact Information PART III: Farm Characteristics a. Biodiversity conservation b. Soil conservation c. Soil erosion d. Description of growing area e. Training on crop cultivation and safety measures PART IV: Farming Practices - Before Harvest a. Planting and fruit development - Field crops b. Planting and fruit development - Tree crops c. Planting and fruit development - Sugarcane d. Planting and fruit development - Cauliflower e. Seed treatment

(B) HARVEST INFORMATION

PART V: Farming Practices - After Harvest a. Fertilizer usage b. Crop protection products c. Harvest timing & quality per crop - Field crops d. Harvest timing & quality per crop - Tree crops e. Harvest timing & quality per crop - Sugarcane f. Harvest timing & quality per crop - Banana g. After harvest PART VI - Other inputs - After Harvest a. Input costs b. Abiotic stress c. Irrigation

See all questionnaires in external materials tab

Cleaning operations

Data processing:

Kynetec uses SPSS (Statistical Package for the Social Sciences) for data entry, cleaning, analysis, and reporting. After collection, the farm data is entered into a local database, reviewed, and quality-checked by the local Kynetec agency. In the case of missing values or inconsistencies, farmers are re-contacted. In some cases, grower data is verified with local experts (e.g. retailers) to ensure data accuracy and validity. After country-level cleaning, the farm-level data is submitted to the global Kynetec headquarters for processing. In the case of missing values or inconsistences, the local Kynetec office was re-contacted to clarify and solve issues.

B. Quality assurance Various consistency checks and internal controls are implemented throughout the entire data collection and reporting process in order to ensure unbiased, high quality data.

• Screening: Each grower is screened and selected by Kynetec based on cluster-specific criteria to ensure a comparable group of growers within each cluster. This helps keeping variability low.

• Evaluation of the questionnaire: The questionnaire aligns with the global objective of the project and is adapted to the local context (e.g. interviewers and growers should understand what is asked). Each year the questionnaire is evaluated based on several criteria, and updated where needed.

• Briefing of interviewers: Each year, local interviewers - familiar with the local context of farming -are thoroughly briefed to fully comprehend the questionnaire to obtain unbiased, accurate answers from respondents.

• Cross-validation of the answers:

o Kynetec captures all growers' responses through a digital data-entry tool. Various logical and consistency checks are automated in this tool (e.g. total crop size in hectares cannot be larger than farm size) o Kynetec cross validates the answers of the growers in three different ways: 1. Within the grower (check if growers respond consistently during the interview) 2. Across years (check if growers respond consistently throughout the years) 3. Within cluster (compare a grower's responses with those of others in the group)

o All the above mentioned inconsistencies are followed up by contacting the growers and asking them to verify their answers. The data is updated after verification. All updates are tracked.

• Check and discuss evolutions and patterns: Global evolutions are calculated, discussed and reviewed on a monthly basis jointly by Kynetec and Syngenta.

• Sensitivity analysis: sensitivity analysis is conducted to evaluate the global results in terms of outliers, retention rates and overall statistical robustness. The results of the sensitivity analysis are discussed jointly by Kynetec and Syngenta.

• It is recommended that users interested in using the administrative level 1 variable in the location dataset use this variable with care and crosscheck it with the postal code variable.

Data appraisal

Due to the above mentioned checks, irregularities in fertilizer usage data were discovered which had to be corrected:

For data collection wave 2014, respondents were asked to give a total estimate of the fertilizer NPK-rates that were applied in the fields. From 2015 onwards, the questionnaire was redesigned to be more precise and obtain data by individual fertilizer products. The new method of measuring fertilizer inputs leads to more accurate results, but also makes a year-on-year comparison difficult. After evaluating several solutions to this problems, 2014 fertilizer usage (NPK input) was re-estimated by calculating a weighted average of fertilizer usage in the following years.
Multi Country Study Survey 2000-2001 - Syrian Arab Republic
catalog.ihsn.org
apps.who.int
+1more
Updated Mar 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Health Organization (WHO) (2019). Multi Country Study Survey 2000-2001 - Syrian Arab Republic [Dataset]. https://catalog.ihsn.org/index.php/catalog/3882
Explore at:
Dataset updated
Mar 29, 2019
Dataset provided by
World Health Organizationhttps://who.int/
Authors
World Health Organization (WHO)
Time period covered
2000 - 2001
Area covered
Syria
Description
Abstract

In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

Kind of data

Sample survey data [ssd]

Mode of data collection

Face-to-face [f2f]

Cleaning operations

Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.
Good Growth Plan 2014-2019 - Japan
microdata.worldbank.org
datacatalog.ihsn.org
+1more
Updated Jan 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syngenta (2023). Good Growth Plan 2014-2019 - Japan [Dataset]. https://microdata.worldbank.org/index.php/catalog/5634
Explore at:
Dataset updated
Jan 27, 2023
Dataset authored and provided by
Syngenta
Time period covered
2014 - 2019
Area covered
Japan
Description
Abstract

Syngenta is committed to increasing crop productivity and to using limited resources such as land, water and inputs more efficiently. Since 2014, Syngenta has been measuring trends in agricultural input efficiency on a global network of real farms. The Good Growth Plan dataset shows aggregated productivity and resource efficiency indicators by harvest year. The data has been collected from more than 4,000 farms and covers more than 20 different crops in 46 countries. The data (except USA data and for Barley in UK, Germany, Poland, Czech Republic, France and Spain) was collected, consolidated and reported by Kynetec (previously Market Probe), an independent market research agency. It can be used as benchmarks for crop yield and input efficiency.

Geographic coverage

National coverage

Analysis unit

Agricultural holdings

Kind of data

Sample survey data [ssd]

Sampling procedure

A. Sample design Farms are grouped in clusters, which represent a crop grown in an area with homogenous agro- ecological conditions and include comparable types of farms. The sample includes reference and benchmark farms. The reference farms were selected by Syngenta and the benchmark farms were randomly selected by Kynetec within the same cluster.

B. Sample size Sample sizes for each cluster are determined with the aim to measure statistically significant increases in crop efficiency over time. This is done by Kynetec based on target productivity increases and assumptions regarding the variability of farm metrics in each cluster. The smaller the expected increase, the larger the sample size needed to measure significant differences over time. Variability within clusters is assumed based on public research and expert opinion. In addition, growers are also grouped in clusters as a means of keeping variances under control, as well as distinguishing between growers in terms of crop size, region and technological level. A minimum sample size of 20 interviews per cluster is needed. The minimum number of reference farms is 5 of 20. The optimal number of reference farms is 10 of 20 (balanced sample).

C. Selection procedure The respondents were picked randomly using a “quota based random sampling” procedure. Growers were first randomly selected and then checked if they complied with the quotas for crops, region, farm size etc. To avoid clustering high number of interviews at one sampling point, interviewers were instructed to do a maximum of 5 interviews in one village.

BF Screened from Japan were selected based on the following criterion: Location: Hokkaido Tokachi (JA Memuro, JA Otofuke, JA Tokachi Shimizu, JA Obihiro Taisho) --> initially focus on Memuro, Otofuke, Tokachi Shimizu, Obihiro Taisho // Added locations in GGP 2015 due to change of RF: Obhiro, Kamikawa, Abashiri
BF: no use of in furrow application (Amigo) - no use of Amistar

Contract farmers of snacks and other food companies --> screening question: 'Do you have quality contracts in place with snack and food companies for your potato production? Y/N --> if no, screen out

Increase of marketable yield --> screening question: 'Are you interested in growing branded potatoes (premium potatoes for processing industry)? Y/N --> if no, screen out

Potato growers for process use
Background info: No mention of Syngenta Background info: - Labor cost is very serious issue: In general, labor cost in Japan is very high. Growers try to reduce labor cost by mechanization. Percentage of labor cost in production cost. They would like to manage cost of labor - Quality and yield driven

Mode of data collection

Face-to-face [f2f]

Research instrument

Data collection tool for 2019 covered the following information:

(A) PRE- HARVEST INFORMATION

PART I: Screening PART II: Contact Information PART III: Farm Characteristics a. Biodiversity conservation b. Soil conservation c. Soil erosion d. Description of growing area e. Training on crop cultivation and safety measures PART IV: Farming Practices - Before Harvest a. Planting and fruit development - Field crops b. Planting and fruit development - Tree crops c. Planting and fruit development - Sugarcane d. Planting and fruit development - Cauliflower e. Seed treatment

(B) HARVEST INFORMATION

PART V: Farming Practices - After Harvest a. Fertilizer usage b. Crop protection products c. Harvest timing & quality per crop - Field crops d. Harvest timing & quality per crop - Tree crops e. Harvest timing & quality per crop - Sugarcane f. Harvest timing & quality per crop - Banana g. After harvest PART VI - Other inputs - After Harvest a. Input costs b. Abiotic stress c. Irrigation

See all questionnaires in external materials tab

Cleaning operations

Data processing:

Kynetec uses SPSS (Statistical Package for the Social Sciences) for data entry, cleaning, analysis, and reporting. After collection, the farm data is entered into a local database, reviewed, and quality-checked by the local Kynetec agency. In the case of missing values or inconsistencies, farmers are re-contacted. In some cases, grower data is verified with local experts (e.g. retailers) to ensure data accuracy and validity. After country-level cleaning, the farm-level data is submitted to the global Kynetec headquarters for processing. In the case of missing values or inconsistences, the local Kynetec office was re-contacted to clarify and solve issues.

Quality assurance Various consistency checks and internal controls are implemented throughout the entire data collection and reporting process in order to ensure unbiased, high quality data.

• Screening: Each grower is screened and selected by Kynetec based on cluster-specific criteria to ensure a comparable group of growers within each cluster. This helps keeping variability low.

• Evaluation of the questionnaire: The questionnaire aligns with the global objective of the project and is adapted to the local context (e.g. interviewers and growers should understand what is asked). Each year the questionnaire is evaluated based on several criteria, and updated where needed.

• Briefing of interviewers: Each year, local interviewers - familiar with the local context of farming -are thoroughly briefed to fully comprehend the questionnaire to obtain unbiased, accurate answers from respondents.

• Cross-validation of the answers: o Kynetec captures all growers' responses through a digital data-entry tool. Various logical and consistency checks are automated in this tool (e.g. total crop size in hectares cannot be larger than farm size) o Kynetec cross validates the answers of the growers in three different ways: 1. Within the grower (check if growers respond consistently during the interview) 2. Across years (check if growers respond consistently throughout the years) 3. Within cluster (compare a grower's responses with those of others in the group) o All the above mentioned inconsistencies are followed up by contacting the growers and asking them to verify their answers. The data is updated after verification. All updates are tracked.

• Check and discuss evolutions and patterns: Global evolutions are calculated, discussed and reviewed on a monthly basis jointly by Kynetec and Syngenta.

• Sensitivity analysis: sensitivity analysis is conducted to evaluate the global results in terms of outliers, retention rates and overall statistical robustness. The results of the sensitivity analysis are discussed jointly by Kynetec and Syngenta.

• It is recommended that users interested in using the administrative level 1 variable in the location dataset use this variable with care and crosscheck it with the postal code variable.

Data appraisal

Due to the above mentioned checks, irregularities in fertilizer usage data were discovered which had to be corrected:

For data collection wave 2014, respondents were asked to give a total estimate of the fertilizer NPK-rates that were applied in the fields. From 2015 onwards, the questionnaire was redesigned to be more precise and obtain data by individual fertilizer products. The new method of measuring fertilizer inputs leads to more accurate results, but also makes a year-on-year comparison difficult. After evaluating several solutions to this problems, 2014 fertilizer usage (NPK input) was re-estimated by calculating a weighted average of fertilizer usage in the following years.
Multi Country Study Survey 2000-2001, Long version - Lebanon
apps.who.int
catalog.ihsn.org
Updated Jan 16, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Health Organization (WHO) (2014). Multi Country Study Survey 2000-2001, Long version - Lebanon [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/192
Explore at:
Dataset updated
Jan 16, 2014
Dataset provided by
World Health Organizationhttps://who.int/
Authors
World Health Organization (WHO)
Time period covered
2000 - 2001
Area covered
Lebanon
Description
Abstract

In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

Kind of data

Sample survey data [ssd]

Mode of data collection

Face-to-face [f2f]

Cleaning operations

Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.
D
Database Automation Industry Report
marketreportanalytics.com
doc, pdf, ppt
Updated Apr 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). Database Automation Industry Report [Dataset]. https://www.marketreportanalytics.com/reports/database-automation-industry-90626
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Apr 27, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Database Automation market is experiencing robust growth, projected to reach $2.35 billion in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 24.38% from 2025 to 2033. This expansion is fueled by several key factors. The increasing complexity of database environments, coupled with the rising demand for faster and more reliable application deployments, is driving the adoption of automation solutions. Organizations across diverse sectors, including Banking, Financial Services and Insurance (BFSI), IT and Telecom, and E-commerce, are increasingly leveraging database automation to streamline operations, reduce manual errors, and improve overall efficiency. The shift towards cloud-based deployments further contributes to market growth, as cloud platforms offer scalability and flexibility well-suited to automated database management. While on-premises solutions still hold a significant share, the cloud segment is expected to witness faster growth in the coming years due to its cost-effectiveness and accessibility. Large enterprises are currently the primary adopters of database automation, but growing awareness and the availability of tailored solutions are expanding the market among Small and Medium-Sized Enterprises (SMEs). Competitive offerings from major players like Oracle, IBM, and Amazon Web Services, coupled with the emergence of specialized vendors, are shaping a dynamic and innovative market landscape. The market segmentation reveals significant opportunities across various components, including Database Patch and Release Automation, Application Release Automation, and Database Test Automation. Services related to implementation, integration, and support form a crucial segment, contributing significantly to the overall market value. While North America currently dominates the market, regions like Asia-Pacific are projected to exhibit strong growth owing to rapid digitalization and increasing IT spending. However, factors such as the high initial investment costs associated with implementing automation solutions and the need for skilled personnel to manage these systems could potentially restrain market growth to some extent. The overall outlook for the Database Automation market remains positive, driven by the persistent need for enhanced operational efficiency and improved application delivery cycles in a rapidly evolving technological landscape. Recent developments include: June 2023: Aquatic Informatics launched a new automated data validation tool, HydroCorrect, that can accelerate proactive monitoring and management of flooding, groundwater, and water quality in the Aquarius platform. With machine-learning technology, HydroCorrect will transform the QA/QC process with automation and standardized workflows that save time and improve data quality., May 2023: data.world, the data catalog platform, acquired the Mighty Canary technology and its incorporation into a new DataOps application. The application uses automation to surface contextual insights and real-time data quality updates directly to the BI, communications, and collaboration tools data consumers use.. Key drivers for this market are: Continuously Growing Volumes of Data Across Verticals, Increasing Demand for Automating Repetitive Database Management Processes. Potential restraints include: Continuously Growing Volumes of Data Across Verticals, Increasing Demand for Automating Repetitive Database Management Processes. Notable trends are: IT and Telecommunication industry is Expected to Witness Significant Growth.
f
Data from: Spatiotemporal Characteristics of Global Building Material...
acs.figshare.com
xlsx
Updated Nov 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qiance Liu; Xin Ouyang; Wensong Zhu; Kun Sun; Jinchao Song; Xiang Li; Yunyun Li; Wu Chen; Gang Liu (2025). Spatiotemporal Characteristics of Global Building Material Intensity Revealed for Circular and Low-Carbon Construction [Dataset]. http://doi.org/10.1021/acs.est.5c05684.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.est.5c05684.s002
Dataset updated
Nov 13, 2025
Dataset provided by
ACS Publications
Authors
Qiance Liu; Xin Ouyang; Wensong Zhu; Kun Sun; Jinchao Song; Xiang Li; Yunyun Li; Wu Chen; Gang Liu
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Quantifying the material intensity of buildings (MIB) is fundamental for built environment stock accounting, construction resource and waste management, and embodied carbon assessment. However, existing MIB data reported in the literature are often sparse, heterogeneous, and scattered across archetypes, which hinders comparability, quality checks, and harmonization. Here, we compiled a global MIB database containing 3051 MIB records in a unified form measured in kg/m2 for 31 types of construction materials, based on both secondary and primary data from multiple sources. Applying a mean-absolute-deviation (MAD) rule to generate archetype-specific general MIBs, we revealed that the upward pressure on MIB from increases in floor area and building height has been partly offset by the use of light-weight materials, yielding a current aggregate MIB of 1464.3 kg/m2 that is comparable to the pre-1920 levels. Global building material composition shifted markedly away from brick and wood and toward higher shares of steel, cement, sand, and stone, alongside sizable heterogeneity across archetypes, regions, and periods. This expanded, standardized, and harmonized global MIB database can help inform material efficiency targets, embodied carbon baselines, and stock-aware planning for selective demolition, procurement, and renovation in a circular and low-carbon construction transition.
A two-dimensional PCA plot obtained from a multiple factor analysis (MFA)...
plos.figshare.com
datasetcatalog.nlm.nih.gov
bin
Updated Aug 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta (2023). A two-dimensional PCA plot obtained from a multiple factor analysis (MFA) performed on all 22 populations using 142 bioclimatic variables retrieved from the WorldClim database. [Dataset]. http://doi.org/10.1371/journal.pntd.0011486.s001
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pntd.0011486.s001
Dataset updated
Aug 8, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A two-dimensional PCA plot obtained from a multiple factor analysis (MFA) performed on all 22 populations using 142 bioclimatic variables retrieved from the WorldClim database.
d
Data from TropiRoot 1.0 database: tropical root characteristics across...
search.dataone.org
osti.gov
Updated Mar 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amanda L. Cordeiro; Daniela F. Cusack; Nathaly Guerrero-Ramírez; Richard J. Norby; Laura Toro; Michelle Y. Wong; S. Joseph Wright; Kristine Grace M. Cabugao; Kelly M. Andersen; Lucia Fuchslueger; Colleen M. Iversen; Fiona Soper; Om Prakash Ghimire; Laynara F. Lugli; Ana Caroline Miron; Oscar Valverde-Barrantes; Marie Arnaud; Sarah Batterman; Lee H. Dietterich; Ming Yang Lee; Monique Weemstra; Daniela Yaffar; Shalom D. Addo-Danso; Kerstin Pierick; Ryan Bridges; Carina Easton; Isabella Felsing; Nathan B. Gonçalves; Riley Krudop; Mason R. McKinzie; Julia Perbohner; Alejandra N. Pozzoli-Oropeza; Mirna Samaniego; Alex W. Smilor; Ilana S. Vargas; Layna Webb; Teddy Nikitin; Jennifer S. Powers; M. Luke McCormack (2025). Data from TropiRoot 1.0 database: tropical root characteristics across environments [Dataset]. http://doi.org/10.15485/2507279
Explore at:
Unique identifier
https://doi.org/10.15485/2507279
Dataset updated
Mar 10, 2025
Dataset provided by
ESS-DIVE
Authors
Amanda L. Cordeiro; Daniela F. Cusack; Nathaly Guerrero-Ramírez; Richard J. Norby; Laura Toro; Michelle Y. Wong; S. Joseph Wright; Kristine Grace M. Cabugao; Kelly M. Andersen; Lucia Fuchslueger; Colleen M. Iversen; Fiona Soper; Om Prakash Ghimire; Laynara F. Lugli; Ana Caroline Miron; Oscar Valverde-Barrantes; Marie Arnaud; Sarah Batterman; Lee H. Dietterich; Ming Yang Lee; Monique Weemstra; Daniela Yaffar; Shalom D. Addo-Danso; Kerstin Pierick; Ryan Bridges; Carina Easton; Isabella Felsing; Nathan B. Gonçalves; Riley Krudop; Mason R. McKinzie; Julia Perbohner; Alejandra N. Pozzoli-Oropeza; Mirna Samaniego; Alex W. Smilor; Ilana S. Vargas; Layna Webb; Teddy Nikitin; Jennifer S. Powers; M. Luke McCormack
Time period covered
Jan 1, 1986 - Jan 1, 2019
Area covered
Description
TropiRoot 1.0 is a new tropical root database with root characteristics across environment gradients. It has data extracted from 107 new sources, resulting in more than 8000 rows of data (either species or community data). Most of the data in TropiRoot 1.0 includes root characteristics such as root biomass, morphology, root dynamics, mass fraction, architecture, anatomy, physiology and root chemistry. This initiative represents an approximately 30% increase in the currently available data for tropical roots in the Fine Root Ecology Database (FRED). TropiRoot 1.0, contains root characteristics from 25 different countries where seven are located in Asia, six in South America, five in Central America and the Caribbean, four in Africa, two in North America, and 1 in Oceania. Due to the volume of data, when ancillary data was available, including soil data, these data was either extracted and included in the database or their availability was recorded in an additional column. Multiple contributors checked the entries for outliers during the collation process to ensure data quality. For text-based observations, we examined all cells to ensure that their content relates to their specific categories. For numerical observations, we ordered each numerical value from least to greatest and plotted the values, checking apparent outliers against the data in their respective sources and correcting or removing incorrect or impossible values. Some data (soil and aboveground) have different columns for the same variable presented in different units, including originally published units, but root characteristics data had units converted to match the ones reported in FRED. By filling a gap from global databases, TropiRoot 1.0 expands our knowledge of otherwise so far underrepresented regions, and our ability to assess global trends. This advancement can be used to improve tropical forest representation in vegetation models.
Nineteen bioclimatic variables retrieved from WorldClim database using...
plos.figshare.com
bin
Updated Aug 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta (2023). Nineteen bioclimatic variables retrieved from WorldClim database using principal coordinates for each sampling site. [Dataset]. http://doi.org/10.1371/journal.pntd.0011486.s003
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pntd.0011486.s003
Dataset updated
Aug 8, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Nineteen bioclimatic variables retrieved from WorldClim database using principal coordinates for each sampling site.
d
Data from: Probability distribution grids of dissolved oxygen and dissolved...
catalog.data.gov
data.usgs.gov
+2more
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Probability distribution grids of dissolved oxygen and dissolved manganese concentrations at selected thresholds in drinking water depth zones, Central Valley, California [Dataset]. https://catalog.data.gov/dataset/probability-distribution-grids-of-dissolved-oxygen-and-dissolved-manganese-concentrations-
Explore at:
Dataset updated
Oct 29, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
California, Central Valley
Description
The ascii grids represent regional probabilities that groundwater in a particular location will have dissolved oxygen (DO) concentrations less than selected threshold values representing anoxic groundwater conditions or will have dissolved manganese (Mn) concentrations greater than selected threshold values representing secondary drinking water-quality contaminant levels (SMCL) and health-based screening levels (HBSL) for water quality. The probability models were constrained by the alluvial boundary of the Central Valley to a depth of approximately 300 meters (m). We utilized prediction modeling methods, specifically boosted regression trees (BRT) with a Bernoulli error distribution within a statistical learning framework within R's computing framework (http://www.r-project.org/) to produce two-dimensional probability grids at selected depths throughout the modeling domain. The statistical learning framework seeks to maximize the predictive performance of machine learning methods through model tuning by cross validation. Models were constructed using measured dissolved oxygen and manganese concentrations sampled from 2,767 wells within the alluvial boundary of the Central Valley and over 60 predictor variables from 7 sources (see metadata) and were assembled to develop a model that incorporates regional-scale soil properties, soil chemistry, land use, aquifer textures, and aquifer hydrology. Previously developed Central Valley model outputs of textures (Central Valley Textural Model, CVTM; Faunt and others, 2010) and MODFLOW-simulated vertical water fluxes and predicted depth to water table (Central Valley Hydrologic Model, CVHM; Faunt, 2009) were used to represent aquifer textures and groundwater hydraulics, respectively. The wells used in the BRT models described above were attributed to predictor variable values in ArcGIS using a 500-m buffer. The response variable data consisted of measured DO and Mn concentrations from 2,767 wells within the alluvial boundary of the Central Valley. The data were compiled from two sources: U.S. Geological Survey (USGS) National Water Information System (NWIS) database (all data are publicly available from the USGS at http://waterdata.usgs.gov/ca/nwis/nwis) and the California State Water Resources Control Board Division of Drinking Water (SWRCB-DDW) database (water-quality data are publicly available from the SWRCB at http://geotracker.waterboards.ca.gov/gama/). Only wells with well depth data were selected, and for wells with multiple records, only the most recent sample in the period 1993–2014 that had the required water-quality data was used. Data were available for 932 wells for the NWIS dataset and 1,835 wells for the SWRCB-DDW dataset. Models were trained on a USGS NWIS dataset of 932 wells and evaluated on an independent hold-out dataset of 1,835 wells from the SWRCB-DDW. We used cross-validation to assess the predictive performance of models of varying complexity as a basis for selecting the final models used to create the prediction grids. Trained models were applied to cross-validation testing data and a separate hold-out dataset to evaluate model predictive performance by emphasizing three model metrics of fit: Kappa, accuracy, and the area under the receiver operator characteristic (ROC) curve. The final trained models were used for mapping predictions at discrete depths to a depth of approximately 300 m. Trained DO and Mn models had accuracies of 86–100 percent, Kappa values of 0.69–0.99, and ROC values of 0.92–1.0. Model accuracies for cross-validation testing datasets were 82–95 percent, and ROC values were 0.87–0.91, indicating good predictive performance. Kappa values for the cross-validation testing dataset were 0.30–0.69, indicating fair to substantial agreement between testing observations and model predictions. Hold-out data were available for the manganese model only and indicated accuracies of 89–97 percent, ROC values of 0.73–0.75, and Kappa values of 0.06–0.30. The predictive performance of both the DO and Mn models was reasonable, considering all three of these fit metrics and the low percentages of low-DO and high-Mn events in the data. See associated journal article (Rosecrans and others, 2017) for complete summary of BRT modeling methods, model fit metrics, and relative influence of predictor variables for a given DO or Mn BRT model. The modeled response variables for the DO BRT models were based on measured DO values from wells at the following thresholds: <0.5 milligrams per liter (mg/L), <1.0 mg/L, and <2.0 mg/L, and these thresholds values were considered anoxic based on literature reviews. The modeled response variables for the Mn BRT models were based on measured Mn values from wells at the following exceedance thresholds: >50 micrograms per liter (µg/L), >150 µg/L, and >300 µg/L. (The 150 µg/L manganese threshold represents one-half the USGS HBSL.) The prediction grid discretization below land surface was in 15-m intervals to a depth of 122 m, followed by intervals of 30 m to a depth of 300 m, resulting in 14 two-dimensional probability grids for each constituent (DO and Mn) and threshold. Probability grid maps were also created for the shallow aquifer and deep aquifer represented by the median domestic and public-supply well depths, respectively. A depth of 46 m was used to stratify wells from the training dataset into the shallow and deep aquifer and was derived from depth percentiles associated with domestic and public supply in previous work by Burow and others (2013). In this work, the median well depth categorized as domestic was 30 m below land surface (bls), and the median well depth categorized as public-supply wells was 100 m bls. Therefore, datasets contained in the folders named "DO BRT prediction grids.zip" and "Mn BRT prediction grids.zip" each have 42 probability grids representing specific depths for each of the selected thresholds of DO and Mn BRT threshold models described above. The dataset contained in the folder named "PublicSupply&DomesticGrids.zip" contains probability grids represented by the domestic and public-supply drinking water depths for each of the six BRT models described above (12 grids total).
d
Restaurants, Fast Food, USA, Top 25 | 200k+ PoIs with 30+ Attributes |...
datarade.ai
.json, .xml, .csv
Updated Feb 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
xavvy (2025). Restaurants, Fast Food, USA, Top 25 | 200k+ PoIs with 30+ Attributes | monthly updates | API & Datasets [Dataset]. https://datarade.ai/data-products/restaurants-fast-food-usa-top-25-200k-pois-with-30-att-xavvy
Explore at:
.json, .xml, .csvAvailable download formats
Dataset updated
Feb 20, 2025
Dataset authored and provided by
xavvy
Area covered
United States of America
Description
Xavvy fuel is the leading source for location data and market insights worldwide. We specialize in data quality and enrichment, providing high-quality POI data for restaurants and quick-service establishments in the United States.

Base data • Name/Brand • Adress • Geocoordinates • Opening Hours • Phone • ... ^

30+ Services • Delivery • Wifi • ChargePoints • …

10+ Payment options • Visa • MasterCard • Google Pay • individual Apps • ...

Our data offering is highly customizable and flexible in delivery – whether one-time or regular data delivery, push or pull services, and various data formats – we adapt to our customers' needs.

Brands included: • McDonalds • Burger King • Subway • KFC • Wendy's • ...

The total number of restaurants per region, market share distribution among competitors, or the ideal location for new branches – our restaurant data provides valuable insights into the food service market and serves as the perfect foundation for in-depth analyses and statistics. Our data helps businesses across various industries make informed decisions regarding market development, expansion, and competitive strategies. Additionally, our data contributes to the consistency and quality of existing datasets. A simple data mapping allows for accuracy verification and correction of erroneous entries.

Especially when displaying information about restaurants and fast-food chains on maps or in applications, high data quality is crucial for an optimal customer experience. Therefore, we continuously optimize our data processing procedures: • Regular quality controls • Geocoding systems to refine location data • Cleaning and standardization of datasets • Consideration of current developments and mergers • Continuous expansion and cross-checking of various data sources

Integrate the most comprehensive database of restaurant locations in the USA into your business. Explore our additional data offerings and gain valuable market insights directly from the experts!
Validated temperature and salinity data, and reconstructed nutrient...
zenodo.org
zip
Updated Sep 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li; Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li (2025). Validated temperature and salinity data, and reconstructed nutrient concentrations in the North Pacific (1895–2024) [Dataset]. http://doi.org/10.5281/zenodo.17140658
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.17140658
Dataset updated
Sep 18, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li; Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The original hydrographic and nutrient data were compiled from the CLIVAR and Carbon Hydrographic Data Office (CCHDO; providing both hydrographic and nutrient measurements) and the World Ocean Database (WOD; supplying hydrographic data only) across the North Pacific. Rigorous quality control procedures—including range, spike, gradient, inversion, outlier checks and etc. — were applied to remove low-quality temperature, salinity, and nutrient records (NO₃⁻, NO₂⁻, DIP, and Si(OH)₄) from both databases. A machine learning model (Random Forest) was trained on the quality-controlled CCHDO data to reconstruct nutrient concentrations, using spatial, temporal, and water mass predictors derived from the validated WOD hydrographic dataset. This process generated approximately 435 million reconstructed nutrient data points across 1.9 million stations for each nutrient species within the WOD, covering the period from 1895 to 2024 in the North Pacific (118.6 ºE to 280.3ºE; -2.0 to 60.6ºN). The final dataset offers validated temperature and salinity values along with reconstructed nutrient concentrations, providing a valuable resource for studying ocean biogeochemistry and climate-related changes in the North Pacific.
d
Macroecological database of mammalian body mass
dataone.org
data.esa.org
+2more
Updated Aug 14, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCEAS 2182: Smith: Body size in ecology and paleoecology: Linking pattern and process across spatial, temporal and taxonomic scales; National Center for Ecological Analysis and Synthesis; Felisa Smith (2015). Macroecological database of mammalian body mass [Dataset]. http://doi.org/10.5063/AA/nceas.196.3
Explore at:
Unique identifier
https://doi.org/10.5063/AA/nceas.196.3
Dataset updated
Aug 14, 2015
Dataset provided by
Knowledge Network for Biocomplexity
Authors
NCEAS 2182: Smith: Body size in ecology and paleoecology: Linking pattern and process across spatial, temporal and taxonomic scales; National Center for Ecological Analysis and Synthesis; Felisa Smith
Time period covered
Jan 3, 1
Area covered
Earth
Variables measured
Genus, Order, Family, Status, Species, Citation, Log Mass, Continent, Reference, Combined mass, and 2 more
Description
The purpose of this data set was to compile body mass information for all mammals on Earth so that we could investigate the patterns of body mass seen across geographic and taxonomic space and evolutionary time. We were interested in the heritability of body size across taxonomic groups (How conserved is body mass within a genus, family, and order?), in the overall pattern of body mass across continents (Do the moments and other descriptive statistics remain the same across geographic space?), and over evolutionary time (How quickly did body mass patterns iterate on the patterns seen today? Were the Pleistocene extinctions size specific on each continent, and did these events coincide with the arrival of man?). These data are also part of a larger project that seeks to integrate body mass patterns across very diverse taxa (NCEAS Working Group on Body size in ecology and paleoecology: linking pattern and process across space, time and taxonomic scales). We began with the updated version of Wilson and Reeder's (1993) taxonomic list of all known Recent mammals of the world (N = 4629 species) to which we added status, distribution, and body mass estimates compiled from the primary and secondary literature. Whenever possible, we used an average of male and female body mass, which was in turn averaged over multiple localities to arrive at our species body mass values. The sources are line referenced in the main data set, with the actual references appearing in a table within the metadata. Mammals have individual records for each continent they occur on. Please note that our data set is more than an amalgamation of smaller compilations. Although we relied heavily a data set for Chiroptera by K. E. Jones (N = 905), the CRC handbook of Mammalian Body Mass (N = 688), and a data set compiled for South America by P. Marquet (N = 505), these total less than half the records in the current database. The remainder are derived from more than 150 other sources (see reference table). Furthermore, we include a comprehensive late Pleistocene species assemblage for Africa, North and South America, and Australia (an additional 230 species). "Late Pleistocene" is defined as approximately 11 ka for Africa, North and South America, and as 50 ka for Australia, because these times predate anthropogenic impacts on mammalian fauna. Overall, the temporal coverage is from the late Pleistocene to present (ca. 45,000 ybp to present). Estimates contained within this data set represent a generalized species value, averaged across gender and geographic space. Consequently, these data are not appropriate for asking population-level questions where the integration of body mass with specific environmental conditions is important. All extant orders of mammals are included, as well as several archaic groups (N = 4859 species). Because some species are found on more than one continent (particularly Chiroptera), there are 5731 entries. We have body masses for the following: Artiodactyla (280 records), Bibymalagasia (2 records), Carnivora (393 records), Cetacea (75 records), Chiroptera (1071 records), Dasyuromorphia (67 records), Dermoptera (3 records), Didelphimorphia (68 records), Diprotodontia (127 records), Hydracoidea (5 records), Insectivora (234 records), Lagomorpha (53 records), Litopterna (2 records), Macroscelidea (14 records), Microbiotheria (1 record), Monotremata (7 records), Notoryctemorphia (1 record), Notoungulata (5 records), Paucituberculata (5 records), Peramelemorphia (24 records), Perissodactyla (47 records), Pholidota (8 records), Primates (276 records), Proboscidea (14 records), Rodentia (1425 records), Scandentia (15 records), Sirenia (6 records), Tubulidentata (1 record), and Xenarthra (75 records).

Data has undergone substantial data quality and assurance checking, though this is an on-going process. Histograms of the body masses of each order were produced, and values at the tails were double-checked for accuracy. When multiple sources of information were available for a species, or new sources encountered, we used those with higher sample sizes and gender-specific information.

Headers are given here as header name followed by more information such as measurement units or other basic descriptor. More information on the variable definitions can be found in Section B, variable information (at http://www.esapubs.org/archive/ecol/E084/094/metadata.htm). Continent (SA, NA, EA, insular, oceanic, AUS, AF), Status (extinct, historical, introduction, or extant), Order, Family, Genus, Species, Log Mass (grams), Combined Mass (grams), Reference.
w
Multi Country Study Survey 2000-2001, Long version - Mexico
apps.who.int
catalog.ihsn.org
Updated Jan 16, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Health Organization (WHO) (2014). Multi Country Study Survey 2000-2001, Long version - Mexico [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/201
Explore at:
Dataset updated
Jan 16, 2014
Dataset authored and provided by
World Health Organization (WHO)
Time period covered
2000 - 2001
Area covered
Mexico
Description
Abstract

In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

Geographic coverage

15 federal states: Distrito Federal, Guanajuato, Jalisco, Estado de México, Michoacán, Qurétaro, Guerrero, Oaxaca, Puebla, Veracruz, Yucatán, Chihuahua, Nuevo León, San Luis Potosí, Sonora

Kind of data

Sample survey data [ssd]

Sampling procedure

The sample used was a probabilistic, multistage, stratified and clustered sample and represented urban and rural strata.

Mexico has 32 Federal States, which were divided, into 3 regions: Centre, South and North. Out of these regions, 15 were selected as follows: Centre: Distrito Federal, Guanajuato, Jalisco, Estado de México, Michoacán, Qurétaro South: Guerrero, Oaxaca, Puebla, Veracruz, Yucatán North: Chihuahua, Nuevo León, San Luis Potosí, Sonora

Mode of data collection

Face-to-face [f2f]

Cleaning operations

Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2022). Protein Cross-Linking Database [Dataset]. http://identifiers.org/RRID:SCR_021027

Protein Cross-Linking Database

RRID:SCR_021027, Protein Cross-Linking Database (RRID:SCR_021027), ProXL, proxl, Protein XL, Protein XL Database

Explore at:

68 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_021027 https://identifiers.org/RRID:SCR_021027/resolver?q=&i=rrid

Dataset updated

Jan 29, 2022

Description

Web application and database designed for sharing, visualizing, and analyzing protein cross-linking mass spectrometry data with emphasis on structural analysis and quality control. Includes public and private data sharing capabilities, project based interface designed to ensure security and facilitate collaboration among multiple researchers. Used for private collaboration and public data dissemination.

Clear search

Close search

Google apps

Main menu

Protein Cross-Linking Database

CLM - Bore assignments QLD

Abstract

Dataset History

Dataset Citation

Dataset Ancestors

Pokemon TCG Pocket Dataset

Pokémon TCG Pocket Card Dataset

Dataset Contents

Key Features

🃏 Complete Card Data

💎 Rarity Classifications

Use Cases

📊 Data Analysis & Research

🤖 Machine Learning & AI

📈 Visualization & Dashboards

Data Quality

Technical Specifications

📋 File Format

🗂️ Column Structure (9 columns)

Convenience Stores, USA, Top 10 | 32k+ PoIs with 15+ Attributes | monthly...

Good Growth Plan 2014-2019 - Indonesia

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Data appraisal

Multi Country Study Survey 2000-2001 - United States

Abstract

Kind of data

Sampling procedure

Mode of data collection

Cleaning operations

Good Growth Plan, 2014-2019 - Paraguay

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Data appraisal

Multi Country Study Survey 2000-2001 - Syrian Arab Republic

Abstract

Kind of data

Mode of data collection

Cleaning operations

Good Growth Plan 2014-2019 - Japan

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Data appraisal

Multi Country Study Survey 2000-2001, Long version - Lebanon

Abstract

Kind of data

Mode of data collection

Cleaning operations

Database Automation Industry Report

Data from: Spatiotemporal Characteristics of Global Building Material...

A two-dimensional PCA plot obtained from a multiple factor analysis (MFA)...

Data from TropiRoot 1.0 database: tropical root characteristics across...

Nineteen bioclimatic variables retrieved from WorldClim database using...

Data from: Probability distribution grids of dissolved oxygen and dissolved...

Restaurants, Fast Food, USA, Top 25 | 200k+ PoIs with 30+ Attributes |...

Validated temperature and salinity data, and reconstructed nutrient...

Macroecological database of mammalian body mass

Multi Country Study Survey 2000-2001, Long version - Mexico

Abstract

Geographic coverage

Kind of data