44 datasets found
  1. n

    Protein Cross-Linking Database

    • neuinfo.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Protein Cross-Linking Database [Dataset]. http://identifiers.org/RRID:SCR_021027
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Web application and database designed for sharing, visualizing, and analyzing protein cross-linking mass spectrometry data with emphasis on structural analysis and quality control. Includes public and private data sharing capabilities, project based interface designed to ensure security and facilitate collaboration among multiple researchers. Used for private collaboration and public data dissemination.

  2. d

    CLM - Bore assignments QLD

    • data.gov.au
    • researchdata.edu.au
    • +1more
    zip
    Updated Nov 19, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2019). CLM - Bore assignments QLD [Dataset]. https://data.gov.au/data/dataset/f8937dd8-b3a0-490e-a452-9dc56fe03914
    Explore at:
    zip(158505)Available download formats
    Dataset updated
    Nov 19, 2019
    Dataset provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Queensland
    Description

    Abstract

    The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    This dataset contains the aquifer assignment results for the Queensland part of the Clarence-Moreton Basin. The data were organized by hydrostratigraphic units. Assigning a bore to a specific aquifer is underpinned by the screened interval data and the aquifer boundaries. In many cases, it is impossible to assign the screened interval of a bore to a single aquifer as bores are either screened across different aquifers or there is insufficient information on stratigraphy and screened intervals. Bores were assigned to aquifers by comparing their screen intervals and depth with aquifer boundary data. The required information was extracted from the "Casing", "Aquifer", and "Stratigraphy" tables of the DNRM database.

    Dataset History

    The following steps were followed during the aquifer assignment:

    1. Determine the boundary of the aquifer of interest. The 'Aquifer' table in the DNRM database registers aquifers that a bore intersects when it is drilled and records the upper and lower extents of aquifers. This information was used to identify the aquifer boundary at any specific location. When boundary information was missing the 'Stratigraphy' table was used to identify aquifer boundaries instead.

      Determine the screen interval of bores. Refer to theThe 'Casing' table contains the screen information for most bores in the database. The codes 'PERF', 'SCRN' and 'ENDD' in the column 'MATERIAL' indicate water entry locations. The code 'OPEN' indicates that a bore is uncased at some depths; if bores intersect an aquifer, then they are considered as water supply points. These codes were used to find the screen interval of a bore. When multiple screens exist, the bore is assumed to be screened across the entire length of the individual screens.

    2. Determine the screen code. A bore may tap into an aquifer in four ways depending on its screen location in aquifers. Four codes (I, T, B and E) were used to indicate the different spatial relationship of a bore with its targeted aquifer. When screen information is lacking, bores with their lower ends located in an aquifer are assumed to be tapped to that aquifer and were assigned a screen code 'BOI'.

    3. Filter bores for a specific area using a shape file or coordinates. If only a part of the aquifer is of interest, then the output bores can be filtered based on their locations.

    4. Cross-check the final datasets against expert knowledge and spatial context of aquifers. As errors are common in such databases, some errors will still persist despite extensive data quality checks. However, such errors are often highlighted during data interpretation and visual representation and can subsequently be corrected through an iterative process.

    Dataset Citation

    Bioregional Assessment Programme (2014) CLM - Bore assignments QLD. Bioregional Assessment Derived Dataset. Viewed 28 September 2017, http://data.bioregionalassessments.gov.au/dataset/f8937dd8-b3a0-490e-a452-9dc56fe03914.

    Dataset Ancestors

  3. Pokemon TCG Pocket Dataset

    • kaggle.com
    zip
    Updated Oct 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JoaoCoelho03 (2025). Pokemon TCG Pocket Dataset [Dataset]. https://www.kaggle.com/datasets/joaocoelho03/pocket-tcg-dataset/data
    Explore at:
    zip(17767 bytes)Available download formats
    Dataset updated
    Oct 5, 2025
    Authors
    JoaoCoelho03
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Pokémon TCG Pocket Card Dataset

    This dataset contains detailed information about all cards available in the Pokémon Trading Card Game Pocket mobile app. The data has been carefully curated and cleaned to provide Pokémon enthusiasts and developers with accurate and comprehensive card information.

    Dataset Contents

    • 8+ Complete Sets: All major card sets including latest expansions
    • 1000+ Cards: Every card with detailed metadata and classifications
    • Clean Format: CSV format optimized for analysis, machine learning, and research

    Key Features

    🃏 Complete Card Data

    • Card names and numbers with proper formatting
    • Complete set and pack organization structure
    • Release dates for all sets and expansions
    • Total card counts per set for completion tracking

    💎 Rarity Classifications

    • 7+ Rarity Types including:
      • Common, Uncommon, Rare
      • Ultra Rare, Secret Rare, Special Art Rare
      • Crown Rare and other premium classifications
    • Includes shiny and special variant cards
    • Standardized rarity naming conventions

    Use Cases

    📊 Data Analysis & Research

    • Card rarity distribution analysis across sets
    • Set completion and collection tracking

    🤖 Machine Learning & AI

    • Card classification models
    • Recommendation systems for collectors
    • Rarity prediction algorithms
    • Collection optimization models

    📈 Visualization & Dashboards

    • Interactive card browsers
    • Collection progress tracking
    • Rarity distribution charts
    • Set release timeline visualizations

    Data Quality

    • Manually Verified: All card information cross-checked for accuracy
    • Standardized Format: Consistent naming and classification across all entries
    • Complete Coverage: All available cards from the mobile game
    • Clean Structure: Optimized for both human readability and machine processing

    Technical Specifications

    📋 File Format

    • Format: CSV (Comma Separated Values)
    • Encoding: UTF-8 with full international character support
    • Delimiter: Comma (,)
    • Headers: Included in first row

    🗂️ Column Structure (9 columns)

    ColumnDescriptionExample
    set_nameFull name of the card set"Eevee Grove"
    set_codeOfficial set identifier"a3b"
    set_release_dateSet release date"June 26, 2025"
    set_total_cardsTotal cards in the set107
    pack_nameName of the specific pack"Eevee Grove"
    card_nameFull card name"Leafeon"
    card_numberCard number within set"2"
    card_rarityRarity classification"Rare"
    card_typeCard type category"Pokémon"

    If you find this dataset useful, consider giving it an upvote — it really helps others discover it too! 🔼😊

    Happy analyzing! 🎯📊

  4. d

    Convenience Stores, USA, Top 10 | 32k+ PoIs with 15+ Attributes | monthly...

    • datarade.ai
    .json, .xml, .csv
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xavvy (2025). Convenience Stores, USA, Top 10 | 32k+ PoIs with 15+ Attributes | monthly updates | API & Datasets [Dataset]. https://datarade.ai/data-products/convenience-stores-usa-top-10-32k-pois-with-15-attribut-xavvy
    Explore at:
    .json, .xml, .csvAvailable download formats
    Dataset updated
    Apr 16, 2025
    Dataset authored and provided by
    xavvy
    Area covered
    United States of America
    Description

    Xavvy fuel is the leading source for location data and market insights worldwide. We specialize in data quality and enrichment, providing high-quality POI data for convenience stores in the United States.

    Base data • Name/Brand • Adress • Geocoordinates • Opening Hours • Phone • ...

    15+ Services • Fuel • Wifi • ChargePoints • …

    10+ Payment options • Visa • MasterCard • Google Pay • individual Apps • ...

    Our data offering is highly customizable and flexible in delivery – whether one-time or regular data delivery, push or pull services, and various data formats – we adapt to our customers' needs.

    Brands included: • 7-Eleven • Circle K • SAlimentation Couche Tard • Speedway • Casey's • ...

    The total number of convenience stores per region, market share distribution among competitors, or the ideal location for new branches – our convenience store data provides valuable insights into the market and serves as the perfect foundation for in-depth analyses and statistics. Our data helps businesses across various industries make informed decisions regarding market development, expansion, and competitive strategies. Additionally, our data contributes to the consistency and quality of existing datasets. A simple data mapping allows for accuracy verification and correction of erroneous entries.

    Especially when displaying information about restaurants and fast-food chains on maps or in applications, high data quality is crucial for an optimal customer experience. Therefore, we continuously optimize our data processing procedures: • Regular quality controls • Geocoding systems to refine location data • Cleaning and standardization of datasets • Consideration of current developments and mergers • Continuous expansion and cross-checking of various data sources

    Integrate the most comprehensive database of convenience store locations in the USA into your business. Explore our additional data offerings and gain valuable market insights directly from the experts!

  5. Good Growth Plan 2014-2019 - Indonesia

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Jan 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syngenta (2023). Good Growth Plan 2014-2019 - Indonesia [Dataset]. https://microdata.worldbank.org/index.php/catalog/5630
    Explore at:
    Dataset updated
    Jan 27, 2023
    Dataset authored and provided by
    Syngenta
    Time period covered
    2014 - 2019
    Area covered
    Indonesia
    Description

    Abstract

    Syngenta is committed to increasing crop productivity and to using limited resources such as land, water and inputs more efficiently. Since 2014, Syngenta has been measuring trends in agricultural input efficiency on a global network of real farms. The Good Growth Plan dataset shows aggregated productivity and resource efficiency indicators by harvest year. The data has been collected from more than 4,000 farms and covers more than 20 different crops in 46 countries. The data (except USA data and for Barley in UK, Germany, Poland, Czech Republic, France and Spain) was collected, consolidated and reported by Kynetec (previously Market Probe), an independent market research agency. It can be used as benchmarks for crop yield and input efficiency.

    Geographic coverage

    National coverage

    Analysis unit

    Agricultural holdings

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A. Sample design Farms are grouped in clusters, which represent a crop grown in an area with homogenous agro- ecological conditions and include comparable types of farms. The sample includes reference and benchmark farms. The reference farms were selected by Syngenta and the benchmark farms were randomly selected by Kynetec within the same cluster.

    B. Sample size Sample sizes for each cluster are determined with the aim to measure statistically significant increases in crop efficiency over time. This is done by Kynetec based on target productivity increases and assumptions regarding the variability of farm metrics in each cluster. The smaller the expected increase, the larger the sample size needed to measure significant differences over time. Variability within clusters is assumed based on public research and expert opinion. In addition, growers are also grouped in clusters as a means of keeping variances under control, as well as distinguishing between growers in terms of crop size, region and technological level. A minimum sample size of 20 interviews per cluster is needed. The minimum number of reference farms is 5 of 20. The optimal number of reference farms is 10 of 20 (balanced sample).

    C. Selection procedure The respondents were picked randomly using a “quota based random sampling” procedure. Growers were first randomly selected and then checked if they complied with the quotas for crops, region, farm size etc. To avoid clustering high number of interviews at one sampling point, interviewers were instructed to do a maximum of 5 interviews in one village.

    BF Screened from Indonesia were selected based on the following criterion: (a) Corn growers in East Java - Location: East Java (Kediri and Probolinggo) and Aceh
    - Innovative (early adopter); Progressive (keen to learn about agronomy and pests; willing to try new technology); Loyal (loyal to technology that can help them)
    - making of technical drain (having irrigation system)
    - marketing network for corn: post-harvest access to market (generally they sell 80% of their harvest)
    - mid-tier (sub-optimal CP/SE use)
    - influenced by fellow farmers and retailers
    - may need longer credit

    (b) Rice growers in West and East Java - Location: West Java (Tasikmalaya), East Java (Kediri), Central Java (Blora, Cilacap, Kebumen), South Lampung
    - The growers are progressive (keen to learn about agronomy and pests; willing to try new technology)
    - Accustomed in using farming equipment and pesticide. (keen to learn about agronomy and pests; willing to try new technology) - A long rice cultivating experience in his area (lots of experience in cultivating rice)
    - willing to move forward in order to increase his productivity (same as progressive)
    - have a soil that broad enough for the upcoming project
    - have influence in his group (ability to influence others) - mid-tier (sub-optimal CP/SE use)
    - may need longer credit

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    Data collection tool for 2019 covered the following information:

    (A) PRE- HARVEST INFORMATION

    PART I: Screening PART II: Contact Information PART III: Farm Characteristics a. Biodiversity conservation b. Soil conservation c. Soil erosion d. Description of growing area e. Training on crop cultivation and safety measures PART IV: Farming Practices - Before Harvest a. Planting and fruit development - Field crops b. Planting and fruit development - Tree crops c. Planting and fruit development - Sugarcane d. Planting and fruit development - Cauliflower e. Seed treatment

    (B) HARVEST INFORMATION

    PART V: Farming Practices - After Harvest a. Fertilizer usage b. Crop protection products c. Harvest timing & quality per crop - Field crops d. Harvest timing & quality per crop - Tree crops e. Harvest timing & quality per crop - Sugarcane f. Harvest timing & quality per crop - Banana g. After harvest PART VI - Other inputs - After Harvest a. Input costs b. Abiotic stress c. Irrigation

    See all questionnaires in external materials tab

    Cleaning operations

    Data processing:

    Kynetec uses SPSS (Statistical Package for the Social Sciences) for data entry, cleaning, analysis, and reporting. After collection, the farm data is entered into a local database, reviewed, and quality-checked by the local Kynetec agency. In the case of missing values or inconsistencies, farmers are re-contacted. In some cases, grower data is verified with local experts (e.g. retailers) to ensure data accuracy and validity. After country-level cleaning, the farm-level data is submitted to the global Kynetec headquarters for processing. In the case of missing values or inconsistences, the local Kynetec office was re-contacted to clarify and solve issues.

    Quality assurance Various consistency checks and internal controls are implemented throughout the entire data collection and reporting process in order to ensure unbiased, high quality data.

    • Screening: Each grower is screened and selected by Kynetec based on cluster-specific criteria to ensure a comparable group of growers within each cluster. This helps keeping variability low.

    • Evaluation of the questionnaire: The questionnaire aligns with the global objective of the project and is adapted to the local context (e.g. interviewers and growers should understand what is asked). Each year the questionnaire is evaluated based on several criteria, and updated where needed.

    • Briefing of interviewers: Each year, local interviewers - familiar with the local context of farming -are thoroughly briefed to fully comprehend the questionnaire to obtain unbiased, accurate answers from respondents.

    • Cross-validation of the answers: o Kynetec captures all growers' responses through a digital data-entry tool. Various logical and consistency checks are automated in this tool (e.g. total crop size in hectares cannot be larger than farm size) o Kynetec cross validates the answers of the growers in three different ways: 1. Within the grower (check if growers respond consistently during the interview) 2. Across years (check if growers respond consistently throughout the years) 3. Within cluster (compare a grower's responses with those of others in the group) o All the above mentioned inconsistencies are followed up by contacting the growers and asking them to verify their answers. The data is updated after verification. All updates are tracked.

    • Check and discuss evolutions and patterns: Global evolutions are calculated, discussed and reviewed on a monthly basis jointly by Kynetec and Syngenta.

    • Sensitivity analysis: sensitivity analysis is conducted to evaluate the global results in terms of outliers, retention rates and overall statistical robustness. The results of the sensitivity analysis are discussed jointly by Kynetec and Syngenta.

    • It is recommended that users interested in using the administrative level 1 variable in the location dataset use this variable with care and crosscheck it with the postal code variable.

    Data appraisal

    Due to the above mentioned checks, irregularities in fertilizer usage data were discovered which had to be corrected:

    For data collection wave 2014, respondents were asked to give a total estimate of the fertilizer NPK-rates that were applied in the fields. From 2015 onwards, the questionnaire was redesigned to be more precise and obtain data by individual fertilizer products. The new method of measuring fertilizer inputs leads to more accurate results, but also makes a year-on-year comparison difficult. After evaluating several solutions to this problems, 2014 fertilizer usage (NPK input) was re-estimated by calculating a weighted average of fertilizer usage in the following years.

  6. Multi Country Study Survey 2000-2001 - United States

    • apps.who.int
    • catalog.ihsn.org
    • +1more
    Updated Jan 23, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Health Organization (WHO) (2014). Multi Country Study Survey 2000-2001 - United States [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/148
    Explore at:
    Dataset updated
    Jan 23, 2014
    Dataset provided by
    World Health Organizationhttps://who.int/
    Authors
    World Health Organization (WHO)
    Time period covered
    2000 - 2001
    Area covered
    United States
    Description

    Abstract

    In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

    The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

    Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

    The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

    The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

    This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A sample of 5,000 households across the US was purchased from Survey Sampling, Inc. located in Connecticut. This sample is based on Random Digit samples.

    This sample was stratified by state to match the percentage of U.S. residents living in each of the fifty states.

    The 5,000 sampled households were randomly assigned to one of three different experimental treatments (normal, personalized and personalised plus 2$ incentive)

    The experiment was done for purposes of evaluating response rate effects of alternative means of contacting US residents.

    Mode of data collection

    Mail Questionnaire [mail]

    Cleaning operations

    Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

    Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

    The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

    In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

    Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

    Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

    Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.

  7. Good Growth Plan, 2014-2019 - Paraguay

    • microdata.fao.org
    Updated Feb 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syngenta (2021). Good Growth Plan, 2014-2019 - Paraguay [Dataset]. https://microdata.fao.org/index.php/catalog/1813
    Explore at:
    Dataset updated
    Feb 17, 2021
    Dataset authored and provided by
    Syngenta
    Time period covered
    2014 - 2019
    Area covered
    Paraguay
    Description

    Abstract

    Syngenta is committed to increasing crop productivity and to using limited resources such as land, water and inputs more efficiently. Since 2014, Syngenta has been measuring trends in agricultural input efficiency on a global network of real farms. The Good Growth Plan dataset shows aggregated productivity and resource efficiency indicators by harvest year. The data has been collected from more than 4,000 farms and covers more than 20 different crops in 46 countries. The data (except USA data and for Barley in UK, Germany, Poland, Czech Republic, France and Spain) was collected, consolidated and reported by Kynetec (previously Market Probe), an independent market research agency. It can be used as benchmarks for crop yield and input efficiency.

    Geographic coverage

    National Coverage

    Analysis unit

    Agricultural holdings

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A. Sample design Farms are grouped in clusters, which represent a crop grown in an area with homogenous agro- ecological conditions and include comparable types of farms. The sample includes reference and benchmark farms. The reference farms were selected by Syngenta and the benchmark farms were randomly selected by Kynetec within the same cluster.

    B. Sample size Sample sizes for each cluster are determined with the aim to measure statistically significant increases in crop efficiency over time. This is done by Kynetec based on target productivity increases and assumptions regarding the variability of farm metrics in each cluster. The smaller the expected increase, the larger the sample size needed to measure significant differences over time. Variability within clusters is assumed based on public research and expert opinion. In addition, growers are also grouped in clusters as a means of keeping variances under control, as well as distinguishing between growers in terms of crop size, region and technological level. A minimum sample size of 20 interviews per cluster is needed. The minimum number of reference farms is 5 of 20. The optimal number of reference farms is 10 of 20 (balanced sample).

    C. Selection procedure The respondents were picked randomly using a “quota based random sampling” procedure. Growers were first randomly selected and then checked if they complied with the quotas for crops, region, farm size etc. To avoid clustering high number of interviews at one sampling point, interviewers were instructed to do a maximum of 5 interviews in one village.

    BF Screened from Paraguay were selected based on the following criterion:

    (a) smallholder soybean growers Medium to high technology farms Regions: - Hohenau (Itapúa) - Edelira (Itapúa) - Pirapó (Itapúa) - La Paz (Itapúa) - Naranjal (Alto Paraná) - San Cristóbal (Alto Paraná)
    corn and soybean in rotation
    first grow corn and soybean secondly

    (b) smallholder maize growers
    Medium to high technology farms Regions: - Hohenau (Itapúa) - Edelira (Itapúa) - Pirapó (Itapúa) - La Paz (Itapúa) - Naranjal (Alto Paraná) - San Cristóbal (Alto Paraná)
    corn and soybean in rotation
    first grow corn and soybean secondly

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    Data collection tool for 2019 covered the following information:

    (A) PRE- HARVEST INFORMATION

    PART I: Screening PART II: Contact Information PART III: Farm Characteristics a. Biodiversity conservation b. Soil conservation c. Soil erosion d. Description of growing area e. Training on crop cultivation and safety measures PART IV: Farming Practices - Before Harvest a. Planting and fruit development - Field crops b. Planting and fruit development - Tree crops c. Planting and fruit development - Sugarcane d. Planting and fruit development - Cauliflower e. Seed treatment

    (B) HARVEST INFORMATION

    PART V: Farming Practices - After Harvest a. Fertilizer usage b. Crop protection products c. Harvest timing & quality per crop - Field crops d. Harvest timing & quality per crop - Tree crops e. Harvest timing & quality per crop - Sugarcane f. Harvest timing & quality per crop - Banana g. After harvest PART VI - Other inputs - After Harvest a. Input costs b. Abiotic stress c. Irrigation

    See all questionnaires in external materials tab

    Cleaning operations

    Data processing:

    Kynetec uses SPSS (Statistical Package for the Social Sciences) for data entry, cleaning, analysis, and reporting. After collection, the farm data is entered into a local database, reviewed, and quality-checked by the local Kynetec agency. In the case of missing values or inconsistencies, farmers are re-contacted. In some cases, grower data is verified with local experts (e.g. retailers) to ensure data accuracy and validity. After country-level cleaning, the farm-level data is submitted to the global Kynetec headquarters for processing. In the case of missing values or inconsistences, the local Kynetec office was re-contacted to clarify and solve issues.

    B. Quality assurance Various consistency checks and internal controls are implemented throughout the entire data collection and reporting process in order to ensure unbiased, high quality data.

    • Screening: Each grower is screened and selected by Kynetec based on cluster-specific criteria to ensure a comparable group of growers within each cluster. This helps keeping variability low.

    • Evaluation of the questionnaire: The questionnaire aligns with the global objective of the project and is adapted to the local context (e.g. interviewers and growers should understand what is asked). Each year the questionnaire is evaluated based on several criteria, and updated where needed.

    • Briefing of interviewers: Each year, local interviewers - familiar with the local context of farming -are thoroughly briefed to fully comprehend the questionnaire to obtain unbiased, accurate answers from respondents.

    • Cross-validation of the answers:

    o Kynetec captures all growers' responses through a digital data-entry tool. Various logical and consistency checks are automated in this tool (e.g. total crop size in hectares cannot be larger than farm size) 
    o Kynetec cross validates the answers of the growers in three different ways: 
      1. Within the grower (check if growers respond consistently during the interview) 
      2. Across years (check if growers respond consistently throughout the years) 
      3. Within cluster (compare a grower's responses with those of others in the group) 
    

    o All the above mentioned inconsistencies are followed up by contacting the growers and asking them to verify their answers. The data is updated after verification. All updates are tracked.

    • Check and discuss evolutions and patterns: Global evolutions are calculated, discussed and reviewed on a monthly basis jointly by Kynetec and Syngenta.

    • Sensitivity analysis: sensitivity analysis is conducted to evaluate the global results in terms of outliers, retention rates and overall statistical robustness. The results of the sensitivity analysis are discussed jointly by Kynetec and Syngenta.

    • It is recommended that users interested in using the administrative level 1 variable in the location dataset use this variable with care and crosscheck it with the postal code variable.

    Data appraisal

    Due to the above mentioned checks, irregularities in fertilizer usage data were discovered which had to be corrected:

    For data collection wave 2014, respondents were asked to give a total estimate of the fertilizer NPK-rates that were applied in the fields. From 2015 onwards, the questionnaire was redesigned to be more precise and obtain data by individual fertilizer products. The new method of measuring fertilizer inputs leads to more accurate results, but also makes a year-on-year comparison difficult. After evaluating several solutions to this problems, 2014 fertilizer usage (NPK input) was re-estimated by calculating a weighted average of fertilizer usage in the following years.

  8. Multi Country Study Survey 2000-2001 - Syrian Arab Republic

    • catalog.ihsn.org
    • apps.who.int
    • +1more
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Health Organization (WHO) (2019). Multi Country Study Survey 2000-2001 - Syrian Arab Republic [Dataset]. https://catalog.ihsn.org/index.php/catalog/3882
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    World Health Organizationhttps://who.int/
    Authors
    World Health Organization (WHO)
    Time period covered
    2000 - 2001
    Area covered
    Syria
    Description

    Abstract

    In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

    The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

    Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

    The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

    The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

    This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

    Kind of data

    Sample survey data [ssd]

    Mode of data collection

    Face-to-face [f2f]

    Cleaning operations

    Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

    Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

    The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

    In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

    Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

    Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

    Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.

  9. Good Growth Plan 2014-2019 - Japan

    • microdata.worldbank.org
    • datacatalog.ihsn.org
    • +1more
    Updated Jan 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syngenta (2023). Good Growth Plan 2014-2019 - Japan [Dataset]. https://microdata.worldbank.org/index.php/catalog/5634
    Explore at:
    Dataset updated
    Jan 27, 2023
    Dataset authored and provided by
    Syngenta
    Time period covered
    2014 - 2019
    Area covered
    Japan
    Description

    Abstract

    Syngenta is committed to increasing crop productivity and to using limited resources such as land, water and inputs more efficiently. Since 2014, Syngenta has been measuring trends in agricultural input efficiency on a global network of real farms. The Good Growth Plan dataset shows aggregated productivity and resource efficiency indicators by harvest year. The data has been collected from more than 4,000 farms and covers more than 20 different crops in 46 countries. The data (except USA data and for Barley in UK, Germany, Poland, Czech Republic, France and Spain) was collected, consolidated and reported by Kynetec (previously Market Probe), an independent market research agency. It can be used as benchmarks for crop yield and input efficiency.

    Geographic coverage

    National coverage

    Analysis unit

    Agricultural holdings

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A. Sample design Farms are grouped in clusters, which represent a crop grown in an area with homogenous agro- ecological conditions and include comparable types of farms. The sample includes reference and benchmark farms. The reference farms were selected by Syngenta and the benchmark farms were randomly selected by Kynetec within the same cluster.

    B. Sample size Sample sizes for each cluster are determined with the aim to measure statistically significant increases in crop efficiency over time. This is done by Kynetec based on target productivity increases and assumptions regarding the variability of farm metrics in each cluster. The smaller the expected increase, the larger the sample size needed to measure significant differences over time. Variability within clusters is assumed based on public research and expert opinion. In addition, growers are also grouped in clusters as a means of keeping variances under control, as well as distinguishing between growers in terms of crop size, region and technological level. A minimum sample size of 20 interviews per cluster is needed. The minimum number of reference farms is 5 of 20. The optimal number of reference farms is 10 of 20 (balanced sample).

    C. Selection procedure The respondents were picked randomly using a “quota based random sampling” procedure. Growers were first randomly selected and then checked if they complied with the quotas for crops, region, farm size etc. To avoid clustering high number of interviews at one sampling point, interviewers were instructed to do a maximum of 5 interviews in one village.

    BF Screened from Japan were selected based on the following criterion: Location: Hokkaido Tokachi (JA Memuro, JA Otofuke, JA Tokachi Shimizu, JA Obihiro Taisho) --> initially focus on Memuro, Otofuke, Tokachi Shimizu, Obihiro Taisho // Added locations in GGP 2015 due to change of RF: Obhiro, Kamikawa, Abashiri
    BF: no use of in furrow application (Amigo) - no use of Amistar

    Contract farmers of snacks and other food companies --> screening question: 'Do you have quality contracts in place with snack and food companies for your potato production? Y/N --> if no, screen out

    Increase of marketable yield --> screening question: 'Are you interested in growing branded potatoes (premium potatoes for processing industry)? Y/N --> if no, screen out

    Potato growers for process use
    Background info: No mention of Syngenta Background info: - Labor cost is very serious issue: In general, labor cost in Japan is very high. Growers try to reduce labor cost by mechanization. Percentage of labor cost in production cost. They would like to manage cost of labor - Quality and yield driven

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    Data collection tool for 2019 covered the following information:

    (A) PRE- HARVEST INFORMATION

    PART I: Screening PART II: Contact Information PART III: Farm Characteristics a. Biodiversity conservation b. Soil conservation c. Soil erosion d. Description of growing area e. Training on crop cultivation and safety measures PART IV: Farming Practices - Before Harvest a. Planting and fruit development - Field crops b. Planting and fruit development - Tree crops c. Planting and fruit development - Sugarcane d. Planting and fruit development - Cauliflower e. Seed treatment

    (B) HARVEST INFORMATION

    PART V: Farming Practices - After Harvest a. Fertilizer usage b. Crop protection products c. Harvest timing & quality per crop - Field crops d. Harvest timing & quality per crop - Tree crops e. Harvest timing & quality per crop - Sugarcane f. Harvest timing & quality per crop - Banana g. After harvest PART VI - Other inputs - After Harvest a. Input costs b. Abiotic stress c. Irrigation

    See all questionnaires in external materials tab

    Cleaning operations

    Data processing:

    Kynetec uses SPSS (Statistical Package for the Social Sciences) for data entry, cleaning, analysis, and reporting. After collection, the farm data is entered into a local database, reviewed, and quality-checked by the local Kynetec agency. In the case of missing values or inconsistencies, farmers are re-contacted. In some cases, grower data is verified with local experts (e.g. retailers) to ensure data accuracy and validity. After country-level cleaning, the farm-level data is submitted to the global Kynetec headquarters for processing. In the case of missing values or inconsistences, the local Kynetec office was re-contacted to clarify and solve issues.

    Quality assurance Various consistency checks and internal controls are implemented throughout the entire data collection and reporting process in order to ensure unbiased, high quality data.

    • Screening: Each grower is screened and selected by Kynetec based on cluster-specific criteria to ensure a comparable group of growers within each cluster. This helps keeping variability low.

    • Evaluation of the questionnaire: The questionnaire aligns with the global objective of the project and is adapted to the local context (e.g. interviewers and growers should understand what is asked). Each year the questionnaire is evaluated based on several criteria, and updated where needed.

    • Briefing of interviewers: Each year, local interviewers - familiar with the local context of farming -are thoroughly briefed to fully comprehend the questionnaire to obtain unbiased, accurate answers from respondents.

    • Cross-validation of the answers: o Kynetec captures all growers' responses through a digital data-entry tool. Various logical and consistency checks are automated in this tool (e.g. total crop size in hectares cannot be larger than farm size) o Kynetec cross validates the answers of the growers in three different ways: 1. Within the grower (check if growers respond consistently during the interview) 2. Across years (check if growers respond consistently throughout the years) 3. Within cluster (compare a grower's responses with those of others in the group) o All the above mentioned inconsistencies are followed up by contacting the growers and asking them to verify their answers. The data is updated after verification. All updates are tracked.

    • Check and discuss evolutions and patterns: Global evolutions are calculated, discussed and reviewed on a monthly basis jointly by Kynetec and Syngenta.

    • Sensitivity analysis: sensitivity analysis is conducted to evaluate the global results in terms of outliers, retention rates and overall statistical robustness. The results of the sensitivity analysis are discussed jointly by Kynetec and Syngenta.

    • It is recommended that users interested in using the administrative level 1 variable in the location dataset use this variable with care and crosscheck it with the postal code variable.

    Data appraisal

    Due to the above mentioned checks, irregularities in fertilizer usage data were discovered which had to be corrected:

    For data collection wave 2014, respondents were asked to give a total estimate of the fertilizer NPK-rates that were applied in the fields. From 2015 onwards, the questionnaire was redesigned to be more precise and obtain data by individual fertilizer products. The new method of measuring fertilizer inputs leads to more accurate results, but also makes a year-on-year comparison difficult. After evaluating several solutions to this problems, 2014 fertilizer usage (NPK input) was re-estimated by calculating a weighted average of fertilizer usage in the following years.

  10. Multi Country Study Survey 2000-2001, Long version - Lebanon

    • apps.who.int
    • catalog.ihsn.org
    Updated Jan 16, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Health Organization (WHO) (2014). Multi Country Study Survey 2000-2001, Long version - Lebanon [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/192
    Explore at:
    Dataset updated
    Jan 16, 2014
    Dataset provided by
    World Health Organizationhttps://who.int/
    Authors
    World Health Organization (WHO)
    Time period covered
    2000 - 2001
    Area covered
    Lebanon
    Description

    Abstract

    In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

    The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

    Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

    The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

    The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

    This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

    Kind of data

    Sample survey data [ssd]

    Mode of data collection

    Face-to-face [f2f]

    Cleaning operations

    Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

    Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

    The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

    In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

    Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

    Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

    Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.

  11. D

    Database Automation Industry Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Database Automation Industry Report [Dataset]. https://www.marketreportanalytics.com/reports/database-automation-industry-90626
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Apr 27, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Database Automation market is experiencing robust growth, projected to reach $2.35 billion in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 24.38% from 2025 to 2033. This expansion is fueled by several key factors. The increasing complexity of database environments, coupled with the rising demand for faster and more reliable application deployments, is driving the adoption of automation solutions. Organizations across diverse sectors, including Banking, Financial Services and Insurance (BFSI), IT and Telecom, and E-commerce, are increasingly leveraging database automation to streamline operations, reduce manual errors, and improve overall efficiency. The shift towards cloud-based deployments further contributes to market growth, as cloud platforms offer scalability and flexibility well-suited to automated database management. While on-premises solutions still hold a significant share, the cloud segment is expected to witness faster growth in the coming years due to its cost-effectiveness and accessibility. Large enterprises are currently the primary adopters of database automation, but growing awareness and the availability of tailored solutions are expanding the market among Small and Medium-Sized Enterprises (SMEs). Competitive offerings from major players like Oracle, IBM, and Amazon Web Services, coupled with the emergence of specialized vendors, are shaping a dynamic and innovative market landscape. The market segmentation reveals significant opportunities across various components, including Database Patch and Release Automation, Application Release Automation, and Database Test Automation. Services related to implementation, integration, and support form a crucial segment, contributing significantly to the overall market value. While North America currently dominates the market, regions like Asia-Pacific are projected to exhibit strong growth owing to rapid digitalization and increasing IT spending. However, factors such as the high initial investment costs associated with implementing automation solutions and the need for skilled personnel to manage these systems could potentially restrain market growth to some extent. The overall outlook for the Database Automation market remains positive, driven by the persistent need for enhanced operational efficiency and improved application delivery cycles in a rapidly evolving technological landscape. Recent developments include: June 2023: Aquatic Informatics launched a new automated data validation tool, HydroCorrect, that can accelerate proactive monitoring and management of flooding, groundwater, and water quality in the Aquarius platform. With machine-learning technology, HydroCorrect will transform the QA/QC process with automation and standardized workflows that save time and improve data quality., May 2023: data.world, the data catalog platform, acquired the Mighty Canary technology and its incorporation into a new DataOps application. The application uses automation to surface contextual insights and real-time data quality updates directly to the BI, communications, and collaboration tools data consumers use.. Key drivers for this market are: Continuously Growing Volumes of Data Across Verticals, Increasing Demand for Automating Repetitive Database Management Processes. Potential restraints include: Continuously Growing Volumes of Data Across Verticals, Increasing Demand for Automating Repetitive Database Management Processes. Notable trends are: IT and Telecommunication industry is Expected to Witness Significant Growth.

  12. f

    Data from: Spatiotemporal Characteristics of Global Building Material...

    • acs.figshare.com
    xlsx
    Updated Nov 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qiance Liu; Xin Ouyang; Wensong Zhu; Kun Sun; Jinchao Song; Xiang Li; Yunyun Li; Wu Chen; Gang Liu (2025). Spatiotemporal Characteristics of Global Building Material Intensity Revealed for Circular and Low-Carbon Construction [Dataset]. http://doi.org/10.1021/acs.est.5c05684.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 13, 2025
    Dataset provided by
    ACS Publications
    Authors
    Qiance Liu; Xin Ouyang; Wensong Zhu; Kun Sun; Jinchao Song; Xiang Li; Yunyun Li; Wu Chen; Gang Liu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Quantifying the material intensity of buildings (MIB) is fundamental for built environment stock accounting, construction resource and waste management, and embodied carbon assessment. However, existing MIB data reported in the literature are often sparse, heterogeneous, and scattered across archetypes, which hinders comparability, quality checks, and harmonization. Here, we compiled a global MIB database containing 3051 MIB records in a unified form measured in kg/m2 for 31 types of construction materials, based on both secondary and primary data from multiple sources. Applying a mean-absolute-deviation (MAD) rule to generate archetype-specific general MIBs, we revealed that the upward pressure on MIB from increases in floor area and building height has been partly offset by the use of light-weight materials, yielding a current aggregate MIB of 1464.3 kg/m2 that is comparable to the pre-1920 levels. Global building material composition shifted markedly away from brick and wood and toward higher shares of steel, cement, sand, and stone, alongside sizable heterogeneity across archetypes, regions, and periods. This expanded, standardized, and harmonized global MIB database can help inform material efficiency targets, embodied carbon baselines, and stock-aware planning for selective demolition, procurement, and renovation in a circular and low-carbon construction transition.

  13. A two-dimensional PCA plot obtained from a multiple factor analysis (MFA)...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    bin
    Updated Aug 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta (2023). A two-dimensional PCA plot obtained from a multiple factor analysis (MFA) performed on all 22 populations using 142 bioclimatic variables retrieved from the WorldClim database. [Dataset]. http://doi.org/10.1371/journal.pntd.0011486.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A two-dimensional PCA plot obtained from a multiple factor analysis (MFA) performed on all 22 populations using 142 bioclimatic variables retrieved from the WorldClim database.

  14. d

    Data from TropiRoot 1.0 database: tropical root characteristics across...

    • search.dataone.org
    • osti.gov
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amanda L. Cordeiro; Daniela F. Cusack; Nathaly Guerrero-Ramírez; Richard J. Norby; Laura Toro; Michelle Y. Wong; S. Joseph Wright; Kristine Grace M. Cabugao; Kelly M. Andersen; Lucia Fuchslueger; Colleen M. Iversen; Fiona Soper; Om Prakash Ghimire; Laynara F. Lugli; Ana Caroline Miron; Oscar Valverde-Barrantes; Marie Arnaud; Sarah Batterman; Lee H. Dietterich; Ming Yang Lee; Monique Weemstra; Daniela Yaffar; Shalom D. Addo-Danso; Kerstin Pierick; Ryan Bridges; Carina Easton; Isabella Felsing; Nathan B. Gonçalves; Riley Krudop; Mason R. McKinzie; Julia Perbohner; Alejandra N. Pozzoli-Oropeza; Mirna Samaniego; Alex W. Smilor; Ilana S. Vargas; Layna Webb; Teddy Nikitin; Jennifer S. Powers; M. Luke McCormack (2025). Data from TropiRoot 1.0 database: tropical root characteristics across environments [Dataset]. http://doi.org/10.15485/2507279
    Explore at:
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    ESS-DIVE
    Authors
    Amanda L. Cordeiro; Daniela F. Cusack; Nathaly Guerrero-Ramírez; Richard J. Norby; Laura Toro; Michelle Y. Wong; S. Joseph Wright; Kristine Grace M. Cabugao; Kelly M. Andersen; Lucia Fuchslueger; Colleen M. Iversen; Fiona Soper; Om Prakash Ghimire; Laynara F. Lugli; Ana Caroline Miron; Oscar Valverde-Barrantes; Marie Arnaud; Sarah Batterman; Lee H. Dietterich; Ming Yang Lee; Monique Weemstra; Daniela Yaffar; Shalom D. Addo-Danso; Kerstin Pierick; Ryan Bridges; Carina Easton; Isabella Felsing; Nathan B. Gonçalves; Riley Krudop; Mason R. McKinzie; Julia Perbohner; Alejandra N. Pozzoli-Oropeza; Mirna Samaniego; Alex W. Smilor; Ilana S. Vargas; Layna Webb; Teddy Nikitin; Jennifer S. Powers; M. Luke McCormack
    Time period covered
    Jan 1, 1986 - Jan 1, 2019
    Area covered
    Description

    TropiRoot 1.0 is a new tropical root database with root characteristics across environment gradients. It has data extracted from 107 new sources, resulting in more than 8000 rows of data (either species or community data). Most of the data in TropiRoot 1.0 includes root characteristics such as root biomass, morphology, root dynamics, mass fraction, architecture, anatomy, physiology and root chemistry. This initiative represents an approximately 30% increase in the currently available data for tropical roots in the Fine Root Ecology Database (FRED). TropiRoot 1.0, contains root characteristics from 25 different countries where seven are located in Asia, six in South America, five in Central America and the Caribbean, four in Africa, two in North America, and 1 in Oceania. Due to the volume of data, when ancillary data was available, including soil data, these data was either extracted and included in the database or their availability was recorded in an additional column. Multiple contributors checked the entries for outliers during the collation process to ensure data quality. For text-based observations, we examined all cells to ensure that their content relates to their specific categories. For numerical observations, we ordered each numerical value from least to greatest and plotted the values, checking apparent outliers against the data in their respective sources and correcting or removing incorrect or impossible values. Some data (soil and aboveground) have different columns for the same variable presented in different units, including originally published units, but root characteristics data had units converted to match the ones reported in FRED. By filling a gap from global databases, TropiRoot 1.0 expands our knowledge of otherwise so far underrepresented regions, and our ability to assess global trends. This advancement can be used to improve tropical forest representation in vegetation models.

  15. Nineteen bioclimatic variables retrieved from WorldClim database using...

    • plos.figshare.com
    bin
    Updated Aug 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta (2023). Nineteen bioclimatic variables retrieved from WorldClim database using principal coordinates for each sampling site. [Dataset]. http://doi.org/10.1371/journal.pntd.0011486.s003
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Melveettil Kishor Sumitha; Mariapillai Kalimuthu; Mayandi Senthil Kumar; Rajaiah Paramasivan; Narendran Pradeep Kumar; Ittoop Pulikkottil Sunish; Thiruppathi Balaji; Devojit Kumar Sarma; Devendra Kumar; Devi Shankar Suman; Hemlata Srivastava; Ipsita Pal Bhowmick; Keshav Vaishnav; Om P. Singh; Prabhakargouda B. Patil; Suchi Tyagi; Suman S. Mohanty; Tapan Kumar Barik; Sreehari Uragayala; Ashwani Kumar; Bhavna Gupta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Nineteen bioclimatic variables retrieved from WorldClim database using principal coordinates for each sampling site.

  16. d

    Data from: Probability distribution grids of dissolved oxygen and dissolved...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Probability distribution grids of dissolved oxygen and dissolved manganese concentrations at selected thresholds in drinking water depth zones, Central Valley, California [Dataset]. https://catalog.data.gov/dataset/probability-distribution-grids-of-dissolved-oxygen-and-dissolved-manganese-concentrations-
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    California, Central Valley
    Description

    The ascii grids represent regional probabilities that groundwater in a particular location will have dissolved oxygen (DO) concentrations less than selected threshold values representing anoxic groundwater conditions or will have dissolved manganese (Mn) concentrations greater than selected threshold values representing secondary drinking water-quality contaminant levels (SMCL) and health-based screening levels (HBSL) for water quality. The probability models were constrained by the alluvial boundary of the Central Valley to a depth of approximately 300 meters (m). We utilized prediction modeling methods, specifically boosted regression trees (BRT) with a Bernoulli error distribution within a statistical learning framework within R's computing framework (http://www.r-project.org/) to produce two-dimensional probability grids at selected depths throughout the modeling domain. The statistical learning framework seeks to maximize the predictive performance of machine learning methods through model tuning by cross validation. Models were constructed using measured dissolved oxygen and manganese concentrations sampled from 2,767 wells within the alluvial boundary of the Central Valley and over 60 predictor variables from 7 sources (see metadata) and were assembled to develop a model that incorporates regional-scale soil properties, soil chemistry, land use, aquifer textures, and aquifer hydrology. Previously developed Central Valley model outputs of textures (Central Valley Textural Model, CVTM; Faunt and others, 2010) and MODFLOW-simulated vertical water fluxes and predicted depth to water table (Central Valley Hydrologic Model, CVHM; Faunt, 2009) were used to represent aquifer textures and groundwater hydraulics, respectively. The wells used in the BRT models described above were attributed to predictor variable values in ArcGIS using a 500-m buffer. The response variable data consisted of measured DO and Mn concentrations from 2,767 wells within the alluvial boundary of the Central Valley. The data were compiled from two sources: U.S. Geological Survey (USGS) National Water Information System (NWIS) database (all data are publicly available from the USGS at http://waterdata.usgs.gov/ca/nwis/nwis) and the California State Water Resources Control Board Division of Drinking Water (SWRCB-DDW) database (water-quality data are publicly available from the SWRCB at http://geotracker.waterboards.ca.gov/gama/). Only wells with well depth data were selected, and for wells with multiple records, only the most recent sample in the period 1993–2014 that had the required water-quality data was used. Data were available for 932 wells for the NWIS dataset and 1,835 wells for the SWRCB-DDW dataset. Models were trained on a USGS NWIS dataset of 932 wells and evaluated on an independent hold-out dataset of 1,835 wells from the SWRCB-DDW. We used cross-validation to assess the predictive performance of models of varying complexity as a basis for selecting the final models used to create the prediction grids. Trained models were applied to cross-validation testing data and a separate hold-out dataset to evaluate model predictive performance by emphasizing three model metrics of fit: Kappa, accuracy, and the area under the receiver operator characteristic (ROC) curve. The final trained models were used for mapping predictions at discrete depths to a depth of approximately 300 m. Trained DO and Mn models had accuracies of 86–100 percent, Kappa values of 0.69–0.99, and ROC values of 0.92–1.0. Model accuracies for cross-validation testing datasets were 82–95 percent, and ROC values were 0.87–0.91, indicating good predictive performance. Kappa values for the cross-validation testing dataset were 0.30–0.69, indicating fair to substantial agreement between testing observations and model predictions. Hold-out data were available for the manganese model only and indicated accuracies of 89–97 percent, ROC values of 0.73–0.75, and Kappa values of 0.06–0.30. The predictive performance of both the DO and Mn models was reasonable, considering all three of these fit metrics and the low percentages of low-DO and high-Mn events in the data. See associated journal article (Rosecrans and others, 2017) for complete summary of BRT modeling methods, model fit metrics, and relative influence of predictor variables for a given DO or Mn BRT model. The modeled response variables for the DO BRT models were based on measured DO values from wells at the following thresholds: <0.5 milligrams per liter (mg/L), <1.0 mg/L, and <2.0 mg/L, and these thresholds values were considered anoxic based on literature reviews. The modeled response variables for the Mn BRT models were based on measured Mn values from wells at the following exceedance thresholds: >50 micrograms per liter (µg/L), >150 µg/L, and >300 µg/L. (The 150 µg/L manganese threshold represents one-half the USGS HBSL.) The prediction grid discretization below land surface was in 15-m intervals to a depth of 122 m, followed by intervals of 30 m to a depth of 300 m, resulting in 14 two-dimensional probability grids for each constituent (DO and Mn) and threshold. Probability grid maps were also created for the shallow aquifer and deep aquifer represented by the median domestic and public-supply well depths, respectively. A depth of 46 m was used to stratify wells from the training dataset into the shallow and deep aquifer and was derived from depth percentiles associated with domestic and public supply in previous work by Burow and others (2013). In this work, the median well depth categorized as domestic was 30 m below land surface (bls), and the median well depth categorized as public-supply wells was 100 m bls. Therefore, datasets contained in the folders named "DO BRT prediction grids.zip" and "Mn BRT prediction grids.zip" each have 42 probability grids representing specific depths for each of the selected thresholds of DO and Mn BRT threshold models described above. The dataset contained in the folder named "PublicSupply&DomesticGrids.zip" contains probability grids represented by the domestic and public-supply drinking water depths for each of the six BRT models described above (12 grids total).

  17. d

    Restaurants, Fast Food, USA, Top 25 | 200k+ PoIs with 30+ Attributes |...

    • datarade.ai
    .json, .xml, .csv
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xavvy (2025). Restaurants, Fast Food, USA, Top 25 | 200k+ PoIs with 30+ Attributes | monthly updates | API & Datasets [Dataset]. https://datarade.ai/data-products/restaurants-fast-food-usa-top-25-200k-pois-with-30-att-xavvy
    Explore at:
    .json, .xml, .csvAvailable download formats
    Dataset updated
    Feb 20, 2025
    Dataset authored and provided by
    xavvy
    Area covered
    United States of America
    Description

    Xavvy fuel is the leading source for location data and market insights worldwide. We specialize in data quality and enrichment, providing high-quality POI data for restaurants and quick-service establishments in the United States.

    Base data • Name/Brand • Adress • Geocoordinates • Opening Hours • Phone • ... ^

    30+ Services • Delivery • Wifi • ChargePoints • …

    10+ Payment options • Visa • MasterCard • Google Pay • individual Apps • ...

    Our data offering is highly customizable and flexible in delivery – whether one-time or regular data delivery, push or pull services, and various data formats – we adapt to our customers' needs.

    Brands included: • McDonalds • Burger King • Subway • KFC • Wendy's • ...

    The total number of restaurants per region, market share distribution among competitors, or the ideal location for new branches – our restaurant data provides valuable insights into the food service market and serves as the perfect foundation for in-depth analyses and statistics. Our data helps businesses across various industries make informed decisions regarding market development, expansion, and competitive strategies. Additionally, our data contributes to the consistency and quality of existing datasets. A simple data mapping allows for accuracy verification and correction of erroneous entries.

    Especially when displaying information about restaurants and fast-food chains on maps or in applications, high data quality is crucial for an optimal customer experience. Therefore, we continuously optimize our data processing procedures: • Regular quality controls • Geocoding systems to refine location data • Cleaning and standardization of datasets • Consideration of current developments and mergers • Continuous expansion and cross-checking of various data sources

    Integrate the most comprehensive database of restaurant locations in the USA into your business. Explore our additional data offerings and gain valuable market insights directly from the experts!

  18. Validated temperature and salinity data, and reconstructed nutrient...

    • zenodo.org
    zip
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li; Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li (2025). Validated temperature and salinity data, and reconstructed nutrient concentrations in the North Pacific (1895–2024) [Dataset]. http://doi.org/10.5281/zenodo.17140658
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li; Chuanjun Du; Naiwen Zheng; Shuh-Ji Kao; Minhan Dai; Zhimian Cao; Dalin Shi; Qiancheng Li; Hao Wang; Xiaolin Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The original hydrographic and nutrient data were compiled from the CLIVAR and Carbon Hydrographic Data Office (CCHDO; providing both hydrographic and nutrient measurements) and the World Ocean Database (WOD; supplying hydrographic data only) across the North Pacific. Rigorous quality control procedures—including range, spike, gradient, inversion, outlier checks and etc. — were applied to remove low-quality temperature, salinity, and nutrient records (NO₃⁻, NO₂⁻, DIP, and Si(OH)₄) from both databases. A machine learning model (Random Forest) was trained on the quality-controlled CCHDO data to reconstruct nutrient concentrations, using spatial, temporal, and water mass predictors derived from the validated WOD hydrographic dataset. This process generated approximately 435 million reconstructed nutrient data points across 1.9 million stations for each nutrient species within the WOD, covering the period from 1895 to 2024 in the North Pacific (118.6 ºE to 280.3ºE; -2.0 to 60.6ºN). The final dataset offers validated temperature and salinity values along with reconstructed nutrient concentrations, providing a valuable resource for studying ocean biogeochemistry and climate-related changes in the North Pacific.

  19. d

    Macroecological database of mammalian body mass

    • dataone.org
    • data.esa.org
    • +2more
    Updated Aug 14, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCEAS 2182: Smith: Body size in ecology and paleoecology: Linking pattern and process across spatial, temporal and taxonomic scales; National Center for Ecological Analysis and Synthesis; Felisa Smith (2015). Macroecological database of mammalian body mass [Dataset]. http://doi.org/10.5063/AA/nceas.196.3
    Explore at:
    Dataset updated
    Aug 14, 2015
    Dataset provided by
    Knowledge Network for Biocomplexity
    Authors
    NCEAS 2182: Smith: Body size in ecology and paleoecology: Linking pattern and process across spatial, temporal and taxonomic scales; National Center for Ecological Analysis and Synthesis; Felisa Smith
    Time period covered
    Jan 3, 1
    Area covered
    Earth
    Variables measured
    Genus, Order, Family, Status, Species, Citation, Log Mass, Continent, Reference, Combined mass, and 2 more
    Description

    The purpose of this data set was to compile body mass information for all mammals on Earth so that we could investigate the patterns of body mass seen across geographic and taxonomic space and evolutionary time. We were interested in the heritability of body size across taxonomic groups (How conserved is body mass within a genus, family, and order?), in the overall pattern of body mass across continents (Do the moments and other descriptive statistics remain the same across geographic space?), and over evolutionary time (How quickly did body mass patterns iterate on the patterns seen today? Were the Pleistocene extinctions size specific on each continent, and did these events coincide with the arrival of man?). These data are also part of a larger project that seeks to integrate body mass patterns across very diverse taxa (NCEAS Working Group on Body size in ecology and paleoecology: linking pattern and process across space, time and taxonomic scales). We began with the updated version of Wilson and Reeder's (1993) taxonomic list of all known Recent mammals of the world (N = 4629 species) to which we added status, distribution, and body mass estimates compiled from the primary and secondary literature. Whenever possible, we used an average of male and female body mass, which was in turn averaged over multiple localities to arrive at our species body mass values. The sources are line referenced in the main data set, with the actual references appearing in a table within the metadata. Mammals have individual records for each continent they occur on. Please note that our data set is more than an amalgamation of smaller compilations. Although we relied heavily a data set for Chiroptera by K. E. Jones (N = 905), the CRC handbook of Mammalian Body Mass (N = 688), and a data set compiled for South America by P. Marquet (N = 505), these total less than half the records in the current database. The remainder are derived from more than 150 other sources (see reference table). Furthermore, we include a comprehensive late Pleistocene species assemblage for Africa, North and South America, and Australia (an additional 230 species). "Late Pleistocene" is defined as approximately 11 ka for Africa, North and South America, and as 50 ka for Australia, because these times predate anthropogenic impacts on mammalian fauna. Overall, the temporal coverage is from the late Pleistocene to present (ca. 45,000 ybp to present). Estimates contained within this data set represent a generalized species value, averaged across gender and geographic space. Consequently, these data are not appropriate for asking population-level questions where the integration of body mass with specific environmental conditions is important. All extant orders of mammals are included, as well as several archaic groups (N = 4859 species). Because some species are found on more than one continent (particularly Chiroptera), there are 5731 entries. We have body masses for the following: Artiodactyla (280 records), Bibymalagasia (2 records), Carnivora (393 records), Cetacea (75 records), Chiroptera (1071 records), Dasyuromorphia (67 records), Dermoptera (3 records), Didelphimorphia (68 records), Diprotodontia (127 records), Hydracoidea (5 records), Insectivora (234 records), Lagomorpha (53 records), Litopterna (2 records), Macroscelidea (14 records), Microbiotheria (1 record), Monotremata (7 records), Notoryctemorphia (1 record), Notoungulata (5 records), Paucituberculata (5 records), Peramelemorphia (24 records), Perissodactyla (47 records), Pholidota (8 records), Primates (276 records), Proboscidea (14 records), Rodentia (1425 records), Scandentia (15 records), Sirenia (6 records), Tubulidentata (1 record), and Xenarthra (75 records).

    Data has undergone substantial data quality and assurance checking, though this is an on-going process. Histograms of the body masses of each order were produced, and values at the tails were double-checked for accuracy. When multiple sources of information were available for a species, or new sources encountered, we used those with higher sample sizes and gender-specific information.

    Headers are given here as header name followed by more information such as measurement units or other basic descriptor. More information on the variable definitions can be found in Section B, variable information (at http://www.esapubs.org/archive/ecol/E084/094/metadata.htm). Continent (SA, NA, EA, insular, oceanic, AUS, AF), Status (extinct, historical, introduction, or extant), Order, Family, Genus, Species, Log Mass (grams), Combined Mass (grams), Reference.

  20. w

    Multi Country Study Survey 2000-2001, Long version - Mexico

    • apps.who.int
    • catalog.ihsn.org
    Updated Jan 16, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Health Organization (WHO) (2014). Multi Country Study Survey 2000-2001, Long version - Mexico [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/201
    Explore at:
    Dataset updated
    Jan 16, 2014
    Dataset authored and provided by
    World Health Organization (WHO)
    Time period covered
    2000 - 2001
    Area covered
    Mexico
    Description

    Abstract

    In order to develop various methods of comparable data collection on health and health system responsiveness WHO started a scientific survey study in 2000-2001. This study has used a common survey instrument in nationally representative populations with modular structure for assessing health of indviduals in various domains, health system responsiveness, household health care expenditures, and additional modules in other areas such as adult mortality and health state valuations.

    The health module of the survey instrument was based on selected domains of the International Classification of Functioning, Disability and Health (ICF) and was developed after a rigorous scientific review of various existing assessment instruments. The responsiveness module has been the result of ongoing work over the last 2 years that has involved international consultations with experts and key informants and has been informed by the scientific literature and pilot studies.

    Questions on household expenditure and proportionate expenditure on health have been borrowed from existing surveys. The survey instrument has been developed in multiple languages using cognitive interviews and cultural applicability tests, stringent psychometric tests for reliability (i.e. test-retest reliability to demonstrate the stability of application) and most importantly, utilizing novel psychometric techniques for cross-population comparability.

    The study was carried out in 61 countries completing 71 surveys because two different modes were intentionally used for comparison purposes in 10 countries. Surveys were conducted in different modes of in- person household 90 minute interviews in 14 countries; brief face-to-face interviews in 27 countries and computerized telephone interviews in 2 countries; and postal surveys in 28 countries. All samples were selected from nationally representative sampling frames with a known probability so as to make estimates based on general population parameters.

    The survey study tested novel techniques to control the reporting bias between different groups of people in different cultures or demographic groups ( i.e. differential item functioning) so as to produce comparable estimates across cultures and groups. To achieve comparability, the selfreports of individuals of their own health were calibrated against well-known performance tests (i.e. self-report vision was measured against standard Snellen's visual acuity test) or against short descriptions in vignettes that marked known anchor points of difficulty (e.g. people with different levels of mobility such as a paraplegic person or an athlete who runs 4 km each day) so as to adjust the responses for comparability . The same method was also used for self-reports of individuals assessing responsiveness of their health systems where vignettes on different responsiveness domains describing different levels of responsiveness were used to calibrate the individual responses.

    This data are useful in their own right to standardize indicators for different domains of health (such as cognition, mobility, self care, affect, usual activities, pain, social participation, etc.) but also provide a better measurement basis for assessing health of the populations in a comparable manner. The data from the surveys can be fed into composite measures such as "Healthy Life Expectancy" and improve the empirical data input for health information systems in different regions of the world. Data from the surveys were also useful to improve the measurement of the responsiveness of different health systems to the legitimate expectations of the population.

    Geographic coverage

    15 federal states: Distrito Federal, Guanajuato, Jalisco, Estado de México, Michoacán, Qurétaro, Guerrero, Oaxaca, Puebla, Veracruz, Yucatán, Chihuahua, Nuevo León, San Luis Potosí, Sonora

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sample used was a probabilistic, multistage, stratified and clustered sample and represented urban and rural strata.

    Mexico has 32 Federal States, which were divided, into 3 regions: Centre, South and North. Out of these regions, 15 were selected as follows: Centre: Distrito Federal, Guanajuato, Jalisco, Estado de México, Michoacán, Qurétaro South: Guerrero, Oaxaca, Puebla, Veracruz, Yucatán North: Chihuahua, Nuevo León, San Luis Potosí, Sonora

    Mode of data collection

    Face-to-face [f2f]

    Cleaning operations

    Data Coding At each site the data was coded by investigators to indicate the respondent status and the selection of the modules for each respondent within the survey design. After the interview was edited by the supervisor and considered adequate it was entered locally.

    Data Entry Program A data entry program was developed in WHO specifically for the survey study and provided to the sites. It was developed using a database program called the I-Shell (short for Interview Shell), a tool designed for easy development of computerized questionnaires and data entry (34). This program allows for easy data cleaning and processing.

    The data entry program checked for inconsistencies and validated the entries in each field by checking for valid response categories and range checks. For example, the program didn’t accept an age greater than 120. For almost all of the variables there existed a range or a list of possible values that the program checked for.

    In addition, the data was entered twice to capture other data entry errors. The data entry program was able to warn the user whenever a value that did not match the first entry was entered at the second data entry. In this case the program asked the user to resolve the conflict by choosing either the 1st or the 2nd data entry value to be able to continue. After the second data entry was completed successfully, the data entry program placed a mark in the database in order to enable the checking of whether this process had been completed for each and every case.

    Data Transfer The data entry program was capable of exporting the data that was entered into one compressed database file which could be easily sent to WHO using email attachments or a file transfer program onto a secure server no matter how many cases were in the file. The sites were allowed the use of as many computers and as many data entry personnel as they wanted. Each computer used for this purpose produced one file and they were merged once they were delivered to WHO with the help of other programs that were built for automating the process. The sites sent the data periodically as they collected it enabling the checking procedures and preliminary analyses in the early stages of the data collection.

    Data quality checks Once the data was received it was analyzed for missing information, invalid responses and representativeness. Inconsistencies were also noted and reported back to sites.

    Data Cleaning and Feedback After receipt of cleaned data from sites, another program was run to check for missing information, incorrect information (e.g. wrong use of center codes), duplicated data, etc. The output of this program was fed back to sites regularly. Mainly, this consisted of cases with duplicate IDs, duplicate cases (where the data for two respondents with different IDs were identical), wrong country codes, missing age, sex, education and some other important variables.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Protein Cross-Linking Database [Dataset]. http://identifiers.org/RRID:SCR_021027

Protein Cross-Linking Database

RRID:SCR_021027, Protein Cross-Linking Database (RRID:SCR_021027), ProXL, proxl, Protein XL, Protein XL Database

Explore at:
68 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jan 29, 2022
Description

Web application and database designed for sharing, visualizing, and analyzing protein cross-linking mass spectrometry data with emphasis on structural analysis and quality control. Includes public and private data sharing capabilities, project based interface designed to ensure security and facilitate collaboration among multiple researchers. Used for private collaboration and public data dissemination.

Search
Clear search
Close search
Google apps
Main menu