Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Experimental statistical procedures used in almost all scientific papers are fundamental for clearer interpretation of the results of experiments conducted in agrarian sciences. However, incorrect use of these procedures can lead the researcher to incorrect or incomplete conclusions. Therefore, the aim of this study was to evaluate the characteristics of the experiments and quality of the use of statistical procedures in soil science in order to promote better use of statistical procedures. For that purpose, 200 articles, published between 2010 and 2014, involving only experimentation and studies by sampling in the soil areas of fertility, chemistry, physics, biology, use and management were randomly selected. A questionnaire containing 28 questions was used to assess the characteristics of the experiments, the statistical procedures used, and the quality of selection and use of these procedures. Most of the articles evaluated presented data from studies conducted under field conditions and 27 % of all papers involved studies by sampling. Most studies did not mention testing to verify normality and homoscedasticity, and most used the Tukey test for mean comparisons. Among studies with a factorial structure of the treatments, many had ignored this structure, and data were compared assuming the absence of factorial structure, or the decomposition of interaction was performed without showing or mentioning the significance of the interaction. Almost none of the papers that had split-block factorial designs considered the factorial structure, or they considered it as a split-plot design. Among the articles that performed regression analysis, only a few of them tested non-polynomial fit models, and none reported verification of the lack of fit in the regressions. The articles evaluated thus reflected poor generalization and, in some cases, wrong generalization in experimental design and selection of procedures for statistical analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The datasets and code for Big Meaning: Qualitative Analysis on Large Bodies of Data Using AI
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We include the course syllabus used to teach quantitative research design and analysis methods to graduate Linguistics students using a blended teaching and learning approach. The blended course took place over two weeks and builds on a face to face course presented over two days in 2019. Students worked through the topics in preparation for a live interactive video session each Friday to go through the activities. Additional communication took place on Slack for two hours each week. A survey was conducted at the start and end of the course to ascertain participants' perceptions of the usefulness of the course. The links to online elements and the evaluations have been removed from the uploaded course guide.Participants who complete this workshop will be able to:- outline the steps and decisions involved in quantitative data analysis of linguistic data- explain common statistical terminology (sample, mean, standard deviation, correlation, nominal, ordinal and scale data)- perform common statistical tests using jamovi (e.g. t-test, correlation, anova, regression)- interpret and report common statistical tests- describe and choose from the various graphing options used to display data- use jamovi to perform common statistical tests and graph resultsEvaluationParticipants who complete the course will use these skills and knowledge to complete the following activities for evaluation:- analyse the data for a project and/or assignment (in part or in whole)- plan the results section of an Honours research project (where applicable)Feedback and suggestions can be directed to M Schaefer schaemn@unisa.ac.za
This dataset is an ATLAS.ti copy bundle that contains the analysis of 86 articles that appeared between March 2011 and March 2013 in the Dutch quality newspaper NRC Handelsblad in the weekly article series 'the last word' [Dutch: 'het laatste woord'] that were written by NRC editor Gijsbert van Es. Newspaper texts have been retrieved from LexisNexis (http://academic.lexisnexis.nl/). These articles describe the experience of the last phase of life of people who were confronted with approaching death due to cancer or other life-threatening diseases, or due to old age and age-related health losses. The analysis focuses on the meanings concerning death and dying that were expressed by these people in their last phase of life. The data-set was analysed with ATLAS.ti and contains a codebook. In the memo manager a memo is included that provides information concerning the analysed data. Culturally embedded meanings concerning death and dying have been interpreted as 'death-related cultural affordances': possibilities for perception and action in the face of death that are offered by the cultural environment. These have been grouped into three different ‘cultural niches’ (sets of mutually supporting cultural affordances) that are grounded in different mechanisms for determining meaning: a canonical niche (grounding meaning in established (religious) authority and tradition), a utilitarian niche (grounding meaning in rationality and utilitarian function) and an expressive niche (grounding meaning in authentic (and often aesthetic) self-expression. Interviews are in Dutch; Codes, analysis and metadata are in English.
The modeled data in these archives are in the NetCDF format (https://www.unidata.ucar.edu/software/netcdf/). NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It is also a community standard for sharing scientific data. The Unidata Program Center supports and maintains netCDF programming interfaces for C, C++, Java, and Fortran. Programming interfaces are also available for Python, IDL, MATLAB, R, Ruby, and Perl. Data in netCDF format is: • Self-Describing. A netCDF file includes information about the data it contains. • Portable. A netCDF file can be accessed by computers with different ways of storing integers, characters, and floating-point numbers. • Scalable. Small subsets of large datasets in various formats may be accessed efficiently through netCDF interfaces, even from remote servers. • Appendable. Data may be appended to a properly structured netCDF file without copying the dataset or redefining its structure. • Sharable. One writer and multiple readers may simultaneously access the same netCDF file. • Archivable. Access to all earlier forms of netCDF data will be supported by current and future versions of the software. Pub_figures.tar.zip Contains the NCL scripts for figures 1-5 and Chesapeake Bay Airshed shapefile. The directory structure of the archive is ./Pub_figures/Fig#_data. Where # is the figure number from 1-5. EMISS.data.tar.zip This archive contains two NetCDF files that contain the emission totals for 2011ec and 2040ei emission inventories. The name of the files contain the year of the inventory and the file header contains a description of each variable and the variable units. EPIC.data.tar.zip contains the monthly mean EPIC data in NetCDF format for ammonium fertilizer application (files with ANH3 in the name) and soil ammonium concentration (files with NH3 in the name) for historical (Hist directory) and future (RCP-4.5 directory) simulations. WRF.data.tar.zip contains mean monthly and seasonal data from the 36km downscaled WRF simulations in the NetCDF format for the historical (Hist directory) and future (RCP-4.5 directory) simulations. CMAQ.data.tar.zip contains the mean monthly and seasonal data in NetCDF format from the 36km CMAQ simulations for the historical (Hist directory), future (RCP-4.5 directory) and future with historical emissions (RCP-4.5-hist-emiss directory). This dataset is associated with the following publication: Campbell, P., J. Bash, C. Nolte, T. Spero, E. Cooter, K. Hinson, and L. Linker. Projections of Atmospheric Nitrogen Deposition to the Chesapeake Bay Watershed. Journal of Geophysical Research - Biogeosciences. American Geophysical Union, Washington, DC, USA, 12(11): 3307-3326, (2019).
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
This dataset contains ERA5 surface level analysis parameter data ensemble means (see linked dataset for spreads). ERA5 is the 5th generation reanalysis project from the European Centre for Medium-Range Weather Forecasts (ECWMF) - see linked documentation for further details. The ensemble means and spreads are calculated from the ERA5 10 member ensemble, run at a reduced resolution compared with the single high resolution (hourly output at 31 km grid spacing) 'HRES' realisation, for which these data have been produced to provide an uncertainty estimate. This dataset contains a limited selection of all available variables and have been converted to netCDF from the original GRIB files held on the ECMWF system. They have also been translated onto a regular latitude-longitude grid during the extraction process from the ECMWF holdings. For a fuller set of variables please see the linked Copernicus Data Store (CDS) data tool, linked to from this record.
Note, ensemble standard deviation is often referred to as ensemble spread and is calculated as the standard deviation of the 10-members in the ensemble (i.e., including the control). It is not the sample standard deviation, and thus were calculated by dividing by 10 rather than 9 (N-1). See linked datasets for ensemble member and ensemble mean data.
The ERA5 global atmospheric reanalysis of the covers 1979 to 2 months behind the present month. This follows on from the ERA-15, ERA-40 rand ERA-interim re-analysis projects.
An initial release of ERA5 data (ERA5t) is made roughly 5 days behind the present date. These will be subsequently reviewed ahead of being released by ECMWF as quality assured data within 3 months. CEDA holds a 6 month rolling copy of the latest ERA5t data. See related datasets linked to from this record. However, for the period 2000-2006 the initial ERA5 release was found to suffer from stratospheric temperature biases and so new runs to address this issue were performed resulting in the ERA5.1 release (see linked datasets). Note, though, that Simmons et al. 2020 (technical memo 859) report that "ERA5.1 is very close to ERA5 in the lower and middle troposphere." but users of data from this period should read the technical memo 859 for further details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Excel based tool was developed to analyze means-end chain data. The tool consists of a user manual, a data input file to correctly organise your MEC data, a calculator file to analyse your data, and instructional videos. The purpose of this tool is to aggregate laddering data into hierarchical value maps showing means-end chains. The summarized results consist of (1) a summary overview, (2) a matrix, and (3) output for copy/pasting into NodeXL to generate hierarchal value maps (HVMs). To use this tool, you must have collected data via laddering interviews. Ladders are codes linked together consisting of attributes, consequences and values (ACVs).
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Animal ecologists often collect hierarchically-structured data and analyze these with linear mixed-effects models. Specific complications arise when the effect sizes of covariates vary on multiple levels (e.g., within vs among subjects). Mean-centering of covariates within subjects offers a useful approach in such situations, but is not without problems. A statistical model represents a hypothesis about the underlying biological process. Mean-centering within clusters assumes that the lower level responses (e.g. within subjects) depend on the deviation from the subject mean (relative) rather than on absolute values of the covariate. This may or may not be biologically realistic. We show that mismatch between the nature of the generating (i.e., biological) process and the form of the statistical analysis produce major conceptual and operational challenges for empiricists. We explored the consequences of mismatches by simulating data with three response-generating processes differing in the source of correlation between a covariate and the response. These data were then analyzed by three different analysis equations. We asked how robustly different analysis equations estimate key parameters of interest and under which circumstances biases arise. Mismatches between generating and analytical equations created several intractable problems for estimating key parameters. The most widely misestimated parameter was the among-subject variance in response. We found that no single analysis equation was robust in estimating all parameters generated by all equations. Importantly, even when response-generating and analysis equations matched mathematically, bias in some parameters arose when sampling across the range of the covariate was limited. Our results have general implications for how we collect and analyze data. They also remind us more generally that conclusions from statistical analysis of data are conditional on a hypothesis, sometimes implicit, for the process(es) that generated the attributes we measure. We discuss strategies for real data analysis in face of uncertainty about the underlying biological process. Methods All data were generated through simulations, so included with this submission are a Read Me file containing general descriptions of data files, a code file that contains R code for the simulations and analysis data files (which will generate new datasets with the same parameters) and the analyzed results in the data files archived here. These data files form the basis for all results presented in the published paper. The code file (in R markdown) has more detailed descriptions of each file of analyzed results.
This dataset includes monthly means of 2.5 degree surface and flux analysis data from ECMWF ERA-40 reanalysis project.
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
ERA-Interim is the latest European Centre for Medium-Range Weather Forecasts (ECMWF) global atmospheric reanalysis of the period 1979 to August 2019. This follows on from the ERA-15 and ERA-40 re-analysis projects.
The dataset includes monthly mean of daily mean pressure level data on a reduced N256 Gaussian grid.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Data Analytics Outsourcing Industry market was valued at USD XXX Million in 2023 and is projected to reach USD XXX Million by 2032, with an expected CAGR of 34.33% during the forecast period.Data analytics outsourcing is a strategic direction wherein an organization outsources its data analytics duties and responsibilities to third-party service providers. Data analytics outsourcing is also called data analytics as a service, whereby businesses tap into professional skills and cutting-edge technologies without significant investments internally. Common elements of outsourcing include data collection, cleaning, preparation, analysis, and visualization. It then untaps precious insights from large datasets by unleashing the might of artificial intelligence, big data, Internet of Things (IoT), business intelligence, and automation on one encompassing field. This expertise helps in making the right decisions and improving operations, recognises emergent trends, and leads organizations to emerge stronger in the competition. Indeed, data analytics outsourcing is very helpful to a huge number of businesses, such as finance and healthcare businesses, retail and technology-based businesses-it allows organizations to draw meaningful, relevant meaning from data for sustainable growth. Recent developments include: February 2024 - Wipro and IBM Expanded Partnership to Offer New AI Services and Support to Clients. Wipro launched an Enterprise AI-Ready Platform, leveraging IBM Watsonx, to advance enterprise adoption of Generative AI. As part of the expanded partnership, IBM and Wipro will establish a centralized tech hub to support joint clients in their AI pursuits. As part of this expanded partnership, Wipro associates will be trained in IBM hybrid cloud, AI, and data analytics technologies to help accelerate the development of joint solutions., September 2023 - IBM announced plans for new generative AI foundation models and enhancements coming to watsonx. These enhancements include a technical preview for watsonx. Governance, new fertile AI data services coming to watsonx. Data and the planned integration of watsonx.ai foundation models across select software and infrastructure products.. Key drivers for this market are: Increasing Volume and Variety of Data being Generated are the Major Driving Factors for this Industry, Increasing Adoption of Data Analytics Outsourcing in BFSI. Potential restraints include: Significant Infrastructure Requires Huge Capital Investment. Notable trends are: Increasing Adoption of Data Analytics Outsourcing in BFSI is Driving the Market.
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
ERA-Interim is the latest European Centre for Medium-Range Weather Forecasts (ECMWF) global atmospheric reanalysis of the period 1979 to August 2019. This follows on from the ERA-15 and ERA-40 re-analysis projects.
The dataset includes synoptic monthly mean analysed potential vorticity level data on a reduced N256 Gaussian grid. Data are available at the 00, 06, 12 and 18 UT analysis times.
https://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/ecmwf-era-products.pdf
ERA-Interim is the latest European Centre for Medium-Range Weather Forecasts (ECMWF) global atmospheric reanalysis of the period 1979 to August 2019. This follows on from the ERA-15 and ERA-40 re-analysis projects.
The dataset includes synoptic monthly mean analysed vertical integral data on a reduced N256 Gaussian grid. Data are available at the 00, 06, 12 and 18 UT analysis times.
Background Gene expression profiling among different tissues is of paramount interest in various areas of biomedical research. We have developed a novel method (DADA, Digital Analysis of cDNA Abundance), that calculates the relative abundance of genes in cDNA libraries. Results DADA is based upon multiple restriction fragment length analysis of pools of clones from cDNA libraries and the identification of gene-specific restriction fingerprints in the resulting complex fragment mixtures. A specific cDNA cloning vector had to be constructed that governed missing or incomplete cDNA inserts which would generate misleading fingerprints in standard cloning vectors. Double stranded cDNA was synthesized using an anchored oligo dT primer, uni-directionally inserted into the DADA vector and cDNA libraries were constructed in E. coli. The cDNA fingerprints were generated in a PCR-free procedure that allows for parallel plasmid preparation, labeling, restriction digest and fragment separation of pools of 96 colonies each. This multiplexing significantly enhanced the throughput in comparison to sequence-based methods (e.g. EST approach). The data of the fragment mixtures were integrated into a relational database system and queried with fingerprints experimentally produced by analyzing single colonies. Due to limited predictability of the position of DNA fragments on the polyacrylamid gels of a given size, fingerprints derived solely from cDNA sequences were not accurate enough to be used for the analysis. We applied DADA to the analysis of gene expression profiles in a model for impaired wound healing (treatment of mice with dexamethasone). Conclusions The method proved to be capable of identifying pharmacologically relevant target genes that had not been identified by other standard methods routinely used to find differentially expressed genes. Due to the above mentioned limited predictability of the fingerprints, the method was yet tested only with a limited number of experimentally determined fingerprints and was able to detect differences in gene expression of transcripts representing 0.05% of the total mRNA population (e.g. medium abundant gene transcripts).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data analysis tools help companies draw insights from customer data, and uncover trends and patterns to make better business decisions. There are a wide number of online data analysis tools that can make be use, whether to perform basic or more advanced data analysis. Because of the development of no-code machine learning software, advanced data analysis is now easier than ever, allowing businesses to reap the benefits from huge amounts of unstructured data.
This paper aims at pointing out the meaning of data analysis and it's benefits, type of data analysis, available data analysis tools and how to choose them.
This dataset contains data tables of Global, Hemispheric, and Zonal Temperature Anomalies. Anomalies are relative to the 1951-1980 base period means. Data from NASA Goddard Institute for Space Studies. The NASA GISS Surface Temperature (GISTEMP) analysis provides a measure of the changing global surface temperature with monthly resolution for the period since 1880, when a reasonably global distribution of meteorological stations was established.Follow datasource.kapsarc.org for timely data to advance energy economics research.
Probabilistic association discovery aims at identifying the association between random vectors, regardless of number of variables involved or linear/nonlinear functional forms. Recently, applications in high-dimensional data have generated rising interest in probabilistic association discovery. We developed a framework based on functions on the observation graph, named MeDiA (Mean Distance Association). We generalize its property to a group of functions on the observation graph. The group of functions encapsulates major existing methods in association discovery, e.g. mutual information and Brownian Covariance, and can be expanded to more complicated forms. We conducted numerical comparison of the statistical power of related methods under multiple scenarios. We further demonstrated the application of MeDiA as a method of gene set analysis that captures a broader range of responses than traditional gene set analysis methods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project summary: Survey and interview data were collected from relevant stakeholders to investigate the effectiveness of physical activity referral pathways. The research questions explored the views of the participants on key determinants of physical activity (PA) and physical activity referral schemes (PARS) promotion. The factors explored included participants’ knowledge, beliefs, behaviours, perceptions and recommendations about PA and PARS. The research was conducted in three stages: The first stage involved two systematic reviews that investigated the global views of patients and HCPs regarding the promotion of PA and PARS. The findings from this stage informed the need for the second (mixed methods studies) and third (qualitative study) stages of the research, which involved in-depth investigations of the perspectives of PARS stakeholders on their experiences of the functionality of PARS within an Australian context. For these two stages of the research, participants included Australian GPs, EPs and patients with chronic disease(s), aged 18 years and above. A sequential explanatory mixed methods research design that included quantitative online surveys and qualitative telephone interviews was adopted for the two mixed methods studies conducted in stage two. The first mixed methods study explored patients’ views on the efficacy of PARS programmes. The second mixed methods study investigated the perspectives of HCPs (GPs and EPs) on the coordination of care for PARS users. Descriptive statistics including frequencies, percentages, means and standard deviations were used to analyse the demographic characteristics of participants. Shapiro Wilk’s test, an inspection of histograms and q-q plots were used to test for normality. Non-parametric statistical tests including Mann Whitney U and Kruskal Wallis tests were used to compare the relationships between variables. The data were presented as frequencies and means ± SD, with an alpha value of 0.05. Framework analysis was employed for the synthesis of the stage two qualitative data. To increase the credibility and validity of the findings in stage two, the results from both strands of each of the two mixed methods studies were triangulated. In stage three, a qualitative pluralistic evaluation approach was utilised to explore and synthesise the recommendations of all stakeholders (GPs, EPs and patients) on how to enhance the effectiveness of the PARS programme.
This dataset consists of the survey data for general practitioners (GPs) and exercise physiologists (EPs)
Software/equipment used to create/collect the data: Survey data was analysed using SPSS version 27.0 (IBM Inc, Chicago IL).
Variable labels and data coding are explained in the variable view of the attached SPSS file and in the Codebook (PDF) provided.
The full methodology is available in the Open Access publication (PLoS) from the Related publications link below.
The systematic reviews and other publications relating to the patient surveys are also available from the links provided.The U.S. Geological Survey (USGS) maintains shoreline positions for the United States coasts from both older sources, such as aerial photos or topographic surveys, as well as contemporary sources like lidar point clouds and digital elevation models (DEMs). These shorelines are compiled and analyzed in the Digital Shoreline Analysis System (DSAS) software to compute rates of change. It is useful to keep a record of historical shoreline positions as a method of monitoring change over time to identify areas most susceptible to erosion or accretion. These data can help coastal managers understand which areas of the coast are vulnerable. This data release and other associated products represent an expansion of the USGS national-scale shoreline database to include Puerto Rico and its islands, Vieques and Culebra. The United States Geological Survey (USGS) in cooperation with the Coastal Research and Planning Institute of Puerto Rico (CoRePI, part of the Graduate School of Planning at the University of Puerto Rico, Rio Piedras Campus) has derived and compiled a database of historical shoreline positions using a variety of methods. These shorelines are used to measure the rate of shoreline change over time.
Title: Face-to-face Peer Dialogue: Students Talking about Feedback (submitted March 2021) A short description of the study set-up: 35 second-year university students were split into 12 groups. Students wrote a scientific report and gave written peer feedback. This was followed by face-to-face peer dialogue on the feedback without teacher facilitation. Dialogues were coded and analysed at the utterance level. Analysis For data analysis, we used the coding scheme by Visschers-Pleijers et al. (2006), which focuses on the analysis of verbal interactions in tutorial groups. To assess the dialogue, the verbal interactions in the discourses were scored at the utterance level as ‘Learning-oriented interaction’, ‘Procedural interaction’ or ‘Irrelevant interaction’ (Visschers-Pleijers et al. 2006). The Learning-oriented interactions were further subdivided in five subcategories: Opening statement, Question (open, critical or verification question), Cumulative reasoning (elaboration, offering suggestion, confirmation or intention to improve), Disagreement (counter argument, doubt, disagreement or no intention to improve) and Lessons learned (an adapted version of the coding scheme used by Visschers-Pleijers et al. 2006). The first and second authors, and a research assistant coded the first four transcripts and discussed their codes in three rounds until they reached consensus. See Appendix A for a description of the coding scheme. After reaching consensus on the coding, the first author and the research assistant, individually coded four new transcripts. For these four transcripts, interrater reliability analysis was performed using percent agreement according Gisev, Bell, and Chen (2013). The percent agreement between the first author and the research assistant ranged from 80 to 92. The first author then coded the remaining eight transcripts individually. Eventually, all transcripts were analysed according to the first author’s classification. For each single group session, the codes for each (sub)category of verbal interaction were counted and percentages were calculated for the number of utterances. The median (Mdn) and interquartile range (IQR) of percentage of utterances for each (sub)category of code were computed per coding category for all groups together. Explanation of all the instruments used in the data collection (including phrasing of items in surveys): This was a discourse analysis (see final coding scheme: separate file). Explanation of the data files: what data is stored in what file? • Final coding scheme (in Word). • Audiotapes (in MP3) and transcripts of 12 groups (in Word). • Data study 4 (in Excel). • Resulting data in table (in Word). In case of quantitative data: meaning and ranges or codings of all columns: • Data study 4 (in Excel): numbers and percentages of interactions. • Resulting data (Table in Word): per group (n=12) in percentages and medians In case of qualitative data: description of the structure of the data files: Not applicable
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Experimental statistical procedures used in almost all scientific papers are fundamental for clearer interpretation of the results of experiments conducted in agrarian sciences. However, incorrect use of these procedures can lead the researcher to incorrect or incomplete conclusions. Therefore, the aim of this study was to evaluate the characteristics of the experiments and quality of the use of statistical procedures in soil science in order to promote better use of statistical procedures. For that purpose, 200 articles, published between 2010 and 2014, involving only experimentation and studies by sampling in the soil areas of fertility, chemistry, physics, biology, use and management were randomly selected. A questionnaire containing 28 questions was used to assess the characteristics of the experiments, the statistical procedures used, and the quality of selection and use of these procedures. Most of the articles evaluated presented data from studies conducted under field conditions and 27 % of all papers involved studies by sampling. Most studies did not mention testing to verify normality and homoscedasticity, and most used the Tukey test for mean comparisons. Among studies with a factorial structure of the treatments, many had ignored this structure, and data were compared assuming the absence of factorial structure, or the decomposition of interaction was performed without showing or mentioning the significance of the interaction. Almost none of the papers that had split-block factorial designs considered the factorial structure, or they considered it as a split-plot design. Among the articles that performed regression analysis, only a few of them tested non-polynomial fit models, and none reported verification of the lack of fit in the regressions. The articles evaluated thus reflected poor generalization and, in some cases, wrong generalization in experimental design and selection of procedures for statistical analysis.