Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of data measured on different scales is a relevant challenge. Biomedical studies often focus on high-throughput datasets of, e.g., quantitative measurements. However, the need for integration of other features possibly measured on different scales, e.g. clinical or cytogenetic factors, becomes increasingly important. The analysis results (e.g. a selection of relevant genes) are then visualized, while adding further information, like clinical factors, on top. However, a more integrative approach is desirable, where all available data are analyzed jointly, and where also in the visualization different data sources are combined in a more natural way. Here we specifically target integrative visualization and present a heatmap-style graphic display. To this end, we develop and explore methods for clustering mixed-type data, with special focus on clustering variables. Clustering of variables does not receive as much attention in the literature as does clustering of samples. We extend the variables clustering methodology by two new approaches, one based on the combination of different association measures and the other on distance correlation. With simulation studies we evaluate and compare different clustering strategies. Applying specific methods for mixed-type data proves to be comparable and in many cases beneficial as compared to standard approaches applied to corresponding quantitative or binarized data. Our two novel approaches for mixed-type variables show similar or better performance than the existing methods ClustOfVar and bias-corrected mutual information. Further, in contrast to ClustOfVar, our methods provide dissimilarity matrices, which is an advantage, especially for the purpose of visualization. Real data examples aim to give an impression of various kinds of potential applications for the integrative heatmap and other graphical displays based on dissimilarity matrices. We demonstrate that the presented integrative heatmap provides more information than common data displays about the relationship among variables and samples. The described clustering and visualization methods are implemented in our R package CluMix available from https://cran.r-project.org/web/packages/CluMix.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 2.07(USD Billion) |
MARKET SIZE 2024 | 2.17(USD Billion) |
MARKET SIZE 2032 | 3.2(USD Billion) |
SEGMENTS COVERED | Deployment Type ,Organization Size ,Industry Vertical ,Data Type ,Analysis Type ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Cloud Deployment Machine Learning Integration Big Data Analytics Predictive Analytics Prescriptive Analytics |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | KNIME ,DAX Analytics ,Minitab ,Alteryx ,MVSP ,XLSTAT ,RapidMiner ,Statistica ,IBM ,TIBCO Software ,SPSS ,SAS Institute ,Oracle ,JMP |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | Healthcare analytics Financial risk assessment Customer segmentation Fraud detection Anomaly detection |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 4.99% (2025 - 2032) |
In 2022, online surveys were by far the most used traditional quantitative methodologies in the market research industry worldwide. During the survey, 85 percent of respondents stated that they regularly used online surveys as one of their three most used methods. Moreover, nine percent of respondents stated that they used online surveys only occasionally.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data supporting the Springer Nature Data Availability Statement (DAS) analysis in the State of Open Data 2024. SOOD_2024_special_analysis_DAS_SN.xlsx contains the DAS, DOI, publication date, DAS categories and related country by Insitution of any author.SOOD 2024_DAS_analysis_sharing.xlsx contains the summary data by country and data sharing type.Utilizing the Dimensions database, we identified articles containing key DAS identifiers such as “Data Availability Statement” or “Availability of Data and Materials” within their full text. Digital Object Identifiers (DOIs) of these articles were collected and matched against Springer Nature’s XML database to extract the DAS for each article. The extracted DAS were categorized into specific sharing types using text and data matching terms. For statements indicating that data are publicly available in a repository, we matched against a predefined list of repository identifiers, names, and URLs. The DAS were classified into the following categories:1. Data are available from the author on request. 2. Data are included in the manuscript or its supplementary material. 3. Some or all of the data are publicly available, for example in a repository.4. Figure source data are included with the manuscript. 5. Data availability is not applicable.6. Data are declared as not available by the author.7. Data available online but not in a repository.These categories are non-exclusive: more than one can apply to any one article. Publications outside the 2019–2023 range and non-article publication types (e.g., book chapters) that were initially included in the Dimensions search results were excluded from the final dataset. Articles were included in the final analysis after applying the exclusion criteria. Upon processing, it was found that only 370 results were returned for Botswana across the five-year period; due to this low number, Botswana was not included in the DAS focused country-level analysis. This analysis does not assess the accuracy of the DAS in the context of each individual article. There was no manual verification of the categories applied; as a result, terms used out of context could have led to misclassification. Approximately 5% of articles remained unclassified following text and data matching due to these limitations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The primary data types collected.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 0.06(USD Billion) |
MARKET SIZE 2024 | 0.08(USD Billion) |
MARKET SIZE 2032 | 0.34(USD Billion) |
SEGMENTS COVERED | Modality ,Animal Type ,Application ,Automated Features ,Output Type ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Rising pet ownership Technological advancements Increasing focus on animal welfare Growing demand for remote monitoring Veterinary industry expansion |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | eScription ,OptiTrack ,VICON ,BTS Bioengineering ,Genovation ,ZEBRIS ,Xsens ,SMART ,Gait Up ,Qualisys ,Motion Analysis Corporation ,Noraxon ,IMV imaging ,Phoenix Controls ,VASG |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | Veterinary diagnostics Precision animal farming Animal health monitoring Livestock management Disease prevention |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 20.53% (2025 - 2032) |
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
In 2023, the global qualitative data analysis software market size was valued at approximately USD 1.2 billion. With an impressive compound annual growth rate (CAGR) of 15%, the market is projected to reach USD 3.3 billion by 2032. This growth is driven by an increasing demand for data-driven decision-making processes across various industries, as well as advancements in artificial intelligence and machine learning technologies that are enhancing the capabilities of qualitative data analysis tools. Organizations are increasingly recognizing the value of qualitative insights, which complement quantitative data by providing deeper, context-rich understanding of phenomena, which is a significant growth factor in this market.
The demand for qualitative data analysis software is expanding due to the growing need for holistic research methods that incorporate diverse data types. In academic research, qualitative data analysis plays a critical role in understanding complex social phenomena by analyzing text, audio, video, and images. The rise of interdisciplinary studies that demand robust qualitative analysis solutions is propelling software adoption. Additionally, the business and enterprise sector has increasingly leveraged these tools to extract consumer insights from unstructured data sources like social media, reviews, and customer feedback. These insights are crucial for developing marketing strategies and enhancing customer engagement, thus driving market growth.
Healthcare is another sector significantly contributing to the market's expansion. Qualitative data analysis is crucial for understanding patient narratives and improving patient-centered care models. With the shift towards personalized medicine, healthcare providers are utilizing qualitative insights to better comprehend patient experiences and treatment outcomes. Moreover, the integration of qualitative data analysis tools with other healthcare systems is enhancing clinical research and operational efficiency. The continuous development in healthcare analytics and the increasing volume of healthcare data are expected to further boost demand in this sector.
Government and public sector organizations are also adopting qualitative data analysis software to improve policy formulation and public services. By analyzing feedback from citizens and stakeholders, governments can make informed decisions that address public needs more effectively. The growing emphasis on transparency and accountability in governance is driving the adoption of these tools. Additionally, the ongoing digital transformation across public sectors globally is facilitating the integration of advanced data analysis tools in government operations, thus contributing to the market's growth.
Regionally, North America dominates the market due to its advanced technological infrastructure and high adoption rate of data-driven decision-making processes across various sectors. Europe follows, with a strong presence of academic research institutions and enterprises investing in qualitative data analysis tools. The Asia Pacific region is expected to witness the fastest growth, driven by rapid digitalization and increasing research activities in countries like China, India, and Japan. Latin America and the Middle East & Africa regions are also beginning to explore the potential of qualitative data analysis, although they currently constitute a smaller portion of the market.
The qualitative data analysis software market is segmented by component into software and services. The software segment is the backbone of the market, offering a variety of tools that allow users to code, categorize, and analyze qualitative data. The demand for sophisticated software solutions is rising as organizations seek tools that offer enhanced features such as data visualization, collaboration capabilities, and integration with other data sources. The push towards comprehensive data analysis platforms that can manage large datasets and provide intuitive interfaces is driving innovation in software development. Furthermore, the integration of artificial intelligence into these software solutions is significantly enhancing their capabilities, making them more efficient and reducing the time required for data analysis.
In contrast, the services segment encompasses a range of offerings including consulting, implementation, training, and support services. As organizations increasingly adopt sophisticated qualitative data analysis tools, there is a growing need for professional services to ensure
Vision and Change in Undergraduate Biology Education encouraged faculty to focus on core concepts and competencies in undergraduate curriculum. We created a sophomore-level course, Biologists' Toolkit, to focus on the competencies of quantitative reasoning and scientific communication. We introduce students to the statistical analysis of data using the open source statistical language and environment, R and R Studio, in the first two-thirds of the course. During this time the students learn to write basic computer commands to input data and conduct common statistical analysis. The students also learn to graphically represent their data using R. In a final project, we assign students unique data sets that require them to develop a hypothesis that can be explored with the data, analyze and graph the data, search literature related to their data set, and write a report that emulates a scientific paper. The final report includes publication quality graphs and proper reporting of data and statistical results. At the end of the course students reported greater confidence in their ability to read and make graphs, analyze data, and develop hypotheses. Although programming in R has a steep learning curve, we found that students who learned programming in R developed a robust strategy for data analyses and they retained and successfully applied those skills in other courses during their junior and senior years.
The FDA Device Dataset by Dataplex provides comprehensive access to over 24 million rows of detailed information, covering 9 key data types essential for anyone involved in the medical device industry. Sourced directly from the U.S. Food and Drug Administration (FDA), this dataset is a critical resource for regulatory compliance, market analysis, and product safety assessment regarding.
Dataset Overview:
This dataset includes data on medical device registrations, approvals, recalls, and adverse events, among other crucial aspects. The dataset is meticulously cleaned and structured to ensure that it meets the needs of researchers, regulatory professionals, and market analysts.
24 Million Rows of Data:
With over 24 million rows, this dataset offers an extensive view of the regulatory landscape for medical devices. It includes data types such as classification, event, enforcement, 510k, registration listings, recall, PMA, UDI, and covid19 serology. This wide range of data types allows users to perform granular analysis on a broad spectrum of device-related topics.
Sourced from the FDA:
All data in this dataset is sourced directly from the FDA, ensuring that it is accurate, up-to-date, and reliable. Regular updates ensure that the dataset remains current, reflecting the latest in device approvals, clearances, and safety reports.
Key Features:
Comprehensive Coverage: Includes 9 key device data types, such as 510(k) clearances, premarket approvals, device classifications, and adverse event reports.
Regulatory Compliance: Provides detailed information necessary for tracking compliance with FDA regulations, including device recalls and enforcement actions.
Market Analysis: Analysts can utilize the dataset to assess market trends, monitor competitor activities, and track the introduction of new devices.
Product Safety Analysis: Researchers can analyze adverse event reports and device recalls to evaluate the safety and performance of medical devices.
Use Cases: - Regulatory Compliance: Ensure your devices meet FDA standards, monitor compliance trends, and stay informed about regulatory changes.
Market Research: Identify trends in the medical device market, track new device approvals, and analyze competitive landscapes with up-to-date and historical data.
Product Safety: Assess the safety and performance of medical devices by examining detailed adverse event reports and recall data.
Data Quality and Reliability:
The FDA Device Dataset prioritizes data quality and reliability. Each record is meticulously sourced from the FDA's official databases, ensuring that the information is both accurate and up-to-date. This makes the dataset a trusted resource for critical applications, where data accuracy is vital.
Integration and Usability:
The dataset is provided in CSV format, making it compatible with most data analysis tools and platforms. Users can easily import, analyze, and utilize the data for various applications, from regulatory reporting to market analysis.
User-Friendly Structure and Metadata:
The data is organized for easy navigation, with clear metadata files included to help users identify relevant records. The dataset is structured by device type, approval and clearance processes, and adverse event reports, allowing for efficient data retrieval and analysis.
Ideal For:
Regulatory Professionals: Monitor FDA compliance, track regulatory changes, and prepare for audits with comprehensive and up-to-date product data.
Market Analysts: Conduct detailed research on market trends, assess new device entries, and analyze competitive dynamics with extensive FDA data.
Healthcare Researchers: Evaluate the safety and efficacy of medical devices product data, identify potential risks, and contribute to improved patient outcomes through detailed analysis.
This dataset is an indispensable resource for anyone involved in the medical device industry, providing the data and insights necessary to drive informed decisions and ensure compliance with FDA regulations.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
With the recent attention and focus on quantitative methods for species delimitation, an overlooked but equally important issue regards what has actually been delimited. This study investigates the apparent arbitrariness of some taxonomic distinctions, and in particular how species and subspecies are assigned. Specifically, we use a recently developed Bayesian model-based approach to show that in the Hercules beetles (genus Dynastes) there is no statistical difference in the probability that putative taxa represent different species, irrespective of whether they were given species or subspecies designations. By considering multiple data types, as opposed to relying exclusively on genetic data alone, we also show that both previously recognized species and subspecies represent a variety of points along the speciation spectrum (i.e., previously recognized species are not systematically further along the continuum than subspecies). For example, based on evolutionary models of divergence, some taxa are statistically distinguishable on more than one axis of differentiation (e.g., along both phenotypic and genetic dimensions), whereas other taxa can only be delimited statistically from a single data type. Because both phenotypic and genetic data are analyzed in a common Bayesian framework, our study provides a framework for investigating whether disagreements in species boundaries among data types reflect (i) actual discordance with the actual history of lineage splitting, or instead (ii) differences among data types in the amount of time required for differentiation to become apparent among the delimited taxa. We discuss what the answers to these questions imply about what characters are used to delimit species, as well as the diverse processes involved in the origin and maintenance of species boundaries. With this in mind, we then reflect more generally on how quantitative methods for species delimitation are used to assign taxonomic status.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. Independent variables were derived from data originally obtained from US Geological Survey (USGS) and USFS data. Data is in raw form (not scaled) and contains binary (0 or 1) columns of data for qualitative independent variables (wilderness areas and soil types).
This study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. These areas represent forests with minimal human-caused disturbances, so that existing forest cover types are more a result of ecological processes rather than forest management practices.
Some background information for these four wilderness areas: Neota (area 2) probably has the highest mean elevational value of the 4 wilderness areas. Rawah (area 1) and Comanche Peak (area 3) would have a lower mean elevational value, while Cache la Poudre (area 4) would have the lowest mean elevational value.
As for primary major tree species in these areas, Neota would have spruce/fir (type 1), while Rawah and Comanche Peak would probably have lodgepole pine (type 2) as their primary species, followed by spruce/fir and aspen (type 5). Cache la Poudre would tend to have Ponderosa pine (type 3), Douglas-fir (type 6), and cottonwood/willow (type 4).
The Rawah and Comanche Peak areas would tend to be more typical of the overall dataset than either the Neota or Cache la Poudre, due to their assortment of tree species and range of predictive variable values (elevation, etc.) Cache la Poudre would probably be more unique than the others, due to its relatively low elevation range and species composition.
Given is the attribute name, attribute type, the measurement unit and a brief description. The forest cover type is the classification problem. The order of this listing corresponds to the order of numerals along the rows of the database.
Name / Data Type / Measurement / Description
Elevation / quantitative /meters / Elevation in meters Aspect / quantitative / azimuth / Aspect in degrees azimuth Slope / quantitative / degrees / Slope in degrees Horizontal_Distance_To_Hydrology / quantitative / meters / Horz Dist to nearest surface water features Vertical_Distance_To_Hydrology / quantitative / meters / Vert Dist to nearest surface water features Horizontal_Distance_To_Roadways / quantitative / meters / Horz Dist to nearest roadway Hillshade_9am / quantitative / 0 to 255 index / Hillshade index at 9am, summer solstice Hillshade_Noon / quantitative / 0 to 255 index / Hillshade index at noon, summer soltice Hillshade_3pm / quantitative / 0 to 255 index / Hillshade index at 3pm, summer solstice Horizontal_Distance_To_Fire_Points / quantitative / meters / Horz Dist to nearest wildfire ignition points Wilderness_Area (4 binary columns) / qualitative / 0 (absence) or 1 (presence) / Wilderness area designation Soil_Type (40 binary columns) / qualitative / 0 (absence) or 1 (presence) / Soil Type designation Cover_Type (7 types) / integer / 1 to 7 / Forest Cover Type designation
Class Labels
Spruce/Fir, Lodgepole Pine, Ponderosa Pine, Cottonwood/Willow, Aspen, Douglas-fir, Krummholz
NOAA NEXRAD Quantitative Precipitation Estimation (QPE) Climate Data Record (CDR) is created from the Radar Multi-Radar/Multi-Sensor (MRMS) Reanalysis to produce severe weather and precipitation products for improved decision-making capability to improve severe weather forecasts and warnings, hydrology, aviation, and numerical weather prediction. The data cover a time period from 2002-01-01 to 2011-12-31. NOAA's NEXRAD reanalysis consists of two primary components; (1) Severe weather and radar-reflectivity data generation, (2) Quantitative Precipitation Estimate (including associated precipitation variables and merged rain gauge and radar estimation). This document focuses on the second component of NOAA's NEXRAD reanalysis - the Quantitative Precipitation Estimate (QPE). The primary files generated within this data set are radar-only and radar- gauge (ROQPE, GCQPE, and MOS2D) merged precipitation products as well as ancillary information on precipitation type (PRATE and PFLAG) and radar quality (RQIND). The initial data set covers the time period from January 2002 - December 2011. Radar-only reflectivity, Gauge, Precipitation Flag, and Radar Quality Index for 5-minute data at 1km regular grid over CONUS. Radar only Radar-Gauge Quantitative Precipitation Estimates at hourly scale at 1km regular grid over CONUS. MRMS Quantitative Precipitation Estimation (QPE) uses the most advanced radar technologies and provides high-resolution information about precipitation types and amounts for the nation. The data are stored in netCDF version 4.0 files that include the necessary metadata and supplementary data fields. Data set provides information that can be useful for identification of various types of precipitation, estimation of radar reflectivity, recognition of storm patterns, forecasting technologies for rainfall estimation, and associating different phases of precipitation such as hail freezing rain and snow with radar observations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Transparency in data visualization is an essential ingredient for scientific communication. The traditional approach of visualizing continuous quantitative data solely in the form of summary statistics (i.e., measures of central tendency and dispersion) has repeatedly been criticized for not revealing the underlying raw data distribution. Remarkably, however, systematic and easy-to-use solutions for raw data visualization using the most commonly reported statistical software package for data analysis, IBM SPSS Statistics, are missing. Here, a comprehensive collection of more than 100 SPSS syntax files and an SPSS dataset template is presented and made freely available that allow the creation of transparent graphs for one-sample designs, for one- and two-factorial between-subject designs, for selected one- and two-factorial within-subject designs as well as for selected two-factorial mixed designs and, with some creativity, even beyond (e.g., three-factorial mixed-designs). Depending on graph type (e.g., pure dot plot, box plot, and line plot), raw data can be displayed along with standard measures of central tendency (arithmetic mean and median) and dispersion (95% CI and SD). The free-to-use syntax can also be modified to match with individual needs. A variety of example applications of syntax are illustrated in a tutorial-like fashion along with fictitious datasets accompanying this contribution. The syntax collection is hoped to provide researchers, students, teachers, and others working with SPSS a valuable tool to move towards more transparency in data visualization.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 4.06(USD Billion) |
MARKET SIZE 2024 | 4.38(USD Billion) |
MARKET SIZE 2032 | 8.0(USD Billion) |
SEGMENTS COVERED | Application, Deployment Mode, End User, Data Type, Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Growing demand for real-time data, Increasing adoption of cloud technology, Focus on regulatory compliance, Rising need for data accuracy, Expansion of clinical trials globally |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | Clinical Ink, Castor EDC, Veeva Systems, Oracle Corporation, Parexel International, Qualcomm Life, MediData, RealTime Software Solutions, Medidata Solutions, CRF Health, Medrio, Datacision, PRA Health Sciences, IBM Corporation |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | Rising demand for clinical trials, Adoption of mobile EDC solutions, Increased focus on patient-centric data, Growth in regulatory compliance needs, Integration with artificial intelligence technologies |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 7.84% (2025 - 2032) |
atWeb, 1.6 Data are structured around: - Symbolic concepts including studied objects and qualitative data: material type, molecule type, etc. - Quantity concepts including quantitative data: temperature, packaging thickness, relative humidity, etc. - Relation concepts describing relationships between studied objects, qualitative and quantitative data: O2 permeability relation, volatil solubility Young's modulus relation, etc. The ontology is used by @Web tool
Quantitative data from community observations are stored and managed using SPSS social survey software. The sampling unit used is a harvest event, typically a hunting or fishing event in a particular season. As of 5 September, 2008 we have received and encoded data for 56 harvest events as follows: Harvest type: Mammal (10), Fish (45), Shellfish (1) Community: Gambell (10), Kanchalan (22), Nikolskoye (6), Sandpoint (18) Preliminary SPSS Data structure: Name, Label, Type, Width ID Respondent s Identification Number String 10 INTERNO Interview Number String 2 DATE Date On Which the Interview Took Place Date 8 SEX Gender Numeric 1 YEARBO Year of Birth Numeric 11 VILLAGE Village Where Respndent Resides String 6 LOCATI Respondent Resides in Russia or Alaska Numeric 8 LIVED How Long Respondent Lived in the Area String 100 LANGUAG Language in Which Interiew Conducted Numeric 7 HARVEST Level of Harvester Numeric 4 YEARHU How Many Years Respondent Has Hunted/Fished in the Area Numeric 8 EMPLOY Is the Respondent Employed in a Non-Harvesting Field Numeric 3 TIMEWOR Time Per Week/Month Is Spent in Non-Harvest Work Numeric 8 YEARWOR How Many Years Spent in Non-Harvest Work CATEGORIES Numeric 8 Q1FISHM Is Respondent Hunting Fish or Mammals On Next Trip Numeric 4 SPECIES Species of Fish/Mammal Being Hunted/Fished Numeric 8 Q2RECA Does Respondent Recall When Last Hunt/Fish Trip Occurre Numeric 3 Q2WHEN Date of Last Hunt/Fish Trip String 50 Q2AAGO How Long Ago Was Last Hunt/Fish Trip Numeric 16 Q3FAR How Far Respondent Travelled On Last Hunt/Fish Trip Numeric Q4OFTEN How Often Respondent Hunted/Fished in the Location of Last Trip Numeric 6 Q5AGE Age When Respondent First Went to Location of Last Trip Numeric 18 Q6PROX Prefers Loc. of Last Trip Due to Proximity to Village Numeric 11 Q6ACCES Prefers Location of Last Trip Due to Ease of Access Numeric 11 Q6CATCH Prefers Location of Last Trip Due to Ease of Catching Numeric 11 Q6OTHER Prefers Location of Last Trip Due to Some Other Reason Numeric 11 Q6SPECI Other Reason Prefers Locatin of Last Trip String 200 Q6DONT Respondent Does Not Like Location of Last Trip Numeric 11 Q7RELY Is Location of Last Trip Reliable for Fishing/Hunting Numeric 3 Q8NOTIC In Previous 5-10 Years Has Respondent Noticed Changes at Last Hunt/Fish Location Numeric 3 Q9OTHER Do Others From the Village Also Hunt/Fish at Location of Last Trip Numeric 3 Q10GETA On Last Trip, Was it Easier or More Difficult to Get to Location Numeric 3 Q10GETR On Last Trip Did Respondent Encounter Difficulties Getting to Hunt/Fish Location Numeric 8 Q10ATRA More Difficult to Get to Location of Last Trip Due to Lack of Transportation Numeric 11 Q10AROA More Difficult to Get to Location of Last Trip Due to Poor Road Conditions Numeric 11 Q10AENV More Difficult to Get to Location of Last Trip Due to Poor Environ Conditions Numeric 11 Q10AECO More Diff. to Get to Location of Last Trip Due to Economics Numeric 11 Q10AHEA More Difficult to Get to Location of Last Trip Due to Personal Health Condition Numeric 11 Q10AOTHE More Difficult to Get to Location of Last Trip Due to Other Reasons Numeric 23 Q11TRAD Last Harvest Used for Traditional/Personal Use Numeric 11 Q11CASH Last Harvest Used for Generating Cash or Bartering Numeric 11 Q11REC Last Harvest Used for Recreational Hunting/Fishing Numeric 11 Q11COM Last Harvest Used for Commercial or Business Activity Numeric 11 Q11DOG Last Harvest Used for Feeding Dogs Numeric 11 Q11SHAR Last Harvest Used for Sharing with Friends/Family Numeric 11 Q11OTHE Last Harvest Used for Something Else Numeric 20 Q12QUAN Quantity of XXX Caught on Last Hunt/Fish Trip Numeric 21
This is the third lab in an Introductory Physical Geography/Environmental Studies course. It introduces students to different data types (qualitative vs quantitative), basic statistical analyses (correlation analysis s, t-test), and graphing techniques.
We established a new protocol for negative immunomagnetic isolation of murine primary Type II alveolar epithelial cells (AEC II) yielding untouched primary murine AEC II. AEC II were collected from mice 24h after Aspergillus fumigatus or mock infection (9 replicates per experimental group) and analyzed by label-free quantitative proteomics.
Attribution-ShareAlike 2.0 (CC BY-SA 2.0)https://creativecommons.org/licenses/by-sa/2.0/
License information was derived automatically
The CLARISSA Cash Plus intervention represented an innovative social protection scheme for tackling social ills, including the worst forms of child labour (WFCL). A universal and unconditional ‘cash plus’ programme, it combined community mobilisation, case work, and cash transfers (CTs). It was implemented in a high-density, low-income neighbourhood in Dhaka to build individual, family, and group capacities to meet needs. This, in turn, was expected to lead to a corresponding decrease in deprivation and community-identified social issues that negatively affect wellbeing, including WFCL. Four principles underpinned the intervention: Unconditionality, Universality, Needs-centred and people-led, and Emergent and open-ended.The intervention took place in Dhaka – North Gojmohol – over a 27-month period, between October 2021 and December 2023, to test and study the impact of providing unconditional and people‑led support to everyone in a community. Cash transfers were provided between January and June 2023 in monthly instalments, plus one investment transfer in September 2023. A total of 1,573 households received cash, through the Upay mobile financial service. Cash was complemented by a ‘plus’ component, implemented between October 2021 and December 2023. Referred to as relational needs-based community organising (NBCO), a team of 20 community mobilisers (CMs) delivered case work at the individual and family level and community mobilisation at the group level. The intervention was part of the wider CLARISSA programme, led by the Institute of Development Studies (IDS) and funded by UK’s Foreign, Commonwealth & Development Office (FCDO). The intervention was implemented by Terre des hommes (Tdh) in Bangladesh and evaluated in collaboration with the BRAC Institute of Governance and Development (BIGD) and researchers from the University of Bath and the Open University, UK.The evaluation of the CLARISSA Social Protection pilot was rooted in contribution analysis that combined multiple methods over more than three years in line with emerging best practice guidelines for mixed methods research on children, work, and wellbeing. Quantitative research included bi-monthly monitoring surveys administered by the project’s community mobilisers (CMs), including basic questions about wellbeing, perceived economic resilience, school attendance, etc. This was complimented by baseline, midline, and endline surveys, which collected information about key outcome indicators within the sphere of influence of the intervention, such as children’s engagement with different forms of work and working conditions, with schooling and other activities, household living conditions and sources of income, and respondents’ perceptions of change. Qualitative tools were used to probe topics and results of interest, as well as impact pathways. These included reflective diaries written by the community mobilisers; three rounds of focus group discussions (FGDs) with community members; three rounds of key informant interviews (KIIs) with members of case study households; and long-term ethnographic observation.Quantitative DataThe quantitative evaluation of the CLARISSA Cash Plus intervention involved several data collection methods to gather information about household living standards, children’s education and work, and social dynamics. The data collection included a pre-intervention census, four periodic surveys, and 13 rounds of bi-monthly monitoring surveys, all conducted between late 2020 and late 2023. Details of each instrument are as follows:Census: Conducted in October/November 2020 in the target neighbourhood of North Gojmohol (n=1,832) and the comparison neighbourhood of Balurmath (n=2,365)Periodic surveys: Baseline (February 2021, n=752 in North Gojmohol), Midline 1 (before cash) (October 2022, n=771 in North Gojmohol), Midline 2 (after 6 rounds of cash) (July 2023, n=769 in North Gojmohol), and Endline (December 2023, n=750 in North Gojmohol and n=773 in Balumath)Bi-monthly monitoring data (13 rounds): Conducted between December 2021 and December 2023 in North Gojmohol (average of 1,400 households per round)The present repository summarizes this information, organized as follows:1.1 Bimonthly survey (household): Panel dataset comprising 13 rounds of bi-monthly monitoring data at the household level (average of 1,400 households per round, total of 18,379 observations)1.2 Bimonthly survey (child): Panel dataset comprising 13 rounds of bi-monthly monitoring data at the child level (aged 5 to 16 at census) (average of 940 children per round, total of 12,213 observations)2.1 Periodic survey (household): Panel dataset comprising 5 periodic surveys (census, baseline, midline 1, midline 2, endline) at the household level (average of 750 households per period, total of 3,762 observations)2.2 Periodic survey (child): Panel dataset comprising 4 periodic surveys (baseline, midline 1, midline 2, endline) at the child level (average of 3,100 children per period, total of 12,417 observations)3.0 Balurmat - North Gojmohol panel: Balanced panel dataset comprising 558 households in North Gojmohol and 773 households in Balurmath, observed both at 2020 census and 2023 endline (total of 2,662 observations)4.0 Questionnaires: Original questionnaires for all datasetsAll datasets are provided in Stata format (.dta) and Excel format (.xlsx) and are accompanied by their respective dictionary in Excel format (.xlsx).Qualitative DataThe qualitative study was conducted in three rounds: the first round of IDIs and FGDs took place between December 2022 and January 2023; the second round took place from April to May 2023; and the third round took place from November to December 2023. KIIs were taken during the 2nd round of study in May 2023.The sample size by round and instrument type is shown below:RoundsIDIs with childrenIDIs with parentsIDIs with CMsFGDsKIIs1st Round (12/2022 – 01/2023)3026-06-2nd Round ( 04/2023 – 05/2023)3023-06053rd Round (11/2023 – 12/2023)26250307-The files in this archive contain the qualitative data and include six types of transcripts:· 1.1 Interviews with children in case study households (IDI): 30 families in round 1, 30 in round 2, and 26 in round 3· 1.2 Interviews with parents in case study households (IDI): 26 families in round 1, 23 in round 2, and 25 in round 3· 1.3 Interviews with community mobiliser (IDI): 3 CM in round 3· 2.0 Key informant interviews (KII): 5 in round 2· 3.0 Focus group discussions (FGD): 6 in round 1, 6 in round 2, and 7 in round 3· 4.0 Community mobiliser micro-narratives (556 cases)Additionally, this repository includes a comprehensive list of all qualitative data files ("List of all qualitative data+MC.xlsx").
Definition of abbreviations: NV = numerical density, NA nucleus = nuclear profile per unit area, NA cell = cell profile per unit area, = coefficient of error of the mean estimate, N. A. = not analysed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of data measured on different scales is a relevant challenge. Biomedical studies often focus on high-throughput datasets of, e.g., quantitative measurements. However, the need for integration of other features possibly measured on different scales, e.g. clinical or cytogenetic factors, becomes increasingly important. The analysis results (e.g. a selection of relevant genes) are then visualized, while adding further information, like clinical factors, on top. However, a more integrative approach is desirable, where all available data are analyzed jointly, and where also in the visualization different data sources are combined in a more natural way. Here we specifically target integrative visualization and present a heatmap-style graphic display. To this end, we develop and explore methods for clustering mixed-type data, with special focus on clustering variables. Clustering of variables does not receive as much attention in the literature as does clustering of samples. We extend the variables clustering methodology by two new approaches, one based on the combination of different association measures and the other on distance correlation. With simulation studies we evaluate and compare different clustering strategies. Applying specific methods for mixed-type data proves to be comparable and in many cases beneficial as compared to standard approaches applied to corresponding quantitative or binarized data. Our two novel approaches for mixed-type variables show similar or better performance than the existing methods ClustOfVar and bias-corrected mutual information. Further, in contrast to ClustOfVar, our methods provide dissimilarity matrices, which is an advantage, especially for the purpose of visualization. Real data examples aim to give an impression of various kinds of potential applications for the integrative heatmap and other graphical displays based on dissimilarity matrices. We demonstrate that the presented integrative heatmap provides more information than common data displays about the relationship among variables and samples. The described clustering and visualization methods are implemented in our R package CluMix available from https://cran.r-project.org/web/packages/CluMix.