Facebook
TwitterHospital Financial Quarterly Aggregate Report
Facebook
Twitter
According to our latest research, the global emissions data aggregation for financial services market size reached USD 1.85 billion in 2024, with a robust CAGR of 17.2% projected through the forecast period. By 2033, the market is anticipated to achieve a value of USD 7.15 billion, reflecting the sector's rapid expansion. Growth in this market is primarily driven by tightening regulatory frameworks, rising investor scrutiny on ESG (Environmental, Social, and Governance) factors, and the increasing adoption of digital tools for sustainability management within financial institutions.
The growth of the emissions data aggregation market in financial services is strongly influenced by the evolving regulatory landscape. Governments and regulatory bodies worldwide are implementing stricter disclosure requirements around carbon emissions and climate-related financial risks. The introduction of frameworks such as the Task Force on Climate-related Financial Disclosures (TCFD) and the European UnionÂ’s Sustainable Finance Disclosure Regulation (SFDR) has mandated banks, asset managers, and insurers to report not only their direct and indirect emissions but also those embedded across their value chains. As a result, financial institutions are seeking sophisticated data aggregation solutions to ensure compliance, minimize reputational risk, and enhance transparency for stakeholders. This regulatory momentum is expected to persist, further fueling the demand for emissions data aggregation platforms and services.
Another significant growth factor is the increasing integration of ESG criteria into investment and lending decisions. Institutional investors, asset managers, and private equity firms are under mounting pressure from clients, shareholders, and advocacy groups to align portfolios with sustainability goals and net-zero commitments. Accurate, timely, and granular emissions data has become a critical input for risk assessment, portfolio analysis, and sustainability reporting. This trend is prompting financial institutions to invest in advanced software and services capable of aggregating emissions data from diverse sources, including direct operations, energy procurement, and value chain activities. The adoption of artificial intelligence and machine learning within these solutions is further enhancing data accuracy, predictive analytics, and automated reporting capabilities, thereby driving market expansion.
Technological innovation is also playing a pivotal role in the growth of the emissions data aggregation market for financial services. Cloud-based platforms, API integrations, and blockchain technology are being leveraged to streamline data collection, validation, and reporting processes. These advancements enable financial institutions to efficiently aggregate emissions data from multiple internal and external sources, ensuring scalability and interoperability with existing IT infrastructure. Furthermore, partnerships between financial institutions and technology vendors are accelerating the development of customized solutions tailored to sector-specific needs. As digital transformation continues to reshape the financial services industry, the adoption of emissions data aggregation solutions is expected to accelerate, supporting the transition to a more sustainable and transparent financial ecosystem.
From a regional perspective, Europe currently leads the global market, driven by progressive regulatory policies and a mature ESG investment landscape. North America follows closely, with significant adoption among large banks and asset managers. The Asia Pacific region is rapidly emerging as a high-growth market, propelled by increasing regulatory alignment, investor demand for green finance, and expanding digital infrastructure. Latin America and the Middle East & Africa, while smaller in market share, are witnessing growing interest as local regulators and financial institutions begin to prioritize climate risk management and sustainability reporting. This regional diversification underscores the global relevance and growth potential of emissions data aggregation solutions in financial services.
The integration of ESG Data Feeds for Capitals is becoming increasingly vital for financial institutions aiming to enhance th
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset provided by = Björn Holzhauer
Dataset Description==Meta-analyses of clinical trials often treat the number of patients experiencing a medical event as binomially distributed when individual patient data for fitting standard time-to-event models are unavailable. Assuming identical drop-out time distributions across arms, random censorship and low proportions of patients with an event, a binomial approach results in a valid test of the null hypothesis of no treatment effect with minimal loss in efficiency compared to time-to-event methods. To deal with differences in follow-up - at the cost of assuming specific distributions for event and drop-out times - we propose a hierarchical multivariate meta-analysis model using the aggregate data likelihood based on the number of cases, fatal cases and discontinuations in each group, as well as the planned trial duration and groups sizes. Such a model also enables exchangeability assumptions about parameters of survival distributions, for which they are more appropriate than for the expected proportion of patients with an event across trials of substantially different length. Borrowing information from other trials within a meta-analysis or from historical data is particularly useful for rare events data. Prior information or exchangeability assumptions also avoid the parameter identifiability problems that arise when using more flexible event and drop-out time distributions than the exponential one. We discuss the derivation of robust historical priors and illustrate the discussed methods using an example. We also compare the proposed approach against other aggregate data meta-analysis methods in a simulation study.
Facebook
Twitter
According to our latest research, the global API Data Aggregation Platform market size reached USD 3.62 billion in 2024. The industry is experiencing robust momentum, propelled by rising demand for seamless data integration and real-time analytics. The market is projected to grow at a CAGR of 17.4% during the forecast period, with the market size expected to reach USD 14.99 billion by 2033. This growth is primarily fueled by the escalating adoption of cloud technologies, digital transformation initiatives, and the increasing need for unified data access across various sectors.
One of the foremost growth drivers in the API Data Aggregation Platform market is the exponential increase in data generation across industries. Enterprises are leveraging multiple digital channels, IoT devices, and cloud-based services, resulting in vast volumes of structured and unstructured data. To derive actionable insights, organizations are increasingly relying on API data aggregation platforms that can seamlessly collect, normalize, and consolidate data from disparate sources. This capability not only streamlines business intelligence processes but also enhances decision-making speed and accuracy. The surge in demand for real-time analytics and the necessity for organizations to remain agile in a highly competitive environment are further catalyzing market expansion.
Another significant factor contributing to the growth of the API Data Aggregation Platform market is the rapid proliferation of financial technology (fintech) and digital banking services. The BFSI sector, in particular, is witnessing a paradigm shift towards open banking, which mandates the secure sharing of customer data via APIs. API aggregation platforms play a pivotal role in this ecosystem by enabling seamless integration between banks, third-party providers, and customers. This not only enhances customer experience through personalized offerings but also ensures regulatory compliance and security. Moreover, the healthcare sector is increasingly adopting these platforms to integrate patient data from various electronic health records (EHRs), wearables, and telemedicine applications, thereby improving care coordination and patient outcomes.
The ongoing digital transformation initiatives across enterprises of all sizes are further propelling the adoption of API data aggregation platforms. Small and medium enterprises (SMEs) are leveraging these solutions to level the playing field with larger organizations by gaining access to unified data views and advanced analytics capabilities. Large enterprises, on the other hand, are utilizing API aggregation to streamline complex data ecosystems and support large-scale digital projects. The growing trend of cloud migration and the increasing importance of data-driven business models are expected to sustain this growth trajectory over the forecast period. Additionally, the rise in remote work and the need for seamless data access across distributed teams are further strengthening market demand.
The emergence of a Unified API Platform is revolutionizing the way organizations approach data integration and management. By providing a cohesive framework that consolidates various API functionalities into a single platform, businesses can streamline their operations and enhance productivity. This unified approach not only simplifies the development and deployment of APIs but also ensures consistent security, governance, and monitoring across all API interactions. As enterprises increasingly adopt digital transformation strategies, the demand for such integrated solutions is on the rise, enabling them to respond swiftly to market changes and customer demands. The Unified API Platform thus represents a significant advancement in the API ecosystem, offering a holistic solution that addresses the complexities of modern data environments.
From a regional perspective, North America currently dominates the API Data Aggregation Platform market, followed by Europe and Asia Pacific. The region's leadership can be attributed to the early adoption of advanced technologies, a mature digital infrastructure, and a strong presence of key market players. Asia Pacific, however, is anticipated to exhibit the highest growth rate over the forecast period, driven by rapid digitalization, increasing investments in IT
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
F00 - Information about the telecommunications entrepreneur F01 - Telephone services provided on the fixed public telecommunications network F02 - Interoperator cooperation in the fixed public telecommunications network F03 - Interoperator cooperation in the mobile public telecommunications network F04 - Retail services provided to end users on the mobile public telecommunications network F05 - Internet access services provided to end users F06 - Bundled services F07 - VoIP telephony services provided on the public telecommunications network F08 - Television services provided to end users
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the context of REDD+, Measurement, Reporting and Verification (MRV) is one way to manage forest change information. A national carbon and non-carbon database will be used in REDD+ to negotiate compensation schemes with the international community. Much of this data will be collected at the local level, thus a reporting system that can integrate these locally collected data into the national database is crucial. In this paper we compare and draw lessons from three existing local to national reporting systems that include the participation of local communities: 1) the government extension services, 2) the government owned forestry company, and 3) a private logging company in Indonesia, and provide recommendations for REDD+ reporting systems. The results suggest that the main desired conditions for effective data flow are: benefits to motivate local participation, based on contributions to reporting activities; simple data format and reporting procedures to allow local participation in the reporting process, and to support data aggregation at the national level; a facilitator to mediate data aggregation at the village level to ensure data consistency, completeness and accuracy; and a transparent and clear data flow. Under these conditions, continuous, accountable and consistent data flow from the local level will reach the national level where it can be fully utilized.
Facebook
TwitterAs part of its mandate under Title VII of the Civil Rights Act of 1964, as amended, the Equal Employment Opportunity Commission requires periodic reports from public and private employers, and unions and labor organizations
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Cement and Aggregate market was valued at USD 204170 million in 2023 and is projected to reach USD 244356.25 million by 2032, with an expected CAGR of 2.6% during the forecast period.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global Regulatory Reporting Data Hub for Banks market size reached USD 4.2 billion in 2024 and is expected to grow at a robust CAGR of 12.1% from 2025 to 2033, reaching a projected value of USD 11.7 billion by 2033. This impressive growth is primarily driven by the increasing regulatory complexity, the need for real-time data management, and the adoption of advanced digital solutions by banks worldwide. The market is witnessing a transformation as financial institutions strive to enhance transparency, streamline compliance processes, and mitigate risks associated with regulatory reporting.
One of the primary growth factors for the Regulatory Reporting Data Hub for Banks market is the rapidly evolving regulatory landscape across global banking sectors. Governments and regulatory bodies are continuously introducing new compliance standards and reporting requirements, compelling banks to upgrade their data infrastructure. The introduction of stringent regulations such as Basel III, Dodd-Frank, MiFID II, and GDPR has necessitated the deployment of robust data hubs capable of aggregating, validating, and reporting vast volumes of financial data in real time. As a result, banks are increasingly investing in advanced regulatory reporting solutions to ensure timely and accurate submission of regulatory reports, avoid penalties, and maintain their reputational integrity.
Another significant driver is the growing adoption of digital transformation strategies within the banking industry. As banks digitize their operations, there is a heightened need for centralized data management platforms that can seamlessly integrate with multiple banking systems and deliver actionable insights for regulatory compliance. Regulatory reporting data hubs offer automated data aggregation, validation, and analytics functionalities, enabling banks to reduce manual intervention and minimize errors. The integration of artificial intelligence, machine learning, and big data analytics further enhances the capabilities of these platforms, allowing banks to proactively identify compliance gaps and streamline reporting workflows. This digital shift not only improves operational efficiency but also supports banks in adapting to rapidly changing regulatory demands.
Furthermore, the increasing focus on risk management and data governance is fueling the demand for regulatory reporting data hubs. Banks are under immense pressure to maintain data accuracy, consistency, and security, especially in the face of cross-border operations and complex financial products. Regulatory reporting data hubs facilitate comprehensive data lineage, audit trails, and secure data storage, ensuring that banks can demonstrate compliance during regulatory audits. The ability to aggregate data from disparate sources and generate unified reports is becoming a strategic advantage for banks seeking to enhance their risk management frameworks and achieve regulatory alignment across multiple jurisdictions.
Regionally, North America dominates the Regulatory Reporting Data Hub for Banks market, accounting for the largest share in 2024, followed by Europe and Asia Pacific. The region's leadership is attributed to the presence of major global banks, advanced IT infrastructure, and proactive regulatory frameworks. Europe is also witnessing significant growth due to the implementation of new regulatory directives and the increasing adoption of cloud-based reporting solutions. Meanwhile, Asia Pacific is emerging as a lucrative market, driven by rapid digitalization in banking and the expansion of cross-border financial activities. Latin America and the Middle East & Africa are gradually catching up, as local banks modernize their compliance processes and embrace technology-driven reporting solutions.
The Regulatory Reporting Data Hub for Banks market by component is segmented into Software and Services. The software segment holds the dominant share, as banks prioritize the deployment of comprehensive platforms capable of automating the entire regulatory reporting lifecycle. These software solutions encompass data aggregation, validation, analytics, and report generation functionalities, offering end-to-end compliance management. The evolution of cloud-native and AI-powered platforms is further enhancing the software segmen
Facebook
TwitterAggregated data attached to Diversity in the High Tech industry report
Facebook
TwitterAs part of its mandate under Title VII of the Civil Rights Act of 1964, as amended, the Equal Employment Opportunity Commission requires periodic reports from public and private employers, and unions and labor organizations
Facebook
TwitterAs part of its mandate under Title VII of the Civil Rights Act of 1964, as amended, the Equal Employment Opportunity Commission requires periodic reports from public and private employers, and unions and labor organizations which indicate the composition of their work forces by sex and by race/ethnic category. Key among these reports is the EEO-1, which is collected annually from Private employers with 100 or more employees or federal contractors with 50 more employees.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Platelet Aggregation Devices market was valued at USD 312.5 million in 2023 and is projected to reach USD 686.53 million by 2032, with an expected CAGR of 11.9% during the forecast period.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
View the column descriptions here.The Office of Emergency and Resilience (OER) of the Food and Agriculture Organization (FAO) is piloting a monitoring system to better understand the impacts of COVID-19 and other shocks on food supply, agricultural livelihoods and food security in a number of food crisis countries. This project is supported by the United States Agency for International Development (USAID).The monitoring system consists of primary data collected from households and key informants (including agricultural inputs vendors, food traders and agriculture extension officers) on a periodic basis (more or less every 3 months). Data are mainly collected through Computer-Assisted Telephone Interviews (CATI). In-person surveys are conducted where the circumstances allow for field access. During each round of the system, more than 40,000 interviews have been completed in more than 20 countries. In order to associate each round of data collection with the dates it was performed, refer to the calendar available here.Data are used to guide strategic decisions, to design programmes and to inform analytical processes such as the IPC.The present layer contains data aggregated on Admin1 level, from Afghanistan, Colombia, DRC, Liberia, Mali, Niger, Sierra Leone, Somalia, Yemen and Zimbabwe. Indicator: Percentage of households reporting shocks directly or indirectly related to COVID-19
Facebook
Twitterhttps://www.usa.gov/government-workshttps://www.usa.gov/government-works
Reporting of new Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. This dataset will receive a final update on June 1, 2023, to reconcile historical data through May 10, 2023, and will remain publicly available.
Aggregate Data Collection Process Since the start of the COVID-19 pandemic, data have been gathered through a robust process with the following steps:
Methodology Changes Several differences exist between the current, weekly-updated dataset and the archived version:
Confirmed and Probable Counts In this dataset, counts by jurisdiction are not displayed by confirmed or probable status. Instead, confirmed and probable cases and deaths are included in the Total Cases and Total Deaths columns, when available. Not all jurisdictions report probable cases and deaths to CDC.* Confirmed and probable case definition criteria are described here:
Council of State and Territorial Epidemiologists (ymaws.com).
Deaths CDC reports death data on other sections of the website: CDC COVID Data Tracker: Home, CDC COVID Data Tracker: Cases, Deaths, and Testing, and NCHS Provisional Death Counts. Information presented on the COVID Data Tracker pages is based on the same source (total case counts) as the present dataset; however, NCHS Death Counts are based on death certificates that use information reported by physicians, medical examiners, or coroners in the cause-of-death section of each certificate. Data from each of these pages are considered provisional (not complete and pending verification) and are therefore subject to change. Counts from previous weeks are continually revised as more records are received and processed.
Number of Jurisdictions Reporting There are currently 60 public health jurisdictions reporting cases of COVID-19. This includes the 50 states, the District of Columbia, New York City, the U.S. territories of American Samoa, Guam, the Commonwealth of the Northern Mariana Islands, Puerto Rico, and the U.S Virgin Islands as well as three independent countries in compacts of free association with the United States, Federated States of Micronesia, Republic of the Marshall Islands, and Republic of Palau. New York State’s reported case and death counts do not include New York City’s counts as they separately report nationally notifiable conditions to CDC.
CDC COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths, available by state and by county. These and other data on COVID-19 are available from multiple public locations, such as:
https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html
https://www.cdc.gov/covid-data-tracker/index.html
https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html
https://www.cdc.gov/coronavirus/2019-ncov/php/open-america/surveillance-data-analytics.html
Additional COVID-19 public use datasets, include line-level (patient-level) data, are available at: https://data.cdc.gov/browse?tags=covid-19.
Archived Data Notes:
November 3, 2022: Due to a reporting cadence issue, case rates for Missouri counties are calculated based on 11 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 3, 2022, instead of the customary 7 days’ worth of data.
November 10, 2022: Due to a reporting cadence change, case rates for Alabama counties are calculated based on 13 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 10, 2022, instead of the customary 7 days’ worth of data.
November 10, 2022: Per the request of the jurisdiction, cases and deaths among non-residents have been removed from all Hawaii county totals throughout the entire time series. Cumulative case and death counts reported by CDC will no longer match Hawaii’s COVID-19 Dashboard, which still includes non-resident cases and deaths.
November 17, 2022: Two new columns, weekly historic cases and weekly historic deaths, were added to this dataset on November 17, 2022. These columns reflect case and death counts that were reported that week but were historical in nature and not reflective of the current burden within the jurisdiction. These historical cases and deaths are not included in the new weekly case and new weekly death columns; however, they are reflected in the cumulative totals provided for each jurisdiction. These data are used to account for artificial increases in case and death totals due to batched reporting of historical data.
December 1, 2022: Due to cadence changes over the Thanksgiving holiday, case rates for all Ohio counties are reported as 0 in the data released on December 1, 2022.
January 5, 2023: Due to North Carolina’s holiday reporting cadence, aggregate case and death data will contain 14 days’ worth of data instead of the customary 7 days. As a result, case and death metrics will appear higher than expected in the January 5, 2023, weekly release.
January 12, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0. As a result, case and death metrics will appear lower than expected in the January 12, 2023, weekly release.
January 19, 2023: Due to a reporting cadence issue, Mississippi’s aggregate case and death data will be calculated based on 14 days’ worth of data instead of the customary 7 days in the January 19, 2023, weekly release.
January 26, 2023: Due to a reporting backlog of historic COVID-19 cases, case rates for two Michigan counties (Livingston and Washtenaw) were higher than expected in the January 19, 2023 weekly release.
January 26, 2023: Due to a backlog of historic COVID-19 cases being reported this week, aggregate case and death counts in Charlotte County and Sarasota County, Florida, will appear higher than expected in the January 26, 2023 weekly release.
January 26, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0 in the weekly release posted on January 26, 2023.
February 2, 2023: As of the data collection deadline, CDC observed an abnormally large increase in aggregate COVID-19 cases and deaths reported for Washington State. In response, totals for new cases and new deaths released on February 2, 2023, have been displayed as zero at the state level until the issue is addressed with state officials. CDC is working with state officials to address the issue.
February 2, 2023: Due to a decrease reported in cumulative case counts by Wyoming, case rates will be reported as 0 in the February 2, 2023, weekly release. CDC is working with state officials to verify the data submitted.
February 16, 2023: Due to data processing delays, Utah’s aggregate case and death data will be reported as 0 in the weekly release posted on February 16, 2023. As a result, case and death metrics will appear lower than expected and should be interpreted with caution.
February 16, 2023: Due to a reporting cadence change, Maine’s
Facebook
Twitter{"Energy consumption data for town-owned infrastructure to meet the Green Energy Act 2009 (O. Reg. 397/11) annual reporting requirements. Aggregate data provides overall energy consumption at the facilities level."}
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Agricultural Digital Intelligence Supply Chain Aggregation Service market was valued at USD XXX million in 2024 and is projected to reach USD XXX million by 2033, with an expected CAGR of XX% during the forecast period.
Facebook
TwitterThe NFJP QPR collects aggregate data on a quarterly, annual, and program-to-date basis on number of participants, characteristics of participants, and interim and long-term performance outcomes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Excel based tool was developed to analyze means-end chain data. The tool consists of a user manual, a data input file to correctly organise your MEC data, a calculator file to analyse your data, and instructional videos. The purpose of this tool is to aggregate laddering data into hierarchical value maps showing means-end chains. The summarized results consist of (1) a summary overview, (2) a matrix, and (3) output for copy/pasting into NodeXL to generate hierarchal value maps (HVMs). To use this tool, you must have collected data via laddering interviews. Ladders are codes linked together consisting of attributes, consequences and values (ACVs).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This artifact accompanies the SEET@ICSE article "Assessing the impact of hints in learning formal specification", which reports on a user study to investigate the impact of different types of automated hints while learning a formal specification language, both in terms of immediate performance and learning retention, but also in the emotional response of the students. This research artifact provides all the material required to replicate this study (except for the proprietary questionnaires passed to assess the emotional response and user experience), as well as the collected data and data analysis scripts used for the discussion in the paper.
Dataset
The artifact contains the resources described below.
Experiment resources
The resources needed for replicating the experiment, namely in directory experiment:
alloy_sheet_pt.pdf: the 1-page Alloy sheet that participants had access to during the 2 sessions of the experiment. The sheet was passed in Portuguese due to the population of the experiment.
alloy_sheet_en.pdf: a version the 1-page Alloy sheet that participants had access to during the 2 sessions of the experiment translated into English.
docker-compose.yml: a Docker Compose configuration file to launch Alloy4Fun populated with the tasks in directory data/experiment for the 2 sessions of the experiment.
api and meteor: directories with source files for building and launching the Alloy4Fun platform for the study.
Experiment data
The task database used in our application of the experiment, namely in directory data/experiment:
Model.json, Instance.json, and Link.json: JSON files with to populate Alloy4Fun with the tasks for the 2 sessions of the experiment.
identifiers.txt: the list of all (104) available participant identifiers that can participate in the experiment.
Collected data
Data collected in the application of the experiment as a simple one-factor randomised experiment in 2 sessions involving 85 undergraduate students majoring in CSE. The experiment was validated by the Ethics Committee for Research in Social and Human Sciences of the Ethics Council of the University of Minho, where the experiment took place. Data is shared the shape of JSON and CSV files with a header row, namely in directory data/results:
data_sessions.json: data collected from task-solving in the 2 sessions of the experiment, used to calculate variables productivity (PROD1 and PROD2, between 0 and 12 solved tasks) and efficiency (EFF1 and EFF2, between 0 and 1).
data_socio.csv: data collected from socio-demographic questionnaire in the 1st session of the experiment, namely:
participant identification: participant's unique identifier (ID);
socio-demographic information: participant's age (AGE), sex (SEX, 1 through 4 for female, male, prefer not to disclosure, and other, respectively), and average academic grade (GRADE, from 0 to 20, NA denotes preference to not disclosure).
data_emo.csv: detailed data collected from the emotional questionnaire in the 2 sessions of the experiment, namely:
participant identification: participant's unique identifier (ID) and the assigned treatment (column HINT, either N, L, E or D);
detailed emotional response data: the differential in the 5-point Likert scale for each of the 14 measured emotions in the 2 sessions, ranging from -5 to -1 if decreased, 0 if maintained, from 1 to 5 if increased, or NA denoting failure to submit the questionnaire. Half of the emotions are positive (Admiration1 and Admiration2, Desire1 and Desire2, Hope1 and Hope2, Fascination1 and Fascination2, Joy1 and Joy2, Satisfaction1 and Satisfaction2, and Pride1 and Pride2), and half are negative (Anger1 and Anger2, Boredom1 and Boredom2, Contempt1 and Contempt2, Disgust1 and Disgust2, Fear1 and Fear2, Sadness1 and Sadness2, and Shame1 and Shame2). This detailed data was used to compute the aggregate data in data_emo_aggregate.csv and in the detailed discussion in Section 6 of the paper.
data_umux.csv: data collected from the user experience questionnaires in the 2 sessions of the experiment, namely:
participant identification: participant's unique identifier (ID);
user experience data: summarised user experience data from the UMUX surveys (UMUX1 and UMUX2, as a usability metric ranging from 0 to 100).
participants.txt: the list of participant identifiers that have registered for the experiment.
Analysis scripts
The analysis scripts required to replicate the analysis of the results of the experiment as reported in the paper, namely in directory analysis:
analysis.r: An R script to analyse the data in the provided CSV files; each performed analysis is documented within the file itself.
requirements.r: An R script to install the required libraries for the analysis script.
normalize_task.r: A Python script to normalize the task JSON data from file data_sessions.json into the CSV format required by the analysis script.
normalize_emo.r: A Python script to compute the aggregate emotional response in the CSV format required by the analysis script from the detailed emotional response data in the CSV format of data_emo.csv.
Dockerfile: Docker script to automate the analysis script from the collected data.
Setup
To replicate the experiment and the analysis of the results, only Docker is required.
If you wish to manually replicate the experiment and collect your own data, you'll need to install:
A modified version of the Alloy4Fun platform, which is built in the Meteor web framework. This version of Alloy4Fun is publicly available in branch study of its repository at https://github.com/haslab/Alloy4Fun/tree/study.
If you wish to manually replicate the analysis of the data collected in our experiment, you'll need to install:
Python to manipulate the JSON data collected in the experiment. Python is freely available for download at https://www.python.org/downloads/, with distributions for most platforms.
R software for the analysis scripts. R is freely available for download at https://cran.r-project.org/mirrors.html, with binary distributions available for Windows, Linux and Mac.
Usage
Experiment replication
This section describes how to replicate our user study experiment, and collect data about how different hints impact the performance of participants.
To launch the Alloy4Fun platform populated with tasks for each session, just run the following commands from the root directory of the artifact. The Meteor server may take a few minutes to launch, wait for the "Started your app" message to show.
cd experimentdocker-compose up
This will launch Alloy4Fun at http://localhost:3000. The tasks are accessed through permalinks assigned to each participant. The experiment allows for up to 104 participants, and the list of available identifiers is given in file identifiers.txt. The group of each participant is determined by the last character of the identifier, either N, L, E or D. The task database can be consulted in directory data/experiment, in Alloy4Fun JSON files.
In the 1st session, each participant was given one permalink that gives access to 12 sequential tasks. The permalink is simply the participant's identifier, so participant 0CAN would just access http://localhost:3000/0CAN. The next task is available after a correct submission to the current task or when a time-out occurs (5mins). Each participant was assigned to a different treatment group, so depending on the permalink different kinds of hints are provided. Below are 4 permalinks, each for each hint group:
Group N (no hints): http://localhost:3000/0CAN
Group L (error locations): http://localhost:3000/CA0L
Group E (counter-example): http://localhost:3000/350E
Group D (error description): http://localhost:3000/27AD
In the 2nd session, likewise the 1st session, each permalink gave access to 12 sequential tasks, and the next task is available after a correct submission or a time-out (5mins). The permalink is constructed by prepending the participant's identifier with P-. So participant 0CAN would just access http://localhost:3000/P-0CAN. In the 2nd sessions all participants were expected to solve the tasks without any hints provided, so the permalinks from different groups are undifferentiated.
Before the 1st session the participants should answer the socio-demographic questionnaire, that should ask the following information: unique identifier, age, sex, familiarity with the Alloy language, and average academic grade.
Before and after both sessions the participants should answer the standard PrEmo 2 questionnaire. PrEmo 2 is published under an Attribution-NonCommercial-NoDerivatives 4.0 International Creative Commons licence (CC BY-NC-ND 4.0). This means that you are free to use the tool for non-commercial purposes as long as you give appropriate credit, provide a link to the license, and do not modify the original material. The original material, namely the depictions of the diferent emotions, can be downloaded from https://diopd.org/premo/. The questionnaire should ask for the unique user identifier, and for the attachment with each of the depicted 14 emotions, expressed in a 5-point Likert scale.
After both sessions the participants should also answer the standard UMUX questionnaire. This questionnaire can be used freely, and should ask for the user unique identifier and answers for the standard 4 questions in a 7-point Likert scale. For information about the questions, how to implement the questionnaire, and how to compute the usability metric ranging from 0 to 100 score from the answers, please see the original paper:
Kraig Finstad. 2010. The usability metric for user experience. Interacting with computers 22, 5 (2010), 323–327.
Analysis of other applications of the experiment
This section describes how to replicate the analysis of the data collected in an application of the experiment described in Experiment replication.
The analysis script expects data in 4 CSV files,
Facebook
TwitterHospital Financial Quarterly Aggregate Report