100+ datasets found
  1. d

    Data from: Quality control and data-handling in multicentre studies: the...

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Quality control and data-handling in multicentre studies: the case of the Multicentre Project for Tuberculosis Research [Dataset]. https://catalog.data.gov/dataset/quality-control-and-data-handling-in-multicentre-studies-the-case-of-the-multicentre-proje
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background The Multicentre Project for Tuberculosis Research (MPTR) was a clinical-epidemiological study on tuberculosis carried out in Spain from 1996 to 1998. In total, 96 centres scattered all over the country participated in the project, 19935 "possible cases" of tuberculosis were examined and 10053 finally included. Data-handling and quality control procedures implemented in the MPTR are described. Methods The study was divided in three phases: 1) preliminary phase, 2) field work 3) final phase. Quality control procedures during the three phases are described. Results: Preliminary phase: a) organisation of the research team; b) design of epidemiological tools; training of researchers. Field work: a) data collection; b) data computerisation; c) data transmission; d) data cleaning; e) quality control audits; f) confidentiality. Final phase: a) final data cleaning; b) final analysis. Conclusion The undertaking of a multicentre project implies the need to work with a heterogeneous research team and yet at the same time attain a common goal by following a homogeneous methodology. This demands an additional effort on quality control.

  2. o

    Indigenous data analysis methods for research

    • osf.io
    url
    Updated Jun 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nina Sivertsen; Tahlia Johnson; Annette Briley; Shanamae Davies; Tara Struck; Larissa Taylor; Susan Smith; Megan Cooper; Jaclyn Davey (2024). Indigenous data analysis methods for research [Dataset]. http://doi.org/10.17605/OSF.IO/VNZD9
    Explore at:
    urlAvailable download formats
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    Center For Open Science
    Authors
    Nina Sivertsen; Tahlia Johnson; Annette Briley; Shanamae Davies; Tara Struck; Larissa Taylor; Susan Smith; Megan Cooper; Jaclyn Davey
    Description

    Objective: The objective of this review is to identify what is known about Indigenous data analysis methods for research. Introduction: Understanding Indigenous data analyses methods for research is crucial in health research with Indigenous participants, to support culturally appropriate interpretation of research data, and culturally inclusive analyses in cross-cultural research teams. Inclusion Criteria: This review will consider primary research studies that report on Indigenous data analysis methods for research. Method: Medline (via Ovid SP), PsycINFO (via Ovid SP), Web of Science (Clarivate Analytics), Scopus (Elsevier), Cumulated Index to Nursing and Allied Health Literature CINAHL (EBSCOhost), ProQuest Central, ProQuest Social Sciences Premium (Clarivate) will be searched. ProQuest (Theses and Dissertations) will be searched for unpublished material. Studies published from inception onwards and written in English will be assessed for inclusion. Studies meeting the inclusion criteria will be assessed for methodological quality and data will be extracted.

  3. o

    Replication data for: Big Data: New Tricks for Econometrics

    • openicpsr.org
    Updated May 1, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hal R. Varian (2014). Replication data for: Big Data: New Tricks for Econometrics [Dataset]. http://doi.org/10.3886/E113925V1
    Explore at:
    Dataset updated
    May 1, 2014
    Dataset provided by
    American Economic Association
    Authors
    Hal R. Varian
    Time period covered
    May 1, 2014
    Description

    Computers are now involved in many economic transactions and can capture data associated with these transactions, which can then be manipulated and analyzed. Conventional statistical and econometric techniques such as regression often work well, but there are issues unique to big datasets that may require different tools. First, the sheer size of the data involved may require more powerful data manipulation tools. Second, we may have more potential predictors than appropriate for estimation, so we need to do some kind of variable selection. Third, large datasets may allow for more flexible relationships than simple linear models. Machine learning techniques such as decision trees, support vector machines, neural nets, deep learning, and so on may allow for more effective ways to model complex relationships. In this essay, I will describe a few of these tools for manipulating and analyzing big data. I believe that these methods have a lot to offer and should be more widely known and used by economists.

  4. 2025 Green Card Report for International Studies With Emphasis In...

    • myvisajobs.com
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MyVisaJobs (2025). 2025 Green Card Report for International Studies With Emphasis In Quantitative Spatial Data Analysis [Dataset]. https://www.myvisajobs.com/reports/green-card/major/international-studies-with-emphasis-in-quantitative-spatial-data-analysis
    Explore at:
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    Authors
    MyVisaJobs
    License

    https://www.myvisajobs.com/terms-of-service/https://www.myvisajobs.com/terms-of-service/

    Variables measured
    Major, Salary, Petitions Filed
    Description

    A dataset that explores Green Card sponsorship trends, salary data, and employer insights for international studies with emphasis in quantitative spatial data analysis in the U.S.

  5. e

    Scoping Review protocol for the study "Current practices in missing data...

    • b2find.eudat.eu
    Updated Apr 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Scoping Review protocol for the study "Current practices in missing data handling for interrupted time series studies performed on individual-level data: a scoping review in health research" [Dataset]. https://b2find.eudat.eu/dataset/951f79ea-6ff4-5ec3-bc27-7016d1949e0c
    Explore at:
    Dataset updated
    Apr 29, 2023
    Description

    Protocol for the study entitled "Current practices in missing data handling for interrupted time series studies performed on individual-level data: a scoping review in health research".

  6. Z

    Assessing the impact of hints in learning formal specification: Research...

    • data.niaid.nih.gov
    Updated Jan 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Macedo, Nuno (2024). Assessing the impact of hints in learning formal specification: Research artifact [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10450608
    Explore at:
    Dataset updated
    Jan 29, 2024
    Dataset provided by
    Campos, José Creissac
    Margolis, Iara
    Macedo, Nuno
    Cunha, Alcino
    Sousa, Emanuel
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This artifact accompanies the SEET@ICSE article "Assessing the impact of hints in learning formal specification", which reports on a user study to investigate the impact of different types of automated hints while learning a formal specification language, both in terms of immediate performance and learning retention, but also in the emotional response of the students. This research artifact provides all the material required to replicate this study (except for the proprietary questionnaires passed to assess the emotional response and user experience), as well as the collected data and data analysis scripts used for the discussion in the paper.

    Dataset

    The artifact contains the resources described below.

    Experiment resources

    The resources needed for replicating the experiment, namely in directory experiment:

    alloy_sheet_pt.pdf: the 1-page Alloy sheet that participants had access to during the 2 sessions of the experiment. The sheet was passed in Portuguese due to the population of the experiment.

    alloy_sheet_en.pdf: a version the 1-page Alloy sheet that participants had access to during the 2 sessions of the experiment translated into English.

    docker-compose.yml: a Docker Compose configuration file to launch Alloy4Fun populated with the tasks in directory data/experiment for the 2 sessions of the experiment.

    api and meteor: directories with source files for building and launching the Alloy4Fun platform for the study.

    Experiment data

    The task database used in our application of the experiment, namely in directory data/experiment:

    Model.json, Instance.json, and Link.json: JSON files with to populate Alloy4Fun with the tasks for the 2 sessions of the experiment.

    identifiers.txt: the list of all (104) available participant identifiers that can participate in the experiment.

    Collected data

    Data collected in the application of the experiment as a simple one-factor randomised experiment in 2 sessions involving 85 undergraduate students majoring in CSE. The experiment was validated by the Ethics Committee for Research in Social and Human Sciences of the Ethics Council of the University of Minho, where the experiment took place. Data is shared the shape of JSON and CSV files with a header row, namely in directory data/results:

    data_sessions.json: data collected from task-solving in the 2 sessions of the experiment, used to calculate variables productivity (PROD1 and PROD2, between 0 and 12 solved tasks) and efficiency (EFF1 and EFF2, between 0 and 1).

    data_socio.csv: data collected from socio-demographic questionnaire in the 1st session of the experiment, namely:

    participant identification: participant's unique identifier (ID);

    socio-demographic information: participant's age (AGE), sex (SEX, 1 through 4 for female, male, prefer not to disclosure, and other, respectively), and average academic grade (GRADE, from 0 to 20, NA denotes preference to not disclosure).

    data_emo.csv: detailed data collected from the emotional questionnaire in the 2 sessions of the experiment, namely:

    participant identification: participant's unique identifier (ID) and the assigned treatment (column HINT, either N, L, E or D);

    detailed emotional response data: the differential in the 5-point Likert scale for each of the 14 measured emotions in the 2 sessions, ranging from -5 to -1 if decreased, 0 if maintained, from 1 to 5 if increased, or NA denoting failure to submit the questionnaire. Half of the emotions are positive (Admiration1 and Admiration2, Desire1 and Desire2, Hope1 and Hope2, Fascination1 and Fascination2, Joy1 and Joy2, Satisfaction1 and Satisfaction2, and Pride1 and Pride2), and half are negative (Anger1 and Anger2, Boredom1 and Boredom2, Contempt1 and Contempt2, Disgust1 and Disgust2, Fear1 and Fear2, Sadness1 and Sadness2, and Shame1 and Shame2). This detailed data was used to compute the aggregate data in data_emo_aggregate.csv and in the detailed discussion in Section 6 of the paper.

    data_umux.csv: data collected from the user experience questionnaires in the 2 sessions of the experiment, namely:

    participant identification: participant's unique identifier (ID);

    user experience data: summarised user experience data from the UMUX surveys (UMUX1 and UMUX2, as a usability metric ranging from 0 to 100).

    participants.txt: the list of participant identifiers that have registered for the experiment.

    Analysis scripts

    The analysis scripts required to replicate the analysis of the results of the experiment as reported in the paper, namely in directory analysis:

    analysis.r: An R script to analyse the data in the provided CSV files; each performed analysis is documented within the file itself.

    requirements.r: An R script to install the required libraries for the analysis script.

    normalize_task.r: A Python script to normalize the task JSON data from file data_sessions.json into the CSV format required by the analysis script.

    normalize_emo.r: A Python script to compute the aggregate emotional response in the CSV format required by the analysis script from the detailed emotional response data in the CSV format of data_emo.csv.

    Dockerfile: Docker script to automate the analysis script from the collected data.

    Setup

    To replicate the experiment and the analysis of the results, only Docker is required.

    If you wish to manually replicate the experiment and collect your own data, you'll need to install:

    A modified version of the Alloy4Fun platform, which is built in the Meteor web framework. This version of Alloy4Fun is publicly available in branch study of its repository at https://github.com/haslab/Alloy4Fun/tree/study.

    If you wish to manually replicate the analysis of the data collected in our experiment, you'll need to install:

    Python to manipulate the JSON data collected in the experiment. Python is freely available for download at https://www.python.org/downloads/, with distributions for most platforms.

    R software for the analysis scripts. R is freely available for download at https://cran.r-project.org/mirrors.html, with binary distributions available for Windows, Linux and Mac.

    Usage

    Experiment replication

    This section describes how to replicate our user study experiment, and collect data about how different hints impact the performance of participants.

    To launch the Alloy4Fun platform populated with tasks for each session, just run the following commands from the root directory of the artifact. The Meteor server may take a few minutes to launch, wait for the "Started your app" message to show.

    cd experimentdocker-compose up

    This will launch Alloy4Fun at http://localhost:3000. The tasks are accessed through permalinks assigned to each participant. The experiment allows for up to 104 participants, and the list of available identifiers is given in file identifiers.txt. The group of each participant is determined by the last character of the identifier, either N, L, E or D. The task database can be consulted in directory data/experiment, in Alloy4Fun JSON files.

    In the 1st session, each participant was given one permalink that gives access to 12 sequential tasks. The permalink is simply the participant's identifier, so participant 0CAN would just access http://localhost:3000/0CAN. The next task is available after a correct submission to the current task or when a time-out occurs (5mins). Each participant was assigned to a different treatment group, so depending on the permalink different kinds of hints are provided. Below are 4 permalinks, each for each hint group:

    Group N (no hints): http://localhost:3000/0CAN

    Group L (error locations): http://localhost:3000/CA0L

    Group E (counter-example): http://localhost:3000/350E

    Group D (error description): http://localhost:3000/27AD

    In the 2nd session, likewise the 1st session, each permalink gave access to 12 sequential tasks, and the next task is available after a correct submission or a time-out (5mins). The permalink is constructed by prepending the participant's identifier with P-. So participant 0CAN would just access http://localhost:3000/P-0CAN. In the 2nd sessions all participants were expected to solve the tasks without any hints provided, so the permalinks from different groups are undifferentiated.

    Before the 1st session the participants should answer the socio-demographic questionnaire, that should ask the following information: unique identifier, age, sex, familiarity with the Alloy language, and average academic grade.

    Before and after both sessions the participants should answer the standard PrEmo 2 questionnaire. PrEmo 2 is published under an Attribution-NonCommercial-NoDerivatives 4.0 International Creative Commons licence (CC BY-NC-ND 4.0). This means that you are free to use the tool for non-commercial purposes as long as you give appropriate credit, provide a link to the license, and do not modify the original material. The original material, namely the depictions of the diferent emotions, can be downloaded from https://diopd.org/premo/. The questionnaire should ask for the unique user identifier, and for the attachment with each of the depicted 14 emotions, expressed in a 5-point Likert scale.

    After both sessions the participants should also answer the standard UMUX questionnaire. This questionnaire can be used freely, and should ask for the user unique identifier and answers for the standard 4 questions in a 7-point Likert scale. For information about the questions, how to implement the questionnaire, and how to compute the usability metric ranging from 0 to 100 score from the answers, please see the original paper:

    Kraig Finstad. 2010. The usability metric for user experience. Interacting with computers 22, 5 (2010), 323–327.

    Analysis of other applications of the experiment

    This section describes how to replicate the analysis of the data collected in an application of the experiment described in Experiment replication.

    The analysis script expects data in 4 CSV files,

  7. D

    Data Analysis Services Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Analysis Services Report [Dataset]. https://www.datainsightsmarket.com/reports/data-analysis-services-1989313
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    May 26, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Data Analysis Services market is experiencing robust growth, driven by the exponential increase in data volume and the rising demand for data-driven decision-making across various industries. The market, estimated at $150 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching an impressive $450 billion by 2033. This expansion is fueled by several key factors, including the increasing adoption of cloud-based analytics platforms, the growing need for advanced analytics techniques like machine learning and AI, and the rising focus on data security and compliance. The market is segmented by service type (e.g., predictive analytics, descriptive analytics, prescriptive analytics), industry vertical (e.g., healthcare, finance, retail), and deployment model (cloud, on-premise). Key players like IBM, Accenture, Microsoft, and SAS Institute are investing heavily in research and development, expanding their service portfolios, and pursuing strategic partnerships to maintain their market leadership. The competitive landscape is characterized by both large established players and emerging niche providers offering specialized solutions. The market's growth trajectory is influenced by various trends, including the increasing adoption of big data technologies, the growing prevalence of self-service analytics tools empowering business users, and the rise of specialized data analysis service providers catering to specific industry needs. However, certain restraints, such as the lack of skilled data analysts, data security concerns, and the high cost of implementation and maintenance of advanced analytics solutions, could potentially hinder market growth. Addressing these challenges through investments in data literacy programs, enhanced security measures, and flexible pricing models will be crucial for sustaining the market's momentum and unlocking its full potential. Overall, the Data Analysis Services market presents a significant opportunity for companies offering innovative solutions and expertise in this rapidly evolving landscape.

  8. m

    Data from: Research Document: Jaouad Karfali Economic Cycle Analysis with...

    • data.mendeley.com
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karfali Jaouad (2025). Research Document: Jaouad Karfali Economic Cycle Analysis with Numerical Time Cycles [Dataset]. http://doi.org/10.17632/wv7dcm5834.1
    Explore at:
    Dataset updated
    Feb 26, 2025
    Authors
    Karfali Jaouad
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Description: This dataset contains historical economic data spanning from 1871 to 2024, used in Jaouad Karfali’s research on Economic Cycle Analysis with Numerical Time Cycles. The study aims to improve economic forecasting accuracy through the 9-year cycle model, which demonstrates superior predictive capabilities compared to traditional economic indicators.

    Dataset Contents: The dataset includes a comprehensive range of economic indicators used in the research, such as:

    USGDP_1871-2024.csv – U.S. Gross Domestic Product (GDP) data. USCPI_cleaned.csv – U.S. Consumer Price Index (CPI), cleaned and processed. USWAGE_1871-2024.csv – U.S. average wages data. EXCHANGEGLOBAL_cleaned.csv – Global exchange rates for the U.S. dollar. EXCHANGEPOUND_cleaned.csv – U.S. dollar to British pound exchange rates. INTERESTRATE_1871-2024.csv – U.S. interest rate data. UNRATE.csv – U.S. unemployment rate statistics. POPTOTUSA647NWDB.csv – U.S. total population data. Significance of the Data: This dataset serves as a foundation for a robust economic analysis of the U.S. economy over multiple decades. It was instrumental in testing the 9-year economic cycle model, which demonstrated an 85% accuracy rate in economic forecasting when compared to traditional models such as ARIMA and VAR.

    Applications:

    Economic Forecasting: Predicts a 1.5% decline in GDP in 2025, followed by a gradual recovery between 2026-2034. Economic Stability Analysis: Used for comparing forecasts with estimates from institutions like the IMF and World Bank. Academic and Institutional Research: Supports studies in economic cycles and long-term forecasting. Source & Further Information: For more details on the methodology and research findings, refer to the full paper published on SSRN:

    https://ssrn.com/author=7429208 https://orcid.org/0009-0002-9626-7289

    • Jaouad Karfali
  9. D

    Data Mining Tools Market Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Feb 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Data Mining Tools Market Report [Dataset]. https://www.marketresearchforecast.com/reports/data-mining-tools-market-1722
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Feb 3, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Data Mining Tools Market size was valued at USD 1.01 USD billion in 2023 and is projected to reach USD 1.99 USD billion by 2032, exhibiting a CAGR of 10.2 % during the forecast period. The growing adoption of data-driven decision-making and the increasing need for business intelligence are major factors driving market growth. Data mining refers to filtering, sorting, and classifying data from larger datasets to reveal subtle patterns and relationships, which helps enterprises identify and solve complex business problems through data analysis. Data mining software tools and techniques allow organizations to foresee future market trends and make business-critical decisions at crucial times. Data mining is an essential component of data science that employs advanced data analytics to derive insightful information from large volumes of data. Businesses rely heavily on data mining to undertake analytics initiatives in the organizational setup. The analyzed data sourced from data mining is used for varied analytics and business intelligence (BI) applications, which consider real-time data analysis along with some historical pieces of information. Recent developments include: May 2023 – WiMi Hologram Cloud Inc. introduced a new data interaction system developed by combining neural network technology and data mining. Using real-time interaction, the system can offer reliable and safe information transmission., May 2023 – U.S. Data Mining Group, Inc., operating in bitcoin mining site, announced a hosting contract to deploy 150,000 bitcoins in partnership with major companies such as TeslaWatt, Sphere 3D, Marathon Digital, and more. The company is offering industry turn-key solutions for curtailment, accounting, and customer relations., April 2023 – Artificial intelligence and single-cell biotech analytics firm, One Biosciences, launched a single cell data mining algorithm called ‘MAYA’. The algorithm is for cancer patients to detect therapeutic vulnerabilities., May 2022 – Europe-based Solarisbank, a banking-as-a-service provider, announced its partnership with Snowflake to boost its cloud data strategy. Using the advanced cloud infrastructure, the company can enhance data mining efficiency and strengthen its banking position.. Key drivers for this market are: Increasing Focus on Customer Satisfaction to Drive Market Growth. Potential restraints include: Requirement of Skilled Technical Resources Likely to Hamper Market Growth. Notable trends are: Incorporation of Data Mining and Machine Learning Solutions to Propel Market Growth.

  10. Sensitive data: legal, ethical and secure storage issues

    • figshare.com
    pdf
    Updated Oct 10, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australian National Data Service; Kate LeMay (2016). Sensitive data: legal, ethical and secure storage issues [Dataset]. http://doi.org/10.6084/m9.figshare.4003485.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 10, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Australian National Data Service; Kate LeMay
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Slides from the introduction to a panel session at eResearch Australasia (Melbourne, October 2016). Panellists: Kate LeMay (Australian National Data Service), Gabrielle Hirsch (Walter and Eliza Hall Institute of Medical Research), Gordon McGurk (National Health and Medical Research Council) and Jeff Christiansen (Intersect).Short abstractHuman medical, health and personal data are a major category of sensitive data. These data need particular care, both during the management of a research project and when planning to publish them. The Australian National Data Service (ANDS) has developed guides around the management and sharing of sensitive data. ANDS is convening this panel to consider legal, ethical and secure storage issues around sensitive data, in the stages of the research life cycle: research conception and planning, commencement of research, data collection and processing, data analysis storage and management, and dissemination of results and data access.

    The legal framework around privacy in Australia is complex and differs between states. Many Acts regulate the collection, use, disclosure and handling of private data. There are also many ethical considerations around the management and sharing of sensitive data. The National Health and Medical Research Council (NHMRC) has developed the Human Research Ethics Application (HREA) as a replacement for the National Ethics Application Form (NEAF). The aim of the HREA is to be a concise streamlined application to facilitate efficient and effective ethics review for research involving humans. The application will assist researchers to consider the ethical principles of the National Statement of Ethical Conduct in Human Research (2007) in relation to their research.

    National security standard guidelines and health and medical research policy drivers underpin the need for a national fit-for-purpose health and medical research data storage facility to store, access and use health and medical research data. med.data.edu.au is an NCRIS-funded facility that underpins the Australian health and medical research sector by providing secure data storage and compute services that adhere to privacy and confidentiality requirements of data custodians who are responsible for human-derived research datasets.

  11. f

    Data for Example II.

    • plos.figshare.com
    application/csv
    Updated Jul 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jularat Chumnaul; Mohammad Sepehrifar (2024). Data for Example II. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.s003
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jularat Chumnaul; Mohammad Sepehrifar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software’s capabilities to achieve the most accurate and credible results.

  12. D

    Big Data Analysis Platform Market Report | Global Forecast From 2025 To 2033...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Big Data Analysis Platform Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-big-data-analysis-platform-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Big Data Analysis Platform Market Outlook



    The global market size for Big Data Analysis Platforms is projected to grow from USD 35.5 billion in 2023 to an impressive USD 110.7 billion by 2032, reflecting a CAGR of 13.5%. This substantial growth can be attributed to the increasing adoption of data-driven decision-making processes across various industries, the rapid proliferation of IoT devices, and the ever-growing volumes of data generated globally.



    One of the primary growth factors for the Big Data Analysis Platform market is the escalating need for businesses to derive actionable insights from complex and voluminous datasets. With the advent of technologies such as artificial intelligence and machine learning, organizations are increasingly leveraging big data analytics to enhance their operational efficiency, customer experience, and competitiveness. The ability to process vast amounts of data quickly and accurately is proving to be a game-changer, enabling businesses to make more informed decisions, predict market trends, and optimize their supply chains.



    Another significant driver is the rise of digital transformation initiatives across various sectors. Companies are increasingly adopting digital technologies to improve their business processes and meet changing customer expectations. Big Data Analysis Platforms are central to these initiatives, providing the necessary tools to analyze and interpret data from diverse sources, including social media, customer transactions, and sensor data. This trend is particularly pronounced in sectors such as retail, healthcare, and BFSI (banking, financial services, and insurance), where data analytics is crucial for personalizing customer experiences, managing risks, and improving operational efficiencies.



    Moreover, the growing adoption of cloud computing is significantly influencing the market. Cloud-based Big Data Analysis Platforms offer several advantages over traditional on-premises solutions, including scalability, flexibility, and cost-effectiveness. Businesses of all sizes are increasingly turning to cloud-based analytics solutions to handle their data processing needs. The ability to scale up or down based on demand, coupled with reduced infrastructure costs, makes cloud-based solutions particularly appealing to small and medium-sized enterprises (SMEs) that may not have the resources to invest in extensive on-premises infrastructure.



    Data Science and Machine-Learning Platforms play a pivotal role in the evolution of Big Data Analysis Platforms. These platforms provide the necessary tools and frameworks for processing and analyzing vast datasets, enabling organizations to uncover hidden patterns and insights. By integrating data science techniques with machine learning algorithms, businesses can automate the analysis process, leading to more accurate predictions and efficient decision-making. This integration is particularly beneficial in sectors such as finance and healthcare, where the ability to quickly analyze complex data can lead to significant competitive advantages. As the demand for data-driven insights continues to grow, the role of data science and machine-learning platforms in enhancing big data analytics capabilities is becoming increasingly critical.



    From a regional perspective, North America currently holds the largest market share, driven by the presence of major technology companies, high adoption rates of advanced technologies, and substantial investments in data analytics infrastructure. Europe and the Asia Pacific regions are also experiencing significant growth, fueled by increasing digitalization efforts and the rising importance of data analytics in business strategy. The Asia Pacific region, in particular, is expected to witness the highest CAGR during the forecast period, propelled by rapid economic growth, a burgeoning middle class, and increasing internet and smartphone penetration.



    Component Analysis



    The Big Data Analysis Platform market can be broadly categorized into three components: Software, Hardware, and Services. The software segment includes analytics software, data management software, and visualization tools, which are crucial for analyzing and interpreting large datasets. This segment is expected to dominate the market due to the continuous advancements in analytics software and the increasing need for sophisticated data analysis tools. Analytics software enables organizations to process and analyze data from multiple sources,

  13. d

    Data from: Bringing Data into the Classroom: Hang On - You Can Do It!

    • search.dataone.org
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Hamilton; Mary MacLeod (2023). Bringing Data into the Classroom: Hang On - You Can Do It! [Dataset]. http://doi.org/10.5683/SP3/VNMOZS
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Elizabeth Hamilton; Mary MacLeod
    Description

    Emerging issues your director needs to know about, including funding issues and how to bring data into the classroom.

  14. Data from: Secondary Data Analysis of the Socio-Economic Panel Study and the...

    • beta.ukdataservice.ac.uk
    Updated 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Langner (2021). Secondary Data Analysis of the Socio-Economic Panel Study and the Cross-National Equivalent File, 2016-2020 [Dataset]. http://doi.org/10.5255/ukda-sn-854591
    Explore at:
    Dataset updated
    2021
    Dataset provided by
    DataCitehttps://www.datacite.org/
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Laura Langner
    Description

    The data comprises three of the Cross-National Equivalent Files. The Panel Study of Income Dynamics (1970-2013) ; the German Socio-Economic Panel Study (1984-2015) and the UKHLS (2009-2014) and the British Household Panel Study (1991-2009). The following variables were extracted: personal identifier (x11101LL), household identifier (x11102), survey year (year), sex (d11101LL), marital status (d11104), income (i11110), employment status (e11101), hour worked (e11101), education (d11108/9), partner identifier (d11105), household size (d11106) and number of children (d11107). The data came in a harmonized form from the data providers. For the papers on Germany, in addition to the variables described above, life satisfaction, work hour flexibility, caregiving, housework hours, widowhood status and carer ID were further extracted from the original German Socio-Economic Panel Study.

  15. Data analysis Protocol for a Joint Study into the Impacts of AI on...

    • zenodo.org
    pdf
    Updated Nov 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tony Clear; Tony Clear (2024). Data analysis Protocol for a Joint Study into the Impacts of AI on professional Competencies of IT Professionals and Implications for Computing Students. ITiCSE 2024 Working Group 02. [Dataset]. http://doi.org/10.5281/zenodo.14176957
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tony Clear; Tony Clear
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    Overview

    The purpose of this protocol is to help us define a common protocol for sharing and analysing data for the ITiCSE 2024 working group: “WG02: A Multi-Institutional-Multi-National Study into the Impacts of AI on Work Practices of IT Professionals and Implications for Computing Students”. Excerpts from the working group plan to place the protocol in context (Clear et al., 2024) are given below.

    Background and Related Work

    As Artificial Intelligence (AI) continues to make its presence felt in transforming workplaces around the world [1,10], and the Information Technology industry in particular, it is essential to understand its impact on the work practices of IT professionals, and the implications for computing students and curricula. This research project builds on work initiated jointly, in Sweden, New Zealand and Scotland, investigating concerns about the increasing impacts of Artificial Intelligence in IT Sector workplaces for employee work engagement [11,13,1] and the implications for tertiary study, assessment and curricula in computing [4, 8, 10, 9].

    “Work engagement”, has been defined as the positive inner state where employees are fully present and engaged in their work, and is closely linked to motivation, learning, productivity, and accountability [11, 13]. Within the context of (Generative) AI at work, IT professionals have been noted as early adopters of AI [10, 1]. Their involvement in implementing and utilising AI technologies can provide valuable insights into the interplay between AI and work engagement. The implications for students are significant as future IT professionals, who must acquire and enhance competencies to adapt and thrive in digital workplaces.

    2 Goals of the Working Group

    By exploring the relationship between work engagement and learning, this study aims to shed light on the dynamics that drive employee engagement and its connection to the professional development of competencies. The previous study has interviewed IT professionals with the following research questions (RQ):

    RQ1: How does AI influence work engagement for IT professionals?

    RQ2: How does AI affect the socio-technical work dynamics for IT professionals?

    RQ3: What are the implications of integrating AI on the acquisition and enhancement of professional competencies and the learning processes of IT professionals?

    3 Methodology

    This working group aims to analyse the corpus of interview data collected from multiple countries to better understand the implications for computing students, tertiary computing education curricula and assessment of the new professional competencies emerging from this work. This study informed by the literature on work engagement, automation and motivation for IT professionals [11, 13], will use a combination of multi-vocal literature review [7] and qualitative research methods [2, 5], including thematic analysis of the interviews, to investigate the state of the practice in and challenges IT Professionals face within their local/global work contexts. The literature on professional competencies in computing [4, 3, 6] will be drawn upon to characterise the new needs identified in this analysis. Further implications for computing curricula design and assessment will be developed from this analysis.

    REFERENCES

    [1] ACM Technology Policy Council. 2023. Principles for the development, deployment, and use of generative AI technologies, ACM New York.

    [2] Braun, V. and Clarke, V. 2021. One size fits all? What counts as quality practice in (reflexive) thematic analysis? Qualitative research in psychology, 18 (3). 328-352.

    [3] Clear, A., Clear, T., Vichare, A., Charles, T., Frezza, S., Gutica, M., Lunt, B., Maiorana, F., Pears, A. and Pitt, F. 2020. Designing Computer Science Competency Statements: A Process and Curriculum Model for the 21st Century in Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education, ACM, New York.

    [4] Clear, A., Parrish, A. and CC2020 Task Force. 2020. Computing Curricula 2020 - CC2020 - Paradigms for Future Computing Curricula ACM and IEEE-CS eds. A Computing Curricula Series Report ACM, New York.

    [5] Cruzes, D.S. and Dyba, T. 2011. Recommended steps for thematic synthesis in software engineering. in 2011 international symposium on empirical software engineering and measurement, IEEE, 2011, 275-284.

    [6] Frezza, S., Clear, T. and Clear, A. 2020. Unpacking Dispositions in the CC2020 Computing Curriculum Overview Report in 2020 IEEE Frontiers in Education Conference (FIE), IEEE, Uppsala, Sweden.

    [7] Garousi, V., Felderer, M., & Mäntylä, M. V. 2019. Guidelines for including grey literature and conducting multivocal literature reviews in software engineering. Information and Software Technology, 106. 101-121

    [8] Jacques, L. 2023. Teaching CS-101 at the Dawn of ChatGPT. ACM Inroads, 14 (2). 40-46.

    [9] Liffiton, M., Sheese, B., Savelka, J. and Denny, P. 2023. CodeHelp: Using Large Language Models with Guardrails for Scalable Support in Programming Classes. arXiv preprint arXiv:2308.06921.

    [10] Prather, J., Denny, P., Leinonen, J., Becker, B.A., Albluwi, I., Craig, M., Keuning, H., Kiesler, N., Kohn, T. and Luxton-Reilly, A. 2023. The robots are here: Navigating the generative ai revolution in computing education. arXiv preprint arXiv:2310.00658.

    [11] Roto, V., Palanque, P. and Karvonen, H., 2019. Engaging automation at work–a literature review. in Human Work Interaction Design. Designing Engaging Automation: 5th IFIP WG 13.6 Working Conference, HWID 2018, Espoo, Finland, August 20-21, 2018, Revised Selected Papers 5, Springer, 158-172.

    [12] SFIA Foundation. 2023. SFIA skills aligned to EU ICT Profiles, SFIA Institute, London.

    [13] Sharp, H., Baddoo, N., Beecham, S., Hall, T. and Robinson, H. 2009. Models of motivation in software engineering. Information and software technology, 51 (1). 219-233.

  16. Erroneous Payments In Childcare Centers Study (EPICCS)

    • agdatacommons.nal.usda.gov
    bin
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA FNS Office of Policy Support (2025). Erroneous Payments In Childcare Centers Study (EPICCS) [Dataset]. http://doi.org/10.15482/USDA.ADC/28256033.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 22, 2025
    Dataset provided by
    Food and Nutrition Servicehttps://www.fns.usda.gov/
    United States Department of Agriculturehttp://usda.gov/
    Authors
    USDA FNS Office of Policy Support
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    TitleErroneous Payments In Childcare Centers Study (EPICCS)URLThese data are not published elsewhere. The public webpage that provides information about the study is https://www.fns.usda.gov/research/cacfp/epiccs.Description of the experiment setting: location, influential climatic conditions, controlled conditions (e.g. temperature, light cycle)These data are from the Erroneous Payments In Childcare Centers Study (EPICCS) study and EPICCS Adjustment study. The study methods are summarized at https://www.fns.usda.gov/research/cacfp/epiccs.These studies examined payment errors and compliance with USDA meal pattern requirements for breakfast and lunch meals in a nationally representative sample of CACFP child care centers serving 2- to 5-year-old children in 2017. The studies included sponsored centers, independent centers, and Head Start centers. The studies did not include infant care centers, family day care homes, at-risk afterschool centers, outside school hours centers, adult care centers, or emergency and homeless shelters.Processing methods and equipment usedN/AStudy date(s) and durationEPICCS data were collected between March 2017 and March 2018. EPICCS Adjustment data were collected between April and June 2023 but are about 2017.Study spatial scale (size of replicates and spatial scale of study area)The intended study universe was CACFP child care centers serving 2- to 5-year-old children. The study data (for the study sample) are accurate. However, due to a weighting error, the national estimates are inaccurate. Please read our disclaimer about the accuracy of some of these data, and a summary of the study methods, at https://www.fns.usda.gov/research/cacfp/epiccs.Level of true replicationN/ASampling precision (within-replicate sampling or pseudoreplication)EPICCS used a multistage-clustered sample design as follows: (1) A representative sample of 25 States in the contiguous 48 States and the District of Columbia; (2) a representative sample of 450 childcare centers (and their sponsors) within Primary Sampling Units (PSUs); and (3) a random sample of 5,400 households (with children ages 2-5 years) enrolled in the sampled childcare centers who applied for free and reduced-price meals or were categorically eligible for free meals.Level of subsampling (number and repeat or within-replicate sampling)Study design (before–after, control–impacts, time series, before–after-control–impacts)Cross-sectionalDescription of any data manipulation, modeling, or statistical analysis undertakenThe EPICCS data were analyzed to produce sample-level descriptive statistics about sources of payment error, sources of noncompliance with USDA meal pattern requirements, and error rates. EPICCS Adjustment data (about the sampled centers’ meal counting methods) were used to re-analyze certification error (a source of payment error). Please read a summary of the study methods at https://www.fns.usda.gov/research/cacfp/epiccs.The sample-level data were weighted to produce national-level estimates, but due to a weighting error these estimates are inaccurate.To preserve the confidentiality of respondents, variables that could be used to identify sponsors, centers, or households were suppressed in the public-use files.Description of any gaps in the data or other limiting factorsPlease read our disclaimer about the accuracy of some of these data at https://www.fns.usda.gov/research/cacfp/epiccs.EPICCS was sampled on accurate criteria. However, the EPICCS and EPICCS Adjustment study final post-stratification weights were raked to inaccurate population parameters using FNS keydata. FNS keydata for CACFP child care centers include emergency shelters and at-risk afterschool programs in addition to child care centers (see the instructions of form FNS-44 https://www.fns.usda.gov/cacfp/fns-44). As a result, the population parameters that the weights were raked to were significantly different than the actual universe being analyzed in EPICCS. Because of the way CACFP data are reported, at-risk afterschool programs were all included in the population as “non-Head Start, not-for-profit outlets” leading to a severe over-weighting of these types of outlets (and children attending them) and under-weighting of Head Start and for-profit outlets. This is particularly notable because certification error is not a source of payment error in Head Start centers; children are categorically eligible for free meals, so FNS reimburses at the free rate without the need for Head Start centers to certify applications. As a result, the calculations using the incorrect weights very likely overestimated certification error and resulting payment errors. FNS chose not to publish these estimates in keeping with USDA scientific integrity guidelines. These weights are left in these data files for transparency.In the EPICCS Adjustment data, significant imputation of claiming methods was required due to non-response to the survey. More information is provided in the technical manual accompanying the data files.Outcome measurement methods and equipment usedData collection instruments included surveys of sampled sponsors and centers, meal observation booklets, abstraction of applications and meal counts, and household interviews.

  17. D

    Data Analysis Application Solution Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Analysis Application Solution Report [Dataset]. https://www.datainsightsmarket.com/reports/data-analysis-application-solution-1439900
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    May 23, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Data Analysis Application Solution market is experiencing robust growth, driven by the increasing volume and complexity of data generated across industries. The market, estimated at $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching an estimated $45 billion by 2033. This expansion is fueled by several key factors, including the rising adoption of cloud-based solutions offering scalability and cost-effectiveness, the growing need for real-time data analytics to support faster decision-making, and the increasing demand for advanced analytics techniques like machine learning and AI to extract deeper insights from data. Furthermore, the market is segmented by deployment (cloud, on-premise), application (business intelligence, data visualization, predictive analytics), and industry (BFSI, healthcare, retail, manufacturing). The competitive landscape is dynamic, with established players like SAP, Microsoft, and Qlik alongside emerging innovative companies like BigID and Collibra vying for market share through continuous product development and strategic partnerships. The major restraints on market growth include the high initial investment costs associated with implementing data analysis solutions, the need for skilled professionals to manage and interpret the data, and concerns around data security and privacy. However, these challenges are being addressed by the development of user-friendly interfaces, affordable cloud-based options, and enhanced data security measures. The market is also witnessing several trends, such as the increasing adoption of self-service analytics tools, empowering business users to perform their own data analysis, and the growing integration of data analysis solutions with other business applications to streamline workflows. The geographical distribution of the market reflects a strong presence in North America and Europe, with significant growth potential in emerging markets like Asia-Pacific. The presence of companies like Sterlite Technologies and Aparavi indicates a growing focus on the development of specialized data analytics applications targeting niche market segments.

  18. WIC Infant and Toddler Feeding Practices Study-2 (WIC ITFPS-2): Prenatal,...

    • agdatacommons.nal.usda.gov
    txt
    Updated Oct 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA FNS Office of Policy Support (2024). WIC Infant and Toddler Feeding Practices Study-2 (WIC ITFPS-2): Prenatal, Infant Year 5 Year Datasets [Dataset]. http://doi.org/10.15482/USDA.ADC/1528196
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 28, 2024
    Dataset provided by
    Food and Nutrition Servicehttps://www.fns.usda.gov/
    United States Department of Agriculturehttp://usda.gov/
    Authors
    USDA FNS Office of Policy Support
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The WIC Infant and Toddler Feeding Practices Study–2 (WIC ITFPS-2) (also known as the “Feeding My Baby Study”) is a national, longitudinal study that captures data on caregivers and their children who participated in the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) around the time of the child’s birth. The study addresses a series of research questions regarding feeding practices, the effect of WIC services on those practices, and the health and nutrition outcomes of children on WIC. Additionally, the study assesses changes in behaviors and trends that may have occurred over the past 20 years by comparing findings to the WIC Infant Feeding Practices Study–1 (WIC IFPS-1), the last major study of the diets of infants on WIC. This longitudinal cohort study has generated a series of reports. These datasets include data from caregivers and their children during the prenatal period and during the children’s first five years of life (child ages 1 to 60 months). A full description of the study design and data collection methods can be found in Chapter 1 of the Second Year Report (https://www.fns.usda.gov/wic/wic-infant-and-toddler-feeding-practices-st...). A full description of the sampling and weighting procedures can be found in Appendix B-1 of the Fourth Year Report (https://fns-prod.azureedge.net/sites/default/files/resource-files/WIC-IT...). Processing methods and equipment used Data in this dataset were primarily collected via telephone interview with caregivers. Children’s length/height and weight data were objectively collected while at the WIC clinic or during visits with healthcare providers. The study team cleaned the raw data to ensure the data were as correct, complete, and consistent as possible. Study date(s) and duration Data collection occurred between 2013 and 2019. Study spatial scale (size of replicates and spatial scale of study area) Respondents were primarily the caregivers of children who received WIC services around the time of the child’s birth. Data were collected from 80 WIC sites across 27 State agencies. Level of true replication Unknown Sampling precision (within-replicate sampling or pseudoreplication) This dataset includes sampling weights that can be applied to produce national estimates. A full description of the sampling and weighting procedures can be found in Appendix B-1 of the Fourth Year Report (https://fns-prod.azureedge.net/sites/default/files/resource-files/WIC-IT...). Level of subsampling (number and repeat or within-replicate sampling) A full description of the sampling and weighting procedures can be found in Appendix B-1 of the Fourth Year Report (https://fns-prod.azureedge.net/sites/default/files/resource-files/WIC-IT...). Study design (before–after, control–impacts, time series, before–after-control–impacts) Longitudinal cohort study. Description of any data manipulation, modeling, or statistical analysis undertaken Each entry in the dataset contains caregiver-level responses to telephone interviews. Also available in the dataset are children’s length/height and weight data, which were objectively collected while at the WIC clinic or during visits with healthcare providers. In addition, the file contains derived variables used for analytic purposes. The file also includes weights created to produce national estimates. The dataset does not include any personally-identifiable information for the study children and/or for individuals who completed the telephone interviews. Description of any gaps in the data or other limiting factors Please refer to the series of annual WIC ITFPS-2 reports (https://www.fns.usda.gov/wic/infant-and-toddler-feeding-practices-study-2-fourth-year-report) for detailed explanations of the study’s limitations. Outcome measurement methods and equipment used The majority of outcomes were measured via telephone interviews with children’s caregivers. Dietary intake was assessed using the USDA Automated Multiple Pass Method (https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-h...). Children’s length/height and weight data were objectively collected while at the WIC clinic or during visits with healthcare providers. Resources in this dataset:Resource Title: ITFP2 Year 5 Enroll to 60 Months Public Use Data CSV. File Name: itfps2_enrollto60m_publicuse.csvResource Description: ITFP2 Year 5 Enroll to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Enroll to 60 Months Public Use Data Codebook. File Name: ITFPS2_EnrollTo60m_PUF_Codebook.pdfResource Description: ITFP2 Year 5 Enroll to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Enroll to 60 Months Public Use Data SAS SPSS STATA R Data. File Name: ITFP@_Year5_Enroll60_SAS_SPSS_STATA_R.zipResource Description: ITFP2 Year 5 Enroll to 60 Months Public Use Data SAS SPSS STATA R DataResource Title: ITFP2 Year 5 Ana to 60 Months Public Use Data CSV. File Name: ampm_1to60_ana_publicuse.csvResource Description: ITFP2 Year 5 Ana to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Tot to 60 Months Public Use Data Codebook. File Name: AMPM_1to60_Tot Codebook.pdfResource Description: ITFP2 Year 5 Tot to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Ana to 60 Months Public Use Data Codebook. File Name: AMPM_1to60_Ana Codebook.pdfResource Description: ITFP2 Year 5 Ana to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Ana to 60 Months Public Use Data SAS SPSS STATA R Data. File Name: ITFP@_Year5_Ana_60_SAS_SPSS_STATA_R.zipResource Description: ITFP2 Year 5 Ana to 60 Months Public Use Data SAS SPSS STATA R DataResource Title: ITFP2 Year 5 Tot to 60 Months Public Use Data CSV. File Name: ampm_1to60_tot_publicuse.csvResource Description: ITFP2 Year 5 Tot to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Tot to 60 Months Public Use SAS SPSS STATA R Data. File Name: ITFP@_Year5_Tot_60_SAS_SPSS_STATA_R.zipResource Description: ITFP2 Year 5 Tot to 60 Months Public Use SAS SPSS STATA R DataResource Title: ITFP2 Year 5 Food Group to 60 Months Public Use Data CSV. File Name: ampm_foodgroup_1to60m_publicuse.csvResource Description: ITFP2 Year 5 Food Group to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Food Group to 60 Months Public Use Data Codebook. File Name: AMPM_FoodGroup_1to60m_Codebook.pdfResource Description: ITFP2 Year 5 Food Group to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Food Group to 60 Months Public Use SAS SPSS STATA R Data. File Name: ITFP@_Year5_Foodgroup_60_SAS_SPSS_STATA_R.zipResource Title: WIC Infant and Toddler Feeding Practices Study-2 Data File Training Manual. File Name: WIC_ITFPS-2_DataFileTrainingManual.pdf

  19. sohamphanseiitb/BIG_Data_5MSEC: BIG Data Analysis of NASA's 5 Millennium...

    • zenodo.org
    bin, pdf
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soham Phanse; Soham Phanse (2024). sohamphanseiitb/BIG_Data_5MSEC: BIG Data Analysis of NASA's 5 Millennium Solar Eclipse Database [Dataset]. http://doi.org/10.5281/zenodo.7409106
    Explore at:
    bin, pdfAvailable download formats
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Soham Phanse; Soham Phanse
    Description

    Solar eclipses are a topic of interest among astronomers, astrologers and the general public as well. There were and will be about 11898 eclipses in the 5 millennia from 2000 BC to 3000 AD. Data visualization and regression techniques offer a deep insight into how various parameters of a solar eclipse are related to each other. Physical models can be verified and can be updated based on the insights gained from the analysis.

    The study covers the major aspects of data analysis including data cleaning, pre-processing, EDA, distribution fitting, regression and machine learning based data analytics. We provide a cleaned and usable database ready for EDA and statistical analysis.

  20. m

    Data for "Direct and indirect Rod and Frame effect: A virtual reality study"...

    • data.mendeley.com
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michał Adamski (2025). Data for "Direct and indirect Rod and Frame effect: A virtual reality study" [Dataset]. http://doi.org/10.17632/pcf2n8b4rd.1
    Explore at:
    Dataset updated
    Feb 12, 2025
    Authors
    Michał Adamski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the raw experimental data and supplementary materials for the "Asymmetry Effects in Virtual Reality Rod and Frame Test". The materials included are:

    •  Raw Experimental Data: older.csv and young.csv
    •  Mathematica Notebooks: a collection of Mathematica notebooks used for data analysis and visualization. These notebooks provide scripts for processing the experimental data, performing statistical analyses, and generating the figures used in the project.
    •  Unity Package: a Unity package featuring a sample scene related to the project. The scene was built using Unity’s Universal Rendering Pipeline (URP). To utilize this package, ensure that URP is enabled in your Unity project. Instructions for enabling URP can be found in the Unity URP Documentation.
    

    Requirements:

    •  For Data Files: software capable of opening CSV files (e.g., Microsoft Excel, Google Sheets, or any programming language that can read CSV formats).
    •  For Mathematica Notebooks: Wolfram Mathematica software to run and modify the notebooks.
    •  For Unity Package: Unity Editor version compatible with URP (2019.3 or later recommended). URP must be installed and enabled in your Unity project.
    

    Usage Notes:

    •  The dataset facilitates comparative studies between different age groups based on the collected variables.
    •  Users can modify the Mathematica notebooks to perform additional analyses.
    •  The Unity scene serves as a reference to the project setup and can be expanded or integrated into larger projects.
    

    Citation: Please cite this dataset when using it in your research or publications.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institutes of Health (2025). Quality control and data-handling in multicentre studies: the case of the Multicentre Project for Tuberculosis Research [Dataset]. https://catalog.data.gov/dataset/quality-control-and-data-handling-in-multicentre-studies-the-case-of-the-multicentre-proje

Data from: Quality control and data-handling in multicentre studies: the case of the Multicentre Project for Tuberculosis Research

Related Article
Explore at:
Dataset updated
Jul 24, 2025
Dataset provided by
National Institutes of Health
Description

Background The Multicentre Project for Tuberculosis Research (MPTR) was a clinical-epidemiological study on tuberculosis carried out in Spain from 1996 to 1998. In total, 96 centres scattered all over the country participated in the project, 19935 "possible cases" of tuberculosis were examined and 10053 finally included. Data-handling and quality control procedures implemented in the MPTR are described. Methods The study was divided in three phases: 1) preliminary phase, 2) field work 3) final phase. Quality control procedures during the three phases are described. Results: Preliminary phase: a) organisation of the research team; b) design of epidemiological tools; training of researchers. Field work: a) data collection; b) data computerisation; c) data transmission; d) data cleaning; e) quality control audits; f) confidentiality. Final phase: a) final data cleaning; b) final analysis. Conclusion The undertaking of a multicentre project implies the need to work with a heterogeneous research team and yet at the same time attain a common goal by following a homogeneous methodology. This demands an additional effort on quality control.

Search
Clear search
Close search
Google apps
Main menu