Facebook
TwitterBackground: In Brazil, secondary data for epidemiology are largely available. However, they are insufficiently prepared for use in research, even when it comes to structured data since they were often designed for other purposes. To date, few publications focus on the process of preparing secondary data. The present findings can help in orienting future research projects that are based on secondary data.Objective: Describe the steps in the process of ensuring the adequacy of a secondary data set for a specific use and to identify the challenges of this process.Methods: The present study is qualitative and reports methodological issues about secondary data use. The study material was comprised of 6,059,454 live births and 73,735 infant death records from 2004 to 2013 of children whose mothers resided in the State of São Paulo - Brazil. The challenges and description of the procedures to ensure data adequacy were undertaken in 6 steps: (1) problem understanding, (2) resource planning, (3) data understanding, (4) data preparation, (5) data validation and (6) data distribution. For each step, procedures, and challenges encountered, and the actions to cope with them and partial results were described. To identify the most labor-intensive tasks in this process, the steps were assessed by adding the number of procedures, challenges, and coping actions. The highest values were assumed to indicate the most critical steps.Results: In total, 22 procedures and 23 actions were needed to deal with the 27 challenges encountered along the process of ensuring the adequacy of the study material for the intended use. The final product was an organized database for a historical cohort study suitable for the intended use. Data understanding and data preparation were identified as the most critical steps, accounting for about 70% of the challenges observed for data using.Conclusion: Significant challenges were encountered in the process of ensuring the adequacy of secondary health data for research use, mainly in the data understanding and data preparation steps. The use of the described steps to approach structured secondary data and the knowledge of the potential challenges along the process may contribute to planning health research.
Facebook
Twitterhttps://data.aussda.at/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11587/RWXEADhttps://data.aussda.at/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11587/RWXEAD
Full edition for scientific use. The survey study “Mapping the Field II” was conducted in 2024 and 2025 by AUSSDA – The Austrian Social Science Data Archive. The aim of the survey was to provide a detailed insight into the social sciences at Austrian universities and to capture attitudes towards data sharing and data management.
Facebook
TwitterWe analysed the Understanding Society Data from Waves 1 and 2 in our project to explore the uses of paradata in cross-sectional and longitudinal surveys with the aim of gaining knowledge that leads to improvement in field process management and responsive survey designs. The project’s key objective was to explore the uses of paradata for cross-sectional and longitudinal surveys with the aim of gaining knowledge that leads to improvement in field process management and responsive survey designs. The research was organised into three sub-projects which: 1. investigate the use of call record data and interviewer observations to study nonresponse in longitudinal surveys; 2. provide insights into the effects of interviewing strategies and other interviewer attributes on response in longitudinal surveys, and 3. gain knowledge about the measurement error properties of paradata, in particular interviewer observations. Analysis techniques included multilevel, discrete-time event history and longitudinal data analysis methods. Dissemination included a short course and an international workshop on paradata.
Facebook
TwitterOnly secondary data was used for this study on the impact of Hurricane Katrina on Southern Louisiana The data sets include: land-cover data for Louisiana, social and economic variables for New Orleans and avian species abundance data gathered from the NOAA Coastal Change Analysis Program, US Census and USGS North American Breeding Bird Survey, respectively. This dataset is associated with the following publication: Chuang, W., T. Eason, A. Garmestani, and C. Roberts. Impact of Hurricane Katrina on the Coastal Systems of Southern Louisiana. Frontiers in Environmental Science. Frontiers, Lausanne, SWITZERLAND, 7(68): 01-15, (2019).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction: The rapidly expanding direct-to-consumer genetic testing (DTC GT) market is one area where narratives of underrepresented populations have not been explored extensively. This study describes African-American consumers’ personal experiences with and perceptions about DTC GT and explores similarities and differences between African-Americans and an earlier cohort of mostly European American consumers. Methods: Twenty semi-structured, qualitative interviews were held with individuals who self-identified as Black/African-American and completed DTC GT between February 2017 and February 2020. Interviews were transcribed and consensus-coded, using inductive content analysis. Results: Participants generally had positive regard for DTC GT. When considering secondary uses of their results or samples, most participants were aware this was a possibility but had little concrete knowledge about company practices. When prompted about potential uses, participants were generally comfortable with research uses but had mixed outlooks on other nonresearch uses such as law enforcement, cloning, and product development. Most participants expressed that consent should be required for any secondary use, with the option to opt out. The most common suggestion for companies was to improve transparency. Compared to European American participants, African-American participants expressed more trust in DTC GT companies compared to healthcare providers, more concerns about law enforcement uses of data, and a stronger expression of community considerations. Discussion/Conclusion: This study found that African-American consumers of DTC GT had a positive outlook about genetic testing and were open to research and some nonresearch uses, provided that they were able to give informed consent. Participants in this study had little knowledge of company practices regarding secondary uses. Compared to an earlier cohort of European American participants, African-American participants expressed more concerns about medical and law enforcement communities’ use of data and more reference to community engagement.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw usage data of the software "Datenhotel" as presented in "Unlocking the potential of secondary data for public health research: A retrospective study with a novel clinical platform"
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Per Erik Strandberg [1], Philipp Peterseil [2], Julian Karoliny [3], Johanna Kallio [4], and Johannes Peltola [4].
[1] Westermo Network Technologies AB (Sweden).
[2] Johannes Kepler University Linz (Austria)
[3] Silicon Austria Labs GmbH (Austria).
[4] VTT Technical Research Centre of Finland Ltd. (Finland).
This data is to accompany a paper submitted to Elsevier's data in brief in 2024, with the title Insights from Publishing Open Data in Industry-Academia Collaboration.
Tentative Abstract: Effective data management and sharing are critical success factors in industry-academia collaboration. This paper explores the motivations and lessons learned from publishing open data sets in such collaborations. Through a survey of participants in a European research project that published 13 data sets, and an analysis of metadata from almost 281 thousand datasets in Zenodo, we collected qualitative and quantitative results on motivations, achievements, research questions, licences and file types. Through inductive reasoning and statistical analysis we found that planning the data collection is essential, and that only few datasets (2.4%) had accompanying scripts for improved reuse. We also found that authors are not well aware of the importance of licences or which licence to choose. Finally, we found that data with a synthetic origin, collected with simulations and potentially mixed with real measurements, can be very meaningful, as predicted by Gartner and illustrated by many datasets collected in our research project.
The file survey.txt contains secondary data from a survey of participants that published open data sets in the 3-year European research project InSecTT.
The file secondary_data_zenodo.json contains secondary data from an analysis of data sets published in Zenodo. It is accompanied with a py-file and a ipynb-file to serve as examples.
This data is licenced with the Creative Commons Attribution 4.0 International license. You are free to use the data if you attribute the authors. Read the license text for details.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the research data from the publication "Physicians' Attitudes toward Secondary Use of Clinical Data for Biomedical Research Purposes. Results of a Quantitative Survey." The research data includes the questionnaire, the data set used, and a description of the data preparation.
Facebook
Twitterhttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Yearly citation counts for the publication titled "Restraint use as a quality indicator for the hospital setting: a secondary data analysis".
Facebook
TwitterThis study contains script files to create teaching versions of Understanding Society: Waves 1-3, the new UK household panel survey. Specifically, the user can focus on individual waves, or can create a panel survey dataset for use in teaching undergraduates and postgraduates. Core areas of focus are attitudes to voting and political parties, to the environment, and to ethnicity and migration. Script files are available for SPSS, STATA and R. Individuals wishing to make use of this resource will need to apply separately to the UK data archive for access to the original datasets: http://discover.ukdataservice.ac.uk/catalogue/?sn=6614 &type=Data%20catalogue
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Regulation for European Health Data Space (EHDS) aims to address the fragmented health data landscape across Europe by promoting ethical and responsible reuse of data, seeking to balance the opportunities for data reuse with the risks it entails. However, the techno-legal aspects of navigating this balance remain poorly understood. This study adopts a qualitative and inductive approach, using semi-structured interviews to explore the risks, challenges, and gaps in the implementation of privacy-enhancing technologies (PETs) within EHDS, particularly in the context of its governance structure and data permits for secondary data use. The findings identify five distinct categories of concerns, based on fourteen risks, and highlight seven governance and technological solutions, illustrating how these solutions address multiple, often correlated risks. The interdependence between concerns and solutions emphasises the need for a strategic and integrated approach to both governance and technology. This mapping between the risks and solutions also highlights the central role of certain solutions, such as public engagement and awareness, in addressing multiple risks. Furthermore, it introduces a new dimension to the concerns by focusing on the structural imbalances in access to the health data economy. We conclude by proposing a research agenda to advance the integration of PETs into the EHDS framework, ensuring that data permits can effectively facilitate secure, ethical, and innovative health data use.
Facebook
TwitterTitle of data: CTE & SAE Data Inventory. Description of data: List of common data elements in clinical trials with domain, availability/completeness, occurrence in trials, semantic codes and definition. (XLSX 34 kb)
Facebook
TwitterThe Secondary Uses Service (SUS +) is a collection of health care data required by hospitals and used for planning health care, supporting payments, commissioning policy development and research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundAn understanding of the resources which engineering students use to write their academic papers provides information about student behaviour as well as the effectiveness of information literacy programs designed for engineering students. One of the most informative sources of information which can be used to determine the nature of the material that students use is the bibliography at the end of the students’ papers. While reference list analysis has been utilised in other disciplines, few studies have focussed on engineering students or used the results to improve the effectiveness of information literacy programs. Gadd, Baldwin and Norris (2010) found that civil engineering students undertaking a finalyear research project cited journal articles more than other types of material, followed by books and reports, with web sites ranked fourth. Several studies, however, have shown that in their first year at least, most students prefer to use Internet search engines (Ellis & Salisbury, 2004; Wilkes & Gurney, 2009).PURPOSEThe aim of this study was to find out exactly what resources undergraduate students studying civil engineering at La Trobe University were using, and in particular, the extent to which students were utilising the scholarly resources paid for by the library. A secondary purpose of the research was to ascertain whether information literacy sessions delivered to those students had any influence on the resources used, and to investigate ways in which the information literacy component of the unit can be improved to encourage students to make better use of the resources purchased by the Library to support their research.DESIGN/METHODThe study examined student bibliographies for three civil engineering group projects at the Bendigo Campus of La Trobe University over a two-year period, including two first-year units (CIV1EP – Engineering Practice) and one-second year unit (CIV2GR – Engineering Group Research). All units included a mandatory library session at the start of the project where student groups were required to meet with the relevant faculty librarian for guidance. In each case, the Faculty Librarian highlighted specific resources relevant to the topic, including books, e-books, video recordings, websites and internet documents. The students were also shown tips for searching the Library catalogue, Google Scholar, LibSearch (the LTU Library’s research and discovery tool) and ProQuest Central. Subject-specific databases for civil engineering and science were also referred to. After the final reports for each project had been submitted and assessed, the Faculty Librarian contacted the lecturer responsible for the unit, requesting copies of the student bibliographies for each group. References for each bibliography were then entered into EndNote. The Faculty Librarian grouped them according to various facets, including the name of the unit and the group within the unit; the material type of the item being referenced; and whether the item required a Library subscription to access it. A total of 58 references were collated for the 2010 CIV1EP unit; 237 references for the 2010 CIV2GR unit; and 225 references for the 2011 CIV1EP unit.INTERIM FINDINGSThe initial findings showed that student bibliographies for the three group projects were primarily made up of freely available internet resources which required no library subscription. For the 2010 CIV1EP unit, all 58 resources used were freely available on the Internet. For the 2011 CIV1EP unit, 28 of the 225 resources used (12.44%) required a Library subscription or purchase for access, while the second-year students (CIV2GR) used a greater variety of resources, with 71 of the 237 resources used (29.96%) requiring a Library subscription or purchase for access. The results suggest that the library sessions had little or no influence on the 2010 CIV1EP group, but the sessions may have assisted students in the 2011 CIV1EP and 2010 CIV2GR groups to find books, journal articles and conference papers, which were all represented in their bibliographiesFURTHER RESEARCHThe next step in the research is to investigate ways to increase the representation of scholarly references (found by resources other than Google) in student bibliographies. It is anticipated that such a change would lead to an overall improvement in the quality of the student papers. One way of achieving this would be to make it mandatory for students to include a specified number of journal articles, conference papers, or scholarly books in their bibliographies. It is also anticipated that embedding La Trobe University’s Inquiry/Research Quiz (IRQ) using a constructively aligned approach will further enhance the students’ research skills and increase their ability to find suitable scholarly material which relates to their topic. This has already been done successfully (Salisbury, Yager, & Kirkman, 2012)CONCLUSIONS & CHALLENGESThe study shows that most students rely heavily on the free Internet for information. Students don’t naturally use Library databases or scholarly resources such as Google Scholar to find information, without encouragement from their teachers, tutors and/or librarians. It is acknowledged that the use of scholarly resources doesn’t automatically lead to a high quality paper. Resources must be used appropriately and students also need to have the skills to identify and synthesise key findings in the existing literature and relate these to their own paper. Ideally, students should be able to see the benefit of using scholarly resources in their papers, and continue to seek these out even when it’s not a specific assessment requirement, though it can’t be assumed that this will be the outcome.REFERENCESEllis, J., & Salisbury, F. (2004). Information literacy milestones: building upon the prior knowledge of first-year students. Australian Library Journal, 53(4), 383-396.Gadd, E., Baldwin, A., & Norris, M. (2010). The citation behaviour of civil engineering students. Journal of Information Literacy, 4(2), 37-49.Salisbury, F., Yager, Z., & Kirkman, L. (2012). Embedding Inquiry/Research: Moving from a minimalist model to constructive alignment. Paper presented at the 15th International First Year in Higher Education Conference, Brisbane. Retrieved from http://www.fyhe.com.au/past_papers/papers12/Papers/11A.pdfWilkes, J., & Gurney, L. J. (2009). Perceptions and applications of information literacy by first year applied science students. Australian Academic & Research Libraries, 40(3), 159-171.
Facebook
Twitterhttps://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Data from online survey among authors of the social sciences using social media data for their research and having published journal articles based on social media data between 2018 and 2021. The questionnaire consists of several closed and open-ended questions in seven main sections: a) data acquisition and use of secondary data, b) past data sharing behaviour, c) data sharing intentions, d) data documentation, e) use of other forms of data, f) personality and g) demography. The questions to measure factors that influence researchers’ data sharing decisions were designed using the Theory of Planned Behavior (Icek Ajzen).
Facebook
TwitterThis data dictionary describes relevant fields from secondary data sources that can assist with modeling the conditions of use for a chemical when performing a chemical assessment. Information on how to access the secondary data sources are included. This dataset is associated with the following publication: Chea, J.D., D.E. Meyer, R.L. Smith, S. Takkellapati, and G.J. Ruiz-Mercado. Exploring automated tracking of chemicals through their conditions of use to support life cycle chemical assessment. JOURNAL OF INDUSTRIAL ECOLOGY. Berkeley Electronic Press, Berkeley, CA, USA, 29(2): 413-616, (2025).
Facebook
Twitteri. .\File_Mapping.csv: This file relates historical reconstructed hydrology streamflow from the U.S. Army Corps of Engineers () to the appropriate stochastic streamflow file for disaggregation of streamflow. Column A is an assigned ID, column B is named “Stochastic” and is the stochastic streamflow file needed for disaggregation, column c is called “RH_Ratio_Col” and is the name of the column in the reconstructed hydrology dataset associated with a stochastic streamflow file, and column D is named “Col_Num” and is the column number in the reconstructed hydrology dataset with the name given in column C. ii. .\Original_Draw_YearDat.csv: This file contains the historical year from 1930 to 2017 with the closest total streamflow for the Souris River Basin to each year in the stochastic streamflow dataset. Column A is an index number, column B is named “V1” and is the year in a simulation, column C is called “V2” and is the stochastic simulation number, column D is an integer that can be related to historical years by adding 1929, and column D is named “year” and is the historical year with the closest total Souris River Basin streamflow volume to the associated year in the stochastic traces. iii. .\revdrawyr.csv: This file is setup the same way that .\Original_Draw_YearDat.csv was except that, when a year had over 400 occurrences, it was randomly replaced with one of the 20 other closest years. The replacement process was completed until there were less than 400 occurrences of each reconstructed hydrology year associated with stochastic simulation years. Column A is an index number, column B is named “V1” and is the year in a simulation, column C is called “V2” and is the stochastic simulation number, column D is called “V3” and is the historical year who’s streamflow ratios will be multiplied by stochastic streamflow, and column E is called “Stoch_yr” and is the total of 2999 and the year in column B. iv. .\RH_1930_2017.csv: This file contains the daily streamflow from the U.S. Army Corps of Engineers (2020), reconstructed hydrology for the Souris River Basin for the period of 1930 to 2017. Column A is the date and columns B through AA are the daily streamflow in cubic feet per second. v. .\rhmoflow_1930Present.csv: This file was created based on .\RH_1930_2017.csv and provides streamflow for each site in cubic meters for a given month. Column A is an unnamed index column, column B is historical year, column C is the historical month associated with the historical year, column D provides a day equal to 1 but does not have particular significance and columns E through AD are monthly streamflow volume for each site location. vi. .\Stoch_Annual_TotVol_CubicDecameters.csv: This file contains the total volume of streamflow for each of the 26 sites for each month in the stochastic streamflow time timeseries and provides a total streamflow volume divided by 100,000 on a monthly basis for the entire Souris River Basin. Column A is unnamed and contains an index number, column B is month and is named “V1”, column C is the year in a simulation, column D is the simulation number, columns E (V4 through V29) through AD are streamflow volume in cubic meters, and column AE (V30) is total Souris River Basin monthly streamflow volume in cubic decameters/1,000.
Facebook
TwitterThe 2010 NEDS is similar to the 2004 Nigeria DHS EdData Survey (NDES) in that it was designed to provide information on education for children age 4–16, focusing on factors influencing household decisions about children’s schooling. The survey gathers information on adult educational attainment, children’s characteristics and rates of school attendance, absenteeism among primary school pupils and secondary school students, household expenditures on schooling and other contributions to schooling, and parents’/guardians’ perceptions of schooling, among other topics.The 2010 NEDS was linked to the 2008 Nigeria Demographic and Health Survey (NDHS) in order to collect additional education data on a subset of the households (those with children age 2–14) surveyed in the 2008 Nigeria DHS survey. The 2008 NDHS, for which data collection was carried out from June to October 2008, was the fourth DHS conducted in Nigeria (previous surveys were implemented in 1990, 1999, and 2003).
The goal of the 2010 NEDS was to follow up with a subset of approximately 30,000 households from the 2008 NDHS survey. However, the 2008 NDHS sample shows that of the 34,070 households interviewed, only 20,823 had eligible children age 2–14. To make statistically significant observations at the State level, 1,700 children per State and the Federal Capital Territory (FCT) were needed. It was estimated that an additional 7,300 households would be required to meet the total number of eligible children needed. To bring the sample size up to the required target, additional households were screened and added to the overall sample. However, these households did not have the NDHS questionnaire administered. Thus, the two surveys were statistically linked to create some data used to produce the results presented in this report, but for some households, data were imputed or not included.
National
Households Individuals
Sample survey data [ssd]
The eligible households for the 2010 NEDS are the same as those households in the 2008 NDHS sample for which interviews were completed and in which there is at least one child age 2-14, inclusive. In the 2008 NDHS, 34,070 households were successfully interviewed, and the goal here was to perform a follow-up NEDS on a subset of approximately 30,000 households. However, records from the 2008 NDHS sample showed that only 20,823 had children age 4-16. Therefore, to bring the sample size up to the required number of children, additional households were screened from the NDHS clusters.
The first step was to use the NDHS data to determine eligibility based on the presence of a child age 2-14. Second, based on a series of precision and power calculations, RTI determined that the final sample size should yield approximately 790 households per State to allow statistical significance for reporting at the State level, resulting in a total completed sample size of 790 × 37 = 29,230. This calculation was driven by desired estimates of precision, analytic goals, and available resources. To achieve the target number of households with completed interviews, we increased the final number of desired interviews to accommodate expected attrition factors such as unlocatable addresses, eligibility issues, and non-response or refusal. Third, to reach the target sample size, we selected additional samples from households that had been listed by NDHS but had not been sampled and visited for interviews. The final number of households with completed interviews was 26,934 slightly lower than the original target, but sufficient to yield interview data for 71,567 children, well above the targeted number of 1,700 children per State.
Face-to-face [f2f]
The four questionnaires used in the 2004 Nigeria DHS EdData Survey (NDES)— 1. Household Questionnaire 2. Parent/Guardian Questionnaire 3. Eligible Child Questionnaire 4. Independent Child Questionnaire—formed the basis for the 2010 NEDS questionnaires. These are all available in Appendix D of the survey report available under External Resources.
More than 90 percent of the questionnaires remained the same; for cases where there was a clear justification or a need for a change in item formulation or a specific requirement for additional items, these were updated accordingly. A one day workshop was convened with the NEDS Implementation Team and the NDES Advisory Committee to review the instruments and identify any needed revisions, additions, or deletions. Efforts were made to collect data to ease integration of the 2010 NEDS data into the FMOE’s national education management information system. Instrument issues that were identified as being problematic in the 2004 NDES as well as items identified as potentially confusing or difficult were proposed for revision. Issues that USAID, DFID, FMOE, and other stakeholders identified as being essential but not included in the 2004 NDES questionnaires were proposed for incorporation into the 2010 NEDS instruments, with USAID serving as the final arbiter regarding questionnaire revisions and content.
General revisions accepted into the questionnaires included the following: - A separation of all questions related to secondary education into junior secondary and senior secondary to reflect the UBE policy - Administration of school-based questions for children identified as attending pre-school - Inclusion of questions on disabilities of children and parents - Additional questions on Islamic schooling - Revision to the literacy question administration to assess English literacy for children attending school - Some additional questions on delivery of UBE under the financial questions section
Upon completion of revisions to the English-language questionnaires, the instruments were translated and adapted by local translators into three languages—Hausa, Igbo, and Yoruba—and then back-translated into English to ensure accuracy of the translation. After the questionnaires were finalized, training materials used in the 2004 NDES and developed by Macro International, which included training guides, data collection manuals, and field observation materials, were reviewed. The materials were updated to reflect changes in the questionnaires. In addition, the procedures as described in the manuals and guides were carefully reviewed. Adjustments were made, where needed, based on experience on large-scale survey and lessons learned from the 2004 NDES and the 2008 NDHS, to ensure the highest quality data capture.
Data processing for the 2010 NEDS occurred concurrently with data collection. Completed questionnaires were retrieved by the field coordinators/trainers and delivered to NPC in standard envelops, labeled with the sample identification, team, and State name. The shipment also contained a written summary of any issues detected during the data collection process. The questionnaire administrators logged the receipt of the questionnaires, acknowledged the list of issues, and acted upon them if required. The editors performed an initial check on the questionnaires, performed any coding of open-ended questions (with possible assistance from the data entry operators), and left them available to be assigned to the data entry operators. The data entry operators entered the data into the system, with the support of the editors for erroneous or unclear data.
Experienced data entry personnel were recruited from those who have performed data entry activities for NPC on previous studies. The data entry teams composed a data entry coordinator, supervisor and operators. Data entry coordinators oversaw the entire data entry process from programming and training to final data cleaning, made assignments, tracked progress, and ensured the quality and timeliness of the data entry process. Data entry supervisors were on hand at all times to ensure that proper procedures were followed and to help editors resolve any uncovered inconsistencies. The supervisors controlled incoming questionnaires, assigned batches of questionnaires to the data entry operators, and managed their progress. Approximately 30 clerks were recruited and trained as data entry operators to enter all completed questionnaires and to perform the secondary entry for data verification. Editors worked with the data entry operators to review information flagged as “erroneous” or “dubious” in the data entry process and provided follow up and resolution for those anomalies.
The data entry program developed for the 2004 NDES was revised to reflect the revisions in the 2010 NEDS questionnaire. The electronic data entry and reporting system ensured internal consistency and inconsistency checks.
A very high overall response rate of 97.9 percent was achieved with interviews completed in 26,934 households out of a total of 27,512 occupied households from the original sample of 28,624 households. The response rates did not vary significantly by urban–rural (98.5 percent versus 97.6 percent, respectively). The response rates for parent/guardians and children were even higher, and the rate for independent children was slightly lower than the overall sample rate, 97.4 percent. In all these cases, the urban/rural differences were negligible.
Estimates derived from a sample survey are affected by two types of errors: (1) non-sampling errors and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe use of routinely collected health data for secondary research purposes is increasingly recognised as a methodology that advances medical research, improves patient outcomes, and guides policy. This secondary data, as found in electronic medical records (EMRs), can be optimised through conversion into a uniform data structure to enable analysis alongside other comparable health metric datasets. This can be achieved with the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM), which employs a standardised vocabulary to facilitate systematic analysis across various observational databases. The concept behind the OMOP-CDM is the conversion of data into a common format through the harmonisation of terminologies, vocabularies, and coding schemes within a unique repository. The OMOP model enhances research capacity through the development of shared analytic and prediction techniques; pharmacovigilance for the active surveillance of drug safety; and ‘validation’ analyses across multiple institutions across Australia, the United States, Europe, and the Asia Pacific. In this research, we aim to investigate the use of the open-source OMOP-CDM in the PATRON primary care data repository.MethodsWe used standard structured query language (SQL) to construct, extract, transform, and load scripts to convert the data to the OMOP-CDM. The process of mapping distinct free-text terms extracted from various EMRs presented a substantial challenge, as many terms could not be automatically matched to standard vocabularies through direct text comparison. This resulted in a number of terms that required manual assignment. To address this issue, we implemented a strategy where our clinical mappers were instructed to focus only on terms that appeared with sufficient frequency. We established a specific threshold value for each domain, ensuring that more than 95% of all records were linked to an approved vocabulary like SNOMED once appropriate mapping was completed. To assess the data quality of the resultant OMOP dataset we utilised the OHDSI Data Quality Dashboard (DQD) to evaluate the plausibility, conformity, and comprehensiveness of the data in the PATRON repository according to the Kahn framework.ResultsAcross three primary care EMR systems we converted data on 2.03 million active patients to version 5.4 of the OMOP common data model. The DQD assessment involved a total of 3,570 individual evaluations. Each evaluation compared the outcome against a predefined threshold. A ’FAIL’ occurred when the percentage of non-compliant rows exceeded the specified threshold value. In this assessment of the primary care OMOP database described here, we achieved an overall pass rate of 97%.ConclusionThe OMOP CDM’s widespread international use, support, and training provides a well-established pathway for data standardisation in collaborative research. Its compatibility allows the sharing of analysis packages across local and international research groups, which facilitates rapid and reproducible data comparisons. A suite of open-source tools, including the OHDSI Data Quality Dashboard (Version 1.4.1), supports the model. Its simplicity and standards-based approach facilitates adoption and integration into existing data processes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The aim of this study was to examine the pattern of medication use in a tertiary hospital in the Sultanate of Oman, including prescribed medications, over-the-counter (OTC) products, vitamins and minerals, as well as herbal and complementary treatments. The study was conducted at the Sultan Qaboos University Hospital (SQUH) and the SQUH Family and Community Medicine Clinic (FAMCO), Muscat, Sultanate of Oman, in 2008. Women were interviewed at different gestational ages using a structured questionnaire. The Electronic Patient Record (EPR) was reviewed to acquire additional information on medication use. The overall mean age of the cohort was 28 years ranging from 19 to 45 years old. The majority of women included were unemployed, with secondary school or higher education.
Facebook
TwitterBackground: In Brazil, secondary data for epidemiology are largely available. However, they are insufficiently prepared for use in research, even when it comes to structured data since they were often designed for other purposes. To date, few publications focus on the process of preparing secondary data. The present findings can help in orienting future research projects that are based on secondary data.Objective: Describe the steps in the process of ensuring the adequacy of a secondary data set for a specific use and to identify the challenges of this process.Methods: The present study is qualitative and reports methodological issues about secondary data use. The study material was comprised of 6,059,454 live births and 73,735 infant death records from 2004 to 2013 of children whose mothers resided in the State of São Paulo - Brazil. The challenges and description of the procedures to ensure data adequacy were undertaken in 6 steps: (1) problem understanding, (2) resource planning, (3) data understanding, (4) data preparation, (5) data validation and (6) data distribution. For each step, procedures, and challenges encountered, and the actions to cope with them and partial results were described. To identify the most labor-intensive tasks in this process, the steps were assessed by adding the number of procedures, challenges, and coping actions. The highest values were assumed to indicate the most critical steps.Results: In total, 22 procedures and 23 actions were needed to deal with the 27 challenges encountered along the process of ensuring the adequacy of the study material for the intended use. The final product was an organized database for a historical cohort study suitable for the intended use. Data understanding and data preparation were identified as the most critical steps, accounting for about 70% of the challenges observed for data using.Conclusion: Significant challenges were encountered in the process of ensuring the adequacy of secondary health data for research use, mainly in the data understanding and data preparation steps. The use of the described steps to approach structured secondary data and the knowledge of the potential challenges along the process may contribute to planning health research.