Facebook
TwitterThis survey provides information on household income and expenditure leading to measure the levels and changes of the living conditions of the people and to observe the consumption patterns .
Key objectives of the survey - To identify the income patterns in Urban, Rural and Estate Sectors & provinces. - To identify the income patterns by income levels. - Average consumption of food items and non food items - Expenditure patterns by sector and by income level.
National coverage.
Household, Individuals
For this survey a sample of buildings and the occupants therein was drawn from the whole island
Sample survey data [ssd]
A two stage stratified random sample design was used in the survey. Urban, Rural and Estate sectors of the Districts were the domains for stratification. The sample frame was the list of buildings that were prepared for the Census of Population and Housing 2001.
Selection of Primary Sampling Units (PSU's) Primary sampling units are the census blocks prepared for the Census of Population and Housing - 2001. The sample frame, which is a collection of all census blocks in the domain, was used for the selection of primary sampling units. A sample of 500 primary sampling units was selected from the sampling frame for the survey.
Selection of Secondary Sampling Units (SSU's) Secondary Sampling Units are the housing units in the selected 500 primary sampling units (census blocks). From each primary sampling unit 10 housing units (SSU) were selected for the survey. The total sample size of 5000 housing units was selected and distributed among Districts in Sri Lanka.
Face-to-face [f2f]
Questionaires
The survey schedule was designed to collect data by household and separate schedules were used for each household identified according to the definition of the household within the housing units selected for the survey. The survey schedule consists three main sections .
1. Demographic section
2. Expenditure
3. Income
The Demographic characteristics and usual activities of the inmates belonging to the household were reported in the Demographic section of the schedule (and close relatives temporarily living away are also listed in this section). Expenditure section has two sub sections to report food and non-food consumption data separately. Expenditure incurred on their own decisions by boarders and servants are recorded in the sub section under the Main expenditure section. The income has seven sub sections categorized according to the main sources of income.
The exact differences or sampling error ,varies depending on the particular sample selected and the variability is measured by the standard error of the estimate. There is about a 95% chance or level of confidence that an estimate based on a sample will differ by no more than 1.96 standard errors from the true population value because of sampling error. Analyses relating to the HIES are generally conducted at the 95% level of confidence .
confidence interval = Estimate value ± (standard error )*(1.96)
http://www.statistics.gov.lk/HIES/HIES%202007/introduction%20%20HIES.pdf
By visiting the above website a description about the adjustments for non-response could be read in section 1.2 of the Final report.
Facebook
TwitterThe 2010 NEDS is similar to the 2004 Nigeria DHS EdData Survey (NDES) in that it was designed to provide information on education for children age 4–16, focusing on factors influencing household decisions about children’s schooling. The survey gathers information on adult educational attainment, children’s characteristics and rates of school attendance, absenteeism among primary school pupils and secondary school students, household expenditures on schooling and other contributions to schooling, and parents’/guardians’ perceptions of schooling, among other topics.The 2010 NEDS was linked to the 2008 Nigeria Demographic and Health Survey (NDHS) in order to collect additional education data on a subset of the households (those with children age 2–14) surveyed in the 2008 Nigeria DHS survey. The 2008 NDHS, for which data collection was carried out from June to October 2008, was the fourth DHS conducted in Nigeria (previous surveys were implemented in 1990, 1999, and 2003).
The goal of the 2010 NEDS was to follow up with a subset of approximately 30,000 households from the 2008 NDHS survey. However, the 2008 NDHS sample shows that of the 34,070 households interviewed, only 20,823 had eligible children age 2–14. To make statistically significant observations at the State level, 1,700 children per State and the Federal Capital Territory (FCT) were needed. It was estimated that an additional 7,300 households would be required to meet the total number of eligible children needed. To bring the sample size up to the required target, additional households were screened and added to the overall sample. However, these households did not have the NDHS questionnaire administered. Thus, the two surveys were statistically linked to create some data used to produce the results presented in this report, but for some households, data were imputed or not included.
National
Households Individuals
Sample survey data [ssd]
The eligible households for the 2010 NEDS are the same as those households in the 2008 NDHS sample for which interviews were completed and in which there is at least one child age 2-14, inclusive. In the 2008 NDHS, 34,070 households were successfully interviewed, and the goal here was to perform a follow-up NEDS on a subset of approximately 30,000 households. However, records from the 2008 NDHS sample showed that only 20,823 had children age 4-16. Therefore, to bring the sample size up to the required number of children, additional households were screened from the NDHS clusters.
The first step was to use the NDHS data to determine eligibility based on the presence of a child age 2-14. Second, based on a series of precision and power calculations, RTI determined that the final sample size should yield approximately 790 households per State to allow statistical significance for reporting at the State level, resulting in a total completed sample size of 790 × 37 = 29,230. This calculation was driven by desired estimates of precision, analytic goals, and available resources. To achieve the target number of households with completed interviews, we increased the final number of desired interviews to accommodate expected attrition factors such as unlocatable addresses, eligibility issues, and non-response or refusal. Third, to reach the target sample size, we selected additional samples from households that had been listed by NDHS but had not been sampled and visited for interviews. The final number of households with completed interviews was 26,934 slightly lower than the original target, but sufficient to yield interview data for 71,567 children, well above the targeted number of 1,700 children per State.
Face-to-face [f2f]
The four questionnaires used in the 2004 Nigeria DHS EdData Survey (NDES)— 1. Household Questionnaire 2. Parent/Guardian Questionnaire 3. Eligible Child Questionnaire 4. Independent Child Questionnaire—formed the basis for the 2010 NEDS questionnaires. These are all available in Appendix D of the survey report available under External Resources.
More than 90 percent of the questionnaires remained the same; for cases where there was a clear justification or a need for a change in item formulation or a specific requirement for additional items, these were updated accordingly. A one day workshop was convened with the NEDS Implementation Team and the NDES Advisory Committee to review the instruments and identify any needed revisions, additions, or deletions. Efforts were made to collect data to ease integration of the 2010 NEDS data into the FMOE’s national education management information system. Instrument issues that were identified as being problematic in the 2004 NDES as well as items identified as potentially confusing or difficult were proposed for revision. Issues that USAID, DFID, FMOE, and other stakeholders identified as being essential but not included in the 2004 NDES questionnaires were proposed for incorporation into the 2010 NEDS instruments, with USAID serving as the final arbiter regarding questionnaire revisions and content.
General revisions accepted into the questionnaires included the following: - A separation of all questions related to secondary education into junior secondary and senior secondary to reflect the UBE policy - Administration of school-based questions for children identified as attending pre-school - Inclusion of questions on disabilities of children and parents - Additional questions on Islamic schooling - Revision to the literacy question administration to assess English literacy for children attending school - Some additional questions on delivery of UBE under the financial questions section
Upon completion of revisions to the English-language questionnaires, the instruments were translated and adapted by local translators into three languages—Hausa, Igbo, and Yoruba—and then back-translated into English to ensure accuracy of the translation. After the questionnaires were finalized, training materials used in the 2004 NDES and developed by Macro International, which included training guides, data collection manuals, and field observation materials, were reviewed. The materials were updated to reflect changes in the questionnaires. In addition, the procedures as described in the manuals and guides were carefully reviewed. Adjustments were made, where needed, based on experience on large-scale survey and lessons learned from the 2004 NDES and the 2008 NDHS, to ensure the highest quality data capture.
Data processing for the 2010 NEDS occurred concurrently with data collection. Completed questionnaires were retrieved by the field coordinators/trainers and delivered to NPC in standard envelops, labeled with the sample identification, team, and State name. The shipment also contained a written summary of any issues detected during the data collection process. The questionnaire administrators logged the receipt of the questionnaires, acknowledged the list of issues, and acted upon them if required. The editors performed an initial check on the questionnaires, performed any coding of open-ended questions (with possible assistance from the data entry operators), and left them available to be assigned to the data entry operators. The data entry operators entered the data into the system, with the support of the editors for erroneous or unclear data.
Experienced data entry personnel were recruited from those who have performed data entry activities for NPC on previous studies. The data entry teams composed a data entry coordinator, supervisor and operators. Data entry coordinators oversaw the entire data entry process from programming and training to final data cleaning, made assignments, tracked progress, and ensured the quality and timeliness of the data entry process. Data entry supervisors were on hand at all times to ensure that proper procedures were followed and to help editors resolve any uncovered inconsistencies. The supervisors controlled incoming questionnaires, assigned batches of questionnaires to the data entry operators, and managed their progress. Approximately 30 clerks were recruited and trained as data entry operators to enter all completed questionnaires and to perform the secondary entry for data verification. Editors worked with the data entry operators to review information flagged as “erroneous” or “dubious” in the data entry process and provided follow up and resolution for those anomalies.
The data entry program developed for the 2004 NDES was revised to reflect the revisions in the 2010 NEDS questionnaire. The electronic data entry and reporting system ensured internal consistency and inconsistency checks.
A very high overall response rate of 97.9 percent was achieved with interviews completed in 26,934 households out of a total of 27,512 occupied households from the original sample of 28,624 households. The response rates did not vary significantly by urban–rural (98.5 percent versus 97.6 percent, respectively). The response rates for parent/guardians and children were even higher, and the rate for independent children was slightly lower than the overall sample rate, 97.4 percent. In all these cases, the urban/rural differences were negligible.
Estimates derived from a sample survey are affected by two types of errors: (1) non-sampling errors and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/22180/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/22180/terms
The purpose of the study was to evaluate the extent to which deterrence or cooperative strategies motivated firms and their facilities to comply with environmental regulations. The project collected administrative data (secondary data) for a sample of publicly owned, United States companies in the pulp and paper, steel, and oil refining industries from 1995 to 2000 to track each firm's economic, environmental, and enforcement compliance history. Company Economic and Size Data (Part 1) from 1993 to 2000 were gathered from the Standard and Poor's Industrial Compustat, Mergent Online, and Securities and Exchange Commission, resulting in 512 company/year observations. Next, the research team used the Directory of Corporate Affiliations, the Environmental Protection Agency's (EPA) Toxic Release Inventory (TRI), and the EPA's Permit Compliance System (PCS) to identify all facilities owned by the sample of firms between 1995 and 2000. Researchers then gathered Facility Ownership Data (Part 2), resulting in 15,408 facility/year observations. The research team gathered various types of PCS data from the EPA for facilities in the sample. Permit Compliance System Facility Data (Part 3) were gathered on the 214 unique major National Pollutant Discharge Elimination System (NPDES) permits issued to facilities in the sample. Although permits were given to facilities, facilities could have one or more discharge points (e.g., pipes) that released polluted water directly into surface waters. Thus, Permit Compliance System Discharge Points (Pipe Layout) Data (Part 4) were also collected on 1,995 pipes. The EPA determined compliance using two methods: inspections and evaluations/assessments. Permit Compliance System Inspections Data (Part 5) were collected on a total of 1,943 inspections. Permit Compliance System Compliance Schedule Data (Part 6) were collected on a total of 3,336 compliance schedule events. Permit Compliance System Compliance Schedule Violation Data (Part 7) were obtained for a total of 246 compliance schedule violations. Permit Compliance System Single Event Violations Data (Part 8) were collected on 75 single event violations. Permit Compliance System Measurement/Effluent and Reporting Violations Data (Part 9) were collected for 396,479 violations. Permit Compliance System Enforcement Actions Data (Part 10) were collected on 1,730 enforcement actions. Occupational Safety and Health Administration Data (Part 11) were collected on a total of 2,243 inspections. The OSHA data were collected by company name and include multiple facilities owned by each company and were not limited to facilities in the Permit Compliance System. Additional information about firm noncompliance was drawn from EPA Docket and CrimDoc systems. Administrative and Judicial Docket Case Data (Part 12) were collected on 40 administrative and civil cases. Administrative and Judicial Docket Case Settlement Data (Part 13) were collected on 36 administrative and civil cases. Criminal Case Data (Part 14) were collected on three criminal cases. For secondary data analysis purposes, the research team created the Yearly Final Report Data (Part 15) and the Quarterly Final Report Data (Part 16). The yearly data contain a total of 378 company/year observations; the quarterly data contain a total of 1,486 company/quarter observations. The research team also conducted a vignette survey of the same set of companies that are in the secondary data to measure compliance and managerial decision-making. Concerning the Vignette Data (Part 17), a factorial survey was developed and administered to company managers tapping into perceptions of the costs and benefits of pro-social and anti-social conduct for themselves and their companies. A total of 114 respondents from 2 of the sampled corporations read and responded to a total of 384 vignettes representing 4 scenario types: technical noncompliance, significant noncompliance, over-compliance, and response to counter-terrorism. Part 1 contains 19 economic and size variables. Part 2 contains a total of eight variables relating to ownership. Part 3 contains 67 variables with regard to facility characteristics. Part 4 contains 31 variables relating to discharge points and pipe layout information. Part 5 contains 13 inspections characteristics variables. Part 6 contains 13 compliance schedule event characteristics variabl
Facebook
TwitterThe research team collected data from statewide datasets on 268 stalking cases including a population of 108 police identified stalking cases across Rhode Island between 2001 and 2005 with a sample of 160 researcher identified stalking incidents (incidents that met statutory criteria for stalking but were cited by police for other domestic violence offenses) during the same period. The secondary data used for this study came from the Rhode Island Supreme Court Domestic Violence Training and Monitoring Unit's (DVU) statewide database of domestic violence incidents reported to Rhode Island law enforcement. Prior criminal history data were obtained from records of all court cases entered into the automated Rhode Island court file, CourtConnect. The data contain a total of 121 variables including suspect characteristics, victim characteristics, incident characteristics, police response characteristics, and prosecutor response characteristics.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Secondary data on social indicators and public expenditure on district and regional level in Tanzania (1996-2010), as for example: THINV: Logarithm of deflated public per capita spending on health in the short- and long term (total spending of the current and the last five budget years) SANI: Latrines per 100 pupils INFRA: Percentage of women and men age 15-49 who reported serious problems in accessing health care due to the distance to the next health facility URB: Percentage of people living in urban areas TAINV: Logarithm of deflated public per capita spending on agriculture (current and previous budget year)* BREASTF: Percentage who started breastfeeding within 1 hour of birth, among the last children born in the five years preceding the survey IODINE: Percentage of households with adequate iodine content of salt (15+ ppm) MEDU: Percentage of women age 15-49 who completed grade 6 at the secondary level VACC: Percentage of children age 12-23 months with a vaccination card TWINV: Logarithm of deflated public per capita spending on water in the short- and long term (total spending of the current and the last five budget years)* TEINV: Logarithm of deflated public per capita spending on education in the short- and long term (total spending of the current and the last five budget years)* LABOUR: Percentage of women and men employed in the 12 months preceding the survey LAND: Per capita farmland in ha (including the area under temporary mono/mixed crops, permanent mono/mixed crops and the area under pasture) RAIN: Yearly rainfall in mm etc. Purpose: The uploaded data were the basis for the following PhD-thesis: The optimal allocation of scarce resources for health improvement is a crucial factor to lower the burden of disease and to strengthen the productive capacities of people living in developing countries. This research project aims to devise tools in narrowing the gap between the actual allocation and a more efficient allocation of resources for health in the case of Tanzania. Firstly, the returns from alternative government spending across sectors such as agriculture, water etc. are analysed. Maximisation of the amount of Disability Adjusted Life Years (DALYs) averted per dollar invested is used as criteria. A Simultaneous Equation Model (SEM) is developed to estimate the required elasticities. The results of the quantitative analysis show that the highest returns on DALYs are obtained by investments in improved nutrition and access to safe water sources, followed by spending on sanitation. Secondly, focusing on the health sector itself, scarce resources for health improvement create the incentive to prioritise certain health interventions. Using the example of malaria, the objective of the second stage is to evaluate whether interventions are prioritized in such a way that the marginal dollar goes to where it has the highest effect on averting DALYs. PopMod, a longitudinal population model, is used to estimate the cost-effectiveness of six isolated and combined malaria intervention approaches. The results of the longitudinal population model show that preventive interventions such as insecticide–treated bed nets (ITNs) and intermittent presumptive treatment with Sulphadoxine-Pyrimethamine (SP) during pregnancy had the highest health returns (both US$ 41 per DALY averted). The third part of this dissertation focuses on the political economy aspect of the allocation of scarce resources for health improvement. The objective here is to positively assess how political party competition and the access to mass media directly affect the distribution of district resources for health improvement. Estimates of cross-sectional and panel data regression analysis imply that a one-percentage point smaller difference (the higher the competition is) between the winning party and the second-place party leads to a 0.151 percentage point increase in public health spending, which is significant at the five percent level. In conclusion, we can say that cross-sectoral effects, the cost-effectiveness of health interventions and the political environment are important factors at play in the country’s resource allocation decisions. In absolute terms, current financial resources to lower the burden of disease in Tanzania are substantial. However, there is a huge potential in optimizing the allocation of these resources for a better health return.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Risk factors for opioid use disorder (weighted).
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This document contains text excerpts captured from the literature as secondary data to develop the qualitative system dynamics model as well as two example coding tables. Table 1 shows the final list of research works selected for model development through a systematic paper selection procedure as described in chapter 3 of the thesis. Table 2 shows the initial causal links created based on the identified casual relationships. Table 3 shows an intermediate merging step (3rd iteration), where causal links are combined into more general links. For a detailed explanation of the model development process refer to chapter 3 of the thesis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Weighted % of prescription opioid recipients according to OUD severity.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains PDF-to-text conversions of scientific research articles, prepared for the task of data citation mining. The goal is to identify references to research datasets within full-text scientific papers and classify them as Primary (data generated in the study) or Secondary (data reused from external sources).
The PDF articles were processed using MinerU, which converts scientific PDFs into structured machine-readable formats (JSON, Markdown, images). This ensures participants can access both the raw text and layout information needed for fine-grained information extraction.
Each paper directory contains the following files:
*_origin.pdf
The original PDF file of the scientific article.
*_content_list.json
Structured extraction of the PDF content, where each object represents a text or figure element with metadata.
Example entry:
{
"type": "text",
"text": "10.1002/2017JC013030",
"text_level": 1,
"page_idx": 0
}
full.md
The complete article content in Markdown format (linearized for easier reading).
images/
Folder containing figures and extracted images from the article.
layout.json
Page layout metadata, including positions of text blocks and images.
The aim is to detect dataset references in the article text and classify them:
DOIs (Digital Object Identifiers):
https://doi.org/[prefix]/[suffix]
Example: https://doi.org/10.5061/dryad.r6nq870
Accession IDs: Used by data repositories. Format varies by repository. Examples:
GSE12345 (NCBI GEO)PDB 1Y2T (Protein Data Bank)E-MEXP-568 (ArrayExpress)Each dataset mention must be labeled as:
train_labels.csv).train_labels.csv → Ground truth with:
article_id: Research paper DOI.dataset_id: Extracted dataset identifier.type: Citation type (Primary / Secondary).sample_submission.csv → Example submission format.
Paper: https://doi.org/10.1098/rspb.2016.1151 Data: https://doi.org/10.5061/dryad.6m3n9 In-text span:
"The data we used in this publication can be accessed from Dryad at doi:10.5061/dryad.6m3n9." Citation type: Primary
This dataset enables participants to develop and test NLP systems for:
Facebook
TwitterThis table contains some of the science results from the Nuclear Spectroscopic Telescope Array (NuSTAR) Serendipitous Survey. The catalog incorporates data taken during the first 40 months of NuSTAR operation, which provide ~20 Ms of effective exposure time over 331 fields, with an areal coverage of 13 deg2. The primary catalog (available as the HEASARC NUSTARSSC table) contains 498 sources (the abstract of the reference paper states that there are 497 sources) detected in total over the 3-24 keV energy range. There are 276 sources with spectroscopic redshifts and classifications, largely resulting from the authors' extensive campaign of ground-based spectroscopic follow-up. The authors characterize the overall sample in terms of the X-ray, optical, and infrared source properties. The sample is primarily composed of active galactic nuclei (AGN), detected over a large range in redshift from z = 0.002 to 3.4 (median redshift z of 0.56), but also includes 16 spectroscopically confirmed Galactic sources. There is a large range in X-ray flux, from log (f_3-24_keV) ~ -14 to -11 (in units of erg s-1 cm-2), and in rest-frame 10-40 keV luminosity, from log (L10-40keV) ~ 39 to 46 (in units of erg s-1), with a median of 44.1. Approximately 79% of the NuSTAR sources have lower-energy (<10 keV) X-ray counterparts from XMM-Newton, Chandra, and Swift XRT observations. The mid-infrared (MIR) analysis, using WISE all-sky survey data, shows that MIR AGN color selections miss a large fraction of the NuSTAR-selected AGN population, from ~15% at the highest luminosities (LX > 1044 erg s-1) to ~80% at the lowest luminosities (LX < 1043 erg s-1). The authors' optical spectroscopic analysis finds that the observed fraction of optically obscured AGN (i.e., the type 2 fraction) is FType2 = 53 (+14, -15) per cent, for a well-defined subset of the 8-24 keV selected sample. This is higher, albeit at a low significance level, than the type 2 fraction measured for redshift- and luminosity-matched AGNs selected by < 10 keV X-ray missions. This table contains the Secondary NuSTAR Serendipitous Source Catalog of 64 sources found using wavdetect to search for significant emission peaks in the FPMA and FPMB data separately (see Section 2.1.1 of Alexander et al. 2013, ApJ, 773, 125) and in the combined A+B data. These sources are listed in Table 7 of the reference paper. This method was developed alongside the primary one (Section 2.3 of the reference paper) in order to investigate the optimum source detection methodologies for NuSTAR and to identify sources in regions of the NuSTAR coverage that are automatically excluded in the primary source detection. The authors emphasize that these secondary sources are not used in any of the science analyses presented in their paper. Nevertheless, these secondary sources are robust NuSTAR detections, some of which will be incorporated in future NuSTAR studies, and for many of them (35 out of the 43 sources with spectroscopic identifications) the authors have obtained new spectroscopic redshifts and classifications through their follow-up program. The X-ray photometric parameters for 4 sources are left blank as in these cases the A+B data prohibit reliable photometric constraints. Additional information on these Secondary Catalog sources that the authors obtained using optical spectroscopy is available in Table 8 of the reference paper (q.v.). This table does NOT contain the the 498 sources in the Primary NuSTAR Serendipitous Source Catalog that were found using the source detection procedure described in Section 2.3 of the reference paper, and that are listed in Table 5 (op. cit.). This table was created by the HEASARC in July 2017 based on the machine-readable version of Table 7 from the reference paper, the Secondary NuSTAR Serendipitous Source Catalog, that was obtained from the ApJ web site. This is a service provided by NASA HEASARC .
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As COVID-19 swept across the globe, increased ventilation and implementation of air cleaning were emphasized by the US CDC and WHO as important strategies to reduce the risk of inhalation exposure to the virus. To assess whether higher ventilation and air cleaning rates lead to lower exposure risk to SARS-CoV-2, 1274 manuscripts published between April 2020 and September 2022 were screened using key words “airborne SARS-CoV-2 or “SARS-CoV-2 aerosol.” Ninety-three studies involved air sampling at locations with known sources (hospitals and residences) were selected and associated data were compiled. Two metrics were used to assess exposure risk: SARS-CoV-2 concentration and SARS-CoV-2 detection rate in air samples. Locations were categorized by type (hospital or residence) and proximity to the location housing the isolated/quarantined patient (primary or secondary). The results showed that hospital wards had lower airborne virus concentrations than residential isolation rooms. A negative correlation was found between airborne virus concentrations in primary-occupancy areas and air changes per hour (ACH). In hospital settings, sample positivity rates were significantly reduced in secondary-occupancy areas compared to primary-occupancy areas, but they were similar across sampling locations in residential settings. ACH and sample positivity rates were negatively correlated, though the effect was diminished when ACH values exceeded 8. While limitations associated with diverse sampling protocols exist, data considered by this meta-analysis support the notion that higher ACH may reduce exposure risks to the virus in ambient air. Copyright © 2024 American Association for Aerosol Research
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset from Aguilar-Latorre, A., Serrano-Ripoll, M. J., Oliván-Blázquez, B., Gervilla, E., & Navarro, C. (2022). Associations Between Severity of Depression, Lifestyle Patterns, and Personal Factors Related to Health Behavior: Secondary Data Analysis From a Randomized Controlled Trial. Frontiers in psychology, 13, 856139. https://doi.org/10.3389/fpsyg.2022.856139
Background: Depression is a prevalent condition that has a significant impact on psychosocial functioning and quality of life. The onset and persistence of depression have been linked to a variety of biological and psychosocial variables. Many of these variables are associated with specific lifestyle characteristics, such as physical activity, diet, and sleep patterns. Some psychosocial determinants have an impact on people’ health-related behavior change. These include personal factors such as sense of coherence, patient activation, health literacy, self-efficacy, and procrastination. This study aims to analyze the association between the severity of depression, lifestyle patterns, and personal factors related to health behavior. It also aims to analyze whether personal factors moderate the relationship between lifestyles and depression.
Methods: This study is a secondary data analysis (SDA) of baseline data collected at the start of a randomized controlled trial (RCT). A sample of 226 patients with subclinical, mild, or moderate depression from primary healthcare centers in two sites in Spain (Zaragoza and Mallorca) was used, and descriptive, bivariate, multivariate, and moderation analyses were performed. Depression was the primary outcome, measured by Beck II Self-Applied Depression Inventory. Lifestyle variables such as physical exercise, adherence to Mediterranean diet and sleep quality, social support, and personal factors such as self-efficacy, patient activation in their own health, sense of coherence, health literacy, and procrastination were considered secondary outcomes.
Results: Low sense of coherence (β = −0.172; p < 0.001), poor sleep quality (β = 0.179; p = 0.008), low patient activation (β = −0.119; p = 0.019), and sedentarism (more minutes seated per day; β = 0.003; p = 0.025) are predictors of having more depressive symptoms. Moderation analyses were not significant.
Discussion: Lifestyle and personal factors are related to depressive symptomatology. Our findings reveal that sense of coherence, patient’s activation level, sedentarism, and sleep quality are associated with depression. Further research is needed regarding adherence to Mediterranean diet, minutes walking per week and the interrelationship between lifestyles, personal factors, and depression.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).
Facebook
TwitterAs significant progress continues to be made by the Rwandan economy following various recovery and growth strategies, certain elements remain crucial. The food and nutrition security of the population remains a key building block in not only consolidating the gains already made thus far but also further accelerating the rate of growth towards the realization of the Millennium Development Goals (MDGs). Thus, the 2009 Comprehensive Food Security and Vulnerability Analysis and Nutrition survey (CFSVANS) was undertaken with the objective of analyzing trends over time in comparison with the 2006 CFSVA and the 2005 RDHS, as well as, with other more recent secondary data, measuring the extent and depth of food and nutrition insecurity and vulnerability, and identifying the underlying causes.
The five key questions to a CFSVANS are: who are the people currently facing food insecurity and malnutrition; how many are they; where do they live; why are they food insecure and/or malnourished and; how can food assistance and interventions make a difference in reducing poverty, hunger and supporting livelihoods? In order to provide answers to these questions, specifically, the assessment sought to:
-Identify geographic and socio-economic groups that are food insecure or vulnerable to food insecurity;
-highlight the nature and causes of food insecurity among each group;
-Identify the major risks and constraints to improving food security;
-Evaluate assistance needs at the short, medium and long range;
- Support the development of an appropriate targeting system;
- Better define the role of GoR's development partners including WFP in promoting food security strengthening programs;
- Determine the prevalence of nutritional status of vulnerable groups (children aged 6 - 59 months and non-pregnant women of reproductive age (15-49 years old);
-Determine the prevalence of exclusive breastfeeding as a key Infant and Young Child Feeding strategy;
-Establish the linkage between household food security and nutritional status of children in Rwanda.
National coverage
Households
Rural household members
Sample survey data [ssd]
Rwanda is administratively divided into four provinces (Northern Province, Southern Province, Eastern Province and Western Province) plus Kigali City and a total of 30 districts. Districts are further divided in sectors and cells. The 2009 Comprehensive Food Security Vulnerability Analysis and Nutrition Survey (CFSVANS) was designed to provide statistically representative information at the sub-provincial level. To facilitate comparison with existing studies, it was decided to define strata using administrative limits rather than food economy zones (as in 2006). Because of the large number of districts, it was decided to define strata that would be either single districts or a group of districts. Districts that were identified as similar with regards to their socio-economic and agroenvironmental characteristics were grouped together. A total of 16 strata were defined including 8 districts and 8 groups of districts. Kigali City was not included in the sample. Selected strata include Nyagatare-Gatsibo-Kayonza, Kirehe-Ngoma-Rwamagana, and Bugesera (Eastern Province), Musanze-Burera, Gakenke, and Rulindo-Gicumbi (Northern Province), Rubavu, Nyabihu, Ngororero, Rutsiro-Karongi, and Nyamasheke-Rusizi (Western Province), and, Kamonyi-Muhanga-Ruhango, Nyanza, Huye, Gisagara, and Nyamagabe-Nyaruguru (Southern Province).
Within each stratum, NISR implemented a two-stage sampling procedure to select households using an approach that is standardized for statistical studies in Rwanda. Zones de Dénombrement (ZD, enumeration areas) were selected first, followed by households using 2007 population estimates based on the 2002 census. The ZDs are a sampling unit that is smaller than a sector. A total of 450 ZD were selected. In each stratum, the probability of the ZDs to be selected was equal to the number of ZDs in the stratum divided by the number of ZDs. In each stratum, ZDs were randomly selected. Within each sampled ZD, a total of 12 households were interviewed, resulting in a total expected sample size of 5,400 households.
All of the households were interviewed. Enumerators were provided with clear instructions on which households to interview, and how to find them. Supervisors were provided with a list of over-sampled households in the event that a household had to be replaced.
Because this study also focuses on the relation between nutrition and food security, it was decided during the study design that only households with children aged below 5 years old would be included in the sample. This imposed some limitations in the ability to draw conclusions about all the households in Rwanda; as explained in the limitations section.
Face-to-face paper [f2f]
Household survey To allow for comparison over time, the 2009 CFSVA and Nutrition Survey used a standard questionnaire similar to the one used for the 2006 CFSVA. In 2006, face validity of the questionnaire was examined by local and food security experts and the questionnaire was piloted among a random sample of people not included in the study. It was a structured questionnaire using mainly close-ended questions with response options provided to the enumerators. For several questions, respondents were allowed to provide more than one response.The survey instrument sought to collect quantitative data on 13 components: (1) demographics; (2) housing and facilities; (3) household and productive assets; (4) inputs to livelihoods; (5) migration and remittances; (6) sources of credit; (7) agricultural production; (8) expenditure; (9) food sources and consumption; (10) shocks and food security; (11) programme participation; (12) maternal health and nutrition; and (13) child health and nutrition.
Community questionnaire In addition to the household survey, a community questionnaire was administered to a key informant, who was an official representative of the area, including the Executive Secretary of the Cell, or any individual responsible for administrative services at Cell level. The community questionnaire was developed using an approach similar to that of the household questionnaire. Questions were open-ended and the questionnaires covered four main aspects; migration and seasonal movement of population, health, external assistance (food aid), and market prices.
The questionnaires were developed in English and administered in Kinyarwanda. Careful training was conducted to reduce individual variations on how enumerators interpreted the questionnaire and understood the questions.
Data entry was conducted by NISR using CSPro. The database was then exported to SPSS for analysis. Statistical analysis was conducted by WFP in Rwanda and Rome, with the support of NISR. SPSS and ADDAWIN were used to conduct PCA and cluster analysis.5 Z-scores for wasting, stunting and underweight were calculated using WHO Anthro. All other analyses were done using SPSS.
A series of data quality tables and graphs are available to review the quality of the data and include the following: -Food Items, Groups and Weights for Calculation of the FCS -Household characteristics associated with food consumption -Child nutrition by livelihood, wealth index and FCS -The people facing food insecurity and vulnerability -Sample and Demographic Characteristics by Strata (CFSVA 2009)....
Facebook
TwitterThe 2022 Nepal Demographic and Health Survey (NDHS) is the sixth survey of its kind implemented in the country as part of the worldwide Demographic and Health Surveys (DHS) Program. It was implemented by New ERA under the aegis of the Ministry of Health and Population (MoHP) of the Government of Nepal with the objective of providing reliable, accurate, and up-to-date data for the country.
The primary objective of the 2022 NDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the 2022 NDHS collected information on fertility, marriage, family planning, breastfeeding practices, nutrition, food insecurity, maternal and child health, childhood mortality, awareness and behavior regarding HIV/AIDS and other sexually transmitted infections (STIs), women’s empowerment, domestic violence, fistula, mental health, accident and injury, disability, and other healthrelated issues such as smoking, knowledge of tuberculosis, and prevalence of hypertension.
The information collected through the 2022 NDHS is intended to assist policymakers and program managers in evaluating and designing programs and strategies for improving the health of Nepal’s population. The survey also provides indicators relevant to the Sustainable Development Goals (SDGs) for Nepal.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, men ageed 15-49, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2022 NDHS is an updated version of the frame from the 2011 Nepal Population and Housing Census (NPHC) provided by the National Statistical Office. The 2022 NDHS considered wards from the 2011 census as sub-wards, the smallest administrative unit for the survey. The census frame includes a complete list of Nepal’s 36,020 sub-wards. Each sub-ward has a residence type (urban or rural), and the measure of size is the number of households.
In September 2015, Nepal’s Constituent Assembly declared changes in the administrative units and reclassified urban and rural areas in the country. Nepal is divided into seven provinces: Koshi Province, Madhesh Province, Bagmati Province, Gandaki Province, Lumbini Province, Karnali Province, and Sudurpashchim Province. Provinces are divided into districts, districts into municipalities, and municipalities into wards. Nepal has 77 districts comprising a total of 753 (local-level) municipalities. Of the municipalities, 293 are urban and 460 are rural.
Originally, the 2011 NPHC included 58 urban municipalities. This number increased to 217 as of 2015. On March 10, 2017, structural changes were made in the classification system for urban (Nagarpalika) and rural (Gaonpalika) locations. Nepal currently has 293 Nagarpalika, with 65% of the population living in these urban areas. The 2022 NDHS used this updated urban-rural classification system. The survey sample is a stratified sample selected in two stages. Stratification was achieved by dividing each of the seven provinces into urban and rural areas that together formed the sampling stratum for that province. A total of 14 sampling strata were created in this way. Implicit stratification with proportional allocation was achieved at each of the lower administrative levels by sorting the sampling frame within each sampling stratum before sample selection, according to administrative units at the different levels, and by using a probability-proportional-to-size selection at the first stage of sampling. In the first stage of sampling, 476 primary sampling units (PSUs) were selected with probability proportional to PSU size and with independent selection in each sampling stratum within the sample allocation. Among the 476 PSUs, 248 were from urban areas and 228 from rural areas. A household listing operation was carried out in all of the selected PSUs before the main survey. The resulting list of households served as the sampling frame for the selection of sample households in the second stage. Thirty households were selected from each cluster, for a total sample size of 14,280 households. Of these households, 7,440 were in urban areas and 6,840 were in rural areas. Some of the selected sub-wards were found to be overly large during the household listing operation. Selected sub-wards with an estimated number of households greater than 300 were segmented. Only one segment was selected for the survey with probability proportional to segment size.
For further details on sample design, see APPENDIX A of the final report.
Computer Assisted Personal Interview [capi]
Four questionnaires were used in the 2022 NDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, and the Biomarker Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to Nepal. In addition, a self-administered Fieldworker Questionnaire collected information about the survey’s fieldworkers.
Input was solicited from various stakeholders representing government ministries and agencies, nongovernmental organizations, and international donors. After all questionnaires were finalized in English, they were translated into Nepali, Maithili, and Bhojpuri. The Household, Woman’s, and Man’s Questionnaires were programmed into tablet computers to facilitate computer-assisted personal interviewing (CAPI) for data collection purposes, with the capability to choose any of the three languages for each questionnaire. The Biomarker Questionnaire was completed on paper during data collection and then entered in the CAPI system.
Data capture for the 2022 NDHS was carried out with Microsoft Surface Go 2 tablets running Windows 10.1. Software was prepared for the survey using CSPro. The processing of the 2022 NDHS data began shortly after the fieldwork started. When data collection was completed in each cluster, the electronic data files were transferred via the Internet File Streaming System (IFSS) to the New ERA central office in Kathmandu. The data files were registered and checked for inconsistencies, incompleteness, and outliers. Errors and inconsistencies were immediately communicated to the field teams for review so that problems would be mitigated going forward. Secondary editing, carried out in the central office at New ERA, involved resolving inconsistencies and coding the open-ended questions. The New ERA senior data processor coordinated the exercise at the central office. The NDHS core team members assisted with the secondary editing. The paper Biomarker Questionnaires were compared with the electronic data file to check for any inconsistencies in data entry. The pictures of vaccination cards that were captured during data collection were verified with the data entered. Data processing and editing were carried out using the CSPro software package. The concurrent data collection and processing offered a distinct advantage because it maximized the likelihood of the data being error-free and accurate. Timely generation of field check tables allowed for effective monitoring. The secondary editing of the data was completed by July 2022, and the final cleaning of the data set was completed by the end of August.
A total of 14,243 households were selected for the sample, of which 13,833 were found to be occupied. Of the occupied households, 13,786 were successfully interviewed, yielding a response rate of more than 99%. In the interviewed households, 15,238 women age 15-49 were identified as eligible for individual interviews. Interviews were completed with 14,845 women, yielding a response rate of 97%. In the subsample of households selected for the men’s survey, 5,185 men age 15-49 were identified as eligible for individual interviews and 4,913 were successfully interviewed, yielding a response rate of 95%.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors result from mistakes made in implementing data collection and in data processing, such as failing to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and entering the data incorrectly. Although numerous efforts were made during the implementation of the 2022 Nepal Demographic and Health Survey (2022 NDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2022 NDHS is only one of many samples that could have been selected from the same population, using the same design and expected sample size. Each of these samples would yield results that differ somewhat from the results of the selected sample. Sampling errors are a measure of the variability among all possible samples. Although the exact degree of variability is unknown, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, and so on), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ECLS-B is a longitudinal study that followed a nationally representative sample of approximately 10,700 participating children from birth through kindergarten entry. The children participating in the study were born in the United States in 2001, and came from diverse socioeconomic and racial/ethnic backgrounds, with over-samples of Chinese children, other Asian and Pacific Islander children, American Indian and Alaska Native children, twins, and children born with low and very low birth weight.
Facebook
TwitterThe primary objective of the 2012 Indonesia Demographic and Health Survey (IDHS) is to provide policymakers and program managers with national- and provincial-level data on representative samples of all women age 15-49 and currently-married men age 15-54.
The 2012 IDHS was specifically designed to meet the following objectives: • Provide data on fertility, family planning, maternal and child health, adult mortality (including maternal mortality), and awareness of AIDS/STIs to program managers, policymakers, and researchers to help them evaluate and improve existing programs; • Measure trends in fertility and contraceptive prevalence rates, and analyze factors that affect such changes, such as marital status and patterns, residence, education, breastfeeding habits, and knowledge, use, and availability of contraception; • Evaluate the achievement of goals previously set by national health programs, with special focus on maternal and child health; • Assess married men’s knowledge of utilization of health services for their family’s health, as well as participation in the health care of their families; • Participate in creating an international database that allows cross-country comparisons that can be used by the program managers, policymakers, and researchers in the areas of family planning, fertility, and health in general
National coverage
Sample survey data [ssd]
Indonesia is divided into 33 provinces. Each province is subdivided into districts (regency in areas mostly rural and municipality in urban areas). Districts are subdivided into subdistricts, and each subdistrict is divided into villages. The entire village is classified as urban or rural.
The 2012 IDHS sample is aimed at providing reliable estimates of key characteristics for women age 15-49 and currently-married men age 15-54 in Indonesia as a whole, in urban and rural areas, and in each of the 33 provinces included in the survey. To achieve this objective, a total of 1,840 census blocks (CBs)-874 in urban areas and 966 in rural areas-were selected from the list of CBs in the selected primary sampling units formed during the 2010 population census.
Because the sample was designed to provide reliable indicators for each province, the number of CBs in each province was not allocated in proportion to the population of the province or its urban-rural classification. Therefore, a final weighing adjustment procedure was done to obtain estimates for all domains. A minimum of 43 CBs per province was imposed in the 2012 IDHS design.
Refer to Appendix B in the final report for details of sample design and implementation.
Face-to-face [f2f]
The 2012 IDHS used four questionnaires: the Household Questionnaire, the Woman’s Questionnaire, the Currently Married Man’s Questionnaire, and the Never-Married Man’s Questionnaire. Because of the change in survey coverage from ever-married women age 15-49 in the 2007 IDHS to all women age 15-49 in the 2012 IDHS, the Woman’s Questionnaire now has questions for never-married women age 15-24. These questions were part of the 2007 Indonesia Young Adult Reproductive Survey questionnaire.
The Household and Woman’s Questionnaires are largely based on standard DHS phase VI questionnaires (March 2011 version). The model questionnaires were adapted for use in Indonesia. Not all questions in the DHS model were adopted in the IDHS. In addition, the response categories were modified to reflect the local situation.
The Household Questionnaire was used to list all the usual members and visitors who spent the previous night in the selected households. Basic information collected on each person listed includes age, sex, education, marital status, education, and relationship to the head of the household. Information on characteristics of the housing unit, such as the source of drinking water, type of toilet facilities, construction materials used for the floor, roof, and outer walls of the house, and ownership of various durable goods were also recorded in the Household Questionnaire. These items reflect the household’s socioeconomic status and are used to calculate the household wealth index. The main purpose of the Household Questionnaire was to identify women and men who were eligible for an individual interview.
The Woman’s Questionnaire was used to collect information from all women age 15-49. These women were asked questions on the following topics: • Background characteristics (marital status, education, media exposure, etc.) • Reproductive history and fertility preferences • Knowledge and use of family planning methods • Antenatal, delivery, and postnatal care • Breastfeeding and infant and young children feeding practices • Childhood mortality • Vaccinations and childhood illnesses • Marriage and sexual activity • Fertility preferences • Woman’s work and husband’s background characteristics • Awareness and behavior regarding HIV-AIDS and other sexually transmitted infections (STIs) • Sibling mortality, including maternal mortality • Other health issues
Questions asked to never-married women age 15-24 addressed the following: • Additional background characteristics • Knowledge of the human reproduction system • Attitudes toward marriage and children • Role of family, school, the community, and exposure to mass media • Use of tobacco, alcohol, and drugs • Dating and sexual activity
The Man’s Questionnaire was administered to all currently married men age 15-54 living in every third household in the 2012 IDHS sample. This questionnaire includes much of the same information included in the Woman’s Questionnaire, but is shorter because it did not contain questions on reproductive history or maternal and child health. Instead, men were asked about their knowledge of and participation in health-careseeking practices for their children.
The questionnaire for never-married men age 15-24 includes the same questions asked to nevermarried women age 15-24.
All completed questionnaires, along with the control forms, were returned to the BPS central office in Jakarta for data processing. The questionnaires were logged and edited, and all open-ended questions were coded. Responses were entered in the computer twice for verification, and they were corrected for computeridentified errors. Data processing activities were carried out by a team of 58 data entry operators, 42 data editors, 14 secondary data editors, and 14 data entry supervisors. A computer package program called Census and Survey Processing System (CSPro), which was specifically designed to process DHS-type survey data, was used in the processing of the 2012 IDHS.
The response rates for both the household and individual interviews in the 2012 IDHS are high. A total of 46,024 households were selected in the sample, of which 44,302 were occupied. Of these households, 43,852 were successfully interviewed, yielding a household response rate of 99 percent.
Refer to Table 1.2 in the final report for more detailed summarized results of the of the 2012 IDHS fieldwork for both the household and individual interviews, by urban-rural residence.
The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors, and (2) sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2012 Indonesia Demographic and Health Survey (2012 IDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2012 IDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling error is a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2012 IDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the 2012 IDHS is a SAS program. This program used the Taylor linearization method
Facebook
Twitter
According to our latest research, the global on-instrument secondary analysis acceleration market size reached USD 1.12 billion in 2024, reflecting robust growth driven by technological advancements and the rising adoption of high-throughput sequencing platforms. The market is projected to expand at a CAGR of 13.7% from 2025 to 2033, reaching a forecasted value of USD 3.48 billion by 2033. The primary growth factor is the increasing demand for rapid, accurate, and scalable data analysis solutions in genomics and related life sciences fields, as laboratories and research institutions prioritize efficiency and precision in large-scale omics studies.
The growth trajectory of the on-instrument secondary analysis acceleration market is strongly influenced by the surging volume of next-generation sequencing (NGS) data generated worldwide. As sequencing costs continue to decrease and throughput increases, the bottleneck has shifted from data generation to data interpretation and analysis. This shift has created an urgent need for advanced hardware accelerators and software solutions capable of performing real-time or near real-time secondary data analysis directly on sequencing instruments. The integration of these accelerators reduces turnaround times, minimizes data transfer bottlenecks, and enables researchers and clinicians to make faster, more informed decisions, thereby driving widespread adoption across both clinical and research applications.
Another significant growth factor is the ongoing convergence of artificial intelligence, machine learning, and high-performance computing with omics data analysis. The implementation of AI-driven algorithms and parallel processing architectures within on-instrument acceleration platforms allows for the efficient handling of increasingly complex and voluminous datasets. This convergence not only enhances the accuracy of variant calling, alignment, and quantification tasks but also supports the development of new analytical pipelines tailored to emerging applications such as single-cell genomics, spatial transcriptomics, and multi-omics integration. As a result, both established institutions and emerging biotech startups are investing heavily in upgrading their analytical infrastructure, further fueling market expansion.
Additionally, the growing emphasis on personalized medicine and precision healthcare is catalyzing demand for on-instrument secondary analysis acceleration solutions. Healthcare providers, pharmaceutical companies, and diagnostic laboratories are seeking to leverage genomic and proteomic insights to guide treatment decisions, drug development, and patient stratification. The ability to accelerate secondary analysis workflows directly on sequencing or mass spectrometry instruments is critical for achieving rapid turnaround times in clinical settings, particularly for applications such as oncology, rare disease diagnosis, and infectious disease surveillance. Regulatory support for clinical genomics and increasing investments in translational research are expected to further boost market growth in the coming years.
From a regional perspective, North America currently dominates the on-instrument secondary analysis acceleration market, accounting for the largest share in 2024, followed closely by Europe and the Asia Pacific. The robust presence of leading sequencing technology providers, high R&D expenditure, and favorable reimbursement frameworks in the United States and Canada have contributed to rapid market adoption. Europe is also witnessing substantial growth, supported by strong government initiatives and collaborative research networks. Meanwhile, the Asia Pacific region is emerging as a lucrative market, driven by expanding genomics research programs, increasing healthcare investments, and a growing focus on precision medicine in countries such as China, Japan, and India.
The product landscape of the on-instrument secondary analysis a
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT
The Albero study analyzes the personal transitions of a cohort of high school students at the end of their studies. The data consist of (a) the longitudinal social network of the students, before (n = 69) and after (n = 57) finishing their studies; and (b) the longitudinal study of the personal networks of each of the participants in the research. The two observations of the complete social network are presented in two matrices in Excel format. For each respondent, two square matrices of 45 alters of their personal networks are provided, also in Excel format. For each respondent, both psychological sense of community and frequency of commuting is provided in a SAV file (SPSS). The database allows the combined analysis of social networks and personal networks of the same set of individuals.
INTRODUCTION
Ecological transitions are key moments in the life of an individual that occur as a result of a change of role or context. This is the case, for example, of the completion of high school studies, when young people start their university studies or try to enter the labor market. These transitions are turning points that carry a risk or an opportunity (Seidman & French, 2004). That is why they have received special attention in research and psychological practice, both from a developmental point of view and in the situational analysis of stress or in the implementation of preventive strategies.
The data we present in this article describe the ecological transition of a group of young people from Alcala de Guadaira, a town located about 16 kilometers from Seville. Specifically, in the “Albero” study we monitored the transition of a cohort of secondary school students at the end of the last pre-university academic year. It is a turning point in which most of them began a metropolitan lifestyle, with more displacements to the capital and a slight decrease in identification with the place of residence (Maya-Jariego, Holgado & Lubbers, 2018).
Normative transitions, such as the completion of studies, affect a group of individuals simultaneously, so they can be analyzed both individually and collectively. From an individual point of view, each student stops attending the institute, which is replaced by new interaction contexts. Consequently, the structure and composition of their personal networks are transformed. From a collective point of view, the network of friendships of the cohort of high school students enters into a gradual process of disintegration and fragmentation into subgroups (Maya-Jariego, Lubbers & Molina, 2019).
These two levels, individual and collective, were evaluated in the “Albero” study. One of the peculiarities of this database is that we combine the analysis of a complete social network with a survey of personal networks in the same set of individuals, with a longitudinal design before and after finishing high school. This allows combining the study of the multiple contexts in which each individual participates, assessed through the analysis of a sample of personal networks (Maya-Jariego, 2018), with the in-depth analysis of a specific context (the relationships between a promotion of students in the institute), through the analysis of the complete network of interactions. This potentially allows us to examine the covariation of the social network with the individual differences in the structure of personal networks.
PARTICIPANTS
The social network and personal networks of the students of the last two years of high school of an institute of Alcala de Guadaira (Seville) were analyzed. The longitudinal follow-up covered approximately a year and a half. The first wave was composed of 31 men (44.9%) and 38 women (55.1%) who live in Alcala de Guadaira, and who mostly expect to live in Alcala (36.2%) or in Seville (37.7%) in the future. In the second wave, information was obtained from 27 men (47.4%) and 30 women (52.6%).
DATE STRUCTURE AND ARCHIVES FORMAT
The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.
Social network
The file “Red_Social_t1.xlsx” is a valued matrix of 69 actors that gathers the relations of knowledge and friendship between the cohort of students of the last year of high school in the first observation. The file “Red_Social_t2.xlsx” is a valued matrix of 57 actors obtained 17 months after the first observation.
The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.
In order to generate each complete social network, the list of 77 students enrolled in the last year of high school was passed to the respondents, asking that in each case they indicate the type of relationship, according to the following values: 1, “his/her name sounds familiar"; 2, "I know him/her"; 3, "we talk from time to time"; 4, "we have good relationship"; and 5, "we are friends." The two resulting complete networks are represented in Figure 2. In the second observation, it is a comparatively less dense network, reflecting the gradual disintegration process that the student group has initiated.
Personal networks
Also in this case the information is organized in two observations. The compressed file “Redes_Personales_t1.csv” includes 69 folders, corresponding to personal networks. Each folder includes a valued matrix of 45 alters in CSV format. Likewise, in each case a graphic representation of the network obtained with Visone (Brandes and Wagner, 2004) is included. Relationship values range from 0 (do not know each other) to 2 (know each other very well).
Second, the compressed file “Redes_Personales_t2.csv” includes 57 folders, with the information equivalent to each respondent referred to the second observation, that is, 17 months after the first interview. The structure of the data is the same as in the first observation.
Sense of community and metropolitan displacements
The SPSS file “Albero.sav” collects the survey data, together with some information-summary of the network data related to each respondent. The 69 rows correspond to the 69 individuals interviewed, and the 118 columns to the variables related to each of them in T1 and T2, according to the following list:
• Socio-economic data.
• Data on habitual residence.
• Information on intercity journeys.
• Identity and sense of community.
• Personal network indicators.
• Social network indicators.
DATA ACCESS
Social networks and personal networks are available in CSV format. This allows its use directly with UCINET, Visone, Pajek or Gephi, among others, and they can be exported as Excel or text format files, to be used with other programs.
The visual representation of the personal networks of the respondents in both waves is available in the following album of the Graphic Gallery of Personal Networks on Flickr: <https://www.flickr.com/photos/25906481@N07/albums/72157667029974755>.
In previous work we analyzed the effects of personal networks on the longitudinal evolution of the socio-centric network. It also includes additional details about the instruments applied. In case of using the data, please quote the following reference:
The English version of this article can be downloaded from: https://tinyurl.com/yy9s2byl
CONCLUSION
The database of the “Albero” study allows us to explore the co-evolution of social networks and personal networks. In this way, we can examine the mutual dependence of individual trajectories and the structure of the relationships of the cohort of students as a whole. The complete social network corresponds to the same context of interaction: the secondary school. However, personal networks collect information from the different contexts in which the individual participates. The structural properties of personal networks may partly explain individual differences in the position of each student in the entire social network. In turn, the properties of the entire social network partly determine the structure of opportunities in which individual trajectories are displayed.
The longitudinal character and the combination of the personal networks of individuals with a common complete social network, make this database have unique characteristics. It may be of interest both for multi-level analysis and for the study of individual differences.
ACKNOWLEDGEMENTS
The fieldwork for this study was supported by the Complementary Actions of the Ministry of Education and Science (SEJ2005-25683), and was part of the project “Dynamics of actors and networks across levels: individuals,
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Background: Malaria continues to pose a major public health challenge in tropical regions. Despite significant efforts to control malaria in Tanzania, there are still residual transmission cases. Unfortunately, little is known about where these residual malaria transmission cases occur and how they spread. In Tanzania, for example, the transmission is heterogeneously distributed. In order to effectively control and prevent the spread of malaria, it is essential to understand the spatial distribution and transmission patterns of the disease. This study seeks to predict areas that are at high risk of malaria transmission so that intervention measures can be developed to accelerate malaria elimination efforts.
Methods: This study employs a geospatial-based model to predict and map out malaria risk area in Kilombero Valley. Environmental factors related to malaria transmission were considered and assigned valuable weights in the Analytic Hierarchy Process (AHP), an online system using a pairwise comparison technique. The malaria hazard map was generated by a weighted overlay of the altitude, slope, curvature, aspect, rainfall distribution, and distance to streams in Geographic Information Systems (GIS). Finally, the risk map was created by overlaying components of malaria risk including hazards, elements at risk, and vulnerability.
Results: The study demonstrates that the majority of the study area falls under the moderate-risk level (61%), followed by the low-risk level (31%), while the high-malaria risk area covers a small area, which occupies only 8% of the total area.
Conclusion: The findings of this study are crucial for developing spatially targeted interventions against malaria transmission in residual transmission settings. Predicted areas prone to malaria risk provide information that will inform decision-makers and policymakers for proper planning, monitoring, and deployment of interventions.
Methods
Data acquisition and description
The study employed both primary and secondary data, which were collected from numerous sources based on the input required for the implementation of the predictive model. Data collected includes the locations of all public and private health centers that were downloaded free from the health portal of the United Republic of Tanzania, Ministry of Health, Community Development, Gender, Elderly, and Children, through the universal resource locator (URL) (http://moh.go.tz/hfrportal/). Human population data was collected from the 2012 population housing census (PHC) for the United Republic of Tanzania report.
Rainfall data were obtained from two local offices; Kilombero Agricultural Training and Research Institute (KATRIN) and Kilombero Valley Teak Company (KVTC). These offices collect meteorological data for agricultural purposes. Monthly data from 2012 to 2017 provided from thirteen (13) weather stations. Road and stream network shapefiles were downloaded free from the MapCruzin website via URL (https://mapcruzin.com/free-tanzania-arcgis-maps-shapefiles.htm).
With respect to the size of the study area, five neighboring scenes of the Landsat 8 OLI/TIRS images (path/row: 167/65, 167/66, 167/67, 168/66 and 168/67) were downloaded freely from the United States Geological Survey (USGS) website via URL: http://earthexplorer.usgs.gov. From July to November 2017, the images were selected and downloaded from the USGS Earth Explorer archive based on the lowest amount of cloud cover coverage as viewed from the archive before downloading. Finally, the digital elevation data with a spatial resolution of three arc-seconds (90m by 90m) using WGS 84 datum and the Geographic Coordinate System were downloaded free from the Shuttle Radar Topography Mission (SRTM) via URL (https://dds.cr.usgs.gov/srtm/version2_1/SRTM3/Africa/). Only six tiles that fall in the study area were downloaded, coded tiles as S08E035, S09E035, S10E035, S08E036, S09E036, S10E036, S08E037, S09E037 and S10E037.
Preparation and Creation of Model Factor Parameters
Creation of Elevation Factor
All six coded tiles were imported into the GIS environment for further analysis. Data management tools, with raster/raster data set/mosaic to new raster feature, were used to join the tiles and form an elevation map layer. Using the spatial analyst tool/reclassify feature, the generated elevation map was then classified into five classes as 109–358, 359–530, 531–747, 748–1017 and >1018 m.a.s.l. and new values were assigned for each class as 1, 2, 3, 4 and 5, respectively, with regards to the relationship with mosquito distribution and malaria risk. Finally, the elevation map based on malaria risk level is levelled as very high, high, moderate, low and very low respectively.
Creation of Slope Factor
A slope map was created from the generated elevation map layer, using a spatial analysis tool/surface/slope feature. Also, the slope raster layer was further reclassified into five subgroups based on predefined slope classes using standard classification schemes, namely quantiles as 0–0.58, 0.59–2.90, 2.91–6.40, 6.41–14.54 and >14.54. This classification scheme divides the range of attribute values into equal-sized sub-ranges, which allow specifying the number of the intervals while the system determines where the breaks should be. The reclassified slope raster layer subgroups were ranked 1, 2, 3, 4 and 5 according to the degree of suitability for malaria incidence in the locality. To elaborate, the steeper slope values are related to lesser malaria hazards, and the gentler slopes are highly susceptible to malaria incidences. Finally, the slope map based on malaria risk level is leveled as very high, high, moderate, low and very low respectively.
Creation of Curvature Factor
Curvature is another topographical factor that was created from the generated elevation map using the spatial analysis tool/surface/curvature feature. The curvature raster layer was further reclassified into five subgroups based on predefined curvature class. The reclassified curvature raster layer subgroups were ranked to 1, 2, 3, 4 and 5 according to their degree of suitability for malaria occurrence. To explain, this affects the acceleration and deceleration of flow across the surface. A negative value indicates that the surface is upwardly convex, and flow will be decelerated, which is related to being highly susceptible to malaria incidences. A positive profile indicates that the surface is upwardly concave and the flow will be accelerated which is related to a lesser malaria hazard, while a value of zero indicates that the surface is linear and related to a moderate malaria hazard. Lastly, the curvature map based on malaria risk level is leveled as very high, high, moderate, low, and very low respectively.
Creation of Aspect Factor
As a topographic factor associated with mosquito larval habitat formation, aspect determines the amount of sunlight an area receives. The more sunlight received the stronger the influence on temperature, which may affect mosquito larval survival. The aspect of the study area also was generated from the elevation map using spatial analyst tools/ raster /surface /aspect feature. The aspect raster layer was further reclassified into five subgroups based on predefined aspect class. The reclassified aspect raster layer subgroups were ranked as 1, 2, 3, 4 and 5 according to the degree of suitability for malaria incidence, and new values were re-assigned in order of malaria hazard rating. Finally, the aspect map based on malaria risk level is leveled as very high, high, moderate, low, and very low, respectively.
Creation of Human Population Distribution Factor
Human population data was used to generate a population distribution map related to malaria occurrence. Kilombero Valley has a total of 42 wards, the data was organized in Ms excel 2016 and imported into the GIS environment for the analysis, Inverse Distance Weighted (IDW) interpolation in the spatial analyst tool was applied to interpolate the population distribution map. The population distribution map was further reclassified into five subgroups based on potential to malaria risk. The reclassified map layer subgroups were ranked according to the vulnerability to malaria incidence in the locality such as areas having high population having the highest vulnerability and the less population having less vulnerable, and the new value was assigned as 1, 2, 3, 4 and 5, and then leveled as very high, high, moderate, low and very low malaria risk level, respectively.
Creation of Proximity to Health Facilities Factor
The distribution of health facilities has a significant impact on the malaria vulnerability of the population dwellings in the Kilombero Valley. The health facility layer was created by computing distance analysis using proximity multiple ring buffer features in spatial analyst tool/multiple ring buffer. Then the map layer was reclassified into five sub-layers such as within (0–5) km, (5.1–10) km, (10.1–20) km, (20.1–50) km and >50km. According to a WHO report, it is indicated that the human population who live nearby or easily accessible to health facilities is less vulnerable to malaria incidence than the ones who are very far from the health facilities due to the distance limitation for the health services. Later on, the new values were assigned as 1, 2, 3, 4 and 5, and then reclassified as very high, high, moderate, low and very low malaria risk levels, respectively.
Creation of Proximity to Road Network Factor
The distance to the road network is also a significant factor, as it can be used as an estimation of the access to present healthcare facilities in the area. Buffer zones were calculated on the path of the road to determine the effect of the road on malaria prevalence. The road shapefile of the study area was inputted into GIS environment and spatial analyst tools / multiple ring buffer feature were used to generate five buffer zones with the
Facebook
TwitterThis survey provides information on household income and expenditure leading to measure the levels and changes of the living conditions of the people and to observe the consumption patterns .
Key objectives of the survey - To identify the income patterns in Urban, Rural and Estate Sectors & provinces. - To identify the income patterns by income levels. - Average consumption of food items and non food items - Expenditure patterns by sector and by income level.
National coverage.
Household, Individuals
For this survey a sample of buildings and the occupants therein was drawn from the whole island
Sample survey data [ssd]
A two stage stratified random sample design was used in the survey. Urban, Rural and Estate sectors of the Districts were the domains for stratification. The sample frame was the list of buildings that were prepared for the Census of Population and Housing 2001.
Selection of Primary Sampling Units (PSU's) Primary sampling units are the census blocks prepared for the Census of Population and Housing - 2001. The sample frame, which is a collection of all census blocks in the domain, was used for the selection of primary sampling units. A sample of 500 primary sampling units was selected from the sampling frame for the survey.
Selection of Secondary Sampling Units (SSU's) Secondary Sampling Units are the housing units in the selected 500 primary sampling units (census blocks). From each primary sampling unit 10 housing units (SSU) were selected for the survey. The total sample size of 5000 housing units was selected and distributed among Districts in Sri Lanka.
Face-to-face [f2f]
Questionaires
The survey schedule was designed to collect data by household and separate schedules were used for each household identified according to the definition of the household within the housing units selected for the survey. The survey schedule consists three main sections .
1. Demographic section
2. Expenditure
3. Income
The Demographic characteristics and usual activities of the inmates belonging to the household were reported in the Demographic section of the schedule (and close relatives temporarily living away are also listed in this section). Expenditure section has two sub sections to report food and non-food consumption data separately. Expenditure incurred on their own decisions by boarders and servants are recorded in the sub section under the Main expenditure section. The income has seven sub sections categorized according to the main sources of income.
The exact differences or sampling error ,varies depending on the particular sample selected and the variability is measured by the standard error of the estimate. There is about a 95% chance or level of confidence that an estimate based on a sample will differ by no more than 1.96 standard errors from the true population value because of sampling error. Analyses relating to the HIES are generally conducted at the 95% level of confidence .
confidence interval = Estimate value ± (standard error )*(1.96)
http://www.statistics.gov.lk/HIES/HIES%202007/introduction%20%20HIES.pdf
By visiting the above website a description about the adjustments for non-response could be read in section 1.2 of the Final report.