50 datasets found
  1. p

    High Frequency Phone Survey, Continuous Data Collection 2023 - Solomon...

    • microdata.pacificdata.org
    Updated Mar 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darian Naidoo and William Seitz (2025). High Frequency Phone Survey, Continuous Data Collection 2023 - Solomon Islands [Dataset]. https://microdata.pacificdata.org/index.php/catalog/875
    Explore at:
    Dataset updated
    Mar 19, 2025
    Dataset authored and provided by
    Darian Naidoo and William Seitz
    Time period covered
    2023 - 2024
    Area covered
    Solomon Islands
    Description

    Abstract

    Access to up-to-date socio-economic data is a widespread challenge in Solomon Islands and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 as a way to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.

    For Solmon Islands, after five rounds of data collection from 2020-2020, in April 2023 a monthly HFPS data collection commenced and continued for 18 months (ending September 2024) –on topics including employment, income, food security, health, food prices, assets and well-being. Fieldwork took place in two non-consecutive weeks of each month. Data for April 2023-December 2023 were a repeated cross section, while January 2024 established the first month of a panel, the was continued to September 2024. Each month has approximately 550 households in the sample and is representative of urban and rural areas, but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in Solomon Islands. There is one date file for household level data with a unique household ID. and a separate file for individual level data within each household data, that can be matched to the household file using the household ID, and which also has a unique individual ID within the household data which can be used to track individuals over time within households, where the data is panel data.

    Geographic coverage

    Urban and rural areas of Solomon Islands.

    Analysis unit

    Household, individual.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The initial sample was drawn through Random Digit Dialing (RDD) with geographic stratification. As an objective of the survey was to measure changes in household economic wellbeing over time, the HFPS sought to contact a consistent number of households across each province month to month. This was initially a repeated cross section from April 2023-Dec 2023. The initial sample was drawn from information provided by a major phone service provider in Solomon Islands, covering all the provinces in the country. It had a probability-based weighted design, with a proportionate stratification to achieve geographical representation. The geographical distribution compared to the 2019 Census is listed below for the first month of the HFPS monthly survey:

    Choiseul : Census: 4.3%, HFPS: 5.2% Western : Census: 14.4%, HFPS: 13.7% Isabel : Census: 4.8%, HFPS: 4.7% Central : Census: 3.6%, HFPS: 5.2% Ren Bell : Census: 0.6%, HFPS: 1.4% Guadalcanal: Census: 19.8%, HFPS: 21.1% Malaita : Census: 23.1%, HFPS: 18.7% Makira : Census: 5.6%, HFPS: 5.6% Temotu: Census: 3.0%, HFPS: 3% Honiara: Census: 20.7%, HFPS: 21.3%

    Source: Census of Population and Housing 2019

    Note: The values in the HFPS column represent the proportion of survey participants residing in each province, based on the raw HFPS data from April.

    In April 2023, the geographic distribution of World Bank HFPS participants was generally similar to that of the census data at the province level, though within provinces, areas with less mobile phone connectivity are likely to be underrepresented. One indication of this is that urban areas constituted 38.2 percent of the survey sample, which is a slight overrepresentation, compared to 32.5 percent in the Census 2019.

    A monthly panel was established in January 2024, that is ongoing as of March 2025. In each subsequent month after January 2024, the survey firm would first attempt to contact all households from the previous month and then attempt to contact households from earlier months that had dropped out. After previous numbers were exhausted, RDD with geographic stratification was used for replacement households. Across all months of the survey a total of, 9,926 interviews were completed.

    Mode of data collection

    Computer Assisted Telephone Interview [cati]

    Research instrument

    The questionnaire, which can be found in the External Resources of this documentation, is available in English, with Solomons Pijin translation. There were few changes to the questionnaire across the survey months, but some sections were only introduced in 2024, namely energy access questions and questions to inform the baseline data of the Solomon Islands Government Integrated Economic Development and Climate Resilience (IEDCR) project.

    Cleaning operations

    The raw data were cleaned by the World Bank team using STATA. This included formatting and correcting errors identified through the survey’s monitoring and quality control process. The data are presented in two datasets: a household dataset and an individual dataset. The total number of observations is 9,926 in the household dataset and 62,054 in the individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, food prices, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (id_member) can be found in the individual dataset.

  2. d

    Time-series water level and water quality data to accompany Scientific...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Nov 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Time-series water level and water quality data to accompany Scientific Investigations Report 2018-5040 [Dataset]. https://catalog.data.gov/dataset/time-series-water-level-and-water-quality-data-to-accompany-scientific-investigations-2018
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This Data Release serves as a repository for a set of time-series data used in Scientific Investigations Report 2018-5040. The data represent continuous measurements of specific conductance, water temperature, and/or water level (stage), recorded by a variety of types of data loggers during three multi-day interference tests conducted on the Virgin River at Pah Tempe Springs during November 2013, February 2014, and November 2014. The data presented are the raw data downloaded from the data loggers and are organized according to the date of the test and the type and name of the observation site. The Data Release contains 3 items: 1. An explanatory table, "PahTempe_table1.xlsx", which indicates which parameters were collected and on what instrument at each site during a given test 2. The data, "PahTempe_data.zip"; this zipped file contains the raw data logger files in comma-separated values (CSV) format, organized into folders according to the date of the interference pumping test 3. The metadata document, "PahTempe_metadata.xml" Because these data were collected during multi-day interference pumping tests, they do not represent natural hydrologic conditions in the river, springs, or shallow groundwater system. Users of this data are advised to refer to the larger work citation for proper use and interpretation of the data.

  3. T

    Overview _data

    • opendata.utah.gov
    Updated Apr 30, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Overview _data [Dataset]. https://opendata.utah.gov/dataset/Overview-_data/wp8y-fbbm
    Explore at:
    csv, xml, application/geo+json, xlsx, kmz, kmlAvailable download formats
    Dataset updated
    Apr 30, 2020
    Description

    This data depicts the Public Land Survey System (PLSS) for the state of Utah and are based on Geographic Coordinate Database (GCDB) coordinate data. This dataset was created to provide continuous cadastre data for the state of Utah.This data is Version 2.1 2019 of the Utah PLSS Fabric. This data set represents the GIS Version of the Public Land Survey System. Updates are expected annually as horizontal control positions from published sources and global positioning system (GPS) observations are added. The primary source for the data is cadastral survey records housed by the BLM supplemented with local records and geographic control coordinates from states, counties as well as other federal agencies such as the USGS and USFS. This data was originally published on 1/3/2017. Updated 1/18/2019These are the corner points of the PLSS. This data set contains summary information about the coordinate location and reliability of corner coordinate information. The information in the corner feature has been collected by the identified data steward. For more information about corner locations, credits and use limitations the identified data steward in the corner feature should be contacted.

  4. NACP Greenhouse Gases Multi-Source Data Compilation, 2000-2009 - Dataset -...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). NACP Greenhouse Gases Multi-Source Data Compilation, 2000-2009 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/nacp-greenhouse-gases-multi-source-data-compilation-2000-2009-67f9b
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This data set is a collection of measurements of carbon dioxide (CO2) and non-CO2 greenhouse gases made across North America by nine independent atmospheric monitoring networks from 2000 - 2009. During this North American Carbon Program (NACP) sponsored activity, data were compiled from the following networks: AGAGE, COBRA, CSIRO, INTEX-A, INTEX B, Irvine Latitude Network, NOAA CMDL, SCRIPPS, and Stanley Tyler-UC Irvine. The files presented here are the products of merging multiple original measurement results files for selected sites across North America from each monitoring network. The primary focus of this effort was the compilation of non-CO2 greenhouse gases over North America, but numerous CO2 observations are also included. The data files for each network are accompanied by detailed readme documentation files prepared by the respective network investigators. Project descriptions, objectives, references, sampling and analysis methods, and data file descriptions are included in these READMEs. Table 1 in the documentation displays the monitoring network sites, sample types, analytes, and links to the detailed network README files. Network- and laboratory-specific data citations are included in the README documentation and should be used to acknowledge the use of these data as appropriate. The data files for each monitoring network and each sampling type (continuous or flasks) have been combined into one compressed (*.zip) file along with the detailed README document. There are 17 compressed files that when expanded contain data files which represent one year�s data for that specific campaign and sampling method. The number of annual files that were compiled from a network into this collection varies.

  5. d

    Spatial Prioritisation of Above Ground Carbon Storage 2023 (England)

    • environment.data.gov.uk
    • data.europa.eu
    zip
    Updated May 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natural England (2025). Spatial Prioritisation of Above Ground Carbon Storage 2023 (England) [Dataset]. https://environment.data.gov.uk/dataset/7de63441-20c3-487c-9326-5ac2c8a9e0ef
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset authored and provided by
    Natural Englandhttp://www.gov.uk/natural-england
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    England
    Description

    These spatial datasets consider the lands contribution to preventing and mitigating climate change, through storage of carbon in the Vegetation (above ground). This above Ground Carbon spatial datasets represent a strategic resource for England, that indicate the range of carbon storage values in tonnes of carbon per hectare (t C Ha-1 ). At a local scale (e.g. 1:50 000). They are presented as a series of raster datasets for use in GIS Systems at a resolution of 25m2. These maps will assist users to find out where the most important carbon stores in vegetation in their areas. They are not suitable for field scale carbon mitigation as this would require field scale carbon assessment.

    Three data component layers were collated together to form a continuous habitat data layer for England using the best, freely available information on habitat types. these were: The National Forest Inventory (2016);The single layer priority habitat dataset (various dated); Living England habitat map from satellite imagery (2020). From the collation, each habitat type was scored in terms of the likely carbon they would store above ground (t carbon/ Ha). These data were taken form a very wide range of scientific studies but largely built on Carbon Storage and Sequestration by Habitat 2021 (NERR094). Where slopes are very steep (greater than 18o) then the habitat classes which are identified by their tree species were score slightly lower, this is because they tend to have thinner soils and support less growth of the tree above the ground. Where woodland is long established or manged for nature a slight enhancing of scoring was given, with locations taken from ancient woodland data (10% uplift) and the protected site data (10% uplift) including SSSI, SAV, LNR and NNR.
    NE PHI - OGL NE Living England - OGL NE Peat Map [2008] - Non- commercial licence Soilscapes - Cranfield University- NE Bespoke Licence SRTM- NASA Shuttle Radar Topography- Open Topography

  6. Printing Unit Condition Monitoring

    • zenodo.org
    • data.niaid.nih.gov
    csv, txt
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Uwe Mönks; Uwe Mönks (2020). Printing Unit Condition Monitoring [Dataset]. http://doi.org/10.5281/zenodo.55227
    Explore at:
    txt, csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Uwe Mönks; Uwe Mönks
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set contains raw sensor signals of four analogue sensors and five features derived from them. These are used to monitor the condition of a printing unit in a demonstrator application and to detect a sensor defect, which is simulated in the data set.

    The demonstrator is used to simulate the wiping process of an Intaglio printing process. Intaglio is the major printing process to produce security prints like banknotes. Engraved structures in the printing plates, which are mounted on a rotating plate cylinder, are filled with ink, which is transferred onto the printing substrate under high pressure. A second cylinder denoted by wiping cylinder, which is working in the printing unit, is lubricated with a solvent to wipe off surplus ink from the printing plates by rotating in the direction opposite to the plate cylinder. This process is crucial as wiping errors immediately lead to print errors.

    The printing unit demonstrator contains models of the two cylinders, which are turned by electric drives. Pressure between the wiping cylinder having a rubber surface and the steel-surfaced plate cylinder is freely adjustable.
    A set of four analogue sensors (contact force, solid-bourne sound, electric current of wiping and plate cylinder drives) continuously acquire data during operation to monitor the process.
    The sensors each output a continuous voltage signal in the range of [-10,10] V, which is proportional to the respective quantity the sensor is observing. Thus, each signal's unit is irrelevant and abandoned as changes of the original quantity of interest are reflected also in the respective voltage signal.
    All output time-domain signals are synchronously and equidistantly sampled at a frequency of 20 kHz and quantised with a resolution of 16 bit.

    The acquired data is then split into non-overlapping batches of 50000 samples (corresponding to 2.5 sec of operation), respectively. The length of the time frame was chosen to ensure that 3 revolutions of the plate cylinder are captured in each signal data batch. The solid-bourne sound signal is treated by the FFT to determine its frequency spectrum per signal batch. Altogether, 5 features per plate cylinder revolution are extracted. This results in 15 feature values per signal data batch. The extracted features are:

    • contact force mean: arithmetic mean of the contact force,
    • solid-bourne sound intensity: root mean square of the solid-bourne sound,
    • solid-bourne sound maxPowFreqInd: index of the frequency component with largest power,
    • motor current wiping cylinder mean: arithmetic mean of the wiping cylinder motor current,
    • motor current plate cylinder mean: arithmetic mean of the plate cylinder motor current.

    Each plate cylinder revolution is represented by one instance in the feature data sets. That is, every instance in the data set is described by a vector of 5 feature values.

    The raw data and feature data sets are divided into two parts, each containing data of one of the two experiments under different operation conditions:

    • Static printing unit demonstrator operation:
      The static experiment observes the printing unit demonstrator during 20:13 min of operation. The printing unit demonstrator was started immediately before the data acquisition began. No additional manipulations or events occurred during the experiment. Therefore, only data representing the demonstrator's normal condition is contained in the data set. It contains 10,000,000 raw signal samples resulting in 600 instances (plate cylinder revolutions), which are in summary described by 3,000 feature values.
    • Manipulated printing unit demonstrator operation:
      The printing unit demonstrator was started ca. 23:00 min before the data acquisition began. During this 10:31 min long experiment, the demonstrator application was intentionally manipulated. In addition, the solid-bourne sound sensor signal was manipulated through low-pass filtering in order to simulate a defect of this sensor. An unintended incident also occurred during this experiment. Therefore, data representing both the demonstrator's normal and abnormals conditions are contained in the data set. The sequence of events along with an objective classification of the demonstrator condition by the human experimenter is summarised in the file PrintingUnit_manip_events.txt. The data set contains 5,950,000 raw signal samples, which are in summary described by 1,785 feature values.

    File name conventions and contents

    The files in the data set are organised such that each row represents a data set instances, columns represent the respective sensor or feature:

    • PrintingUnitData*.csv: These files contain raw sensor signals.
    • PrintingUnitFeatures*.csv: These files contain the features extracted from the sensor signals.

    Additional files contain information about

    • *_condition.csv: The condition if the printing unit is indicated by 'n' (normal condition) or 'a' (abnormal condition). These are the labels of the data set instances.
    • *_filter.csv: The solid-bourne sound filter status is indicated by '1' (filter activated) or '0' (filter deactivated).
    • *_time.csv: The relative time, at which the respective instance of the data set was determined. It is represented as the number of days from January 0, 0000 as is returned from MATLAB's datenum function (cf. http://www.mathworks.com/help/matlab/ref/datenum.html for details).

    The operation conditions (with respect to the experiment, cf. above) are distinguished by

    • *_static*.csv: Static printing unit demonstrator operation.
    • *_manip*.csv: Manipulated printing unit demonstrator operation.
  7. E

    UK Environmental Change Network (ECN) frog data: 1994-2015

    • catalogue.ceh.ac.uk
    • gimi9.com
    • +3more
    text/directory
    Updated Nov 9, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    S. Rennie; J. Adamson; R. Anderson; C. Andrews; J. Bater; N. Bayfield; K. Beaton; D. Beaumont; S. Benham; V. Bowmaker; C. Britt; R. Brooker; D. Brooks; J. Brunt; G. Common; R. Cooper; S. Corbett; N. Critchley; P. Dennis; J. Dick; B. Dodd; N. Dodd; N. Donovan; J. Easter; M. Flexen; A. Gardiner; D. Hamilton; P. Hargreaves; M. Hatton-Ellis; M. Howe; J. Kahl; M. Lane; S. Langan; D. Lloyd; B. McCarney; Y. McElarney; C. McKenna; S. McMillan; F. Milne; L. Milne; M. Morecroft; M. Murphy; A. Nelson; H. Nicholson; D. Pallett; D. Parry; I. Pearce; G. Pozsgai; R. Rose; S. Schafer; T. Scott; L. Sherrin; C. Shortall; R. Smith; P. Smith; R. Tait; C. Taylor; M. Taylor; M. Thurlow; A. Turner; K. Tyson; H. Watson; M. Whittaker; C. Wood (2017). UK Environmental Change Network (ECN) frog data: 1994-2015 [Dataset]. http://doi.org/10.5285/4d8c7dd9-8248-46ca-b988-c1fc38e51581
    Explore at:
    text/directoryAvailable download formats
    Dataset updated
    Nov 9, 2017
    Dataset provided by
    NERC EDS Environmental Information Data Centre
    Authors
    S. Rennie; J. Adamson; R. Anderson; C. Andrews; J. Bater; N. Bayfield; K. Beaton; D. Beaumont; S. Benham; V. Bowmaker; C. Britt; R. Brooker; D. Brooks; J. Brunt; G. Common; R. Cooper; S. Corbett; N. Critchley; P. Dennis; J. Dick; B. Dodd; N. Dodd; N. Donovan; J. Easter; M. Flexen; A. Gardiner; D. Hamilton; P. Hargreaves; M. Hatton-Ellis; M. Howe; J. Kahl; M. Lane; S. Langan; D. Lloyd; B. McCarney; Y. McElarney; C. McKenna; S. McMillan; F. Milne; L. Milne; M. Morecroft; M. Murphy; A. Nelson; H. Nicholson; D. Pallett; D. Parry; I. Pearce; G. Pozsgai; R. Rose; S. Schafer; T. Scott; L. Sherrin; C. Shortall; R. Smith; P. Smith; R. Tait; C. Taylor; M. Taylor; M. Thurlow; A. Turner; K. Tyson; H. Watson; M. Whittaker; C. Wood
    License

    https://eidc.ac.uk/licences/ogl/plainhttps://eidc.ac.uk/licences/ogl/plain

    Area covered
    Description

    Frog data from the UK Environmental Change Network (ECN) terrestrial sites. Variables measured include phenology (i.e. the dates when frogs start congregating, spawning, when hatching occurs and when the frogs leave), number of spawn masses, total surface area covered by spawn, percentage of dead spawn, depth, pH, conductivity, alkalinity, aluminium, calcium, chloride, ammonium, nitrate nitrogen, phosphate phosphorous, potassium, sulphate sulphur, sodium, total nitrogen and total dissolved phosphorous. These data are collected at ECN's terrestrial sites using a standard protocol. They represent continuous records from 1994 to 2015. ECN is the UK's long-term environmental monitoring programme. It is a multi-agency programme sponsored by a consortium of fourteen government departments and agencies. These organisations contribute to the programme through funding either site monitoring and/or network co-ordination activities. These organisations are: Agri-Food and Biosciences Institute, Biotechnology and Biological Sciences Research Council, Cyfoeth Naturiol Cymru - Natural Resources Wales, Defence Science & Technology Laboratory, Department for Environment, Food and Rural Affairs, Environment Agency, Forestry Commission, Llywodraeth Cymru - Welsh Government, Natural England, Natural Environment Research Council, Northern Ireland Environment Agency, Scottish Environment Protection Agency, Scottish Government and Scottish Natural Heritage.

  8. Iris Flower Species

    • kaggle.com
    zip
    Updated Oct 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tavish (2024). Iris Flower Species [Dataset]. https://www.kaggle.com/datasets/tavish7/iris-uci/code
    Explore at:
    zip(3904 bytes)Available download formats
    Dataset updated
    Oct 5, 2024
    Authors
    Tavish
    Description

    The Iris dataset is a commonly used dataset in machine learning and statistics and is among the first datasets used in the literature on classification techniques. It was used in R.A. Fisher's classic 1936 paper.

    The data set has three classes, each with 50 occurrences for a total of 150 instances, and each class represents a certain kind of iris flower species. The dataset is accessible and taken from the UCI Machine Learning Repository.

    There are a total of four features in the dataset: sepal length, sepal width, petal length, and petal width, each with a continuous data type and units in cm, and one target variable (class) that is categorical in nature. For the target variable ‘class’ we have three categories namely, Iris Setosa, Iris Versicolour and Iris Virginica.

  9. Labor Force Survey, LFS 2017 - Palestine

    • erfdataportal.com
    Updated Mar 22, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palestinian Central Bureau of Statistics (2021). Labor Force Survey, LFS 2017 - Palestine [Dataset]. https://www.erfdataportal.com/index.php/catalog/170
    Explore at:
    Dataset updated
    Mar 22, 2021
    Dataset provided by
    Palestinian Central Bureau of Statisticshttps://pcbs.gov/
    Economic Research Forum
    Time period covered
    2017
    Area covered
    Palestine
    Description

    Abstract

    THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE PALESTINIAN CENTRAL BUREAU OF STATISTICS

    The Palestinian Central Bureau of Statistics (PCBS) carried out four rounds of the Labor Force Survey 2017 (LFS). The survey rounds covered a total sample of about 23,120 households (5,780 households per quarter).

    The main objective of collecting data on the labour force and its components, including employment, unemployment and underemployment, is to provide basic information on the size and structure of the Palestinian labour force. Data collected at different points in time provide a basis for monitoring current trends and changes in the labour market and in the employment situation. These data, supported with information on other aspects of the economy, provide a basis for the evaluation and analysis of macro-economic policies.

    The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

    Geographic coverage

    Covering a representative sample on the region level (West Bank, Gaza Strip), the locality type (urban, rural, camp) and the governorates.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey covered all Palestinian households who are a usual residence of the Palestinian Territory.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE PALESTINIAN CENTRAL BUREAU OF STATISTICS

    The methodology was designed according to the context of the survey, international standards, data processing requirements and comparability of outputs with other related surveys.

    ---> Target Population: It consists of all individuals aged 10 years and Above and there are staying normally with their households in the state of Palestine during 2017.

    ---> Sampling Frame: The sampling frame consists of the master sample, which was updated in 2011: each enumeration area consists of buildings and housing units with an average of about 124 households. The master sample consists of 596 enumeration areas; we used 494 enumeration areas as a framework for the labor force survey sample in 2017 and these units were used as primary sampling units (PSUs).

    ---> Sampling Size: The estimated sample size is 5,780 households in each quarter of 2017.

    ---> Sample Design The sample is two stage stratified cluster sample with two stages : First stage: we select a systematic random sample of 494 enumeration areas for the whole round ,and we excluded the enumeration areas which its sizes less than 40 households. Second stage: we select a systematic random sample of 16 households from each enumeration area selected in the first stage, se we select a systematic random of 16 households of the enumeration areas which its size is 80 household and over and the enumeration areas which its size is less than 80 households we select systematic random of 8 households.

    ---> Sample strata: The population was divided by: 1- Governorate (16 governorate) 2- Type of Locality (urban, rural, refugee camps).

    ---> Sample Rotation: Each round of the Labor Force Survey covers all of the 494 master sample enumeration areas. Basically, the areas remain fixed over time, but households in 50% of the EAs were replaced in each round. The same households remain in the sample for two consecutive rounds, left for the next two rounds, then selected for the sample for another two consecutive rounds before being dropped from the sample. An overlap of 50% is then achieved between both consecutive rounds and between consecutive years (making the sample efficient for monitoring purposes).

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The survey questionnaire was designed according to the International Labour Organization (ILO) recommendations. The questionnaire includes four main parts:

    ---> 1. Identification Data: The main objective for this part is to record the necessary information to identify the household, such as, cluster code, sector, type of locality, cell, housing number and the cell code.

    ---> 2. Quality Control: This part involves groups of controlling standards to monitor the field and office operation, to keep in order the sequence of questionnaire stages (data collection, field and office coding, data entry, editing after entry and store the data.

    ---> 3. Household Roster: This part involves demographic characteristics about the household, like number of persons in the household, date of birth, sex, educational level…etc.

    ---> 4. Employment Part: This part involves the major research indicators, where one questionnaire had been answered by every 15 years and over household member, to be able to explore their labour force status and recognize their major characteristics toward employment status, economic activity, occupation, place of work, and other employment indicators.

    Cleaning operations

    ---> Raw Data PCBS started collecting data since 1st quarter 2017 using the hand held devices in Palestine excluding Jerusalem in side boarders (J1) and Gaza Strip, the program used in HHD called Sql Server and Microsoft. Net which was developed by General Directorate of Information Systems. Using HHD reduced the data processing stages, the fieldworkers collect data and sending data directly to server then the project manager can withdrawal the data at any time he needs. In order to work in parallel with Gaza Strip and Jerusalem in side boarders (J1), an office program was developed using the same techniques by using the same database for the HHD.

    ---> Harmonized Data - The SPSS package is used to clean and harmonize the datasets. - The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency. - All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization. - A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables. - A post-harmonization cleaning process is then conducted on the data. - Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    The survey sample consists of about 30,230 households of which 23,120 households completed the interview; whereas 14,682 households from the West Bank and 8,438 households in Gaza Strip. Weights were modified to account for non-response rate. The response rate in the West Bank reached 82.4% while in the Gaza Strip it reached 92.7%.

    Sampling error estimates

    ---> Sampling Errors Data of this survey may be affected by sampling errors due to use of a sample and not a complete enumeration. Therefore, certain differences can be expected in comparison with the real values obtained through censuses. Variances were calculated for the most important indicators: the variance table is attached with the final report. There is no problem in disseminating results at national or governorate level for the West Bank and Gaza Strip.

    ---> Non-Sampling Errors Non-statistical errors are probable in all stages of the project, during data collection or processing. This is referred to as non-response errors, response errors, interviewing errors, and data entry errors. To avoid errors and reduce their effects, great efforts were made to train the fieldworkers intensively. They were trained on how to carry out the interview, what to discuss and what to avoid, carrying out a pilot survey, as well as practical and theoretical training during the training course. Also data entry staff were trained on the data entry program that was examined before starting the data entry process. To stay in contact with progress of fieldwork activities and to limit obstacles, there was continuous contact with the fieldwork team through regular visits to the field and regular meetings with them during the different field visits. Problems faced by fieldworkers were discussed to clarify any issues. Non-sampling errors can occur at the various stages of survey implementation whether in data collection or in data processing. They are generally difficult to be evaluated statistically.

    They cover a wide range of errors, including errors resulting from non-response, sampling frame coverage, coding and classification, data processing, and survey response (both respondent and interviewer-related). The use of effective training and supervision and the careful design of questions have direct bearing on limiting the magnitude of non-sampling errors, and hence enhancing the quality of the resulting data. The implementation of the survey encountered non-response where the case ( household was not present at home ) during the fieldwork visit and the case ( housing unit is vacant) become the high percentage of the non response cases. The total non-response rate reached14.2% which is very low once compared to the household surveys conducted by PCBS , The refusal rate reached 3.0% which is very low percentage compared to the

  10. NARSTO EPA Supersite (SS) Baltimore, Johns Hopkins University Meteorolgical...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). NARSTO EPA Supersite (SS) Baltimore, Johns Hopkins University Meteorolgical Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/narsto-epa-supersite-ss-baltimore-johns-hopkins-university-meteorolgical-data-b7da3
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The NARSTO_EPA_SS_BALTIMORE_JHU_MET_DATA is the North American Research Strategy for Tropospheric Ozone (NARSTO) Environmental Protection Agency (EPA) Supersite (SS) Baltimore, Johns Hopkins University Meteorolgical Data product. This product containsmeteorological and turbulence measurements that were recorded using a diverse array of instruments by the Parlange Environmental Fluid Mechanics Group, Department of Geography and Environmental Engineering, JHU at the EPA Baltimore Supersite. Measurements were made at three Baltimore locations over the indicated time intervals: FMC Corporation (May 26 - June 15, 2001), Clifton Park (July 1 - September 14, 2001), and Ponca Street (February 13, 2002 - March 15, 2003).The instruments were mounted on an 11m tall meteorological tower on the site. The instrumentation consisted of a 3d sonic anemometer-thermometer, pyranometer, wind vane, tipping bucket rain collector, 2 cup anemometers, temperature and relative humidity probe and pressure sensor. The data were collected on a continuous basis and were subsequently subjected to multiple cycles of data validation to ensure correctness and accuracy. The validated data was then averaged over a 5 minute interval to create the final data set. The data set is organized to provide a unique data file for any given day within the operating time duration. Each file contains the variables temperature, relative humidity, mean horizontal wind speed (at 10.39m), horizontal resultant vector mean wind speed, mean horizontal wind speed (at 5.87m), mean horizontal wind angle, std deviation of the wind angle, precipitation, friction velocity, Obukhov length, sensible vertical heat flux, solar radiation, atmospheric pressure, virtual potential temperature, specific humidity and wind angle from sonic anemometer. In addition to usual meteorological variables, this data set also provides information on turbulent mixing (parameterized by the friction velocity) and atmospheric stability (parameterized by the Obukhov length). The Baltimore Supersite collected high-quality ambient air quality measurements with unprecedented temporal resolution at an industrially influenced urban site and two intensive measurement campaigns. A data set of project results was constructed to take advantage of advanced multivariate statistical techniques. Data were collected on the sources and nature of organic aerosol for the region, and large quantities of urban particulate matter (PM) were collected for retrospective chemical, physical, and biological analyses and for toxicological testing. These data provided important information on the potential health effects of particles to support exposure and epidemiological studies for enhanced evaluation of health outcome, pollutant, and source relationships. The EPA PM Supersites Program was an ambient air monitoring research program designed to provide information of value to the atmospheric sciences, and human health and exposure research communities. Eight geographically diverse projects were chosen to specifically address the following EPA research priorities: (1) to characterize PM, its constituents, precursors, co-pollutants, atmospheric transport, and its source categories that affect the PM in any region; (2) to address the research questions and scientific uncertainties about PM source-receptor and exposure-health effects relationships; and (3) to compare and evaluate different methods of characterizing PM including testing new and emerging measurement methods.NARSTO, which has since disbanded, was a public/private partnership, whose membership spanned across government, utilities, industry, and academe throughout Mexico, the United States, and Canada. The primary mission was to coordinate and enhance policy-relevant scientific research and assessment of tropospheric pollution behavior; activities provide input for science-based decision-making and determination of workable, efficient, and effective strategies for local and regional air-pollution management. Data products from local, regional, and international monitoring and research programs are still available.

  11. Data from: Consumer Expenditure Survey, 2004: Diary Survey

    • icpsr.umich.edu
    Updated Aug 1, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of Labor. Bureau of Labor Statistics (2013). Consumer Expenditure Survey, 2004: Diary Survey [Dataset]. http://doi.org/10.3886/ICPSR04415.v2
    Explore at:
    Dataset updated
    Aug 1, 2013
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    United States Department of Labor. Bureau of Labor Statistics
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/4415/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/4415/terms

    Time period covered
    2004
    Area covered
    United States
    Description

    The Consumer Expenditure Survey (CE) program provides a continuous and comprehensive flow of data on the buying habits of American consumers including data on their expenditures, income, and consumer unit (families and single consumers) characteristics. These data are used widely in economic research and analysis, and in support of revisions of the Consumer Price Index. The Consumer Expenditure Survey (CE) program is comprised of two separate components (each with its own survey questionnaire and independent sample), the Diary Survey and the quarterly Interview Survey (ICPSR 4416). This data collection contains the Diary Survey data, which was designed to obtain data on frequently purchased smaller items, including food and beverages (both at home and in food establishments), gasoline, housekeeping supplies, tobacco, nonprescription drugs, and personal care products and services. Each consumer unit (CU) recorded its expenditures in a diary for two consecutive 1-week periods. Although the diary was designed to collect information on expenditures that could not be easily recalled over time, respondents were asked to report all expenses (except overnight travel) that the CU incurred during the survey week. The microdata in this collection are available as SAS, SPSS, and STATA datasets or ASCII comma-delimited files. The 2004 Diary release contains five sets of data files (FMLY, MEMB, EXPN, DTAB, DTAB_IMPUTE) and three processing files. The FMLY, MEMB, EXPN, DTAB, and DTAB_IMPUTE files are organized by the quarter of the calendar year in which the data were collected. There are four quarterly datasets for each of these files. The FMLY files contain CU characteristics, income, and summary level expenditures; the MEMB files contain member characteristics and income data; the EXPN files contain detailed weekly expenditures at the Universal Classification Code (UCC) level; the DTAB files contain the CU's reported income values or the mean of the five imputed income values in the multiple imputation method; and the DTAB_IMPUTE files contain the five imputed income values. Please note that the summary level expenditure and income information on the FMLY files permits the data user to link consumer spending, by general expenditure category, and household characteristics and demographics on one set of files. The three processing files enhance computer processing and tabulation of data, and provide descriptive information on item codes. The three processing files are: (1) an aggregation scheme file used in the published consumer expenditure tables (DSTUB), (2) a UCC file that contains UCCs and their abbreviated titles, identifying the expenditure, income, or demographic item represented by each UCC, and (3) a sample program file that contains the computer program used in Section VII "MICRODATA VERIFICATION AND ESTIMATION METHODOLOGY" of the Diary User Guide. The processing files are further explained in Section III.E.5. "PROCESSING FILES" of the same User Guide documentation. There is also a second user guide, User's Guide to Income Imputation in the CE, which includes information on how to appropriately use the imputed income data. Demographic and family characteristics data include age, sex, race, marital status, and CU relationships for each CU member. Income information, such as wage, salary, unemployment compensation, child support, and alimony, as well as information on the employment of each CU member age 14 and over was also collected.

  12. Z

    Continuous MODIS land surface temperature dataset over the Eastern...

    • data.niaid.nih.gov
    Updated Feb 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shilo Shiff; Lensky, M Itamar; Helman, David (2021). Continuous MODIS land surface temperature dataset over the Eastern Mediterranean [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3583123
    Explore at:
    Dataset updated
    Feb 11, 2021
    Dataset provided by
    Department of Geography and Environment, Bar-Ilan University, Ramat Gan, Israel
    Department of Soil and Water Sciences, The Roberth H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
    Authors
    Shilo Shiff; Lensky, M Itamar; Helman, David
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Eastern Mediterranean
    Description

    A continuous dataset of Land Surface Temperature (LST) is vital for climatological and environmental studies. LST can be regarded as a combination of seasonal mean temperature (climatology) and daily anomaly, which is attributed mainly to the synoptic-scale atmospheric circulation (weather). To reproduce LST in cloudy pixels, time series (2002-2019) of cloud-free 1km MODIS Aqua LST images were generated and the pixel-based seasonality (climatology) was calculated using temporal Fourier analysis. To add the anomaly, we used the NCEP Climate Forecast System Version 2 (CFSv2) model, which provides air surface temperature under both cloudy and clear sky conditions. The combination of the two sources of data enables the estimation of LST in cloudy pixels.

    Data structure

    The dataset consists of geo-located continuous LST (Day, Night and Daily) which calculates LST values of cloudy pixels. The spatial domain of the data is the Eastern Mediterranean, at the resolution of the MYD11A1 product (~1 Km). Data are stored in GeoTIFF format as signed 16-bit integers using a scale factor of 0.02, with one file per day, each defined by 4 dimensions (Night LST Cont., Day LST Cont., Daily Average LST Cont., QA). The QA band stores information about the presence of cloud in the original pixel. If in both original files, Day LST and Night LST there was NoData due to clouds, then the QA value is 0. QA value of 1 indicates NoData at original Day LST, 2 indicates NoData at Night LST and 3 indicates valid data at both, day and night. File names follow this naming convention: LST_  .tif, where  represents the year, represents the month and represents the day. Files of each year (2002-2019) are compressed in a ZIP file. The same data is also provided in NetCDF format, each file represents a whole year and is consist of 4 bands (Night LST Cont., Day LST Cont., Daily Average LST Cont., QA) for each day.

    The file LSTcont_validation.tif contains the validation dataset in which the MAE, RMSE, and Pearson (r) of the validation with true LST are provided. Data are stored in GeoTIFF format as signed 32-bit floats, with the same spatial extent and resolution as the LSTcont dataset. These data are stored with one file containing three bands (MAE, RMSE, and Perarson_r). The same data with the same structure is also provided in NetCDF format.

    How to use

    The data can be read in various of program languages such as Python, IDL, Matlab etc.and can be visualize in a GIS program such as ArcGis or Qgis. A short animation demonstrates how to visualize the data using the Qgis open source program is available in the project Github code reposetory.

    Web application

    The LSTcont web application (https://shilosh.users.earthengine.app/view/continuous-lst) is an Earth Engine app. The interface includes a map and a date picker. The user can select a date (July 2002 – present) and visualize LSTcont for that day anywhere on the globe. The web app calculate LSTcont on the fly based on ready-made global climatological files. The LSTcont can be downloaded as a GeoTiff with 5 bands in that order: Mean daily LSTcont, Night original LST, Night LSTcont, Day original LST, Day LSTcont.

    Code availability

    Datasets for other regions can be easily produced by the GEE platform with the code provided project Github code reposetory.

  13. r

    CALY-SWE: Discrete choice experiment and time trade-off data for a...

    • researchdata.se
    • data.europa.eu
    Updated Sep 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaspar Walter Meili; Lars Lindholm (2024). CALY-SWE: Discrete choice experiment and time trade-off data for a representative Swedish value set [Dataset]. http://doi.org/10.5878/asxy-3p37
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Umeå University
    Authors
    Kaspar Walter Meili; Lars Lindholm
    Time period covered
    Jan 8, 2022 - Apr 18, 2022
    Area covered
    Sweden
    Description

    The data consist of two parts: Time trade-off (TTO) data with one row per TTO question (5 questions), and discrete choice experiment (DCE) data with one row per question (6 questions). The purpose of the data is the calculation of a Swedish value set for the capability-adjusted life years (CALY-SWE) instrument. To protect the privacy of the study participants and to comply with GDPR, access to the data is given upon request.

    The data is provided in 4 .csv files with the names:

    • tto.csv (252 kB)
    • dce.csv (282 kB)
    • weights_final_model.csv (30 kB)
    • coefs_final_model.csv (1 kB)

    The first two files (tto.csv, dce.csv) contain the time trade-off (TTO) answers and discrete choice experiment (DCE) answers of participants. The latter two files (weight_final_model.csv, coefs_final_model.csv) contain the generated value set of CALY-SWE weights, and the pertaining coefficients of the main effects additive model.

    Background:

    CALY-SWE is a capability-based instrument for studying Quality of Life (QoL). It consists of 6 attributes (health, social relations, financial situation & housing, occupation, security, political & civil rights) and provides the option to gives for attribute answers on 3 levels (Agree, Agree partially, Do not agree). A configuration or state is one of the 3^6 = 729 possible situations that the instrument describes. Here, a config is denoted in the form of xxxxxx, one x for each attribute in order above. X is a digit corresponding to the level of the respective attribute, with 3 being the highest (Agree), and 1 being the lowest (Do not agree). For example, 222222 encodes a configuration with all attributes on level 2 (Partially agree). The purpose of this dataset is to support the publication of the CALY-SWE value set and to enable reproduction of the calculations (due to privacy concerns we abstain from publishing individual level characteristics). A value set consists of values on the 0 to 1 scale for all 729, each of represents a quality weighting where 1 is the highest capability-related QoL, and 0 the lowest capability-related QoL.

    The data contains answers to two types of questions: TTO and DCE.

    In TTO questions, participants iteratively chose a number of years between 1 to 10. A choice of 10 years is equivalent to living 10 years with full capability (state configuration 333333) in the capability state that the TTO question describes. The answer on the 0 to 1 scale is then calculated as x/10. In the DCE questions, participants were given two states and they chose a state that they found to be better. We used a hybrid model with a linear regression and a logit model component, where the coefficients were linked through a multiplicative factor, to obtain the weights (weights_final_model.csv). Each weight is calculated as constant + the coefficients for the respective configuration. Coefficients for level 3 encode the difference to level 2, and coefficients for level 2 the difference to the constant. For example, for the weight for 123112 is calculated as constant + socrel2 + finhou2 + finhou3 + polciv2 (No coefficients for health, occupation, and security involved as they are on level 1 that is captured in the constant/intercept).

    To assess the quality of TTO answers, we calculated a score per participant that takes into account inconsistencies in answering the TTO question. We then excluded 20% of participants with the worst score to improve the TTO data quality and signal strength for the model (this is indicated by the 'included' variable in the TTO dataset). Details of the entire survey are described in the preprint “CALY-SWE value set: An integrated approach for a valuation study based on an online-administered TTO and DCE survey” by Meili et al. (2023). Please check this document for updated versions.

    Ids have been randomized with preserved linkage between the DCE and TTO dataset.

    Data files and variables:

    Below is a description of the variables in each CSV file. - tto.csv:

    config: 6 numbers representing the attribute levels. position: The number of the asked TTO question. tto_block: The design block of the TTO question. answer: The equivalence value indicated by the participant, ranging from 0.1 to 1 in steps of 0.1. included: If the answer was included in the data for the model to generate the value set. id: Randomized id of the participant.

    • dce.csv:

    config1: Configuration of the first state in the question. config2: Configuration of the second state in the question. position: The number of the asked TTO question. answer: Whether state 1 or 2 was preferred. id: Randomized id of the participant.

    • weights_final_model.csv

    config: 6 numbers representing the attribute levels. weight: The weight calculated with the final model. ciu: The upper 95% credible interval. cil: The lower 95% credible interval.

    • coefs_final_model.csv:

    name: Name of the coefficient, composed of an abbreviation for the attribute and a level number (abbreviations in the same order as above: health, socrel, finhou, occu, secu, polciv). value: Continuous, weight on the 0 to 1 scale. ciu: The upper 95% credible interval. cil: The lower 95% credible interval.

  14. d

    Depth to the Water Raster on Long Island, New York, 2013

    • datasets.ai
    • data.usgs.gov
    • +2more
    55
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of the Interior (2023). Depth to the Water Raster on Long Island, New York, 2013 [Dataset]. https://datasets.ai/datasets/depth-to-the-water-raster-on-long-island-new-york-2013
    Explore at:
    55Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset authored and provided by
    Department of the Interior
    Area covered
    Long Island, New York
    Description

    The depth to water table was measured at 335 groundwater monitoring wells (observation and supply) screened in the upper glacial and Magothy aquifers during April and May of 2013. This raster data set was interpolated from the water level data collected at those sites and represents a continuous surface of the estimated depth to water for hydrologic conditions on Long Island, New York. These data are presented in Sheet 4 of Scientific Investigations Map 3326.

  15. g

    Data from: BOREAS TE-08 Aspen Bark Spectral Reflectance Data

    • gimi9.com
    • search.dataone.org
    • +9more
    Updated Feb 1, 2001
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2001). BOREAS TE-08 Aspen Bark Spectral Reflectance Data [Dataset]. https://gimi9.com/dataset/data-gov_boreas-te-08-aspen-bark-spectral-reflectance-data-87ac3/
    Explore at:
    Dataset updated
    Feb 1, 2001
    Description

    The BOREAS TE-08 team collected in-lab spectral reflectance data for aspen bark and leaves from three sites within the BOREAS SSA from 24-May-1994 to 16-Jun-1994 (IFC 1), 19-Jul-1994 to 08-Aug-1994 (IFC 2), and 30-Aug-1994 to 19-Sep-1994 (IFC 3). One to nine trees from each site were sampled during the three IFCs. Each tree was sampled in five different locations for bark spectral properties: BS, US, BR, BT, and BO. Additionally, a limited number of LV were collected. Bark samples were removed from the stem of the tree and placed in ziplock bags for transport to UNH, where they were scanned with a spectroradiometer in a controlled environment. Each sample was scanned twice: the first set of measurements was made with the bark surface moistened, and the second set was made with the bark surface air-dried for a period of 30 minutes. These data represent continuous spectra of bark reflectance. Each sample was scanned three times, rotating the sample when possible. The reported values for each sample are an average over the three scans.

  16. Long-term dataset of mean surface and upper air meteorological measurements...

    • data-search.nerc.ac.uk
    • hosted-metadata.bgs.ac.uk
    http
    Updated Oct 1, 2002
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    British Antarctic Survey (2002). Long-term dataset of mean surface and upper air meteorological measurements from a selection of Antarctic stations and automatic weather stations - READER (REference Antarctic Data for Environmental Research) project [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/api/records/GB_NERC_BAS_PDC_00248
    Explore at:
    httpAvailable download formats
    Dataset updated
    Oct 1, 2002
    Dataset authored and provided by
    British Antarctic Surveyhttp://www.bas.ac.uk/
    Area covered
    Antarctica,
    Description

    READER (REference Antarctic Data for Environmental Research) is a project of the Scientific Committee on Antarctic Research (SCAR http://www.scar.org/) and has the goal of creating a high quality, long term dataset of mean surface and upper air meteorological measurements from in-situ Antarctic observing systems. These data will be of value in climate research and climate change investigations.

    The primary sources of data are the Antarctic research stations and automatic weather stations. Data from mobile platforms, such as ships and drifting buoys are not being collected since our goal is to derive time series of data at fixed locations.

    Surface and upper air data are being collected and the principal statistics derived are monthly and annual means. Daily data will not be provided in order to keep the data set to a manageable size. With the resources available to the project, it is clearly not possible to collect all the information that could be required by the whole range of investigations into change in the Antarctic. Instead a key set of meteorological variables (surface temperature, mean sea level pressure and surface wind speed, and upper air temperature, geopotential height and wind speed at standard levels) are being assembled and a definitive set of measurements presented for use by researchers.

    A lot of stations have been operated in the Antarctic over the years; many for quite short periods. However, our goal here is to provide information on the long time series that can provide insight into change in the Antarctic. So to be included, the record from a station must extend for 25 years, although not necessarily in a continuous period, or be currently in operation and have operated for the last 10 years. In READER we have chosen to use only data from year-round stations.

  17. d

    Thickness of the upper Fort Union aquifer in the Williston structural basin

    • datasets.ai
    • data.usgs.gov
    • +2more
    55
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of the Interior (2023). Thickness of the upper Fort Union aquifer in the Williston structural basin [Dataset]. https://datasets.ai/datasets/thickness-of-the-upper-fort-union-aquifer-in-the-williston-structural-basin
    Explore at:
    55Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset authored and provided by
    Department of the Interior
    Description

    These data represent the thickness, in feet, of the upper Fort Union aquifer in the Williston structural basin. The data are presented as ASCII text files that can be converted to continuous raster format.

  18. d

    Accident and Emergency Quality Indicators

    • digital.nhs.uk
    pdf, xls
    Updated May 25, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). Accident and Emergency Quality Indicators [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/accident-and-emergency-quality-indicators
    Explore at:
    pdf(144.9 kB), xls(2.0 MB), pdf(41.4 kB)Available download formats
    Dataset updated
    May 25, 2012
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Jan 1, 2012 - Jan 31, 2012
    Area covered
    England
    Description

    In April 2011 a new set of clinical quality indicators was introduced to replace the previous four hour waiting time standard, and measure the quality of care delivered in A&E departments in England. Further details on the background and management of the quality indicators are available from the Department of Health (DH) website. This is the tenth publication of data on the Accident and Emergency (A&E) clinical quality indicators, drawn from A&E data within provisional Hospital Episode Statistics (HES). These data relate to A&E attendances in January 2012 and draw on 1.37 million detailed records of attendances at major A&E departments, single speciality A&E departments (e.g. dental A&Es), minor injury units and walk-in centres in England. This report sets out data coverage, data quality and performance information for the following five A&E indicators: Left department before being seen for treatment rate Re-attendance rate Time to initial assessment Time to treatment Total time in A&E Publishing these data will help share information on the quality of care of A&E services to stimulate the discussion and debate between patients, clinicians, providers and commissioners, which is needed in a culture of continuous improvement. These A&E HES data are published as experimental statistics to note the shortfalls in the quality and coverage of records submitted via the A&E commissioning data set. The data used in these reports are sourced from Provisional A&E HES data, and as such these data may differ to information extracted directly from Secondary Uses Service (SUS) data, or data extracted directly from local patient administration systems. Provisional HES data may be revised throughout the year (for example, activity data for April 2011 may differ depending on whether they are extracted in August 2011, or later in the year). Indicator data published for earlier months have not been revised using updated HES data extracted in subsequent months. The data presented here represent the output of the existing A&E Commissioning Dataset (CDS V6 Type 010). It must be recognised that these data will not exactly match the data definitions for the A&E clinical quality indicators set out in the guidance document A&E clinical quality indicators: Implementation guidance and data definitions (external link). The DH is currently working with Information Standards Board to amend the existing CDS Type 10 Accident and Emergency to collect the data required to monitor the A&E indicators. A&E HES data are collected and published by the NHS Information Centre for Health and Social Care. The data in this report are secondary analyses of HES data produced by the Urgent & Emergency Care team, Department of Health. A&E HES data are published as experimental statistics to note the known shortfalls in the quality of some A&E HES data elements. The published information sets out where data quality for the indicators may be improved by, for example, reducing the number of unknown values (e.g. unknown times to initial assessment) and default values (e.g. the number of attendances that are automatically given a time to initial assessment of midnight 00:00). The quality and coverage of A&E HES data have considerably improved over the years, and the Department and the NHS Information Centre are working with NHS Performance and Information directors to further improve the data.

  19. Classifier Model

    • kaggle.com
    zip
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeriann L Rhymer (2025). Classifier Model [Dataset]. https://www.kaggle.com/datasets/jeriannlrhymer/regression-model/discussion
    Explore at:
    zip(2163 bytes)Available download formats
    Dataset updated
    Feb 4, 2025
    Authors
    Jeriann L Rhymer
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Purpose of this data is Linear Regression

    Handling categorical features in a scikit-learn model. Carrying out a train/test split. Training a model. Evaluating that model on the testing data.

    The mpg data set represents the fuel economy (in miles per gallon) for 38 popular models of car, measured between 1999 and 2008.

    Factor Type Description manufacturer multi-valued discrete Vehicle manufacturer model multi-valued discrete Model of the vehicle displ continuous Size of engine [litres] year multi-valued discrete Year of vehicle manufacture cyl multi-valued discrete Number of ignition cylinders trans multi-valued discrete Transmission type (manual or automatic) drv multi-valued discrete Driven wheels (f=front, 4=4-wheel, r=rear wheel drive) city continuous Miles per gallon, city driving conditions (fuel economy) hwy continuous Miles per gallon, highway driving conditions (fuel economy) fl multi-valued discrete Vehicle type class multi-valued discrete Vehicle class (suv, compact, etc)

  20. Consumer Expenditure Diary Survey 2003 - United States

    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau (2019). Consumer Expenditure Diary Survey 2003 - United States [Dataset]. https://catalog.ihsn.org/index.php/catalog/6805
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    United States Census Bureauhttp://census.gov/
    Time period covered
    2003
    Area covered
    United States
    Description

    Abstract

    The Consumer Expenditure Survey (CE) program provides a continuous and comprehensive flow of data on the buying habits of American consumers. These data are used widely in economic research and analysis, and in support of revisions of the Consumer Price Index. To meet the needs of users, the Bureau of Labor Statistics (BLS) produces population estimates for consumer units (CUs) of average expenditures in news releases, reports, issues, and articles in the Monthly Labor Review. Tabulated CE data are also available on the Internet and by facsimile transmission (See Section XV. APPENDIX 4). The microdata are available online at http://www/bls.gov/cex/pumdhome.htm.

    These microdata files present detailed expenditure and income data for the Diary component of the CE for 2003. They include weekly expenditure (EXPD) and annual income (DTBD) files. The data in EXPD and DTBD files are categorized by a Universal Classification Code (UCC). The advantage of the EXPD and DTBD files is that with the data classified in a standardized format, the user may perform comparative expenditure (or income) analysis with relative ease. The FMLD and MEMD files present data on the characteristics and demographics of CUs and CU members. The summary level expenditure and income information on the FMLD files permits the data user to link consumer spending, by general expenditure category, and household characteristics and demographics on one set of files.

    Estimates of average expenditures in 2003 from the Diary survey, integrated with data from the Interview survey, are published in Consumer Expenditures in 2003. A list of recent publications containing data from the CE appears at the end of this documentation.

    The microdata files are in the public domain and with appropriate credit, may be reproduced without permission. A suggested citation is: "U.S. Department of Labor, Bureau of Labor Statistics, Consumer Expenditure Survey, Diary Survey, 2003".

    STATE IDENTIFIER

    Since the CE is not designed to produce state-level estimates, summing the consumer unit weights by state will not yield state population totals. A CU's basic weight reflects its probability of selection among a group of primary sampling units of similar characteristics. For example, sample units in an urban nonmetropolitan area in California may represent similar areas in Wyoming and Nevada. Among other adjustments, CUs are post-stratified nationally by sex-age-race. For example, the weights of consumer units containing a black male, age 16-24 in Alabama, Colorado, or New York, are all adjusted equivalently. Therefore, weighted population state totals will not match population totals calculated from other surveys that are designed to represent state data.

    To summarize, the CE sample was not designed to produce precise estimates for individual states. Although state-level estimates that are unbiased in a repeated sampling sense can be calculated for various statistical measures, such as means and aggregates, their estimates will generally be subject to large variances. Additionally, a particular state-population estimate from the CE sample may be far from the true state-population estimate.

    INTERPRETING THE DATA

    Several factors should be considered when interpreting the expenditure data. The average expenditure for an item may be considerably lower than the expenditure by those CUs that purchased the item. The less frequently an item is purchased, the greater the difference between the average for all consumer units and the average of those purchasing. (See Section V.B. for ESTIMATION OF TOTAL AND MEAN EXPENDITURES). Also, an individual CU may spend more or less than the average, depending on its particular characteristics. Factors such as income, age of family members, geographic location, taste and personal preference also influence expenditures. Furthermore, even within groups with similar characteristics, the distribution of expenditures varies substantially.

    Expenditures reported are the direct out-of-pocket expenditures. Indirect expenditures, which may be significant, may be reflected elsewhere. For example, rental contracts often include utilities. Renters with such contracts would record no direct expense for utilities, and therefore, appear to have no utility expenses. Employers or insurance companies frequently pay other costs. CUs with members whose employers pay for all or part of their health insurance or life insurance would have lower direct expenses for these items than those who pay the entire amount themselves. These points should be considered when relating reported averages to individual circumstances.

    The Diary survey PUMD are organized into five major data files for each quarter: 1. FMLD - a file with characteristics, income, and summary level expenditures for the household 2. MEMD - a file with characteristics and income for each member in the household
    3. EXPD - a detailed weekly expenditure file categorized by UCC 4. DTBD - a detailed annual income file categorized by UCC
    5. DTID - a household imputed income file categorized by UCC

    Analysis unit

    Consumer Unit

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A. SURVEY SAMPLE DESIGN

    Samples for the CE are national probability samples of households designed to be representative of the total U. S. civilian population. Eligible population includes all civilian noninstitutional persons.

    The first step in sampling is the selection of primary sampling units (PSUs), which consist of counties (or parts thereof) or groups of counties. The set of sample PSUs used for the 2003 sample is composed of 105 areas. The design classifies the PSUs into four categories:

    • 31 "A" certainty PSUs are Metropolitan Statistical Areas (MSA's) with a population greater than 1.5 million. • 46 "B" PSUs, are medium-sized MSA's. • 10 "C" PSUs are nonmetropolitan areas that are included in the CPI. • 18 "D" PSUs are nonmetropolitan areas where only the urban population data will be included in the CPI.

    The sampling frame (that is, the list from which housing units were chosen) for the 2003 survey is generated from the 1990 Population Census 100-percent-detail file. The sampling frame is augmented by new construction permits and by techniques used to eliminate recognized deficiencies in census coverage. All Enumeration Districts (ED's) from the Census that fail to meet the criterion for good addresses for new construction, and all ED's in nonpermit-issuing areas are grouped into the area segment frame.

    To the extent possible, an unclustered sample of units is selected within each PSU. This lack of clustering is desirable because the sample size of the Diary Survey is small relative to other surveys, while the intraclass correlations for expenditure characteristics are relatively large. This suggests that any clustering of the sample units could result in an unacceptable increase in the within-PSU variance and, as a result, the total variance.

    Each selected sample unit is requested to keep two 1-week diaries of expenditures over consecutive weeks. The earliest possible day for placing a diary with a household is predesignated with each day of the week having an equal chance to be the first of the reference week. The diaries are evenly spaced throughout the year. During the last 6 weeks of the year, however, the Diary Survey sample is supplemented to twice its normal size to increase the reporting of types of expenditures unique to the holidays.

    B. COOPERATION LEVELS

    The annual target sample size at the United States level for the Diary Survey is 7,800 participating sample units. To achieve this target the total estimated work load is 11,275 sample units. This allows for refusals, vacancies, or nonexistent sample unit addresses.

    Each participating sample unit selected is asked to keep two 1-week diaries. Each diary is treated independently, so response rates are based on twice the number of housing units sampled.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Response rate

    The response rate for the 2003 Diary Survey is 73.4%. This response rate refers to all diaries in the year.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Darian Naidoo and William Seitz (2025). High Frequency Phone Survey, Continuous Data Collection 2023 - Solomon Islands [Dataset]. https://microdata.pacificdata.org/index.php/catalog/875

High Frequency Phone Survey, Continuous Data Collection 2023 - Solomon Islands

Explore at:
Dataset updated
Mar 19, 2025
Dataset authored and provided by
Darian Naidoo and William Seitz
Time period covered
2023 - 2024
Area covered
Solomon Islands
Description

Abstract

Access to up-to-date socio-economic data is a widespread challenge in Solomon Islands and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 as a way to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.

For Solmon Islands, after five rounds of data collection from 2020-2020, in April 2023 a monthly HFPS data collection commenced and continued for 18 months (ending September 2024) –on topics including employment, income, food security, health, food prices, assets and well-being. Fieldwork took place in two non-consecutive weeks of each month. Data for April 2023-December 2023 were a repeated cross section, while January 2024 established the first month of a panel, the was continued to September 2024. Each month has approximately 550 households in the sample and is representative of urban and rural areas, but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in Solomon Islands. There is one date file for household level data with a unique household ID. and a separate file for individual level data within each household data, that can be matched to the household file using the household ID, and which also has a unique individual ID within the household data which can be used to track individuals over time within households, where the data is panel data.

Geographic coverage

Urban and rural areas of Solomon Islands.

Analysis unit

Household, individual.

Kind of data

Sample survey data [ssd]

Sampling procedure

The initial sample was drawn through Random Digit Dialing (RDD) with geographic stratification. As an objective of the survey was to measure changes in household economic wellbeing over time, the HFPS sought to contact a consistent number of households across each province month to month. This was initially a repeated cross section from April 2023-Dec 2023. The initial sample was drawn from information provided by a major phone service provider in Solomon Islands, covering all the provinces in the country. It had a probability-based weighted design, with a proportionate stratification to achieve geographical representation. The geographical distribution compared to the 2019 Census is listed below for the first month of the HFPS monthly survey:

Choiseul : Census: 4.3%, HFPS: 5.2% Western : Census: 14.4%, HFPS: 13.7% Isabel : Census: 4.8%, HFPS: 4.7% Central : Census: 3.6%, HFPS: 5.2% Ren Bell : Census: 0.6%, HFPS: 1.4% Guadalcanal: Census: 19.8%, HFPS: 21.1% Malaita : Census: 23.1%, HFPS: 18.7% Makira : Census: 5.6%, HFPS: 5.6% Temotu: Census: 3.0%, HFPS: 3% Honiara: Census: 20.7%, HFPS: 21.3%

Source: Census of Population and Housing 2019

Note: The values in the HFPS column represent the proportion of survey participants residing in each province, based on the raw HFPS data from April.

In April 2023, the geographic distribution of World Bank HFPS participants was generally similar to that of the census data at the province level, though within provinces, areas with less mobile phone connectivity are likely to be underrepresented. One indication of this is that urban areas constituted 38.2 percent of the survey sample, which is a slight overrepresentation, compared to 32.5 percent in the Census 2019.

A monthly panel was established in January 2024, that is ongoing as of March 2025. In each subsequent month after January 2024, the survey firm would first attempt to contact all households from the previous month and then attempt to contact households from earlier months that had dropped out. After previous numbers were exhausted, RDD with geographic stratification was used for replacement households. Across all months of the survey a total of, 9,926 interviews were completed.

Mode of data collection

Computer Assisted Telephone Interview [cati]

Research instrument

The questionnaire, which can be found in the External Resources of this documentation, is available in English, with Solomons Pijin translation. There were few changes to the questionnaire across the survey months, but some sections were only introduced in 2024, namely energy access questions and questions to inform the baseline data of the Solomon Islands Government Integrated Economic Development and Climate Resilience (IEDCR) project.

Cleaning operations

The raw data were cleaned by the World Bank team using STATA. This included formatting and correcting errors identified through the survey’s monitoring and quality control process. The data are presented in two datasets: a household dataset and an individual dataset. The total number of observations is 9,926 in the household dataset and 62,054 in the individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, food prices, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (id_member) can be found in the individual dataset.

Search
Clear search
Close search
Google apps
Main menu