Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comes as SQL-importable file and is compatible with the widely available MariaDB- and MySQL-databases.
It is based on (and incorporates/extends) the dataset "1151 commits with software maintenance activity labels (corrective,perfective,adaptive)" by Levin and Yehudai (https://doi.org/10.5281/zenodo.835534).
The extensions to this dataset were obtained using Git-Tools, a tool that is included in the Git-Density (https://doi.org/10.5281/zenodo.2565238) suite. For each of the projects in the original dataset, Git-Tools was run in extended mode.
The dataset contains these tables:
x1151: The original dataset from Levin and Yehudai.
despite its name, this dataset has only 1,149 commits, as two commits were duplicates in the original dataset.
This dataset spanned 11 projects, each of which had between 99 and 114 commits
This dataset has 71 features and spans the projects RxJava, hbase, elasticsearch, intellij-community, hadoop, drools, Kotlin, restlet-framework-java, orientdb, camel and spring-framework.
gtools_ex (short for Git-Tools, extended)
Contains 359,569 commits, analyzed using Git-Tools in extended mode
It spans all commits and projects from the x1151 dataset as well.
All 11 projects were analyzed, from the initial commit until the end of January 2019. For the projects Intellij and Kotlin, the first 35,000 resp. 30,000 commits were analyzed.
This dataset introduces 35 new features (see list below), 22 of which are size- or density-related.
The dataset contains these views:
geX_L (short for Git-tools, extended, with labels)
Joins the commits' labels from x1151 with the extended attributes from gtools_ex, using the commits' hashes.
jeX_L (short for joined, extended, with labels)
Joins the datasets x1151 and gtools_ex entirely, based on the commits' hashes.
Features of the gtools_ex dataset:
SHA1
RepoPathOrUrl
AuthorName
CommitterName
AuthorTime (UTC)
CommitterTime (UTC)
MinutesSincePreviousCommit: Double, describing the amount of minutes that passed since the previous commit. Previous refers to the parent commit, not the previous in time.
Message: The commit's message/comment
AuthorEmail
CommitterEmail
AuthorNominalLabel: All authors of a repository are analyzed and merged by Git-Density using some heuristic, even if they do not always use the same email address or name. This label is a unique string that helps identifying the same author across commits, even if the author did not always use the exact same identity.
CommitterNominalLabel: The same as AuthorNominalLabel, but for the committer this time.
IsInitialCommit: A boolean indicating, whether a commit is preceded by a parent or not.
IsMergeCommit: A boolean indicating whether a commit has more than one parent.
NumberOfParentCommits
ParentCommitSHA1s: A comma-concatenated string of the parents' SHA1 IDs
NumberOfFilesAdded
NumberOfFilesAddedNet: Like the previous property, but if the net-size of all changes of an added file is zero (i.e. when adding a file that is empty/whitespace or does not contain code), then this property does not count the file.
NumberOfLinesAddedByAddedFiles
NumberOfLinesAddedByAddedFilesNet: Like the previous property, but counts the net-lines
NumberOfFilesDeleted
NumberOfFilesDeletedNet: Like the previous property, but considers only files that had net-changes
NumberOfLinesDeletedByDeletedFiles
NumberOfLinesDeletedByDeletedFilesNet: Like the previous property, but counts the net-lines
NumberOfFilesModified
NumberOfFilesModifiedNet: Like the previous property, but considers only files that had net-changes
NumberOfFilesRenamed
NumberOfFilesRenamedNet: Like the previous property, but considers only files that had net-changes
NumberOfLinesAddedByModifiedFiles
NumberOfLinesAddedByModifiedFilesNet: Like the previous property, but counts the net-lines
NumberOfLinesDeletedByModifiedFiles
NumberOfLinesDeletedByModifiedFilesNet: Like the previous property, but counts the net-lines
NumberOfLinesAddedByRenamedFiles
NumberOfLinesAddedByRenamedFilesNet: Like the previous property, but counts the net-lines
NumberOfLinesDeletedByRenamedFiles
NumberOfLinesDeletedByRenamedFilesNet: Like the previous property, but counts the net-lines
Density: The ratio between the two sums of all lines added+deleted+modified+renamed and their resp. gross-version. A density of zero means that the sum of net-lines is zero (i.e. all lines changes were just whitespace, comments etc.). A density of of 1 means that all changed net-lines contribute to the gross-size of the commit (i.e. no useless lines with e.g. only comments or whitespace).
AffectedFilesRatioNet: The ratio between the sums of NumberOfFilesXXX and NumberOfFilesXXXNet
This dataset is supporting the paper "Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities", as submitted to the QRS2019 conference (The 19th IEEE International Conference on Software Quality, Reliability, and Security). Citation: Hönel, S., Ericsson, M., Löwe, W. and Wingkvist, A., 2019. Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities. In The 19th IEEE International Conference on Software Quality, Reliability, and Security.
Facebook
TwitterIn F1 2020 (game by Codemaster), there is an option to stream the telemetry data of the race in UDP. This is mainly used for race analysis or live informations. This dataset will contain aggregated results of races done with following rules:
Each race have 4 files: - the pilot information to store very little information but same some space on the main dataframe - the race information (simply, race, weather, temperatures air/track). - the result at the end of the race with each lap time and section time for each pilot - the complete telemetry data
Only the files with the session ID 3335673977098133433 will have the description. They are all identical.
N/A
Some ideas of questions to answer :
For each race: - What is the time lost in pit ? - What is the time gained by having the DRS in a specific DRS zone ? - What is the tyre degradation / time lost per tyre type ? What is the best strategy to use (gain time with tyre but more pit stop) ? - What is the impact of Rich/Normal/Lean mixes on lap time ? Is it better to use rich mix in high speed straight of for accelerations ? - When is the best option to use overtake mode (except to overtake of course) ? high speed straight of for accelerations ? - Can you predict a Lap Time based on Tyre information, Mixes and Overtake mode used ? - Where are the best overtake opportunities ?
**SurfaceType: **
0: Tarmac 1: Rumble strip 2: Concrete 3: Rock 4: Gravel 5: Mud 6: Sand 7: Grass 8: Water 9: Cobblestone 10: Metal 11: Ridged
Facebook
TwitterThe ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) mission measures the temperature of plants to better understand how much water plants need and how they respond to stress. ECOSTRESS is attached to the International Space Station (ISS) and collects data globally between 52° N and 52° S latitudes. A map of the acquisition coverage can be found on the ECOSTRESS website.The ECOSTRESS Gridded Water Use Efficiency Instantaneous L4 Global 70 m (ECO_L4G_WUE) Version 2 data product provides Water Use Efficiency (WUE) data generated by dividing the Breathing Earth System Simulator (BESS) Gross Primary Production (GPP) by the Priestley-Taylor Jet Propulsion Laboratory Soil Moisture (PT-JPL-SM) transpiration to estimate WUE, the ratio of grams of carbon that plants absorb to kilograms of water that plants release. The product provides a BESS GPP estimate that represents the amount of carbon surrounding the plants. The ECO_L4G_WUE Version 2 data product is available globally and projected to a globally snapped 0.0006° grid with a 70 meter spatial resolution and is distributed in HDF5. Each granule contains layers of Water Use Efficiency (WUE), Water Gross Primary Production (GPP), cloud mask, and water mask. A low-resolution browse is also available showing daily WUE as a stretched image with a color ramp in JPEG format.Known Issues Data acquisition gap: ECOSTRESS was launched on June 29, 2018, and moved to autonomous science operations on August 20, 2018, following a successful in-orbit checkout period. On September 29, 2018, ECOSTRESS experienced an anomaly with its primary mass storage unit (MSU). ECOSTRESS has a primary and secondary MSU (A and B). On December 5, 2018, the instrument was switched to the secondary MSU, and science operations resumed. On March 14, 2019, the secondary MSU experienced a similar anomaly, temporarily halting science acquisitions. On May 15, 2019, a new data acquisition approach was implemented, and science acquisitions resumed. To optimize the new acquisition approach, only Thermal Infrared (TIR) bands 2, 4, and 5 are being downloaded. The data products are the same as before, but the bands not downloaded contain fill values (L1 radiance and L2 emissivity). This approach was implemented from May 15, 2019, through April 28, 2023. Data acquisition gap: From February 8 to February 16, 2020, an ECOSTRESS instrument issue resulted in a data anomaly that created striping in band 4 (10.5 micron). These data products have been reprocessed and are available for download. No ECOSTRESS data were acquired on February 17, 2020, due to the instrument being in SAFEHOLD. Data acquired following the anomaly have not been affected. Data acquisition: ECOSTRESS has now successfully returned to 5-band mode after being in 3-band mode since 2019. This feature was successfully enabled following a Data Processing Unit firmware update (version 4.1) to the payload on April 28, 2023. To better balance contiguous science data scene variables, 3-band collection is currently being interleaved with 5-band acquisitions over the orbital day/night periods. Solar Array Obstruction: Some ECOSTRESS scenes may be affected by solar array obstructions from the International Space Station (ISS), potentially impacting data quality of obstructed pixels. The 'FieldOfViewObstruction' metadata field is included in all Version 2 products to indicate possible obstructions: * Before October 24, 2024 (orbits prior to 35724): The field is present but was not populated and does not reliably identify affected scenes. * On or after October 24, 2024 (starting with orbit 35724): The field is populated and generally accurate, except for late December 2024, when a temporary processing error may have caused false positives. * A list of scenes confirmed to be affected by obstructions is available and is recommended for verifying historical data (before October 24, 2024) and scenes from late December 2024.* The ISS native pointing information is coarse relative to ECOSTRESS pixels, so ECOSTRESS geolocation is improved through image matching with a basemap. Metadata in the L1B_GEO file shows the success of this geolocation improvement, using categorizations "best", "good", "suspect", and "poor". We recommend that users use only "best" and "good" scenes for evaluations where geolocation is important (e.g., comparison to field sites). For some scenes, this metadata is not reflected in the higher-level products (e.g., land surface temperature, evapotranspiration, etc.). While this metadata is always available in the geolocation product, to save users additional download, we have produced a summary text file that includes the geolocation quality flags for all scenes from launch to present. At a later date, all higher-level products will reflect the geolocation quality flag correctly (the field name is GeolocationAccuracyQA).
Facebook
Twitterhttps://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
The Cluster Ion Spectrometer (CIS) instrument is a comprehensive ionic plasma spectrometry package onboard the four Cluster spacecraft, capable of obtaining full three-dimensional ion distributions with good time resolution (one spacecraft spin) and with mass-per-charge composition determination. Since the scientific objectives cannot be met with a single detector, the CIS package therefore consists of two different instruments, a Hot Ion Analyser (HIA) and a time-of-flight ion Composition Distribution Function (CODIF), plus a sophisticated dual-processor based instrument control and data processing system (DPS), which permits extensive onboard data-processing. Both analysers use symmetric optics resulting in continuous, uniform, and well-characterised phase space coverage.
The CODIF instrument is a high-sensitivity mass-resolving spectrometer with an instantaneous 360° × 8° field-of-view to measure full three-dimensional distribution functions of the major ion species (in as much as they contribute significantly to the total mass density of the plasma), within one spin period of the spacecraft. Typically these include H+, He+, He++ and O+, with energies from ~0 to 40 keV/e and with medium (22.5°) angular resolution. The CODIF instrument combines ion energy-per-charge selection, by deflection in a rotationally symmetric toroidal electrostatic analyser, with a subsequent time-of-flight analysis after post-acceleration to ~15 keV/e. The energy-per-charge analyser is of a rotationally symmetric toroidal type, which is basically similar to the quadrispheric top-hat analysers and has a uniform response over 360° of polar angle. In the time-of-flight section the velocity of the incoming ions is measured. Microchannel plates (MCPs) are used to detect both the ions and the secondary electrons, which are emitted from the carbon foil during the passage of the ions and give the start signal, for the time-of-flight measurement, and the positional information (22.5° resolution).
In order to cover populations ranging from magnetosheath/magnetopause protons to tail lobe ions (consisting of protons and heavier ions), a dynamic range of more than 105 is required. CODIF therefore consists of two sections, each with 180° field of view, with geometry factors differing by a factor of ~100. This way, one section will always have counting rates which are statistically meaningful and which at the same time can be handled by the time-of-flight electronics. However, intense ion fluxes can in some cases saturate the CODIF instrument (particularly if data are acquired from the high sensitivity side), but these fluxes are measured with HIA.
The sensor primarily covers the energy range between 0.015 and 40 keV/e. With an additional RPA device in the aperture system of the sensor, and with pre-acceleration for the energies below 25 eV/e, the range is extended to energies as low as the spacecraft potential. The RPA operates only in the RPA mode.
The analyser has a characteristic energy response of about 7.3, and an intrinsic energy resolution of ΔE/E ~ 0.14. The deflection voltage is varied in an exponential sweep. The full energy sweep with 31 contiguous energy channels is performed 32 times per spin. Thus a partial two-dimensional cut through the distribution function in polar angle is obtained every 1/32 of the spacecraft spin (125 ms). The full 4π ion distributions are obtained in one spacecraft spin period. Including the effects of grid transparencies and support posts in the collimator, each 22.5° sector has a respective geometry factor of 2.4 × 10^-3 cm^2 sr keV keV^-1 in the h igh sensitivity side, and 2.6 × 10^-5 cm^2 sr keV keV-1 in the low sensitivity side, depending on the flight model.
The HIA instrument does not offer mass resolution but, also having two different sensitivities, increases the dynamic range, and has an angular resolution capability (5.6° × 5.6°) adequate for ion-beam and solar-wind measurements. HIA combines the selection of incoming ions, according to the ion energy-per-charge ratio by deflection in an electrostatic analyser, with a fast imaging particle detection system. This particle imaging is based on MCP electron multipliers and position-encoding discrete anodes.
Basically the analyser design is a symmetrical quadrispherical electrostatic analyser which has a uniform 360° disc-shaped field-of-view and narrow angular resolution capability. The HIA instrument has two 180° field-of-view sections with two different sensitivities, with a 20-30 ratio (depending on the flight model but precisely known from calibrations), corresponding respectively to the high G and low g sections. The low g section allows detection of the solar wind and the required high angular resolution is achieved through the use of 8 sectors, 5.625° each, the remaining 8 sectors having 11.25° resolution. The 180° high G section is divided into 16 sectors, 11.25° each. For each sensitivity section a full 4π steradian scan, consisting of 32 energy sweeps, is completed every spin of the spacecraft, i.e., 4 s, giving a full three-dimensional distribution of ions in the energy range ~5 eV/e to 32 keV/e. The geometry factor is ~8.0 * 10^-3 cm^2*sr*keV*keV^-1 for the high G half (over 180°), and ~3.5 * 10^-4 cm^2*sr*keV*keV^-1 for the low g half, depending on the flight model.
Caveats
The user of the CIS CSDS parameters needs to be cautious. These parameters are only moments of the distribution functions, that result from summing counting rates. Thus they do not convey information on the detailed structure of the three-dimensional distributions.
Counting statistics are essential for obtaining reliable results. Preliminary information on inadequate counting rates, dead or saturated detectors, is given in the Caveats attribute. Besides instrument sensitivity and calibration, the accuracy of computed moments is mainly affected by the finite energy and angle resolution, and by the finite energy range of the instruments.
An inappropriate choice of an operational mode is not without consequences for the accuracy of the parameters. Solar wind modes in the magnetosphere exclude a large portion of the ion distribution. This is particularly important for HIA moments obtained i n the magnetosheath while the instrument is in a solar wind mode. The moments then come from the 45° × 45° centered in the solar wind direction, resulting in largely under-sampled distributions.
This data set contains Cluster 4 CIS Prime Parameters.
CODIF energy sweeping during solar wind modes has a reduced energy range when the high sensitivity side faces the solar wind (45° in azimuth over 360°). This implies that if the data come from the high sensitivity side, the solar wind is then not detected.
Magnetospheric modes in the solar wind result in a probable detector saturation.
The He++ data can be contaminated by some H+ ions, resulting in over-estimated He++ densities.
The CIS calibration values are regularly updated to take into account the detector efficiency evolution. However, as the evaluation of the detector efficiency requires some time history, necessary for a statistical analysis, there is a hysteresis between the detector efficiency drift and the calibration updates.
Furthermore, an inhomogeneous evolution of the detection efficiency between the different anode sectors can result in a bias in the calculated direction of the bulk plasma flow. This phenomenon has been observed on the CODIF data obtained onboard Spacecraft 3 (Samba), resulting in a degraded accuracy of the Vz component (corrected in September 2001 with onboard software patches).
The CIS instrument is not operational on Spacecraft 2 (Salsa).
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This package contains the complete experimental data explained in: Karakurt, A., Şentürk S., & Serra X. (In Press). MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop. Please cite the paper above, if you are using the data in your work. The zip file includes the folds, features, training and testing data, results and evaluation file. It is part of the experiments hosted in github (https://github.com/sertansenturk/makam_recognition_experiments/tree/dlfm2016) in the folder call "./data". We host the experimental data in Zenodo (http://dx.doi.org/10.5281/zenodo.57999) separately due to the file size limitations in github. The files generated from audio recordings are labeled with 16 character long MusicBrainz IDs (in short "MBID"s) Please check http://musicbrainz.org/ for more information about the unique identifiers. The structure of the data in the zip file is explained below. In the paths given below task is the computational task ("tonic," "mode" or "joint"), training_type is either "single" (-distribution per mode) or "multi" (-distribution per mode), distribution is either "pcd" (pitch class distribution) or "pd" (pitch distribution), bin_size is the bin size of the distribution in cents, kernel_width is the standard deviation of the Gaussian kernel used in smoothing the distribution, distance is either the distance or the dissimilarity metric, num_neighbors is the number or neighbors checked in k-nearest neighbor classification and min_peak is the minimum peak ratio. 0 kernel_width implies no smoothing. min_peak always takes the value 0.15. For a thorough explanation please refer to the companion page (http://compmusic.upf.edu/node/319) and the paper itself. folds.json: Divides the test dataset (https://github.com/MTG/otmm_makam_recognition_dataset/releases) into training and testing sets according to stratified 10-fold scheme. The annotations are also distributed to sets accordingly. The file is generated by the Jupyter notebook setup_feature_training.ipynb (4th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb). Features: The path is data/features/[distribution--bin_size--kernel_width]/[MBID--(hist or pdf)].json. "pdf" stands for probability density function, which is used to obtain the multi-distribution models in the training step and "hist" stands for the histogram, which is used to obtain the single-distribution models in the training step. The features are extracted using the Jupyter notebook setup_feature_training.ipynb (5th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb) Training: The path is data/training/[training_type--distribution--bin_size--kernel_width]/fold(0:9).json]. There are 10 folds in each folder, each of which stores the training model (file paths of the distributions in "multi" training_type or the distributions itself in "single" training_type) trained for the fold using the parameter set. The training files are generated by the Jupyter notebook setup_feature_training.ipynb (6th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb) Testing: The path is data/testing/[task]/[training_type--distribution--bin_size--kernel_width--distance--num_neighbors--min_peak]. Each path has the folders fold(0:9), which have the evaluation and the results files obtained from each fold. The path also has the overall_eval.json file, which stores the overall evaluation of the experiment. The optimal value of min_peak is selected in the 4th code block, testing is carried in the 6th code clock and the evaluation is done in the 7th code block in the Jupyter notebook testing_evaluation.ipynb in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/testing_evaluation.ipynb). data/testing/ folder also contains a summary of all the experiments in the files data/testing/evaluation_overall.json and data/testing/evaluation_perfold.json. These files are created in MATLAB while running the statistical significance scripts. data/testing/evaluation_perfold.mat is the same with the json file of the same filename, stored for fast reading. For additional information please contact the authors. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Facebook
TwitterThe World Values Survey (WVS) is an international research program devoted to the scientific and academic study of social, political, economic, religious and cultural values of people in the world. The project’s goal is to assess which impact values stability or change over time has on the social, political and economic development of countries and societies. The project grew out of the European Values Study and was started in 1981 by its Founder and first President (1981-2013) Professor Ronald Inglehart from the University of Michigan (USA) and his team, and since then has been operating in more than 120 world societies. The main research instrument of the project is a representative comparative social survey which is conducted globally every 5 years. Extensive geographical and thematic scope, free availability of survey data and project findings for broad public turned the WVS into one of the most authoritative and widely-used cross-national surveys in the social sciences. At the moment, WVS is the largest non-commercial cross-national empirical time-series investigation of human beliefs and values ever executed. Interview Mode of collection: mixed mode Face-to-face interview: CAPI (Computer Assisted Personal Interview) Face-to-face interview: PAPI (Paper and Pencil Interview) Telephone interview: CATI (Computer Assisted Telephone Interview) Self-administered questionnaire: CAWI (Computer-Assisted Web Interview) Self-administered questionnaire: Paper In all countries, fieldwork was conducted on the basis of detailed and uniform instructions prepared by the WVS Scientific Committee and WVSA secretariat. The main data collection mode in 1981-2012 was face to face (interviewer-administered) interview with the printed questionnaire. Postal surveys (respondent-administered) have been used in Canada, New Zealanda, Japan, Australia. CAPI and online data collection modes have been introduced first in WVS-6 in 2012-2014. The main data collection mode in WVS 2017-2022 is face to face (interviewer-administered). Several countries employed mixed-mode approach to data collection: USA (CAWI; CATI); Australia and Japan (CAWI; postal survey); Hong Kong SAR (PAPI; CAWI); Malaysia (CAWI; PAPI). The WVS Master Questionnaire is always provided in English and each national survey team has to ensure that the questionnaire was translated into all the languages spoken by 15% or more of the population in the country. A central team monitors the translation process. The target population is defined as: individuals aged 18 (16/17 is acceptable in the countries with such voting age) or older (with no upper age limit), regardless of their nationality, citizenship or language, that have been residing in the [country] within private households for the past 6 months prior to the date of beginning of fieldwork (or in the date of the first visit to the household, in case of random-route selection). The sampling procedures differ from country to country; probability Sample: Multistage Sample Probability Sample, Simple Random Sample Representative single stage or multi-stage sampling of the adult population of the country 18 (16) years old and older was used for the WVS 1981-2022. In 1981-2012, the required sample size for each coutnry was N=1000 or above. In 2017-2022, the sample size was set as effective sample size: 1200 for countries with population over 2 million, 1000 for countries with population less than 2 million. As an exception, few surveys with smaller sample sizes have been accepted into the WVS 1981-2020 through the WVSA's history. Sample design and other relevant information about sampling are reviewed by the WVS Scientific Advisory Committee and approved prior to contracting of fieldwork agency or starting of data collection. The sampling is documented using the Survey Design Form delivered by the national teams which included the description of the sampling frame and each sampling stage as well as the calculation of the planned gross and net sample size to achieve the required effective sample. Additionally, it included the analytical description of the inclusion probabilities of the sampling design that are used to calculate design weights.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
This is a synthetic volcano deformation dataset accompanying the publication of Graph Neural Network based elastic deformation emulators for magmatic reservoirs of complex geometries, on the journal Volcanica. Synthetic, quasi-static deformation is computed for magma chambers of various geometries, parameterized as spheroids or superpositions of spherical harmonics. Surface deformation is computed using the boundary element method (BEM) of Nikkhoo & Walter (2015). Please reference our paper for details of computational methods.
The dataset contains 50,000 realizations of magma chamber geometries/orientations/centroid depths and associated deformation fields. Surface deformation fields are sampled at discrete locations, with a uniform random distribution within [Lh x Lh], and a distribution that concentrates near the chamber (at radial distances, r = 10^(-3 random number) * Lh/2). Note this dataset contains only a small fraction of the total dataset. In total, 824,393 realizations of magma chambers were used to train our emulators. For accessing the complete training data set, please contact the authors.
Each .mat file contains the deformation field associated with a single chamber geometry. Use visData.m to visualize chamber geometry and associated surface displacement. Each file contains two MATLAB structures, "input" and "output".
Naming of each zip file
The numbers after the underscore, N:M, indicate that this file contains N of the M total chamber realizations for this particular setup.
sph_20AspRatios_1e4:151211.zip: deformation corresponding to spheroidal magma chambers parameterized by aspect ratios.
sh_complex_1e4:152283.zip: deformation corresponding to chamber geometry produced by superposition of spherical harmonic modes.
sh_mode_approx_1e4:138380.zip: deformation corresponding to chamber geometries corresponding to individual spherical harmonic modes, combined with a spherical mode (the spherical mode prevents chamber surfaces from having zero radii locally)
sh_spheroid_approx1e4:202272.zip: deformation corresponding to chambers approximating spheroids, but parameterized by spherical harmonics.
sh_spheroid_perturb_1e4:180247.zip: same as above, but with additional random perturbations parameterized in spherical harmonics.
Variables in each file
Input contains the following fields:
dp2mu: pressure change to shear modulus ratio.
dx, dy, dz: the coordinates of chamber centroid [meters]
mu: dimensionless crustal shear modulus (always set to 1)
nu: crustal Poisson's ratio (always set to 0.25)
Ns: number of points on the surface where displacements are computed
Lh, Lv: horizontal and vertical dimensions of the model domain [meters]. Lh is determined such that at the edge of the model domain, the displacement magnitude is below 10 percent of the maximum. Lv = Lh/2 + abs(dz)
for the spheroids -----------------------------------------------------------------------------------------------------------
the input files contain
asp: aspect ratio of chamber (length of the semi-major axis divided by that of the semi-minor axis)
ra, rb: semi-major, -minor, axis length [meters]
thetax, thetay, thetaz: counterclockwise rotation angles with regard to x, y, z axis [degrees]. thetax = [0, 90] degrees, thetay = 0 degrees, thetaz = 360 degrees.
for the general geometries--------------------------------------------------------------------------------------------------
the input files contain
ls, ms, fs: degree, order, coefficients of spherical harmonic modes. Spherical harmonics are sampled up to degree 5. fs is a complex vector of coefficients such that the resulting shape is real.
normF: normalization factor applied to the shape parameterized by ls, ms, fs, such that the shape as a maximum radius of unity.
rmax: scale factor to scale the spherical harmonics parameterized shape to real dimensions [meters].
=============================================================================================
Output contains the following fields,
X, Y, Z: coordinates of points where displacement vectors are computed [meters]
Ux, Uy, Uz: displacements in x, y, z directions [meters]
P, T: coordinates [meters] of vertices for the triangular mesh used in BEM calculation, and the connectivity matrix
C: coordinates [meters] of the center of each triangular element
that, dhat, nhat: unit vectors for orthogonal coordinate systems local to each triangular element. that ("t-hat") extends from vertex one to vertex two, nhat is outward normal, and dhat = cross (nhat, that).
Reference:
Facebook
TwitterThe World Values Survey (WVS) is an international research program devoted to the scientific and academic study of social, political, economic, religious and cultural values of people in the world. The project’s goal is to assess which impact values stability or change over time has on the social, political and economic development of countries and societies. The project grew out of the European Values Study and was started in 1981 by its Founder and first President (1981-2013) Professor Ronald Inglehart from the University of Michigan (USA) and his team, and since then has been operating in more than 120 world societies. The main research instrument of the project is a representative comparative social survey which is conducted globally every 5 years. Extensive geographical and thematic scope, free availability of survey data and project findings for broad public turned the WVS into one of the most authoritative and widely-used cross-national surveys in the social sciences. At the moment, WVS is the largest non-commercial cross-national empirical time-series investigation of human beliefs and values ever executed. Interview Mode of collection: mixed mode Face-to-face interview: CAPI (Computer Assisted Personal Interview) Face-to-face interview: PAPI (Paper and Pencil Interview) Telephone interview: CATI (Computer Assisted Telephone Interview) Self-administered questionnaire: CAWI (Computer-Assisted Web Interview) Self-administered questionnaire: Paper Web-based Interview In all countries, fieldwork was conducted on the basis of detailed and uniform instructions prepared by the WVS Scientific Committee and WVSA secretariat. The main data collection mode in 1981-2012 was face to face (interviewer-administered) interview with the printed questionnaire. Postal surveys (respondent-administered) have been used in Canada, New Zealanda, Japan, Australia. CAPI and online data collection modes have been introduced first in WVS-6 in 2012-2014. The main data collection mode in WVS 2017-2022 is face to face (interviewer-administered) interview with a printed or electronic questionnaire (CAPI). Several countries employed mixed-mode approach to data collection: USA (CAWI; CATI); Australia and Japan (CAWI; postal survey); Hong Kong SAR (PAPI; CAWI); Malaysia (CAWI; PAPI). The WVS Master Questionnaire is always provided in English and each national survey team has to ensure that the questionnaire was translated into all the languages spoken by 15% or more of the population in the country. A central team monitors the translation process. The target population is defined as: individuals aged 18 (16/17 is acceptable in the countries with such voting age) or older (with no upper age limit), regardless of their nationality, citizenship or language, that have been residing in the [country] within private households for the past 6 months prior to the date of beginning of fieldwork (or in the date of the first visit to the household, in case of random-route selection). The sampling procedures differ from country to country; probability Sample: Multistage Sample Probability Sample, Simple Random Sample Representative single stage or multi-stage sampling of the adult population of the country 18 (16) years old and older was used for the WVS 1981-2022. In 1981-2012, the required sample size for each coutnry was N=1000 or above. In 2017-2022, the sample size was set as effective sample size: 1200 for countries with population over 2 million, 1000 for countries with population less than 2 million. As an exception, few surveys with smaller sample sizes have been accepted into the WVS 1981-2022 through the WVSA's history. Sample design and other relevant information about sampling are reviewed by the WVS Scientific Advisory Committee and approved prior to contracting of fieldwork agency or starting of data collection. The sampling is documented using the Survey Design Form delivered by the national teams which included the description of the sampling frame and each sampling stage as well as the calculation of the planned gross and net sample size to achieve the required effective sample. Additionally, it included the analytical description of the inclusion probabilities of the sampling design that are used to calculate design weights.
Facebook
TwitterThe Influencing Travel Behaviour Team (ITB) provide road safety education, training and publicity to schools, communities, businesses and Leeds residents. We promote sustainable travel throughout Leeds along with helping schools and businesses to develop and implement their travel plans (which promote safe, sustainable and less car dependent patterns of travel). Each year we request mode of travel data from schools in Leeds via a SIMS report or excel spreadsheet. The 10 modes of travel specified in the data collection are: Bus (type not known), Car Share (children travelling together from different households), Car/Van, Cycle, Dedicated School Bus, Other, Public Bus Service, Taxi, Train, Walk (including scooting) This collection forms part of the Statutory duty local authorities have to monitor the success of promoting sustainable travel, and in some cases is linked to a school’s planning obligated travel plan. It is an important part of improving road safety and promoting healthy lifestyles among children in Leeds but since the council declared a climate emergency in March of this year the data is even more valuable. The data helps us understand the environmental context in Leeds and work to effectively limit carbon emissions wherever possible. We strongly encourage all schools to provide the data but not all of them respond to the request and we do not always receive a response for every pupil/student so some school response rates may be low.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Overview: This dataset contains survey responses collected from students in a college located in Satara, Maharashtra, India. The survey was conducted to gather information about students' library usage, reading habits, learning preferences, and other related factors.
Columns: The dataset consists of 29 columns representing different survey questions and responses. The columns include information such as gender, faculty, location, preferred study materials, library visit frequency, average time spent in college, preferred learning language, reading preferences, COVID-19 pandemic impact, book purchasing behavior, parents' occupation and education, and more.
Data Collection: The survey was shared with students in the college library, and their responses were collected using a Google Form. Approximately 10-15k students studying in various courses, ranging from 11th grade to master's degree, participated in the survey.
Data Format: The dataset is provided in CSV format, with each row representing a student's survey response and each column representing a specific survey question.
Data Usage: This dataset can be used to gain insights into students' library usage patterns, reading habits, and learning preferences. It can be used for exploratory data analysis, statistical analysis, and building predictive models related to student behavior, library services, or educational interventions.
Data Quality: The dataset has been cleaned and preprocessed to remove any identifiable personal information and ensure data privacy. However, it is always advisable to handle the data responsibly and in accordance with applicable data protection regulations.
Here's a column-wise description of the dataset:
gender: Gender of the student. faculty: Faculty or department of the student. Enter Your Location: Location of the student. kind of books preferred for study: Preferred type of books for studying. How Frequently do you visit library: Frequency of visiting the library. For what Purposes do you visit library: Purposes for visiting the library. Average Time spent in college: Average time spent in college. What is general Purposes: General purposes of the student. Which one is your Preferred location: Preferred location. What is your preferred time?: Preferred time for activities. Preferred language for Learning: Preferred language for learning. Preferred type for reading: Preferred type of reading material. Do you enjoy the Reading: Enjoyment of reading. Which mode of learning: Preferred mode of learning. Dose Covid Pandemic Ch: Impact of the Covid pandemic on learning. How do you study before collage: Study habits before college. How do you study after Collage: Study habits after college. Do you aware about Nati: Awareness about National Digital Library. Do you Using National di: Usage of National Digital Library. Dose Covid 19 Pandemic Affected Your Reading Habits: Impact of the Covid-19 pandemic on reading habits. Do you purchase Books from store: Book purchasing behavior from physical stores. Average Expenditure on books: Average expenditure on books. Occupation Of Father: Occupation of the student's father. Parents Education: Education level of the student's parents. Select your Faculty: Select faculty or department. Enter your Location: Enter location. Preferred Language for Learning: Preferred language for learning. Do you Using National dig: Usage of National Digital Library. Occupation of Father: Occupation of the student's father.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset is based on Facial Expressions Training Data.
Images remain 96x96 in size and their improved labels were used. The source dataset was split into two subsets - train and test - and its classes were balanced. The train.csv and test.csv files contain label to file name mappings for train and test subsets respectively.
Classes are as follows: anger, contempt, disgust, fear, happy, neutral, sad, and surprise. Classes were balanced by augmentation relative to the size of the largest class using the Albumentations library for Python.
The following pipeline was used for augmentation:
A.Compose(
[
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(
always_apply=True, contrast_limit=0.2, brightness_limit=0.2
),
A.OneOf(
[
A.MotionBlur(always_apply=True),
A.GaussNoise(always_apply=True),
A.GaussianBlur(always_apply=True),
],
p=0.5,
),
A.PixelDropout(p=0.25),
A.Rotate(always_apply=True, limit=20, border_mode=cv2.BORDER_REPLICATE),
]
)
Horizontal flip is applied with a probability of 50%. Random brightness and contrast are always applied with a contrast limit of ±20% and a brightness limit of ±20%. One of motion blur, Gaussian noise, or Gaussian blur is applied with a probability of 50%. Pixel dropout is applied with a probability of 25%. Rotation by a random angle is always applied with a limit of ±20 degrees and border mode set to replicate colors at the borders of the image being rotated to avoid black borders.
Three variants of the original dataset were created:
data_relabeled_balanced_1x)data_relabeled_balanced_2x)data_relabeled_balanced_3x)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was generated for the purpose of developing unfolding methods that leverage generative machine learning models. It consists of two pieces: one piece contains events with the Standard Model (SM) production of a top-quark pair in the semi-leptonic decay mode, and the other contains events with top-quark pair production modified by a non-zero EFT operator. The SM dataset contains 15,015,000 events, and the EFT dataset contains 30,000,000. Both datasets store the following event configurations:
Each of these configurations is stored in a dedicated group as described below. Throughout, the units of energy and transverse momentum are GeV. For more details on the generation of this dataset, see Ref. [1].
Parton level data:
Particle level data:
Detector level data:
Citations:
[1] - https://arxiv.org/abs/2404.14332
[2] - https://twiki.cern.ch/twiki/bin/view/LHCPhysics/ParticleLevelTopDefinitions
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wind Spacecraft:
The Wind spacecraft (https://wind.nasa.gov) was launched on November 1, 1994 and currently orbits the first Lagrange point between the Earth and sun. A comprehensive review can be found in Wilson et al. [2021]. It holds a suite of instruments from gamma ray detectors to quasi-static magnetic field instruments, Bo. The instruments used for this data product are the fluxgate magnetometer (MFI) [Lepping et al., 1995] and the radio receivers (WAVES) [Bougeret et al., 1995]. The MFI measures 3-vector Bo at ~11 samples per second (sps); WAVES observes electromagnetic radiation from ~4 kHz to >12 MHz which provides an observation of the upper hybrid line (also called the plasma line) used to define the total electron density and also takes time series snapshot/waveform captures of electric and magnetic field fluctuations, called TDS bursts herein.
WAVES Instrument:
The WAVES experiment [Bougeret et al., 1995] on the Wind spacecraft is composed of three orthogonal electric field antenna and three orthogonal search coil magnetometers. The electric fields are measured through five different receivers: Low Frequency FFT receiver called FFT (0.3 Hz to 11 kHz), Thermal Noise Receiver called TNR (4-256 kHz), Radio receiver band 1 called RAD1 (20-1040 kHz), Radio receiver band 2 called RAD2 (1.075-13.825 MHz), and the Time Domain Sampler (TDS). The electric field antenna are dipole antennas with two orthogonal antennas in the spin plane and one spin axis stacer antenna.
The TDS receiver allows one to examine the electromagnetic waves observed by Wind as time series waveform captures. There are two modes of operation, TDS Fast (TDSF) and TDS Slow (TDSS). TDSF returns 2048 data points for two channels of the electric field, typically Ex and Ey (i.e. spin plane components), with little to no gain below ~120 Hz (the data herein has been high pass filtered above ~150 Hz for this reason). TDSS returns four channels with three electric(magnetic) field components and one magnetic(electric) component. The search coils show a gain roll off ~3.3 Hz [e.g., see Wilson et al., 2010; Wilson et al., 2012; Wilson et al., 2013 and references therein for more details].
The original calibration of the electric field antenna found that the effective antenna lengths are roughly 41.1 m, 3.79 m, and 2.17 m for the X, Y, and Z antenna, respectively. The +Ex antenna was broken twice during the mission as of June 26, 2020. The first break occurred on August 3, 2000 around ~21:00 UTC and the second on September 24, 2002 around ~23:00 UTC. These breaks reduced the effective antenna length of Ex from ~41 m to 27 m after the first break and ~25 m after the second break [e.g., see Malaspina et al., 2014; Malaspina & Wilson, 2016].
TDS Bursts:
TDS bursts are waveform captures/snapshots of electric and magnetic field data. The data is triggered by the largest amplitude waves which exceed a specific threshold and are then stored in a memory buffer. The bursts are ranked according to a quality filter which mostly depends upon amplitude. Due to the age of the spacecraft and ubiquity of large amplitude electromagnetic and electrostatic waves, the memory buffer often fills up before dumping onto the magnetic tape drive. If the memory buffer is full, then the bottom ranked TDS burst is erased every time a new TDS burst is sampled. That is, the newest TDS burst sampled by the instrument is always stored and if it ranks higher than any other in the list, it will be kept. This results in the bottom ranked burst always being erased. Earlier in the mission, there were also so called honesty bursts, which were taken periodically to test whether the triggers were working properly. It was found that the TDSF triggered properly, but not the TDSS. So the TDSS was set to trigger off of the Ex signals.
A TDS burst from the Wind/WAVES instrument is always 2048 time steps for each channel. The sample rate for TDSF bursts ranges from 1875 samples/second (sps) to 120,000 sps. Every TDS burst is marked a unique set of numbers (unique on any given date) to help distinguish it from others and to ensure any set of channels are appropriately connected to each other. For instance, during one spacecraft downlink interval there may be 95% of the TDS bursts with a complete set of channels (i.e., TDSF has two channels, TDSS has four) while the remaining 5% can be missing channels (just example numbers, not quantitatively accurate). During another downlink interval, those missing channels may be returned if they are not overwritten. During every downlink, the flight operations team at NASA Goddard Space Fligth Center (GSFC) generate level zero binary files from the raw telemetry data. Those files are filled with data received on that date and the file name is labeled with that date. There is no attempt to sort chronologically the data within so any given level zero file can have data from multiple dates within. Thus, it is often necessary to load upwards of five days of level zero files to find as many full channel sets as possible. The remaining unmatched channel sets comprise a much smaller fraction of the total.
All data provided here are from TDSF, so only two channels. Most of the time channel 1 will be associated with the Ex antenna and channel 2 with the Ey antenna. The data are provided in the spinning instrument coordinate basis with associated angles necessary to rotate into a physically meaningful basis (e.g., GSE).
TDS Time Stamps:
Each TDS burst is tagged with a time stamp called a spacecraft event time or SCET. The TDS datation time is sampled after the burst is acquired which requires a delay buffer. The datation time requires two corrections. The first correction arises from tagging the TDS datation with an associated spacecraft major frame in house keeping (HK) data. The second correction removes the delay buffer duration. Both inaccuracies are essentially artifacts of on ground derived values in the archives created by the WINDlib software (K. Goetz, Personal Communication, 2008) found at https://github.com/lynnbwilsoniii/Wind_Decom_Code.
The WAVES instrument's HK mode sends relevant low rate science back to ground once every spacecraft major frame. If multiple TDS bursts occur in the same major frame, it is possible for the WINDlib software to assign them the same SCETs. The reason being that this top-level SCET is only accurate to within +300 ms (in 120,000 sps mode) due to the issues described above (at lower sample rates, the error can be slightly larger). The time stamp uncertainty is a positive definite value because it results from digitization rounding errors. One can correct these issues to within +10 ms if using the proper HK data.
*** The data stored here have not corrected the SCETs! ***
The 300 ms uncertainty, due to the HK corrections mentioned above, results from WINDlib trying to recreate the time stamp after it has been telemetered back to ground. If a burst stays in the TDS buffer for extended periods of time (i.e., >2 days), the interpolation done by WINDlib can make mistakes in the 11th significant digit. The positive definite nature of this uncertainty is due to rounding errors associated with the onboard DPU (digital processing unit) clock rollover. The DPU clock is a 24 bit integer clock sampling at ∼50,018.8 Hz. The clock rolls over at ∼5366.691244092221 seconds, i.e., (16*224)/50,018.8. The sample rate is a temperature sensitive issue and thus subject to change over time. From a sample of 384 different points on 14 different days, a statistical estimate of the rollover time is 5366.691124061162 ± 0.000478370049 seconds (calculated by Lynn B. Wilson III, 2008). Note that the WAVES instrument team used UR8 times, which are the number of 86,400 second days from 1982-01-01/00:00:00.000 UTC.
The method to correct the SCETs to within +10 ms, were one to do so, is given as follows:
Retrieve the DPU clock times, SCETs, UR8 times, and DPU Major Frame Numbers from the WINDlib libraries on the VAX/ALPHA systems for the TDSS(F) data of interest.
Retrieve the same quantities from the HK data.
Match the HK event number with the same DPU Major Frame Number as the TDSS(F) burst of interest.
Find the difference in DPU clock times between the TDSS(F) burst of interest and the HK event with matching major frame number (Note: The TDSS(F) DPU clock time will always be greater than the HK DPU clock if they are the same DPU Major Frame Number and the DPU clock has not rolled over).
Convert the difference to a UR8 time and add this to the HK UR8 time. The new UR8 time is the corrected UR8 time to within +10 ms.
Find the difference between the new UR8 time and the UR8 time WINDlib associates with the TDSS(F) burst. Add the difference to the DPU clock time assigned by WINDlib to get the corrected DPU clock time (Note: watch for the DPU clock rollover).
Convert the new UR8 time to a SCET using either the IDL WINDlib libraries or TMLib (STEREO S/WAVES software) libraries of available functions. This new SCET is accurate to within +10 ms.
One can find a UR8 to UTC conversion routine at https://github.com/lynnbwilsoniii/wind_3dp_pros in the ~/LYNN_PRO/Wind_WAVES_routines/ folder.
Examples of good waveforms can be found in the notes PDF at https://wind.nasa.gov/docs/wind_waves.pdf.
Data Set Description
Each Zip file contains 300+ IDL save files; one for each day of the year with available data. This data set is not complete as the software used to retrieve and calibrate these TDS bursts did not have sufficient error handling to handle some of the more nuanced bit errors or major frame errors in some of the level zero files. There is currently (as of June 27, 2020) an effort (by Keith Goetz et al.) to generate the entire TDSF and TDSS data set in one repository to be put on SPDF/CDAWeb as CDF files. Once that data set is available, it will supercede
Facebook
TwitterThis data is derived from the Maternity Indicators dataset which is provided to the Welsh Government by Digital Health and Care Wales (DHCW). The Maternity Indicators dataset was established in 2016. It combines records from a mother’s initial assessment with a child’s birth record and enabled Welsh Government to monitor its initial set of outcome indicators and performance measures (Maternity Indicators) which were established to measure the effectiveness and quality of Welsh maternity services. The Maternity Indicators dataset allows us to analyse characteristics of the mother’s pregnancy and birth process. The process for producing this data extract is complex largely because there can be multiple initial assessment data and records for both initial assessments and births are not always complete. Full details of every data item available in the Maternity Indicators dataset are available through the NHS Wales Data Dictionary: http://www.datadictionary.wales.nhs.uk/#!WordDocuments/datasetstructure20.htm The mode of birth relates to how the baby was delivered and is often different to the mode of onset of labour. There are three modes of birth recorded in the MI ds and they are defined as: caesarean section: elective and emergency caesarean section deliveries; instrumental: forceps cephalic deliveries and ventouse (vacuum) deliveries; and spontaneous vaginal: baby born by maternal effort. The data dictionary also defines how ethnic groups are classified, namely: White (any white background); Asian (Pakistani, Bangladeshi, Chinese, Indian, any other Asian background); Mixed/multiple (white and Asian, white and black African, white and black Caribbean, any other mixed background); Other (any other ethnic group); Black (African, Caribbean, any other black background).
Facebook
TwitterEquipment Used: NOAA Ship Okeanos Explorer is equipped with a 26 kilohertz (kHz) Kongsberg EM 304 MKII multibeam sonar. The nominal transmit (TX) alongtrack beamwidth is 0.5°, and the nominal receive (RX) acrosstrack beamwidth is 1.0°. The system generates a 150° beam fan, containing 512 beams with up to 800 soundings per ping cycle when in high-density mode. In waters shallower than approximately 3,300 m the system is able to operate in dual-swath mode, where one nominal ping cycle includes two swaths, resulting in up to 1,600 soundings. Data are recorded using Kongsberg's Seafloor Information System (SIS) software. Collocated to the bathymetric data, bottom backscatter data were collected and stored within the raw files, both as beam-averaged backscatter values, and as full-time series values (snippets) within each beam. During standard data acquisition, the EM 304 multibeam sonar is synchronized with the other active sonars using the Kongsberg Synchronization Unit with the EM 304 multibeam sonar set as the master. Any changes in equipment setup for the year or expedition are detailed in the annual Readiness Report or associated Expedition Report, respectively. For general information about sub-bottom operations, please refer to the NOAA Ocean Exploration Mapping Procedures Manual. Calibrations: At the beginning of each field season, a multibeam geometric calibration (patch test) is conducted to resolve any angular misalignments of the EM 304 multibeam equipment. A patch test is also conducted if any multibeam equipment (e.g., transducers, IMU, antennas) is installed or disturbed. The patch test determines if there are any residual biases or errors in navigation timing, pitch, roll, and heading/yaw (and resolves each bias individually in that order). Whenever possible (and assuming reasonable values), the results of each test are applied in SIS prior to data collection for the following test. Calibration Reports are archived as supplemental documents to the annual Readiness Report throughout the year. A relative backscatter correction was performed in 2021, and the resulting gain values were uploaded to the processing unit. This procedure helps to normalize differences in backscatter values resulting from variable frequencies and pulse durations employed within sectors and among ping modes used during multibeam data acquisition. Acquisition Corrections: Real-time corrections to the data upon acquisition include the continuous application of surface sound speed obtained with a hull-mounted Reson SV-70 probe, and application of water column sound speed profiles obtained with Sippican Deep Blue Expendable Bathythermographs (XBTs) and/or Seabird CTD 9/11. Sound speed profiles are conducted every four hours, or more frequently as dictated by local oceanographic conditions (typically every two hours when operating in more dynamic areas). Reson sound speed values are constantly compared against secondarily derived sound speed values from the ship’s onboard thermosalinograph flow-through system as a quality assurance measure. Roll, pitch, and heave motion corrections are applied in real-time via a POS MV 320 version 5 or a Seapath-380, using Marine Star DGPS correctors. The motion and positioning unit used will be noted in the processing logs. No tidal corrections are applied to the raw or processed data. Multibeam data quality is monitored in real-time by acquisition watchstanders. Ship speed is adjusted to maintain data quality and sounding density as necessary. Line spacing is planned to ensure one-quarter to one-third swath-width overlap between lines, depending on the environmental conditions and impact on the quality of the outer swath regions. Angles are generally left open (70°/70°) during transits to maximize data collection and are adjusted on both the port and starboard sides to ensure the best data quality and coverage. If outer beams are returning obviously spurious soundings (e.g., due to attenuation or low grazing angle), beam angles are gradually reduced and monitored closely until a high-quality swath was obtained. Processing Steps: The full-resolution multibeam .kmall files are imported into QPS Qimera, and then processed and cleaned of noise and artifacts. Outlier soundings are removed using multiple methods including automatic filtering and/or manual cleaning with the swath and subset editing tools. The default sound speed scheduling method is “Nearest-in-Time; SVP Crossfade 60 sec.†If another method was implemented, it will be noted in the associated log. Data Product Creation Steps: Gridded digital terrain models were created using the weighted moving average algorithm and were exported in multiple formats using QPS Fledermaus software. Some expeditions have several final multibeam grid files in order to keep file sizes manageable and to focus on particular survey areas of interest. The final surfaces are re-projected to the field geographic WGS84 reference frame in QPS Fledermaus software, saved as a .sd file, and then exported to multiple formats (ASCII XYZ text file (.xyz), color image .tif, floating point .tif, and Google Earth .kmz file formats). The .gsf files are used to create daily backscatter mosaics using QPS FMGT. Horizontal Datum: WGS84 Vertical Datum: Data are referenced to the waterline using surveyed vessel offsets and static draft measurements. Software Versions: The version for any software used is noted in the associated Expedition Report. Data Format: Raw data (Level-00) are archived in .kmall format. Processed files (Level-01) are archived as .gsf files. Bathymetry grids (Level-02) are archived as .xyz, color .tif, floating point .tif, .kmz, and .sd. Backscatter mosaics are archived as .sd and .tif formats. Weather, Watchstanding, Processing, and Data Package Logs (.xls) are archived. There is a complete accounting of each individually archived multibeam data file and of each bathymetric surface product in the multibeam data acquisition and processing logs archived with the dataset. Contact: Please do not hesitate to contact NOAA Ocean Exploration (oar.oer.exmappingteam@noaa.gov) with any questions regarding these files. If you are interested in downloading the raw data (Level-00) or cleaned/edited data (Level-01), you can access those data from the NOAA National Centers for Environmental Information for geophysical data https://www.ncei.noaa.gov/products/seafloor-mapping. For questions or assistance, NOAA Ocean Exploration’s Data Management Team can be reached at oer.info.mgmt@noaa.gov.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this table you will find information on the extent to which Dutch people participate traffic broken down by region by mode of transport and motive.
The traffic participation of persons is expressed in five quantities: — Average number of trips per person per day. A journey is a journey or part of a journey. with one motive. For example, the distance travelled from home to work is one movement, whether or not one or more means of transport are used used. — Average number of kilometres travelled per person per day. — Average travel time per person per day: based on departure and arrival times of relocations. — Average distance per journey. — Average travel time per journey.
The mobility data for the years 1985-2003 were obtained from the annual survey carried out by the Central Statistical Office (CBS) Research Relocation Behavior (OVG). Since 2004, mobility data have been from the Mobility Research Netherlands (MON) of Transport Service and Shipping (DVS), part of the Ministry of Transport and Water management.
Data available from: 1985.
Status of the figures Figures based on OVG/MON are always final.
When are new figures coming? This table was discontinued by 20-03-2012 and continued as ‘Mobility in the Netherlands; modes of transport and motives, regions’. See also paragraph 3.
Facebook
TwitterData Set Overview This dataset contains the plasma electron density derived from electric field mutual impedance spectra measured by RPCMIP in active modes for ESC1 (COMET ESCORT 1) mission phase, between 2014/11/19 and 2015/03/10. Derived data collected in this dataset are referred as CODMAC Level 5 products. Data RPCMIP plasma electron density is obtained by onground analysis of RPCMIP spectra acquired when RPCMIP operated in active modes (SDL and LDL). Automatic algorithms have been developed to extract characteristics features from spectra. These features are related to plasma signatures and enabled in certain conditions to derive the plasma frequency and therefore the plasma electron density. Different algorithms have been developed to adapt the different operational modes of the instrument and to account for instrumental limitations (specific to each operation mode). While derived electron density is believed to be of sufficient quality for science use, false detections or misinterpretations cannot be avoided. The user is invited to use these data with caution and always consider quality values associated to density values. A more detailed description of the plasma electron density dataset can be found in the documentation enclosed in the dataset. Processing Only CODMAC Level 5 data are present in this data set. truncated!, Please see actual data for full text [truncated!, Please see actual data for full text]
Facebook
TwitterNational
Sample survey data [ssd]
Sample size is 2,155 households
LSMS Sample Design
The LSMS design consisted of an equal-probability sample of housing units (HUs) within each of 16 explicit strata. These were selected in two stages. The first was to select - within strata - an agreed number of enumeration units (EAs) with probability proportional to number of HUs in the EA (according to 2001 Census data). The second stage was to select 8 HUs systematically from each selected EA. (Substitutes were used where necessary to ensure that 8 households were successfully interviewed in each EA, but I shall ignore that for current purposes.) Although probabilities within strata were (approximately) equal, probabilities varied greatly between the strata. Notably, the mountain region was heavily over-represented and the Central Rural region was under-represented in the sample.
Panel Survey Sample Design
The LSMS was so-designed, partly to enable separate analysis by broad strata (e.g. separate estimates for the mountain region). Regional analysis is much less important for the panel. The sample size will in any case be considerably smaller, so some regional sample sizes would inevitably be too small to permit robust estimation. The prime objective for the panel is to enable national-level estimates with the highest possible precision. To achieve this, the sample was structured in a way that minimises the overall variation in households' selection probabilities. In other words, the sample distribution over strata matched as closely as possible the population distribution.
Panel design
The Albanian panel survey sample was selected from households interviewed on the 2002 LSMS conducted by INSTAT with support from the World Bank. The sample size for the panel took approximately half the LSMS households and has re-interviewed these households annually in each of 2003 and 2004. The LSMS data collected in 2002 therefore constitute 'Wave 1' of the panel survey and giving three waves of panel data altogether. The fieldwork for Wave 3 was carried out in the spring of 2004.
The sample selected from the LSMS for the panel was designed to provide a nationally representative sample of households and individuals within Albania (see Appendix B for full description of the sample design and selection procedure). This differs from the LSMS where the sample was designed to be representative of each strata which broadly represented the main regions in Albania so that regional level statistics could be generated (Mountain, Central, Coastal, Tirana).
The panel also has no over-sampling as in the LSMS. This design was adopted as the smaller sample size for the panel would have made it more difficult to produce regionally representative samples and increased sampling error while over-sampling can introduce additional complications for analysis in the context of a panel. The panel data can be used for analysis broken down by strata to assess any differences between areas but should not be used to produce cross-sectional estimates at the regional level. The relatively small sample size for the panel must always be considered as cell sizes which are small have higher levels of error and can produce estimates which are less reliable. Panel surveys have a number of elements of which data users need to be aware when carrying out their analysis. The main features of the panel design are as follows: - All members of Wave 1 households were designated as original sample members (OSMs) including children aged under 15 years. - New members living with an OSM become eligible for inclusion in the sample - All sample members are followed as they move address and any new members found to be living in their household included - Sample members moving out of Albania are considered to be out of scope for that year of the survey (note that they remain potentially eligible for interview and it is possible they may return to a sample household at a future wave) - From Wave 2, only household members aged 15 years and over are eligible for interview. As children turn 15, they become eligible for interview (This differs from the LSMS where the individual questionnaire collected some data on children under 15 from the mother or main carer).
The panel is essentially an individual level survey as individuals are followed over time regardless of the household they are living in at a given interview point. This is the key element of the panel design. Households change in composition over time as members move in and out, children are born and others die. New households are formed as people marry or children leave the parental home and households can disappear if all members die or all members move in different directions. The fact that households do not remain constant over time means that it is only possible to follow individuals over time, observing them in their household context at each interview point.
It should also be noted that a 'household' is not equivalent to a current address. A household may move to a new address but maintain the same composition. Similarly, an individual sample member may move between several addresses during the life of the survey. In this design, there is no substitution or recruitment of new households moving into addresses vacated by sample members.
Face-to-face [f2f]
Panel questionnaire content
The data for Wave 1 of the panel survey are the LSMS data so contains all the modules carried for the LSMS. To minimise respondent burden and help maintain response rates in the panel survey it was necessary to reduce the length and complexity of the LSMS questionnaire. However, it was also important to maintain comparability in question wording and response categories wherever possible as only variables which are comparable over time can be used for longitudinal analysis. The Wave 2 questionnaire is therefore a reduced version of the LSMS questionnaire with some additional elements that were required for the panel e.g. collecting details of people moving into and out of the household, and some new elements that had not been included on the LSMS. A cross-wave list of variables for Waves 1 and 2 shows which variables have been carried at both waves, which were carried at Wave 1 only and which at Wave 2 only (see ‘Variable Reconciliation LSMS_PANEL_final). The most notable changes were that the LSMS detailed consumption module was not collected at Wave 2 and the agriculture module was a reduced form compared to the LSMS.
The Wave 2 individual questionnaire contains some routing depending on whether or not the person is an original sample member interviewed on the LSMS or a new person who had joined the household since Wave 1. This is because some information only needs to be collected once e.g. place of birth and other information only needs to be updated on an annual basis. For example all qualifications were collected on the LSMS so for original members we only need to know if they have gained any new qualifications in the past year but for new members we need to ask about all qualifications. Users of the data need to be aware of this routing and in some cases may need to get information from an earlier wave if it was not collected at the current wave. Users are recommended to use the data in conjunction with the questionnaires so they are aware of the routing for different sample members.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Global cloud dataset from combined spaceborne radar and lidar.
This repository contains the 3S-GEOPROF-COMB product, a globally-gridded dataset for cloud vertical structure retrieved from hybrid active remote sensing (CloudSat radar and CALIPSO lidar) reported at 240 m vertical resolution. Science variables include vertical cloud fraction and vertically-integrated cloud cover for various geometrical criteria (i.e. high, middle, low, and thick clouds, along with with unique high, middle, and low cloud cover variants).
A Python notebook showing how to work with the dataset is available on GitHub, as is the source code used to produce the data product.
Our product is calculated from the latest release (R05) of per-orbit (level 2) combined cloud mask profiles in 2B-GEOPROF-LIDAR with additional data from 2B-GEOPROF. Validation and a complete description of the data product is given in the paper "A Global Gridded Dataset for Cloud Vertical Structure from Combined CloudSat and CALIPSO Observations" (Earth System Science Data).
Please cite "Bertrand, L., Kay, J. E., Haynes, J., and de Boer, G.: A global gridded dataset for cloud vertical structure from combined CloudSat and CALIPSO observations, Earth Syst. Sci. Data, 16, 1301–1316, https://doi.org/10.5194/essd-16-1301-2024, 2024."
The files contained in each folder are given via the following format:
instruments_frequency_resolution.zip
instruments:
radarlidar: the standard product, computed from merged geometrical profiles of hydrometeor occurrence
radaronly: computed solely from CloudSat radar profiles, otherwise processing is identical. For when users need to determine which instrument is responsible for observations of interest.
lidaronly: computed solely from CALIPSO lidar profiles, otherwise processing is identical. For when users need to determine which instrument is responsible for observations of interest.
frequency:
monthly: data files report fields aggregated over a 1-month period
seasonal: data files report fields aggregated over a 3-month period (DJF, MAM, JJA, SON)
resolution:
2.5x2.5: each grid box spans 2.5 degrees latitude and 2.5 degrees longitude
5x5: each grid box spans 5 degrees latitude and 5 degrees longitude
10x10: each grid box spans 10 degrees latitude and 10 degrees longitude
Each folder contains a netCDF data file and a cloud cover quicklook plot image file for each time period over the 2006-2019 data record. Individual files are named according to the following format:
timeperiod_instruments_datastream_version.nc (or .png)
timeperiod: the time step at the given frequency, either e.g. 2006-08 (August 2006) or 2012-DJF (December 2012 to February 2013).
instruments: the instruments used in the data product as a whole, always CSCAL (CloudSat and CALIPSO).
datastream: either 3S-GEOPROF-COMB (COMBined radar and lidar), 3S-GEOPROF-COMB-RO (the auxiliary Radar Only variant of the product), or 3S-GEOPROF-COMB-LO (the auxiliary Lidar Only variant of the product)
version: current release is v8.4
The product handles the 2011 CloudSat battery anomaly, after which the satellite only collects data in the sunlit portion of its orbit, by allowing users to subsample the pre-anomaly period to mimic the post-anomaly collection patterns. This allows users to estimate the effect of the reduced sampling on their analyses or apply a consistent sampling mode to the entire dataset. This option is provided to users via the "doop" dimension. Dimension coordinate value "All cases" reports variables computed using all observations, while "DO-OP observable" reports variables using only input data that either were or would have been collected in DO-OP mode (i.e. the pre-DO-OP period is subsampled to DO-OP collection patterns).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○