The primary article (cited below under "Related works") introduces social work researchers to discrete choice experiments (DCEs) for studying stakeholder preferences. The article includes an online supplement with a worked example demonstrating DCE design and analysis with realistic simulated data. The worked example focuses on caregivers' priorities in choosing treatment for children with attention deficit hyperactivity disorder. This dataset includes the scripts (and, in some cases, Excel files) that we used to identify appropriate experimental designs, simulate population and sample data, estimate sample size requirements for the multinomial logit (MNL, also known as conditional logit) and random parameter logit (RPL) models, estimate parameters using the MNL and RPL models, and analyze attribute importance, willingness to pay, and predicted uptake. It also includes the associated data files (experimental designs, data generation parameters, simulated population data and parameters, ..., In the worked example, we used simulated data to examine caregiver preferences for 7 treatment attributes (medication administration, therapy location, school accommodation, caregiver behavior training, provider communication, provider specialty, and monthly out-of-pocket costs) identified by dosReis and colleagues in a previous DCE. We employed an orthogonal design with 1 continuous variable (cost) and 12 dummy-coded variables (representing the levels of the remaining attributes, which were categorical). Using the parameter estimates published by dosReis et al., with slight adaptations, we simulated utility values for a population of 100,000 people, then selected a sample of 500 for analysis. Relying on random utility theory, we used the mlogit package in R to estimate the MNL and RPL models, using 5,000 Halton draws for simulated maximum likelihood estimation of the RPL model. In addition to estimating the utility parameters, we measured the relative importance of each attribute, esti..., , # Data from: How to Use Discrete Choice Experiments to Capture Stakeholder Preferences in Social Work Research
This dataset supports the worked example in:
Ellis, A. R., Cryer-Coupet, Q. R., Weller, B. E., Howard, K., Raghunandan, R., & Thomas, K. C. (2024). How to use discrete choice experiments to capture stakeholder preferences in social work research. Journal of the Society for Social Work and Research. Advance online publication. https://doi.org/10.1086/731310
The referenced article introduces social work researchers to discrete choice experiments (DCEs) for studying stakeholder preferences. In a DCE, researchers ask participants to complete a series of choice tasks: hypothetical situations in which each participant is presented with alternative scenarios and selects one or more. For example, social work researchers may want to know how parents and other caregivers pr...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Identifying important attributes in the consumer decision-making process is a difficult task for marketing professionals. Based on this context, this paper presents the attributes considered more important by consumers in the notebook decision-making process. For such propose, it was made an exploratory research shared in two parts, one qualitative, and another one quantitative. In the qualitative part, 42 attributes consider in the notebook purchasing moment were found. In the quantitative part, the questionnaire was applied in a sample of 131 people. The results, using exploratory factor analysis, showed that the 24 attributes correspond to five dimensions. The factors were called pleasure and benefits, notebook features, performance, attention and operational. In the end, the attributes were classified according to the proposed by theory. Final considerations and suggestions for future research are also presented.
Our dataset offers a unique blend of attributes from YouTube and Google Maps, empowering users with comprehensive insights into online content and geographical reach. Let's delve into what makes our data stand out:
Unique Attributes: - From YouTube: Detailed video information including title, description, upload date, video ID, and channel URL. Video metrics such as views, likes, comments, and duration are also provided. - Creator Info: Access author details like name and channel URL. - Channel Information: Gain insights into channel title, description, location, join date, and visual branding elements like logo and banner URLs. - Channel Metrics: Understand a channel's performance with metrics like total views, subscribers, and video count. - Google Maps Integration: Explore business ratings from Google My Business and location data from Google Maps.
Data Sourcing: - Our data is meticulously sourced from publicly available information on YouTube and Google Maps, ensuring accuracy and reliability.
Primary Use-Cases: - Marketing: Analyze video performance metrics to optimize content strategies. - Research: Explore trends in creator behavior and audience engagement. - Location-Based Insights: Utilize Google Maps data for market research, competitor analysis, and location-based targeting.
Fit within Broader Offering: - This dataset complements our broader data offering by providing rich insights into online content consumption and geographical presence. It enhances decision-making processes across various industries, including marketing, advertising, research, and business intelligence.
Usage Examples: - Marketers can identify popular video topics and optimize advertising campaigns accordingly. - Researchers can analyze audience engagement patterns to understand viewer preferences. - Businesses can assess their Google My Business ratings and geographical distribution for strategic planning.
With scalable solutions and high-quality data, our dataset offers unparalleled depth for extracting actionable insights and driving informed decisions in the digital landscape.
McGRAW’s US B2B Data: Accurate, Reliable, and Market-Ready
Our B2B database delivers over 80 million verified contacts with 95%+ accuracy. Supported by in-house call centers, social media validation, and market research teams, we ensure that every record is fresh, reliable, and optimized for B2B outreach, lead generation, and advanced market insights.
Our B2B database is one of the most accurate and extensive datasets available, covering over 91 million business executives with a 95%+ accuracy guarantee. Designed for businesses that require the highest quality data, this database provides detailed, validated, and continuously updated information on decision-makers and industry influencers worldwide.
The B2B Database is meticulously curated to meet the needs of businesses seeking precise and actionable data. Our datasets are not only extensive but also rigorously validated and updated to ensure the highest level of accuracy and reliability.
Key Data Attributes:
Unlike many providers that rely solely on third-party vendor files, McGRAW takes a hands-on approach to data validation. Our dedicated nearshore and offshore call centers engage directly with data before each delivery to ensure every record meets our high standards of accuracy and relevance.
In addition, our teams of social media validators, market researchers, and digital marketing specialists continuously refine and update records to maintain data freshness. Each dataset undergoes multiple verification checks using internal validation processes and third-party tools such as Fresh Address, BriteVerify, and Impressionwise to guarantee the highest data quality.
Additional Data Solutions and Services
Data Enhancement: Email and LinkedIn appends, contact discovery across global roles and functions
Business Verification: Real-time validation through call centers, social media, and market research
Technology Insights: Detailed IT infrastructure reports, spending trends, and executive insights
Healthcare Database: Access to over 80 million healthcare professionals and industry leaders
Global Reach: US and international GDPR-compliant datasets, complete with email, postal, and phone contacts
Email Broadcast Services: Full-service campaign execution, from testing to live deployment, with tracking of key engagement metrics such as opens and clicks
Many B2B data providers rely on vendor-contributed files without conducting the rigorous validation necessary to ensure accuracy. This often results in outdated and unreliable data that fails to meet the demands of a fast-moving business environment.
McGRAW takes a different approach. By owning and operating dedicated call centers, we directly verify and validate our data before delivery, ensuring that every record is up-to-date and ready to drive business success.
Through continuous validation, social media verification, and real-time updates, McGRAW provides a high-quality, dependable database for businesses that prioritize data integrity and performance. Our Global Business Executives database is the ideal solution for companies that need accurate, relevant, and market-ready data to fuel their strategies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a minimal example of Data Subject Access Request Packages (SARPs), as they can be retrieved under data protection laws, specifically the GDPR. It includes data from two data subjects, each with accounts for five major sevices, namely Amazon, Apple, Facebook, Google, and Linkedin.
This dataset is meant to be an initial dataset that allows for manual exploration of structures and contents found in SARPs. Hence, the number of controllers and user profiles should be minimal but sufficient to allow cross-subject and cross-controller analysis. This dataset can be used to explore structures, formats and data types found in real-world SARPs. Thereby, the planning of future SARP-based research projects and studies shall be facilitated.
We invite other researchers to use this dataset to explore the structure of SARPs. The envisioned primary usage includes the development of user-centric privacy interfaces and other technical contributions in the area of data access rights. Moreover, these packages can also be used for examplified data analyses, although no substantive research questions can be answered using this data. In particular, this data does not reflect how data subjects behave in real world. However, it is representative enough to give a first impression on the types of data analysis possible when using real world data.
In order to allow cross-subject analysis, while keeping the re-identification risk minimal, we used research-only accounts for the data generation. A detailed explanation of the data generation method can be found in the paper corresponding to the dataset, accepted for the Annual Privacy Forum 2024.
In short, two user profiles were designed and corresponding accounts were created for each of the five services. Then, those accounts were used for two to four month. During the usage period, we minimized the amount of identifying data and also avoided interactions with data subjects not part of this research. Afterwards, we performed a data access request via the controller's web interface. Finally, the data was cleansed as described in detail in the acconpanying paper and in brief within the following section.
Before publication, both possibly identifying information and security relevant attributes need to be obfuscated or deleted. Moreover, multi-party data (especially messages with external entities) must be deleted. If data is obfuscated, we made sure to substitute multiple occurances of the same information with the same replacement.
We provide a list of deleted and obfuscated items, the obfuscation scheme and, if applicable, the replacement.
The list of obfuscated items looks like the following example:
path | filetype | filename | attribute | scheme | replacement |
linkedin\Linkedin_Basic | csv | messages.csv | TO | semantic description | Firstname Lastname |
gooogle\Meine Aktivitäten\Datenexport | html | MeineAktivitäten.html | IP Address | loopback | 127.142.201.194 |
facebook\personal_information | json | profile_information.json | emails | semantic description | firstname.lastname@gmail.com |
To give you an overview of the dataset, we publicly provide some meta-data about the usage time and SARP characteristics of exports from subject A/ subject B.
provider | usage time (in month) | export options | file types | # subfolders | # files | export size |
Amazon | 2/4 | all categories | CSV (32/49) EML (2/5) JPEG (1/2) JSON (3/3) PDF (9/10) TXT (4/4) | 41/49 | 51/73 | 1.2 MB / 1.4 MB |
Apple | 2/4 | all data max. 1 GB/ max. 4 GB | CSV (8/3) | 20/1 | 8/3 | 71.8 KB / 294.8 KB |
2/4 |
all data JSON/HTML on my computer | JSON (39/0) HTML (0/63) TXT (29/28) JPG (0/4) PNG (1/15) GIF (7/7) | 45/76 | 76/117 | 12.3 MB / 13.5 MB | |
2/4 |
all data frequency once ZIP max. 4 GB | HTML (8/11) CSV (10/13) JSON (27/28) TXT (14/14) PDF (1/1) MBOX (1/1) VCF (1/0) ICS (1/0) README (1/1) JPG (0/2) | 44/51 | 64/71 | 1.54 MB /1.2 MB | |
2/4 | all data | CSV (18/21) | 0/0 (part 1/2) 0/0 (part 1/2) | 13/18 19/21 |
3.9 KB / 6.0 KB 6.2 KB / 9.2 KB |
This data collection was performed by Daniela Pöhn (Universität der Bundeswehr München, Germany), Frank Pallas and Nicola Leschke (Paris Lodron Universität Salzburg, Austria). For questions, please contact nicola.leschke@plus.ac.at.
The dataset was collected according to the method presented in:
Leschke, Pöhn, and Pallas (2024). "How to Drill Into Silos: Creating a Free-to-Use Dataset of Data Subject Access Packages". Accepted for Annual Privacy Forum 2024.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Stage tasks: Task 1: Development of algorithms for statistical analysis of attribute values for data purification. The aim of the task was to develop an algorithm that is able to identify the type of attribute (scalar, discrete) and depending on the type (text, number, date, text label, etc.) and deduce which values can be considered correct and which are incorrect and cause noise dataset, which in turn affects the quality of the ML model. Task 2: Development of algorithms for statistical analysis of data attributes in terms of optimal coding of learning vectors. The aim of the task was to develop an algorithm that is able to propose optimal coding of the learning vector to be used in the ML process and perform the appropriate conversion, depending on the type (text, number, date, text label, etc.) for each type of attribute (scalar, discrete). e.g. converting text to word instance matrix format. It was necessary to predict several possible conversion scenarios that are most often used in practice, resulting from the heuristic knowledge of experts. Task 3: Developing a prototype of an automatic data cleaning and coding environment and testing the solution on samples of production data. Industrial Research: Task No. 2. Research on the meta-learning algorithm Task 1: Review of existing meta-learning concepts and selection of algorithms for further development The aim of the task was to analyze the state of knowledge on meta-learning in terms of the possibility of using existing research results in the project - a task carried out in the form of subcontracting by a scientific unit. Task 2: Review and development of the most commonly used ML algorithms in terms of their susceptibility to hyperparameter meta-learning and practical usefulness of the obtained models. The aim of the task was to develop a pool of basic algorithms that will be used as production algorithms, i.e. performing the right predictions. The hyperparameters of these algorithms have been meta-learning. It was therefore necessary to develop a model of interaction of the main algorithm with individual production algorithms. – task carried out in the form of subcontracting by a scientific unit. Task 3: Development of a meta-learning algorithm for selected types of ML models The aim of the task was to develop the main algorithm implementing the function of optimizing hyperparameters of production models. It should be noted that the hyperparameters have a different structure depending on the specific production model, so the de facto appropriate solution was to use a different optimization algorithm for each model separately. Task 4: Developing a prototype of the algorithm and testing the operation of the obtained production data models. Experimental development work: Task No. 3. Research on the prototype of the architecture of the platform implementation environment Task 1: Developing the architecture of the data acquisition and storage module. The aim of the task was to develop an architecture for a scalable ETL (Extract Transform Load) solution for efficient implementation of the source data acquisition process (Data Ingest). An attempt was made to consider appropriate parsing algorithms and standardization of encoding data of various types (e.g. dates, numbers) in terms of effective further processing. Task 2: Development of a module for configuring and executing data processing pipelines in a distributed architecture. Due to the high complexity of the implemented algorithms, it was necessary to develop an architecture that would allow pipeline processing of subsequent data processing steps on various machines with the possibility of using a distributed architecture in a cloud and/or virtual environment. The use of existing concepts of distributed architectures, such as Map Reduce, was considered here. Task 3: Development of a user interface enabling intuitive control of data processing.
From the site: "This dataset has two purposes: to provide users with a comprehensive set of geospatial characteristics for a large number of gaged watersheds, particularly for gages with long flow record; and to provide a determination of which of those watersheds represent hydrologic conditions which are least disturbed by human influences ("reference gages"), compared to other watersheds within 12 major ecoregions. Identifying reference gages serves important research goals: for example identifying conditions or goals for stream restoration, climate change studies, and more."
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Chicken bouillon samples containing diverse YP were chemically and sensorially characterized by using statistical multivariate analyses. Untargeted profiles were obtained using RPLC-MS. This study was used for a straight-forward data-driven approach for studying foods with added YP to identify flavor-impacting correlations between molecular composition and sensory perception. It also highlights the limitations and preconditions for good prediction models. Overall, this study emphasises a matrix-based approach for the prediction of food taste, which can be used to analyse foods for targeted flavor design or quality control.
For more information use the DOI for the linked publication or the textfile uploaded here.
The BFE annually prepares the EFstat of the Federal Government, which provides information on the expenditure of energy research financed by public funds as well as a detailed compilation of the flows of funds. For the information, both of the IEA classification and the CH classification are compiled. The information is based on the federal databases as well as the declaration of the sponsoring bodies. ### Resources description (see below for download links) This dataset consists of different kinds of resources: DATA — Energy Research Statistics (EFstat) Raw data file(s). They contain the complete dataset. Some attributes are coded (e.g. the attribute “class” may contain the value “ri1” instead of the full name “Other uses”). Exact meaning and translations for those codes are in a separate CATALOG resource. SPARQL — Energy Research Statistics (EFstat) Example request on the LINDAS SPARQL endpoint for Linked Data access. CATALOG — Energy Research Statistics (EFstat) Catalog(s) file(s) contain the human-readable translations of the coded attributes in the DATA ressources.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Chicken bouillon samples containing diverse YP were chemically and sensorially characterized by using statistical multivariate analyses. Untargeted profiles were obtained using targeted HILIC-MS. This study was used for a straight-forward data-driven approach for studying foods with added YP to identify flavor-impacting correlations between molecular composition and sensory perception. It also highlights the limitations and preconditions for good prediction models. Overall, this study emphasises a matrix-based approach for the prediction of food taste, which can be used to analyse foods for targeted flavor design or quality control.
For more information use the DOI for the linked publication or the textfile uploaded here.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Malaria is the leading cause of death in the African region. Data mining can help extract valuable knowledge from available data in the healthcare sector. This makes it possible to train models to predict patient health faster than in clinical trials. Implementations of various machine learning algorithms such as K-Nearest Neighbors, Bayes Theorem, Logistic Regression, Support Vector Machines, and Multinomial Naïve Bayes (MNB), etc., has been applied to malaria datasets in public hospitals, but there are still limitations in modeling using the Naive Bayes multinomial algorithm. This study applies the MNB model to explore the relationship between 15 relevant attributes of public hospitals data. The goal is to examine how the dependency between attributes affects the performance of the classifier. MNB creates transparent and reliable graphical representation between attributes with the ability to predict new situations. The model (MNB) has 97% accuracy. It is concluded that this model outperforms the GNB classifier which has 100% accuracy and the RF which also has 100% accuracy.
Methods
Prior to collection of data, the researcher was be guided by all ethical training certification on data collection, right to confidentiality and privacy reserved called Institutional Review Board (IRB). Data was be collected from the manual archive of the Hospitals purposively selected using stratified sampling technique, transform the data to electronic form and store in MYSQL database called malaria. Each patient file was extracted and review for signs and symptoms of malaria then check for laboratory confirmation result from diagnosis. The data was be divided into two tables: the first table was called data1 which contain data for use in phase 1 of the classification, while the second table data2 which contains data for use in phase 2 of the classification.
Data Source Collection
Malaria incidence data set is obtained from Public hospitals from 2017 to 2021. These are the data used for modeling and analysis. Also, putting in mind the geographical location and socio-economic factors inclusive which are available for patients inhabiting those areas. Naive Bayes (Multinomial) is the model used to analyze the collected data for malaria disease prediction and grading accordingly.
Data Preprocessing:
Data preprocessing shall be done to remove noise and outlier.
Transformation:
The data shall be transformed from analog to electronic record.
Data Partitioning
The data which shall be collected will be divided into two portions; one portion of the data shall be extracted as a training set, while the other portion will be used for testing. The training portion shall be taken from a table stored in a database and will be called data which is training set1, while the training portion taking from another table store in a database is shall be called data which is training set2.
The dataset was split into two parts: a sample containing 70% of the training data and 30% for the purpose of this research. Then, using MNB classification algorithms implemented in Python, the models were trained on the training sample. On the 30% remaining data, the resulting models were tested, and the results were compared with the other Machine Learning models using the standard metrics.
Classification and prediction:
Base on the nature of variable in the dataset, this study will use Naïve Bayes (Multinomial) classification techniques; Classification phase 1 and Classification phase 2. The operation of the framework is illustrated as follows:
i. Data collection and preprocessing shall be done.
ii. Preprocess data shall be stored in a training set 1 and training set 2. These datasets shall be used during classification.
iii. Test data set is shall be stored in database test data set.
iv. Part of the test data set must be compared for classification using classifier 1 and the remaining part must be classified with classifier 2 as follows:
Classifier phase 1: It classify into positive or negative classes. If the patient is having malaria, then the patient is classified as positive (P), while a patient is classified as negative (N) if the patient does not have malaria.
Classifier phase 2: It classify only data set that has been classified as positive by classifier 1, and then further classify them into complicated and uncomplicated class label. The classifier will also capture data on environmental factors, genetics, gender and age, cultural and socio-economic variables. The system will be designed such that the core parameters as a determining factor should supply their value.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This Sentinel-1 InSAR dataset contains surface deformation that occurred between Nov. 2014 and Jan. 2019 associated with the Permian Basin oil and gas production. For further details of the processing method and uncertainty analysis, please see the associated paper of Staniewicz et al., 2020. Note: click the "tree" viewing option to see proper organizational layout of files, not the "table" layout. When using this data for research, please cite: Staniewicz, S., Chen, J., Lee, H., Olson, J., Savvaidis, A., Reedy, R., et al. (2020). InSAR reveals complex surface deformation patterns over an 80,000 square kilometer oil-producing region in the Permian Basin. Geophysical Research Letters, 47, e2020GL090151. Details of generation and data attributes Two paths of Sentinel 1 data were used in the analysis: the ascending path 78 and the descending path 85. For each path, the cumulative radar line-of-sight (LOS) deformation between (1) Nov. 2014 and Jan. 2017; (2) Nov. 2014 and Jan. 2018; and (3) Nov. 2014 and Jan. 2019 are included. Here the pixel spacing for all InSAR grids is 120 meters. All deformation data units are in centimeters. Each of the maps' cumulative results have an uncertainty of ~ 1 cm or less. All the maps using ascending (or descending) Sentinel data are coregistered the same latitude/longitude grid as the digital elevation model (DEM) covering the ascending (or descending) path. Note: We used the SRTM DEM data to generate the interferograms. These DEM data can be found in geotiffs/ascending_path78/dem.tif and geotiffs/descending_path85/dem.tif. The units of the DEMs are in meters. For each path, we provided the names and locations of the GPS stations with continuous coverage between Nov. 2014 and Jan. 2019 as CSV files. The GPS east, north, and vertical daily time series are available through the Nevada Geodetic Laboratory (http://geodesy.unr.edu/ ). For this example, the NMHB station's NA plate-fixed solutions are available at http://geodesy.unr.edu/NGLStationPages/stations/NMHB.sta The file geotiffs/ascending_path78/gps_locations.csv contains the name, latitude, and longitude of the stations within the ascending path, as well as the row and column of that location within the ascending latitude/longitude grid. The GPS stations TXKM was used as the spatial reference point to calibrate all LOS InSAR maps, the rest of GPS stations were used as independent validations for the InSAR results. In addition to providing the data in GeoTIFF format, we have also loaded the data into MATLAB provded .mat files (located in the matlab_version/ folder). We have divided the .mat files into data coregistered on the ascending grid, the descending grid, and the vertical/east deformation solutions in the region where the ascending and descending paths overlap. The definition of the radar LOS direction We note that InSAR measures surface deformation along the radar LOS direction. In the region where the ascending and descending paths overlap, we decomposed the the two LOS deformation solutions (Nov. 2014 to Jan. 2019) using the ascending and descending LOS maps (unitless, as in geotiffs/ascending_path78/los_enu.tif and geotiffs/descending_path85/los_enu.tif) into their horizontal and vertical components. These vertical/horizontal solutions are contained in geotiffs/vertical_horizontal_decomposition/. Further details of the LOS decomposition can be found in the associated paper and supplement of Staniewicz et al., 2020. Converting GPS ENU data to the radar LOS Here we show an example of how you would convert GPS east, north, up (ENU) time series data into measurements comparable to the ascending LOS InSAR measurements using the LOS unit vector coefficients. We use station NMHB as an example, whose metadata is contained in the geotiffs/ascending_path78/gps_locations.csv file. The LOS vector coefficients are in the geotiffs/ascending_path78/los_enu.tif image (or path78_data.mat), which is a 3 band GeoTIFF containing the look vector coefficients. To extract the 3 LOS coefficients from the matrix los_enu we could do the following in MATLAB: load path78_data.mat r = gps_locations.row(1); c = gps_locations.col(1); enu_coeffs = los_enu(r, c, :); alpha_east = enu_coeffs(1); alpha_north = enu_coeffs(2); alpha_up = enu_coeffs(3); Calling the east, north, up time series ts_east, ts_north, ts_up respectively, we can convert this to ts_LOS as follows: ts_LOS = alpha_east * ts_east + alpha_north * ts_north + alpha_up * ts_up
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The data that support the findings of this study are available from the corresponding author upon reasonable request and approval of the HR consultancy firm the data were obtained from. The Mplus code for the CFA and multilevel analyses is available at: https://osf.io/6f47s/
This study draws from brand positioning research to introduce the notions of points-of-relevance and points-of-difference to employer image research. Similar to prior research, this means that we start by investigating the relevant image attributes (points-of-relevance) that potential applicants use for judging organizations' attractiveness as an employer. However, we go beyond past research by examining whether the same points-of-relevance are used within and across industries. Next, we further extend current research by identifying which of the relevant image attributes also serve as points-of-difference for distinguishing between organizations and industries. The sample consisted of 24 organizations from 6 industries (total N = 7171). As a first key result, across industries and organizations, individuals attached similar importance to the same instrumental (job content, working conditions, and compensation) and symbolic (innovativeness, gentleness, and competence) image attributes in judging organizational attractiveness. Second, organizations and industries varied significantly on both instrumental and symbolic image attributes, with job content and innovativeness emerging as the strongest points-of-difference. Third, most image attributes showed greater variation between industries than between organizations, pointing at the importance of studying employer image at the industry level. Implications for recruitment research, employer branding, and best employer competitions are discussed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A collection of curated and standardized values used by the HuBMAP metadata records to ensure uniformity in the description of samples and single-cell data produced by the consortium (Bueckle et al. 2025).
Bibliography:
Research Ship Oceanus Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Oceanus Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Oceanus Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Oceanus Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.html
Xverum’s Point of Interest (POI) Data is a comprehensive dataset containing 230M+ verified locations across 5000 business categories. Our dataset delivers structured geographic data, business attributes, location intelligence, and mapping insights, making it an essential tool for GIS applications, market research, urban planning, and competitive analysis.
With regular updates and continuous POI discovery, Xverum ensures accurate, up-to-date information on businesses, landmarks, retail stores, and more. Delivered in bulk to S3 Bucket and cloud storage, our dataset integrates seamlessly into mapping, geographic information systems, and analytics platforms.
🔥 Key Features:
Extensive POI Coverage: ✅ 230M+ Points of Interest worldwide, covering 5000 business categories. ✅ Includes retail stores, restaurants, corporate offices, landmarks, and service providers.
Geographic & Location Intelligence Data: ✅ Latitude & longitude coordinates for mapping and navigation applications. ✅ Geographic classification, including country, state, city, and postal code. ✅ Business status tracking – Open, temporarily closed, or permanently closed.
Continuous Discovery & Regular Updates: ✅ New POIs continuously added through discovery processes. ✅ Regular updates ensure data accuracy, reflecting new openings and closures.
Rich Business Insights: ✅ Detailed business attributes, including company name, category, and subcategories. ✅ Contact details, including phone number and website (if available). ✅ Consumer review insights, including rating distribution and total number of reviews (additional feature). ✅ Operating hours where available.
Ideal for Mapping & Location Analytics: ✅ Supports geospatial analysis & GIS applications. ✅ Enhances mapping & navigation solutions with structured POI data. ✅ Provides location intelligence for site selection & business expansion strategies.
Bulk Data Delivery (NO API): ✅ Delivered in bulk via S3 Bucket or cloud storage. ✅ Available in structured format (.json) for seamless integration.
🏆Primary Use Cases:
Mapping & Geographic Analysis: 🔹 Power GIS platforms & navigation systems with precise POI data. 🔹 Enhance digital maps with accurate business locations & categories.
Retail Expansion & Market Research: 🔹 Identify key business locations & competitors for market analysis. 🔹 Assess brand presence across different industries & geographies.
Business Intelligence & Competitive Analysis: 🔹 Benchmark competitor locations & regional business density. 🔹 Analyze market trends through POI growth & closure tracking.
Smart City & Urban Planning: 🔹 Support public infrastructure projects with accurate POI data. 🔹 Improve accessibility & zoning decisions for government & businesses.
💡 Why Choose Xverum’s POI Data?
Access Xverum’s 230M+ POI dataset for mapping, geographic analysis, and location intelligence. Request a free sample or contact us to customize your dataset today!
Research Ship Atlantis Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Atlantis Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Atlantis Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Atlantis Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.html
Research Ship Nathaniel B. Palmer Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Nathaniel B. Palmer Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Nathaniel B. Palmer Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.htmlResearch Ship Nathaniel B. Palmer Underway Meteorological Data (delayed ~10 days for quality control) are from the Shipboard Automated Meteorological and Oceanographic System (SAMOS) program. IMPORTANT: ALWAYS USE THE QUALITY FLAG DATA! Each data variable's metadata includes a qcindex attribute which indicates a character number in the flag data. ALWAYS check the flag data for each row of data to see which data is good (flag='Z') and which data isn't. For example, to extract just data where time (qcindex=1), latitude (qcindex=2), longitude (qcindex=3), and airTemperature (qcindex=12) are 'good' data, include this constraint in your ERDDAP query: flag=~'ZZZ........Z.* 'in your query. '=~' indicates this is a regular expression constraint. The 'Z's are literal characters. In this dataset, 'Z' indicates 'good' data. The '.'s say to match any character. The '*' says to match the previous character 0 or more times. (Don't include backslashes in your query.) See the tutorial for regular expressions at http://www.vogella.com/tutorials/JavaRegularExpressions/article.html
https://www.frdr-dfdr.ca/docs/en/depositing_data/#data-usage-licenseshttps://www.frdr-dfdr.ca/docs/en/depositing_data/#data-usage-licenses
This resource contains the CAMELS-SPAT data set. CAMELS-SPAT provides data that can support hydrologic modeling and analysis for 1426 streamflow measurement stations located across the United States and Canada.
The area upstream of each station has been divided into various subbasins. The provided data include: (1) shapefiles outlining the location of each basin and its subbasins, (2) streamflow observations at daily and hourly resolution at the outlet of each basin, (3) meteorological data from 4 different data sets (RDRS, EM-Earth, ERA5, Daymet), at their native gridded resolution as well as averaged to the basin and subbasin level, (4) geospatial data from 11 different data at their native gridded resolution, and (5) statistical summaries (i.e. catchment attributes) calculated from the streamflow, meteorological and geospatial data at the basin and subbasin level.
Data set structure is described in the README found in this repository. Data set development is described in Knoben et al (under review; https://doi.org/10.5194/egusphere-2025-893). When using the CAMELS-SPAT data, please follow the attribution guidelines provided in Section 6 in this paper (briefly, individual attribution of any data set included in CAMELS-SPAT is requested if this data is used). BibTeX entries for the individual data sources aggregated in CAMELS-SPAT are provided in the citation.bib file found in this repository.
Temporary reference: Knoben, W. J. M., Keshavarz, K., Torres-Rojas, L., Thébault, C., Chaney, N. W., Pietroniro, A., and Clark, M. P.: Catchment Attributes and MEteorology for Large-Sample SPATially distributed analysis (CAMELS-SPAT): Streamflow observations, forcing data and geospatial data for hydrologic studies across North America, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2025-893, 2025
Xverum’s B2B Contact Data provides access to 750M+ verified individual profiles and 50M+ company records, enabling B2B lead generation, talent sourcing, and labor market research. Our 100+ data attributes per profile, combined with regular updates, ensure your business always has the most accurate, actionable, and compliant insights for sales, HR, and workforce analytics.
Key Features: 💠 Comprehensive Global Coverage: Access 750M+ individual profiles and 50M+ company profiles across 200+ countries. 💠 Data Enrichment & Personalization: Leverage 100+ attributes such as job title, industry, experience level, and company size to enhance CRM systems, build talent pipelines, and personalize marketing campaigns. 💠 Profile Change Detection: Stay ahead with detailed tracking of individual job changes, employer transitions, and hiring trends. 💠 High-Frequency Data Updates: Enjoy 3x faster refresh rates, with 300M+ records updated every month to keep your data fresh and reliable. 💠 HR & Workforce Analytics Ready: Identify hiring patterns and track talent movement across industries with fresh labor market insights. 💠 GDPR-Compliant and Secure: Fully compliant with global data privacy laws, ensuring secure and ethical data usage for all business applications.
Primary Use Cases:
🚀 B2B & Sales Enablement: ▪️Lead Generation & Prospecting: Identify high-value B2B leads with regular profile updates. ▪️B2B Marketing Campaigns: Create highly targeted outreach using firmographic and profile change insights. ▪️CRM Data Enrichment: Enrich existing customer records with detailed business intelligence.
💼 HR Tech & Talent Sourcing: ▪️Candidate & Talent Intelligence: Find top professionals and executives based on experience, industry, and location. ▪️Workforce Planning & Recruitment: Track career transitions and hiring trends to identify top employers and high-growth sectors. ▪️HR Tech & ATS Integration: Seamlessly integrate B2B professional insights into recruitment platforms and applicant tracking systems (ATS).
📊 Labor Market Research & Analysis: ▪️Job Market Trends: Monitor industry hiring trends, job role demand, and regional workforce dynamics. ▪️Competitive Talent Benchmarking: Compare hiring strategies and track workforce movements across competitors. ▪️Diversity & Inclusion Research: Analyze workforce composition and track diversity trends in hiring and employment.
Why Choose Xverum’s B2B Contact Data? ✅ 750M+ Profiles & 50M+ Companies – One of the largest verified B2B databases globally. ✅ 300M+ Monthly Updates – Ensuring access to the latest data. ✅ 100+ Data Attributes – Including job titles, full experience histories, industries, hiring trends, and more. ✅ 100% GDPR & CCPA Compliant – Secure and ethical data sourcing for global applications.
Contact us for a customized B2B, talent intelligence, or labor market research data solution. Stay ahead with Xverum’s fresh, accurate, and actionable workforce insights.
The primary article (cited below under "Related works") introduces social work researchers to discrete choice experiments (DCEs) for studying stakeholder preferences. The article includes an online supplement with a worked example demonstrating DCE design and analysis with realistic simulated data. The worked example focuses on caregivers' priorities in choosing treatment for children with attention deficit hyperactivity disorder. This dataset includes the scripts (and, in some cases, Excel files) that we used to identify appropriate experimental designs, simulate population and sample data, estimate sample size requirements for the multinomial logit (MNL, also known as conditional logit) and random parameter logit (RPL) models, estimate parameters using the MNL and RPL models, and analyze attribute importance, willingness to pay, and predicted uptake. It also includes the associated data files (experimental designs, data generation parameters, simulated population data and parameters, ..., In the worked example, we used simulated data to examine caregiver preferences for 7 treatment attributes (medication administration, therapy location, school accommodation, caregiver behavior training, provider communication, provider specialty, and monthly out-of-pocket costs) identified by dosReis and colleagues in a previous DCE. We employed an orthogonal design with 1 continuous variable (cost) and 12 dummy-coded variables (representing the levels of the remaining attributes, which were categorical). Using the parameter estimates published by dosReis et al., with slight adaptations, we simulated utility values for a population of 100,000 people, then selected a sample of 500 for analysis. Relying on random utility theory, we used the mlogit package in R to estimate the MNL and RPL models, using 5,000 Halton draws for simulated maximum likelihood estimation of the RPL model. In addition to estimating the utility parameters, we measured the relative importance of each attribute, esti..., , # Data from: How to Use Discrete Choice Experiments to Capture Stakeholder Preferences in Social Work Research
This dataset supports the worked example in:
Ellis, A. R., Cryer-Coupet, Q. R., Weller, B. E., Howard, K., Raghunandan, R., & Thomas, K. C. (2024). How to use discrete choice experiments to capture stakeholder preferences in social work research. Journal of the Society for Social Work and Research. Advance online publication. https://doi.org/10.1086/731310
The referenced article introduces social work researchers to discrete choice experiments (DCEs) for studying stakeholder preferences. In a DCE, researchers ask participants to complete a series of choice tasks: hypothetical situations in which each participant is presented with alternative scenarios and selects one or more. For example, social work researchers may want to know how parents and other caregivers pr...