Most people in industrialized countries regularly purchase products online. Consumers often rely on previous customers’ reviews to make purchasing decisions. The current research investigates whether potential online customers select these reviews in a biased way and whether typical interface properties of product evaluation portals foster biased selection. Based on selective exposure research, potential online customers should have a bias towards selecting positive reviews when they have an initial preference for a product. We tested this prediction across five studies (total N = 1376) while manipulating several typical properties of the review selection interface that should – according to earlier findings – facilitate biased selection. Across all studies, we found some evidence for a bias in favor of selecting positive reviews, but the aggregated effect was non-significant in an internal meta-analysis. Contrary to our hypothesis and not replicating previous research, none of the interface properties that were assumed to increase biased selection led to the predicted effects. Overall, the current research suggests that biased information selection, which has regularly been found in many other contexts, only plays a minor role in online review selection. Thus, there is no need to fear that product evaluation portals elicit biased impressions about products among consumers due to selective exposure.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:
Context:
Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.
Inspiration:
The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.
Dataset Information:
The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:
Use Cases:
Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.
----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This users dataset is a preview of a much bigger dataset, with lots of related data (product listings of sellers, comments on listed products, etc...).
My Telegram bot will answer your queries and allow you to contact me.
There are a lot of unknowns when running an E-commerce store, even when you have analytics to guide your decisions.
Users are an important factor in an e-commerce business. This is especially true in a C2C-oriented store, since they are both the suppliers (by uploading their products) AND the customers (by purchasing other user's articles).
This dataset aims to serve as a benchmark for an e-commerce fashion store. Using this dataset, you may want to try and understand what you can expect of your users and determine in advance how your grows may be.
If you think this kind of dataset may be useful or if you liked it, don't forget to show your support or appreciation with an upvote/comment. You may even include how you think this dataset might be of use to you. This way, I will be more aware of specific needs and be able to adapt my datasets to suits more your needs.
This dataset is part of a preview of a much larger dataset. Please contact me for more.
The data was scraped from a successful online C2C fashion store with over 10M registered users. The store was first launched in Europe around 2009 then expanded worldwide.
Visitors vs Users: Visitors do not appear in this dataset. Only registered users are included. "Visitors" cannot purchase an article but can view the catalog.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Questions you might want to answer using this dataset:
Example works:
For other licensing options, contact me.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
We propose Safe Human dataset consisting of 17 different objects referred to as SH17 dataset. We scrapped images from the Pexels website, which offers clear usage rights for all its images, showcasing a range of human activities across diverse industrial operations.
To extract relevant images, we used multiple queries such as manufacturing worker, industrial worker, human worker, labor, etc. The tags associated with Pexels images proved reasonably accurate. After removing duplicate samples, we obtained a dataset of 8,099 images. The dataset exhibits significant diversity, representing manufacturing environments globally, thus minimizing potential regional or racial biases. Samples of the dataset are shown below.
The data consists of three folders,
This dataset, scrapped through the Pexels website, is intended for educational, research, and analysis purposes only. You may be able to use the data for training of the Machine learning models only. Users are urged to use this data responsibly, ethically, and within the bounds of legal stipulations.
Legal Simplicity: All photos and videos on Pexels can be downloaded and used for free.
The dataset is provided "as is," without warranty, and the creator disclaims any legal liability for its use by others.
Users are encouraged to consider the ethical implications of their analyses and the potential impact on broader community.
1. Sales Analysis:
Sales data forms the backbone of this dataset, and it allows users to delve into various aspects of sales performance.
2. Product Analysis:
Each product in this dataset comes with its unique identifier (StockCode) and its name (Description).
3. Customer Segmentation:
If you associated specific business logic onto the transactions (such as calculating total amounts), then you could use standard machine learning methods or even RFM (Recency, Frequency, Monetary) segmentation techniques combining it with 'CustomerID' for your customer base to understand customer behavior better.
4. Geographical Analysis:
The Country column enables analysts to study purchase patterns across different geographical locations.
5. Sales Performance Dashboard:
To track the sales performance of the online retail company, a sales performance dashboard can be created. This dashboard can include key metrics such as total sales, sales by product category, sales by customer segment, and sales by geographical location. By visualizing the sales data in an interactive dashboard, it becomes easier to identify trends, patterns, and areas for improvement.
Dataset and associated material from an intervention study to test whether online video resources integrated into an action plan will result in greater engagement resources than when the resources are not part of a plan. Data were collected in partnership with a wellbeing company specialising in the provision of online wellbeing resources. The data collected were quantitative data about participants’ engagement with the wellbeing resources, including demographic and self-reported variables (N = 67), qualitative text data about participants' perceived barriers to the intervention implementation (N = 67) and anonymised transcripts of qualitative follow up interviews with study participants (N = 10).Proper nutrition and healthy diets are a key aspect of health, which mandatory food labelling in the UK tries to address by empowering people with the information to help them make healthier choices. The format of this information (e.g., verbal quantifiers like 'low fat' or numerical quantifiers like '5% fat') affects whether people can easily understand and use food labels. Examining how people's judgements and decisions with respect to food differ depending on food label format therefore has wide-reaching impact for health policy decisions, consumer behaviour, and food industry practice. This project will use computational methods to identify different strategies people use to decide what foods are healthiest (e.g., less fat, or less sugar, etc.) I will evaluate which strategies produce the healthiest choices, use these insights to inform policy and conduct knowledge exchange with my industry partner. The project will consolidate my PhD, which investigated differences in people's decision-making strategies when using verbal and numerical quantifiers on food labels. Using a mixture of behavioural tasks, surveys, and eye-tracking methodology, I identified that different ways of presenting quantities can lead to people relying on different pieces of information to judge food. I intend to extend this research and maximise its impact in four ways. First, I will apply new and advanced statistical modelling to my research. To classify and predict food choice strategies in my data, I will learn two modelling techniques: multinomial processing trees, a probability-based method to classify choices, and machine learning, which makes predictions based on patterns in data. For example, I would expect the models to identify cues on food labelling that predict the choices people will make. Using the results of these analyses, I will submit a planned research protocol (a 'Registered Report') to test my model on real-life products. Registered Reports receive peer review prior to data collection, so submitting it during the Fellowship supports my future academic research beyond the Fellowship. Second, I will extend the impact of my work through knowledge exchange with the start-up company Keep Fit Eat Fit Wellbeing Ltd (KFEF). As part of a holistic wellness package, KFEF produces healthy eating advice and recipes with nutritional information for their clients. My research will inform the design of their content for clients. In turn, working with them gives me access to usage metrics from their customer portal that I will analyse to determine if the communication formats are effective. These real-world data will reinforce the lab studies from my PhD and help KFEF improve their product offering. Third, I will disseminate my research findings to academic and non-academic audiences. For academic audiences, I will produce three new journal articles and present my work at one local and one international academic conference. I will also engage with non-academic audiences through preparing press releases, submitting a policy brief to present at the All-Party Parliamentary Food and Health Forum, and attending a Westminster Food and Nutrition Forum conference. Engaging with policy-makers through these channels will help me lobby for positive change to food labelling guidelines. Finally, I will prepare a proposal for funding from the Wellcome Trust to create and test a technological system that supports informed food choices. This future proposal will be informed by my PhD data, computational modelling research, and collaborations with: industry (Keep Fit Eat Fit), experts in shaping behavioural policy (at the University of Reading), and experts in technological health interventions (at the University of Konstanz). Ultimately, my research seeks to improve the food choice environment for consumers and empower them to make informed, healthy choices. Online survey with experimental component (condition randomisation), behavioural (resource usage) outcomes and qualitative (online one-on-one interviews) data collection.
As of early 2023, approximately ** percent of consumers in the United States said they would prefer to shop mostly online rather than in-store, making it the country with highest online shopping preference. In contrast, more shoppers preferred visiting physical stores in countries such as Austria, Finland, and New Zealand.
https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used
RTB Maps is a cloud-based electronic Atlas. We used ArGIS 10 for Desktop with Spatial Analysis Extension, ArcGIS 10 for Server on-premise, ArcGIS API for Javascript, IIS web services based on .NET, and ArcGIS Online combining data on the cloud with data and applications on our local server to develop an Atlas that brings together many of the map themes related to development of roots, tubers and banana crops. The Atlas is structured to allow our participating scientists to understand the distribution of the crops and observe the spatial distribution of many of the obstacles to production of these crops. The Atlas also includes an application to allow our partners to evaluate the importance of different factors when setting priorities for research and development. The application uses weighted overlay analysis within a multi-criteria decision analysis framework to rate the importance of factors when establishing geographic priorities for research and development.Datasets of crop distribution maps, agroecology maps, biotic and abiotic constraints to crop production, poverty maps and other demographic indicators are used as a key inputs to multi-objective criteria analysis.Further metadata/references can be found here: http://gisweb.ciat.cgiar.org/RTBmaps/DataAvailability_RTBMaps.htmlDISCLAIMER, ACKNOWLEDGMENTS AND PERMISSIONS:This service is provided by Roots, Tubers and Bananas CGIAR Research Program as a public service. Use of this service to retrieve information constitutes your awareness and agreement to the following conditions of use.This online resource displays GIS data and query tools subject to continuous updates and adjustments. The GIS data has been taken from various, mostly public, sources and is supplied in good faith.RTBMaps GIS Data Disclaimer• The data used to show the Base Maps is supplied by ESRI.• The data used to show the photos over the map is supplied by Flickr.• The data used to show the videos over the map is supplied by Youtube.• The population map is supplied to us by CIESIN, Columbia University and CIAT.• The Accessibility map is provided by Global Environment Monitoring Unit - Joint Research Centre of the European Commission. Accessibility maps are made for a specific purpose and they cannot be used as a generic dataset to represent "the accessibility" for a given study area.• Harvested area and yield for banana, cassava, potato, sweet potato and yam for the year 200, is provided by EarthSat (University of Minnesota’s Institute on the Environment-Global Landscapes initiative and McGill University’s Land Use and the Global Environment lab). Dataset from Monfreda C., Ramankutty N., and Foley J.A. 2008.• Agroecology dataset: global edapho-climatic zones for cassava based on mean growing season, temperature, number of dry season months, daily temperature range and seasonality. Dataset from CIAT (Carter et al. 1992)• Demography indicators: Total and Rural Population from Center for International Earth Science Information Network (CIESIN) and CIAT 2004.• The FGGD prevalence of stunting map is a global raster datalayer with a resolution of 5 arc-minutes. The percentage of stunted children under five years old is reported according to the lowest available sub-national administrative units: all pixels within the unit boundaries will have the same value. Data have been compiled by FAO from different sources: Demographic and Health Surveys (DHS), UNICEF MICS, WHO Global Database on Child Growth and Malnutrition, and national surveys. Data provided by FAO – GIS Unit 2007.• Poverty dataset: Global poverty headcount and absolute number of poor. Number of people living on less than $1.25 or $2.00 per day. Dataset from IFPRI and CIATTHE RTBMAPS GROUP MAKES NO WARRANTIES OR GUARANTEES, EITHER EXPRESSED OR IMPLIED AS TO THE COMPLETENESS, ACCURACY, OR CORRECTNESS OF THE DATA PORTRAYED IN THIS PRODUCT NOR ACCEPTS ANY LIABILITY, ARISING FROM ANY INCORRECT, INCOMPLETE OR MISLEADING INFORMATION CONTAINED THEREIN. ALL INFORMATION, DATA AND DATABASES ARE PROVIDED "AS IS" WITH NO WARRANTY, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, FITNESS FOR A PARTICULAR PURPOSE. By accessing this website and/or data contained within the databases, you hereby release the RTB group and CGCenters, its employees, agents, contractors, sponsors and suppliers from any and all responsibility and liability associated with its use. In no event shall the RTB Group or its officers or employees be liable for any damages arising in any way out of the use of the website, or use of the information contained in the databases herein including, but not limited to the RTBMaps online Atlas product.APPLICATION DEVELOPMENT:• Desktop and web development - Ernesto Giron E. (GeoSpatial Consultant) e.giron.e@gmail.com• GIS Analyst - Elizabeth Barona. (Independent Consultant) barona.elizabeth@gmail.comCollaborators:Glenn Hyman, Bernardo Creamer, Jesus David Hoyos, Diana Carolina Giraldo Soroush Parsa, Jagath Shanthalal, Herlin Rodolfo Espinosa, Carlos Navarro, Jorge Cardona and Beatriz Vanessa Herrera at CIAT, Tunrayo Alabi and Joseph Rusike from IITA, Guy Hareau, Reinhard Simon, Henry Juarez, Ulrich Kleinwechter, Greg Forbes, Adam Sparks from CIP, and David Brown and Charles Staver from Bioversity International.Please note these services may be unavailable at times due to maintenance work.Please feel free to contact us with any questions or problems you may be having with RTBMaps.
Upvote! The database contains +40,000 records on US Gross Rent & Geo Locations. The field description of the database is documented in the attached pdf file. To access, all 325,272 records on a scale roughly equivalent to a neighborhood (census tract) see link below and make sure to upvote. Upvote right now, please. Enjoy!
Get the full free database with coupon code: FreeDatabase, See directions at the bottom of the description... And make sure to upvote :) coupon ends at 2:00 pm 8-23-2017
The data set originally developed for real estate and business investment research. Income is a vital element when determining both quality and socioeconomic features of a given geographic location. The following data was derived from over +36,000 files and covers 348,893 location records.
Only proper citing is required please see the documentation for details. Have Fun!!!
Golden Oak Research Group, LLC. “U.S. Income Database Kaggle”. Publication: 5, August 2017. Accessed, day, month year.
For any questions, you may reach us at research_development@goldenoakresearch.com. For immediate assistance, you may reach me on at 585-626-2965
please note: it is my personal number and email is preferred
Check our data's accuracy: Census Fact Checker
Don't settle. Go big and win big. Optimize your potential**. Access all gross rent records and more on a scale roughly equivalent to a neighborhood, see link below:
A small startup with big dreams, giving the every day, up and coming data scientist professional grade data at affordable prices It's what we do.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The data included in this publication depict components of wildfire risk specifically for populated areas in the United States. These datasets represent where people live in the United States and the in situ risk from wildfire, i.e., the risk at the location where the adverse effects take place.
National wildfire hazard datasets of annual burn probability and fire intensity, generated by the USDA Forest Service, Rocky Mountain Research Station and Pyrologix LLC, form the foundation of the Wildfire Risk to Communities data. Vegetation and wildland fuels data from LANDFIRE 2020 (version 2.2.0) were used as input to two different but related geospatial fire simulation systems. Annual burn probability was produced with the USFS geospatial fire simulator (FSim) at a relatively coarse cell size of 270 meters (m). To bring the burn probability raster data down to a finer resolution more useful for assessing hazard and risk to communities, we upsampled them to the native 30 m resolution of the LANDFIRE fuel and vegetation data. In this upsampling process, we also spread values of modeled burn probability into developed areas represented in LANDFIRE fuels data as non-burnable. Burn probability rasters represent landscape conditions as of the end of 2020. Fire intensity characteristics were modeled at 30 m resolution using a process that performs a comprehensive set of FlamMap runs spanning the full range of weather-related characteristics that occur during a fire season and then integrates those runs into a variety of results based on the likelihood of those weather types occurring. Before the fire intensity modeling, the LANDFIRE 2020 data were updated to reflect fuels disturbances occurring in 2021 and 2022. As such, the fire intensity datasets represent landscape conditions as of the end of 2022. The data products in this publication that represent where people live, reflect 2020 estimates of housing units and 2021 estimates of population counts from the U.S. Census Bureau, combined with building footprint data from Onegeo and USA Structures, both reflecting 2022 conditions.
The specific raster datasets included in this publication include:
Building Count: Building Count is a 30-m raster representing the count of buildings in the building footprint dataset located within each 30-m pixel.
Building Density: Building Density is a 30-m raster representing the density of buildings in the building footprint dataset (buildings per square kilometer [km²]).
Building Coverage: Building Coverage is a 30-m raster depicting the percentage of habitable land area covered by building footprints.
Population Count (PopCount): PopCount is a 30-m raster with pixel values representing residential population count (persons) in each pixel.
Population Density (PopDen): PopDen is a 30-m raster of residential population density (people/km²).
Housing Unit Count (HUCount): HUCount is a 30-m raster representing the number of housing units in each pixel.
Housing Unit Density (HUDen): HUDen is a 30-m raster of housing-unit density (housing units/km²).
Housing Unit Exposure (HUExposure): HUExposure is a 30-m raster that represents the expected number of housing units within a pixel potentially exposed to wildfire in a year. This is a long-term annual average and not intended to represent the actual number of housing units exposed in any specific year.
Housing Unit Impact (HUImpact): HUImpact is a 30-m raster that represents the relative potential impact of fire to housing units at any pixel, if a fire were to occur. It is an index that incorporates the general consequences of fire on a home as a function of fire intensity and uses flame length probabilities from wildfire modeling to capture likely intensity of fire.
Housing Unit Risk (HURisk): HURisk is a 30-m raster that integrates all four primary elements of wildfire risk - likelihood, intensity, susceptibility, and exposure - on pixels where housing unit density is greater than zero.The geospatial data products described and distributed here are part of the Wildfire Risk to Communities project. This project was directed by Congress in the 2018 Consolidated Appropriations Act (i.e., 2018 Omnibus Act, H.R. 1625, Section 210: Wildfire Hazard Severity Mapping) to help U.S. communities understand components of their relative wildfire risk profile, the nature and effects of wildfire risk, and actions communities can take to mitigate risk. The first edition of these data represented the first time wildfire risk to communities had been mapped nationally with consistent methodology. They provided foundational information for comparing the relative wildfire risk among populated communities in the United States. In this version, the 2nd edition, we use improved modeling and mapping methodology and updated input data to generate the current suite of products.See the Wildfire Risk to Communities website at https://www.wildfirerisk.org for complete project information and an interactive web application for exploring some of the datasets published here. We deliver the data here as zip files by U.S. state (including AK and HI), and for the full extent of the continental U.S.
This data publication is a second edition and represents an update to any previous versions of Wildfire Risk to Communities risk datasets published by the USDA Forest Service. This second edition was originally published on 06/03/2024. On 09/10/2024, a minor correction was made to the abstract in this overall metadata document as well as the individual metadata documents associated with each raster dataset. The supplemental file containing data product descriptions was also updated. In addition, we separated the large CONUS download into a series of smaller zip files (one for each layer).
There are two companion data publications that are part of the WRC 2.0 data update: one that characterizes landscape-wide wildfire hazard and risk for the nation (Scott et al. 2024, https://doi.org/10.2737/RDS-2020-0016-2), and one that delineates wildfire risk reduction zones and provides tabular summaries of wildfire hazard and risk raster datasets (Dillon et al. 2024, https://doi.org/10.2737/RDS-2024-0030).
https://brightdata.com/licensehttps://brightdata.com/license
Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.
Dataset Features
User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.
Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.
Popular Use Cases
Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.
Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The growth of the global organic food industry over the last two decades has been dramatic. This is evident from rising consumer spending and paying attention to health, environmental concerns, and food safety. As reported by FiBL and IFOAM (2023), the global market for organic food reached a cumulative value of 187 billion USD in 2022, growing annually at close to 10% [1]. The demand for organic vegetables has increased as consumers shift to healthier and more sustainable eating patterns.
This trend aligns with SDG 3 (Good Health and Well-being) and SDG 12 (Responsible Consumption and Production), which promote healthy lifestyles and sustainable food systems.
Countries like Germany, France, and the United States have been the market leaders, while Europe and North America have become developed regions. By 2022, the European organic market was worth 52 billion Euros. Moreover, organic food expenditure per capita reached 135 Euros in Denmark and 102 Euros in Switzerland [1]. People in these regions emphasize buying products from organic certifiers and those produced without pesticides and using cruelty-free farming methods.
On the other hand, Indonesia and some other emerging economies are slowly but steadily increasing their organic food consumption. Indonesia's organic food market is still in an early phase of development. However, because of the increased focus on healthy living from middle-class consumers, it has considerable growth potential. The organic food sector in Indonesia is reported to have grown by 12.3% in 2021, out of which organic vegetables constituted around 60% of the entire sales of organic products [2]. However, with less than $5 annually per capita on organic food, Indonesia has a greater challenge than opportunity in promoting the organic market than developed countries.
This shows a gap in achieving SDG 10 (Reduced Inequalities) in terms of access to healthy and sustainable food.
Even with the increasing demand for organic vegetables, multiple difficulties inhibit widespread acceptance by Indonesians:
High price premiums: Organic vegetables in Indonesia are often 40-80% more expensive than conventional alternatives [2].
Limited Availability and Accessibility: Rural consumers have limited access to organic products primarily sold in urban supermarkets or online.
Consumer trust in certification: Regarding the credibility and legibility of organic labels, 42% of Indonesian consumers expressed doubts in a BPS survey in 2021 [3].
Food waste remains one of Indonesia's biggest challenges, and vegetables are top of the list. A study showed that vegetables constitute 31% of the most discarded food items, with rice (20%), meat (11%), dairy (10%), and fish (10%) following behind. This enormous waste signifies a resource deficit and adds to the environmental problems as organic waste gives off greenhouse gases when it decomposes.
This directly contradicts the targets of SDG 12 (Responsible Consumption and Production) and SDG 13 (Climate Action), which call for waste reduction and the mitigation of greenhouse gas emissions from food systems.
Jika ingin disisipkan juga ikon-ikon visual SDGs atau dirancang sebagai bagian dari slide/publikasi visual, saya bisa bantu buatkan juga.
Timestamp | Materi pembelajaran kewirausahaan yang saya terima relevan dengan dunia usaha saat ini. (X1) | Metode pengajaran dalam mata kuliah kewirausahaan memudahkan saya memahami cara memulai bisnis. (X1) | Tujuan pembelajaran mata kuliah ini jelas dan mendukung saya menjadi wirausaha. (X1) |
---|---|---|---|
2025-05-28 17:40:53.301000 | Sangat Setuju | Sangat Setuju | Sangat Setuju |
2025-05-28 17:40:54.696000 | Sangat Setuju | Sangat Setuju | Sangat Setuju |
2025-05-28 17:41:05.493000 | Sangat Setuju | Sangat Setuju | |
2025-05-28 17:41:09.955000 | Sangat Setuju | Sangat Setuju | Sangat Setuju |
2025-05-28 17:41:24.107000 | Sangat Setuju | Sangat Setuju | Sangat Setuju |
ChatGPT said:
Berikut contoh deskripsi open data Zenodo yang dapat Anda gunakan untuk mendeskripsikan dataset dan dokumen Anda berdasarkan dua file yang diunggah (.xlsx
dan .docx
) dengan struktur akademik dan profesional:
This dataset and research article explore how entrepreneurship education influences the use of Artificial Intelligence (AI) among digital-native university students, particularly in its effect on performance expectancy, effort expectancy, and entrepreneurial intention. Using the Unified Theory of Acceptance and Use of Technology (UTAUT) as the theoretical foundation, the study employs Structural Equation Modeling – PLS-SEM to assess seven mediation pathways.
The dataset (.xlsx
) includes anonymized responses from survey participants measuring constructs such as entrepreneurial education, AI usage, performance expectancy, effort expectancy, and entrepreneurial intention. The accompanying article (.docx
) provides a comprehensive review of the literature, hypothesis development, methodology, and interpretation of results, emphasizing the significant role of performance expectancy in mediating the relationship between AI usage and entrepreneurial intention.
The findings contribute to expanding the UTAUT framework within entrepreneurship education and provide practical implications for curriculum design in the digital era.
The Influence of Entrepreneurship Education, Artificial Intelligence..xlsx
: Raw data and computed constructs used in SmartPLS analysis.
The Influence of Entrepreneurial Education on AI en-GB.docx
: Full research article including theoretical background, model development, hypothesis testing results, and discussion.
Entrepreneurship Education, Artificial Intelligence, Entrepreneurial Intention, Performance Expectancy, Effort Expectancy, UTAUT, Digital Natives, PLS-SEM, Higher Education, EdTech
This research was supported by Bina Nusantara University (BINUS International Research - Basic) under contract number: 081/VRRTT/IV/2025
, dated 25 April 2025.
Understanding Society is a large representative household panel study for the UK. The Study follows the same 40,000 households over time, beginning in 2009 and providing a detailed picture of how people’s lives are changing. One of the many innovative features of Understanding Society is that a great deal of information about neighbourhoods can be used alongside the individual and household-level information collected in the Study, making it a useful study for neighbourhood effects analyses. In a recent paper (Knies, 2017) we explored four Understanding Society data products, based on four different types of rural-urban/neighbourhood classifications, to throw light on how much heterogeneity in neighbourhood contexts is captured in the first waves of Understanding Society, including change in neighbourhood contexts. This Online Supplement presents additional tables to Knies (2017). Data was used from the first five waves of Understanding Society (University of Essex. Institute for Social and Economic Research, 2015a), and linked it with information from four related data products that provide qualitative information about the types of neighbourhood people live in. The four neighbourhood classifications used are: • 2001 Census Rural-urban classification • 2001 Census Output Area Classification (OAC) • ACORN 2013 classification • MOSAIC UK 2009 classification
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The Australian Census Longitudinal Dataset (ACLD) brings together a 5% sample from the 2006 Census with records from the 2011 Census to create a research tool for exploring how Australian society is …Show full descriptionThe Australian Census Longitudinal Dataset (ACLD) brings together a 5% sample from the 2006 Census with records from the 2011 Census to create a research tool for exploring how Australian society is changing over time. In taking a longitudinal view of Australians, the ACLD may uncover new insights into the dynamics and transitions that drive social and economic change over time, conveying how these vary for diverse population groups and geographies. It is envisaged that the 2016 and successive Censuses will be added in the future, as well as administrative data sets. The ACLD is released in ABS TableBuilder and as a microdata product in the ABS Data Laboratory. The Census of Population and Housing is conducted every five years and aims to measure accurately the number of people and dwellings in Australia on Census Night. Microdata products are the most detailed information available from a Census or survey and are generally the responses to individual questions on the questionnaire. They also include derived data from answers to two or more questions and are released with the approval of the Australian Statistician. The following microdata products are available for this longitudinal dataset: •ACLD in TableBuilder - an online tool for creating tables and graphs. •ACLD in ABS Data Laboratory (ABSDL) - for in-depth analysis using a range of statistical software packages.
In 2024, global retail e-commerce sales reached an estimated ************ U.S. dollars. Projections indicate a ** percent growth in this figure over the coming years, with expectations to come close to ************** dollars by 2028. World players Among the key players on the world stage, the American marketplace giant Amazon holds the title of the largest e-commerce player globally, with a gross merchandise value of nearly *********** U.S. dollars in 2024. Amazon was also the most valuable retail brand globally, followed by mostly American competitors such as Walmart and the Home Depot. Leading e-tailing regions E-commerce is a dormant channel globally, but nowhere has it been as successful as in Asia. In 2024, the e-commerce revenue in that continent alone was measured at nearly ************ U.S. dollars, outperforming the Americas and Europe. That year, the up-and-coming e-commerce markets also centered around Asia. The Philippines and India stood out as the swiftest-growing e-commerce markets based on online sales, anticipating a growth rate surpassing ** percent.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Sentiment analysis remains one of the key problems that has seen extensive application of natural language processing. This time around, given the tweets from customers about various tech firms who manufacture and sell mobiles, computers, laptops, etc, the task is to identify if the tweets have a negative sentiment towards such companies or products.
Sentiment analysis is contextual mining of text which identifies and extracts subjective information in source material, and helping a business to understand the social sentiment of their brand, product or service while monitoring online conversations. Brands can use this data to measure the success of their products in an objective manner. In this challenge, you are provided with tweet data to predict sentiment on electronic products of netizens.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Data Science Resources Get started with NLP and text classification with our latest offering ‘Natural Language Processing (NLP) using Python’ course Refer this comprehensive guide that exhaustively covers text classification techniques using different libraries and its implementation in python. You can also refer this guide that covers multiple techniques including TF-IDF, Word2Vec etc. to tackle problems related to Sentiment Analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The virtual display of products in e-commerce brings new problems of information asymmetry, and the overload of digital information also increases the difficulty of consumers' purchasing decisions. The real-time interaction between the streamer and the consumer during live streaming e-commerce will promote consumers' understanding of the product, reduce information asymmetry, and increase consumers' purchase intention. However, why do people trust the untouchable and unfamiliar streamers from live streaming e-commerce to purchase online? To understand this phenomenon, based on the perspective of the information asymmetry theory and parasocial relationship theory, this research identified how social capital affected purchase intention in live streaming e-commerce. Through a questionnaire survey of live viewers, the purchase intention model constructed by empirical testing was used. The findings showed that the streamer's professionalism, the reciprocal expectation of live streaming, and the viewer's parasocial relationship could effectively increase the viewer's purchase intention. The occurrence of a streamer's negative public events could significantly reduce the viewer's purchase intention. The scale of live streaming and the streamer's commitment had no significant impact on the viewer's purchase intention. Trust played an intermediary role between the streamer's professionalism and parasocial relationship and the viewer's purchase intention.
Which county has the most Facebook users?
There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
Facebook – the most used social media
Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
Facebook usage by device
As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Most people in industrialized countries regularly purchase products online. Consumers often rely on previous customers’ reviews to make purchasing decisions. The current research investigates whether potential online customers select these reviews in a biased way and whether typical interface properties of product evaluation portals foster biased selection. Based on selective exposure research, potential online customers should have a bias towards selecting positive reviews when they have an initial preference for a product. We tested this prediction across five studies (total N = 1376) while manipulating several typical properties of the review selection interface that should – according to earlier findings – facilitate biased selection. Across all studies, we found some evidence for a bias in favor of selecting positive reviews, but the aggregated effect was non-significant in an internal meta-analysis. Contrary to our hypothesis and not replicating previous research, none of the interface properties that were assumed to increase biased selection led to the predicted effects. Overall, the current research suggests that biased information selection, which has regularly been found in many other contexts, only plays a minor role in online review selection. Thus, there is no need to fear that product evaluation portals elicit biased impressions about products among consumers due to selective exposure.