40 datasets found
  1. Amazon employees 2007-2024

    • statista.com
    • gruabehub.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon employees 2007-2024 [Dataset]. https://www.statista.com/statistics/234488/number-of-amazon-employees/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, United States
    Description

    The combined number of full- and part-time employees of Amazon.com has increased significantly since 2017. Amazon’s headcount peaked in 2021 when the American multinational e-commerce company employed ********* full- and part-time employees, not counting external contractors. However, in 2024, the number dropped to *********. E-commerce crunch The workforce reduction of Amazon follows the mass layoffs hitting the entire e-commerce sector. With the full reopening of physical stores after the COVID-19 pandemic, online shopping demand decreased, leading online retailers to restructure their businesses, including personnel costs. Diversifying business With online retail sales growing slower due to recession and inflation, Amazon can still leverage other profitable revenue segments — from media subscriptions to server hosting and cloud services. On top of that, in 2023 Amazon monitored small enterprises operating in different fields and strategically invested in them, as disclosed startup acquisitions indicate.

  2. Amazon Employee Access Challenge

    • kaggle.com
    Updated Aug 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Massaron (2021). Amazon Employee Access Challenge [Dataset]. https://www.kaggle.com/lucamassaron/amazon-employee-access-challenge/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Luca Massaron
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    When an employee at any company starts work, they first need to obtain the computer access necessary to fulfill their role. This access may allow an employee to read/manipulate resources through various applications or web portals. It is assumed that employees fulfilling the functions of a given role will access the same or similar resources. It is often the case that employees figure out the access they need as they encounter roadblocks during their daily work (e.g. not able to log into a reporting portal). A knowledgeable supervisor then takes time to manually grant the needed access in order to overcome access obstacles. As employees move throughout a company, this access discovery/recovery cycle wastes a nontrivial amount of time and money.

    There is a considerable amount of data regarding an employee’s role within an organization and the resources to which they have access. Given the data related to current employees and their provisioned access, models can be built that automatically determine access privileges as employees enter and leave roles within a company. These auto-access models seek to minimize the human involvement required to grant or revoke employee access.

    Part of the competition "Amazon.com - Employee Access Challenge" (https://www.kaggle.com/c/amazon-employee-access-challenge), the data consists of real historical data collected from 2010 & 2011. Employees are manually allowed or denied access to resources over time. Your task is to create an algorithm capable of learning from this historical data to predict approval/denial for an unseen set of employees.

    Content

    The data comes from Amazon Inc. collected from 2010-2011 (published on Kaggle platform). The training set consists of 32769 samples and the testing one of 58922 samples. The training set has one label attribute named “ACTION”, whose value “1” indicates an application is approved whereas “0” indicates rejection. As predictors of this state, there are eight features, indicating characteristics of the required resource anf the role and work group of the employee at Amazon requesting access.

    train.csv - The training set. Each row has the ACTION (ground truth), RESOURCE, and information about the employee's role at the time of approval

    test.csv - The test set for which predictions should be made. Each row asks whether an employee having the listed characteristics should have access to the listed resource.

    Column NameDescription
    ACTIONACTION is 1 if the resource was approved, 0 if the resource was not
    RESOURCEAn ID for each resource
    MGR_IDThe EMPLOYEE ID of the manager of the current EMPLOYEE ID record; an employee may have only one manager at a time
    ROLE_ROLLUP_1Company role grouping category id 1 (e.g. US Engineering)
    ROLE_ROLLUP_2Company role grouping category id 2 (e.g. US Retail)
    ROLE_DEPTNAMECompany role department description (e.g. Retail)
    ROLE_TITLECompany role business title description (e.g. Senior Engineering Retail Manager)
    ROLE_FAMILY_DESCCompany role family extended description (e.g. Retail Manager, Software Engineering)
    ROLE_FAMILYCompany role family description (e.g. Retail Manager)
    ROLE_CODECompany role code; this code is unique to each role (e.g. Manager)

    Models are judged on area under the ROC curve (https://en.wikipedia.org/wiki/Receiver_operating_characteristic)

    Acknowledgements

    The data has been donated by Amazon and the original competition has been hosted in collaboration with the IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2013)

  3. w

    Dataset of business metrics of companies called Amazon

    • workwithdata.com
    Updated May 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of business metrics of companies called Amazon [Dataset]. https://www.workwithdata.com/datasets/companies?col=ceo%2Cceo_approval%2Cceo_gender%2Ccity%2Cemployees&f=1&fcol0=company&fop0=%3D&fval0=Amazon
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about companies. It has 19 rows and is filtered where the company is Amazon. It features 5 columns: employees, CEO, CEO gender, and CEO approval.

  4. u

    Amazon Question and Answer Data

    • cseweb.ucsd.edu
    json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Amazon Question and Answer Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain 1.48 million question and answer pairs about products from Amazon.

    Metadata includes

    • question and answer text

    • is the question binary (yes/no), and if so does it have a yes/no answer?

    • timestamps

    • product ID (to reference the review dataset)

    Basic Statistics:

    • Questions: 1.48 million

    • Answers: 4,019,744

    • Labeled yes/no questions: 309,419

    • Number of unique products with questions: 191,185

  5. g

    Amazon review data 2018

    • nijianmo.github.io
    • cseweb.ucsd.edu
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Amazon review data 2018 [Dataset]. https://nijianmo.github.io/amazon/
    Explore at:
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    Context

    This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

    • More reviews:

      • The total number of reviews is 233.1 million (142.8 million in 2014).
    • New reviews:

      • Current data includes reviews in the range May 1996 - Oct 2018.
    • Metadata: - We have added transaction metadata for each review shown on the review page.

      • Added more detailed metadata of the product landing page.

    Acknowledgements

    If you publish articles based on this dataset, please cite the following paper:

    • Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
  6. b

    Amazon reviews Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2023). Amazon reviews Dataset [Dataset]. https://brightdata.com/products/datasets/amazon/reviews
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Mar 21, 2023
    Dataset authored and provided by
    Bright Data
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.

  7. w

    Amazon Web Services - Public Data Sets

    • data.wu.ac.at
    Updated Oct 10, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Global (2013). Amazon Web Services - Public Data Sets [Dataset]. https://data.wu.ac.at/schema/datahub_io/NTYxNjkxNmYtNmZlNS00N2EwLWJkYTktZjFjZWJkNTM2MTNm
    Explore at:
    Dataset updated
    Oct 10, 2013
    Dataset provided by
    Global
    Description

    About

    From website:

    Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.

    Previously, large data sets such as the mapping of the Human Genome and the US Census data required hours or days to locate, download, customize, and analyze. Now, anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. For example, users can produce or use prebuilt server images with tools and applications to analyze the data sets. By hosting this important and useful data with cost-efficient services such as Amazon EC2, AWS hopes to provide researchers across a variety of disciplines and industries with tools to enable more innovation, more quickly.

  8. d

    Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) +...

    • datarade.ai
    .csv
    Updated Jan 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Space Know (2023). Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) + Research Report Available [Dataset]. https://datarade.ai/data-products/satellite-us-supply-chain-dataset-package-amazon-fedex-wal-space-know
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jan 18, 2023
    Dataset authored and provided by
    Space Know
    Area covered
    United States
    Description

    SpaceKnow USA Supply Chain Premium Dataset gives you data (by locations and company) of US Supply Chain choke points in near-real-time as seen from satellite images. The uniqueness of this dataset lies in its granularity.

    About dataset: We apply proprietary algorithms to SAR satellite imagery of key industrial, transportation, storage, and logistics locations to create daily indices of industry activity. Data was collected from more than 5,000 locations across the USA. Thanks to the use of SAR satellite technology, the quality of the SpaceKnow dataset is not influenced by weather fluctuations.

    In total SpaceKnow USA Supply Chain dataset offers +50 specific indices with real-time insights. The premium dataset includes company-focused indices. This type of data can be used by investors to get insight on important KPIs such as revenue.

    This dataset is:

    Daily frequency History from Jan 2017 - present

    Within one package we provide you with real-time insights into:

    Port Container country-level indices(A container port or container terminal is a facility where cargo containers are transshipped between different transport vehicles, for onward transportation) Port Container indices for the major ports in US: Port of Los Angeles Port of Long Beach Port of New York & New Jersey Port of Savannah Port of Houston Port of Virginia Port of Oakland in California Port of South Carolina Port of Miami

    Trucking Stop indices for the most important locations in the supply chain like: Iowa Nevada South Carolina Oregon North Carolina

    Inland Containers index on a country-level

    Logistics Center index on a country-level (Logistics centers are distribution hubs for finished goods that need to be transported to another location. We include logistics centers from companies like Amazon, Walmart, Fedex and others)

    Logistics Center indices for states like: California New York Illinois Indiana South Carolina And many more…

    Logistics Center indices for companies: Amazon Walmart Fedex

    Research Reports Don't have the capacity to analyze the data? Let SpaceKnow's in-house economists do the heavy lifting so that you can focus on what's important. SpaceKnow writes research reports based on what the data from the US Supply Chain dataset package is showing. The document includes a detailed explanation of what is happening with supporting charts and tables. The reports are published on a monthly basis.

    Delivery Mechanisms All of the delivery mechanisms detailed below are available as part of this package. Data is distributed only in the flat-table CSV format. Methods how to access the data: Dashboard - option that also offers data visualization within the webpage Automatic email delivery API access to our dataset Research reports - provided via email in PDF format

    Client Support

    Each client is assigned an account representative who will reach out periodically to make sure that the data packages are meeting your needs. Here are some other ways to contact SpaceKnow in case you have a specific question.

    For delivery questions and issues: Please reach out to support@spaceknow.com

    For data questions: Please reach out to product@spaceknow.com

    For pricing/sales support: Please reach out to info@spaceknow.com or sales@spaceknow.com

  9. Data from: LBA-ECO LC-03 SAR Images, Land Cover, and Biomass, Four Areas...

    • catalog.data.gov
    • cmr.earthdata.nasa.gov
    • +4more
    Updated Aug 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ORNL_DAAC (2025). LBA-ECO LC-03 SAR Images, Land Cover, and Biomass, Four Areas across Brazilian Amazon [Dataset]. https://catalog.data.gov/dataset/lba-eco-lc-03-sar-images-land-cover-and-biomass-four-areas-across-brazilian-amazon-72c32
    Explore at:
    Dataset updated
    Aug 30, 2025
    Dataset provided by
    Oak Ridge National Laboratory Distributed Active Archive Center
    Area covered
    Amazon Rainforest, Brazil
    Description

    This data set provides three related land cover products for four study areas across the Brazilian Amazon: Manaus, Amazonas; Tapajos National Forest, Para Western (Santarem); Rio Branco, Acre; and Rondonia, Rondonia. Products include (1) orthorectified JERS-1 and RadarSat images, (2) land cover classifications derived from the SAR data, and (3) biomass estimates in tons per hectare based on the land cover classification. There are 12 image files (.tif) with this data set.Orthorectified JERS-1 and RadarSat images are provided as GeoTIFF images - one file for each study area.For the Manaus and Tapajos sites: The images are orthorectified at 12.5-meter resolution and then re-sampled at 25-meter resolution.For the Rondonia and Rio Branco sites: The images from 1978 are orthorectified at 25-meter resolution and then re-sampled at 90-meter resolution. Each GeoTIFF file contains 3 image channels: - 2 L-band JERS-1 data in Fall and Spring seasons and - 1 C-band RadarSat data.Land cover classifications are based on two JERS-1 images and one RadarSat image and provided as GeoTIFFs - one file for each study area. Four major land cover classes are distinguished: (1) Flat surface; (2) Regrowth area; (3) Short vegetation; and (4) Tall vegetation. The biomass estimates in tons per hectare are based on the land cover classification results and are reported in one GeoTIFF file for each study area.DATA QUALITY STATEMENT: The Data Center has determined that there are questions about the quality of the data reported in this data set. The data set has missing or incomplete data, metadata, or other documentation that diminishes the usability of the products.KNOWN PROBLEMS: The data providers note that due to limited resources, these data have been neither validated nor quality-assured for general use. For that reason, extreme caution is advised when considering the use of these data.Any use of the derived data is not recommended because the results have not been validated. However, the DEM and vectors (related data set), and orthorectified SAR data can be used if the user understands how these were produced and accepts the limitations.

  10. d

    Amazon Products Database contains data on keywords and product listings...

    • datarade.ai
    .json
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataForSEO (2023). Amazon Products Database contains data on keywords and product listings ranking for them [Dataset]. https://datarade.ai/data-products/amazon-products-database-contains-data-on-keywords-and-produc-dataforseo
    Explore at:
    .jsonAvailable download formats
    Dataset updated
    Sep 27, 2023
    Dataset authored and provided by
    DataForSEO
    Area covered
    United Arab Emirates, Egypt, United States of America, Saudi Arabia
    Description

    First of all, Amazon product datasets are indispensable for reverse engineering your rivals. For example, you can collect a list of keywords you already rank for or want to, and go through DataForSEO Amazon Products Database to find other sellers appearing as the top results for these terms.

    Next, you can narrow down the scope of your contenders to those performing the best. To do so, you can filter out sellers who won the “Amazon’s Choice” and those whose products got listed multiple times on the first page.

    Once you’ve compiled the final list of your challengers, Amazon Products Database will help you to quickly examine product titles, descriptions, prices, images, and other details that will let you grasp the main contributors to your competitors’ success. Once you’ve figured that out, you can start optimizing your product listings and pricing strategies to increase conversions.

    However, the number of use cases for Amazon product data isn’t limited to competitor analysis. It can be applied to monitoring product rankings, running price comparisons, and more.

  11. v

    Data from: LBA-ECO LC-03 Hypsography, Rivers, Roads, and DEM, Four Areas...

    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • search.dataone.org
    • +6more
    Updated Jul 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ORNL_DAAC (2025). LBA-ECO LC-03 Hypsography, Rivers, Roads, and DEM, Four Areas across Brazilian Amazon [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/lba-eco-lc-03-hypsography-rivers-roads-and-dem-four-areas-across-brazilian-amazon-76907
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    ORNL_DAAC
    Area covered
    Amazon Rainforest, Brazil
    Description

    This data set provides four related spatial data products for four study areas across the Brazilian Amazon: Manaus, Amazonas; Tapajos National Forest, Para Western (Santarem); Rio Branco, Acre; and Rondonia, Rondonia. Products include vector data showing (1) roads, (2) rivers, and (3) hypsography and (4) digital elevation model (DEM) images that were encoded from the hypsography vectors. There are 15 data files with this data set which includes 12 compressed *.zip files containing ArcInfo shape files and 3 GeoTIFFS.This data set contains vector data showing roads, rivers, and hypsography for each study area in ESRI ArcGIS shapefile format. The vectors were hand-digitized by the Images Company in Brazil from paper maps produced by the Brazilian government. Depending on the scale of the original maps, the digitization errors vary. For some maps, some vectors are missing. Data were manually checked for duplicate or extra vectors. These data sets were derived from several map sheets produced from aerial coverages dating from 1974 to 1978.The DEM images were encoded from the hypsography vectors and are provided in GeoTIFF format. The attribute value associated with each line and point in the vector segment is encoded into the image channel; the image channel is then filled in by interpolating image data between encoded vector data. For each DEM: 1 image channel with pixel resolution = 25m x 25m. DEM images are provided for Manaus, Tapajos National Forest, and Rondonia. The files for Rio Branco were unusable due to a documentation error.DATA QUALITY STATEMENT: The Data Center has determined that there are questions about the quality of the data reported in this data set. The data set has missing or incomplete data, metadata, or other documentation that diminishes the usability of the products. KNOWN PROBLEMS:The data providers note that due to limited resources, these data have been neither validated nor quality-assured for general use. For that reason, extreme caution is advised when considering the use of these data. - Any use of the derived data is not recommended because the results have not been validated.- However, the DEM, vectors, and orthorectified SAR data (related data set) can be used if the user understands how these were produced and accepts the limitations.

  12. H

    Data from: Spatial and temporal contrasts in the distribution of crops and...

    • dataverse.harvard.edu
    • datasetcatalog.nlm.nih.gov
    Updated Jan 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Andres Imbach Bartol; M Manrow; Elizabeth Barona Adarve; Alberto G O P Barretto; Glenn Graham Hyman (2024). Spatial and temporal contrasts in the distribution of crops and pastures across Amazonia: A new agricultural land use data set from census data since 1950: Crops and pastures across Amazonia [Dataset]. http://doi.org/10.7910/DVN/7J9WVY
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Pablo Andres Imbach Bartol; M Manrow; Elizabeth Barona Adarve; Alberto G O P Barretto; Glenn Graham Hyman
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.2/customlicense?persistentId=doi:10.7910/DVN/7J9WVYhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.2/customlicense?persistentId=doi:10.7910/DVN/7J9WVY

    Area covered
    South America, Peru, Brazil, Guyana, Bolivarian Republic of, Venezuela, Plurinational State of, Bolivia, Ecuador, Colombia
    Description

    Amazonia holds the largest continuous area of tropical forests with intense land use change dynamics inducing water, carbon, and energy feedbacks with regional and global impacts. Much of our knowledge of land use change in Amazonia comes from studies of the Brazilian Amazon, which accounts for two thirds of the region. Amazonia outside of Brazil has received less attention because of the difficulty of acquiring consistent data across countries. We present here an agricultural statistics database of the entire Amazonia region, with a harmonized description of crops and pastures in geospatial format, based on administrative boundary data at the municipality level. The spatial coverage includes countries within Amazonia and spans censuses and surveys from 1950 to 2012. Harmonized crop and pasture types are explored by grouping annual and perennial cropping systems, C3 and C4 photosynthetic pathways, planted and natural pastures, and main crops. Our analysis examined the spatial pattern of ratios between classes of the groups and their correlation with the agricultural extent of crops and pastures within administrative units of the Amazon, by country, and census/survey dates. Significant correlations were found between all ratios and the fraction of agricultural lands of each administrative unit, with the exception of planted to natural pastures ratio and pasture lands extent. Brazil and Peru in most cases have significant correlations for all ratios analyzed even for specific census and survey dates. Results suggested improvements, and potential applications of the database for carbon, water, climate, and land use change studies are discussed. The database presented here provides an Amazon-wide improved data set on agricultural dynamics with expanded temporal and spatial coverage

  13. Amazon Reviews Dataset

    • kaggle.com
    Updated Jan 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Ihenacho (2023). Amazon Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/danielihenacho/amazon-reviews-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Daniel Ihenacho
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset was created from the scraped reviews from products in Amazon for the purpose of text classification. The classes are three in number namely; - Negative Reviews - Neutral Reviews - Positive Reviews

    Data columns includes; - Sentiments - Cleaned Review - Cleaned Review Length - Review Score

    This dataset presents the problem of multiclass classification with the use of ML algorithms and also deep learning algorithms. Moreover, there is a class imbalance; negative reviews has the lowest number of reviews compared to positive and neutral reviews.

    For ML algo use a mapping of; negative--> -1, neutral--> 0, positive --> 1

    For Deep Learning algo use a mapping of; negative --> 0 neutral --> 1 positive --> 2

    Looking forward to your model discoveries on this dataset.

    Please leave an upvote if you find this relevant 😀.

  14. o

    10m Annual Land Use Land Cover (9-class)

    • registry.opendata.aws
    • collections.sentinel-hub.com
    Updated Jul 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Impact Observatory (2023). 10m Annual Land Use Land Cover (9-class) [Dataset]. https://registry.opendata.aws/io-lulc/
    Explore at:
    Dataset updated
    Jul 6, 2023
    Dataset provided by
    <a href="https://www.impactobservatory.com/">Impact Observatory</a>
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset, produced by Impact Observatory, Microsoft, and Esri, displays a global map of land use and land cover (LULC) derived from ESA Sentinel-2 imagery at 10 meter resolution for the years 2017 - 2023. Each map is a composite of LULC predictions for 9 classes throughout the year in order to generate a representative snapshot of each year. This dataset was generated by Impact Observatory, which used billions of human-labeled pixels (curated by the National Geographic Society) to train a deep learning model for land classification. Each global map was produced by applying this model to the Sentinel-2 annual scene collections from the Mircosoft Planetary Computer. Each of the maps has an assessed average accuracy of over 75%. These maps have been improved from Impact Observatory’s previous release and provide a relative reduction in the amount of anomalous change between classes, particularly between “Bare” and any of the vegetative classes “Trees,” “Crops,” “Flooded Vegetation,” and “Rangeland”. This updated time series of annual global maps is also re-aligned to match the ESA UTM tiling grid for Sentinel-2 imagery. Data can be accessed directly from the Registry of Open Data on AWS, from the STAC 1.0.0 endpoint, or from the IO Store for a specific Area of Interest (AOI).

  15. Amazon Reviews Dataset

    • kaggle.com
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dongre Laxman (2024). Amazon Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/dongrelaxman/amazon-reviews-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dongre Laxman
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset comprises customer reviews for Amazon, an online retail giant, featuring insights into customer experiences, including ratings, review titles, texts, and metadata. It is valuable for analyzing customer satisfaction, sentiment, and trends.

    Column Descriptions:

    Reviewer Name: Identifies the reviewer. Profile Link: Links to the reviewer's profile for additional insights. Country: Indicates the reviewer's location. Review Count: Number of reviews by the same user, showing engagement level. Review Date: When the review was posted, useful for time analysis. Rating: Numerical satisfaction measure. Review Title: Summarizes the review sentiment. Review Text: Detailed customer feedback. Date of Experience: When the service/product was experienced.

    Prospective applications:

    Sentiment Analysis: Analyze review texts and titles to assess overall customer sentiment toward products, enabling the identification of strengths and weaknesses. Customer Satisfaction Tracking: Track and visualize rating trends over time to understand fluctuations in customer satisfaction. Product Improvement: Identify common themes in reviews to highlight areas for product enhancement or development. Market Segmentation: Use country and demographic information to customize marketing strategies and gain insights into regional preferences. Competitor Analysis: Evaluate customer feedback on Amazon products in comparison to competitors to determine market positioning. Recommendation Systems: Leverage review data to enhance recommendation algorithms, improving personalized shopping experiences. Trend Analysis: Investigate temporal patterns in reviews to link sentiment changes with marketing efforts or product launches.

    This extensive dataset serves as a valuable asset for various analyses focused on enhancing customer engagement and refining business strategies.

  16. n

    LBA-ECO LC-15 Vegetation Cover Types from MODIS, 1-km, Amazon Basin:...

    • earthdata.nasa.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +6more
    Updated Aug 22, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ORNL_CLOUD (2011). LBA-ECO LC-15 Vegetation Cover Types from MODIS, 1-km, Amazon Basin: 2000-2001 [Dataset]. http://doi.org/10.3334/ORNLDAAC/1035
    Explore at:
    Dataset updated
    Aug 22, 2011
    Dataset authored and provided by
    ORNL_CLOUD
    Description

    This data set contains proportional estimates for the vegetative cover types of woody vegetation, herbaceous vegetation, and bare ground over the Amazon Basin for the period 2000-2001. These products were derived from all seven bands of the Moderate-resolution Imaging Spectroradiometer (MODIS) sensor onboard NASA's Terra satellite. A set of MODIS 32-day composites were used to create the vegetation cover types using the Vegetation Continuous Fields (VCF) (Hansen et al., 2002) approach which shows how much of a land cover such as "forest" or "grassland" exists anywhere on the land surface. The VCF product may depict areas of heterogeneous land cover better than traditional discrete classification schemes which shows where land cover types are concentrated.

    The original MODIS products are 500-m spatial resolution and are derived from 2000-2001 data products. The data were resampled to 1-km resolution for the regional study under this project, and provided as 3 separate cover type files in ENVI and GeoTIFF file formats that are provided in six zipped files. These products are registered to the rest of the regional data sets over the Amazon basin.

    These data are also available for download from the Global Land Cover Facility Website (http://modis.umiacs.umd.edu/).

  17. Archaeological Sites in the Amazon Biome

    • kaggle.com
    Updated Jun 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JP (2025). Archaeological Sites in the Amazon Biome [Dataset]. https://www.kaggle.com/datasets/josepart/archeological-sites-in-the-amazon-biome
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 22, 2025
    Dataset provided by
    Kaggle
    Authors
    JP
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Amazon Rainforest
    Description

    Description

    This dataset compiles geolocation information of archaeological sites in the Amazon Biome and surrounding area. It includes a total of 5442 sites. However, please note that the dataset contains duplicates. This was a deliberate choice to allow the user to select the subset of data points that they want to use. For example, the user could choose to deduplicate the dataset using their own chosen strategy, or use a subset from a particular source. The locations of the sites are illustrated in the figure below.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F358347%2Ff8916b3723b9297a397ba1732e3afa74%2Famazon_biome_sites.png?generation=1750548569960405&alt=media" alt=""> Fig. 1: Archaeological sites spread across the Amazon Biome (black outline). Each color corresponds to a different source (see References) as follows: de Souza (red), Coomes (green), Kalliola (blue), Walker (orange) and Jacobs (purple).

    :warning: Note that the amazon_biome_sites.csv contains duplicates. This is because some of the sources are compilations that contain other sources included in this dataset. For example, Jacobs' compilation contains some of the sites present in Kalliola's list. Likewise, there is a lot of overlap between Walker's list and all other lists (as can be appreciated in Fig. 1). In many cases, site naming is not consistent, and coordinates may also vary. For example, Walker didn't provide site names and seems to have rounded location coordinates, and Jacobs updated coordinates based on his verification of the sites on platforms like Google Earth. Additionally, some authors consider different neighboring structures as different sites, whereas others consider them as part of the same site. Therefore, deduplication is not trivial. However, depending on the intended use, there are several strategies that could be employed. For example, the user could choose to use the sites from a particular source, or remove duplicates based on approximate coordinates. You can have a look at this notebook for an example of how to deduplicate data.

    Data Collection

    • The Excel sheet with a list of archaeological sites in the Department of Loreto in Peru (sources/original/coomes/Coomes et al_Table of archaeological sites in Department of Loreto.xlsx) was processed manually and converted into a CSV file (sources/processed/coomes_loreto_peruvian_amazon_sites.csv).
    • The PDF file containing archaeological sites in the Upper Tapajós Basin in Brazil (sources/original/desouza/41467_2018_3510_MOESM1_ESM.pdf) was processed by first extracting the pages containing the relevant table (sources/processed/desouza_upper_tapajos_basin_sites.pdf), and then asking Gemini 2.5 Flash to convert the PDF table into a CSV file. The final file was manually verified to correct for any mistakes, including swapping the latitude and longitude values, which were inverted in the source file.
    • The PDF file containing archaeological sites in the southwestern Amazon (sources/original/kalliola/List of Southwestern Amazonian Earthworks 25.08.2024b.pdf) was processed by asking Gemini 2.5 Flash to convert the PDF table into a CSV file. The final file was verified manually by checking some entries and swapping the latitude and longitude values, which were inverted in the source file. Additionally, typos existing in the original file, as well as some introduced by Gemini, were fixed, as well as the order of some entries (e.g., the sequence of structures corresponding to the same site were sometimes out of order). When in doubt, things were left as in the original file. Despite efforts to fix all mistakes and try to attain consistency among the naming of the entries, there are no guarantees that all errors have been fixed.
    • The file with the sites compiled by Walker et al. (sources/original/walker/submit.csv) was copied with the coordinate columns renamed. Additionally, the file sources/original/walker/variables.xlsx was converted into a CSV file, and a column was added to indicate which variables correspond to which columns in the sites file.
    • The file sources/original/jacobs/amazon_geoglyphs.xls, compiled by Jacobs, was turned into a CSV file by concatenating the sheets corresponding to geoglyphs, mound villages and earthworks in Mato Grosso. Additionally, some entries were fixed, where the coordinates had the wrong signs. In particular, ronq18 had a positive latitude, and mgro5 had a positive longitude.

    Data Processing

    Once all the source files were compiled and processed, in order to generate the amazon_biome_sites.csv, some further processing was performed. In particular, these are the steps that were followed:

    • For each source file, site names were modified (when necessary) by combining information from m...
  18. d

    Data from: Habitat use of Amazonian birds varies by age and foraging guild...

    • search.dataone.org
    • datasetcatalog.nlm.nih.gov
    • +1more
    Updated May 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Luther (2024). Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient [Dataset]. http://doi.org/10.5061/dryad.8kprr4xvw
    Explore at:
    Dataset updated
    May 10, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    David Luther
    Description

    Patterns of habitat use directly influence a species’ fitness, yet for many species an individual’s age can influence patterns of habitat use. However, in tropical rainforests, which host the greatest terrestrial species diversity, little is known about how age classes of different species use different adjacent habitats of varying quality. We use long term mistnet data from the Amazon rainforest to assess patterns of habitat use among adult, adolescent (teenage), and young understory birds in forest fragments, primary, and secondary forest at the Biological Dynamics of Forest Fragments Project in Brazil. Insectivore adults were most common in primary forest, adolescents were equally likely in primary and secondary forest, and all ages were the least common in forest fragments. In contrast to insectivores, frugivores and omnivores showed no differences among all three habitat types. Our results illustrate potential ideal despotic distributions among breeding populations of some guilds o..., , , # Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient

    Cite this dataset: Luther, David 2024. Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient [Dataset]. Dryad. https://doi.org/10.5061/dryad.8kprr4xvw

    Title: Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient

    Abstract: Patterns of habitat use directly influence a species’ fitness, yet for many species an individual’s age can influence patterns of habitat use. However, in tropical rainforests, which host the greatest terrestrial species diversity, little is known about how age classes of different species use different adjacent habitats of varying quality. We use long term mistnet data from the Amazon rainforest to assess patterns of habitat use among adult, adolescent (teenage), and young understory birds in forest fragments, primary, and secondary forest a...

  19. Amazon reviews

    • kaggle.com
    Updated Oct 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdallah Wagih Ibrahim (2023). Amazon reviews [Dataset]. https://www.kaggle.com/datasets/abdallahwagih/amazon-reviews/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Abdallah Wagih Ibrahim
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview: This dataset contains a subset of Amazon customer reviews from the "Cell Phones & Accessories" category. The dataset provides valuable insights into customer sentiment and opinions related to various cell phone and accessory products available on Amazon. Whether you're interested in natural language processing, sentiment analysis, product recommendations, or market research, this dataset can be a valuable resource.

    Context: With the ever-increasing variety of cell phones and accessories available online, understanding customer feedback and preferences is crucial for businesses, researchers, and data enthusiasts. This dataset offers a glimpse into customer sentiments regarding different products, allowing for a wide range of analytical and research applications.

    License: Please note that this dataset is for research and analysis purposes only and may be subject to copyright and terms of use from Amazon. Make sure to comply with Amazon's policies when using this data.

    Dataset Source: The original dataset was scraped from Amazon's website.

  20. UK Optimal Product Price Prediction Dataset

    • kaggle.com
    Updated Nov 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asaniczka (2023). UK Optimal Product Price Prediction Dataset [Dataset]. http://doi.org/10.34740/kaggle/ds/3893120
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 7, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    asaniczka
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    This dataset contains product prices from Amazon UK, with a focus on price prediction. With a good amount of data on what price points sell the most, you can train machine learning models to predict the optimal price for a product based on its features and product name.

    If you find this dataset useful, make sure to show your appreciation by upvoting! ❤️✨

    Inspirations

    This dataset is a superset of my Amazon UK product price dataset. Another inspiration is this competition that awareded 100K Prize Money

    What To Do?

    • Your objective is to create a prediction model that will assist sellers in pricing their products within the optimal price range to generate the most sales.
    • The dataset includes various data points, such as the number of reviews, rating, best seller status, and items sold last month.
    • You can select specific factors (e.g., over 100 reviews = optimal price for the product) and then divide the dataset into products priced optimally vs products priced unoptimally.
    • By utilizing techniques like vectorizing product names and features, you can train a model to provide the optimal price for a product, which sellers or businesses might find valuable.

    How to know if a product sells?

    • I would prefer to use the number of reviews as a metric to determine if a product sells. More reviews = more sales, right?
    • According to one source only 1-2% of buyers leave a review
    • So if we multiply the reviews for a product by 50x, then we would get a good understanding how many units has sold.
    • If we then multiple the product price by number of units sold, we'd get the total revenue generated by the product

    How is this useful?

    • Sellers and businesses can leverage your model to determine the optimal price for their products, thereby maximizing sales.
    • Businesses can assess the profitability of a product and plan their supply chain accordingly.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Amazon employees 2007-2024 [Dataset]. https://www.statista.com/statistics/234488/number-of-amazon-employees/
Organization logo

Amazon employees 2007-2024

Explore at:
43 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, United States
Description

The combined number of full- and part-time employees of Amazon.com has increased significantly since 2017. Amazon’s headcount peaked in 2021 when the American multinational e-commerce company employed ********* full- and part-time employees, not counting external contractors. However, in 2024, the number dropped to *********. E-commerce crunch The workforce reduction of Amazon follows the mass layoffs hitting the entire e-commerce sector. With the full reopening of physical stores after the COVID-19 pandemic, online shopping demand decreased, leading online retailers to restructure their businesses, including personnel costs. Diversifying business With online retail sales growing slower due to recession and inflation, Amazon can still leverage other profitable revenue segments — from media subscriptions to server hosting and cloud services. On top of that, in 2023 Amazon monitored small enterprises operating in different fields and strategically invested in them, as disclosed startup acquisitions indicate.

Search
Clear search
Close search
Google apps
Main menu