40 datasets found

Amazon employees 2007-2024
statista.com
gruabehub.com
Updated Jun 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amazon employees 2007-2024 [Dataset]. https://www.statista.com/statistics/234488/number-of-amazon-employees/
Explore at:
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, United States
Description
The combined number of full- and part-time employees of Amazon.com has increased significantly since 2017. Amazon’s headcount peaked in 2021 when the American multinational e-commerce company employed ********* full- and part-time employees, not counting external contractors. However, in 2024, the number dropped to *********. E-commerce crunch The workforce reduction of Amazon follows the mass layoffs hitting the entire e-commerce sector. With the full reopening of physical stores after the COVID-19 pandemic, online shopping demand decreased, leading online retailers to restructure their businesses, including personnel costs. Diversifying business With online retail sales growing slower due to recession and inflation, Amazon can still leverage other profitable revenue segments — from media subscriptions to server hosting and cloud services. On top of that, in 2023 Amazon monitored small enterprises operating in different fields and strategically invested in them, as disclosed startup acquisitions indicate.

Amazon Employee Access Challenge

kaggle.com

Updated Aug 11, 2021

Facebook

Twitter

Click to copy link

Link copied

Cite

Luca Massaron (2021). Amazon Employee Access Challenge [Dataset]. https://www.kaggle.com/lucamassaron/amazon-employee-access-challenge/code

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 11, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Luca Massaron

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

When an employee at any company starts work, they first need to obtain the computer access necessary to fulfill their role. This access may allow an employee to read/manipulate resources through various applications or web portals. It is assumed that employees fulfilling the functions of a given role will access the same or similar resources. It is often the case that employees figure out the access they need as they encounter roadblocks during their daily work (e.g. not able to log into a reporting portal). A knowledgeable supervisor then takes time to manually grant the needed access in order to overcome access obstacles. As employees move throughout a company, this access discovery/recovery cycle wastes a nontrivial amount of time and money.

There is a considerable amount of data regarding an employee’s role within an organization and the resources to which they have access. Given the data related to current employees and their provisioned access, models can be built that automatically determine access privileges as employees enter and leave roles within a company. These auto-access models seek to minimize the human involvement required to grant or revoke employee access.

Part of the competition "Amazon.com - Employee Access Challenge" (https://www.kaggle.com/c/amazon-employee-access-challenge), the data consists of real historical data collected from 2010 & 2011. Employees are manually allowed or denied access to resources over time. Your task is to create an algorithm capable of learning from this historical data to predict approval/denial for an unseen set of employees.

Content

The data comes from Amazon Inc. collected from 2010-2011 (published on Kaggle platform). The training set consists of 32769 samples and the testing one of 58922 samples. The training set has one label attribute named “ACTION”, whose value “1” indicates an application is approved whereas “0” indicates rejection. As predictors of this state, there are eight features, indicating characteristics of the required resource anf the role and work group of the employee at Amazon requesting access.

train.csv - The training set. Each row has the ACTION (ground truth), RESOURCE, and information about the employee's role at the time of approval

test.csv - The test set for which predictions should be made. Each row asks whether an employee having the listed characteristics should have access to the listed resource.

Column Name	Description
ACTION	ACTION is 1 if the resource was approved, 0 if the resource was not
RESOURCE	An ID for each resource
MGR_ID	The EMPLOYEE ID of the manager of the current EMPLOYEE ID record; an employee may have only one manager at a time
ROLE_ROLLUP_1	Company role grouping category id 1 (e.g. US Engineering)
ROLE_ROLLUP_2	Company role grouping category id 2 (e.g. US Retail)
ROLE_DEPTNAME	Company role department description (e.g. Retail)
ROLE_TITLE	Company role business title description (e.g. Senior Engineering Retail Manager)
ROLE_FAMILY_DESC	Company role family extended description (e.g. Retail Manager, Software Engineering)
ROLE_FAMILY	Company role family description (e.g. Retail Manager)
ROLE_CODE	Company role code; this code is unique to each role (e.g. Manager)

Models are judged on area under the ROC curve (https://en.wikipedia.org/wiki/Receiver_operating_characteristic)

Acknowledgements

The data has been donated by Amazon and the original competition has been hosted in collaboration with the IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2013)

w
Dataset of business metrics of companies called Amazon
workwithdata.com
Updated May 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of business metrics of companies called Amazon [Dataset]. https://www.workwithdata.com/datasets/companies?col=ceo%2Cceo_approval%2Cceo_gender%2Ccity%2Cemployees&f=1&fcol0=company&fop0=%3D&fval0=Amazon
Explore at:
Dataset updated
May 6, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about companies. It has 19 rows and is filtered where the company is Amazon. It features 5 columns: employees, CEO, CEO gender, and CEO approval.
u
Amazon Question and Answer Data
cseweb.ucsd.edu
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Amazon Question and Answer Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
These datasets contain 1.48 million question and answer pairs about products from Amazon.

Metadata includes

question and answer text

is the question binary (yes/no), and if so does it have a yes/no answer?

timestamps

product ID (to reference the review dataset)

Basic Statistics:

Questions: 1.48 million

Answers: 4,019,744

Labeled yes/no questions: 309,419

Number of unique products with questions: 191,185
g
Amazon review data 2018
nijianmo.github.io
cseweb.ucsd.edu
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Amazon review data 2018 [Dataset]. https://nijianmo.github.io/amazon/
Explore at:
Dataset authored and provided by
UCSD CSE Research Project
Description
Context

This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

More reviews:

The total number of reviews is 233.1 million (142.8 million in 2014).

New reviews:

Current data includes reviews in the range May 1996 - Oct 2018.

Metadata: - We have added transaction metadata for each review shown on the review page.

Added more detailed metadata of the product landing page.

Acknowledgements

If you publish articles based on this dataset, please cite the following paper:

Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
b
Amazon reviews Dataset
brightdata.com
.json, .csv, .xlsx
Updated Mar 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2023). Amazon reviews Dataset [Dataset]. https://brightdata.com/products/datasets/amazon/reviews
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Mar 21, 2023
Dataset authored and provided by
Bright Data
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.
w
Amazon Web Services - Public Data Sets
data.wu.ac.at
Updated Oct 10, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Global (2013). Amazon Web Services - Public Data Sets [Dataset]. https://data.wu.ac.at/schema/datahub_io/NTYxNjkxNmYtNmZlNS00N2EwLWJkYTktZjFjZWJkNTM2MTNm
Explore at:
Dataset updated
Oct 10, 2013
Dataset provided by
Global
Description
About

From website:

Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.

Previously, large data sets such as the mapping of the Human Genome and the US Census data required hours or days to locate, download, customize, and analyze. Now, anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. For example, users can produce or use prebuilt server images with tools and applications to analyze the data sets. By hosting this important and useful data with cost-efficient services such as Amazon EC2, AWS hopes to provide researchers across a variety of disciplines and industries with tools to enable more innovation, more quickly.
d
Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) +...
datarade.ai
.csv
Updated Jan 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Space Know (2023). Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) + Research Report Available [Dataset]. https://datarade.ai/data-products/satellite-us-supply-chain-dataset-package-amazon-fedex-wal-space-know
Explore at:
.csvAvailable download formats
Dataset updated
Jan 18, 2023
Dataset authored and provided by
Space Know
Area covered
United States
Description
SpaceKnow USA Supply Chain Premium Dataset gives you data (by locations and company) of US Supply Chain choke points in near-real-time as seen from satellite images. The uniqueness of this dataset lies in its granularity.

About dataset: We apply proprietary algorithms to SAR satellite imagery of key industrial, transportation, storage, and logistics locations to create daily indices of industry activity. Data was collected from more than 5,000 locations across the USA. Thanks to the use of SAR satellite technology, the quality of the SpaceKnow dataset is not influenced by weather fluctuations.

In total SpaceKnow USA Supply Chain dataset offers +50 specific indices with real-time insights. The premium dataset includes company-focused indices. This type of data can be used by investors to get insight on important KPIs such as revenue.

This dataset is:

Daily frequency History from Jan 2017 - present

Within one package we provide you with real-time insights into:

Port Container country-level indices(A container port or container terminal is a facility where cargo containers are transshipped between different transport vehicles, for onward transportation) Port Container indices for the major ports in US: Port of Los Angeles Port of Long Beach Port of New York & New Jersey Port of Savannah Port of Houston Port of Virginia Port of Oakland in California Port of South Carolina Port of Miami

Trucking Stop indices for the most important locations in the supply chain like: Iowa Nevada South Carolina Oregon North Carolina

Inland Containers index on a country-level

Logistics Center index on a country-level (Logistics centers are distribution hubs for finished goods that need to be transported to another location. We include logistics centers from companies like Amazon, Walmart, Fedex and others)

Logistics Center indices for states like: California New York Illinois Indiana South Carolina And many more…

Logistics Center indices for companies: Amazon Walmart Fedex

Research Reports Don't have the capacity to analyze the data? Let SpaceKnow's in-house economists do the heavy lifting so that you can focus on what's important. SpaceKnow writes research reports based on what the data from the US Supply Chain dataset package is showing. The document includes a detailed explanation of what is happening with supporting charts and tables. The reports are published on a monthly basis.

Delivery Mechanisms All of the delivery mechanisms detailed below are available as part of this package. Data is distributed only in the flat-table CSV format. Methods how to access the data: Dashboard - option that also offers data visualization within the webpage Automatic email delivery API access to our dataset Research reports - provided via email in PDF format

Client Support

Each client is assigned an account representative who will reach out periodically to make sure that the data packages are meeting your needs. Here are some other ways to contact SpaceKnow in case you have a specific question.

For delivery questions and issues: Please reach out to support@spaceknow.com

For data questions: Please reach out to product@spaceknow.com

For pricing/sales support: Please reach out to info@spaceknow.com or sales@spaceknow.com
Data from: LBA-ECO LC-03 SAR Images, Land Cover, and Biomass, Four Areas...
catalog.data.gov
cmr.earthdata.nasa.gov
+4more
Updated Aug 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ORNL_DAAC (2025). LBA-ECO LC-03 SAR Images, Land Cover, and Biomass, Four Areas across Brazilian Amazon [Dataset]. https://catalog.data.gov/dataset/lba-eco-lc-03-sar-images-land-cover-and-biomass-four-areas-across-brazilian-amazon-72c32
Explore at:
Dataset updated
Aug 30, 2025
Dataset provided by
Oak Ridge National Laboratory Distributed Active Archive Center
Area covered
Amazon Rainforest, Brazil
Description
This data set provides three related land cover products for four study areas across the Brazilian Amazon: Manaus, Amazonas; Tapajos National Forest, Para Western (Santarem); Rio Branco, Acre; and Rondonia, Rondonia. Products include (1) orthorectified JERS-1 and RadarSat images, (2) land cover classifications derived from the SAR data, and (3) biomass estimates in tons per hectare based on the land cover classification. There are 12 image files (.tif) with this data set.Orthorectified JERS-1 and RadarSat images are provided as GeoTIFF images - one file for each study area.For the Manaus and Tapajos sites: The images are orthorectified at 12.5-meter resolution and then re-sampled at 25-meter resolution.For the Rondonia and Rio Branco sites: The images from 1978 are orthorectified at 25-meter resolution and then re-sampled at 90-meter resolution. Each GeoTIFF file contains 3 image channels: - 2 L-band JERS-1 data in Fall and Spring seasons and - 1 C-band RadarSat data.Land cover classifications are based on two JERS-1 images and one RadarSat image and provided as GeoTIFFs - one file for each study area. Four major land cover classes are distinguished: (1) Flat surface; (2) Regrowth area; (3) Short vegetation; and (4) Tall vegetation. The biomass estimates in tons per hectare are based on the land cover classification results and are reported in one GeoTIFF file for each study area.DATA QUALITY STATEMENT: The Data Center has determined that there are questions about the quality of the data reported in this data set. The data set has missing or incomplete data, metadata, or other documentation that diminishes the usability of the products.KNOWN PROBLEMS: The data providers note that due to limited resources, these data have been neither validated nor quality-assured for general use. For that reason, extreme caution is advised when considering the use of these data.Any use of the derived data is not recommended because the results have not been validated. However, the DEM and vectors (related data set), and orthorectified SAR data can be used if the user understands how these were produced and accepts the limitations.
d
Amazon Products Database contains data on keywords and product listings...
datarade.ai
.json
Updated Sep 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataForSEO (2023). Amazon Products Database contains data on keywords and product listings ranking for them [Dataset]. https://datarade.ai/data-products/amazon-products-database-contains-data-on-keywords-and-produc-dataforseo
Explore at:
.jsonAvailable download formats
Dataset updated
Sep 27, 2023
Dataset authored and provided by
DataForSEO
Area covered
United Arab Emirates, Egypt, United States of America, Saudi Arabia
Description
First of all, Amazon product datasets are indispensable for reverse engineering your rivals. For example, you can collect a list of keywords you already rank for or want to, and go through DataForSEO Amazon Products Database to find other sellers appearing as the top results for these terms.

Next, you can narrow down the scope of your contenders to those performing the best. To do so, you can filter out sellers who won the “Amazon’s Choice” and those whose products got listed multiple times on the first page.

Once you’ve compiled the final list of your challengers, Amazon Products Database will help you to quickly examine product titles, descriptions, prices, images, and other details that will let you grasp the main contributors to your competitors’ success. Once you’ve figured that out, you can start optimizing your product listings and pricing strategies to increase conversions.

However, the number of use cases for Amazon product data isn’t limited to competitor analysis. It can be applied to monitoring product rankings, running price comparisons, and more.
v
Data from: LBA-ECO LC-03 Hypsography, Rivers, Roads, and DEM, Four Areas...
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
search.dataone.org
+6more
Updated Jul 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ORNL_DAAC (2025). LBA-ECO LC-03 Hypsography, Rivers, Roads, and DEM, Four Areas across Brazilian Amazon [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/lba-eco-lc-03-hypsography-rivers-roads-and-dem-four-areas-across-brazilian-amazon-76907
Explore at:
Dataset updated
Jul 10, 2025
Dataset provided by
ORNL_DAAC
Area covered
Amazon Rainforest, Brazil
Description
This data set provides four related spatial data products for four study areas across the Brazilian Amazon: Manaus, Amazonas; Tapajos National Forest, Para Western (Santarem); Rio Branco, Acre; and Rondonia, Rondonia. Products include vector data showing (1) roads, (2) rivers, and (3) hypsography and (4) digital elevation model (DEM) images that were encoded from the hypsography vectors. There are 15 data files with this data set which includes 12 compressed *.zip files containing ArcInfo shape files and 3 GeoTIFFS.This data set contains vector data showing roads, rivers, and hypsography for each study area in ESRI ArcGIS shapefile format. The vectors were hand-digitized by the Images Company in Brazil from paper maps produced by the Brazilian government. Depending on the scale of the original maps, the digitization errors vary. For some maps, some vectors are missing. Data were manually checked for duplicate or extra vectors. These data sets were derived from several map sheets produced from aerial coverages dating from 1974 to 1978.The DEM images were encoded from the hypsography vectors and are provided in GeoTIFF format. The attribute value associated with each line and point in the vector segment is encoded into the image channel; the image channel is then filled in by interpolating image data between encoded vector data. For each DEM: 1 image channel with pixel resolution = 25m x 25m. DEM images are provided for Manaus, Tapajos National Forest, and Rondonia. The files for Rio Branco were unusable due to a documentation error.DATA QUALITY STATEMENT: The Data Center has determined that there are questions about the quality of the data reported in this data set. The data set has missing or incomplete data, metadata, or other documentation that diminishes the usability of the products. KNOWN PROBLEMS:The data providers note that due to limited resources, these data have been neither validated nor quality-assured for general use. For that reason, extreme caution is advised when considering the use of these data. - Any use of the derived data is not recommended because the results have not been validated.- However, the DEM, vectors, and orthorectified SAR data (related data set) can be used if the user understands how these were produced and accepts the limitations.
H
Data from: Spatial and temporal contrasts in the distribution of crops and...
dataverse.harvard.edu
datasetcatalog.nlm.nih.gov
Updated Jan 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pablo Andres Imbach Bartol; M Manrow; Elizabeth Barona Adarve; Alberto G O P Barretto; Glenn Graham Hyman (2024). Spatial and temporal contrasts in the distribution of crops and pastures across Amazonia: A new agricultural land use data set from census data since 1950: Crops and pastures across Amazonia [Dataset]. http://doi.org/10.7910/DVN/7J9WVY
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/7J9WVY
Dataset updated
Jan 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Pablo Andres Imbach Bartol; M Manrow; Elizabeth Barona Adarve; Alberto G O P Barretto; Glenn Graham Hyman
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.2/customlicense?persistentId=doi:10.7910/DVN/7J9WVYhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.2/customlicense?persistentId=doi:10.7910/DVN/7J9WVY
Area covered
South America, Peru, Brazil, Guyana, Bolivarian Republic of, Venezuela, Plurinational State of, Bolivia, Ecuador, Colombia
Description
Amazonia holds the largest continuous area of tropical forests with intense land use change dynamics inducing water, carbon, and energy feedbacks with regional and global impacts. Much of our knowledge of land use change in Amazonia comes from studies of the Brazilian Amazon, which accounts for two thirds of the region. Amazonia outside of Brazil has received less attention because of the difﬁculty of acquiring consistent data across countries. We present here an agricultural statistics database of the entire Amazonia region, with a harmonized description of crops and pastures in geospatial format, based on administrative boundary data at the municipality level. The spatial coverage includes countries within Amazonia and spans censuses and surveys from 1950 to 2012. Harmonized crop and pasture types are explored by grouping annual and perennial cropping systems, C3 and C4 photosynthetic pathways, planted and natural pastures, and main crops. Our analysis examined the spatial pattern of ratios between classes of the groups and their correlation with the agricultural extent of crops and pastures within administrative units of the Amazon, by country, and census/survey dates. Signiﬁcant correlations were found between all ratios and the fraction of agricultural lands of each administrative unit, with the exception of planted to natural pastures ratio and pasture lands extent. Brazil and Peru in most cases have signiﬁcant correlations for all ratios analyzed even for speciﬁc census and survey dates. Results suggested improvements, and potential applications of the database for carbon, water, climate, and land use change studies are discussed. The database presented here provides an Amazon-wide improved data set on agricultural dynamics with expanded temporal and spatial coverage
Amazon Reviews Dataset
kaggle.com
Updated Jan 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Ihenacho (2023). Amazon Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/danielihenacho/amazon-reviews-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 2, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Daniel Ihenacho
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset was created from the scraped reviews from products in Amazon for the purpose of text classification. The classes are three in number namely; - Negative Reviews - Neutral Reviews - Positive Reviews

Data columns includes; - Sentiments - Cleaned Review - Cleaned Review Length - Review Score

This dataset presents the problem of multiclass classification with the use of ML algorithms and also deep learning algorithms. Moreover, there is a class imbalance; negative reviews has the lowest number of reviews compared to positive and neutral reviews.

For ML algo use a mapping of; negative--> -1, neutral--> 0, positive --> 1

For Deep Learning algo use a mapping of; negative --> 0 neutral --> 1 positive --> 2

Looking forward to your model discoveries on this dataset.

Please leave an upvote if you find this relevant 😀.
o
10m Annual Land Use Land Cover (9-class)
registry.opendata.aws
collections.sentinel-hub.com
Updated Jul 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Impact Observatory (2023). 10m Annual Land Use Land Cover (9-class) [Dataset]. https://registry.opendata.aws/io-lulc/
Explore at:
Dataset updated
Jul 6, 2023
Dataset provided by
<a href="https://www.impactobservatory.com/">Impact Observatory</a>
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset, produced by Impact Observatory, Microsoft, and Esri, displays a global map of land use and land cover (LULC) derived from ESA Sentinel-2 imagery at 10 meter resolution for the years 2017 - 2023. Each map is a composite of LULC predictions for 9 classes throughout the year in order to generate a representative snapshot of each year. This dataset was generated by Impact Observatory, which used billions of human-labeled pixels (curated by the National Geographic Society) to train a deep learning model for land classification. Each global map was produced by applying this model to the Sentinel-2 annual scene collections from the Mircosoft Planetary Computer. Each of the maps has an assessed average accuracy of over 75%. These maps have been improved from Impact Observatory’s previous release and provide a relative reduction in the amount of anomalous change between classes, particularly between “Bare” and any of the vegetative classes “Trees,” “Crops,” “Flooded Vegetation,” and “Rangeland”. This updated time series of annual global maps is also re-aligned to match the ESA UTM tiling grid for Sentinel-2 imagery. Data can be accessed directly from the Registry of Open Data on AWS, from the STAC 1.0.0 endpoint, or from the IO Store for a specific Area of Interest (AOI).
Amazon Reviews Dataset
kaggle.com
Updated Sep 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dongre Laxman (2024). Amazon Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/dongrelaxman/amazon-reviews-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 20, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dongre Laxman
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset comprises customer reviews for Amazon, an online retail giant, featuring insights into customer experiences, including ratings, review titles, texts, and metadata. It is valuable for analyzing customer satisfaction, sentiment, and trends.

Column Descriptions:

Reviewer Name: Identifies the reviewer. Profile Link: Links to the reviewer's profile for additional insights. Country: Indicates the reviewer's location. Review Count: Number of reviews by the same user, showing engagement level. Review Date: When the review was posted, useful for time analysis. Rating: Numerical satisfaction measure. Review Title: Summarizes the review sentiment. Review Text: Detailed customer feedback. Date of Experience: When the service/product was experienced.

Prospective applications:

Sentiment Analysis: Analyze review texts and titles to assess overall customer sentiment toward products, enabling the identification of strengths and weaknesses. Customer Satisfaction Tracking: Track and visualize rating trends over time to understand fluctuations in customer satisfaction. Product Improvement: Identify common themes in reviews to highlight areas for product enhancement or development. Market Segmentation: Use country and demographic information to customize marketing strategies and gain insights into regional preferences. Competitor Analysis: Evaluate customer feedback on Amazon products in comparison to competitors to determine market positioning. Recommendation Systems: Leverage review data to enhance recommendation algorithms, improving personalized shopping experiences. Trend Analysis: Investigate temporal patterns in reviews to link sentiment changes with marketing efforts or product launches.

This extensive dataset serves as a valuable asset for various analyses focused on enhancing customer engagement and refining business strategies.
n
LBA-ECO LC-15 Vegetation Cover Types from MODIS, 1-km, Amazon Basin:...
earthdata.nasa.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+6more
Updated Aug 22, 2011
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ORNL_CLOUD (2011). LBA-ECO LC-15 Vegetation Cover Types from MODIS, 1-km, Amazon Basin: 2000-2001 [Dataset]. http://doi.org/10.3334/ORNLDAAC/1035
Explore at:
Unique identifier
https://doi.org/10.3334/ORNLDAAC/1035
Dataset updated
Aug 22, 2011
Dataset authored and provided by
ORNL_CLOUD
Description
This data set contains proportional estimates for the vegetative cover types of woody vegetation, herbaceous vegetation, and bare ground over the Amazon Basin for the period 2000-2001. These products were derived from all seven bands of the Moderate-resolution Imaging Spectroradiometer (MODIS) sensor onboard NASA's Terra satellite. A set of MODIS 32-day composites were used to create the vegetation cover types using the Vegetation Continuous Fields (VCF) (Hansen et al., 2002) approach which shows how much of a land cover such as "forest" or "grassland" exists anywhere on the land surface. The VCF product may depict areas of heterogeneous land cover better than traditional discrete classification schemes which shows where land cover types are concentrated.

The original MODIS products are 500-m spatial resolution and are derived from 2000-2001 data products. The data were resampled to 1-km resolution for the regional study under this project, and provided as 3 separate cover type files in ENVI and GeoTIFF file formats that are provided in six zipped files. These products are registered to the rest of the regional data sets over the Amazon basin.

These data are also available for download from the Global Land Cover Facility Website (http://modis.umiacs.umd.edu/).
Archaeological Sites in the Amazon Biome
kaggle.com
Updated Jun 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JP (2025). Archaeological Sites in the Amazon Biome [Dataset]. https://www.kaggle.com/datasets/josepart/archeological-sites-in-the-amazon-biome
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 22, 2025
Dataset provided by
Kaggle
Authors
JP
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Amazon Rainforest
Description
Description

This dataset compiles geolocation information of archaeological sites in the Amazon Biome and surrounding area. It includes a total of 5442 sites. However, please note that the dataset contains duplicates. This was a deliberate choice to allow the user to select the subset of data points that they want to use. For example, the user could choose to deduplicate the dataset using their own chosen strategy, or use a subset from a particular source. The locations of the sites are illustrated in the figure below.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F358347%2Ff8916b3723b9297a397ba1732e3afa74%2Famazon_biome_sites.png?generation=1750548569960405&alt=media" alt=""> Fig. 1: Archaeological sites spread across the Amazon Biome (black outline). Each color corresponds to a different source (see References) as follows: de Souza (red), Coomes (green), Kalliola (blue), Walker (orange) and Jacobs (purple).

:warning: Note that the amazon_biome_sites.csv contains duplicates. This is because some of the sources are compilations that contain other sources included in this dataset. For example, Jacobs' compilation contains some of the sites present in Kalliola's list. Likewise, there is a lot of overlap between Walker's list and all other lists (as can be appreciated in Fig. 1). In many cases, site naming is not consistent, and coordinates may also vary. For example, Walker didn't provide site names and seems to have rounded location coordinates, and Jacobs updated coordinates based on his verification of the sites on platforms like Google Earth. Additionally, some authors consider different neighboring structures as different sites, whereas others consider them as part of the same site. Therefore, deduplication is not trivial. However, depending on the intended use, there are several strategies that could be employed. For example, the user could choose to use the sites from a particular source, or remove duplicates based on approximate coordinates. You can have a look at this notebook for an example of how to deduplicate data.

Data Collection

The Excel sheet with a list of archaeological sites in the Department of Loreto in Peru (sources/original/coomes/Coomes et al_Table of archaeological sites in Department of Loreto.xlsx) was processed manually and converted into a CSV file (sources/processed/coomes_loreto_peruvian_amazon_sites.csv).

The PDF file containing archaeological sites in the Upper Tapajós Basin in Brazil (sources/original/desouza/41467_2018_3510_MOESM1_ESM.pdf) was processed by first extracting the pages containing the relevant table (sources/processed/desouza_upper_tapajos_basin_sites.pdf), and then asking Gemini 2.5 Flash to convert the PDF table into a CSV file. The final file was manually verified to correct for any mistakes, including swapping the latitude and longitude values, which were inverted in the source file.

The PDF file containing archaeological sites in the southwestern Amazon (sources/original/kalliola/List of Southwestern Amazonian Earthworks 25.08.2024b.pdf) was processed by asking Gemini 2.5 Flash to convert the PDF table into a CSV file. The final file was verified manually by checking some entries and swapping the latitude and longitude values, which were inverted in the source file. Additionally, typos existing in the original file, as well as some introduced by Gemini, were fixed, as well as the order of some entries (e.g., the sequence of structures corresponding to the same site were sometimes out of order). When in doubt, things were left as in the original file. Despite efforts to fix all mistakes and try to attain consistency among the naming of the entries, there are no guarantees that all errors have been fixed.

The file with the sites compiled by Walker et al. (sources/original/walker/submit.csv) was copied with the coordinate columns renamed. Additionally, the file sources/original/walker/variables.xlsx was converted into a CSV file, and a column was added to indicate which variables correspond to which columns in the sites file.

The file sources/original/jacobs/amazon_geoglyphs.xls, compiled by Jacobs, was turned into a CSV file by concatenating the sheets corresponding to geoglyphs, mound villages and earthworks in Mato Grosso. Additionally, some entries were fixed, where the coordinates had the wrong signs. In particular, ronq18 had a positive latitude, and mgro5 had a positive longitude.

Data Processing

Once all the source files were compiled and processed, in order to generate the amazon_biome_sites.csv, some further processing was performed. In particular, these are the steps that were followed:

For each source file, site names were modified (when necessary) by combining information from m...
d
Data from: Habitat use of Amazonian birds varies by age and foraging guild...
search.dataone.org
datasetcatalog.nlm.nih.gov
+1more
Updated May 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Luther (2024). Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient [Dataset]. http://doi.org/10.5061/dryad.8kprr4xvw
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.8kprr4xvw
Dataset updated
May 10, 2024
Dataset provided by
Dryad Digital Repository
Authors
David Luther
Description
Patterns of habitat use directly influence a speciesâ€™ fitness, yet for many species an individualâ€™s age can influence patterns of habitat use. However, in tropical rainforests, which host the greatest terrestrial species diversity, little is known about how age classes of different species use different adjacent habitats of varying quality. We use long term mistnet data from the Amazon rainforest to assess patterns of habitat use among adult, adolescent (teenage), and young understory birds in forest fragments, primary, and secondary forest at the Biological Dynamics of Forest Fragments Project in Brazil. Insectivore adults were most common in primary forest, adolescents were equally likely in primary and secondary forest, and all ages were the least common in forest fragments. In contrast to insectivores, frugivores and omnivores showed no differences among all three habitat types. Our results illustrate potential ideal despotic distributions among breeding populations of some guilds o..., , , # Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient

Cite this dataset: Luther, David 2024. Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient [Dataset]. Dryad. https://doi.org/10.5061/dryad.8kprr4xvw

Title: Habitat use of Amazonian birds varies by age and foraging guild along a disturbance gradient

Abstract: Patterns of habitat use directly influence a speciesâ€™ fitness, yet for many species an individualâ€™s age can influence patterns of habitat use. However, in tropical rainforests, which host the greatest terrestrial species diversity, little is known about how age classes of different species use different adjacent habitats of varying quality. We use long term mistnet data from the Amazon rainforest to assess patterns of habitat use among adult, adolescent (teenage), and young understory birds in forest fragments, primary, and secondary forest a...
Amazon reviews
kaggle.com
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdallah Wagih Ibrahim (2023). Amazon reviews [Dataset]. https://www.kaggle.com/datasets/abdallahwagih/amazon-reviews/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abdallah Wagih Ibrahim
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Overview: This dataset contains a subset of Amazon customer reviews from the "Cell Phones & Accessories" category. The dataset provides valuable insights into customer sentiment and opinions related to various cell phone and accessory products available on Amazon. Whether you're interested in natural language processing, sentiment analysis, product recommendations, or market research, this dataset can be a valuable resource.

Context: With the ever-increasing variety of cell phones and accessories available online, understanding customer feedback and preferences is crucial for businesses, researchers, and data enthusiasts. This dataset offers a glimpse into customer sentiments regarding different products, allowing for a wide range of analytical and research applications.

License: Please note that this dataset is for research and analysis purposes only and may be subject to copyright and terms of use from Amazon. Make sure to comply with Amazon's policies when using this data.

Dataset Source: The original dataset was scraped from Amazon's website.
UK Optimal Product Price Prediction Dataset
kaggle.com
Updated Nov 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
asaniczka (2023). UK Optimal Product Price Prediction Dataset [Dataset]. http://doi.org/10.34740/kaggle/ds/3893120
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/ds/3893120
Dataset updated
Nov 7, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
asaniczka
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Area covered
United Kingdom
Description
This dataset contains product prices from Amazon UK, with a focus on price prediction. With a good amount of data on what price points sell the most, you can train machine learning models to predict the optimal price for a product based on its features and product name.

If you find this dataset useful, make sure to show your appreciation by upvoting! ❤️✨

Inspirations

This dataset is a superset of my Amazon UK product price dataset. Another inspiration is this competition that awareded 100K Prize Money

What To Do?

Your objective is to create a prediction model that will assist sellers in pricing their products within the optimal price range to generate the most sales.

The dataset includes various data points, such as the number of reviews, rating, best seller status, and items sold last month.

You can select specific factors (e.g., over 100 reviews = optimal price for the product) and then divide the dataset into products priced optimally vs products priced unoptimally.

By utilizing techniques like vectorizing product names and features, you can train a model to provide the optimal price for a product, which sellers or businesses might find valuable.

How to know if a product sells?

I would prefer to use the number of reviews as a metric to determine if a product sells. More reviews = more sales, right?

According to one source only 1-2% of buyers leave a review

So if we multiply the reviews for a product by 50x, then we would get a good understanding how many units has sold.

If we then multiple the product price by number of units sold, we'd get the total revenue generated by the product

How is this useful?

Sellers and businesses can leverage your model to determine the optimal price for their products, thereby maximizing sales.

Businesses can assess the profitability of a product and plan their supply chain accordingly.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Amazon employees 2007-2024 [Dataset]. https://www.statista.com/statistics/234488/number-of-amazon-employees/

Amazon employees 2007-2024

Explore at:

43 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 25, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

Worldwide, United States

Description

The combined number of full- and part-time employees of Amazon.com has increased significantly since 2017. Amazon’s headcount peaked in 2021 when the American multinational e-commerce company employed ********* full- and part-time employees, not counting external contractors. However, in 2024, the number dropped to *********. E-commerce crunch The workforce reduction of Amazon follows the mass layoffs hitting the entire e-commerce sector. With the full reopening of physical stores after the COVID-19 pandemic, online shopping demand decreased, leading online retailers to restructure their businesses, including personnel costs. Diversifying business With online retail sales growing slower due to recession and inflation, Amazon can still leverage other profitable revenue segments — from media subscriptions to server hosting and cloud services. On top of that, in 2023 Amazon monitored small enterprises operating in different fields and strategically invested in them, as disclosed startup acquisitions indicate.

Clear search

Close search

Google apps

Main menu

Amazon employees 2007-2024

Amazon Employee Access Challenge

Context

Content

Acknowledgements

Dataset of business metrics of companies called Amazon

Amazon Question and Answer Data

Amazon review data 2018

Context

Acknowledgements

Amazon reviews Dataset

Amazon Web Services - Public Data Sets

About

Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) +...

Data from: LBA-ECO LC-03 SAR Images, Land Cover, and Biomass, Four Areas...

Amazon Products Database contains data on keywords and product listings...

Data from: LBA-ECO LC-03 Hypsography, Rivers, Roads, and DEM, Four Areas...

Data from: Spatial and temporal contrasts in the distribution of crops and...

Amazon Reviews Dataset

10m Annual Land Use Land Cover (9-class)

Amazon Reviews Dataset

LBA-ECO LC-15 Vegetation Cover Types from MODIS, 1-km, Amazon Basin:...

Archaeological Sites in the Amazon Biome

Description

Data Collection

Data Processing

Data from: Habitat use of Amazonian birds varies by age and foraging guild...

Amazon reviews

UK Optimal Product Price Prediction Dataset

Inspirations

What To Do?

How to know if a product sells?

How is this useful?

Amazon employees 2007-2024