We present a collection of Amazon reviews specifically designed to aid research in multilingual text classification. The dataset contains reviews in English, Japanese, German, French, Chinese and Spanish, collected between November 1, 2015 and November 1, 2019. Each record in the dataset contains the review text, the review title, the star rating, an anonymized reviewer ID, an anonymized product ID and the coarse-grained product category (e.g. 'books', 'appliances', etc.)
Amazon Web Services (AWS) global cloud data centers operate in 34 geographic regions, each containing several availability zones (AZs). As of 2024, Europe/Middle East/Africa and Asia Pacific and China had 77 zones combined, which is over 70 percent of all AWS' AZs.
A multidisciplinary repository of public data sets such as the Human Genome and US Census data that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community. Anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. If you have a public domain or non-proprietary data set that you think is useful and interesting to the AWS community, please submit a request and the AWS team will review your submission and get back to you. Typically the data sets in the repository are between 1 GB to 1 TB in size (based on the Amazon EBS volume limit), but they can work with you to host larger data sets as well. You must have the right to make the data freely available.
Test Private AWS S3 data. This is for TEST PURPOSES ONLY
This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
New reviews:
Metadata: - We have added transaction metadata for each review shown on the review page.
If you publish articles based on this dataset, please cite the following paper:
A centralized repository of up-to-date and curated datasets on or related to the spread and characteristics of the novel corona virus (SARS-CoV-2) and its associated illness, COVID-19. Globally, there are several efforts underway to gather this data, and we are working with partners to make this crucial data freely available and keep it up-to-date. Hosted on the AWS cloud, we have seeded our curated data lake with COVID-19 case tracking data from Johns Hopkins and The New York Times, hospital bed availability from Definitive Healthcare, and over 45,000 research articles about COVID-19 and related coronaviruses from the Allen Institute for AI.
In the past, the U.S. Geological Survey (USGS) and NASA collaborated on the creation of four global land data sets from Landsat images: one from the 1970s, and one each from circa 1990, 2000, and 2005. Each of these global data sets was created from the primary Landsat sensor in use at the time: the Multispectral Scanner (MSS) in the 1970s, the Thematic Mapper (TM) in 1990, Enhanced Thematic Mapper Plus (ETM+) in 2000, and a combination of TM and ETM+ in 2005.
Global MODIS vegetation indices are designed to provide consistent spatial and temporal comparisons of vegetation conditions. Blue, red, and near-infrared reflectances, centered at 469-nanometers, 645-nanometers, and 858-nanometers, respectively, are used to determine the MODIS daily vegetation indices. The MODIS Normalized Difference Vegetation Index (NDVI) complements NOAA's Advanced Very High Resolution Radiometer (AVHRR) NDVI products and provides continuity for time series historical applications. MODIS also includes a new Enhanced Vegetation Index (EVI) that minimizes canopy background variations and maintains sensitivity over dense vegetation conditions. The EVI also uses the blue band to remove residual atmosphere contamination caused by smoke and sub-pixel thin cloud clouds. The MODIS NDVI and EVI products are computed from atmospherically corrected bi-directional surface reflectances that have been masked for water, clouds, heavy aerosols, and cloud shadows. Global MOD13Q1 data are provided every 16 days at 250-meter spatial resolution as a gridded level-3 product in the Sinusoidal projection. Lacking a 250m blue band, the EVI algorithm uses the 500m blue band to correct for residual atmospheric effects, with negligible spatial artifacts. Vegetation indices are used for global monitoring of vegetation conditions and are used in products displaying land cover and land cover changes. These data may be used as input for modeling global biogeochemical and hydrologic processes and global and regional climate. These data also may be used for characterizing land surface biophysical properties and processes, including primary production and land cover conversion.
We present the AWS documentation corpus, an open-book QA dataset, which contains 25,175 documents along with 100 matched questions and answers. These questions are inspired by the author's interactions with real AWS customers and the questions they asked about AWS services. The data was anonymized and aggregated. All questions in the dataset have a valid, factual and unambiguous answer within the accompanying documents, we deliberately avoided questions that are ambiguous, incomprehensible, opinion-seeking, or not clearly a request for factual information. All questions, answers and accompanying documents in the dataset are annotated by authors. There are two types of answers: text and yes-no-none(YNN) answers. Text answers range from a few words to a full paragraph sourced from a continuous block of words in a document or from different locations within the same document. Every question in the dataset has a matched text answer. Yes-no-none(YNN) answers can be yes, no, or none depending on the type of question. For example the question: “Can I stop a DB instance that has a read replica?” has a clear yes or no answer but the question “What is the maximum number of rows in a dataset in Amazon Forecast?” is not a yes or no question and therefore has a “None” as the YNN answer. 23 questions have ‘Yes’ YNN answers, 10 questions have ‘No’ YNN answers and 67 questions have ‘None’ YNN answers.
This data set represents the automatic weather station (AWS) data from the 16 stations of the Desert Research Institute network for the period 00 PST March 1 to 00 PST May 1, 2006 during the Terrain-induced Rotor Experiment (T-REX) field campaign. The data have a temporal resolution of 30 seconds, and are in netCDF format files.
As of April 2025, Amazon Wed Services (AWS) cloud data centers operated in 14 markets in the Asia-Pacific region, with 44 availability zones in total. An availability zone (AZs) is one or more separate data centers located within specific regions within which cloud services originate and operate. Each AZ has independent power, cooling, and physical security.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Context:- Amazon.com, Inc. is an American multinational technology company specializing in e-commerce, cloud computing, digital streaming, and artificial intelligence. Founded by Jeff Bezos in 1994, Amazon has grown into one of the world’s most valuable companies, revolutionizing online retail and cloud services through its Amazon Web Services (AWS) division.
As of March 2025 Amazon has a market cap of $2.249 Trillion USD. This makes Amazon the world's 4th most valuable company by market cap according to our data. The market capitalization, commonly called market cap, is the total market value of a publicly traded company's outstanding shares and is commonly used to measure how much a company is worth.
Content:-
This dataset covers Amazon’s daily stock price data from 2000 to 2025. It includes information on:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F14466026%2F5453b54c1a5488a995b51a5f3b23fd84%2FStock%20dataset%20variables.jpg?generation=1740822549719886&alt=media" alt="">
Time-period: 2000–2025
Acknowlegements This dataset belongs to me.I'm sharing it here for free.You may do with it as you wish.
Amazon AWS - Cloud Platforms & Services
Companies using Amazon AWS
We have data on 1,070,574 companies that use Amazon AWS. The companies using Amazon AWS are most often found in United States and in the Computer Software industry. Amazon AWS is most often used by companies with 10-50 employees and 1M-10M dollars in revenue. Our data for Amazon AWS usage goes back as far as 2 years and 1 months.
What is Amazon AWS?
Amazon Web Services (AWS) is a collection of remote computing services, also called web services that make up a cloud computing platform offered by Amazon.com.
Top Industries that use Amazon AWS
Looking at Amazon AWS customers by industry, we find that Computer Software (6%) is the largest segment.
Distribution of companies using Amazon AWS by Industry
Computer software - 67, 537 companies Hospitals & Healthcare - 54, 293 companies Retail - 39, 543 companies Information Technology and Services - 35, 382 companies Real Estate - 31, 676 companies Restaurants - 30, 302 companies Construction - 29, 207 companies Automotive - 28, 469 companies Financial Services - 23, 680 companies Education Management - 21, 548 companies
Top Countries that use Amazon AWS
49% of Amazon AWS customers are in United States and 7% are in United Kingdom.
Distribution of companies using Amazon AWS by country
United Sates – 616 2275 companies United Kingdom – 68 219 companies Australia – 44 601 companies Canada – 42 770 companies Germany – 31 541 companies India – 30 949 companies Netherlands – 19 543 companies Brazil – 17 165 companies Italy – 14 876 companies Spain – 14 675 companies
Contact Information of Fields Include:-
• Company Name
• Business contact number
• Title
• Name
• Email Address
• Country, State, City, Zip Code
• Phone, Mobile and Fax
• Website
• Industry
• SIC & NAICS Code
• Employees Size
• Revenue Size
• And more…
Why Buy AWS Users List from DataCaptive?
• More than 1,070,574 companies
• Responsive database
• Customizable as per your requirements
• Email and Tele-verified list
• Team of 100+ market researchers
• Authentic data sources
What’s in for you?
Over choosing us, here are a few advantages we authenticate-
• Locate, target, and prospect leads from 170+ countries • Design and execute ABM and multi-channel campaigns • Seamless and smooth pre-and post-sale customer service • Connect with old leads and build a fruitful customer relationship • Analyze the market for product development and sales campaigns • Boost sales and ROI with increased customer acquisition and retention
Our security compliance
We use of globally recognized data laws like –
GDPR, CCPA, ACMA, EDPS, CAN-SPAM and ANTI CAN-SPAM to ensure the privacy and security of our database. We engage certified auditors to validate our security and privacy by providing us with certificates to represent our security compliance.
Our USPs- what makes us your ideal choice?
At DataCaptive™, we strive consistently to improve our services and cater to the needs of businesses around the world while keeping up with industry trends.
• Elaborate data mining from credible sources • 7-tier verification, including manual quality check • Strict adherence to global and local data policies • Guaranteed 95% accuracy or cash-back • Free sample database available on request
Guaranteed benefits of our Amazon AWS users email database!
85% email deliverability and 95% accuracy on other data fields
We understand the importance of data accuracy and employ every avenue to keep our database fresh and updated. We execute a multi-step QC process backed by our Patented AI and Machine learning tools to prevent anomalies in consistency and data precision. This cycle repeats every 45 days. Although maintaining 100% accuracy is quite impractical, since data such as email, physical addresses, and phone numbers are subjected to change, we guarantee 85% email deliverability and 95% accuracy on other data points.
100% replacement in case of hard bounces
Every data point is meticulously verified and then re-verified to ensure you get the best. Data Accuracy is paramount in successfully penetrating a new market or working within a familiar one. We are committed to precision. However, in an unlikely event where hard bounces or inaccuracies exceed the guaranteed percentage, we offer replacement with immediate effect. If need be, we even offer credits and/or refunds for inaccurate contacts.
Other promised benefits
• Contacts are for the perpetual usage • The database comprises consent-based opt-in contacts only • The list is free of duplicate contacts and generic emails • Round-the-clock customer service assistance • 360-degree database solutions
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset Card for "amazon-product-data-filter"
Dataset Summary
The Amazon Product Dataset contains product listing data from the Amazon US website. It can be used for various NLP and classification tasks, such as text generation, product type classification, attribute extraction, image recognition and more.
Languages
The text in the dataset is in English.
Dataset Structure
Data Instances
Each data point provides product information, such… See the full description on the dataset page: https://huggingface.co/datasets/iarbel/amazon-product-data-filter.
The Sentinel-2 mission is a land monitoring constellation of two satellites that provide high resolution optical imagery and provide continuity for the current SPOT and Landsat missions. The mission provides a global coverage of the Earth's land surface every 5 days, making the data of great use in on-going studies. L1C data are available from June 2015 globally. L2A data are available from November 2016 over Europe region and globally since January 2017.
https://brightdata.com/licensehttps://brightdata.com/license
Buy Amazon datasets and get access to over 300 million records from any Amazon domain. Get insights on Amazon products, sellers, and reviews.
OSM is a free, editable map of the world, created and maintained by volunteers. Regular OSM data archives are made available in Amazon S3 in both standard formats (OSM PBF, XML) and cloud-native formats optimized for analytics workloads.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "Amazon-QA"
Dataset Summary
This dataset contains Question and Answer data from Amazon. Disclaimer: The team releasing Amazon-QA did not upload the dataset to the Hub and did not write a dataset card. These steps were done by the Hugging Face team.
Supported Tasks
Sentence Transformers training; useful for semantic search and sentence similarity.
Languages
English.
Dataset Structure
Each example in the dataset… See the full description on the dataset page: https://huggingface.co/datasets/embedding-data/Amazon-QA.
This data set contains 1-minute resolution surface meteorological data from the Atmospheric Boundary Layer Experiment (ABLE) operated by the Argonne National Laboratory in the Walnut River Watershed in Butler County Kansas. The ABLE Automated Weather Station (AWS) Network consists of five stations. Data cover the period from 13 May to 25 June 2002. The data are in columnar ASCII format. Consult the README for more information.
These data were collected during the field phase of the T-REX experiment, 1st March - 30th April 2006. This data set contains AWS data collected at 3 second intervals from 14 sites - one for each station. With the exception of the "Notch" site, all AWS consisted of a 10m instrumented mast; the "Notch" site being 2m high (to minimise visual impact at this exposed site).
We present a collection of Amazon reviews specifically designed to aid research in multilingual text classification. The dataset contains reviews in English, Japanese, German, French, Chinese and Spanish, collected between November 1, 2015 and November 1, 2019. Each record in the dataset contains the review text, the review title, the star rating, an anonymized reviewer ID, an anonymized product ID and the coarse-grained product category (e.g. 'books', 'appliances', etc.)