23 datasets found

Global net revenue of Amazon 2014-2024, by product group
statista.com
ai-chatbox.pro
Updated Feb 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Global net revenue of Amazon 2014-2024, by product group [Dataset]. https://www.statista.com/statistics/672747/amazons-consolidated-net-revenue-by-segment/
Explore at:
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2024, Amazon's net revenue from subscription services segment amounted to 44.37 billion U.S. dollars. Subscription services include Amazon Prime, for which Amazon reported 200 million paying members worldwide at the end of 2020. The AWS category generated 107.56 billion U.S. dollars in annual sales. During the most recently reported fiscal year, the company’s net revenue amounted to 638 billion U.S. dollars. Amazon revenue segments Amazon is one of the biggest online companies worldwide. In 2019, the company’s revenue increased by 21 percent, compared to Google’s revenue growth during the same fiscal period, which was just 18 percent. The majority of Amazon’s net sales are generated through its North American business segment, which accounted for 236.3 billion U.S. dollars in 2020. The United States are the company’s leading market, followed by Germany and the United Kingdom. Business segment: Amazon Web Services Amazon Web Services, commonly referred to as AWS, is one of the strongest-growing business segments of Amazon. AWS is a cloud computing service that provides individuals, companies and governments with a wide range of computing, networking, storage, database, analytics and application services, among many others. As of the third quarter of 2020, AWS accounted for approximately 32 percent of the global cloud infrastructure services vendor market.
Amazon revenue 2004-2024
statista.com
Updated Jun 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amazon revenue 2004-2024 [Dataset]. https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/
Explore at:
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, United States
Description
From 2004 to 2024, the net revenue of Amazon e-commerce and service sales has increased tremendously. In the fiscal year ending December 31, the multinational e-commerce company's net revenue was almost *** billion U.S. dollars, up from *** billion U.S. dollars in 2023.Amazon.com, a U.S. e-commerce company originally founded in 1994, is the world’s largest online retailer of books, clothing, electronics, music, and many more goods. As of 2024, the company generates the majority of it's net revenues through online retail product sales, followed by third-party retail seller services, cloud computing services, and retail subscription services including Amazon Prime. From seller to digital environment Through Amazon, consumers are able to purchase goods at a rather discounted price from both small and large companies as well as from other users. Both new and used goods are sold on the website. Due to the wide variety of goods available at prices which often undercut local brick-and-mortar retail offerings, Amazon has dominated the retailer market. As of 2024, Amazon’s brand worth amounts to over *** billion U.S. dollars, topping the likes of companies such as Walmart, Ikea, as well as digital competitors Alibaba and eBay. One of Amazon's first forays into the world of hardware was its e-reader Kindle, one of the most popular e-book readers worldwide. More recently, Amazon has also released several series of own-branded products and a voice-controlled virtual assistant, Alexa. Headquartered in North America Due to its location, Amazon offers more services in North America than worldwide. As a result, the majority of the company’s net revenue in 2023 was actually earned in the United States, Canada, and Mexico. In 2023, approximately *** billion U.S. dollars was earned in North America compared to only roughly *** billion U.S. dollars internationally.
Amazon Web Services: year-on-year growth 2014-2025
statista.com
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amazon Web Services: year-on-year growth 2014-2025 [Dataset]. https://www.statista.com/statistics/422273/yoy-quarterly-growth-aws-revenues/
Explore at:
Dataset updated
May 13, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In the first quarter of 2025, revenues of Amazon Web Services (AWS) rose to 17 percent, a decrease from the previous three quarters. AWS is one of Amazon’s strongest revenue segments, generating over 115 billion U.S. dollars in 2024 net sales, up from 105 billion U.S. dollars in 2023. Amazon Web Services Amazon Web Services (AWS) provides on-demand cloud platforms and APIs through a pay-as-you-go-model to customers. AWS launched in 2002 providing general services and tools and produced its first cloud products in 2006. Today, more than 175 different cloud services for a variety of technologies and industries are released already. AWS ranks as one of the most popular public cloud infrastructure and platform services running applications worldwide in 2020, ahead of Microsoft Azure and Google cloud services. Cloud computing Cloud computing is essentially the delivery of online computing services to customers. As enterprises continually migrate their applications and data to the cloud instead of storing it on local machines, it becomes possible to access resources from different locations. Some of the key services of the AWS ecosystem for cloud applications include storage, database, security tools, and management tools. AWS is among the most popular cloud providers Some of the largest globally operating enterprises use AWS for their cloud services, including Netflix, BBC, and Baidu. Accordingly, AWS is one of the leading cloud providers in the global cloud market. Due to its continuously expanding portfolio of services and deepening of expertise, the company continues to be not only an important cloud service provider but also a business partner.
Amazon Prime TV Shows
kaggle.com
Updated Oct 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neelima Jauhari (2020). Amazon Prime TV Shows [Dataset]. https://www.kaggle.com/nilimajauhari/amazon-prime-tv-shows/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 13, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Neelima Jauhari
Description
Context

This data set was created so as to analyze the latest shows available on Amazon Prime as well as the shows with a high rating.

Content

The data set contains the name of the show or title, year of the release which is the year in which the show was released or went on-air, No.of seasons means the number of seasons of the show which are available on Prime, Language is for the audio language of the show and does not take into consideration the language of the subtitles, genre of the show like Kids, Drama, Action and so on, IMDB ratings of the show: though for many tv shows and kid shows the rating was not available, Age of Viewers is to specify the age of the target audience- All in age means that the content is not restricted to any particular age group and all audiences can view it.

Acknowledgements

I have collected this data from Amazon Prime's Website.

Inspiration

Since a lot many TV shows have high IMDB ratings but don't get viewed that much because the audience is not aware of it or it is not advertised much. I have created this data set so as to find out the highest-rated shows in each category or in a particular genre.
Amazon Product Reviews
kaggle.com
Updated Nov 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Amazon Product Reviews [Dataset]. https://www.kaggle.com/datasets/thedevastator/amazon-product-reviews/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Amazon Product Reviews

18 Years of Customer Ratings and Experiences

By Huggingface Hub [source]

About this dataset

The Amazon Reviews Polarity Dataset discloses eighteen years of customers' ratings and reviews from Amazon.com, offering an unparalleled trove of insight and knowledge. Drawing from the immense pool of over 35 million customer reviews, this dataset presents a broad spectrum of customer opinions on products they have bought or used. This invaluable data is a gold mine for improving products and services as it contains comprehensive information regarding customers' experiences with a product including ratings, titles, and plaintext content. At the same time, this dataset contains both customer-specific data along with product information which encourages deep analytics that could lead to great advances in providing tailored solutions for customers. Has your product been favored by the majority? Are there any aspects that need extra care? Use Amazon Reviews Polarity to gain deeper insights into what your customers want - explore now!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

Analyze customer ratings to identify trends: Take a look at how many customers have rated the same product or service with the same score (e.g., 4 stars). You can use this information to identify what customers like or don’t like about it by examining common sentiment throughout the reviews. Identifying these patterns can help you make decisions on which features of your products or services to emphasize in order to boost sales and satisfaction rates.

2 Review content analysis: Analyzing review content is one of the best ways to gauge customer sentiment toward specific features or aspects of a product/service. Using natural language processing tools such as Word2Vec, Latent Dirichlet Allocation (LDA), or even simple keyword search algorithms can quickly reveal general topics that are discussed in relation to your product/service across multiple reviews - allowing you quickly pinpoint areas that may need improvement for particular items within your lines of business.

3 Track associated scores over time: By tracking customer ratings overtime, you may be able to better understand when there has been an issue with something specific related to your product/service - such as negative response toward a feature that was introduced but didn’t seem popular among customers and was removed shortly after introduction.. This can save time and money by identifying issues before they become widespread concerns with larger sets of consumers who invest their money in using your company's item(s).

4 Visualize sentiment data over time graphs : Utilizing visualizations such as bar graphs can help identify trends across different categories quicker than raw numbers alone; combining both numeric values along with color differences associated between different scores allows you spot anomalies easier - allowing faster resolution times when trying figure out why certain spikes occurred where other stayed stable (or vice-versa) when comparing similar data points through time-series based visualization models

Research Ideas

Developing a customer sentiment analysis system that can be used to quickly analyze the sentiment of reviews and identify any potential areas of improvement.

Building a product recommendation service that takes into account the ratings and reviews of customers when recommending similar products they may be interested in purchasing.

Training a machine learning model to accurately predict customers’ ratings on new products they have not yet tried and leverage this for further product development optimization initiatives

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: train.csv | Column name | Description | |:--------------|:-------------------------------------------------------------------| | label | The sentiment of the review, either positive or negative. (String) | | title | The title of the review. (String) ...
Amazon Business Research Analyst Dataset
kaggle.com
Updated Nov 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vikramjeet Singh (2022). Amazon Business Research Analyst Dataset [Dataset]. https://www.kaggle.com/vikramxd/amazon-business-research-analyst-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 4, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vikramjeet Singh
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Amazon.com strives to be Earth's most customer-centric company where people can find and discover virtually anything they want to buy online. By giving customers more of what they want - low prices, vast selection, and convenience - Amazon.com continues to grow and evolve as a world-class e-commerce platform. Amazon's evolution from Web site to an e-commerce partner to a development platform is driven by the spirit of innovation that is part of the company's DNA. The world's brightest technology minds come to Amazon.com to research and develop technology that improves the lives of shoppers and sellers around the world.

The Business Research analyst (RA) will be responsible for continuous improvement projects across the RBS teams leading to each of its delivery levers. The long-term goal of the RA role is to eliminate Defects that impact selling partners or customer experience, and the secondary goal is to enhance GMS (General merchandise sales)/ FCF (Free cash flow). This will require collaboration with local and global teams, which have a process and technical expertise. RA is an Independent Contributor (IC) to perform a Big Data Analysis to identify defect patterns/process gaps for complex problems, create long-term solutions to eliminate defects/issuesAmazon.com strives to be Earth's most customer-centric company where people can find and discover virtually anything they want to buy online. By giving customers more of what they want - low prices, vast selection, and convenience - Amazon.com continues to grow and evolve as a world-class e-commerce platform. Amazon's evolution from Web site to an e-commerce partner to a development platform is driven by the spirit of innovation that is part of the company's DNA. The world's brightest technology minds come to Amazon.com to research and develop technology that improves the lives of shoppers and sellers around the world.

The Business Research analyst (RA) will be responsible for continuous improvement projects across the RBS teams leading to each of its delivery levers. The long-term goal of the RA role is to eliminate Defects that impact selling partners or customer experience, and the secondary goal is to enhance GMS (General merchandise sales)/ FCF (Free cash flow). This will require collaboration with local and global teams, which have a process and technical expertise. RA is an Independent Contributor (IC) to perform a Big Data Analysis to identify defect patterns/process gaps for complex problems, create long-term solutions to eliminate defects/issues and report impact/results. RA should work across VP teams, and regional and/or cross-regional organizations to drive improvements. Leads projects and opportunities across the Operations (FCs, CS, Supply Chain, Transportation, Engineering) that are business critical, and maybe global in nature

Evaluation Metric ** 100*(metric.r2_score(y-pred, y))**
P
AWS Documentation Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sia Gholami; Mehdi Noori, AWS Documentation Dataset [Dataset]. https://paperswithcode.com/dataset/aws-documentation
Explore at:
Authors
Sia Gholami; Mehdi Noori
Description
We present the AWS documentation corpus, an open-book QA dataset, which contains 25,175 documents along with 100 matched questions and answers. These questions are inspired by the author's interactions with real AWS customers and the questions they asked about AWS services. The data was anonymized and aggregated. All questions in the dataset have a valid, factual and unambiguous answer within the accompanying documents, we deliberately avoided questions that are ambiguous, incomprehensible, opinion-seeking, or not clearly a request for factual information. All questions, answers and accompanying documents in the dataset are annotated by authors. There are two types of answers: text and yes-no-none(YNN) answers. Text answers range from a few words to a full paragraph sourced from a continuous block of words in a document or from different locations within the same document. Every question in the dataset has a matched text answer. Yes-no-none(YNN) answers can be yes, no, or none depending on the type of question. For example the question: “Can I stop a DB instance that has a read replica?” has a clear yes or no answer but the question “What is the maximum number of rows in a dataset in Amazon Forecast?” is not a yes or no question and therefore has a “None” as the YNN answer. 23 questions have ‘Yes’ YNN answers, 10 questions have ‘No’ YNN answers and 67 questions have ‘None’ YNN answers.
d
Amazon Email Receipt Data | Consumer Transaction Data | Asia, EMEA, LATAM,...
datarade.ai
.json, .xml, .csv
Updated Oct 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Measurable AI (2023). Amazon Email Receipt Data | Consumer Transaction Data | Asia, EMEA, LATAM, MENA, India | Granular & Aggregate Data available [Dataset]. https://datarade.ai/data-products/amazon-email-receipt-data-consumer-transaction-data-asia-measurable-ai
Explore at:
.json, .xml, .csvAvailable download formats
Dataset updated
Oct 12, 2023
Dataset authored and provided by
Measurable AI
Area covered
Asia, Latin America, Brazil, Chile, Colombia, Pakistan, United States of America, Thailand, Japan, Argentina, Malaysia, Mexico
Description
The Measurable AI Amazon Consumer Transaction Dataset is a leading source of email receipts and consumer transaction data, offering data collected directly from users via Proprietary Consumer Apps, with millions of opt-in users.

We source our email receipt consumer data panel via two consumer apps which garner the express consent of our end-users (GDPR compliant). We then aggregate and anonymize all the transactional data to produce raw and aggregate datasets for our clients.

Use Cases Our clients leverage our datasets to produce actionable consumer insights such as: - Market share analysis - User behavioral traits (e.g. retention rates) - Average order values - Promotional strategies used by the key players. Several of our clients also use our datasets for forecasting and understanding industry trends better.

Coverage - Asia (Japan) - EMEA (Spain, United Arab Emirates)

Granular Data Itemized, high-definition data per transaction level with metrics such as - Order value - Items ordered - No. of orders per user - Delivery fee - Service fee - Promotions used - Geolocation data and more

Aggregate Data - Weekly/ monthly order volume - Revenue delivered in aggregate form, with historical data dating back to 2018. All the transactional e-receipts are sent from app to users’ registered accounts.

Most of our clients are fast-growing Tech Companies, Financial Institutions, Buyside Firms, Market Research Agencies, Consultancies and Academia.

Our dataset is GDPR compliant, contains no PII information and is aggregated & anonymized with user consent. Contact business@measurable.ai for a data dictionary and to find out our volume in each country.
Z
PIPr: A Dataset of Public Infrastructure as Code Programs
data.niaid.nih.gov
zenodo.org
Updated Nov 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salvaneschi, Guido (2023). PIPr: A Dataset of Public Infrastructure as Code Programs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8262770
Explore at:
Dataset updated
Nov 28, 2023
Dataset provided by
Salvaneschi, Guido
Sokolowski, Daniel
Spielmann, David
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
Programming Languages Infrastructure as Code (PL-IaC) enables IaC programs written in general-purpose programming languages like Python and TypeScript. The currently available PL-IaC solutions are Pulumi and the Cloud Development Kits (CDKs) of Amazon Web Services (AWS) and Terraform. This dataset provides metadata and initial analyses of all public GitHub repositories in August 2022 with an IaC program, including their programming languages, applied testing techniques, and licenses. Further, we provide a shallow copy of the head state of those 7104 repositories whose licenses permit redistribution. The dataset is available under the Open Data Commons Attribution License (ODC-By) v1.0. Contents:

metadata.zip: The dataset metadata and analysis results as CSV files. scripts-and-logs.zip: Scripts and logs of the dataset creation. LICENSE: The Open Data Commons Attribution License (ODC-By) v1.0 text. README.md: This document. redistributable-repositiories.zip: Shallow copies of the head state of all redistributable repositories with an IaC program. This artifact is part of the ProTI Infrastructure as Code testing project: https://proti-iac.github.io. Metadata The dataset's metadata comprises three tabular CSV files containing metadata about all analyzed repositories, IaC programs, and testing source code files. repositories.csv:

ID (integer): GitHub repository ID url (string): GitHub repository URL downloaded (boolean): Whether cloning the repository succeeded name (string): Repository name description (string): Repository description licenses (string, list of strings): Repository licenses redistributable (boolean): Whether the repository's licenses permit redistribution created (string, date & time): Time of the repository's creation updated (string, date & time): Time of the last update to the repository pushed (string, date & time): Time of the last push to the repository fork (boolean): Whether the repository is a fork forks (integer): Number of forks archive (boolean): Whether the repository is archived programs (string, list of strings): Project file path of each IaC program in the repository programs.csv:

ID (string): Project file path of the IaC program repository (integer): GitHub repository ID of the repository containing the IaC program directory (string): Path of the directory containing the IaC program's project file solution (string, enum): PL-IaC solution of the IaC program ("AWS CDK", "CDKTF", "Pulumi") language (string, enum): Programming language of the IaC program (enum values: "csharp", "go", "haskell", "java", "javascript", "python", "typescript", "yaml") name (string): IaC program name description (string): IaC program description runtime (string): Runtime string of the IaC program testing (string, list of enum): Testing techniques of the IaC program (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") tests (string, list of strings): File paths of IaC program's tests testing-files.csv:

file (string): Testing file path language (string, enum): Programming language of the testing file (enum values: "csharp", "go", "java", "javascript", "python", "typescript") techniques (string, list of enum): Testing techniques used in the testing file (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") keywords (string, list of enum): Keywords found in the testing file (enum values: "/go/auto", "/testing/integration", "@AfterAll", "@BeforeAll", "@Test", "@aws-cdk", "@aws-cdk/assert", "@pulumi.runtime.test", "@pulumi/", "@pulumi/policy", "@pulumi/pulumi/automation", "Amazon.CDK", "Amazon.CDK.Assertions", "Assertions_", "HashiCorp.Cdktf", "IMocks", "Moq", "NUnit", "PolicyPack(", "ProgramTest", "Pulumi", "Pulumi.Automation", "PulumiTest", "ResourceValidationArgs", "ResourceValidationPolicy", "SnapshotTest()", "StackValidationPolicy", "Testing", "Testing_ToBeValidTerraform(", "ToBeValidTerraform(", "Verifier.Verify(", "WithMocks(", "[Fact]", "[TestClass]", "[TestFixture]", "[TestMethod]", "[Test]", "afterAll(", "assertions", "automation", "aws-cdk-lib", "aws-cdk-lib/assert", "aws_cdk", "aws_cdk.assertions", "awscdk", "beforeAll(", "cdktf", "com.pulumi", "def test_", "describe(", "github.com/aws/aws-cdk-go/awscdk", "github.com/hashicorp/terraform-cdk-go/cdktf", "github.com/pulumi/pulumi", "integration", "junit", "pulumi", "pulumi.runtime.setMocks(", "pulumi.runtime.set_mocks(", "pulumi_policy", "pytest", "setMocks(", "set_mocks(", "snapshot", "software.amazon.awscdk.assertions", "stretchr", "test(", "testing", "toBeValidTerraform(", "toMatchInlineSnapshot(", "toMatchSnapshot(", "to_be_valid_terraform(", "unittest", "withMocks(") program (string): Project file path of the testing file's IaC program Dataset Creation scripts-and-logs.zip contains all scripts and logs of the creation of this dataset. In it, executions/executions.log documents the commands that generated this dataset in detail. On a high level, the dataset was created as follows:

A list of all repositories with a PL-IaC program configuration file was created using search-repositories.py (documented below). The execution took two weeks due to the non-deterministic nature of GitHub's REST API, causing excessive retries. A shallow copy of the head of all repositories was downloaded using download-repositories.py (documented below). Using analysis.ipynb, the repositories were analyzed for the programs' metadata, including the used programming languages and licenses. Based on the analysis, all repositories with at least one IaC program and a redistributable license were packaged into redistributable-repositiories.zip, excluding any node_modules and .git directories. Searching Repositories The repositories are searched through search-repositories.py and saved in a CSV file. The script takes these arguments in the following order:

Github access token. Name of the CSV output file. Filename to search for. File extensions to search for, separated by commas. Min file size for the search (for all files: 0). Max file size for the search or * for unlimited (for all files: *). Pulumi projects have a Pulumi.yaml or Pulumi.yml (case-sensitive file name) file in their root folder, i.e., (3) is Pulumi and (4) is yml,yaml. https://www.pulumi.com/docs/intro/concepts/project/ AWS CDK projects have a cdk.json (case-sensitive file name) file in their root folder, i.e., (3) is cdk and (4) is json. https://docs.aws.amazon.com/cdk/v2/guide/cli.html CDK for Terraform (CDKTF) projects have a cdktf.json (case-sensitive file name) file in their root folder, i.e., (3) is cdktf and (4) is json. https://www.terraform.io/cdktf/create-and-deploy/project-setup Limitations The script uses the GitHub code search API and inherits its limitations:

Only forks with more stars than the parent repository are included. Only the repositories' default branches are considered. Only files smaller than 384 KB are searchable. Only repositories with fewer than 500,000 files are considered. Only repositories that have had activity or have been returned in search results in the last year are considered. More details: https://docs.github.com/en/search-github/searching-on-github/searching-code The results of the GitHub code search API are not stable. However, the generally more robust GraphQL API does not support searching for files in repositories: https://stackoverflow.com/questions/45382069/search-for-code-in-github-using-graphql-v4-api Downloading Repositories download-repositories.py downloads all repositories in CSV files generated through search-respositories.py and generates an overview CSV file of the downloads. The script takes these arguments in the following order:

Name of the repositories CSV files generated through search-repositories.py, separated by commas. Output directory to download the repositories to. Name of the CSV output file. The script only downloads a shallow recursive copy of the HEAD of the repo, i.e., only the main branch's most recent state, including submodules, without the rest of the git history. Each repository is downloaded to a subfolder named by the repository's ID.
T
amazon_us_reviews
tensorflow.org
huggingface.co
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). amazon_us_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/amazon_us_reviews
Explore at:
Dataset updated
Dec 6, 2022
Description
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).

Each Dataset contains the following columns : marketplace - 2 letter country code of the marketplace where the review was written. customer_id - Random identifier that can be used to aggregate reviews written by a single author. review_id - The unique ID of the review. product_id - The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id. product_parent - Random identifier that can be used to aggregate reviews for the same product. product_title - Title of the product. product_category - Broad product category that can be used to group reviews (also used to group the dataset into coherent parts). star_rating - The 1-5 star rating of the review. helpful_votes - Number of helpful votes. total_votes - Number of total votes the review received. vine - Review was written as part of the Vine program. verified_purchase - The review is on a verified purchase. review_headline - The title of the review. review_body - The review text. review_date - The date the review was written.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('amazon_us_reviews', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
d
Amazon Seller Directory 2025 | Amazon Seller Database USA, FR, Germany, ESP,...
datarade.ai
.csv, .xls
Updated Feb 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lead for Business (2022). Amazon Seller Directory 2025 | Amazon Seller Database USA, FR, Germany, ESP, UK, Italy, CA | List of Amazon Sellers | 200K+ Amazon Seller Leads| [Dataset]. https://datarade.ai/data-products/amazon-seller-directory-amazon-fba-seller-database-with-sto-lead-for-business
Explore at:
.csv, .xlsAvailable download formats
Dataset updated
Feb 21, 2022
Dataset authored and provided by
Lead for Business
Area covered
United Kingdom, United States, Italy
Description
• 500K+ Active Amazon Stores • 200K+ Seller Leads • Platforms USA, Germany, UK, Italy, France, Spain, CA • C-Suite/Marketing/Sales Contacts • FBA/Non-FBA Sellers • 15+ data points available for each prospect • Filter your leads by store size, niche, location, and many more • 100% manually researched and verified.

For over a decade, we have been manually collecting Amazon seller data from various data sources such as Amazon, Linkedin, Google, and others. We are specialized to get valid, and potential data so you may conduct ads and begin selling without hesitation.

We designed our data packages for all types of organizations, thus they are reasonably priced. We are always trying to reduce our prices to better suit all of your requirements.

So, if you’re looking to reach out to your targeted Amazon sellers, now is the greatest time to do so and offer your goods, services, and promotions. You can get your targeted Amazon Sellers List with seller contact information.

Alternatively, if you provide Amazon Seller Names or IDs, we will conduct Custom Research and deliver the customized list to you.

Data Points Available:

Full Name Linkedin URL Direct Email Generic Phone Number Business Name and Address Company Website Seller IDs and URLs Revenue Seller Review Count Niche FBA/Non-FBA Country and More
r
1000 Genomes Project and AWS
rrid.site
neuinfo.org
+2more
Updated Jun 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). 1000 Genomes Project and AWS [Dataset]. http://identifiers.org/RRID:SCR_008801
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008801
Dataset updated
Jun 28, 2025
Description
A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.
m
Data from: Amazon Rainforest Wildfires Rumor Detection
data.mendeley.com
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bram Janssens (2022). Amazon Rainforest Wildfires Rumor Detection [Dataset]. http://doi.org/10.17632/m7k4gsffry.1
Explore at:
Unique identifier
https://doi.org/10.17632/m7k4gsffry.1
Dataset updated
Dec 6, 2022
Authors
Bram Janssens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Amazon Rainforest
Description
The data set contains information about the Amazon rainforest wildfires that took place in 2019. Twitter data has been collected between August 21, 2019 and September 27, 2019 based on the following hashtags: #PrayforAmazonas, #AmazonRainforest, and #AmazonFire.

The goal of this data set is to detect whether a tweet is identified as a rumor or not (given by the 'label' column). A tweet that is identified as a rumor is labeled as 1, and 0 otherwise. The tweets were labeled by two independent annotators using the following guidelines. Whether a tweet is a rumor or not depends on 3 important aspects: (1) A rumor is a piece of information that is unverified or not confirmed by official instances. In other words, it does not matter whether the information turns out to be true or false in the future. (2) More specifically, a tweet is a rumor if the information is unverified at the time of posting. (3) For a tweet to be a rumor, it should contain an assertion, meaning the author of tweet commits to the truth of the message.

In sum, the annotators indicated that a tweet is a rumor if it consisted of an assertion giving information that is unverifiable at the time of posting. Practically, to check whether the information in a tweet was verified or confirmed by official instances at the moment of tweeting, the annotators used BBC News and Reuters. After all the tweets were labeled, the annotators re-iterated over the tweets they disagreed on to produce the final tweet label.

Besides the label indicating whether a tweet is a rumor or not (i.e., ‘label’), the data set contains the tweet itself (i.e., ‘full_text’), and additional metadata (e.g., ‘created_at’, ‘favorite_count’). In total, the data set contains 1,392 observations of which 184 (13%) are identified as rumors.

This data set can be used by researchers to make rumor detection models (i.e., statistical, machine learning and deep learning models) using both unstructured (i.e., textual) and structured data.
d
More than 1,070,574 Verified Contacts of companies that use Amazon AWS
datarade.ai
Updated Aug 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataCaptive (2021). More than 1,070,574 Verified Contacts of companies that use Amazon AWS [Dataset]. https://datarade.ai/data-providers/datacaptive/data-products/more-than-1-070-574-verified-contacts-of-companies-that-use-a-datacaptive
Explore at:
.json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Aug 20, 2021
Dataset authored and provided by
DataCaptive
Area covered
Rwanda, Singapore, Virgin Islands (British), Niger, Saint Helena, Iceland, Tonga, British Indian Ocean Territory, Kyrgyzstan, Tunisia
Description
Amazon AWS - Cloud Platforms & Services

Companies using Amazon AWS

We have data on 1,070,574 companies that use Amazon AWS. The companies using Amazon AWS are most often found in United States and in the Computer Software industry. Amazon AWS is most often used by companies with 10-50 employees and 1M-10M dollars in revenue. Our data for Amazon AWS usage goes back as far as 2 years and 1 months.

What is Amazon AWS?

Amazon Web Services (AWS) is a collection of remote computing services, also called web services that make up a cloud computing platform offered by Amazon.com.

Top Industries that use Amazon AWS

Looking at Amazon AWS customers by industry, we find that Computer Software (6%) is the largest segment.

Distribution of companies using Amazon AWS by Industry

 Computer software - 67, 537 companies  Hospitals & Healthcare - 54, 293 companies  Retail - 39, 543 companies  Information Technology and Services - 35, 382 companies  Real Estate - 31, 676 companies  Restaurants - 30, 302 companies  Construction - 29, 207 companies  Automotive - 28, 469 companies  Financial Services - 23, 680 companies  Education Management - 21, 548 companies

Top Countries that use Amazon AWS

49% of Amazon AWS customers are in United States and 7% are in United Kingdom.

Distribution of companies using Amazon AWS by country

 United Sates – 616 2275 companies  United Kingdom – 68 219 companies  Australia – 44 601 companies  Canada – 42 770 companies  Germany – 31 541 companies  India – 30 949 companies  Netherlands – 19 543 companies  Brazil – 17 165 companies  Italy – 14 876 companies  Spain – 14 675 companies

Contact Information of Fields Include:-

• Company Name • Business contact number • Title
• Name • Email Address • Country, State, City, Zip Code • Phone, Mobile and Fax • Website • Industry • SIC & NAICS Code • Employees Size
• Revenue Size
• And more…

Why Buy AWS Users List from DataCaptive?

• More than 1,070,574 companies
• Responsive database • Customizable as per your requirements • Email and Tele-verified list • Team of 100+ market researchers • Authentic data sources

What’s in for you?

Over choosing us, here are a few advantages we authenticate-

• Locate, target, and prospect leads from 170+ countries • Design and execute ABM and multi-channel campaigns • Seamless and smooth pre-and post-sale customer service • Connect with old leads and build a fruitful customer relationship • Analyze the market for product development and sales campaigns • Boost sales and ROI with increased customer acquisition and retention

Our security compliance

We use of globally recognized data laws like –

GDPR, CCPA, ACMA, EDPS, CAN-SPAM and ANTI CAN-SPAM to ensure the privacy and security of our database. We engage certified auditors to validate our security and privacy by providing us with certificates to represent our security compliance.

Our USPs- what makes us your ideal choice?

At DataCaptive™, we strive consistently to improve our services and cater to the needs of businesses around the world while keeping up with industry trends.

• Elaborate data mining from credible sources • 7-tier verification, including manual quality check • Strict adherence to global and local data policies • Guaranteed 95% accuracy or cash-back • Free sample database available on request

Guaranteed benefits of our Amazon AWS users email database!

85% email deliverability and 95% accuracy on other data fields

We understand the importance of data accuracy and employ every avenue to keep our database fresh and updated. We execute a multi-step QC process backed by our Patented AI and Machine learning tools to prevent anomalies in consistency and data precision. This cycle repeats every 45 days. Although maintaining 100% accuracy is quite impractical, since data such as email, physical addresses, and phone numbers are subjected to change, we guarantee 85% email deliverability and 95% accuracy on other data points.

100% replacement in case of hard bounces

Every data point is meticulously verified and then re-verified to ensure you get the best. Data Accuracy is paramount in successfully penetrating a new market or working within a familiar one. We are committed to precision. However, in an unlikely event where hard bounces or inaccuracies exceed the guaranteed percentage, we offer replacement with immediate effect. If need be, we even offer credits and/or refunds for inaccurate contacts.

Other promised benefits

• Contacts are for the perpetual usage • The database comprises consent-based opt-in contacts only • The list is free of duplicate contacts and generic emails • Round-the-clock customer service assistance • 360-degree database solutions
Data from: RB - Rio de Janeiro Botanical Garden Herbarium Collection
gbif.org
es.bionomia.net
+2more
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafaela Forzza; Luís Alexandre Silva; Rafaela Forzza; Luís Alexandre Silva (2025). RB - Rio de Janeiro Botanical Garden Herbarium Collection [Dataset]. http://doi.org/10.15468/7ep9i2
Explore at:
Unique identifier
https://doi.org/10.15468/7ep9i2
Dataset updated
Apr 1, 2025
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
Instituto de Pesquisas Jardim Botanico do Rio de Janeiro
Authors
Rafaela Forzza; Luís Alexandre Silva; Rafaela Forzza; Luís Alexandre Silva
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
Created over a century ago, the RB currently comprises ca. 750,000 mounted specimens, with a strong representation of Brazilian flora, mainly from the Atlantic and Amazon forests. Nearly 100% of these specimens have been entered into the database and imaged and, at present, about 17% have been geo-referenced. This data paper is focused exclusively on RB's exsiccatae collection of land plants and algae, which is currently increasing by about twenty to thirty thousand specimens per year thanks to fieldwork, exchange and donations. Since 2005, many national and international projects have been implemented, improving the quality and accessibility of the collection. The most important facilitating factor in this process was the creation of the institutional system for plants collection and management, named JABOT. Since the RB is continuously growing, the dataset is updated weekly on SiBBr and GBIF portals.
The most represented environments are the Atlantic and Amazon forests, a biodiversity hotspot and the world's largest rain forest, respectively. The dataset described in this article contains the data and metadata of plants and algae specimens in the RB collection and the link to access the respective images. Currently, the RB data is publicly available online at several biodiversity portals, such as our institutional database JABOT, the Reflora Virtual Herbarium, the SiBBr and the GBIF portal. However, a description of the RB dataset as a whole is not available in the literature.
Database, Storage & Backup Software Publishing in the US - Market Research...
ibisworld.com
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IBISWorld (2025). Database, Storage & Backup Software Publishing in the US - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/united-states/market-research-reports/database-storage-backup-software-publishing-industry/
Explore at:
Dataset updated
Apr 15, 2025
Dataset authored and provided by
IBISWorld
License
https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Time period covered
2015 - 2030
Area covered
United States
Description
The rise in remote work and digital transformation initiatives has accelerated the demand for robust and scalable solutions offered by the database, storage and backup software publishing industry. Cloud adoption has surged, with downstream businesses in finance and healthcare increasingly relying on cloud-based databases and storage systems to ensure accessibility and resilience. To capture demand, publishers have grown revenue through subscription-based offerings, which have expanded the industry's reach and provided recurring revenue over the past five years. Driven by a 47.9% surge in 2021, industry revenue has increased at a CAGR of 10.2% to reach $98.9 billion, including growth of 2.5% in 2025. Advancements in cloud and digital technology have paved the way for new freemium substitutes, reshaping industry competition and introducing operational challenges. As new, cost-effective solutions emerge, traditional publishers have faced the challenge of differentiating their offerings while maintaining profitability. Leading companies such as Microsoft and Oracle have responded with investments in compatibility capabilities and AI features that have been designed to retain users as more options become available. Combined with the emerging threat of cyber attacks, however, these investments have weighed on industry profitability as greater resources are now needed to support different initiatives. With freemium models here to stay, industry revenue growth will decelerate moving forward. Users are expected to demand free tiers among leading publishers, who have already deployed these subscription models at the cost of revenue growth. Despite these trends, however, publishers are expected to benefit from data center expansions and upgrades, which will provide them with the necessary infrastructure to develop next-generation AI and edge computing offerings. With billions of dollars being invested in these areas, industry revenue will be sustained and rise at a CAGR of 2.5% over the next five years to reach $112.0 billion in 2030.
e
Wok - Dataset - B2FIND
b2find.eudat.eu
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
Dataset updated
Jul 19, 2024
Description
Wok is a workflow management system implemented in Python that makes very easy to structure the workflows, parallelize their execution and monitor its progress among other things. It is designed in a modular way allowing to adapt it to different infraestructures./nFor the time being it is strongly focused on clusters implementing any DRMAA compatible resource manager (i.e. Oracle Grid Engine) which working nodes have a shared folder in common. Other, more flexible infrastructures (such as the Amazon EC2) are considered for future implementations. Workflows in Wok are defined in an xml file with the .flow extension. This definition includes:/n- the different modules (or pieces of processing)/n- the interconnections between modules (i.e. the input of module B links with the output of module A)/n- explicit dependencies (i.e. module A cannot be executed until module B has finished)/n- descriptions that can be used to generate documentation automatically or to create web forms/nEach module corresponds with a piece of software that has to be run in order to process some input and generate an output. For now, only Python scripts are allowed, but they can be used to execute software written in other languages./nWorkflows in Wok can be treated as any software project and managed with version control system tools and the IDE of your choice./nWok can be used as a terminal script or can be run in server mode./nThe execution of a workflow in the terminal is done using the wok-run script which allows few options:/n- An instance name (-n name), which allows to run the same workflow many times simultaneously independently/n- Configuration files (-c file.conf), the configuration can be splitted in as much files as desired/n- Configuration parameters (-D param=value), which overwrite any previous configuration in configuration files/nThe workflow definition file (i.e. myworkflow.flow) is passed as the first argument./nTo monitor the execution of the workflow there are different resources available:/n- The web server that allows to interact with the engine in a very straightforward way. Recommended!./n- The logs emited by the wok-run through the standard output,/n- The intermediate files generated by Wok (i.e. the tasks output files)/nIt has been designed for workflow developers who feel more confortable programming than doing hundred of clicks and drag & drop's, and also for those who want infraestructure flexibility and full control and monitorization of the execution.
d
Wetland timing, Amazon Basin, Reis et al. 2019
search.dataone.org
hydroshare.org
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Claire Beveridge (2021). Wetland timing, Amazon Basin, Reis et al. 2019 [Dataset]. https://search.dataone.org/view/sha256%3A5f22fe1bc6f9d0fd1f39f1695ffe2e198ccf450a430b685da1195ec3fb94c1ca
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Claire Beveridge
Time period covered
Jan 1, 1993 - Dec 31, 2004
Area covered

Description
This resource contains a map (image map only, not a database) that characterizes wetland inundation regimes in the Amazon basin. The map is from a study by Reis et al. (2019) titled "Characterizing seasonal dynamics of Amazonian wetlands for conservation and decision making." The maps shows the distribution of wetland inundation cluster. Each cluster represents a distinct seasonal inundation regime, ranging from the most ephemeral wetlands (cluster 1) to permanently inundated wetlands (cluster 18). The study uses the Global Inundation Extent from Multi‐Satellites database (GIEMS) D15 dataset by Fluet‐Chouinard et al., 2015. The GIEMS dataset is available upon request (see http://www.estellus.fr/index.php?static13/giems-d15). The Reis et al. 2019 map data may be available upon request from the author.

Abstract from source: In many wetlands the timing and duration of inundation determine ecological characteristics and the provision of ecosystem services; however, wetland conservation decisions often rely on static maps of wetland boundaries that do not capture their dynamic hydrological variability and connectivity. The Amazon River basin contains some of the world's most extensive wetlands, many of which are floodplains where seasonal flood pulses result in a temporally varying inundation area and hydrological connectivity with river systems. This study classified Amazon wetlands according to the timing and duration (months per year) of inundation detected by remote sensing, and also investigated the contribution of precipitation regimes in affecting wetland distribution and hydrological dynamics. Permanently inundated wetlands account for the largest area and are mainly floodplains located in the lowlands of the catchment. Seasonally inundated wetlands varied greatly in the duration of inundation over the course of the year, ranging from 1 to 9 months. Distinct seasonal timing was detected among the large wetland complexes, reflecting rainfall regimes as well as time lags for drainage and drying. For example, inundation in the extensive Llanos de Moxos region of the southern Amazon was protracted and lasted well after the rainy season, compared with the Roraima region of the northern Amazon, where inundation was shorter and tracked the rainy season. The integration of inundation dynamics into wetland classification captures regional differences in timing and duration of inundation in the major wetlands of the basin that should be considered for conservation planning and other ecological applications. This information can aid regional wetland management and planning, especially with regards to minimizing the effects of dam and waterway construction that can directly affect the natural wetland dynamics. The use of global remotely sensed inundation data makes this approach easily transferable to other large tropical wetlands.

Contents: "Reis_2019_AmazonWetlandsSeasonalDynamics.pdf" is the manuscript that describing the data analysis.

"Reis_2019_Supplemental1_AmazonWetlandsSeasonalDynamics.pdf" is a silhouette plot of CLARA classification showing the silhouette width of each of the 18 wetland clusters. The bars represent the samples grouped in each cluster and the silhouette width (SI) is a measure of the performance of the classification. The SI ranges from −1 to +1, where a high positive value indicates that the object is well matched to its own cluster and poorly matched to neighbouring clusters.

"Reis_2019_Supplemental2_AmazonWetlandsSeasonalDynamics.pdf" is the main data product, a geographic distribution of inundation clusters across the Amazon basin. Each cluster represents a distinct seasonal inundation regime, ranging from the most ephemeral wetlands (cluster 1) to permanently inundated wetlands (cluster 18). Deep open waters of river, lakes, and reservoirs depicted in GWD‐LR are shown in black.
n
Data from: A large-scale assessment of ant diversity across the Brazilian...
data.niaid.nih.gov
search.dataone.org
+2more
zip
Updated Jun 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joudellys Andrade-Silva; Fabricio Baccaro; Lívia Prado; Benoit Guenard; Dan Warren; Jamie Kass; Evan Economo; Rogerio Silva (2022). A large-scale assessment of ant diversity across the Brazilian Amazon Basin: integrating geographic, ecological, and morphological drivers of sampling bias [Dataset]. http://doi.org/10.5061/dryad.ht76hdrj8
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.ht76hdrj8
Dataset updated
Jun 16, 2022
Dataset provided by
University of Hong Kong
Museu Paraense Emílio Goeldi
Universidade de São Paulo
Okinawa Institute of Science and Technology Graduate University
Universidade Federal do Amazonas
Authors
Joudellys Andrade-Silva; Fabricio Baccaro; Lívia Prado; Benoit Guenard; Dan Warren; Jamie Kass; Evan Economo; Rogerio Silva
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Brazil
Description
Tropical ecosystems are often biodiversity hotspots, and invertebrates represent the main underrepresented component of diversity in large-scale analyses. This problem is partly related to the scarcity of data widely available to conduct these studies and the lack of systematic organization of knowledge about invertebrates’ distributions in biodiversity hotspots. Here, we introduce and analyze a comprehensive data compilation of Amazonian ant diversity. Using records from 1817 to 2020 from both published and unpublished sources, we describe the diversity and distribution of ant species in the Brazilian Amazon Basin. Further, using high-definition images and data from taxonomic publications, we build a comprehensive database of morphological traits for the ant species that occur in the region. In total, we recorded 1,067 nominal species in the Brazilian Amazon Basin, with sampling locations strongly biased by access routes, urban centers, research institutions, and major infrastructure projects. Large areas where ant sampling is non-existent represent about 52% of the basin and are concentrated mainly in the North, Southeastern, and Western Brazilian Amazon. We found that distance to roads is the main driver of ant sampling in the Amazon. Contrary to our expectations, morphological traits had lower predictive power in predicting sample bias than purely geographic variables. However, when geographic predictors were controlled, habitat stratum and traits contribute to explain the remaining variance. More species were recorded in better-sampled areas, but species richness estimation models suggest that areas in South Amazonian edge forests are associated with especially high species richness. Our results represent the first trait-based, large-scale study for insects in Amazonian forests and a starting point for macroecological studies focusing on insect diversity in the Amazon Basin. Methods We obtained all records available in the literature for the Brazilian Amazon (from 1817 to 2020) through the Global Ant Biodiversity Informatics (GABI - Guénard et al. 2017) project. Then, we compiled additional data on ant occurrences in the Brazilian Amazon from online databases and scientific repositories in Brazil. We also included checklists from non-published sources, mainly dissertations, master’s theses, field expeditions, and environmental assessment reports, to compile the most comprehensive information on ant occurrences in the Brazilian Amazon. We obtained these checklists from Brazil’s leading research centers on taxonomy, systematics, and ant biology. We constructed the database of morphological traits based on five continuous measurements for all ant species recorded in the Brazilian Amazon Basin. These traits were selected because they are classified as priority information in functional aspects of ant ecology. Our database was based on more than 3,000 high-definition images, including lateral, frontal, and dorsal views. For species without high-definition images available, we obtained morphological traits from the taxonomic literature when possible, leading to data extracted from over 40 publications. We employed ImageJ software to record the measurements (http://imagej.nih.gov/ij). Whenever possible, we used the minor workers to standardize the measurements, as is routinely done in studies of the morphological diversity of ants. However, when these were not available, we used major workers to obtain morphological measurements. Further, some ant species have vestigial or absent eyes, making it impossible to measure some morphological traits. We assigned the following rule for these species: when the species did not show eyes, we assigned a value equal to 0 (zero) for “maximum eye size”. The same procedure was adopted for morphological traits related to eyes, such as “interocular distance”. This protocol allows keeping such species in the analyses and maintains their unique morphological characteristics.
o
Sentinel-2 Cloud-Optimized GeoTIFFs
registry.opendata.aws
Updated Oct 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Element 84 (2020). Sentinel-2 Cloud-Optimized GeoTIFFs [Dataset]. https://registry.opendata.aws/sentinel-2-l2a-cogs/
Explore at:
Dataset updated
Oct 5, 2020
Dataset provided by
<a href="https://www.element84.com/">Element 84</a>
Description
The Sentinel-2 mission is a land monitoring constellation of two satellites that provide high resolution optical imagery and provide continuity for the current SPOT and Landsat missions. The mission provides a global coverage of the Earth's land surface every 5 days, making the data of great use in ongoing studies. This dataset is the same as the Sentinel-2 dataset, except the JP2K files were converted into Cloud-Optimized GeoTIFFs (COGs). Additionally, SpatioTemporal Asset Catalog metadata has were in a JSON file alongside the data, and a STAC API called Earth-search is freely available to search the archive. This dataset contains all of the scenes in the original Sentinel-2 Public Dataset and will grow as that does. L2A data are available from April 2017 over wider Europe region and globally since December 2018.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Global net revenue of Amazon 2014-2024, by product group [Dataset]. https://www.statista.com/statistics/672747/amazons-consolidated-net-revenue-by-segment/

Global net revenue of Amazon 2014-2024, by product group

Explore at:

18 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Feb 24, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

Worldwide

Description

In 2024, Amazon's net revenue from subscription services segment amounted to 44.37 billion U.S. dollars. Subscription services include Amazon Prime, for which Amazon reported 200 million paying members worldwide at the end of 2020. The AWS category generated 107.56 billion U.S. dollars in annual sales. During the most recently reported fiscal year, the company’s net revenue amounted to 638 billion U.S. dollars. Amazon revenue segments Amazon is one of the biggest online companies worldwide. In 2019, the company’s revenue increased by 21 percent, compared to Google’s revenue growth during the same fiscal period, which was just 18 percent. The majority of Amazon’s net sales are generated through its North American business segment, which accounted for 236.3 billion U.S. dollars in 2020. The United States are the company’s leading market, followed by Germany and the United Kingdom. Business segment: Amazon Web Services Amazon Web Services, commonly referred to as AWS, is one of the strongest-growing business segments of Amazon. AWS is a cloud computing service that provides individuals, companies and governments with a wide range of computing, networking, storage, database, analytics and application services, among many others. As of the third quarter of 2020, AWS accounted for approximately 32 percent of the global cloud infrastructure services vendor market.

Clear search

Close search

Google apps

Main menu

Global net revenue of Amazon 2014-2024, by product group

Amazon revenue 2004-2024

Amazon Web Services: year-on-year growth 2014-2025

Amazon Prime TV Shows

Context

Content

Acknowledgements

Inspiration

Amazon Product Reviews

Amazon Product Reviews

18 Years of Customer Ratings and Experiences

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Amazon Business Research Analyst Dataset

AWS Documentation Dataset

Amazon Email Receipt Data | Consumer Transaction Data | Asia, EMEA, LATAM,...

PIPr: A Dataset of Public Infrastructure as Code Programs

amazon_us_reviews

Amazon Seller Directory 2025 | Amazon Seller Database USA, FR, Germany, ESP,...

1000 Genomes Project and AWS

Data from: Amazon Rainforest Wildfires Rumor Detection

More than 1,070,574 Verified Contacts of companies that use Amazon AWS

Data from: RB - Rio de Janeiro Botanical Garden Herbarium Collection

Database, Storage & Backup Software Publishing in the US - Market Research...

Wok - Dataset - B2FIND

Wetland timing, Amazon Basin, Reis et al. 2019

Data from: A large-scale assessment of ant diversity across the Brazilian...

Sentinel-2 Cloud-Optimized GeoTIFFs

Global net revenue of Amazon 2014-2024, by product group