28 datasets found
  1. Amazon revenue 2004-2024

    • statista.com
    • gruabehub.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon revenue 2004-2024 [Dataset]. https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States, Worldwide
    Description

    From 2004 to 2024, the net revenue of Amazon e-commerce and service sales has increased tremendously. In the fiscal year ending December 31, the multinational e-commerce company's net revenue was almost *** billion U.S. dollars, up from *** billion U.S. dollars in 2023.Amazon.com, a U.S. e-commerce company originally founded in 1994, is the world’s largest online retailer of books, clothing, electronics, music, and many more goods. As of 2024, the company generates the majority of it's net revenues through online retail product sales, followed by third-party retail seller services, cloud computing services, and retail subscription services including Amazon Prime. From seller to digital environment Through Amazon, consumers are able to purchase goods at a rather discounted price from both small and large companies as well as from other users. Both new and used goods are sold on the website. Due to the wide variety of goods available at prices which often undercut local brick-and-mortar retail offerings, Amazon has dominated the retailer market. As of 2024, Amazon’s brand worth amounts to over *** billion U.S. dollars, topping the likes of companies such as Walmart, Ikea, as well as digital competitors Alibaba and eBay. One of Amazon's first forays into the world of hardware was its e-reader Kindle, one of the most popular e-book readers worldwide. More recently, Amazon has also released several series of own-branded products and a voice-controlled virtual assistant, Alexa. Headquartered in North America Due to its location, Amazon offers more services in North America than worldwide. As a result, the majority of the company’s net revenue in 2023 was actually earned in the United States, Canada, and Mexico. In 2023, approximately *** billion U.S. dollars was earned in North America compared to only roughly *** billion U.S. dollars internationally.

  2. u

    Amazon Question and Answer Data

    • cseweb.ucsd.edu
    json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Amazon Question and Answer Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain 1.48 million question and answer pairs about products from Amazon.

    Metadata includes

    • question and answer text

    • is the question binary (yes/no), and if so does it have a yes/no answer?

    • timestamps

    • product ID (to reference the review dataset)

    Basic Statistics:

    • Questions: 1.48 million

    • Answers: 4,019,744

    • Labeled yes/no questions: 309,419

    • Number of unique products with questions: 191,185

  3. Global net revenue of Amazon 2014-2024, by product group

    • statista.com
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global net revenue of Amazon 2014-2024, by product group [Dataset]. https://www.statista.com/statistics/672747/amazons-consolidated-net-revenue-by-segment/
    Explore at:
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, Amazon's net revenue from subscription services segment amounted to 44.37 billion U.S. dollars. Subscription services include Amazon Prime, for which Amazon reported 200 million paying members worldwide at the end of 2020. The AWS category generated 107.56 billion U.S. dollars in annual sales. During the most recently reported fiscal year, the company’s net revenue amounted to 638 billion U.S. dollars. Amazon revenue segments Amazon is one of the biggest online companies worldwide. In 2019, the company’s revenue increased by 21 percent, compared to Google’s revenue growth during the same fiscal period, which was just 18 percent. The majority of Amazon’s net sales are generated through its North American business segment, which accounted for 236.3 billion U.S. dollars in 2020. The United States are the company’s leading market, followed by Germany and the United Kingdom. Business segment: Amazon Web Services Amazon Web Services, commonly referred to as AWS, is one of the strongest-growing business segments of Amazon. AWS is a cloud computing service that provides individuals, companies and governments with a wide range of computing, networking, storage, database, analytics and application services, among many others. As of the third quarter of 2020, AWS accounted for approximately 32 percent of the global cloud infrastructure services vendor market.

  4. UK Optimal Product Price Prediction Dataset

    • kaggle.com
    Updated Nov 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asaniczka (2023). UK Optimal Product Price Prediction Dataset [Dataset]. http://doi.org/10.34740/kaggle/ds/3893120
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 7, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    asaniczka
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    This dataset contains product prices from Amazon UK, with a focus on price prediction. With a good amount of data on what price points sell the most, you can train machine learning models to predict the optimal price for a product based on its features and product name.

    If you find this dataset useful, make sure to show your appreciation by upvoting! ❤️✨

    Inspirations

    This dataset is a superset of my Amazon UK product price dataset. Another inspiration is this competition that awareded 100K Prize Money

    What To Do?

    • Your objective is to create a prediction model that will assist sellers in pricing their products within the optimal price range to generate the most sales.
    • The dataset includes various data points, such as the number of reviews, rating, best seller status, and items sold last month.
    • You can select specific factors (e.g., over 100 reviews = optimal price for the product) and then divide the dataset into products priced optimally vs products priced unoptimally.
    • By utilizing techniques like vectorizing product names and features, you can train a model to provide the optimal price for a product, which sellers or businesses might find valuable.

    How to know if a product sells?

    • I would prefer to use the number of reviews as a metric to determine if a product sells. More reviews = more sales, right?
    • According to one source only 1-2% of buyers leave a review
    • So if we multiply the reviews for a product by 50x, then we would get a good understanding how many units has sold.
    • If we then multiple the product price by number of units sold, we'd get the total revenue generated by the product

    How is this useful?

    • Sellers and businesses can leverage your model to determine the optimal price for their products, thereby maximizing sales.
    • Businesses can assess the profitability of a product and plan their supply chain accordingly.
  5. Amazon Web Services: year-on-year growth 2014-2025

    • statista.com
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon Web Services: year-on-year growth 2014-2025 [Dataset]. https://www.statista.com/statistics/422273/yoy-quarterly-growth-aws-revenues/
    Explore at:
    Dataset updated
    May 13, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In the first quarter of 2025, revenues of Amazon Web Services (AWS) rose to 17 percent, a decrease from the previous three quarters. AWS is one of Amazon’s strongest revenue segments, generating over 115 billion U.S. dollars in 2024 net sales, up from 105 billion U.S. dollars in 2023. Amazon Web Services Amazon Web Services (AWS) provides on-demand cloud platforms and APIs through a pay-as-you-go-model to customers. AWS launched in 2002 providing general services and tools and produced its first cloud products in 2006. Today, more than 175 different cloud services for a variety of technologies and industries are released already. AWS ranks as one of the most popular public cloud infrastructure and platform services running applications worldwide in 2020, ahead of Microsoft Azure and Google cloud services. Cloud computing Cloud computing is essentially the delivery of online computing services to customers. As enterprises continually migrate their applications and data to the cloud instead of storing it on local machines, it becomes possible to access resources from different locations. Some of the key services of the AWS ecosystem for cloud applications include storage, database, security tools, and management tools. AWS is among the most popular cloud providers Some of the largest globally operating enterprises use AWS for their cloud services, including Netflix, BBC, and Baidu. Accordingly, AWS is one of the leading cloud providers in the global cloud market. Due to its continuously expanding portfolio of services and deepening of expertise, the company continues to be not only an important cloud service provider but also a business partner.

  6. d

    Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) +...

    • datarade.ai
    .csv
    Updated Jan 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Space Know (2023). Satellite US Supply Chain Dataset Package (Amazon, Fedex, Walmart) + Research Report Available [Dataset]. https://datarade.ai/data-products/satellite-us-supply-chain-dataset-package-amazon-fedex-wal-space-know
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jan 18, 2023
    Dataset authored and provided by
    Space Know
    Area covered
    United States
    Description

    SpaceKnow USA Supply Chain Premium Dataset gives you data (by locations and company) of US Supply Chain choke points in near-real-time as seen from satellite images. The uniqueness of this dataset lies in its granularity.

    About dataset: We apply proprietary algorithms to SAR satellite imagery of key industrial, transportation, storage, and logistics locations to create daily indices of industry activity. Data was collected from more than 5,000 locations across the USA. Thanks to the use of SAR satellite technology, the quality of the SpaceKnow dataset is not influenced by weather fluctuations.

    In total SpaceKnow USA Supply Chain dataset offers +50 specific indices with real-time insights. The premium dataset includes company-focused indices. This type of data can be used by investors to get insight on important KPIs such as revenue.

    This dataset is:

    Daily frequency History from Jan 2017 - present

    Within one package we provide you with real-time insights into:

    Port Container country-level indices(A container port or container terminal is a facility where cargo containers are transshipped between different transport vehicles, for onward transportation) Port Container indices for the major ports in US: Port of Los Angeles Port of Long Beach Port of New York & New Jersey Port of Savannah Port of Houston Port of Virginia Port of Oakland in California Port of South Carolina Port of Miami

    Trucking Stop indices for the most important locations in the supply chain like: Iowa Nevada South Carolina Oregon North Carolina

    Inland Containers index on a country-level

    Logistics Center index on a country-level (Logistics centers are distribution hubs for finished goods that need to be transported to another location. We include logistics centers from companies like Amazon, Walmart, Fedex and others)

    Logistics Center indices for states like: California New York Illinois Indiana South Carolina And many more…

    Logistics Center indices for companies: Amazon Walmart Fedex

    Research Reports Don't have the capacity to analyze the data? Let SpaceKnow's in-house economists do the heavy lifting so that you can focus on what's important. SpaceKnow writes research reports based on what the data from the US Supply Chain dataset package is showing. The document includes a detailed explanation of what is happening with supporting charts and tables. The reports are published on a monthly basis.

    Delivery Mechanisms All of the delivery mechanisms detailed below are available as part of this package. Data is distributed only in the flat-table CSV format. Methods how to access the data: Dashboard - option that also offers data visualization within the webpage Automatic email delivery API access to our dataset Research reports - provided via email in PDF format

    Client Support

    Each client is assigned an account representative who will reach out periodically to make sure that the data packages are meeting your needs. Here are some other ways to contact SpaceKnow in case you have a specific question.

    For delivery questions and issues: Please reach out to support@spaceknow.com

    For data questions: Please reach out to product@spaceknow.com

    For pricing/sales support: Please reach out to info@spaceknow.com or sales@spaceknow.com

  7. Amazon Machine Learning book dataset[11 Countries]

    • kaggle.com
    Updated Nov 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jitesh Kumar Sahoo (2023). Amazon Machine Learning book dataset[11 Countries] [Dataset]. https://www.kaggle.com/datasets/jiteshkumarsahoo/amazon-machine-learning-book-dataset11-countries
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jitesh Kumar Sahoo
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Amazon Machine Learninng Book Data from Multiple Countries

    Overview

    This dataset is a valuable resource for exploring the world of literature and e-commerce by providing information about books available on Amazon websites in 14 different countries. It offers a comprehensive collection of book details, enabling data analysts, researchers, and machine learning enthusiasts to investigate trends in book titles, authors, pricing, ratings, reviews, publishing dates, and the diverse landscape of the Amazon marketplace across various countries.

    Columns

    The dataset is structured with the following columns:

    1. Title: The title of the book.
    2. Authors: The authors of the book.
    3. Price: The listed price of the book.
    4. Stars: The average rating given to the book.
    5. Number of Reviews: The count of reviews for the book.
    6. Published Date: The date when the book was published.
    7. Country: The specific country of the Amazon website where the book is listed.

    Data Details

    This dataset consists of 4,268 records, representing books from a wide array of genres and categories. While it provides valuable insights into the world of books available on Amazon, it's important to note that not all columns contain complete data, as indicated by the "Non-Null Count" for each column. This aspect can be a valuable opportunity for data cleaning and analysis.

    Use Cases

    The dataset is rich with potential use cases, including but not limited to:

    • Market Analysis: Investigate the book pricing and availability trends in different countries, enabling businesses and entrepreneurs to make informed decisions when entering or expanding in these markets.
    • Author and Genre Profiling: Identify popular authors and genres in various countries, aiding publishers and authors in tailoring their content for specific markets.
    • Predictive Modeling: Create models to predict book ratings or reviews based on various attributes, facilitating recommendations for readers.
    • Publishing Trends: Analyze the distribution of published dates to understand the evolution of literary trends in different regions.

    Data Sources

    The dataset was meticulously curated by scraping data from Amazon websites in 14 countries. This process ensures that the data accurately represents the books available on these websites and opens up a wealth of possibilities for cross-country comparisons and insights.

    Feel free to explore this dataset, conduct thorough analysis, and share your findings with the community. Your insights can contribute to a deeper understanding of the global literary and e-commerce landscape.

  8. T

    amazon_us_reviews

    • tensorflow.org
    • huggingface.co
    Updated Dec 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). amazon_us_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/amazon_us_reviews
    Explore at:
    Dataset updated
    Dec 6, 2022
    Description

    Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

    Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).

    Each Dataset contains the following columns : marketplace - 2 letter country code of the marketplace where the review was written. customer_id - Random identifier that can be used to aggregate reviews written by a single author. review_id - The unique ID of the review. product_id - The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id. product_parent - Random identifier that can be used to aggregate reviews for the same product. product_title - Title of the product. product_category - Broad product category that can be used to group reviews (also used to group the dataset into coherent parts). star_rating - The 1-5 star rating of the review. helpful_votes - Number of helpful votes. total_votes - Number of total votes the review received. vine - Review was written as part of the Vine program. verified_purchase - The review is on a verified purchase. review_headline - The title of the review. review_body - The review text. review_date - The date the review was written.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('amazon_us_reviews', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  9. d

    Paypal Email Receipt Data | Consumer Transaction Data | Payment Data | Asia,...

    • datarade.ai
    .json, .xml, .csv
    Updated Feb 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Measurable AI (2024). Paypal Email Receipt Data | Consumer Transaction Data | Payment Data | Asia, EMEA, LATAM, MENA, India | Granular & Aggregate Data available [Dataset]. https://datarade.ai/data-products/paypal-email-receipt-data-consumer-transaction-data-payme-measurable-ai
    Explore at:
    .json, .xml, .csvAvailable download formats
    Dataset updated
    Feb 23, 2024
    Dataset authored and provided by
    Measurable AI
    Area covered
    Latin America, Argentina, Brazil, Chile, Colombia, Mexico, United States of America, Japan
    Description

    The Measurable AI Amazon Consumer Transaction Dataset is a leading source of email receipts and consumer transaction data, offering data collected directly from users via Proprietary Consumer Apps, with millions of opt-in users.

    We source our email receipt consumer data panel via two consumer apps which garner the express consent of our end-users (GDPR compliant). We then aggregate and anonymize all the transactional data to produce raw and aggregate datasets for our clients.

    Use Cases Our clients leverage our datasets to produce actionable consumer insights such as: - Market share analysis - User behavioral traits (e.g. retention rates) - Average order values - Promotional strategies used by the key players. Several of our clients also use our datasets for forecasting and understanding industry trends better.

    Granular Data Itemized, high-definition data per transaction level with metrics such as - Order value - Items ordered - No. of orders per user - Delivery fee - Service fee - Promotions used - Geolocation data and more

    Aggregate Data - Weekly/ monthly order volume - Revenue delivered in aggregate form, with historical data dating back to 2018. All the transactional e-receipts are sent from app to users’ registered accounts.

    Most of our clients are fast-growing Tech Companies, Financial Institutions, Buyside Firms, Market Research Agencies, Consultancies and Academia.

    Our dataset is GDPR compliant, contains no PII information and is aggregated & anonymized with user consent. Contact michelle@measurable.ai for a data dictionary and to find out our volume in each country.

  10. Z

    MuMu: Multimodal Music Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oramas, Sergio (2022). MuMu: Multimodal Music Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_831188
    Explore at:
    Dataset updated
    Dec 6, 2022
    Dataset authored and provided by
    Oramas, Sergio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MuMu is a Multimodal Music dataset with multi-label genre annotations that combines information from the Amazon Reviews dataset and the Million Song Dataset (MSD). The former contains millions of album customer reviews and album metadata gathered from Amazon.com. The latter is a collection of metadata and precomputed audio features for a million songs.

    To map the information from both datasets we use MusicBrainz. This process yields the final set of 147,295 songs, which belong to 31,471 albums. For the mapped set of albums, there are 447,583 customer reviews from the Amazon Dataset. The dataset have been used for multi-label music genre classification experiments in the related publication. In addition to genre annotations, this dataset provides further information about each album, such as genre annotations, average rating, selling rank, similar products, and cover image url. For every text review it also provides helpfulness score of the reviews, average rating, and summary of the review.

    The mapping between the three datasets (Amazon, MusicBrainz and MSD), genre annotations, metadata, data splits, text reviews and links to images are available here. Images and audio files can not be released due to copyright issues.

    MuMu dataset (mapping, metadata, annotations and text reviews)

    Data splits and multimodal feature embeddings for ISMIR multi-label classification experiments

    These data can be used together with the Tartarus deep learning library https://github.com/sergiooramas/tartarus.

    NOTE: This version provides simplified files with metadata and splits.

    Scientific References

    Please cite the following papers if using MuMu dataset or Tartarus library.

    Oramas, S., Barbieri, F., Nieto, O., and Serra, X (2018). Multimodal Deep Learning for Music Genre Classification, Transactions of the International Society for Music Information Retrieval, V(1).

    Oramas S., Nieto O., Barbieri F., & Serra X. (2017). Multi-label Music Genre Classification from audio, text and images using Deep Features. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). https://arxiv.org/abs/1707.04916

  11. Dataset and code for "What make readers love a fiction book: a stat analysis...

    • zenodo.org
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AISDL Team; AISDL Team (2024). Dataset and code for "What make readers love a fiction book: a stat analysis on Wild Wise Weird using real-world data from Amazon readers' reviews" [Dataset]. http://doi.org/10.5281/zenodo.14498846
    Explore at:
    Dataset updated
    Dec 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    AISDL Team; AISDL Team
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For the sake of research transparency and reducing research and reproducibility costs, we have stored all data and computer code of the project "What make readers love a fiction book: a stat analysis on Wild Wise Weird using real-world data from Amazon readers' reviews" on Zenodo.

  12. r

    1000 Genomes Project and AWS

    • rrid.site
    • neuinfo.org
    • +2more
    Updated Jul 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). 1000 Genomes Project and AWS [Dataset]. http://identifiers.org/RRID:SCR_008801
    Explore at:
    Dataset updated
    Jul 27, 2025
    Description

    A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.

  13. b

    Data from: Coarse datasets for the 2002-2010 Tsimane' Amazonian Panel...

    • scholarworks.brandeis.edu
    docx, pdf, xls
    Updated Mar 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ricardo Godoy; William R. Leonard; Victoria Reyes-Garcia; Tomas Huanca (2022). Coarse datasets for the 2002-2010 Tsimane' Amazonian Panel Study(TAPS) - Introduction and authorization [Dataset]. https://scholarworks.brandeis.edu/esploro/outputs/dataset/Coarse-datasets-for-the-2002-2010-Tsimane/9924097301801921
    Explore at:
    xls(1472000 bytes), pdf(140365 bytes), docx(32618 bytes)Available download formats
    Dataset updated
    Mar 15, 2022
    Authors
    Ricardo Godoy; William R. Leonard; Victoria Reyes-Garcia; Tomas Huanca
    Time period covered
    Mar 2022
    Measurement technique
    <p>See Chapter 4 of "Too little, too late" for general methods, and different chapters for methods on different topics</p>
    Description

    Introduction. This document provides an overview of an archive composed of four sections.

    [1] An introduction (this document) which describes the scope of the project

    [2] Yearly folder, from 2002 until 2010, of the coarse Microsoft Access datasets + the surveys used to collect information for each year. The word coarse does not mean the information in the Microsoft Access dataset was not corrected for mistakes; it was, but some mistakes and inconsistencies remain, such as with data on age or education. Furthermore, the coarse dataset provides disaggregated information for selected topics, which appear in summary statistics in the clean dataset. For example, in the coarse dataset one can find the different illnesses afflicting a person during the past 14 days whereas in the clean dataset only the total number of illnesses appears.

    [3] A letter from the Gran Consejo Tsimane’ authorizing the public use of de-identified data collected in our studies among Tsimane’.

    [4] A Microsoft Excel document with the unique identification number for each person in the panel study.


    Background. During 2002-2010, a team of international researchers, surveyors, and translators gathered longitudinal (panel) data on the demography, economy, social relations, health, nutritional status, local ecological knowledge, and emotions of about 1400 native Amazonians known as Tsimane’ who lived in thirteen villages near and far from towns in the department of Beni in the Bolivian Amazon. A report titled “Too little, too late” summarizes selected findings from the study and is available to the public at the electronic library of Brandeis University:

    https://scholarworks.brandeis.edu/permalink/01BRAND_INST/1bo2f6t/alma9923926194001921


    A copy of the clean, merged, and appended Stata (V17) dataset is available to the public at the following two web addresses:

    [a] Brandeis University:

    https://scholarworks.brandeis.edu/permalink/01BRAND_INST/1bo2f6t/alma9923926193901921

    [b] Inter-university Consortium for Political and Social Research (ICPSR), University of Michigan (only available to users affiliated with institutions belonging to ICPSR)

    http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/37671/utilization

    Chapter 4 of the report “Too little, too late” mentioned above describes the motivation and history of the study, the difference between the coarse and clean datasets, and topics which can be examined only with coarse data.


    Aims. The aims of this archive are to:

    · Make available in Microsoft Access the coarse de-identified dataset [1] for each of the seven yearly surveys (2004-2010) and [2] one Access data based on quarterly surveys done during 2002 and 2003. Together, these two datasets form one longitudinal dataset of individuals, households, and villages.

    · Provide guidance on how to link files within and across years, and

    · Make available a Microsoft Excel file with a unique identification number to link individuals across years

    The datasets in the archive.

    · Eight Microsoft Access datasets with data on a wide range of variables. Except for the Access file for 2002-2003, all the other information in each of the other Access files refers to one year. Within any Access dataset, users will find two types of files:

    o Thematic files. The name of a thematic file contains the prefix tbl (e.g., 29_tbl_Demography or tbl_29_Demography). The file name (sometimes in Spanish, sometimes in English) indicates the content of the file. For example, in the Access dataset for one year, the micro file tbl_30_Ventas has all the information on sales for that year. Within each micro file, columns contain information on a variable and the name of the column indicates the content of the variable. For instance, the column heading item in the Sales file would indicate the type of good sold. The exac…

  14. Amazon Prime TV Shows

    • kaggle.com
    Updated Oct 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neelima Jauhari (2020). Amazon Prime TV Shows [Dataset]. https://www.kaggle.com/nilimajauhari/amazon-prime-tv-shows/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 13, 2020
    Dataset provided by
    Kaggle
    Authors
    Neelima Jauhari
    Description

    Context

    This data set was created so as to analyze the latest shows available on Amazon Prime as well as the shows with a high rating.

    Content

    The data set contains the name of the show or title, year of the release which is the year in which the show was released or went on-air, No.of seasons means the number of seasons of the show which are available on Prime, Language is for the audio language of the show and does not take into consideration the language of the subtitles, genre of the show like Kids, Drama, Action and so on, IMDB ratings of the show: though for many tv shows and kid shows the rating was not available, Age of Viewers is to specify the age of the target audience- All in age means that the content is not restricted to any particular age group and all audiences can view it.

    Acknowledgements

    I have collected this data from Amazon Prime's Website.

    Inspiration

    Since a lot many TV shows have high IMDB ratings but don't get viewed that much because the audience is not aware of it or it is not advertised much. I have created this data set so as to find out the highest-rated shows in each category or in a particular genre.

  15. F

    Data from: A Neural Approach for Text Extraction from Scholarly Figures

    • data.uni-hannover.de
    zip
    Updated Jan 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TIB (2022). A Neural Approach for Text Extraction from Scholarly Figures [Dataset]. https://data.uni-hannover.de/dataset/a-neural-approach-for-text-extraction-from-scholarly-figures
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 20, 2022
    Dataset authored and provided by
    TIB
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    A Neural Approach for Text Extraction from Scholarly Figures

    This is the readme for the supplemental data for our ICDAR 2019 paper.

    You can read our paper via IEEE here: https://ieeexplore.ieee.org/document/8978202

    If you found this dataset useful, please consider citing our paper:

    @inproceedings{DBLP:conf/icdar/MorrisTE19,
     author  = {David Morris and
            Peichen Tang and
            Ralph Ewerth},
     title   = {A Neural Approach for Text Extraction from Scholarly Figures},
     booktitle = {2019 International Conference on Document Analysis and Recognition,
            {ICDAR} 2019, Sydney, Australia, September 20-25, 2019},
     pages   = {1438--1443},
     publisher = {{IEEE}},
     year   = {2019},
     url    = {https://doi.org/10.1109/ICDAR.2019.00231},
     doi    = {10.1109/ICDAR.2019.00231},
     timestamp = {Tue, 04 Feb 2020 13:28:39 +0100},
     biburl  = {https://dblp.org/rec/conf/icdar/MorrisTE19.bib},
     bibsource = {dblp computer science bibliography, https://dblp.org}
    }
    

    This work was financially supported by the German Federal Ministry of Education and Research (BMBF) and European Social Fund (ESF) (InclusiveOCW project, no. 01PE17004).

    Datasets

    We used different sources of data for testing, validation, and training. Our testing set was assembled by the work we cited by Böschen et al. We excluded the DeGruyter dataset, and use it as our validation dataset.

    Testing

    These datasets contain a readme with license information. Further information about the associated project can be found in the authors' published work we cited: https://doi.org/10.1007/978-3-319-51811-4_2

    Validation

    The DeGruyter dataset does not include the labeled images due to license restrictions. As of writing, the images can still be downloaded from DeGruyter via the links in the readme. Note that depending on what program you use to strip the images out of the PDF they are provided in, you may have to re-number the images.

    Training

    We used label_generator's generated dataset, which the author made available on a requester-pays amazon s3 bucket. We also used the Multi-Type Web Images dataset, which is mirrored here.

    Code

    We have made our code available in code.zip. We will upload code, announce further news, and field questions via the github repo.

    Our text detection network is adapted from Argman's EAST implementation. The EAST/checkpoints/ours subdirectory contains the trained weights we used in the paper.

    We used a tesseract script to run text extraction from detected text rows. This is inside our code code.tar as text_recognition_multipro.py.

    We used a java script provided by Falk Böschen and adapted to our file structure. We included this as evaluator.jar.

    Parameter sweeps are automated by param_sweep.rb. This file also shows how to invoke all of these components.

  16. 70,000 Active buyer Email list ( from Amazon & ebay ) for Market

    • dataandsons.com
    csv, zip
    Updated Dec 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    boobxff.blogspot.com (2020). 70,000 Active buyer Email list ( from Amazon & ebay ) for Market [Dataset]. https://www.dataandsons.com/categories/product-lists/70-000-active-buyer-email-list-from-amazon-and-ebay-for-market
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 12, 2020
    Dataset provided by
    Authors
    boobxff.blogspot.com
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    About this Dataset

    You will get an active email list for real and active buyers who make regular purchases through Amazon and other e-commerce sites. This email list contains 100% original email address. You can also use these emails to increase visits to your website, blog, or YouTube channel. I offer you now, a great treasure to use whenever you want.

    So don't waste your time and start boosting your ecommerce business online.

    The buyers will be from:

    United States of America Canada Europe Union

    $ There are no duplicate emails $ No fake IDs $ Audiences ready to buy

    Category

    Product Lists

    Keywords

    email marketing,emails,Email List,buyer

    Row Count

    70150

    Price

    $90.00

  17. I

    Data for Implementing Deep Soil and Dynamic Root Uptake in Noah-MP (v4.5):...

    • databank.illinois.edu
    • investigacion.usc.gal
    • +1more
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carolina A. Bieri; Francina Dominguez; Gonzalo Miguez-Macho; Ying Fan (2025). Data for Implementing Deep Soil and Dynamic Root Uptake in Noah-MP (v4.5): Impact on Amazon Dry-Season Transpiration [Dataset]. http://doi.org/10.13012/B2IDB-8777292_V1
    Explore at:
    Dataset updated
    Mar 19, 2025
    Authors
    Carolina A. Bieri; Francina Dominguez; Gonzalo Miguez-Macho; Ying Fan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    U.S. National Science Foundation (NSF)
    Description

    This repository includes HRLDAS Noah-MP model output generated as part of Bieri et al. (2025) - Implementing deep soil and dynamic root uptake in Noah-MP (v4.5): Impact on Amazon dry-season transpiration. These data are distributed in two different formats: Raw model output files and subsetted files that include data for a specific variable. All files are .nc format (NetCDF) and aggregated into .tar files to facilitate download. Given the size of these datasets, Globus transfer is the best way to download them. Raw model output for four model experiments is available: FD (control), GW, SOIL, and ROOT. See the associated publication for information on the different experiments. These data span an approximately 20 year period from 01 Jun 2000 to 31 Dec 2019. The data have a spatial resolution of 4 km and a temporal frequency of 3 hours. These data are for a domain in the southern Amazon basin (see Figure 1 in the associated publication). Data for each experiment is available as a .tar file which includes 3-hourly NetCDF files. All default Noah-MP output variables are included in each file. As a result, the .tar files are quite large and may take many hours or even days to transfer depending on your network speed and local configurations. These files are named 'noahmp_output_2000_2019_EXP.tar', where EXP is the name of the experiment (FD, GW, SOIL, or ROOT). Subsetted model output at a daily temporal resolution for all four model experiments is also available. These .tar files include the following variables: water table depth (ZWT), latent heat flux (LH), sensible heat flux (HFX), soil moisture (SOIL_M), canopy evaporation (ECAN), ground evaporation (EDIR), transpiration (ETRAN), rainfall rate at the surface (QRAIN), and two variables that are specific to the ROOT experiment: ROOTACTIVITY (root activity function) and GWRD (active root water uptake depth). There is one file for each variable within the tarred files. These files are named 'noahmp_output_subset_2000_2019_EXP.tar', where EXP is the name of the experiment (FD, GW, SOIL, or ROOT). Finally, there is a sample dataset with raw 3-hourly output from the ROOT experiment for one day. The purpose of this sample dataset is to allow users to confirm if these data meet their needs before initiating a full transfer via Globus. This file is named 'noahmp_output_sample_ROOT.tar'. The README.txt file provides information on the Noah-MP output variables in these datasets, among other specifications. Information on HRLDAS Noah-MP and names/definitions of model output variables that are useful in working with these data are available here: http://dx.doi.org/10.5065/ew8g-yr95. Note that some output variables may be listed in this document under a different variable name, so searching for the long name (e.g. 'baseflow' instead of 'QRF') is recommended. Information on additional output variables that were added to the model as part of this study is available here: https://github.com/bieri2/bieri-et-al-2025-EGU-GMD/tree/DynaRoot. Model code, configuration files, and forcing data used to carry out the model simulations are linked in the related resources section.

  18. Data from: SAFARI 2000 Global Historical Climatology Network, V. 1,...

    • data.nasa.gov
    • search.dataone.org
    • +6more
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). SAFARI 2000 Global Historical Climatology Network, V. 1, 1874-1990 [Dataset]. https://data.nasa.gov/dataset/safari-2000-global-historical-climatology-network-v-1-1874-1990-6639a
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This data set consists of a southern African subset of the Global Historical Climatology Network (GHCN) Version 1 database. All stations with the following bounding coordinates are included in this subset: 5W - 60E and 5N - 35S. There are three files available, one each for precipitation, temperature, and pressure data. Within this subset the oldest data date from 1874 and the most recent from 1990. The GHCN V1 database contains monthly temperature, precipitation, sea-level pressure, and station-pressure data for thousands of meteorological stations worldwide. The database was compiled from pre-existing national, regional, and global collections of data as part of the Global Historical Climatology Network (GHCN) project, the goal of which is to produce, maintain and make available a comprehensive global surface baseline climate data set for monitoring climate and detecting climate change. It contains data from roughly 6000 temperature stations, 7500 precipitation stations, 1800 sea-level pressure stations, and 1800 station-pressure stations. Each station has at least 10 years of data; 40% have more than 50 years of data. Spatial coverage is good over most of the globe, particularly for the United States and Europe. Data gaps are evident over the Amazon rainforest, the Sahara desert, Greenland, and Antarctica. The earliest station data are from 1697; the most recent are from 1990. The database was created from 15 source data sets including: The National Climatic Data Center's (NCDC's) World Weather Records, CAC's Climate Anomaly Monitoring System (CAMS), NCAR's World Monthly Surface Station Climatology, CIRES' (Eischeid/Diaz) Global precipitation data set, P. Jones' Temperature data base for the world, and S. Nicholson's African precipitation database. Quality Control of the database included visual inspection of graphs of all station time series, tests for precipitation digitized 6 months out of phase, tests for different stations having identical data, and other tests. This detailed analysis has revealed that most stations (95% for temperature and precipitation, 75% for pressure) contain high-quality data. However, gross data-processing errors (e.g., keypunch problems) and discontinuous inhomogeneities (e.g., station relocations and instrumentation changes) do characterize a small number of stations. All major data processing problems have been flagged (or corrected, when possible). Similarly, all major inhomogeneities have been flagged, although no homogeneity corrections were applied. More information can be found at: ftp://daac.ornl.gov/data/safari2k/climate_meteorology/ghcn/comp/ghcn_v1_readme.pdf.

  19. u

    Goodreads Book Reviews

    • cseweb.ucsd.edu
    json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Goodreads Book Reviews [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user interaction, raging from adding to a shelf, rating, and reading.

    Metadata includes

    • reviews

    • add-to-shelf, read, review actions

    • book attributes: title, isbn

    • graph of similar books

    Basic Statistics:

    • Items: 1,561,465

    • Users: 808,749

    • Interactions: 225,394,930

  20. 70,000 Active buyer email list from Amazon & ebay for #Email_marketing

    • dataandsons.com
    csv, zip
    Updated Dec 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    boobxff.blogspot.com (2020). 70,000 Active buyer email list from Amazon & ebay for #Email_marketing [Dataset]. https://www.dataandsons.com/categories/markets/70-000-active-buyer-email-list-from-amazon-and-ebay-for-email-marketing
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 12, 2020
    Dataset provided by
    Authors
    boobxff.blogspot.com
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    About this Dataset

    You will get an active email list for real and active buyers who make regular purchases through Amazon and other e-commerce sites. This email list contains 100% original email address. You can also use these emails to increase visits to your website, blog, or YouTube channel. I offer you now, a great treasure to use whenever you want.

    So don't waste your time and start boosting your ecommerce business online.

    The buyers will be from:

    United States of America Canada Europe Union

    $ There are no duplicate emails $ No fake IDs $ Audiences ready to buy

    Category

    Markets

    Keywords

    market,emails,email ma,list,buyer

    Row Count

    70150

    Price

    $90.00

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Amazon revenue 2004-2024 [Dataset]. https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/
Organization logo

Amazon revenue 2004-2024

Explore at:
83 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States, Worldwide
Description

From 2004 to 2024, the net revenue of Amazon e-commerce and service sales has increased tremendously. In the fiscal year ending December 31, the multinational e-commerce company's net revenue was almost *** billion U.S. dollars, up from *** billion U.S. dollars in 2023.Amazon.com, a U.S. e-commerce company originally founded in 1994, is the world’s largest online retailer of books, clothing, electronics, music, and many more goods. As of 2024, the company generates the majority of it's net revenues through online retail product sales, followed by third-party retail seller services, cloud computing services, and retail subscription services including Amazon Prime. From seller to digital environment Through Amazon, consumers are able to purchase goods at a rather discounted price from both small and large companies as well as from other users. Both new and used goods are sold on the website. Due to the wide variety of goods available at prices which often undercut local brick-and-mortar retail offerings, Amazon has dominated the retailer market. As of 2024, Amazon’s brand worth amounts to over *** billion U.S. dollars, topping the likes of companies such as Walmart, Ikea, as well as digital competitors Alibaba and eBay. One of Amazon's first forays into the world of hardware was its e-reader Kindle, one of the most popular e-book readers worldwide. More recently, Amazon has also released several series of own-branded products and a voice-controlled virtual assistant, Alexa. Headquartered in North America Due to its location, Amazon offers more services in North America than worldwide. As a result, the majority of the company’s net revenue in 2023 was actually earned in the United States, Canada, and Mexico. In 2023, approximately *** billion U.S. dollars was earned in North America compared to only roughly *** billion U.S. dollars internationally.

Search
Clear search
Close search
Google apps
Main menu