100+ datasets found
  1. b

    Amazon Statistics (2025)

    • businessofapps.com
    Updated Jul 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business of Apps (2025). Amazon Statistics (2025) [Dataset]. https://www.businessofapps.com/data/amazon-statistics/
    Explore at:
    Dataset updated
    Jul 20, 2025
    Dataset authored and provided by
    Business of Apps
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Amazon is one of the most recognisable brands in the world, and the third largest by revenue. It was the fourth tech company to reach a $1 trillion market cap, and a market leader in e-commerce,...

  2. Amazon Sales Report 2022 Data Set

    • kaggle.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hasan23 (2025). Amazon Sales Report 2022 Data Set [Dataset]. https://www.kaggle.com/datasets/hasan23/amazon-sales-report-2022-data-set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 11, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hasan23
    Description

    Dataset

    This dataset was created by Hasan23

    Contents

  3. amazon_revenue

    • kaggle.com
    Updated Mar 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phan Nguyễn Hữu Phong (2024). amazon_revenue [Dataset]. https://www.kaggle.com/datasets/phannguynhuphong/amazon-revenue/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Phan Nguyễn Hữu Phong
    Description

    Dataset

    This dataset was created by Phan Nguyễn Hữu Phong

    Contents

  4. Datasets for Sentiment Analysis

    • zenodo.org
    csv
    Updated Dec 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.

    Below are the datasets specified, along with the details of their references, authors, and download sources.

    ----------- STS-Gold Dataset ----------------

    The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.

    Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.

    File name: sts_gold_tweet.csv

    ----------- Amazon Sales Dataset ----------------

    This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.

    Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)

    Features:

    • product_id - Product ID
    • product_name - Name of the Product
    • category - Category of the Product
    • discounted_price - Discounted Price of the Product
    • actual_price - Actual Price of the Product
    • discount_percentage - Percentage of Discount for the Product
    • rating - Rating of the Product
    • rating_count - Number of people who voted for the Amazon rating
    • about_product - Description about the Product
    • user_id - ID of the user who wrote review for the Product
    • user_name - Name of the user who wrote review for the Product
    • review_id - ID of the user review
    • review_title - Short review
    • review_content - Long review
    • img_link - Image Link of the Product
    • product_link - Official Website Link of the Product

    License: CC BY-NC-SA 4.0

    File name: amazon.csv

    ----------- Rotten Tomatoes Reviews Dataset ----------------

    This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.

    This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).

    Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics

    File name: data_rt.csv

    ----------- Preprocessed Dataset Sentiment Analysis ----------------

    Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
    Stemmed and lemmatized using nltk.
    Sentiment labels are generated using TextBlob polarity scores.

    The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).

    DOI: 10.34740/kaggle/dsv/3877817

    Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }

    This dataset was used in the experimental phase of my research.

    File name: EcoPreprocessed.csv

    ----------- Amazon Earphones Reviews ----------------

    This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)

    License: U.S. Government Works

    Source: www.amazon.in

    File name (original): AllProductReviews.csv (contains 14337 reviews)

    File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)

    ----------- Amazon Musical Instruments Reviews ----------------

    This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).

    Source: http://jmcauley.ucsd.edu/data/amazon/

    File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)

    File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)

  5. amazon best selling

    • kaggle.com
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raneem Oqaily (2022). amazon best selling [Dataset]. https://www.kaggle.com/datasets/raneemoqaily/amazon-best-selling/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Raneem Oqaily
    Description

    Dataset

    This dataset was created by Raneem Oqaily

    Contents

  6. U.S. trust in tech companies with personal data 2021

    • statista.com
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). U.S. trust in tech companies with personal data 2021 [Dataset]. https://www.statista.com/statistics/800764/trust-tech-companies-keep-personal-data-secure-private/
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 2021
    Area covered
    United States
    Description

    As of November 2021 in the United States, ** percent of surveyed participants said that they trusted Amazon to handle their personal data, whereas ** percent said they distrusted the service with their information. Overall, ** percent of respondents said that they did not trust Facebook with their private data, and ** percent said they did not trust TikTok with such information. Just under half of all respondents stated that they trusted Google and ** percent trusted Microsoft.

  7. amazon

    • kaggle.com
    Updated Sep 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhilash Datta (2021). amazon [Dataset]. https://www.kaggle.com/datasets/abhilashdatta/amazon/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Abhilash Datta
    Description

    Dataset

    This dataset was created by Abhilash Datta

    Contents

  8. r

    Amazon Prime Member Annual Spending Data 2019-2024

    • redstagfulfillment.com
    html
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Red Stag Fulfillment (2025). Amazon Prime Member Annual Spending Data 2019-2024 [Dataset]. https://redstagfulfillment.com/average-annual-spend-of-an-amazon-prime-member/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 19, 2025
    Dataset authored and provided by
    Red Stag Fulfillment
    Time period covered
    2019 - 2024
    Area covered
    United States
    Variables measured
    Prime Day spending averages, Annual Prime member spending, Demographic spending patterns, Annual non-Prime customer spending, Prime membership penetration rates
    Description

    Comprehensive dataset tracking Amazon Prime member spending patterns from 2019-2024, including comparison with non-Prime customers and demographic breakdowns

  9. Amazon mobile audience share in the UK 2024, by age group

    • statista.com
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon mobile audience share in the UK 2024, by age group [Dataset]. https://www.statista.com/statistics/1308619/uk-amazon-mobile-audience-by-age-group/
    Explore at:
    Dataset updated
    Jul 18, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 2024
    Area covered
    United Kingdom
    Description

    In the United Kingdom (UK), data from November 2024 based on geolocalized users showed that more than half of Amazon's mobile audience comprises people aged between 25 and 34. Younger adults from 18 to 24 years old account for the second-biggest group of mobile users in the UK, with nearly ** percent.

  10. Amazon revenue 2004-2024

    • statista.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon revenue 2004-2024 [Dataset]. https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, United States
    Description

    From 2004 to 2024, the net revenue of Amazon e-commerce and service sales has increased tremendously. In the fiscal year ending December 31, the multinational e-commerce company's net revenue was almost *** billion U.S. dollars, up from *** billion U.S. dollars in 2023.Amazon.com, a U.S. e-commerce company originally founded in 1994, is the world’s largest online retailer of books, clothing, electronics, music, and many more goods. As of 2024, the company generates the majority of it's net revenues through online retail product sales, followed by third-party retail seller services, cloud computing services, and retail subscription services including Amazon Prime. From seller to digital environment Through Amazon, consumers are able to purchase goods at a rather discounted price from both small and large companies as well as from other users. Both new and used goods are sold on the website. Due to the wide variety of goods available at prices which often undercut local brick-and-mortar retail offerings, Amazon has dominated the retailer market. As of 2024, Amazon’s brand worth amounts to over *** billion U.S. dollars, topping the likes of companies such as Walmart, Ikea, as well as digital competitors Alibaba and eBay. One of Amazon's first forays into the world of hardware was its e-reader Kindle, one of the most popular e-book readers worldwide. More recently, Amazon has also released several series of own-branded products and a voice-controlled virtual assistant, Alexa. Headquartered in North America Due to its location, Amazon offers more services in North America than worldwide. As a result, the majority of the company’s net revenue in 2023 was actually earned in the United States, Canada, and Mexico. In 2023, approximately *** billion U.S. dollars was earned in North America compared to only roughly *** billion U.S. dollars internationally.

  11. amazon

    • kaggle.com
    Updated Apr 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sharegpt (2023). amazon [Dataset]. https://www.kaggle.com/datasets/sharegpt/amazon
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    sharegpt
    Description

    Dataset

    This dataset was created by sharegpt

    Contents

  12. Amazon employees 2007-2024

    • statista.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon employees 2007-2024 [Dataset]. https://www.statista.com/statistics/234488/number-of-amazon-employees/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, United States
    Description

    The combined number of full- and part-time employees of Amazon.com has increased significantly since 2017. Amazon’s headcount peaked in 2021 when the American multinational e-commerce company employed ********* full- and part-time employees, not counting external contractors. However, in 2024, the number dropped to *********. E-commerce crunch The workforce reduction of Amazon follows the mass layoffs hitting the entire e-commerce sector. With the full reopening of physical stores after the COVID-19 pandemic, online shopping demand decreased, leading online retailers to restructure their businesses, including personnel costs. Diversifying business With online retail sales growing slower due to recession and inflation, Amazon can still leverage other profitable revenue segments — from media subscriptions to server hosting and cloud services. On top of that, in 2023 Amazon monitored small enterprises operating in different fields and strategically invested in them, as disclosed startup acquisitions indicate.

  13. f

    Static analysis of Yelp and Amazon data sets.

    • plos.figshare.com
    xls
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    An Tong; Bochao Chen; Zhe Wang; Jiawei Gao; Chi Kin Lam (2025). Static analysis of Yelp and Amazon data sets. [Dataset]. http://doi.org/10.1371/journal.pone.0322004.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    An Tong; Bochao Chen; Zhe Wang; Jiawei Gao; Chi Kin Lam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the number of telecom frauds has increased significantly, causing substantial losses to people’s daily lives. With technological advancements, telecom fraud methods have also become more sophisticated, making fraudsters harder to detect as they often imitate normal users and exhibit highly similar features. Traditional graph neural network (GNN) methods aggregate the features of neighboring nodes, which makes it difficult to distinguish between fraudsters and normal users when their features are highly similar. To address this issue, we proposed a spatio-temporal graph attention network (GDFGAT) with feature difference-based weight updates. We conducted comprehensive experiments on our method on a real telecom fraud dataset. Our method obtained an accuracy of 93.28%, f1 score of 92.08%, precision rate of 93.51%, recall rate of 90.97%, and AUC value of 94.53%. The results showed that our method (GDFGAT) is better than the classical method, the latest methods and the baseline model in many metrics; each metric improved by nearly 2%. In addition, we also conducted experiments on the imbalanced datasets: Amazon and YelpChi. The results showed that our model GDFGAT performed better than the baseline model in some metrics.

  14. A

    ‘FAANG- Complete Stock Data’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘FAANG- Complete Stock Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-faang-complete-stock-data-36c1/9110ef3b/?iid=011-763&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘FAANG- Complete Stock Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/aayushmishra1512/faang-complete-stock-data on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    There are a few companies that are considered to be revolutionary. These companies also happen to be a dream place to work at for many many people across the world. These companies include - Facebook,Amazon,Apple,Netflix and Google also known as FAANG! These companies make ton of money and they help others too by giving them a chance to invest in the companies via stocks and shares. This data wass made targeting these stock prices.

    Content

    The data contains information such as opening price of a stock, closing price, how much of these stocks were sold and many more things. There are 5 different CSV files in the data for each company.

    --- Original source retains full ownership of the source dataset ---

  15. c

    The Global Information Services market size was USD 140.9 billion in 2022!

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Feb 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2024). The Global Information Services market size was USD 140.9 billion in 2022! [Dataset]. https://www.cognitivemarketresearch.com/information-services-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Feb 20, 2024
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, The Global market for Information Services was USD 140.9 billion in 2022 and will grow at a 7.80% CAGR from 2023 to 2030. Market Dynamics of

    Information Services Market

    Key Drivers for

    Information Services Market

    Data generation is expanding exponentially: The digital transformation across industries has produced massive quantities of structured and unstructured data, which has increased the need for data processing and analytics services. Information services are essential for organizations to extract practical knowledge from huge datasets. Cloud computing supports real-time analysis and scalable data storage. Risk management and regulatory compliance needs: Businesses are now compelled to use specialized information services due to increased data privacy legislation (GDPR, CCPA) and financial reporting standards. Demand for compliance is driven by industries such as healthcare, finance, and the law. Third-party providers are knowledgeable about how regulations are changing. Integration of AI and automation: The speed and correctness of information services are increased by the integration of sophisticated analytics, machine learning, and natural language processing. Automated data curation and predictive modeling lessen manual labor while enhancing decision-making.

    Key Restraints for

    Information Services Market

    Worries about data security and privacy: High-profile breaches and misuse of personal data undermine consumer trust in information service companies. High operational costs result from stringent cybersecurity safeguards and encryption protocols. Cross-border data transfer limitations make it harder to provide services globally. Market fragmentation and strong competition: Low entry barriers for simple data services result in oversaturation in some areas. As suppliers compete on price rather than value-added features, differentiation becomes more difficult. Reliance on third-party data sources: The dependability of services is impacted by the inconsistent data quality from outside vendors. Proprietary datasets' licensing fees lower the profit margins of information service companies

    Key Trends for

    Information Services Market

    Specific industry-specific solutions: Targeted niche information services for sectors like healthcare (clinical trial data) or supply chain (IoT sensor analytics) are gaining popularity. A higher-value knowledge is produced by combining domain expertise with data science. Real-time data delivery: switch from static reports to dynamic dashboards and streaming analytics. Edge computing allows for quicker processing for time-sensitive applications like financial trading or fraud detection. Ethical AI and open data sourcing: Increasingly, socially conscious firms are asking for auditable algorithms and unbiased datasets. Providers are implementing fair data acquisition strategies and explainable AI frameworks Introduction of Information Services

    Information systems are a collection of interconnected components that are used to capture, process, save, and disseminate various sorts of data for people to view and utilize. Businesses and consumers can choose from a variety of services offered by the information services market. These services might range from analytics tools and cloud-based storage to data management services and cybersecurity solutions. The market is being driven by an increase in the demand for these services as businesses search for fresh ways to use technology to spur development and innovation.

    For instance, Amazon Web Services (AWS) offers a variety of cloud-based services, such as data storage and analysis tools. AWS provides a number of storage solutions, such as object storage, block storage, and file storage, as well as data analysis and machine learning capabilities. These services enable businesses to store and analyze massive volumes of data in the cloud, making it more accessible and usable for a wide range of applications.

    (Source: docs.aws.amazon.com/whitepapers/latest/aws-overview/storage-services.html)

  16. h

    first-impressions-v2

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yeray, first-impressions-v2 [Dataset]. https://huggingface.co/datasets/yeray142/first-impressions-v2
    Explore at:
    Authors
    Yeray
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for First Impressions V2

    The first impressions data set, comprises 10000 clips (average duration 15s) extracted from more than 3,000 different YouTube high-definition (HD) videos of people facing and speaking in English to a camera. The videos are split into training, validation and test sets with a 3:1:1 ratio. People in videos show different gender, age, nationality, and ethnicity. Videos are labeled with personality traits variables. Amazon Mechanical Turk (AMT) was… See the full description on the dataset page: https://huggingface.co/datasets/yeray142/first-impressions-v2.

  17. DASCH DR7 Digital Inventory

    • zenodo.org
    zip
    Updated Dec 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter K. G. Williams; Peter K. G. Williams (2024). DASCH DR7 Digital Inventory [Dataset]. http://doi.org/10.5281/zenodo.14563521
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 27, 2024
    Dataset provided by
    Harvard College Observatoryhttp://www.cfa.harvard.edu/hco
    Authors
    Peter K. G. Williams; Peter K. G. Williams
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files define a "digital inventory" of all of the files archived as part of DASCH Data Release 7 (DR7). DASCH (Digital Access to a Sky Century @ Harvard) was the project to digitize the Harvard College Observatory’s Astronomical Photographic Glass Plate Collection for scientific applications. This irreplaceable resource provides a means for systematic study of the sky on 100-year time scales.

    This inventory does not contain the actual DASCH data. Rather, it contains an exhaustive index of all of the DASCH data — virtually all aspects of DASCH's digital existence throughout the project's entire history, up through the DR7 release date (December, 2024). The complete inventory documents 33,791,530 files totaling 745,627,062,858,355 bytes (around 678 TiB) of data. The inventory itself is about 10 GiB in size (decompressed), spread across 3,946 files.

    The actual underlying data are currently archived in a set of Amazon AWS S3 buckets and magnetic tapes held by Harvard College Observatory. Most DASCH users are encouraged to access DASCH data via the project's data access services; this inventory should only be of interest to those interested in large-scale duplication of the DASCH data.

    The DASCH archive, which is indexed by this inventory, includes:

    • Full-plate "mosaic" FITS images of more than 428,000 plates, as well as photographs of the plates and their jackets
    • Astrometric solution data for about 97% of the plates
    • Photometric calibration data for about 89% of the plates
    • Lightcurves for all sources extracted from the plates, matched to two separate reference catalogs:
      • 23,574,404,199 measurements calibrated to the APASS DR8 catalog
      • 27,966,413,880 measurements calibrated to the ATLAS-refcat2 catalog
    • About 166,000 photographs of observing logbooks documenting the plates, and a selection of historical astronomer notebooks discussing them
    • Derived products, generated from the above, needed to operate the DASCH data access services
    • Raw "tile" data from two decades of DASCH scanning, as well as supporting calibration and telemetry files
    • All of the source code behind the DASCH software systems, from scanning to pipeline processing to data access services to end-user analysis
    • Logs relating to all modern DASCH pipeline processing, data management, and other operations tasks
    • All available project documentation
    • All other data files supporting DASCH operations

    See the README.md file within the collection for more information about the structure and contents of this inventory. In summary, it organizes the DASCH data files into a virtual hierarchy of names. Associated with each name is a size (in bytes), MD5 digest, and one or more "data URLs" recording locations where that file is archived as of DR7. Every single file has a data URL indicating a location on Amazon's AWS S3 storage service; many files also have one or more copies on magnetic backup tapes held
    by Harvard College Observatory.

    The inventory is expressed as a collection of plain-text (UTF-8) files using Markdown syntax. There is approximately one such file for each "folder" or "subtree" of the virtual name hierarchy. Each file contains a human-readable preamble describing the folder contents, an optional Markdown table listing any direct-descendant subfolders, and an optional Markdown table documenting any files contained directly within that folder. The intention is that it should be fairly straightforward for both humans to navigate these files, as well as to write software that processes them. While most files are human-scale in size, the largest (Inventory.pipeline_astrometry.md) is about 280 MiB and contains about 1.5 million records.

    As of the DR7 release, only some DASCH archive files are directly accessible by third parties. The Starglass website (https://starglass.cfa.harvard.edu/) makes many photographs and "mosaics" (full-plate FITS images) available, and the web APIs supporting this site and the DASCH data access services (see the DASCH site, https://dasch.cfa.harvard.edu/) provide access to additional resources. To duplicate other portions of the archive, you may need to contact Harvard College Observatory. It is hoped that over time, more and more of the DASCH archive will become available for direct download. It is also hoped that additional copies of the DASCH archive will be created and publicized; the best way to ensure the long-term preservation of this dataset is to duplicate it. A major goal of this inventory is to make such duplication tractable.

    To the greatest extent possible, it is believed that all of the files documented as part of this archive can be duplicated free of legal encumbrances. Unless documented otherwise, the copyright owner of all copyrightable elements is the President and Fellows of Harvard College. Please see the DASCH website for the most up-to-date guidance regarding image credits and any legal topics relating to this dataset.

    Acknowledgments

    The DASCH scanning project was the work of literally hundreds of people over multiple decades. Out of the many people who have devoted their time and energy to the project, the essential contributions of a few deserve special recognition: Prof. Jonathan (Josh) Grindlay; Bob Simcoe; Edward Los; Lindsay Smith Zrull; and Alison Doane.

    The DASCH project at Harvard is grateful for partial support from NSF grants AST-0407380, AST-0909073, and AST-1313370; which should be acknowledged in all papers making use of DASCH data.

    We acknowledge the one-time gift of the Cornel and Cynthia K. Sarosdy Fund for DASCH, and thank Grzegorz Pojmanski of the ASAS project for providing some of the source code on which the DASCH scientific data access portal was based.

    The ongoing AAVSO Photometric All-Sky Survey (APASS) has improved DASCH photometric calibration and is funded by the Robert Martin Ayers Sciences Fund.

    This inventory and DASCH Data Release 7 were prepared by Peter K. G. Williams in December, 2024.

  18. E-commerce revenue worldwide 2017-2030, by segment

    • statista.com
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). E-commerce revenue worldwide 2017-2030, by segment [Dataset]. https://www.statista.com/topics/871/online-shopping/
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    Significant fluctuations are estimated for all segments over the forecast period for the revenue. In general, the indicator appears to exhibit a positive trend, with more segments showing increasing values rather than decreasing values until 2030. Among them, the segment Food attains the highest value throughout the entire period, reaching 1.23 trillion U.S. dollars.The Statista Market Insights cover a broad range of additional markets.

  19. e

    Dataset for: The More Competent, the Better? The Effects of Perceived...

    • b2find.eudat.eu
    Updated Nov 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Dataset for: The More Competent, the Better? The Effects of Perceived Competencies on Disclosure Towards Conversational Artificial Intelligence - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/3193477c-2599-5888-8276-cf7773a5d01b
    Explore at:
    Dataset updated
    Nov 29, 2022
    Description

    Conversational AI (e.g., Google Assistant or Amazon Alexa) is present in many people’s everyday life and, at the same time, becomes more and more capable of solving more complex tasks. However, it is unclear how the growing capabilities of conversational AI affect people’s disclosure towards the system as previous research has revealed mixed effects of technology competence. To address this research question, we propose a framework systematically disentangling conversational AI competencies along the lines of the dimensions of human competencies suggested by the action regulation theory. Across two correlational studies and three experiments (N total = 1453), we investigated how these competencies differentially affect users’ and non-users’ disclosure towards conversational AI. Results indicate that intellectual competencies (e.g., planning actions and anticipating problems) in a conversational AI heighten users’ willingness to disclose and reduce their privacy concerns. In contrast, meta-cognitive heuristics (e.g., deriving universal strategies based on previous interactions) raise privacy concerns for users and, even more so, for non-users but reduce willingness to disclose only for non-users. Thus, the present research suggests that not all competencies of a conversational AI are seen as merely positive, and the proposed differentiation of competencies is informative to explain effects on disclosure.

  20. Number of U.S. Amazon Prime subscribers 2013-2019

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of U.S. Amazon Prime subscribers 2013-2019 [Dataset]. https://www.statista.com/statistics/546894/number-of-amazon-prime-paying-members/
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 2013 - Dec 2019
    Area covered
    United States
    Description

    Amazon Prime is constantly growing in the United States: as of December 2019, there were an estimated *** million U.S. Amazon Prime subscribers, up from ** million in June 2018. On average, Amazon Prime members spent ***** U.S. dollars on the e-retail platform per year. March 2019 data also states that non-Prime members only spent *** U.S. dollars annually. Amazon Prime Amazon Prime is a paid subscription service offered by online retail platform Amazon. The subscription includes services such as music and video streaming, free two-day (or faster) shipping, as well as many other benefits. The program was launched in 2005 and is available internationally. In 2019, Amazon generated ***** billion U.S. dollars in revenues through its subscription services segment. Subscription services do not only include Amazon Prime revenues, but also audiobook, e-book, digital video, digital music and other non-AWS subscription services. Prime shoppers The most popular product categories purchased by Amazon Prime shoppers in the United States were electronics, apparel, and home and kitchen goods. Amazon Prime shoppers are more engaged that non-members: during a February 2019 survey, 20 percent of Amazon Prime members stated that they shopped on Amazon a few times per week, with ***** percent saying that they did so on an (almost) daily basis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Business of Apps (2025). Amazon Statistics (2025) [Dataset]. https://www.businessofapps.com/data/amazon-statistics/

Amazon Statistics (2025)

Explore at:
12 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 20, 2025
Dataset authored and provided by
Business of Apps
License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

Amazon is one of the most recognisable brands in the world, and the third largest by revenue. It was the fourth tech company to reach a $1 trillion market cap, and a market leader in e-commerce,...

Search
Clear search
Close search
Google apps
Main menu