MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is Top 1000 companies dataset and it is web scraped form a website . This dataset includes 4 columns and 1000 unique rows . 4 columns includes Name of the company , Rating out of 5 , Reviews given by employees and Overall data. also Rows includes different 1000 Companies name.
Dataset is good for beginner level Data analysis project and Company Recommendation projects etc.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset provided includes information about various companies, their stock symbols, financial metrics such as price-to-book ratio and share price, as well as details about their origin countries. Additionally, the dataset contains frequency distribution information for certain ranges of price-to-book ratios and share prices.
The dataset appears to be a compilation of financial data for different companies, likely for investment analysis or comparison purposes. It includes the following key components:
This dataset can be utilized for various financial analyses such as company valuation, comparison of financial metrics across companies, and investment decision-making.
mochi-skz/twt-kaggle-data dataset hosted on Hugging Face and contributed by the HF Datasets community
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This synthetic dataset simulates procurement transactions for a mid-sized organization over 2024. It includes purchases across multiple categories (electronics, furniture, stationery, etc.) from various suppliers and buyers. Ideal for practicing descriptive analytics, spend analysis, and supplier performance evaluation.
Gholamreza/test-dataset-kaggle dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/odbl/https://choosealicense.com/licenses/odbl/
Date: 2022-07-10 Files: ner_dataset.csv Source: Kaggle entity annotated corpus notes: The dataset only contains the tokens and ner tag labels. Labels are uppercase.
About Dataset
from Kaggle Datasets
Context
Annotated Corpus for Named Entity Recognition using GMB(Groningen Meaning Bank) corpus for entity classification with enhanced and popular features by Natural Language Processing applied to the data set. Tip: Use Pandas Dataframe to load dataset if using Python for… See the full description on the dataset page: https://huggingface.co/datasets/rjac/kaggle-entity-annotated-corpus-ner-dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Indian Company Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yogeshrampariya/indian-company-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset is about Indian recruitment company, who hire best of brains in India in software, banking, manufacturing, analysis and Education background
Columns:
Name: Name of Company Rating : Overall Rating given by Rating agency Reviews: Overall Google Reviews Company: Company type like Public, Private Head_Quarters : Location where headquarter of company is Company_age : How old the company is No_of_Employee : Number of employee company has
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
This dataset will test your coding skills where you will be handling categorical variables, its a classification problem
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview:
This dataset provides a comprehensive look into the financial expenditures and profits of 50 startups based in the United States. It is an invaluable resource for analysts, economists, and business strategists seeking to understand the correlation between different types of spending and profitability in startup ventures.
Attributes: 1. R&D Spend: - Description: The amount of money each company has invested in Research and Development activities. - Data Type: Numeric (US dollars) - Importance: Indicates the company's commitment to innovation and technological advancement. 2. Administration: - Description: Expenditure on administrative functions and operations. - Data Type: Numeric (US dollars) - Relevance: Reflects the overhead costs associated with managing the company. 3. Marketing Spend: - Description: Investment in marketing and promotional activities. - Data Type: Numeric (US dollars) - Significance: A key factor in revenue generation and market penetration. 4. State: - Description: The U.S. state where the company is operating. - Data Type: Categorical (California, New York, or Florida) - Purpose: Provides geographical context and allows for regional analysis. 5. Profit: - Description: The net profit earned by the company. - Data Type: Numeric (US dollars) - Utility: A direct measure of the company’s financial success.
Potential Uses: - Business Analysis: Understanding how different types of spending (R&D, administration, marketing) affect profitability. - Regional Studies: Examining the impact of geographical location on business success. - Startup Growth: Insights into the financial practices of successful startups. - Economic Research: Data-driven study of the startup ecosystem in the U.S.
Target Audience: - Business Analysts and Economists - Marketing Strategists - Startup Consultants - Data Science Enthusiasts - Academic Researchers
Conclusion: This dataset is a rich resource for anyone looking to delve into the financial dynamics of startups in the U.S. It offers a unique perspective on how different types of investments correlate with company success across various states.
Please note that the data is anonymized and does not include any confidential information about the companies listed. The dataset is intended for educational and research purposes.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The given dataset appears to be a sales dataset containing information about different orders. Here is a description of the data:
The dataset provides detailed information about each order, including customer details, product details, sales information, and shipping information. It can be used to analyze various aspects of the sales data, such as profitability, customer segments, product categories, and regional sales performance.
https://choosealicense.com/licenses/pddl/https://choosealicense.com/licenses/pddl/
Dataset Card for MergedDataset
Dataset Summary
Supported Tasks and Leaderboards
[More Information Needed]
Languages
[More Information Needed]
Dataset Structure
Data Instances
[More Information Needed]
Data Fields
[More Information Needed]
Data Splits
[More Information Needed]
Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
Initial Data… See the full description on the dataset page: https://huggingface.co/datasets/ahmadkhan1022/kaggle.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Apple Stock Data 2025
This is a dataset copied from Kaggle. You can see the original dataset here: https://www.kaggle.com/datasets/umerhaddii/apple-stock-data-2025
The following is the original readme of this dataset:
About Dataset
Context
Apple Inc. is an American hardware and software developer and technology company that develops and sells computers, smartphones and consumer electronics as well as operating systems and application software. Apple also… See the full description on the dataset page: https://huggingface.co/datasets/tablegpt/AppleStockData2025.
Schmitz005/kaggle-recipe-categorized-chunk-8 dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Top 100 Biggest Restaurant Chains 2021’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/johnharshith/top-100-biggest-restaurant-chains-2021 on 28 January 2022.
--- Dataset description provided by original source is as follows ---
https://i.insider.com/5db704d7045a311ad239369b?width=1300&format=jpeg&auto=webp" alt="Popular Restaurant Chains">
This Dataset contains the data compiled by Technomic and reported by Restaurant Business magazine, the top 100 most popular restaurant chains in the United States in terms of the latest 2020 sales which were responsible for three-fourths of the total industry sales growth last year.
The data was obtained from the Restaurant Business magazine website. The columns contain stats such as position of restaurant chains, 2020 U.S. sales, YOY sales change, 2020 U.S. units, YOY unit change, segment and menu types. This data can be found from the website https://www.restaurantbusinessonline.com/top-500-chains with detailed analysis.
While 2016 was a rough year for chain restaurants, more than half of the industry wealth of $521.9 billion still comes from the Top 500 chains and nearly 94% of those dollars and 93% of those units are represented in the Top 250. These stats have made me curious to find out interesting profit patterns from this dataset.
This Dataset can be used to study interesting patterns using various classification techniques and arrive at some exciting conclusions. One can create amazing visualisations using the different columns of the dataset. We can also find out and design an effective business model from the given dataset and take one step closer to your most successful restaurant chain startup ever!
--- Original source retains full ownership of the source dataset ---
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
E-commerce has become a new channel to support businesses development. Through e-commerce, businesses can get access and establish a wider market presence by providing cheaper and more efficient distribution channels for their products or services. E-commerce has also changed the way people shop and consume products and services. Many people are turning to their computers or smart devices to order goods, which can easily be delivered to their homes.
This is a sales transaction data set of UK-based e-commerce (online retail) for one year. This London-based shop has been selling gifts and homewares for adults and children through the website since 2007. Their customers come from all over the world and usually make direct purchases for themselves. There are also small businesses that buy in bulk and sell to other customers through retail outlet channels.
The data set contains 500K rows and 8 columns. The following is the description of each column. 1. TransactionNo (categorical): a six-digit unique number that defines each transaction. The letter “C” in the code indicates a cancellation. 2. Date (numeric): the date when each transaction was generated. 3. ProductNo (categorical): a five or six-digit unique character used to identify a specific product. 4. Product (categorical): product/item name. 5. Price (numeric): the price of each product per unit in pound sterling (£). 6. Quantity (numeric): the quantity of each product per transaction. Negative values related to cancelled transactions. 7. CustomerNo (categorical): a five-digit unique number that defines each customer. 8. Country (categorical): name of the country where the customer resides.
There is a small percentage of order cancellation in the data set. Most of these cancellations were due to out-of-stock conditions on some products. Under this situation, customers tend to cancel an order as they want all products delivered all at once.
Information is a main asset of businesses nowadays. The success of a business in a competitive environment depends on its ability to acquire, store, and utilize information. Data is one of the main sources of information. Therefore, data analysis is an important activity for acquiring new and useful information. Analyze this dataset and try to answer the following questions. 1. How was the sales trend over the months? 2. What are the most frequently purchased products? 3. How many products does the customer purchase in each transaction? 4. What are the most profitable segment customers? 5. Based on your findings, what strategy could you recommend to the business to gain more profit?
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Overview
This dataset presents a meticulously compiled collection of 387 academic publications that explore various aspects of social media and business intelligence. The dataset includes detailed metadata about each publication, such as titles, authorship, abstracts, publication years, article types, and the journals or conferences where they were published. Citations and research areas are also included, making this dataset a valuable resource for bibliometric analysis, trend detection, and literature reviews in the fields of social media analytics, sentiment analysis, business intelligence, and related disciplines.
Content
The dataset comprises 15 columns, each capturing specific attributes of the research papers. Below is a description of each column:
Applications
This dataset can be utilized for a variety of purposes, including but not limited to:
Data Collection and Preprocessing
The dataset was curated by extracting bibliometric data from Web of Science (WOS), ensuring the inclusion of comprehensive and high-quality metadata. All records have been standardized for consistency and completeness to facilitate easier analysis.
This dataset was created by Shital Gaikwad
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 1,000 job postings for Machine Learning-related roles across the United States, scraped between late 2024 and early 2025. The data was collected directly from company career pages and job boards, focusing on full job descriptions and associated company information.
Column | Description |
---|---|
job_posted_date | The date the job was posted (format: YYYY-MM-DD). |
company_address_locality | The city or locality of the job or company. |
company_address_region | The U.S. state or region where the job is located. |
company_name | The name of the company posting the job. |
company_website | The official website of the company. |
company_description | A short description or mission statement of the company. |
job_description_text | The full job description text as listed in the original posting. |
seniority_level | The required seniority level (e.g., Internship, Entry level, Mid-Senior). |
job_title | The full job title listed in the posting. |
The provided dataset appears to be a sales dataset from a company called "**T-Mart.**" The dataset contains various columns with information about the sales transactions, including the date of the transaction, product details, quantity, sales type, location, payment mode, product category, unit of measurement (UOM), purchase price, and some additional labels and counts.
Based on the given information, here's a brief description of the dataset:
The "T-Mart" sales dataset captures sales transactions with details such as the transaction date, unique product identifier (PRODUCT ID), quantity sold, sales type (Direct Sales, Online, etc.), sales location (e.g., California, Alabama), payment mode (Cash, Online), product details (PRODUCT, CATEGORY, UOM), purchase price, and some additional label-based information.
This dataset provides insights into various aspects of the company's sales operations, including the distribution of sales across different categories, products, and locations, as well as information about the payment modes used for transactions.
Analyzing this dataset can help identify trends, popular products, sales performance by location, and preferred payment methods. It's essential for understanding the company's sales dynamics and making informed business decisions.
This dataset appears to be rich in information, and with the right data visualization techniques, we can uncover valuable insights that can be used for strategic planning and optimizing sales strategies.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
It is not so often that one can find fundamental data of companies on which it would be possible to accurately assess the value of a company.
So I decided to use yahoo_fin api to collect some fundamentals of 48 companies from the S&P 500 index.
The content of indicators in each table: - total assets. - cash. - stockholder equity. - profit. - revenue. - return on equity, return on assets, profit margin. - trailing P/E, P/S, P/B, PEG, forward P/E.
In addition, the dataset has prices for all stocks for four years.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
📊 Dataset Features This dataset includes 5,000 startups from 10 countries and contains 15 key features: Startup Name: Name of the startup Founded Year: Year the startup was founded Country: Country where the startup is based Industry: Industry category (Tech, FinTech, AI, etc.) Funding Stage: Stage of investment (Seed, Series A, etc.) Total Funding ($M): Total funding received (in million $) Number of Employees: Number of employees in the startup Annual Revenue ($M): Annual revenue in million dollars Valuation ($B): Startup's valuation in billion dollars Success Score: Score from 1 to 10 based on growth Acquired?: Whether the startup was acquired (Yes/No) IPO?: Did the startup go public? (Yes/No) Customer Base (Millions): Number of active customers Tech Stack: Technologies used by the startup Social Media Followers: Total followers on social platforms Analysis Ideas 📈 What Can You Do with This Dataset? Here are some exciting analyses you can perform:
Predict Startup Success: Train a machine learning model to predict the success score. Industry Trends: Analyze which industries get the most funding. **Valuation vs. Funding: **Explore the correlation between funding and valuation. Acquisition Analysis: Investigate the factors that contribute to startups being acquired.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is Top 1000 companies dataset and it is web scraped form a website . This dataset includes 4 columns and 1000 unique rows . 4 columns includes Name of the company , Rating out of 5 , Reviews given by employees and Overall data. also Rows includes different 1000 Companies name.
Dataset is good for beginner level Data analysis project and Company Recommendation projects etc.