100+ datasets found
  1. Orange dataset table

    • figshare.com
    xlsx
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Rui Simões
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

    Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.

  2. [Data Analysis/2021] Mix Dataset

    • kaggle.com
    zip
    Updated Apr 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pham Le Tu Nhi (2024). [Data Analysis/2021] Mix Dataset [Dataset]. https://www.kaggle.com/datasets/phamletunhi/data-analysis2021-mix-dataset
    Explore at:
    zip(1086824007 bytes)Available download formats
    Dataset updated
    Apr 16, 2024
    Authors
    Pham Le Tu Nhi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Pham Le Tu Nhi

    Released under MIT

    Contents

  3. c

    Walmart Products Dataset – Free Product Data CSV

    • crawlfeeds.com
    csv, zip
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

    Key Features

    Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

    CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

    Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

    Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

    Who Benefits?

    • Data analysts & researchers exploring e-commerce trends or product catalog data.
    • Developers & data scientists building price-comparison tools, recommendation engines or ML models.
    • E-commerce strategists/marketers need product metadata for competitive analysis or market research.
    • Students/hobbyists needing a free dataset for learning or demo projects.

    Why Use This Dataset Instead of Manual Scraping?

    • Time-saving: No need to write scrapers or deal with rate limits.
    • Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.
    • Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
      Instant access: Free and immediately downloadable.
  4. Comprehensive Supply Chain Analysis

    • kaggle.com
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorothy Joel (2023). Comprehensive Supply Chain Analysis [Dataset]. https://www.kaggle.com/datasets/dorothyjoel/us-regional-sales
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dorothy Joel
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This supply chain analysis provides a comprehensive view of the company's order and distribution processes, allowing for in-depth analysis and optimization of various aspects of the supply chain, from procurement and inventory management to sales and customer satisfaction. It empowers the company to make data-driven decisions to improve efficiency, reduce costs, and enhance customer experiences. The provided supply chain analysis dataset contains various columns that capture important information related to the company's order and distribution processes:

    • OrderNumber • Sales Channel • WarehouseCode • ProcuredDate • CurrencyCode • OrderDate • ShipDate • DeliveryDate • SalesTeamID • CustomerID • StoreID • ProductID • Order Quantity • Discount Applied • Unit Cost • Unit Price

  5. Google Analytics data of an E-commerce Company

    • kaggle.com
    zip
    Updated Oct 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fehu.zone (2024). Google Analytics data of an E-commerce Company [Dataset]. https://www.kaggle.com/datasets/fehu94/google-analytics-data-of-an-e-commerce-company
    Explore at:
    zip(3156 bytes)Available download formats
    Dataset updated
    Oct 19, 2024
    Authors
    fehu.zone
    Description

    📊 Dataset Title: Daily Active Users Dataset

    📝 Description

    This dataset provides detailed insights into daily active users (DAU) of a platform or service, captured over a defined period of time. The dataset includes information such as the number of active users per day, allowing data analysts and business intelligence teams to track usage trends, monitor platform engagement, and identify patterns in user activity over time.

    The data is ideal for performing time series analysis, statistical analysis, and trend forecasting. You can utilize this dataset to measure the success of platform initiatives, evaluate user behavior, or predict future trends in engagement. It is also suitable for training machine learning models that focus on user activity prediction or anomaly detection.

    📂 Dataset Structure

    The dataset is structured in a simple and easy-to-use format, containing the following columns:

    • Date: The date on which the data was recorded, formatted as YYYYMMDD.
    • Number of Active Users: The number of users who were active on the platform on the corresponding date.

    Each row in the dataset represents a unique date and its corresponding number of active users. This allows for time-based analysis, such as calculating the moving average of active users, detecting seasonality, or spotting sudden spikes or drops in engagement.

    🧐 Key Use Cases

    This dataset can be used for a wide range of purposes, including:

    1. Time Series Analysis: Analyze trends and seasonality of user engagement.
    2. Trend Detection: Discover peaks and valleys in user activity.
    3. Anomaly Detection: Use statistical methods or machine learning algorithms to detect anomalies in user behavior.
    4. Forecasting User Growth: Build forecasting models to predict future platform usage.
    5. Seasonality Insights: Identify patterns like increased activity on weekends or holidays.

    📈 Potential Analysis

    Here are some specific analyses you can perform using this dataset:

    • Moving Average and Smoothing: Calculate the moving average over a 7-day or 30-day period.
    • Correlation with External Factors: Correlate daily active users with other datasets.
    • Statistical Hypothesis Testing: Perform t-tests or ANOVA to determine significant differences in user activity.
    • Machine Learning for Prediction: Train machine learning models to predict user engagement.

    🚀 Getting Started

    To get started with this dataset, you can load it into your preferred analysis tool. Here's how to do it using Python's pandas library:

    import pandas as pd
    
    # Load the dataset
    data = pd.read_csv('path_to_dataset.csv')
    
    # Display the first few rows
    print(data.head())
    
    # Basic statistics
    print(data.describe())
    
  6. Data from: Social Media Data Analysis

    • kaggle.com
    zip
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nafe Muhtasim (2021). Social Media Data Analysis [Dataset]. https://www.kaggle.com/datasets/nafemuhtasim/social-media-data-analysis
    Explore at:
    zip(29081 bytes)Available download formats
    Dataset updated
    Apr 16, 2021
    Authors
    Nafe Muhtasim
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Nafe Muhtasim

    Released under CC0: Public Domain

    Contents

  7. m

    Student Skill Gap Analysis

    • data.mendeley.com
    • kaggle.com
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bindu Garg (2025). Student Skill Gap Analysis [Dataset]. http://doi.org/10.17632/rv6scbpd7v.1
    Explore at:
    Dataset updated
    Apr 28, 2025
    Authors
    Bindu Garg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is designed for skill gap analysis, focusing on evaluating the skill gap between students’ current skills and industry requirements. It provides insights into technical skills, soft skills, career interests, and challenges, helping in skill gap analysis to identify areas for improvement.

    By leveraging this dataset, educators, recruiters, and researchers can conduct skill gap analysis to assess students’ job readiness and tailor training programs accordingly. It serves as a valuable resource for identifying skill deficiencies and skill gaps improving career guidance, and enhancing curriculum design through targeted skill gap analysis.

    Following is the column descriptors: Name - Student's full name. email_id - Student's email address. Year - The academic year the student is currently in (e.g., 1st Year, 2nd Year, etc.). Current Course - The course the student is currently pursuing (e.g., B.Tech CSE, MBA, etc.). Technical Skills - List of technical skills possessed by the student (e.g., Python, Data Analysis, Cloud Computing). Programming Languages - Programming languages known by the student (e.g., Python, Java, C++). Rating - Self-assessed rating of technical skills on a scale of 1 to 5. Soft Skills - List of soft skills (e.g., Communication, Leadership, Teamwork). Rating - Self-assessed rating of soft skills on a scale of 1 to 5. Projects - Indicates whether the student has worked on any projects (Yes/No). Career Interest - The student's preferred career path (e.g., Data Scientist, Software Engineer). Challenges - Challenges faced while applying for jobs/internships (e.g., Lack of experience, Resume building issues).

  8. f

    Descriptive statistics.

    • plos.figshare.com
    xls
    Updated Oct 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrinal Saha; Aparna Deb; Imtiaz Sultan; Sujat Paul; Jishan Ahmed; Goutam Saha (2023). Descriptive statistics. [Dataset]. http://doi.org/10.1371/journal.pgph.0002475.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 31, 2023
    Dataset provided by
    PLOS Global Public Health
    Authors
    Mrinal Saha; Aparna Deb; Imtiaz Sultan; Sujat Paul; Jishan Ahmed; Goutam Saha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Vitamin D insufficiency appears to be prevalent in SLE patients. Multiple factors potentially contribute to lower vitamin D levels, including limited sun exposure, the use of sunscreen, darker skin complexion, aging, obesity, specific medical conditions, and certain medications. The study aims to assess the risk factors associated with low vitamin D levels in SLE patients in the southern part of Bangladesh, a region noted for a high prevalence of SLE. The research additionally investigates the possible correlation between vitamin D and the SLEDAI score, seeking to understand the potential benefits of vitamin D in enhancing disease outcomes for SLE patients. The study incorporates a dataset consisting of 50 patients from the southern part of Bangladesh and evaluates their clinical and demographic data. An initial exploratory data analysis is conducted to gain insights into the data, which includes calculating means and standard deviations, performing correlation analysis, and generating heat maps. Relevant inferential statistical tests, such as the Student’s t-test, are also employed. In the machine learning part of the analysis, this study utilizes supervised learning algorithms, specifically Linear Regression (LR) and Random Forest (RF). To optimize the hyperparameters of the RF model and mitigate the risk of overfitting given the small dataset, a 3-Fold cross-validation strategy is implemented. The study also calculates bootstrapped confidence intervals to provide robust uncertainty estimates and further validate the approach. A comprehensive feature importance analysis is carried out using RF feature importance, permutation-based feature importance, and SHAP values. The LR model yields an RMSE of 4.83 (CI: 2.70, 6.76) and MAE of 3.86 (CI: 2.06, 5.86), whereas the RF model achieves better results, with an RMSE of 2.98 (CI: 2.16, 3.76) and MAE of 2.68 (CI: 1.83,3.52). Both models identify Hb, CRP, ESR, and age as significant contributors to vitamin D level predictions. Despite the lack of a significant association between SLEDAI and vitamin D in the statistical analysis, the machine learning models suggest a potential nonlinear dependency of vitamin D on SLEDAI. These findings highlight the importance of these factors in managing vitamin D levels in SLE patients. The study concludes that there is a high prevalence of vitamin D insufficiency in SLE patients. Although a direct linear correlation between the SLEDAI score and vitamin D levels is not observed, machine learning models suggest the possibility of a nonlinear relationship. Furthermore, factors such as Hb, CRP, ESR, and age are identified as more significant in predicting vitamin D levels. Thus, the study suggests that monitoring these factors may be advantageous in managing vitamin D levels in SLE patients. Given the immunological nature of SLE, the potential role of vitamin D in SLE disease activity could be substantial. Therefore, it underscores the need for further large-scale studies to corroborate this hypothesis.

  9. g

    Insurance Dataset

    • gts.ai
    json
    Updated Oct 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2022). Insurance Dataset [Dataset]. https://gts.ai/case-study/insurance-dataset-annotation-services-for-precision-data-analysis/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 16, 2022
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Insurance Dataset project is an extensive initiative focused on collecting and analyzing insurance-related data from various sources.

  10. H

    Python Codes for Data Analysis of The Impact of COVID-19 on Technical...

    • dataverse.harvard.edu
    • figshare.com
    Updated Mar 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Szkirpan (2022). Python Codes for Data Analysis of The Impact of COVID-19 on Technical Services Units Survey Results [Dataset]. http://doi.org/10.7910/DVN/SXMSDZ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Elizabeth Szkirpan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Copies of Anaconda 3 Jupyter Notebooks and Python script for holistic and clustered analysis of "The Impact of COVID-19 on Technical Services Units" survey results. Data was analyzed holistically using cleaned and standardized survey results and by library type clusters. To streamline data analysis in certain locations, an off-shoot CSV file was created so data could be standardized without compromising the integrity of the parent clean file. Three Jupyter Notebooks/Python scripts are available in relation to this project: COVID_Impact_TechnicalServices_HolisticAnalysis (a holistic analysis of all survey data) and COVID_Impact_TechnicalServices_LibraryTypeAnalysis (a clustered analysis of impact by library type, clustered files available as part of the Dataverse for this project).

  11. m

    Dataset

    • data.mendeley.com
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gunarto Wibisono (2025). Dataset [Dataset]. http://doi.org/10.17632/298hjrz9cm.1
    Explore at:
    Dataset updated
    Jun 10, 2025
    Authors
    Gunarto Wibisono
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains supporting data collected during the research titled "Design of a Modular and Scalable IoT-Based System for Utility Management: A Case Study at 3 Towers Residences."

    The data includes: - Field observation photos of utility control rooms (electrical only), to show the electrical panel room is very tight and the MCB meter box as well. - Screenshots of email correspondence from finance team (masked the sender), to show inconsistency of delivery dates. - Visual evidence of Excel utility billing data which manually processed. - Supplementary documentation used in the thesis development.

  12. Exploring E-commerce Trends⭐️⭐️⭐️

    • kaggle.com
    zip
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Roshan Riaz (2024). Exploring E-commerce Trends⭐️⭐️⭐️ [Dataset]. https://www.kaggle.com/datasets/muhammadroshaanriaz/e-commerce-trends-a-guide-to-leveraging-dataset
    Explore at:
    zip(51169 bytes)Available download formats
    Dataset updated
    Jul 8, 2024
    Authors
    Muhammad Roshan Riaz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Exploring E-commerce Trends: A Guide to Leveraging Dummy Dataset

    Introduction: In the world of e-commerce, data is a powerful asset that can be leveraged to understand customer behavior, improve sales strategies, and enhance overall business performance. This guide explores how to effectively utilize a dummy dataset generated to simulate various aspects of an e-commerce platform. By analyzing this dataset, businesses can gain valuable insights into product trends, customer preferences, and market dynamics.

    1. Dataset Overview: The dummy dataset contains information on 1000 products across different categories such as electronics, clothing, home & kitchen, books, toys & games, and more. Each product is associated with attributes such as price, rating, number of reviews, stock quantity, discounts, sales, and date added to inventory. This comprehensive dataset provides a rich source of information for analysis and exploration.

    2. Data Analysis: Using tools like Pandas, NumPy, and visualization libraries like Matplotlib or Seaborn, businesses can perform in-depth analysis of the dataset. Key insights such as top-selling products, popular product categories, pricing trends, and seasonal variations can be extracted through exploratory data analysis (EDA). Visualization techniques can be employed to create intuitive graphs and charts for better understanding and communication of findings.

    3. Machine Learning Applications: The dataset can be used to train machine learning models for various e-commerce tasks such as product recommendation, sales prediction, customer segmentation, and sentiment analysis. By applying algorithms like linear regression, decision trees, or neural networks, businesses can develop predictive models to optimize inventory management, personalize customer experiences, and drive sales growth.

    4. Testing and Prototyping: Businesses can utilize the dummy dataset to test new algorithms, prototype new features, or conduct A/B testing experiments without impacting real user data. This enables rapid iteration and experimentation to validate hypotheses and refine strategies before implementation in a live environment.

    5. Educational Resources: The dummy dataset serves as an invaluable educational resource for students, researchers, and professionals interested in learning about e-commerce data analysis and machine learning. Tutorials, workshops, and online courses can be developed using the dataset to teach concepts such as data manipulation, statistical analysis, and model training in the context of e-commerce.

    6. Decision Support and Strategy Development: Insights derived from the dataset can inform strategic decision-making processes and guide business strategy development. By understanding customer preferences, market trends, and competitor behavior, businesses can make informed decisions regarding product assortment, pricing strategies, marketing campaigns, and resource allocation.

    Conclusion: In conclusion, the dummy dataset provides a versatile and valuable resource for exploring e-commerce trends, understanding customer behavior, and driving business growth. By leveraging this dataset effectively, businesses can unlock actionable insights, optimize operations, and stay ahead in today's competitive e-commerce landscape

  13. R

    Data Analytics 2 Dataset

    • universe.roboflow.com
    zip
    Updated Jan 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRAFFIC SIGNS DATA ANALYTICS (2025). Data Analytics 2 Dataset [Dataset]. https://universe.roboflow.com/traffic-signs-data-analytics/data-analytics-2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 1, 2025
    Dataset authored and provided by
    TRAFFIC SIGNS DATA ANALYTICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    TRAFFIC LIGHTS Gztl Bounding Boxes
    Description

    DATA ANALYTICS 2

    ## Overview
    
    DATA ANALYTICS 2 is a dataset for object detection tasks - it contains TRAFFIC LIGHTS Gztl annotations for 8,579 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. New 1000 Sales Records Data 2

    • kaggle.com
    zip
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calvin Oko Mensah (2023). New 1000 Sales Records Data 2 [Dataset]. https://www.kaggle.com/datasets/calvinokomensah/new-1000-sales-records-data-2
    Explore at:
    zip(49305 bytes)Available download formats
    Dataset updated
    Jan 12, 2023
    Authors
    Calvin Oko Mensah
    Description

    This is a dataset downloaded off excelbianalytics.com created off of random VBA logic. I recently performed an extensive exploratory data analysis on it and I included new columns to it, namely: Unit margin, Order year, Order month, Order weekday and Order_Ship_Days which I think can help with analysis on the data. I shared it because I thought it was a great dataset to practice analytical processes on for newbies like myself.

  15. m

    lipidomics dataset-Lipid droplet plasticity and dispersion dictates...

    • data.mendeley.com
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sahar motamedi (2023). lipidomics dataset-Lipid droplet plasticity and dispersion dictates differential sensitivity of therapy-resistant melanoma cells to PUFA and iron-induced ferroptosis [Dataset]. http://doi.org/10.17632/3c755wtx8w.1
    Explore at:
    Dataset updated
    Jun 9, 2023
    Authors
    sahar motamedi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Lipidmics data generated for the manuscript (Lipid droplet plasticity and dispersion dictates differential sensitivity of therapy-resistant melanoma cells to PUFA and iron-induced ferroptosis)

  16. d

    Data from: A simple method for statistical analysis of intensity differences...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

  17. D

    UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View...

    • dataverse.no
    • dataverse.azure.uit.no
    • +1more
    txt, zip
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anurag Dalal; Anurag Dalal (2025). UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View Synthesis [Dataset]. http://doi.org/10.18710/WSU7I6
    Explore at:
    txt(7447), zip(960339536)Available download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    DataverseNO
    Authors
    Anurag Dalal; Anurag Dalal
    License

    https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6

    Description

    The dataset comprises three dynamic scenes characterized by both simple and complex lighting conditions. The quantity of cameras ranges from 4 to 512, including 4, 6, 8, 10, 12, 14, 16, 32, 64, 128, 256, and 512. The point clouds are randomly generated.

  18. N

    Allison, IA Annual Population and Growth Analysis Dataset: A Comprehensive...

    • neilsberg.com
    csv, json
    Updated Jul 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Allison, IA Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in Allison from 2000 to 2023 // 2024 Edition [Dataset]. https://www.neilsberg.com/insights/allison-ia-population-by-year/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Allison, Iowa
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2023, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2023. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2023. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Allison population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Allison across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2023, the population of Allison was 942, a 1.15% decrease year-by-year from 2022. Previously, in 2022, Allison population was 953, a decline of 1.04% compared to a population of 963 in 2021. Over the last 20 plus years, between 2000 and 2023, population of Allison decreased by 58. In this period, the peak population was 1,026 in the year 2010. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2023

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2023)
    • Population: The population for the specific year for the Allison is shown in this column.
    • Year on Year Change: This column displays the change in Allison population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Allison Population by Year. You can refer the same here

  19. Dataset for Computational Fluid Dynamics Analysis of a Micro-scale Chamber...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Dec 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). Dataset for Computational Fluid Dynamics Analysis of a Micro-scale Chamber for Measuring Organic Chemical Emission Parameters [Dataset]. https://catalog.data.gov/dataset/dataset-for-computational-fluid-dynamics-analysis-of-a-micro-scale-chamber-for-measuring-o
    Explore at:
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The data presented in this data file is a product of journal publication. The dataset contains air velocity of testing chambers, and convective mass transfer coefficient, diffusion coefficient, and material / air partition coefficient of formaldehyde with different building materials. Portions of this dataset are inaccessible because: CFD data was generated and stored by North Carolina State University because the files are too big and need specific software to open. They can be accessed through the following means: Please contact Jack Edwards at North Carolina State University. Format: CFD modeling data files. This dataset is associated with the following publication: Edwards, J., C. Huang, and X. Liu. Computational Fluid Dynamics Analysis of a Micro-scale Chamber for Measuring Organic Chemical Emission Parameters. JOURNAL OF HAZARDOUS MATERIALS. Elsevier Science Ltd, New York, NY, USA, 0, (2024).

  20. Digital_Payments_2025_Dataset

    • figshare.com
    csv
    Updated Apr 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shreyash tiwari (2025). Digital_Payments_2025_Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28873229.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 25, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    shreyash tiwari
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The "Digital Payments 2025 Dataset" is a synthetic dataset representing digital payment transactions across various payment applications in India for the year 2025. It captures monthly transaction data for multiple payment apps, including banks, UPI platforms, and mobile payment services, reflecting the growing adoption of digital payments in India. The dataset was created as part of a college project to simulate realistic transaction patterns for research, education, and analysis in data science, economics, and fintech studies. It includes metrics such as customer transaction counts and values, total transaction counts and values, and temporal data (month and year). The data is synthetic, generated using Python libraries to mimic real-world digital payment trends, and is suitable for academic research, teaching, and exploratory data analysis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
Organization logoOrganization logo

Orange dataset table

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
xlsxAvailable download formats
Dataset updated
Mar 4, 2022
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Rui Simões
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.

Search
Clear search
Close search
Google apps
Main menu