100+ datasets found
  1. d

    Data from: PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM...

    • catalog.data.gov
    • data.nasa.gov
    • +2more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY [Dataset]. https://catalog.data.gov/dataset/padmini-a-peer-to-peer-distributed-astronomy-data-mining-system-and-a-case-study
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY TUSHAR MAHULE, KIRK BORNE, SANDIPAN DEY, SUGANDHA ARORA, AND HILLOL KARGUPTA** Abstract. Peer-to-Peer (P2P) networks are appealing for astronomy data mining from virtual observatories because of the large volume of the data, compute-intensive tasks, potentially large number of users, and distributed nature of the data analysis process. This paper offers a brief overview of PADMINI—a Peer-to-Peer Astronomy Data MINIng system. It also presents a case study on PADMINI for distributed outlier detection using astronomy data. PADMINI is a webbased system powered by Google Sky and distributed data mining algorithms that run on a collection of computing nodes. This paper offers a case study of the PADMINI evaluating the architecture and the performance of the overall system. Detailed experimental results are presented in order to document the utility and scalability of the system.

  2. d

    Data from: Data Mining at NASA: From Theory to Applications

    • catalog.data.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +3more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Data Mining at NASA: From Theory to Applications [Dataset]. https://catalog.data.gov/dataset/data-mining-at-nasa-from-theory-to-applications
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    NASA has some of the largest and most complex data sources in the world, with data sources ranging from the earth sciences, space sciences, and massive distributed engineering data sets from commercial aircraft and spacecraft. This talk will discuss some of the issues and algorithms developed to analyze and discover patterns in these data sets. We will also provide an overview of a large research program in Integrated Vehicle Health Management. The goal of this program is to develop advanced technologies to automatically detect, diagnose, predict, and mitigate adverse events during the flight of an aircraft. A case study will be presented on a recent data mining analysis performed to support the Flight Readiness Review of the Space Shuttle Mission STS-119.

  3. f

    Results of running KHC on our case study.

    • plos.figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdulrahman Ahmed Bobakr Baqais; Mohammad Alshayeb (2023). Results of running KHC on our case study. [Dataset]. http://doi.org/10.1371/journal.pone.0202629.t011
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Abdulrahman Ahmed Bobakr Baqais; Mohammad Alshayeb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of running KHC on our case study.

  4. d

    Data Mining in Systems Health Management

    • catalog.data.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +2more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Data Mining in Systems Health Management [Dataset]. https://catalog.data.gov/dataset/data-mining-in-systems-health-management
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    This chapter presents theoretical and practical aspects associated to the implementation of a combined model-based/data-driven approach for failure prognostics based on particle filtering algorithms, in which the current esti- mate of the state PDF is used to determine the operating condition of the system and predict the progression of a fault indicator, given a dynamic state model and a set of process measurements. In this approach, the task of es- timating the current value of the fault indicator, as well as other important changing parameters in the environment, involves two basic steps: the predic- tion step, based on the process model, and an update step, which incorporates the new measurement into the a priori state estimate. This framework allows to estimate of the probability of failure at future time instants (RUL PDF) in real-time, providing information about time-to- failure (TTF) expectations, statistical confidence intervals, long-term predic- tions; using for this purpose empirical knowledge about critical conditions for the system (also referred to as the hazard zones). This information is of paramount significance for the improvement of the system reliability and cost-effective operation of critical assets, as it has been shown in a case study where feedback correction strategies (based on uncertainty measures) have been implemented to lengthen the RUL of a rotorcraft transmission system with propagating fatigue cracks on a critical component. Although the feed- back loop is implemented using simple linear relationships, it is helpful to provide a quick insight into the manner that the system reacts to changes on its input signals, in terms of its predicted RUL. The method is able to manage non-Gaussian pdf’s since it includes concepts such as nonlinear state estimation and confidence intervals in its formulation. Real data from a fault seeded test showed that the proposed framework was able to anticipate modifications on the system input to lengthen its RUL. Results of this test indicate that the method was able to successfully suggest the correction that the system required. In this sense, future work will be focused on the development and testing of similar strategies using different input-output uncertainty metrics.

  5. f

    The coupling values of the classes in our case study.

    • plos.figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdulrahman Ahmed Bobakr Baqais; Mohammad Alshayeb (2023). The coupling values of the classes in our case study. [Dataset]. http://doi.org/10.1371/journal.pone.0202629.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Abdulrahman Ahmed Bobakr Baqais; Mohammad Alshayeb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The coupling values of the classes in our case study.

  6. f

    Results of running KSA for Extract Message Refactoring on our case study.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdulrahman Ahmed Bobakr Baqais; Mohammad Alshayeb (2023). Results of running KSA for Extract Message Refactoring on our case study. [Dataset]. http://doi.org/10.1371/journal.pone.0202629.t012
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Abdulrahman Ahmed Bobakr Baqais; Mohammad Alshayeb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Saudi Arabia
    Description

    Results of running KSA for Extract Message Refactoring on our case study.

  7. d

    Retrospective_Mining_of_Tox_Data_Anemia_Case_Study_RegToxPharm Data

    • datasets.ai
    • catalog.data.gov
    0
    Updated Sep 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2024). Retrospective_Mining_of_Tox_Data_Anemia_Case_Study_RegToxPharm Data [Dataset]. https://datasets.ai/datasets/retrospective-mining-of-tox-data-anemia-case-study-regtoxpharm-data
    Explore at:
    0Available download formats
    Dataset updated
    Sep 18, 2024
    Dataset authored and provided by
    U.S. Environmental Protection Agency
    Description

    Data from a study to critically examine some of the issues of using data from ToxRefDB, a database largely composed of guideline studies for pesticidal active ingredients, using a case study focusing on chemically-induced anemia.

    This dataset is associated with the following publication: Judson, R.S., M. Martin, G. Patlewicz, and C.E. Wood. (Reg. Tox. Pharm.) Retrospective Mining of Toxicology Data to Discover Multispecies and Chemical Class Effects: Anemia as a Case Study. REGULATORY TOXICOLOGY AND PHARMACOLOGY. Elsevier Science Ltd, New York, NY, USA, 86: 74-92, (2017).

  8. u

    Data from: The use of project portfolios in effective strategy execution to...

    • researchdata.up.ac.za
    zip
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palesa Agnes Ramashala (2023). The use of project portfolios in effective strategy execution to improve business value [Dataset]. http://doi.org/10.25403/UPresearchdata.13280141.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    University of Pretoria
    Authors
    Palesa Agnes Ramashala
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Qualitative data gathered from interviews that were conducted with case organisations. The data is analysed using a qualitative data analysis tool (AtlasTi) to code and generate network diagrams. Software such as Atlas.ti 8 Windows will be a great advantage to use in order to view these results. Interviews were conducted with four case organisations. The details of the responses from the respondents from case organisations are captured. The data gathered during the interview sessions is captured in a tabular form and graphs were also created to identify trends. Also in this study is desktop review of the case organisations that formed part of the study. The desktop study was done using published annual reports over a period of more than seven years. The analysis was done given the scope of the project and its constructs.

  9. f

    A summary of related hashtags (top group) and related place mentions (bottom...

    • plos.figshare.com
    xls
    Updated Jun 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Savelyev; Alan M. MacEachren (2023). A summary of related hashtags (top group) and related place mentions (bottom group) identified with each particular meta-path. [Dataset]. http://doi.org/10.1371/journal.pone.0206906.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Alexander Savelyev; Alan M. MacEachren
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A summary of related hashtags (top group) and related place mentions (bottom group) identified with each particular meta-path.

  10. c

    Data from: THE USE OF DIGITAL PROJECT MANAGEMENT SOLUTIONS BY PROJECT...

    • esango.cput.ac.za
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kgothatso Mahlo (2023). THE USE OF DIGITAL PROJECT MANAGEMENT SOLUTIONS BY PROJECT BUSINESSES: A CASE STUDY OF A SELECTED PLATINUM MINE IN THE LIMPOPO PROVINCE [Dataset]. http://doi.org/10.25381/cput.19481114.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Cape Peninsula University of Technology
    Authors
    Kgothatso Mahlo
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Ethical Clearance reference number: 2020FOBREC758

    The purpose of this study was to investigate the awareness and adoption levels of digital project management solutions by project businesses at a selected platinum mine in the Limpopo Province. The study further investigated their project management challenges. It examined the impediments of adoption and the readiness of project businesses that had not yet adopted digital project management solutions, to adopt these solutions. The quantitative research survey was distributed and emailed to project businesses at a selected mine in the Limpopo Province, where 110 project businesses participated in this study. SPSS v26.0 software was utilised to process data.

    The study revealed that the level of awareness of digital project management tools is higher than the level of adoption. It was found that project businesses that have not employed digital project management tools experience several internal challenges, while external challenges were faced by various project businesses. This study identified several barriers which affected project businesses that had not employed digital project management tools, for example, lack of knowledge and the high costs associated with the use of digital project management solutions. These barriers were also identified as factors that affected the readiness of project businesses to adopt digital project management tools.

    The study proposed several recommendations, including that software developers should consider integrating their software packages with external systems that could enhance project management processes. It was further recommended that the South African mining sector should set a minimum standard requirement of project management training and the South African business development agencies should consider offering ongoing training programmes that are focused on digitalisation and knowledge building in the discipline of project management.

  11. r

    International Journal of Engineering and Advanced Technology FAQ -...

    • researchhelpdesk.org
    Updated May 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Help Desk (2022). International Journal of Engineering and Advanced Technology FAQ - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/faq/552/international-journal-of-engineering-and-advanced-technology
    Explore at:
    Dataset updated
    May 28, 2022
    Dataset authored and provided by
    Research Help Desk
    Description

    International Journal of Engineering and Advanced Technology FAQ - ResearchHelpDesk - International Journal of Engineering and Advanced Technology (IJEAT) is having Online-ISSN 2249-8958, bi-monthly international journal, being published in the months of February, April, June, August, October, and December by Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Bhopal (M.P.), India since the year 2011. It is academic, online, open access, double-blind, peer-reviewed international journal. It aims to publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. All submitted papers will be reviewed by the board of committee of IJEAT. Aim of IJEAT Journal disseminate original, scientific, theoretical or applied research in the field of Engineering and allied fields. dispense a platform for publishing results and research with a strong empirical component. aqueduct the significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. seek original and unpublished research papers based on theoretical or experimental works for the publication globally. publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. impart a platform for publishing results and research with a strong empirical component. create a bridge for a significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. solicit original and unpublished research papers, based on theoretical or experimental works. Scope of IJEAT International Journal of Engineering and Advanced Technology (IJEAT) covers all topics of all engineering branches. Some of them are Computer Science & Engineering, Information Technology, Electronics & Communication, Electrical and Electronics, Electronics and Telecommunication, Civil Engineering, Mechanical Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. The main topic includes but not limited to: 1. Smart Computing and Information Processing Signal and Speech Processing Image Processing and Pattern Recognition WSN Artificial Intelligence and machine learning Data mining and warehousing Data Analytics Deep learning Bioinformatics High Performance computing Advanced Computer networking Cloud Computing IoT Parallel Computing on GPU Human Computer Interactions 2. Recent Trends in Microelectronics and VLSI Design Process & Device Technologies Low-power design Nanometer-scale integrated circuits Application specific ICs (ASICs) FPGAs Nanotechnology Nano electronics and Quantum Computing 3. Challenges of Industry and their Solutions, Communications Advanced Manufacturing Technologies Artificial Intelligence Autonomous Robots Augmented Reality Big Data Analytics and Business Intelligence Cyber Physical Systems (CPS) Digital Clone or Simulation Industrial Internet of Things (IIoT) Manufacturing IOT Plant Cyber security Smart Solutions – Wearable Sensors and Smart Glasses System Integration Small Batch Manufacturing Visual Analytics Virtual Reality 3D Printing 4. Internet of Things (IoT) Internet of Things (IoT) & IoE & Edge Computing Distributed Mobile Applications Utilizing IoT Security, Privacy and Trust in IoT & IoE Standards for IoT Applications Ubiquitous Computing Block Chain-enabled IoT Device and Data Security and Privacy Application of WSN in IoT Cloud Resources Utilization in IoT Wireless Access Technologies for IoT Mobile Applications and Services for IoT Machine/ Deep Learning with IoT & IoE Smart Sensors and Internet of Things for Smart City Logic, Functional programming and Microcontrollers for IoT Sensor Networks, Actuators for Internet of Things Data Visualization using IoT IoT Application and Communication Protocol Big Data Analytics for Social Networking using IoT IoT Applications for Smart Cities Emulation and Simulation Methodologies for IoT IoT Applied for Digital Contents 5. Microwaves and Photonics Microwave filter Micro Strip antenna Microwave Link design Microwave oscillator Frequency selective surface Microwave Antenna Microwave Photonics Radio over fiber Optical communication Optical oscillator Optical Link design Optical phase lock loop Optical devices 6. Computation Intelligence and Analytics Soft Computing Advance Ubiquitous Computing Parallel Computing Distributed Computing Machine Learning Information Retrieval Expert Systems Data Mining Text Mining Data Warehousing Predictive Analysis Data Management Big Data Analytics Big Data Security 7. Energy Harvesting and Wireless Power Transmission Energy harvesting and transfer for wireless sensor networks Economics of energy harvesting communications Waveform optimization for wireless power transfer RF Energy Harvesting Wireless Power Transmission Microstrip Antenna design and application Wearable Textile Antenna Luminescence Rectenna 8. Advance Concept of Networking and Database Computer Network Mobile Adhoc Network Image Security Application Artificial Intelligence and machine learning in the Field of Network and Database Data Analytic High performance computing Pattern Recognition 9. Machine Learning (ML) and Knowledge Mining (KM) Regression and prediction Problem solving and planning Clustering Classification Neural information processing Vision and speech perception Heterogeneous and streaming data Natural language processing Probabilistic Models and Methods Reasoning and inference Marketing and social sciences Data mining Knowledge Discovery Web mining Information retrieval Design and diagnosis Game playing Streaming data Music Modelling and Analysis Robotics and control Multi-agent systems Bioinformatics Social sciences Industrial, financial and scientific applications of all kind 10. Advanced Computer networking Computational Intelligence Data Management, Exploration, and Mining Robotics Artificial Intelligence and Machine Learning Computer Architecture and VLSI Computer Graphics, Simulation, and Modelling Digital System and Logic Design Natural Language Processing and Machine Translation Parallel and Distributed Algorithms Pattern Recognition and Analysis Systems and Software Engineering Nature Inspired Computing Signal and Image Processing Reconfigurable Computing Cloud, Cluster, Grid and P2P Computing Biomedical Computing Advanced Bioinformatics Green Computing Mobile Computing Nano Ubiquitous Computing Context Awareness and Personalization, Autonomic and Trusted Computing Cryptography and Applied Mathematics Security, Trust and Privacy Digital Rights Management Networked-Driven Multicourse Chips Internet Computing Agricultural Informatics and Communication Community Information Systems Computational Economics, Digital Photogrammetric Remote Sensing, GIS and GPS Disaster Management e-governance, e-Commerce, e-business, e-Learning Forest Genomics and Informatics Healthcare Informatics Information Ecology and Knowledge Management Irrigation Informatics Neuro-Informatics Open Source: Challenges and opportunities Web-Based Learning: Innovation and Challenges Soft computing Signal and Speech Processing Natural Language Processing 11. Communications Microstrip Antenna Microwave Radar and Satellite Smart Antenna MIMO Antenna Wireless Communication RFID Network and Applications 5G Communication 6G Communication 12. Algorithms and Complexity Sequential, Parallel And Distributed Algorithms And Data Structures Approximation And Randomized Algorithms Graph Algorithms And Graph Drawing On-Line And Streaming Algorithms Analysis Of Algorithms And Computational Complexity Algorithm Engineering Web Algorithms Exact And Parameterized Computation Algorithmic Game Theory Computational Biology Foundations Of Communication Networks Computational Geometry Discrete Optimization 13. Software Engineering and Knowledge Engineering Software Engineering Methodologies Agent-based software engineering Artificial intelligence approaches to software engineering Component-based software engineering Embedded and ubiquitous software engineering Aspect-based software engineering Empirical software engineering Search-Based Software engineering Automated software design and synthesis Computer-supported cooperative work Automated software specification Reverse engineering Software Engineering Techniques and Production Perspectives Requirements engineering Software analysis, design and modelling Software maintenance and evolution Software engineering tools and environments Software engineering decision support Software design patterns Software product lines Process and workflow management Reflection and metadata approaches Program understanding and system maintenance Software domain modelling and analysis Software economics Multimedia and hypermedia software engineering Software engineering case study and experience reports Enterprise software, middleware, and tools Artificial intelligent methods, models, techniques Artificial life and societies Swarm intelligence Smart Spaces Autonomic computing and agent-based systems Autonomic computing Adaptive Systems Agent architectures, ontologies, languages and protocols Multi-agent systems Agent-based learning and knowledge discovery Interface agents Agent-based auctions and marketplaces Secure mobile and multi-agent systems Mobile agents SOA and Service-Oriented Systems Service-centric software engineering Service oriented requirements engineering Service oriented architectures Middleware for service based systems Service discovery and composition Service level agreements (drafting,

  12. Mining Blockchain Processes: CryptoKitties Case Study

    • researchdata.edu.au
    datadownload
    Updated Jul 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wil van der Aalst; Ingo Weber; An Binh Tran; Alex Ponomarev; Christopher Klinkmueller (2019). Mining Blockchain Processes: CryptoKitties Case Study [Dataset]. http://doi.org/10.25919/5D242B0BE3384
    Explore at:
    datadownloadAvailable download formats
    Dataset updated
    Jul 9, 2019
    Dataset provided by
    CSIROhttp://www.csiro.au/
    Authors
    Wil van der Aalst; Ingo Weber; An Binh Tran; Alex Ponomarev; Christopher Klinkmueller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This collection comprises the manifest and the two generated XES logs from the case study in [1]. The manifest specifies which data related to the execution of the CryptoKitties smart contracts was retrieved from the transaction log of the public Ethereum blockchain. The manifest also specifies how this data was transformed into the two XES logs. Details regarding the approach to extracting XES logs from the Ethereum blockchain and the case study are provided in [1]. For more information on the XES standard see http://www.xes-standard.org.

    [1] Klinkmüller, C., Ponomarev, A., Tran, A., Weber, I., and van der Aalst, W.: "Mining Blockchain Processes: Extracting Process Mining Data from Blockchain Applications", Blockchain Forum at BPM 2019.

  13. Data from: Case Study: Bike Sharing

    • kaggle.com
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ibrahim Ahmadov (2023). Case Study: Bike Sharing [Dataset]. https://www.kaggle.com/datasets/ibrahimahmadov/case-study-bike-sharing
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 26, 2023
    Dataset provided by
    Kaggle
    Authors
    Ibrahim Ahmadov
    Description

    About Dataset

    This case study is a part of Google Data Analytics course. Cyclistic is a fictional bike-sharing company, however, the data is real. It encompasses information about bike-sharing stations in Chicago and total rides with rented bikes during more than 10 years, from 2013 until February 2023.

    The business task is to help design the marketing strategy. The project owner aims at converting casual riders into annual members. To achieve that goal the marketing team needs to better understand how annual members and casual riders differ in using rented bikes.

    My specific task was to analyze the available data of rides and provide 3 main recommendation for the marketing strategy, based on the data analysis.

    The requirement was to analyze the data for the last 12 months. However, I decided to use the whole dataset, since it was openly available for the whole period of operations.

    Data License Agreement

    Lyft Bikes and Scooters, LLC (“Bikeshare”) operates the City of Chicago’s (“City”) Divvy bicycle sharing service. Bikeshare and the City are committed to supporting bicycling as an alternative transportation option. As part of that commitment, the City permits Bikeshare to make certain Divvy system data owned by the City (“Data”) available to the public, subject to the terms and conditions of this License Agreement (“Agreement”). By accessing or using any of the Data, you agree to all of the terms and conditions of this Agreement.

    License. Bikeshare hereby grants to you a non-exclusive, royalty-free, limited, perpetual license to access, reproduce, analyze, copy, modify, distribute in your product or service and use the Data for any lawful purpose (“License”). Prohibited Conduct. The License does not authorize you to do, and you will not do or assist others in doing, any of the following

    Use the Data in any unlawful manner or for any unlawful purpose; Host, stream, publish, distribute, sublicense, or sell the Data as a stand-alone dataset; provided, however, you may include the Data as source material, as applicable, in analyses, reports, or studies published or distributed for non-commercial purposes; Access the Data by means other than the interface Bikeshare provides or authorizes for that purpose; Circumvent any access restrictions relating to the Data; Use data mining or other extraction methods in connection with Bikeshare's website or the Data; Attempt to correlate the Data with names, addresses, or other information of customers or Members of Bikeshare; and State or imply that you are affiliated, approved, endorsed, or sponsored by Bikeshare. Use or authorize others to use, without the written permission of the applicable owners, the trademarks or trade names of Lyft Bikes and Scooters, LLC, the City of Chicago or any sponsor of the Divvy service. These marks include, but are not limited to DIVVY, and the DIVVY logo, which are owned by the City of Chicago. No Warranty. THE DATA IS PROVIDED “AS IS,” AS AVAILABLE (AT BIKESHARE’S SOLE DISCRETION) AND AT YOUR SOLE RISK. TO THE MAXIMUM EXTENT PROVIDED BY LAW BIKESHARE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. BIKESHARE FURTHER DISCLAIMS ANY WARRANTY THAT THE DATA WILL MEET YOUR NEEDS OR WILL BE OR CONTINUE TO BE AVAILABLE, COMPLETE, ACCURATE, TIMELY, SECURE, OR ERROR FREE.

    Limitation of Liability and Covenant Not to Sue. Bikeshare, its parent, affiliates and sponsors, and their respective directors, officers, employees, or agents will not be liable to you or anyone else for any loss or damage, including any direct, indirect, incidental, and consequential damages, whether foreseeable or not, based on any theory of liability, resulting in whole or in part from your access to or use of the Data. You will not bring any claim for damages against any of those persons or entities in any court or otherwise arising out of or relating to this Agreement, the Data, or your use of the Data. In any event, if you were to bring and prevail on such a claim, your maximum recovery is limited to $100 in the aggregate even if you or they had been advised of the possibility of liability exceeding that amount. Ownership and Provision of Data. The City of Chicago owns all right, title, and interest in the Data. Bikeshare may modify or cease providing any or all of the Data at any time, without notice, in its sole discretion. No Waiver. Nothing in this Agreement is or implies a waiver of any rights Bikeshare or the City of Chicago has in the Data or in any copyrights, patents, or trademarks owned or licensed by Bikeshare, its parent, affiliates or sponsors. The DIVVY trademarks are owned by the City of Chicago. Termination of Agreement. Bikeshare may terminate this Agreement at any time and for any reason in its sole discretion. Termination will be effective ...

  14. Data from: Automatic composition of descriptive music: A case study of the...

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucía Martín-Gómez (2023). Automatic composition of descriptive music: A case study of the relationship between image and sound [Dataset]. http://doi.org/10.6084/m9.figshare.6682998.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Lucía Martín-Gómez
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    FANTASIAThis repository contains the data related to image descriptors and sound associated with a selection of frames of the films Fantasia and Fantasia 2000 produced by DisneyAboutThis repository contains the data used in the article Automatic composition of descriptive music: A case study of the relationship between image and sound published in the 6th International Workshop on Computational Creativity, Concept Invention, and General Intelligence (C3GI). Data structure is explained in detail in the article. AbstractHuman beings establish relationships with the environment mainly through sight and hearing. This work focuses on the concept of descriptive music, which makes use of sound resources to narrate a story. The Fantasia film, produced by Walt Disney was used in the case study. One of its musical pieces is analyzed in order to obtain the relationship between image and music. This connection is subsequently used to create a descriptive musical composition from a new video. Naive Bayes, Support Vector Machine and Random Forest are the three classifiers studied for the model induction process. After an analysis of their performance, it was concluded that Random Forest provided the best solution; the produced musical composition had a considerably high descriptive quality. DataNutcracker_data.arff: Image descriptors and the most important sound of each frame from the fragment "The Nutcracker Suite" in film Fantasia. Data stored into ARFF format.Firebird_data.arff: Image descriptors of each frame from the fragment "The Firebird" in film Fantasia 2000. Data stored into ARFF format.Firebird_midi_prediction.csv: Frame number of the fragment "The Firebird" in film Fantasia 2000 and the sound predicted by the system encoded in MIDI. Data stored into CSV format.Firebird_prediction.mp3: Audio file with the synthesizing of the prediction data for the fragment "The Firebird" of film Fantasia 2000.LicenseData is available under MIT License. To make use of the data the article must be cited.

  15. r

    Journal of Big Data Impact Factor 2024-2025 - ResearchHelpDesk

    • researchhelpdesk.org
    Updated Feb 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Help Desk (2022). Journal of Big Data Impact Factor 2024-2025 - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/impact-factor-if/289/journal-of-big-data
    Explore at:
    Dataset updated
    Feb 23, 2022
    Dataset authored and provided by
    Research Help Desk
    Description

    Journal of Big Data Impact Factor 2024-2025 - ResearchHelpDesk - The Journal of Big Data publishes high-quality, scholarly research papers, methodologies and case studies covering a broad range of topics, from big data analytics to data-intensive computing and all applications of big data research. The journal examines the challenges facing big data today and going forward including, but not limited to: data capture and storage; search, sharing, and analytics; big data technologies; data visualization; architectures for massively parallel processing; data mining tools and techniques; machine learning algorithms for big data; cloud computing platforms; distributed file systems and databases; and scalable storage systems. Academic researchers and practitioners will find the Journal of Big Data to be a seminal source of innovative material. All articles published by the Journal of Big Data are made freely and permanently accessible online immediately upon publication, without subscription charges or registration barriers. As authors of articles published in the Journal of Big Data you are the copyright holders of your article and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate your article, according to the SpringerOpen copyright and license agreement. For those of you who are US government employees or are prevented from being copyright holders for similar reasons, SpringerOpen can accommodate non-standard copyright lines.

  16. m

    Data for: Corrosive Sulphur effect in power and distribution transformers...

    • data.mendeley.com
    Updated Dec 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ricardo Arias (2018). Data for: Corrosive Sulphur effect in power and distribution transformers failures and treatments [Dataset]. http://doi.org/10.17632/6ng342f32t.1
    Explore at:
    Dataset updated
    Dec 18, 2018
    Authors
    Ricardo Arias
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Parameters for case study: Corrosive Sulphur in power and distribution transformers

  17. m

    Optimum Layout of Sublevel Stoping Mines (Case Study)

    • data.mendeley.com
    Updated Aug 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan Yarmuch (2017). Optimum Layout of Sublevel Stoping Mines (Case Study) [Dataset]. http://doi.org/10.17632/zvtw8zc2p6.1
    Explore at:
    Dataset updated
    Aug 31, 2017
    Authors
    Juan Yarmuch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provide the ".lp" and the ".sol" files for the case study published in: J.L Yarmuch, Grigaliunas M.C., Munizaga-Rosas J.C. "Optimum Layout of Sublevel Stoping Mines"

  18. Exploring Customer Journey Mining and RPA: Case Study Data

    • figshare.com
    txt
    Updated Jul 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jost Wiethölter (2023). Exploring Customer Journey Mining and RPA: Case Study Data [Dataset]. http://doi.org/10.6084/m9.figshare.23690811.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 15, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jost Wiethölter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This document includes the underlying data of the case study conducted and presented in the confrence paper titled "Exploring Customer Journey Mining and RPA: Prediction of Customers’ Next Touchpoint".

    Data Source: https://www.kaggle.com/datasets/kishlaya18/customer-purchase-journey-netherlands

  19. m

    Data for: A methodology for building a data-enclosing tunnel for automated...

    • data.mendeley.com
    • search.datacite.org
    Updated Nov 4, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Marcano (2019). Data for: A methodology for building a data-enclosing tunnel for automated online-feedback in simulator training [Dataset]. http://doi.org/10.17632/bpczvpr5np.1
    Explore at:
    Dataset updated
    Nov 4, 2019
    Authors
    Laura Marcano
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    The data was generated randomly, based on different combinations of possible actions in the dynamic simulator K-Spice from Kongsberg Digital. In case study 1, the aim was to increase +10 % of the oil production with respect to the initial condition value. In case study 2, the aim was to decrease -10 % of the gas production with respect to the initial condition value. The data show examples of possible correct and incorrect paths that a trainee could follow trying to solve the scenarios.

  20. Z

    A dataset for temporal analysis of files related to the JFK case

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luczak-Roesch, Markus (2020). A dataset for temporal analysis of files related to the JFK case [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1042153
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Luczak-Roesch, Markus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the content of the subset of all files with a correct publication date from the 2017 release of files related to the JFK case (retrieved from https://www.archives.gov/research/jfk/2017-release). This content was extracted from the source PDF files using the R OCR libraries tesseract and pdftools.

    The code to derive the dataset is given as follows:

    BEGIN R DATA PROCESSING SCRIPT

    library(tesseract) library(pdftools)

    pdfs <- list.files("[path to your output directory containing all PDF files]")

    meta <- read.csv2("[path to your input directory]/jfkrelease-2017-dce65d0ec70a54d5744de17d280f3ad2.csv",header = T,sep = ',') #the meta file containing all metadata for the PDF files (e.g. publication date)

    meta$Doc.Date <- as.character(meta$Doc.Date)

    meta.clean <- meta[-which(meta$Doc.Date=="" | grepl("/0000",meta$Doc.Date)),] for(i in 1:nrow(meta.clean)){ meta.clean$Doc.Date[i] <- gsub("00","01",meta.clean$Doc.Date[i])

    if(nchar(meta.clean$Doc.Date[i])<10){ meta.clean$Doc.Date[i]<-format(strptime(meta.clean$Doc.Date[i],format = "%d/%m/%y"),"%m/%d/%Y") }

    }

    meta.clean$Doc.Date <- strptime(meta.clean$Doc.Date,format = "%m/%d/%Y")

    meta.clean <- meta.clean[order(meta.clean$Doc.Date),]

    docs <- data.frame(content=character(0),dpub=character(0),stringsAsFactors = F) for(i in 1:nrow(meta.clean)){

    for(i in 1:3){

    pdf_prop <- pdftools::pdf_info(paste0("[path to your output directory]/",tolower(meta.clean$File.Name[i]))) tmp_files <- c() for(k in 1:pdf_prop$pages){ tmp_files <- c(tmp_files,paste0("/home/STAFF/luczakma/RProjects/JFK/data/tmp/",k)) }

    img_file <- pdftools::pdf_convert(paste0("[path to your output directory]/",tolower(meta.clean$File.Name[i])), format = 'tiff', pages = NULL, dpi = 700,filenames = tmp_files)

    txt <- ""

    for(j in 1:length(img_file)){ extract <- ocr(img_file[j], engine = tesseract("eng")) #unlink(img_file) txt <- paste(txt,extract,collapse = " ") }

    docs <- rbind(docs,data.frame(content=iconv(tolower(gsub("\s+"," ",gsub("[[:punct:]]|[ ]"," ",txt))),to="UTF-8"),dpub=format(meta.clean$Doc.Date[i],"%Y/%m/%d"),stringsAsFactors = F),stringsAsFactors = F) }

    write.table(docs,"[path to your output directory]/documents.csv", row.names = F)

    END R DATA PROCESSING SCRIPT

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dashlink (2025). PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY [Dataset]. https://catalog.data.gov/dataset/padmini-a-peer-to-peer-distributed-astronomy-data-mining-system-and-a-case-study

Data from: PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY

Related Article
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description

PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY TUSHAR MAHULE, KIRK BORNE, SANDIPAN DEY, SUGANDHA ARORA, AND HILLOL KARGUPTA** Abstract. Peer-to-Peer (P2P) networks are appealing for astronomy data mining from virtual observatories because of the large volume of the data, compute-intensive tasks, potentially large number of users, and distributed nature of the data analysis process. This paper offers a brief overview of PADMINI—a Peer-to-Peer Astronomy Data MINIng system. It also presents a case study on PADMINI for distributed outlier detection using astronomy data. PADMINI is a webbased system powered by Google Sky and distributed data mining algorithms that run on a collection of computing nodes. This paper offers a case study of the PADMINI evaluating the architecture and the performance of the overall system. Detailed experimental results are presented in order to document the utility and scalability of the system.

Search
Clear search
Close search
Google apps
Main menu