100+ datasets found
  1. a

    A collection of sport activity files for data analysis and data mining

    • academictorrents.com
    bittorrent
    Updated Feb 16, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samo Rauter et al. (2015). A collection of sport activity files for data analysis and data mining [Dataset]. https://academictorrents.com/details/aac04fca4cd3b4dcd580e9018d68fa0647b7d908
    Explore at:
    bittorrentAvailable download formats
    Dataset updated
    Feb 16, 2015
    Dataset authored and provided by
    Samo Rauter et al.
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    Dataset consists of the data produced by nine cyclists. Data were directly exported from their Strava or Garmin Connect accounts. Data format of sport s activities could be written in GPX or TCX form, which are basically the XML formats adapted to specific purposes. From each dataset, many following information can be obtained: GPS location, elevation, duration, distance, average and maximal heart rate, while some workouts include also data obtained from power meters.

  2. d

    Data from: Multi-objective optimization based privacy preserving distributed...

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Dec 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2023). Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks [Dataset]. https://catalog.data.gov/dataset/multi-objective-optimization-based-privacy-preserving-distributed-data-mining-in-peer-to-p
    Explore at:
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    Dashlink
    Description

    This paper proposes a scalable, local privacy preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.

  3. Privacy Preserving Distributed Data Mining

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • datadiscoverystudio.org
    • +2more
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). Privacy Preserving Distributed Data Mining [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/privacy-preserving-distributed-data-mining
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Distributed data mining from privacy-sensitive multi-party data is likely to play an important role in the next generation of integrated vehicle health monitoring systems. For example, consider an airline manufacturer [tex]$\mathcal{C}$[/tex] manufacturing an aircraft model [tex]$A$[/tex] and selling it to five different airline operating companies [tex]$\mathcal{V}_1 \dots \mathcal{V}_5$[/tex]. These aircrafts, during their operation, generate huge amount of data. Mining this data can reveal useful information regarding the health and operability of the aircraft which can be useful for disaster management and prediction of efficient operating regimes. Now if the manufacturer [tex]$\mathcal{C}$[/tex] wants to analyze the performance data collected from different aircrafts of model-type [tex]$A$[/tex] belonging to different airlines then central collection of data for subsequent analysis may not be an option. It should be noted that the result of this analysis may be statistically more significant if the data for aircraft model [tex]$A$[/tex] across all companies were available to [tex]$\mathcal{C}$[/tex]. The potential problems arising out of such a data mining scenario are:

  4. a

    Data from: A collection of sport activity datasets for data analysis and...

    • academictorrents.com
    bittorrent
    Updated Apr 6, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iztok Fister Jr. and Samo Rauter and Dusan Fister and Iztok Fister (2017). A collection of sport activity datasets for data analysis and data mining 2017a [Dataset]. https://academictorrents.com/details/f2221a292540ff3e6c85025754f775361c7cd886
    Explore at:
    bittorrentAvailable download formats
    Dataset updated
    Apr 6, 2017
    Dataset authored and provided by
    Iztok Fister Jr. and Samo Rauter and Dusan Fister and Iztok Fister
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    A collection of sport activity datasets for data analysis and data mining 2017a

  5. Educational data collected from teachers - for the analysis of online...

    • zenodo.org
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corina Simionescu; Corina Simionescu (2024). Educational data collected from teachers - for the analysis of online activities in schools in Romania (during the Covid-19 pandemic, March 2020 - April 2020) [Dataset]. http://doi.org/10.5281/zenodo.13772111
    Explore at:
    Dataset updated
    Sep 17, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Corina Simionescu; Corina Simionescu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset comes from a questionnaire structured into 24 questions, which can be accessed at https://forms.gle/bUgYMfoNHh7r6ebs6. This questionnaire was completed by 956 respondents and aims to analyze the online activities carried out during March - April 2020, being distributed to teachers.
    Each question is designed to reveal different aspects of the experiences, skills, and perspectives of teaching staff regarding online teaching and learning.

    To protect the identity of the respondents and to obtain accurate responses, all data collected from teachers was anonymous. We did not collect any personal information whatsoever. This aspect was made clear to the respondents in the description of the questionnaire.

  6. The fee principle for exploration data collection of CPC Corporation, Taiwan...

    • data.gov.tw
    csv
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CPC Corporation, Taiwan (2025). The fee principle for exploration data collection of CPC Corporation, Taiwan [Dataset]. https://data.gov.tw/en/datasets/30135
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    CPC Corporationhttp://en.cpc.com.tw/
    Authors
    CPC Corporation, Taiwan
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    The Exploration and Mining Division provides a fee schedule for the provision of exploration data to external parties.

  7. m

    pinterest_dataset

    • data.mendeley.com
    Updated Oct 27, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pinterest_dataset [Dataset]. https://data.mendeley.com/datasets/fs4k2zc5j5/2
    Explore at:
    Dataset updated
    Oct 27, 2017
    Authors
    Juan Carlos Gomez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset with 72000 pins from 117 users in Pinterest. Each pin contains a short raw text and an image. The images are processed using a pretrained Convolutional Neural Network and transformed into a vector of 4096 features.

    This dataset was used in the paper "User Identification in Pinterest Through the Refinement of a Cascade Fusion of Text and Images" to idenfity specific users given their comments. The paper is publishe in the Research in Computing Science Journal, as part of the LKE 2017 conference. The dataset includes the splits used in the paper.

    There are nine files. text_test, text_train and text_val, contain the raw text of each pin in the corresponding split of the data. imag_test, imag_train and imag_val contain the image features of each pin in the corresponding split of the data. train_user and val_test_users contain the index of the user of each pin (between 0 and 116). There is a correspondance one-to-one among the test, train and validation files for images, text and users. There are 400 pins per user in the train set, and 100 pins per user in the validation and test sets each one.

    If you have questions regarding the data, write to: jc dot gomez at ugto dot mx

  8. Data from: Peer-to-Peer Data Mining, Privacy Issues, and Games

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • data.nasa.gov
    • +2more
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Peer-to-Peer Data Mining, Privacy Issues, and Games [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/peer-to-peer-data-mining-privacy-issues-and-games
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Peer-to-Peer (P2P) networks are gaining increasing popularity in many distributed applications such as file-sharing, network storage, web caching, sear- ching and indexing of relevant documents and P2P network-threat analysis. Many of these applications require scalable analysis of data over a P2P network. This paper starts by offering a brief overview of distributed data mining applications and algorithms for P2P environments. Next it discusses some of the privacy concerns with P2P data mining and points out the problems of existing privacy-preserving multi-party data mining techniques. It further points out that most of the nice assumptions of these existing privacy preserving techniques fall apart in real-life applications of privacy-preserving distributed data mining (PPDM). The paper offers a more realistic formulation of the PPDM problem as a multi-party game and points out some recent results.

  9. s

    Data from: Social Media Data Mining Becomes Ordinary

    • orda.shef.ac.uk
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helen Kennedy (2023). Social Media Data Mining Becomes Ordinary [Dataset]. http://doi.org/10.15131/shef.data.5195032.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    The University of Sheffield
    Authors
    Helen Kennedy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This research explored what happens when social media data mining becomes ordinary and is carried out by organisations that might be seen as the pillars of everyday life. The interviews on which the transcripts are based are discussed in Chapter 6 of the book. The referenced book contains a description of the methods. No other publications resulted from working with these transcripts.

  10. m

    Educational Attainment in North Carolina Public Schools: Use of statistical...

    • data.mendeley.com
    Updated Nov 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scott Herford (2018). Educational Attainment in North Carolina Public Schools: Use of statistical modeling, data mining techniques, and machine learning algorithms to explore 2014-2017 North Carolina Public School datasets. [Dataset]. http://doi.org/10.17632/6cm9wyd5g5.1
    Explore at:
    Dataset updated
    Nov 14, 2018
    Authors
    Scott Herford
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    North Carolina
    Description

    The purpose of data mining analysis is always to find patterns of the data using certain kind of techiques such as classification or regression. It is not always feasible to apply classification algorithms directly to dataset. Before doing any work on the data, the data has to be pre-processed and this process normally involves feature selection and dimensionality reduction. We tried to use clustering as a way to reduce the dimension of the data and create new features. Based on our project, after using clustering prior to classification, the performance has not improved much. The reason why it has not improved could be the features we selected to perform clustering are not well suited for it. Because of the nature of the data, classification tasks are going to provide more information to work with in terms of improving knowledge and overall performance metrics. From the dimensionality reduction perspective: It is different from Principle Component Analysis which guarantees finding the best linear transformation that reduces the number of dimensions with a minimum loss of information. Using clusters as a technique of reducing the data dimension will lose a lot of information since clustering techniques are based a metric of 'distance'. At high dimensions euclidean distance loses pretty much all meaning. Therefore using clustering as a "Reducing" dimensionality by mapping data points to cluster numbers is not always good since you may lose almost all the information. From the creating new features perspective: Clustering analysis creates labels based on the patterns of the data, it brings uncertainties into the data. By using clustering prior to classification, the decision on the number of clusters will highly affect the performance of the clustering, then affect the performance of classification. If the part of features we use clustering techniques on is very suited for it, it might increase the overall performance on classification. For example, if the features we use k-means on are numerical and the dimension is small, the overall classification performance may be better. We did not lock in the clustering outputs using a random_state in the effort to see if they were stable. Our assumption was that if the results vary highly from run to run which they definitely did, maybe the data just does not cluster well with the methods selected at all. Basically, the ramification we saw was that our results are not much better than random when applying clustering to the data preprocessing. Finally, it is important to ensure a feedback loop is in place to continuously collect the same data in the same format from which the models were created. This feedback loop can be used to measure the model real world effectiveness and also to continue to revise the models from time to time as things change.

  11. w

    Data mining-Mathematics

    • workwithdata.com
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Data mining-Mathematics [Dataset]. https://www.workwithdata.com/topic/data-mining-mathematics
    Explore at:
    Dataset updated
    Apr 15, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data mining-Mathematics is a book subject. It includes 13 books, written by 11 different authors.

  12. f

    Data from: Integrating Data Mining and Natural Language Processing to...

    • figshare.com
    zip
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyoung Jeong; Taehyun Park; JunHo Song; Seungpyo Kang; Joonghee Won; Jungim Han; Kyoungmin Min (2024). Integrating Data Mining and Natural Language Processing to Construct a Melting Point Database for Organometallic Compounds [Dataset]. http://doi.org/10.1021/acs.jcim.4c01254.s004
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    ACS Publications
    Authors
    Jinyoung Jeong; Taehyun Park; JunHo Song; Seungpyo Kang; Joonghee Won; Jungim Han; Kyoungmin Min
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    As semiconductor devices are miniaturized, the importance of atomic layer deposition (ALD) technology is growing. When designing ALD precursors, it is important to consider the melting point, because the precursors should have melting points lower than the process temperature. However, obtaining melting point data is challenging due to experimental sensitivity and high computational costs. As a result, a comprehensive and well-organized database for the melting point of the OMCs has not been fully reported yet. Therefore, in this study, we constructed a database of melting points for 1,845 OMCs, including 58 metal and 6 metalloid elements. The database contains CAS numbers, molecular formulas, and structural information and was constructed through automatic extraction and systematic curation. The melting point information was extracted using two methods: 1) 1,434 materials from 11 chemical vendor databases and 2) 411 materials identified through natural language processing (NLP) techniques with an accuracy of 86.3%, based on 2,096 scientific papers published over the past 29 years. In our database, the OMCs contain up to around 250 atoms and have melting points that range from −170 to 1610 °C. The main source is the Chemsrc database, accounting for 607 materials (32.9%), and Fe is the most common central metal or metalloid element (15.0%), followed by Si (11.6%) and B (6.7%). To validate the utilization of the constructed database, a multimodal neural network model was developed integrating graph-based and feature-based information as descriptors to predict the melting points of the OMCs but moderate performance. We believe the current approach reduces the time and cost associated with hand-operated data collection and processing, contributing to effective screening of potentially promising ALD precursors and providing crucial information for the advancement of the semiconductor industry.

  13. Data from: MusicOSet: An Enhanced Open Dataset for Music Data Mining

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Jun 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariana O. Silva; Mariana O. Silva; Laís Mota; Mirella M. Moro; Mirella M. Moro; Laís Mota (2021). MusicOSet: An Enhanced Open Dataset for Music Data Mining [Dataset]. http://doi.org/10.5281/zenodo.4904639
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jun 7, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mariana O. Silva; Mariana O. Silva; Laís Mota; Mirella M. Moro; Mirella M. Moro; Laís Mota
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MusicOSet is an open and enhanced dataset of musical elements (artists, songs and albums) based on musical popularity classification. Provides a directly accessible collection of data suitable for numerous tasks in music data mining (e.g., data visualization, classification, clustering, similarity search, MIR, HSS and so forth). To create MusicOSet, the potential information sources were divided into three main categories: music popularity sources, metadata sources, and acoustic and lyrical features sources. Data from all three categories were initially collected between January and May 2019. Nevertheless, the update and enhancement of the data happened in June 2019.

    The attractive features of MusicOSet include:

    • Integration and centralization of different musical data sources
    • Calculation of popularity scores and classification of hits and non-hits musical elements, varying from 1962 to 2018
    • Enriched metadata for music, artists, and albums from the US popular music industry
    • Availability of acoustic and lyrical resources
    • Unrestricted access in two formats: SQL database and compressed .csv files
    |    Data    | # Records |
    |:-----------------:|:---------:|
    | Songs       | 20,405  |
    | Artists      | 11,518  |
    | Albums      | 26,522  |
    | Lyrics      | 19,664  |
    | Acoustic Features | 20,405  |
    | Genres      | 1,561   |
  14. Data Mining at NASA: From Theory to Applications - Dataset - NASA Open Data...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). Data Mining at NASA: From Theory to Applications - Dataset - NASA Open Data Portal [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/data-mining-at-nasa-from-theory-to-applications
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    NASA has some of the largest and most complex data sources in the world, with data sources ranging from the earth sciences, space sciences, and massive distributed engineering data sets from commercial aircraft and spacecraft. This talk will discuss some of the issues and algorithms developed to analyze and discover patterns in these data sets. We will also provide an overview of a large research program in Integrated Vehicle Health Management. The goal of this program is to develop advanced technologies to automatically detect, diagnose, predict, and mitigate adverse events during the flight of an aircraft. A case study will be presented on a recent data mining analysis performed to support the Flight Readiness Review of the Space Shuttle Mission STS-119.

  15. Video-to-Model Data Set

    • figshare.com
    • commons.datacite.org
    xml
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Mar 24, 2020
    Dataset provided by
    figshare
    Authors
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.

  16. d

    Data from: Community-Scale Attic Retrofit and Home Energy Upgrade Data...

    • catalog.data.gov
    • data.openei.org
    • +3more
    Updated Nov 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davis Energy (2023). Community-Scale Attic Retrofit and Home Energy Upgrade Data Mining - Hot Dry Climate [Dataset]. https://catalog.data.gov/dataset/community-scale-attic-retrofit-and-home-energy-upgrade-data-mining-hot-dry-climate
    Explore at:
    Dataset updated
    Nov 2, 2023
    Dataset provided by
    Davis Energy
    Description

    Retrofitting is an essential element of any comprehensive strategy for improving residential energy efficiency. The residential retrofit market is still developing, and program managers must develop innovative strategies to increase uptake and promote economies of scale. Residential retrofitting remains a challenging proposition to sell to homeowners, because awareness levels are low and financial incentives are lacking. The U.S. Department of Energy's Building America research team, Alliance for Residential Building Innovation (ARBI), implemented a project to increase residential retrofits in Davis, California. The project used a neighborhood-focused strategy for implementation and a low-cost retrofit program that focused on upgraded attic insulation and duct sealing. ARBI worked with a community partner, the not-for-profit Cool Davis Initiative, as well as selected area contractors to implement a strategy that sought to capitalize on the strong local expertise of partners and the unique aspects of the Davis, California, community. Working with community partners also allowed ARBI to collect and analyze data about effective messaging tactics for community-based retrofit programs. ARBI expected this project, called Retrofit Your Attic, to achieve higher uptake than other retrofit projects, because it emphasized a low-cost, one-measure retrofit program. However, this was not the case. The program used a strategy that focused on attics-including air sealing, duct sealing, and attic insulation-as a low-cost entry for homeowners to complete home retrofits. The price was kept below $4,000 after incentives; both contractors in the program offered the same price. The program completed only five retrofits. Interestingly, none of those homeowners used the one-measure strategy. All five homeowners were concerned about cost, comfort, and energy savings and included additional measures in their retrofits. The low-cost, one-measure strategy did not increase the uptake among homeowners, even in a well-educated, affluent community such as Davis. This project has two primary components. One is to complete attic retrofits on a community scale in the hot-dry climate on Davis, CA. Sufficient data will be collected on these projects to include them in the BAFDR. Additionally, ARBI is working with contractors to obtain building and utility data from a large set of retrofit projects in CA (hot-dry). These projects are to be uploaded into the BAFDR.

  17. d

    Linked Data Mining Challenge RM Set

    • da-ra.de
    • search.gesis.org
    • +2more
    Updated 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johann Schaible (2015). Linked Data Mining Challenge RM Set [Dataset]. http://doi.org/10.7802/78
    Explore at:
    Dataset updated
    2015
    Dataset provided by
    GESIS Data Archive
    da|ra
    Authors
    Johann Schaible
    Description

    Rapid Miner Process files and XML test set including the predicted labels for the Linked Data Mining Challenge 2015.

  18. w

    Books about Data mining-Social aspects

    • workwithdata.com
    Updated Aug 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    Dataset updated
    Aug 2, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books and is filtered where the book subjects is Data mining-Social aspects, featuring 9 columns including author, BNB id, book, book publisher, and book subjects. The preview is ordered by publication date (descending).

  19. Data from: PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • data.nasa.gov
    • +2more
    Updated Feb 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/padmini-a-peer-to-peer-distributed-astronomy-data-mining-system-and-a-case-study
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    PADMINI: A PEER-TO-PEER DISTRIBUTED ASTRONOMY DATA MINING SYSTEM AND A CASE STUDY TUSHAR MAHULE, KIRK BORNE, SANDIPAN DEY, SUGANDHA ARORA, AND HILLOL KARGUPTA** Abstract. Peer-to-Peer (P2P) networks are appealing for astronomy data mining from virtual observatories because of the large volume of the data, compute-intensive tasks, potentially large number of users, and distributed nature of the data analysis process. This paper offers a brief overview of PADMINI—a Peer-to-Peer Astronomy Data MINIng system. It also presents a case study on PADMINI for distributed outlier detection using astronomy data. PADMINI is a webbased system powered by Google Sky and distributed data mining algorithms that run on a collection of computing nodes. This paper offers a case study of the PADMINI evaluating the architecture and the performance of the overall system. Detailed experimental results are presented in order to document the utility and scalability of the system.

  20. Application of data analytics and mining across procurement process globally...

    • statista.com
    Updated Jul 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Application of data analytics and mining across procurement process globally 2017 [Dataset]. https://www.statista.com/statistics/728137/worldwide-application-of-data-analytics-and-mining-across-procurement-process/
    Explore at:
    Dataset updated
    Jul 7, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017
    Area covered
    Worldwide
    Description

    This statistic displays the various applications of data analytics and mining across procurement processes, according to chief procurement officers (CPOs) worldwide, as of 2017. Fifty-seven percent of the CPOs asked agreed that data analytics and mining had been applied to intelligent and advanced analytics for negotiations, and 40 percent of them indicated data analytics and mining had been applied to supplier portfolio optimization processes.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Samo Rauter et al. (2015). A collection of sport activity files for data analysis and data mining [Dataset]. https://academictorrents.com/details/aac04fca4cd3b4dcd580e9018d68fa0647b7d908

A collection of sport activity files for data analysis and data mining

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
bittorrentAvailable download formats
Dataset updated
Feb 16, 2015
Dataset authored and provided by
Samo Rauter et al.
License

https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

Description

Dataset consists of the data produced by nine cyclists. Data were directly exported from their Strava or Garmin Connect accounts. Data format of sport s activities could be written in GPX or TCX form, which are basically the XML formats adapted to specific purposes. From each dataset, many following information can be obtained: GPS location, elevation, duration, distance, average and maximal heart rate, while some workouts include also data obtained from power meters.

Search
Clear search
Close search
Google apps
Main menu