52 datasets found
  1. e

    Data Warehousing and Data Mining (Old), 7th Semester, Computer Science and...

    • paper.erudition.co.in
    html
    Updated Nov 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Data Warehousing and Data Mining (Old), 7th Semester, Computer Science and Engineering, MAKAUT | Erudition Paper [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering/7/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Nov 23, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of Data Warehousing and Data Mining (Old),7th Semester,Computer Science and Engineering,Maulana Abul Kalam Azad University of Technology

  2. e

    Module II

    • paper.erudition.co.in
    html
    Updated Nov 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Module II [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering/7/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Nov 23, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Module II of Data Warehousing and Data Mining, 7th Semester , Computer Science and Engineering

  3. Musical Chords and Image Descriptors from Film Fantasia (Disney)

    • figshare.com
    txt
    Updated Apr 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucía Martín-Gómez (2020). Musical Chords and Image Descriptors from Film Fantasia (Disney) [Dataset]. http://doi.org/10.6084/m9.figshare.12110712.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 10, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Lucía Martín-Gómez
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    FANTASIAThis repository contains the data related to image descriptors and sounds associated with a selection of frames of the film Fantasia, produced by Disney.AboutThis repository contains data used in a doctoral thesis for the automatic composition of descriptive music. The information is extracted from the fragment of The Nutcracker from film Fantasia (Disney, 1940) using SIFT and BoVW, color quantization and CENS. Data- Attributes 1-50: weighted vector of visual words.- Attributes 51-59: red, green and blue values for three RGB colors.- Note1, note2 and note3: MIDI notes related to each frame of the film.LicenseData is available under MIT License. To make use of the data the article must be cited.

  4. Data from: CONCEPT- DM2 DATA MODEL TO ANALYSE HEALTHCARE PATHWAYS OF TYPE 2...

    • zenodo.org
    bin, png, zip
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Berta Ibáñez-Beroiz; Berta Ibáñez-Beroiz; Asier Ballesteros-Domínguez; Asier Ballesteros-Domínguez; Ignacio Oscoz-Villanueva; Ignacio Oscoz-Villanueva; Ibai Tamayo; Ibai Tamayo; Julián Librero; Julián Librero; Mónica Enguita-Germán; Mónica Enguita-Germán; Francisco Estupiñán-Romero; Francisco Estupiñán-Romero; Enrique Bernal-Delgado; Enrique Bernal-Delgado (2024). CONCEPT- DM2 DATA MODEL TO ANALYSE HEALTHCARE PATHWAYS OF TYPE 2 DIABETES [Dataset]. http://doi.org/10.5281/zenodo.7778291
    Explore at:
    bin, png, zipAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Berta Ibáñez-Beroiz; Berta Ibáñez-Beroiz; Asier Ballesteros-Domínguez; Asier Ballesteros-Domínguez; Ignacio Oscoz-Villanueva; Ignacio Oscoz-Villanueva; Ibai Tamayo; Ibai Tamayo; Julián Librero; Julián Librero; Mónica Enguita-Germán; Mónica Enguita-Germán; Francisco Estupiñán-Romero; Francisco Estupiñán-Romero; Enrique Bernal-Delgado; Enrique Bernal-Delgado
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Technical notes and documentation on the common data model of the project CONCEPT-DM2.

    This publication corresponds to the Common Data Model (CDM) specification of the CONCEPT-DM2 project for the implementation of a federated network analysis of the healthcare pathway of type 2 diabetes.

    Aims of the CONCEPT-DM2 project:

    General aim: To analyse chronic care effectiveness and efficiency of care pathways in diabetes, assuming the relevance of care pathways as independent factors of health outcomes using data from real life world (RWD) from five Spanish Regional Health Systems.

    Main specific aims:

    • To characterize the care pathways in patients with diabetes through the whole care system in terms of process indicators and pharmacologic recommendations
    • To compare these observed care pathways with the theoretical clinical pathways derived from the clinical practice guidelines
    • To assess if the adherence to clinical guidelines influence on important health outcomes, such as cardiovascular hospitalizations.
    • To compare the traditional analytical methods with process mining methods in terms of modeling quality, prediction performance and information provided.

    Study Design: It is a population-based retrospective observational study centered on all T2D patients diagnosed in five Regional Health Services within the Spanish National Health Service. We will include all the contacts of these patients with the health services using the electronic medical record systems including Primary Care data, Specialized Care data, Hospitalizations, Urgent Care data, Pharmacy Claims, and also other registers such as the mortality and the population register.

    Cohort definition: All patients with code of Type 2 Diabetes in the clinical health records

    • Inclusion criteria: patients that, at 01/01/2017 or during the follow-up from 01/01/2017 to 31/12/2022 had active health card (active TIS - tarjeta sanitaria activa) and code of type 2 diabetes (T2D, DM2 in spanish) in the clinical records of primary care (CIAP2 T90 in case of using CIAP code system)
    • Exclusion criteria:
      • patients with no contact with the health system from 01/01/2017 to 31/12/2022
      • patients that had a T1D (DM1) code opened after the T2D code during the follow-up.
    • Study period. From 01/01/2017 to 31/12/2022

    Files included in this publication:

    • Datamodel_CONCEPT_DM2_diagram.png
    • Common data model specification (Datamodel_CONCEPT_DM2_v.0.1.0.xlsx)
    • Synthetic datasets (Datamodel_CONCEPT_DM2_sample_data)
      • sample_data1_dm_patient.csv
      • sample_data2_dm_param.csv
      • sample_data3_dm_patient.csv
      • sample_data4_dm_param.csv
      • sample_data5_dm_patient.csv
      • sample_data6_dm_param.csv
      • sample_data7_dm_param.csv
      • sample_data8_dm_param.csv
    • Datamodel_CONCEPT_DM2_explanation.pptx
  5. d

    Data for: Epidemiological landscape of Batrachochytrium dendrobatidis and...

    • datadryad.org
    • search.dataone.org
    • +1more
    zip
    Updated Dec 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M. Delia Basanta; Julián A. Velasco; Constantino González-Salazar (2023). Data for: Epidemiological landscape of Batrachochytrium dendrobatidis and its impact on amphibian diversity at global scale [Dataset]. http://doi.org/10.5061/dryad.83bk3j9zv
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 13, 2023
    Dataset provided by
    Dryad
    Authors
    M. Delia Basanta; Julián A. Velasco; Constantino González-Salazar
    Time period covered
    Nov 26, 2023
    Description

    1. Title of Dataset: Epidemiological landscape of Batrachochytrium dendrobatidis and its impact on amphibian diversity at global scale

    https://doi.org/10.5061/dryad.83bk3j9zv

    2. Authors Information

    M. Delia Basanta Department of Biology, University of Nevada Reno. Reno, Nevada, USA. delibasanta@gmail.com

    Julián A. Velasco Instituto de Ciencias de la Atmósfera y Cambio Climático, Universidad Nacional Autónoma de México. Ciudad de México, México. javelasco@atmosfera.unam.mx

    Constantino González-Salazar. Instituto de Ciencias de la Atmósfera y Cambio Climático, Universidad Nacional Autónoma de México. Ciudad de México, México. cgsalazar@atmosfera.unam.mx

    3. Date of data collection (single date, range, approximate date): 2019-2022

    4. Geographic location of data collection: Global

    DATA & FILE OVERVIEW

    1. File List:

    1. Table S1.xls
    2. Supplementary information S1.docx
    3. Table S2.xlsx
    4. Table S3.xlsx
    5. Table S4.xlsx

    DATA-SPECIFIC INFORMATION F...

  6. M

    Data from: Characterizing and classifying neuroendocrine neoplasms through...

    • datacatalog.mskcc.org
    • search.dataone.org
    • +1more
    Updated Sep 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nanayakkara, Jina; Yang, Xiaojing; Tyryshkin, Kathrin; Wong, Justin J.M.; Vanderbeck, Kaitlin; Ginter, Paula S.; Scognamiglio, Theresa; Chen, Yao-Tseng; Panarelli, Nicole; Cheung, Nai-Kong; Dijk, Frederike; Ben-Dov, Iddo Z.; Kim, Michelle Kang; Singh, Simron; Morozov, Pavel; Max, Klaas E. A.; Tuschl, Thomas; Renwick, Neil (2023). Characterizing and classifying neuroendocrine neoplasms through microRNA sequencing and data mining [Dataset]. http://doi.org/10.5061/dryad.fn2z34tqj
    Explore at:
    Dataset updated
    Sep 19, 2023
    Dataset provided by
    MSK Library
    Authors
    Nanayakkara, Jina; Yang, Xiaojing; Tyryshkin, Kathrin; Wong, Justin J.M.; Vanderbeck, Kaitlin; Ginter, Paula S.; Scognamiglio, Theresa; Chen, Yao-Tseng; Panarelli, Nicole; Cheung, Nai-Kong; Dijk, Frederike; Ben-Dov, Iddo Z.; Kim, Michelle Kang; Singh, Simron; Morozov, Pavel; Max, Klaas E. A.; Tuschl, Thomas; Renwick, Neil
    Description

    From Dryad entry:

    "Abstract
    Neuroendocrine neoplasms (NENs) are clinically diverse and incompletely characterized cancers that are challenging to classify. MicroRNAs (miRNAs) are small regulatory RNAs that can be used to classify cancers. Recently, a morphology-based classification framework for evaluating NENs from different anatomic sites was proposed by experts, with the requirement of improved molecular data integration. Here, we compiled 378 miRNA expression profiles to examine NEN classification through comprehensive miRNA profiling and data mining. Following data preprocessing, our final study cohort included 221 NEN and 114 non-NEN samples, representing 15 NEN pathological types and five site-matched non-NEN control groups. Unsupervised hierarchical clustering of miRNA expression profiles clearly separated NENs from non-NENs. Comparative analyses showed that miR-375 and miR-7 expression is substantially higher in NEN cases than non-NEN controls. Correlation analyses showed that NENs from diverse anatomic sites have convergent miRNA expression programs, likely reflecting morphologic and functional similarities. Using machine learning approaches, we identified 17 miRNAs to discriminate 15 NEN pathological types and subsequently constructed a multi-layer classifier, correctly identifying 217 (98%) of 221 samples and overturning one histologic diagnosis. Through our research, we have identified common and type-specific miRNA tissue markers and constructed an accurate miRNA-based classifier, advancing our understanding of NEN diversity.

    Methods
    Sequencing-based miRNA expression profiles from 378 clinical samples, comprising 239 neuroendocrine neoplasm (NEN) cases and 139 site-matched non-NEN controls, were used in this study. Expression profiles were either compiled from published studies (n=149) or generated through small RNA sequencing (n=229). Prior to sequencing, total RNA was isolated from formalin-fixed paraffin-embedded (FFPE) tissue blocks or fresh-frozen (FF) tissue samples. Small RNA cDNA libraries were sequenced on HiSeq 2500 Illumina platforms using an established small RNA sequencing (Hafner et al., 2012 Methods) and sequence annotation pipeline (Brown et al., 2013 Front Genet) to generate miRNA expression profiles. Scaling our existing approach to miRNA-based NEN classification (Panarelli et al., 2019 Endocr Relat Cancer; Ren et al., 2017 Oncotarget), we constructed and cross-validated a multi-layer classifier for discriminating NEN pathological types based on selected miRNAs.

    Usage notes
    Diagnostic histopathology and small RNA cDNA library preparation information for all samples are presented in Table S1 of the associated manuscript."

  7. e

    Module IV

    • paper.erudition.co.in
    html
    Updated Nov 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2025). Module IV [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering/7/data-warehousing-and-data-mining
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Nov 23, 2025
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Module IV of Data Warehousing and Data Mining, 7th Semester , Computer Science and Engineering

  8. d

    Synthetic temporal dataset for temporal trend analysis and retrieval

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing Ao; Kara Schatz; Rada Chirkova (2025). Synthetic temporal dataset for temporal trend analysis and retrieval [Dataset]. http://doi.org/10.5061/dryad.q573n5trf
    Explore at:
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Jing Ao; Kara Schatz; Rada Chirkova
    Time period covered
    May 7, 2024
    Description

    This repository contains a synthetic, temporal data set that was generated by the authors by sampling values from the Gaussian distribution. The dataset contains eight nontemporal dimensions, a temporal dimension, and a numerical measure attribute. The data set was generated according to the scheme and procedure detailed in this source paper: Kaufmann, M., Fischer, P.M., May, N., Tonder, A., Kossmann, D. (2014). TPC-BiH: A Benchmark for Bitemporal Databases. In: Performance Characterization and Benchmarking. TPCTC 2013. Lecture Notes in Computer Science, vol 8391. Springer, Cham. The data set can be used for analyzing and locating temporal trends of interest, where a temporal trend is generated by selecting the desired values of the nontemporal dimensions, and then selecting the corresponding values of the temporal dimension and the numerical measure attribute. Locating temporal trends of interest, e.g., unusual trends, is a common task in many applications and domains. It can also be o..., , , # Synthetic temporal dataset for temporal trend analysis and retrieval

    https://doi.org/10.5061/dryad.q573n5trf

    The data set can be used for analyzing and locating temporal trends of interest, where a temporal trend is generated by selecting the desired values of the nontemporal dimensions, and then selecting the corresponding values of the temporal dimension and the numerical measure attribute. Locating temporal trends of interest, e.g., unusual trends, is a common task in many applications and domains. It can also be of interest to understand which nontemporal dimensions are associated with the temporal trends of interest. To this end, the data set can be used for analyzing and locating temporal trends in the data cube induced by the data set, e.g., retrieving outlier temporal trends using an outlier detector.Â

    We generated the synthetic temporal data set [1], which contains up to 8 nontemporal dimensions, one temporal dimension, and a nume...

  9. Examples of natural language in (a) mortality dataset, (b) aeromedical...

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neva J. Bull; Bridget Honan; Neil J. Spratt; Simon Quilty (2023). Examples of natural language in (a) mortality dataset, (b) aeromedical retrieval dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0284965.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Neva J. Bull; Bridget Honan; Neil J. Spratt; Simon Quilty
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Examples of natural language in (a) mortality dataset, (b) aeromedical retrieval dataset.

  10. Zenodo Open Metadata snapshot - Training dataset for records and communities...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated Dec 15, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo team; Zenodo team (2022). Zenodo Open Metadata snapshot - Training dataset for records and communities classifier building [Dataset]. http://doi.org/10.5281/zenodo.7438358
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Dec 15, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zenodo team; Zenodo team
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains Zenodo's published open access records and communities metadata, including entries marked by the Zenodo staff as spam and deleted.

    The datasets are gzipped compressed JSON-lines files, where each line is a JSON object representation of a Zenodo record or community.

    Records dataset

    Filename: zenodo_open_metadata_{ date of export }.jsonl.gz

    Each object contains the terms: part_of, thesis, description, doi, meeting, imprint, references, recid, alternate_identifiers, resource_type, journal, related_identifiers, title, subjects, notes, creators, communities, access_right, keywords, contributors, publication_date

    which correspond to the fields with the same name available in Zenodo's record JSON Schema at https://zenodo.org/schemas/records/record-v1.0.0.json.

    In addition, some terms have been altered:

    • The term files contains a list of dictionaries containing filetype, size, and filename only.
    • The term license contains a short Zenodo ID of the license (e.g. "cc-by").

    Communities dataset

    Filename: zenodo_community_metadata_{ date of export }.jsonl.gz

    Each object contains the terms: id, title, description, curation_policy, page

    which correspond to the fields with the same name available in Zenodo's community creation form.

    Notes for all datasets

    For each object the term spam contains a boolean value, determining whether a given record/community was marked as spam content by Zenodo staff.

    Some values for the top-level terms, which were missing in the metadata may contain a null value.

    A smaller uncompressed random sample of 200 JSON lines is also included for each dataset to test and get familiar with the format without having to download the entire dataset.

  11. m

    Wind Turbine Accident News (1980-2013)

    • data.mendeley.com
    Updated Nov 27, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gurdal Ertek (2017). Wind Turbine Accident News (1980-2013) [Dataset]. http://doi.org/10.17632/jkjvmn9tz3.1
    Explore at:
    Dataset updated
    Nov 27, 2017
    Authors
    Gurdal Ertek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data sets includes 216 news on 240 wind turbine accidents between the years 1980 and 2013. The analysis of this data set and the insights obtained are reported in the following research paper:

    Asian, S., Ertek, G., Haksoz, C., Pakter, S. and Ulun, S., 2017. Wind turbine accidents: A data mining study. IEEE Systems Journal, 11(3), pp.1567-1578.

    As of now, the most extensive data available on the Internet on wind turbines accidents is published by the Caithness Windfarm Information Forum (CWIF), a UK-based grassroots organization opposing wind turbine installations.

    While the Caithness list is impressive in magnitude, the quality and reliability of the list is open to discussion because of the following reason:

    • Many of the web links to the news sources are not valid, and some of the accidents appear in multiple lines of the data.

    In spite of containing much more magnitude of data, the data available in other online sources also exhibit similar deficiencies.

    So, there are problems when it comes to using the Caithness data or other data in research studies. To this end, we collected data on wind turbine accidents ourselves, also using the data from Caithness and we share our collected data on this page (please click the link at the top of the page to download the data).

    The data we collected consists of three folders, and a MS Excel file.

    The folder News.txt contains the accident news, with each news in a separate text file:

    The folder News.doc contains news, with each news in a separate MS Word file:

    Finally, the folder News.doc.with.notes contains news, with each news in a separate MS Word file, but with extensive comments, explaining how the database in the MS Excel file was constructed:

    The MS Excel file News.Database.xlsx contains the structured data created based on the detailed reading of the accident news text:

    The MS Excel file is the file that was analyzed in our research paper.

  12. O

    ATP 312, NOTES ON THE PETROLEUM PROSPECTS, FOR DOMINION MINING AND OIL NL

    • data.qld.gov.au
    Updated May 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geological Survey of Queensland (2023). ATP 312, NOTES ON THE PETROLEUM PROSPECTS, FOR DOMINION MINING AND OIL NL [Dataset]. https://www.data.qld.gov.au/dataset/cr011048
    Explore at:
    Dataset updated
    May 10, 2023
    Dataset authored and provided by
    Geological Survey of Queensland
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    URL: https://geoscience.data.qld.gov.au/dataset/cr011048

    ATP 312, NOTES ON THE PETROLEUM PROSPECTS, FOR DOMINION MINING AND OIL NL

  13. Vale: A Leader in the Mining Industry (Forecast)

    • kappasignal.com
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2023). Vale: A Leader in the Mining Industry (Forecast) [Dataset]. https://www.kappasignal.com/2023/06/vale-leader-in-mining-industry.html
    Explore at:
    Dataset updated
    Jun 3, 2023
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Vale: A Leader in the Mining Industry

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  14. Practice-Based Evidence: Profiling the Safety of Cilostazol by Text-Mining...

    • plos.figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas J. Leeper; Anna Bauer-Mehren; Srinivasan V. Iyer; Paea LePendu; Cliff Olson; Nigam H. Shah (2023). Practice-Based Evidence: Profiling the Safety of Cilostazol by Text-Mining of Clinical Notes [Dataset]. http://doi.org/10.1371/journal.pone.0063499
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nicholas J. Leeper; Anna Bauer-Mehren; Srinivasan V. Iyer; Paea LePendu; Cliff Olson; Nigam H. Shah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundPeripheral arterial disease (PAD) is a growing problem with few available therapies. Cilostazol is the only FDA-approved medication with a class I indication for intermittent claudication, but carries a black box warning due to concerns for increased cardiovascular mortality. To assess the validity of this black box warning, we employed a novel text-analytics pipeline to quantify the adverse events associated with Cilostazol use in a clinical setting, including patients with congestive heart failure (CHF).Methods and ResultsWe analyzed the electronic medical records of 1.8 million subjects from the Stanford clinical data warehouse spanning 18 years using a novel text-mining/statistical analytics pipeline. We identified 232 PAD patients taking Cilostazol and created a control group of 1,160 PAD patients not taking this drug using 1∶5 propensity-score matching. Over a mean follow up of 4.2 years, we observed no association between Cilostazol use and any major adverse cardiovascular event including stroke (OR = 1.13, CI [0.82, 1.55]), myocardial infarction (OR = 1.00, CI [0.71, 1.39]), or death (OR = 0.86, CI [0.63, 1.18]). Cilostazol was not associated with an increase in any arrhythmic complication. We also identified a subset of CHF patients who were prescribed Cilostazol despite its black box warning, and found that it did not increase mortality in this high-risk group of patients.ConclusionsThis proof of principle study shows the potential of text-analytics to mine clinical data warehouses to uncover ‘natural experiments’ such as the use of Cilostazol in CHF patients. We envision this method will have broad applications for examining difficult to test clinical hypotheses and to aid in post-marketing drug safety surveillance. Moreover, our observations argue for a prospective study to examine the validity of a drug safety warning that may be unnecessarily limiting the use of an efficacious therapy.

  15. Anglo American: A Mining Titan's Next Chapter (AAL) (Forecast)

    • kappasignal.com
    Updated Nov 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2024). Anglo American: A Mining Titan's Next Chapter (AAL) (Forecast) [Dataset]. https://www.kappasignal.com/2024/11/anglo-american-mining-titans-next.html
    Explore at:
    Dataset updated
    Nov 18, 2024
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Anglo American: A Mining Titan's Next Chapter (AAL)

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  16. Data from: BZ:TSXV Benz Mining Corp. (Forecast)

    • kappasignal.com
    Updated May 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2023). BZ:TSXV Benz Mining Corp. (Forecast) [Dataset]. https://www.kappasignal.com/2023/05/bztsxv-benz-mining-corp.html
    Explore at:
    Dataset updated
    May 21, 2023
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    BZ:TSXV Benz Mining Corp.

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  17. Harmony Mining: Digging for Gold or Digging a Hole? (HMY) (Forecast)

    • kappasignal.com
    Updated Nov 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KappaSignal (2025). Harmony Mining: Digging for Gold or Digging a Hole? (HMY) (Forecast) [Dataset]. https://www.kappasignal.com/2024/02/harmony-mining-digging-for-gold-or.html
    Explore at:
    Dataset updated
    Nov 25, 2025
    Dataset authored and provided by
    KappaSignal
    License

    https://www.kappasignal.com/p/legal-disclaimer.htmlhttps://www.kappasignal.com/p/legal-disclaimer.html

    Description

    This analysis presents a rigorous exploration of financial data, incorporating a diverse range of statistical features. By providing a robust foundation, it facilitates advanced research and innovative modeling techniques within the field of finance.

    Harmony Mining: Digging for Gold or Digging a Hole? (HMY)

    Financial data:

    • Historical daily stock prices (open, high, low, close, volume)

    • Fundamental data (e.g., market capitalization, price to earnings P/E ratio, dividend yield, earnings per share EPS, price to earnings growth, debt-to-equity ratio, price-to-book ratio, current ratio, free cash flow, projected earnings growth, return on equity, dividend payout ratio, price to sales ratio, credit rating)

    • Technical indicators (e.g., moving averages, RSI, MACD, average directional index, aroon oscillator, stochastic oscillator, on-balance volume, accumulation/distribution A/D line, parabolic SAR indicator, bollinger bands indicators, fibonacci, williams percent range, commodity channel index)

    Machine learning features:

    • Feature engineering based on financial data and technical indicators

    • Sentiment analysis data from social media and news articles

    • Macroeconomic data (e.g., GDP, unemployment rate, interest rates, consumer spending, building permits, consumer confidence, inflation, producer price index, money supply, home sales, retail sales, bond yields)

    Potential Applications:

    • Stock price prediction

    • Portfolio optimization

    • Algorithmic trading

    • Market sentiment analysis

    • Risk management

    Use Cases:

    • Researchers investigating the effectiveness of machine learning in stock market prediction

    • Analysts developing quantitative trading Buy/Sell strategies

    • Individuals interested in building their own stock market prediction models

    • Students learning about machine learning and financial applications

    Additional Notes:

    • The dataset may include different levels of granularity (e.g., daily, hourly)

    • Data cleaning and preprocessing are essential before model training

    • Regular updates are recommended to maintain the accuracy and relevance of the data

  18. d

    Data from: Stillwater Complex, Montana—logs of core drilled by Stillwater...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Stillwater Complex, Montana—logs of core drilled by Stillwater Mining Company and Anaconda Copper Corp. in the Stillwater Mine area, 1983 to 1989 [Dataset]. https://catalog.data.gov/dataset/stillwater-complex-montanalogs-of-core-drilled-by-stillwater-mining-company-and-anaconda-c
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This dataset includes TIFF (Tagged Image File Format) images of graphic drill core logs showing associated drill core information, a TIFF image of the explanation for the lithology and structure sections of the logs, an Esri shapefile of the locations of the drill holes, and 12 .csv files of tabular data that were compiled from handwritten drill core logs. The drill core is from the Stillwater Mine area of the Stillwater Complex, Montana and was drilled from 1983 to 1989 by the Stillwater Mining Company and Anaconda Copper Corp. The data shown in the graphic drill logs and contained within the .csv files includes lithologic, structure, percent recovery, grain size, sulfide, nickel, copper, platinum, and palladium mineralization information. The graphic drill logs were created using Golden software's Strater 5 drill core visualization software and are provided with both logarithmic and linear scales where applicable. The graphic drill logs are plotted using the depth recorded in the drill logs and do not reflect stratigraphic true thickness. All instances of question marks ("?") represent original data as written by the geologist. In areas where the hand-written notes were unreadable, the notation of "[unreadable]" was used. See USGS SIR 2014-5183 (https://pubs.usgs.gov/sir/2014/5183/) for report and spatial data relating to the Stillwater Complex.

  19. m

    Data from: The extent and consequences of p-hacking in science

    • figshare.mq.edu.au
    • search.dataone.org
    • +3more
    bin
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan L. Head; Luke Holman; Rob Lanfear; Andrew T. Kahn; Michael D. Jennions (2023). Data from: The extent and consequences of p-hacking in science [Dataset]. http://doi.org/10.5061/dryad.79d43
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    Macquarie University
    Authors
    Megan L. Head; Luke Holman; Rob Lanfear; Andrew T. Kahn; Michael D. Jennions
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as “p-hacking,” occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses.

    Usage Notes Data from: The extent and consequences of p-hacking in scienceThis zip file consists of three parts. 1. Data obtained from text-mining and associated analysis files. 2. Data obtained from previously published meta-analyses and associated analysis files. 3. Analysis files used to conduct meta-analyses of the data. Read me files are contained within this zip file.FILES_FOR_DRYAD.zip

  20. a

    BMP Areas Unsuitable for Mining - Petitioned

    • pa-geo-data-pennmap.hub.arcgis.com
    Updated Mar 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PA Department of Environmental Protection (2025). BMP Areas Unsuitable for Mining - Petitioned [Dataset]. https://pa-geo-data-pennmap.hub.arcgis.com/datasets/PADEP-1::bmp-areas-unsuitable-for-mining-petitioned
    Explore at:
    Dataset updated
    Mar 11, 2025
    Dataset authored and provided by
    PA Department of Environmental Protection
    Area covered
    Description

    OBJECTID ObjectIDSHAPE ESRI Geometry FieldNAME_PETITION Name assigned to the area petitioned to be designated as unsuitable for mining.PETITIONER The entity that submitted the petition.PETITIONID An identification number assigned to the area petitioned to be designated as unsuitable for mining.DATE_RECEIVED The date the Department received the petition to designate the area as unsuitable for mining.COUNTY The county the area is located in.PETITIONSTATUS Current status of the petition review.DATE_FINAL The date the Department made a final action on the petition.ACRES_PETITIONED Acreage of area petitioned to be designated as unsuitable for mining.ACRES_DESIGNATED Acreage of area designated as unsuitable for mining during review.ACRES_COAL Acreage of coal field extents inside the area designated as unsuitable for mining.ACRES_GIS Acreage of area calculated in GIS using PA Albers Equal Area Conic projectionSQMILE_GIS Square miles of area calculated in GIS using PA Albers Equal Area Conic projection.NOTES Additional notes.SHAPE.AREA GIS Area in native map unitsSHAPE.LEN Length/Perimeter in native map units

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Einetic (2025). Data Warehousing and Data Mining (Old), 7th Semester, Computer Science and Engineering, MAKAUT | Erudition Paper [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering/7/data-warehousing-and-data-mining

Data Warehousing and Data Mining (Old), 7th Semester, Computer Science and Engineering, MAKAUT | Erudition Paper

Explore at:
htmlAvailable download formats
Dataset updated
Nov 23, 2025
Dataset authored and provided by
Einetic
License

https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

Description

Question Paper Solutions of Data Warehousing and Data Mining (Old),7th Semester,Computer Science and Engineering,Maulana Abul Kalam Azad University of Technology

Search
Clear search
Close search
Google apps
Main menu