7 datasets found
  1. f

    Raw data of the ships of priority II in 2017

    • figshare.com
    • data.4tu.nl
    • +1more
    xlsx
    Updated Jul 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiaqi Mu (2020). Raw data of the ships of priority II in 2017 [Dataset]. http://doi.org/10.4121/uuid:92a6ce98-5001-4768-bec1-49881a367d36
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 28, 2020
    Dataset provided by
    4TU.ResearchData
    Authors
    Jiaqi Mu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    See the article "Targeting model based on principal component analysis and extreme learning machine" for the meaning of the data.

  2. f

    Data_Sheet_1_Bioinformatics Analyses Determined the Distinct CNS and...

    • figshare.com
    • frontiersin.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seiichi Omura; Fumitaka Sato; Nicholas E. Martinez; Ah-Mee Park; Mitsugu Fujita; Nikki J. Kennett; Urška Cvek; Alireza Minagar; J. Steven Alexander; Ikuo Tsunoda (2023). Data_Sheet_1_Bioinformatics Analyses Determined the Distinct CNS and Peripheral Surrogate Biomarker Candidates Between Two Mouse Models for Progressive Multiple Sclerosis.docx [Dataset]. http://doi.org/10.3389/fimmu.2019.00516.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Seiichi Omura; Fumitaka Sato; Nicholas E. Martinez; Ah-Mee Park; Mitsugu Fujita; Nikki J. Kennett; Urška Cvek; Alireza Minagar; J. Steven Alexander; Ikuo Tsunoda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Previously, we have established two distinct progressive multiple sclerosis (MS) models by induction of experimental autoimmune encephalomyelitis (EAE) with myelin oligodendrocyte glycoprotein (MOG) in two mouse strains. A.SW mice develop ataxia with antibody deposition, but no T cell infiltration, in the central nervous system (CNS), while SJL/J mice develop paralysis with CNS T cell infiltration. In this study, we determined biomarkers contributing to the homogeneity and heterogeneity of two models. Using the CNS and spleen microarray transcriptome and cytokine data, we conducted computational analyses. We identified up-regulation of immune-related genes, including immunoglobulins, in the CNS of both models. Pro-inflammatory cytokines, interferon (IFN)-γ and interleukin (IL)-17, were associated with the disease progression in SJL/J mice, while the expression of both cytokines was detected only at the EAE onset in A.SW mice. Principal component analysis (PCA) of CNS transcriptome data demonstrated that down-regulation of prolactin may reflect disease progression. Pattern matching analysis of spleen transcriptome with CNS PCA identified 333 splenic surrogate markers, including Stfa2l1, which reflected the changes in the CNS. Among them, we found that two genes (PER1/MIR6883 and FKBP5) and one gene (SLC16A1/MCT1) were also significantly up-regulated and down-regulated, respectively, in human MS peripheral blood, using data mining.

  3. Credit Card Fraud Detection

    • kaggle.com
    zip
    Updated Mar 23, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Machine Learning Group - ULB (2018). Credit Card Fraud Detection [Dataset]. https://www.kaggle.com/mlg-ulb/creditcardfraud
    Explore at:
    zip(69155672 bytes)Available download formats
    Dataset updated
    Mar 23, 2018
    Dataset authored and provided by
    Machine Learning Group - ULB
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase.

    Content

    The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

    It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.

    Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.

    Update (03/05/2021)

    A simulator for transaction data has been released as part of the practical handbook on Machine Learning for Credit Card Fraud Detection - https://fraud-detection-handbook.github.io/fraud-detection-handbook/Chapter_3_GettingStarted/SimulatedDataset.html. We invite all practitioners interested in fraud detection datasets to also check out this data simulator, and the methodologies for credit card fraud detection presented in the book.

    Acknowledgements

    The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project

    Please cite the following works:

    Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015

    Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon

    Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE

    Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)

    Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier

    Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing

    Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019

    Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019

    Yann-Aël Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook

    Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic Oblé, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics

  4. f

    MOESM9 of Feature optimization in high dimensional chemical space:...

    • springernature.figshare.com
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinuraj K. R.; Rakhila M.; Dhanalakshmi M.; Sajeev R.; Akshata Gad; Jayan K.; Muhammed Iqbal P.; Andrew Manuel; Abdul Jaleel U. C. (2023). MOESM9 of Feature optimization in high dimensional chemical space: statistical and data mining solutions [Dataset]. http://doi.org/10.6084/m9.figshare.6814010.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Jinuraj K. R.; Rakhila M.; Dhanalakshmi M.; Sajeev R.; Akshata Gad; Jayan K.; Muhammed Iqbal P.; Andrew Manuel; Abdul Jaleel U. C.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 9: Table S9. PubChem high throughput screen results of 3-(1H-1,3-Benzadiol-2-yl)quinoline and 2-(4-Methoxyphenyl)-7-methylimidazo[1,2-a]pyridine.

  5. f

    Data Sheet 3_Statistical approach for the evaluation of household solid...

    • frontiersin.figshare.com
    csv
    Updated Feb 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge L. Padilla-Vento; Juan J. Soria (2025). Data Sheet 3_Statistical approach for the evaluation of household solid waste generation in Peruvian households: 2014–2021.csv [Dataset]. http://doi.org/10.3389/fams.2025.1498513.s003
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 27, 2025
    Dataset provided by
    Frontiers
    Authors
    Jorge L. Padilla-Vento; Juan J. Soria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study addresses the evaluation of the generation of domestic solid waste in Peruvian households using statistical techniques and the SEMMA and PCA data mining methodology. The objective is to explore how waste management, population and the Per Capita Generation index PCG index per capita influence the production of this waste in Peruvian departments. The sample was obtained from the database of annual reports submitted by district and provincial municipalities to MINAM through the Information System for Solid Waste Management (SIGERSOL), including data from the 24 departments of Peru, with a total of 14,852 records organized in 196 registration forms. Statistical techniques and the adaptation of the SEMMA methodology were applied together with the Principal Component Analysis (PCA) to examine the impacts of the accumulation of household solid waste in Peru. This study showed that the first component accounts for 80.2% of the inertia. Combining the first two components accounts for 99.8% of the total variation, suggesting that most of the meaningful information can be maintained using only two dimensions. Welch’s ANOVA showed significant differences in domestic solid waste generation among Peruvian departments [F (6, 94.310) = 790.444; p = 0.0, p 

  6. MOESM11 of Feature optimization in high dimensional chemical space:...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinuraj K. R.; Rakhila M.; Dhanalakshmi M.; Sajeev R.; Akshata Gad; Jayan K.; Muhammed Iqbal P.; Andrew Manuel; Abdul Jaleel U. C. (2023). MOESM11 of Feature optimization in high dimensional chemical space: statistical and data mining solutions [Dataset]. http://doi.org/10.6084/m9.figshare.6813827.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jinuraj K. R.; Rakhila M.; Dhanalakshmi M.; Sajeev R.; Akshata Gad; Jayan K.; Muhammed Iqbal P.; Andrew Manuel; Abdul Jaleel U. C.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 11. The 14 training sets used for study which is derived from AID 1721, a high throughput screened, confirmatory bioassay dataset on pyruvate kinase protein target of Leishmania mexicana. Training sets are given as ARFF file and have 179 molecular descriptors generated using PowerMV.

  7. MOESM7 of Feature optimization in high dimensional chemical space:...

    • springernature.figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinuraj K. R.; Rakhila M.; Dhanalakshmi M.; Sajeev R.; Akshata Gad; Jayan K.; Muhammed Iqbal P.; Andrew Manuel; Abdul Jaleel U. C. (2023). MOESM7 of Feature optimization in high dimensional chemical space: statistical and data mining solutions [Dataset]. http://doi.org/10.6084/m9.figshare.6813983.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jinuraj K. R.; Rakhila M.; Dhanalakshmi M.; Sajeev R.; Akshata Gad; Jayan K.; Muhammed Iqbal P.; Andrew Manuel; Abdul Jaleel U. C.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 7: Table S10. Weighted burden number descriptor values (PCAD) of FDA approved drugs and that of PubChem molecules which were enlisted in Table 3.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jiaqi Mu (2020). Raw data of the ships of priority II in 2017 [Dataset]. http://doi.org/10.4121/uuid:92a6ce98-5001-4768-bec1-49881a367d36

Raw data of the ships of priority II in 2017

Explore at:
xlsxAvailable download formats
Dataset updated
Jul 28, 2020
Dataset provided by
4TU.ResearchData
Authors
Jiaqi Mu
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

See the article "Targeting model based on principal component analysis and extreme learning machine" for the meaning of the data.

Search
Clear search
Close search
Google apps
Main menu