100+ datasets found
  1. MOVIE CORRELATION ANALYSIS-2ND PROJECT

    • kaggle.com
    zip
    Updated Oct 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    srijanrawat86 (2023). MOVIE CORRELATION ANALYSIS-2ND PROJECT [Dataset]. https://www.kaggle.com/datasets/srijanrawat86/movie-correlation-analysis-2nd-project
    Explore at:
    zip(433664 bytes)Available download formats
    Dataset updated
    Oct 8, 2023
    Authors
    srijanrawat86
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by srijanrawat86

    Released under CC0: Public Domain

    Contents

  2. Statistical analysis of co-occurrence patterns in microbial presence-absence...

    • plos.figshare.com
    html
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kumar P. Mainali; Sharon Bewick; Peter Thielen; Thomas Mehoke; Florian P. Breitwieser; Shishir Paudel; Arjun Adhikari; Joshua Wolfe; Eric V. Slud; David Karig; William F. Fagan (2023). Statistical analysis of co-occurrence patterns in microbial presence-absence datasets [Dataset]. http://doi.org/10.1371/journal.pone.0187132
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kumar P. Mainali; Sharon Bewick; Peter Thielen; Thomas Mehoke; Florian P. Breitwieser; Shishir Paudel; Arjun Adhikari; Joshua Wolfe; Eric V. Slud; David Karig; William F. Fagan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Drawing on a long history in macroecology, correlation analysis of microbiome datasets is becoming a common practice for identifying relationships or shared ecological niches among bacterial taxa. However, many of the statistical issues that plague such analyses in macroscale communities remain unresolved for microbial communities. Here, we discuss problems in the analysis of microbial species correlations based on presence-absence data. We focus on presence-absence data because this information is more readily obtainable from sequencing studies, especially for whole-genome sequencing, where abundance estimation is still in its infancy. First, we show how Pearson’s correlation coefficient (r) and Jaccard’s index (J)–two of the most common metrics for correlation analysis of presence-absence data–can contradict each other when applied to a typical microbiome dataset. In our dataset, for example, 14% of species-pairs predicted to be significantly correlated by r were not predicted to be significantly correlated using J, while 37.4% of species-pairs predicted to be significantly correlated by J were not predicted to be significantly correlated using r. Mismatch was particularly common among species-pairs with at least one rare species (

  3. n

    Data from: WiBB: An integrated method for quantifying the relative...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    zip
    Updated Aug 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qin Li; Xiaojun Kou (2021). WiBB: An integrated method for quantifying the relative importance of predictive variables [Dataset]. http://doi.org/10.5061/dryad.xsj3tx9g1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 20, 2021
    Dataset provided by
    Field Museum of Natural History
    Beijing Normal University
    Authors
    Qin Li; Xiaojun Kou
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    This dataset contains simulated datasets, empirical data, and R scripts described in the paper: “Li, Q. and Kou, X. (2021) WiBB: An integrated method for quantifying the relative importance of predictive variables. Ecography (DOI: 10.1111/ecog.05651)”.

    A fundamental goal of scientific research is to identify the underlying variables that govern crucial processes of a system. Here we proposed a new index, WiBB, which integrates the merits of several existing methods: a model-weighting method from information theory (Wi), a standardized regression coefficient method measured by ß* (B), and bootstrap resampling technique (B). We applied the WiBB in simulated datasets with known correlation structures, for both linear models (LM) and generalized linear models (GLM), to evaluate its performance. We also applied two other methods, relative sum of wight (SWi), and standardized beta (ß*), to evaluate their performance in comparison with the WiBB method on ranking predictor importances under various scenarios. We also applied it to an empirical dataset in a plant genus Mimulus to select bioclimatic predictors of species’ presence across the landscape. Results in the simulated datasets showed that the WiBB method outperformed the ß* and SWi methods in scenarios with small and large sample sizes, respectively, and that the bootstrap resampling technique significantly improved the discriminant ability. When testing WiBB in the empirical dataset with GLM, it sensibly identified four important predictors with high credibility out of six candidates in modeling geographical distributions of 71 Mimulus species. This integrated index has great advantages in evaluating predictor importance and hence reducing the dimensionality of data, without losing interpretive power. The simplicity of calculation of the new metric over more sophisticated statistical procedures, makes it a handy method in the statistical toolbox.

    Methods To simulate independent datasets (size = 1000), we adopted Galipaud et al.’s approach (2014) with custom modifications of the data.simulation function, which used the multiple normal distribution function rmvnorm in R package mvtnorm(v1.0-5, Genz et al. 2016). Each dataset was simulated with a preset correlation structure between a response variable (y) and four predictors(x1, x2, x3, x4). The first three (genuine) predictors were set to be strongly, moderately, and weakly correlated with the response variable, respectively (denoted by large, medium, small Pearson correlation coefficients, r), while the correlation between the response and the last (spurious) predictor was set to be zero. We simulated datasets with three levels of differences of correlation coefficients of consecutive predictors, where ∆r = 0.1, 0.2, 0.3, respectively. These three levels of ∆r resulted in three correlation structures between the response and four predictors: (0.3, 0.2, 0.1, 0.0), (0.6, 0.4, 0.2, 0.0), and (0.8, 0.6, 0.3, 0.0), respectively. We repeated the simulation procedure 200 times for each of three preset correlation structures (600 datasets in total), for LM fitting later. For GLM fitting, we modified the simulation procedures with additional steps, in which we converted the continuous response into binary data O (e.g., occurrence data having 0 for absence and 1 for presence). We tested the WiBB method, along with two other methods, relative sum of wight (SWi), and standardized beta (ß*), to evaluate the ability to correctly rank predictor importances under various scenarios. The empirical dataset of 71 Mimulus species was collected by their occurrence coordinates and correponding values extracted from climatic layers from WorldClim dataset (www.worldclim.org), and we applied the WiBB method to infer important predictors for their geographical distributions.

  4. Evaluating Correlation Between Measurement Samples in Reverberation Chambers...

    • nist.gov
    • datasets.ai
    • +2more
    Updated Apr 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). Evaluating Correlation Between Measurement Samples in Reverberation Chambers Using Clustering [Dataset]. http://doi.org/10.18434/mds2-2986
    Explore at:
    Dataset updated
    Apr 6, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    Evaluating Correlation Between Measurement Samples in Reverberation Chambers Using Clustering Abstract: Traditionally, in reverberation chambers (RC) measurement autocorrelation or correlation-matrix methods have been applied to evaluate measurement correlation. In this article, we introduce the use of clustering based on correlative distance to group correlated measurements. We apply the method to measurements taken in an RC using one and two paddles to stir the electromagnetic fields and applying decreasing angular steps between consecutive paddles positions. The results using varying correlation threshold values demonstrate that the method calculates the number of effective samples and allows discerning outliers, i.e., uncorrelated measurements, and clusters of correlated measurements. This calculation method, if verified, will allow non-sequential stir sequence design and, thereby, reduce testing time. Keywords: Correlation, Pearson correlation coefficient (PCC), reverberation chambers (RC), mode-stirring samples, correlative distance, clustering analysis, adjacency matrix.

  5. Correlation data

    • kaggle.com
    zip
    Updated Oct 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Le Thi Diem Chau (2021). Correlation data [Dataset]. https://www.kaggle.com/lethidiemchau/correlation-data
    Explore at:
    zip(9417 bytes)Available download formats
    Dataset updated
    Oct 26, 2021
    Authors
    Le Thi Diem Chau
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Le Thi Diem Chau

    Released under CC0: Public Domain

    Contents

  6. p

    Music & Affect 2020 Dataset Study 2.csv

    • psycharchives.org
    Updated Sep 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Music & Affect 2020 Dataset Study 2.csv [Dataset]. https://www.psycharchives.org/handle/20.500.12034/3089
    Explore at:
    Dataset updated
    Sep 17, 2020
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset for: Leipold, B. & Loepthien, T. (2021). Attentive and emotional listening to music: The role of positive and negative affect. Jahrbuch Musikpsychologie, 30. https://doi.org/10.5964/jbdgm.78 In a cross-sectional study associations of global affect with two ways of listening to music – attentive–analytical listening (AL) and emotional listening (EL) were examined. More specifically, the degrees to which AL and EL are differentially correlated with positive and negative affect were examined. In Study 1, a sample of 1,291 individuals responded to questionnaires on listening to music, positive affect (PA), and negative affect (NA). We used the PANAS that measures PA and NA as high arousal dimensions. AL was positively correlated with PA, EL with NA. Moderation analyses showed stronger associations between PA and AL when NA was low. Study 2 (499 participants) differentiated between three facets of affect and focused, in addition to PA and NA, on the role of relaxation. Similar to the findings of Study 1, AL was correlated with PA, EL with NA and PA. Moderation analyses indicated that the degree to which PA is associated with an individual´s tendency to listen to music attentively depends on their degree of relaxation. In addition, the correlation between pleasant activation and EL was stronger for individuals who were more relaxed; for individuals who were less relaxed the correlation between unpleasant activation and EL was stronger. In sum, the results demonstrate not only simple bivariate correlations, but also that the expected associations vary, depending on the different affective states. We argue that the results reflect a dual function of listening to music, which includes emotional regulation and information processing.: Dataset Study 2

  7. A Bayesian method for detecting pairwise associations in compositional data

    • plos.figshare.com
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emma Schwager; Himel Mallick; Steffen Ventz; Curtis Huttenhower (2023). A Bayesian method for detecting pairwise associations in compositional data [Dataset]. http://doi.org/10.1371/journal.pcbi.1005852
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Emma Schwager; Himel Mallick; Steffen Ventz; Curtis Huttenhower
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Compositional data consist of vectors of proportions normalized to a constant sum from a basis of unobserved counts. The sum constraint makes inference on correlations between unconstrained features challenging due to the information loss from normalization. However, such correlations are of long-standing interest in fields including ecology. We propose a novel Bayesian framework (BAnOCC: Bayesian Analysis of Compositional Covariance) to estimate a sparse precision matrix through a LASSO prior. The resulting posterior, generated by MCMC sampling, allows uncertainty quantification of any function of the precision matrix, including the correlation matrix. We also use a first-order Taylor expansion to approximate the transformation from the unobserved counts to the composition in order to investigate what characteristics of the unobserved counts can make the correlations more or less difficult to infer. On simulated datasets, we show that BAnOCC infers the true network as well as previous methods while offering the advantage of posterior inference. Larger and more realistic simulated datasets further showed that BAnOCC performs well as measured by type I and type II error rates. Finally, we apply BAnOCC to a microbial ecology dataset from the Human Microbiome Project, which in addition to reproducing established ecological results revealed unique, competition-based roles for Proteobacteria in multiple distinct habitats.

  8. f

    Correlation analysis.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Oct 29, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bendszus, Martin; Burger, Astrid; Hoffmann, Jürgen; Möhlenbruch, Markus A.; Günther, Patrick; Kargus, Steffen; Gebhart, Philipp; Kühle, Reinald; Vollherbst, Dominik F. (2020). Correlation analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000451487
    Explore at:
    Dataset updated
    Oct 29, 2020
    Authors
    Bendszus, Martin; Burger, Astrid; Hoffmann, Jürgen; Möhlenbruch, Markus A.; Günther, Patrick; Kargus, Steffen; Gebhart, Philipp; Kühle, Reinald; Vollherbst, Dominik F.
    Description

    Correlation analysis.

  9. Partner expectations survey dataset

    • kaggle.com
    zip
    Updated Mar 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tharun (2024). Partner expectations survey dataset [Dataset]. https://www.kaggle.com/datasets/tharunprabu/partner-preference
    Explore at:
    zip(1328 bytes)Available download formats
    Dataset updated
    Mar 20, 2024
    Authors
    Tharun
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    📊 Partner's Expectations Survey Dataset: Insights from Ongoing Exploration

    Embark on an insightful journey into relationship dynamics with this evolving dataset! With an initial set of small responses, the "Relationship Predictor Survey" offers a snapshot into the diverse factors influencing romantic connections. As I continue to collect data, this dataset will grow, providing a unique opportunity for ongoing analysis and exploration.

    Key Features:

    🌐 Ongoing Updates: I'm committed to regularly updating this dataset with new responses, expanding the scope of my exploration.

    🧩 Initial Insights: While currently modest, the dataset already encompasses a variety of factors, including social skills, personality traits, interests, and more.

    🚀 Community Collaboration: Join me in unraveling the nuances of relationships by contributing to and engaging with this evolving dataset.

    How to Use:

    📈 Track Changes: Stay tuned for updates as I will add more responses over time. 🤝 Collaborate: Share your own insights and analyses to enrich the collective understanding. 📑 Flexible Research: Use the dataset for ongoing research projects or personal exploration.

    Acknowledgments: A sincere thank you to the initial participants who kick started this project. Your input lays the foundation for a growing resource that benefits the community.

    Please help me gather more data: https://forms.gle/xJ7W6SRH917HLMsaA

    Join me in this continuous exploration of relationships. As I gather more responses, the dataset will become a dynamic resource for insights and discussions. Happy analyzing! 🌱

  10. f

    Pearson correlation matrix.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Jul 29, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Beh, Loo-See; Masum, Abdul Kadar Muhammad; Hoque, Kazi Enamul; Azad, Abul Kalam (2015). Pearson correlation matrix. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001878400
    Explore at:
    Dataset updated
    Jul 29, 2015
    Authors
    Beh, Loo-See; Masum, Abdul Kadar Muhammad; Hoque, Kazi Enamul; Azad, Abul Kalam
    Description

    Pearson correlation matrix.

  11. f

    Data from: Correlation table.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Feb 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parks-Stamm, Elizabeth J.; Damanskyy, Yevhen; Martiny-Huenger, Torsten (2022). Correlation table. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000439797
    Explore at:
    Dataset updated
    Feb 23, 2022
    Authors
    Parks-Stamm, Elizabeth J.; Damanskyy, Yevhen; Martiny-Huenger, Torsten
    Description

    Correlation table.

  12. Event-correlated Outage Dataset in America

    • data.openei.org
    • s.cnmilf.com
    • +1more
    archive +2
    Updated Oct 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buxin She; Veronica Adetola; Ji Young Yun; Buxin She; Veronica Adetola; Ji Young Yun (2024). Event-correlated Outage Dataset in America [Dataset]. https://data.openei.org/submissions/6458
    Explore at:
    archive, text_document, websiteAvailable download formats
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    United States Department of Energyhttp://energy.gov/
    Open Energy Data Initiative (OEDI)
    Pacific Northwest National Laboratory
    Authors
    Buxin She; Veronica Adetola; Ji Young Yun; Buxin She; Veronica Adetola; Ji Young Yun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This dataset includes an aggregated and event-correlated analysis of power outages in the United States, synthesized by integrating three data sources: the Environment for the Analysis of Geo-Located Energy Information (EAGLE-I), the Electric Emergency Incident Disturbance Report (DOE-417), and Annual Estimates of the Resident Population for Counties 2024 (CO-EST2024-POP). The EAGLE-I dataset, spanning from 2014 to 2023, encompasses over 146 million customers and offers county-level outage information at 15-minute intervals. The data has been processed, filtered, and aggregated to deliver an enhanced perspective on power outages, which are then correlated with DOE-417 data based on geographic location as well as the start and end times of events. For each major disturbance documented in DOE-417, essential metrics are defined to quantify the outages associated with the event. This dataset supports researchers in examining outages triggered by major disturbances like extreme weather and physical disruptions, thereby aiding studies on power system resilience.

    Links to the raw data for generating the correlated dataset are included below as "DOE-417", "EAGLE-I", and "CO-EST2024-POP" resources.

    Acknowledgement: This work is funded by the Laboratory Directed Research and Development (LDRD) at the Pacific Northwest National Laboratory (PNNL) as part of the Resilience Through Data-Driven, Intelligently Designed Control (RD2C) Initiative.

  13. u

    Correlation Analysis of TLRA with Species-Discovery Coverage (Ground Truth...

    • figshare.unimelb.edu.au
    • figshare.com
    xlsx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anandi Karunaratne; Artem Polyvyanyy (2025). Correlation Analysis of TLRA with Species-Discovery Coverage (Ground Truth and Estimation) and Log-System Recall: A Comparative Evaluation Dataset [Dataset]. http://doi.org/10.26188/26410747.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset provided by
    The University of Melbourne
    Authors
    Anandi Karunaratne; Artem Polyvyanyy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset supports the study of correlations between Trace-based Log Representativeness Approximation (TLRA) and three measures: species-discovery-based coverage (both estimated and ground truth) and log-system recall. The analysis was conducted across event logs of 60 generative systems and varying log sizes and noise levels.Version 1: Focuses on the correlation analysis between TLRA and species-discovery-based coverage estimation (as presented in ieeexplore.ieee.org/document/10680679).Version 2: Extends the analysis by incorporating log-system recall.Version 3: Includes a ground truth analysis using sample coverage.The systems and logs used for this analysis are available for download in our GitHub repository.We kindly request that you cite our work if you use this dataset in your research:A. Karunaratne, A. Polyvyanyy, and A. Moffat, “The role of log representativeness in estimating generalization in process mining,” in Int. Conf. Process Mining. IEEE, 2024, pp. 33-40.

  14. d

    Data from: Example Groundwater-Level Datasets and Benchmarking Results for...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Example Groundwater-Level Datasets and Benchmarking Results for the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) Software Package [Dataset]. https://catalog.data.gov/dataset/example-groundwater-level-datasets-and-benchmarking-results-for-the-automated-regional-cor
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This data release provides two example groundwater-level datasets used to benchmark the Automated Regional Correlation Analysis for Hydrologic Record Imputation (ARCHI) software package (Levy and others, 2024). The first dataset contains groundwater-level records and site metadata for wells located on Long Island, New York (NY) and some surrounding mainland sites in New York and Connecticut. The second dataset contains groundwater-level records and site metadata for wells located in the southeastern San Joaquin Valley of the Central Valley, California (CA). For ease of exposition these are referred to as NY and CA datasets, respectively. Both datasets are formatted with column headers that can be read by the ARCHI software package within the R computing environment. These datasets were used to benchmark the imputation accuracy of three ARCHI model settings (OLS, ridge, and MOVE.1) against the widely used imputation program missForest (Stekhoven and Bühlmann, 2012). The ARCHI program was used to process the NY and CA datasets on monthly and annual timesteps, respectively, filter out sites with insufficient data for imputation, and create 200 test datasets from each of the example datasets with 5 percent of observations removed at random (herein, referred to as "holdouts"). Imputation accuracy for test datasets was assessed using normalized root mean square error (NRMSE), which is the root mean square error divided by the standard deviation of the observed holdout values. ARCHI produces prediction intervals (PIs) using a non-parametric bootstrapping routine, which were assessed by computing a coverage rate (CR) defined as the proportion of holdout observations falling within the estimated PI. The multiple regression models included with the ARCHI package (OLS and ridge) were further tested on all test datasets at eleven different levels of the p_per_n input parameter, which limits the maximum ratio of regression model predictors (p) per observations (n) as a decimal fraction greater than zero and less than or equal to one. This data release contains ten tables formatted as tab-delimited text files. The “CA_data.txt” and “NY_data.txt” tables contain 243,094 and 89,997 depth-to-groundwater measurement values (value, in feet below land surface) indexed by site identifier (site_no) and measurement date (date) for CA and NY datasets, respectively. The “CA_sites.txt” and “NY_sites.txt” tables contain site metadata for the 4,380 and 476 unique sites included in the CA and NY datasets, respectively. The “CA_NRMSE.txt” and “NY_NRMSE.txt” tables contain NRMSE values computed by imputing 200 test datasets with 5 percent random holdouts to assess imputation accuracy for three different ARCHI model settings and missForest using CA and NY datasets, respectively. The “CA_CR.txt” and “NY_CR.txt” tables contain CR values used to evaluate non-parametric PIs generated by bootstrapping regressions with three different ARCHI model settings using the CA and NY test datasets, respectively. The “CA_p_per_n.txt” and “NY_p_per_n.txt” tables contain mean NRMSE values computed for 200 test datasets with 5 percent random holdouts at 11 different levels of p_per_n for OLS and ridge models compared to training error for the same models on the entire CA and NY datasets, respectively. References Cited Levy, Z.F., Stagnitta, T.J., and Glas, R.L., 2024, ARCHI: Automated Regional Correlation Analysis for Hydrologic Record Imputation, v1.0.0: U.S. Geological Survey software release, https://doi.org/10.5066/P1VVHWKE. Stekhoven, D.J., and Bühlmann, P., 2012, MissForest—non-parametric missing value imputation for mixed-type data: Bioinformatics 28(1), 112-118. https://doi.org/10.1093/bioinformatics/btr597.

  15. S

    Dataset of correlation coefficients among UN each member state’s SDGs...

    • scidb.cn
    Updated Jan 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Tian; Zhang Lili; Li Jianhui (2021). Dataset of correlation coefficients among UN each member state’s SDGs indicator pairs during 2000 – 2017 [Dataset]. http://doi.org/10.11922/sciencedb.j00001.00217
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 31, 2021
    Dataset provided by
    Science Data Bank
    Authors
    Gao Tian; Zhang Lili; Li Jianhui
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United Nations
    Description

    This dataset mainly includes the correlation coefficient table (.csv) of the SDGs indicator pairs of all 193 member states of the United Nations from 2000 to 2017, the data visualization pictures (.png) of selected 20 countries in the southern hemisphere. These data are saved as a cab format file (.cab).This dataset can be used as the analysis data for the United Nations to assess the future realization of the Sustainable Development Goals, as well as an important reference for countries to monitor the completion of indicators and formulate relevant policies.

  16. f

    Correlation matrix of the study variables.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Sep 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bianco, Antonino; Palma, Antonio; Tabacchi, Garden; Thomas, Ewan; Bellafiore, Marianna; Scardina, Antonino; Agnese, Massimiliano; Navarra, Giovanni Angelo (2023). Correlation matrix of the study variables. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000994220
    Explore at:
    Dataset updated
    Sep 6, 2023
    Authors
    Bianco, Antonino; Palma, Antonio; Tabacchi, Garden; Thomas, Ewan; Bellafiore, Marianna; Scardina, Antonino; Agnese, Massimiliano; Navarra, Giovanni Angelo
    Description

    Pearson’s correlation coefficients were used for quantitative variables and normally distributed variables. Spearman’s correlation coefficients were used for categorical variables and not normally distributed variables.

  17. t

    Symmetric diffeomorphic image registration with cross-correlation - Dataset...

    • service.tib.eu
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Symmetric diffeomorphic image registration with cross-correlation - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/symmetric-diffeomorphic-image-registration-with-cross-correlation
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain.

  18. Social Media and Mental Health

    • kaggle.com
    zip
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SouvikAhmed071 (2023). Social Media and Mental Health [Dataset]. https://www.kaggle.com/datasets/souvikahmed071/social-media-and-mental-health
    Explore at:
    zip(10944 bytes)Available download formats
    Dataset updated
    Jul 18, 2023
    Authors
    SouvikAhmed071
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    This dataset was originally collected for a data science and machine learning project that aimed at investigating the potential correlation between the amount of time an individual spends on social media and the impact it has on their mental health.

    The project involves conducting a survey to collect data, organizing the data, and using machine learning techniques to create a predictive model that can determine whether a person should seek professional help based on their answers to the survey questions.

    This project was completed as part of a Statistics course at a university, and the team is currently in the process of writing a report and completing a paper that summarizes and discusses the findings in relation to other research on the topic.

    The following is the Google Colab link to the project, done on Jupyter Notebook -

    https://colab.research.google.com/drive/1p7P6lL1QUw1TtyUD1odNR4M6TVJK7IYN

    The following is the GitHub Repository of the project -

    https://github.com/daerkns/social-media-and-mental-health

    Libraries used for the Project -

    Pandas
    Numpy
    Matplotlib
    Seaborn
    Sci-kit Learn
    
  19. Generalized Correlation Tables

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). Generalized Correlation Tables [Dataset]. https://catalog.data.gov/dataset/generalized-correlation-tables
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    Zipped file and Excel correlation tables

  20. SiMES dataset: Pearson correlation coefficients (r) and p-values for the...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt Silver; Peng Chen; Ruoying Li; Ching-Yu Cheng; Tien-Yin Wong; E-Shyong Tai; Yik-Ying Teo; Giovanni Montana (2023). SiMES dataset: Pearson correlation coefficients (r) and p-values for the data plotted in Figure 15. [Dataset]. http://doi.org/10.1371/journal.pgen.1003939.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Matt Silver; Peng Chen; Ruoying Li; Ching-Yu Cheng; Tien-Yin Wong; E-Shyong Tai; Yik-Ying Teo; Giovanni Montana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Refer to Table 6 for details.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
srijanrawat86 (2023). MOVIE CORRELATION ANALYSIS-2ND PROJECT [Dataset]. https://www.kaggle.com/datasets/srijanrawat86/movie-correlation-analysis-2nd-project
Organization logo

MOVIE CORRELATION ANALYSIS-2ND PROJECT

Explore at:
zip(433664 bytes)Available download formats
Dataset updated
Oct 8, 2023
Authors
srijanrawat86
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset

This dataset was created by srijanrawat86

Released under CC0: Public Domain

Contents

Search
Clear search
Close search
Google apps
Main menu