87 datasets found
  1. Statistical analysis of co-occurrence patterns in microbial presence-absence...

    • plos.figshare.com
    html
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kumar P. Mainali; Sharon Bewick; Peter Thielen; Thomas Mehoke; Florian P. Breitwieser; Shishir Paudel; Arjun Adhikari; Joshua Wolfe; Eric V. Slud; David Karig; William F. Fagan (2023). Statistical analysis of co-occurrence patterns in microbial presence-absence datasets [Dataset]. http://doi.org/10.1371/journal.pone.0187132
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kumar P. Mainali; Sharon Bewick; Peter Thielen; Thomas Mehoke; Florian P. Breitwieser; Shishir Paudel; Arjun Adhikari; Joshua Wolfe; Eric V. Slud; David Karig; William F. Fagan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Drawing on a long history in macroecology, correlation analysis of microbiome datasets is becoming a common practice for identifying relationships or shared ecological niches among bacterial taxa. However, many of the statistical issues that plague such analyses in macroscale communities remain unresolved for microbial communities. Here, we discuss problems in the analysis of microbial species correlations based on presence-absence data. We focus on presence-absence data because this information is more readily obtainable from sequencing studies, especially for whole-genome sequencing, where abundance estimation is still in its infancy. First, we show how Pearson’s correlation coefficient (r) and Jaccard’s index (J)–two of the most common metrics for correlation analysis of presence-absence data–can contradict each other when applied to a typical microbiome dataset. In our dataset, for example, 14% of species-pairs predicted to be significantly correlated by r were not predicted to be significantly correlated using J, while 37.4% of species-pairs predicted to be significantly correlated by J were not predicted to be significantly correlated using r. Mismatch was particularly common among species-pairs with at least one rare species (

  2. Data from: Climate Prediction Center (CPC)Ensemble Canonical Correlation...

    • data.cnra.ca.gov
    • data.wu.ac.at
    Updated Mar 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Oceanic and Atmospheric Administration (2023). Climate Prediction Center (CPC)Ensemble Canonical Correlation Analysis 90-Day Seasonal Forecast of Precipitation [Dataset]. https://data.cnra.ca.gov/dataset/climate-prediction-center-cpcensemble-canonical-correlation-analysis-90-day-seasonal-forecast-o
    Explore at:
    Dataset updated
    Mar 1, 2023
    Dataset authored and provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    The Ensemble Canonical Correlation Analysis (ECCA) precipitation forecast is a 90-day (seasonal) outlook of US surface precipitation anomalies. The ECCA uses Canonical Correlation Analysis (CCA), an empirical statistical method that finds patterns of predictors (variables used to make the prediction) and predictands (variables to be predicted) that maximize the correlation between them. The most recent available predictor data for different atmospheric/oceanic variables are projected onto the loading patterns to create forecasts. The ensemble refers to forecasts produced by using each predictor separately to create a forecast. The final forecast is an equally weighted average of the ensemble of forecasts. The model is trained from 1953 to the year before the present year to create the loading patterns. The available forecasts are rotated, meaning that only the most recently created forecasts are available. Previously made forecasts are not archived. For each produced forecast, 13 different leads (0.5 months with increment of 1 month for subsequent leads of forecasts are created each with a different valid date).

  3. d

    Data from: Clinical correlation of multiple sclerosis immunopathological...

    • datadryad.org
    zip
    Updated Nov 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    W. Oliver Tobin (2021). Clinical correlation of multiple sclerosis immunopathological subtypes [Dataset]. http://doi.org/10.5061/dryad.5x69p8d2t
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 15, 2021
    Dataset provided by
    Dryad
    Authors
    W. Oliver Tobin
    Time period covered
    Apr 22, 2021
    Description

    Objective: To compare clinical characteristics across immunopathological subtypes of patients with multiple sclerosis.

    Methods: Immunopathological subtyping was performed on specimens from 547 patients with biopsy and/or autopsy confirmed CNS demyelination.

    Results: The frequency of immunopathological subtypes were pattern I (23%), II (56%), and III (22%). Immunopatterns were similar in terms of age at autopsy/biopsy (median age 41 years, range 4-83 years, p=0.16) and proportion female (54%, p=0.71). Median follow-up after symptom onset was 2.3 years (range 0-38y). In addition to being overrepresented among autopsy cases (45% vs. 19% in biopsy cohort, p<0.001), index attack-related disability was higher in pattern III vs. pattern II (median EDSS 4 vs. 3, p=0.02). Monophasic clinical course was more common in patients with pattern III than pattern I or II (59% vs. 33% vs. 32%, p<0.001). Similarly, patients with pattern III pathology were likely to have progressive disease compare...

  4. Data from: Multivariate estimate of eating patterns: is the whole different...

    • scielo.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iolanda Karla Santana dos Santos; Wolney Lisbôa Conde; Alicia Matijasevich Manitto (2023). Multivariate estimate of eating patterns: is the whole different from the parts? [Dataset]. http://doi.org/10.6084/m9.figshare.14321486.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Iolanda Karla Santana dos Santos; Wolney Lisbôa Conde; Alicia Matijasevich Manitto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT: Objective: To describe the correlations between eating patterns for the years 2007 to 2012, and for each year of the period from 2007 to 2012. Method: Cross-sectional study with data from the System of Surveillance of Risk and Protection Factors to Chronic Diseases by Telephone Survey with the selection of 167,761 individuals aged 18 to 44 years old. Eating patterns were identified with a Principal Component Analysis. To compare the effects of the extraction and the estimate of eating patterns among different surveys we conducted the following analyzes: in the first, we used the total data set for the years from 2007 to 2012; in the second, the patterns were estimated in each annual set of data for the period from 2007 to 2012. Steps 1 and 2 were performed with no rotation, with Varimax rotation and with Promax rotation. After extracting the patterns, standardized scores with zero mean were generated for each pattern. The association between the patterns generated in the analyzes was estimated by the Pearson correlation coefficient (r). Results: In the non-rotated analyzes, the components retained in the set presented correlations that were higher than 0.90, with the retained patterns in each year. In the rotated analyzes, only the first component had correlations that were higher than 0.90. Conclusion: Estimates of eating patterns either segmented - year by year - or in general - all of the years - showed high correlation and consistency between the patterns identified when in the same data pool.

  5. D

    Correlation between sequence divergence and polymorphism reveals similar...

    • datasetcatalog.nlm.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated Dec 31, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sloan, Daniel B.; Galloway, Laura F.; Barnard-Kubow, Karen B. (2014). Correlation between sequence divergence and polymorphism reveals similar evolutionary mechanisms acting across multiple timescales in a rapidly evolving plastid genome [Dataset]. http://doi.org/10.5061/dryad.d143r
    Explore at:
    Dataset updated
    Dec 31, 2014
    Authors
    Sloan, Daniel B.; Galloway, Laura F.; Barnard-Kubow, Karen B.
    Description

    Background: Although the plastid genome is highly conserved across most angiosperms, multiple lineages have increased rates of structural rearrangement and nucleotide substitution. These lineages exhibit an excess of nonsynonymous substitutions (i.e., elevated dN/dS ratios) in similar subsets of plastid genes, suggesting that similar mechanisms may be leading to relaxed and/or positive selection on these genes. However, little is known regarding whether these mechanisms continue to shape sequence diversity at the intraspecific level. Results: We examined patterns of interspecific divergence and intraspecific polymorphism in the plastid genome of Campanulastrum americanum, and across plastid genes found a significant correlation between dN/dS and pN/pS (i.e., the within-species equivalent of dN/dS). A number of genes including ycf1, ycf2, clpP, and ribosomal protein genes exhibited high dN/dS ratios. McDonald-Kreitman tests detected little evidence for positive selection acting on these genes, likely due to the presence of substantial intraspecific divergence. Large-scale structural variation was also observed between populations. Conclusions: These results suggest that mechanisms leading to structural rearrangements and increased nucleotide substitution rates in the plastid genome are continuing to act at the intraspecific level. Accelerated plastid genome evolution may increase the likelihood of intraspecific cytonuclear genetic incompatibilities, and thereby contribute to the early stages of the speciation process. Background: Although the plastid genome is highly conserved across most angiosperms, multiple lineages have increased rates of structural rearrangement and nucleotide substitution. These lineages exhibit an excess of nonsynonymous substitutions (i.e., elevated dN/dS ratios) in similar subsets of plastid genes, suggesting that similar mechanisms may be leading to relaxed and/or positive selection on these genes. However, little is known regarding whether these mechanisms continue to shape sequence diversity at the intraspecific level. Results: We examined patterns of interspecific divergence and intraspecific polymorphism in the plastid genome of Campanulastrum americanum, and across plastid genes found a significant correlation between dN/dS and pN/pS (i.e., the within-species equivalent of dN/dS). A number of genes including ycf1, ycf2, clpP, and ribosomal protein genes exhibited high dN/dS ratios. McDonald-Kreitman tests detected little evidence for positive selection acting on these genes, likely due to the presence of substantial intraspecific divergence. Large-scale structural variation was also observed between populations. Conclusions: These results suggest that mechanisms leading to structural rearrangements and increased nucleotide substitution rates in the plastid genome are continuing to act at the intraspecific level. Accelerated plastid genome evolution may increase the likelihood of intraspecific cytonuclear genetic incompatibilities, and thereby contribute to the early stages of the speciation process.

  6. B

    Data from: Large-Scale Analysis of Gene Expression and Connectivity in the...

    • borealisdata.ca
    • search.dataone.org
    Updated Mar 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leon French; Powell Patrick Cheng Tan; Paul Pavlidis (2019). Large-Scale Analysis of Gene Expression and Connectivity in the Rodent Brain: Insights through Data Integration [Dataset]. http://doi.org/10.5683/SP2/TB4AMV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 11, 2019
    Dataset provided by
    Borealis
    Authors
    Leon French; Powell Patrick Cheng Tan; Paul Pavlidis
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    NSERC, NIH, MSFHR, CIHR
    Description

    Recent research in C. elegans and the rodent has identified correlations between gene expression and connectivity. Here we extend this type of approach to examine complex patterns of gene expression in the rodent brain in the context of regional brain connectivity and differences in cellular populations. Using multiple large-scale data sets obtained from public sources, we identified two novel patterns of mouse brain gene expression showing a strong degree of anti-correlation, and relate this to multiple data modalities including macroscale connectivity. We found that these signatures are associated with differences in expression of neuronal and oligodendrocyte markers, suggesting they reflect regional differences in cellular populations. We also find that the expression level of these genes is correlated with connectivity degree, with regions expressing the neuron-enriched pattern having more incoming and outgoing connections with other regions. Our results exemplify what is possible when increasingly detailed large-scale cell- and gene-level data sets are integrated with connectivity data.

  7. E-Commerce Summer Product Ratings and Sales

    • kaggle.com
    zip
    Updated Jan 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). E-Commerce Summer Product Ratings and Sales [Dataset]. https://www.kaggle.com/datasets/thedevastator/summer-product-ratings-and-sales-performance-in/discussion
    Explore at:
    zip(436244 bytes)Available download formats
    Dataset updated
    Jan 15, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    E-Commerce Summer Product Ratings and Sales

    Identifying Correlations and Patterns for Optimal Results

    By Jeffrey Mvutu Mabilama [source]

    About this dataset

    This dataset brings you closer than ever to the reality of top-selling products and their performance in e-commerce platforms. It gives you detailed lists of each product's features, ratings, sales, reviews and other metrics so that you can understand what makes a successful summer product on Wish. With this data at hand, you have access to not only a curated list of top summer products but also to the power of analytics for boosting your business operations.

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains information about summer product listings, ratings, and sales performance data on the Wish e-commerce platform. Using this information you will be able to understand how well certain products sell, the average price of products in the summer season and many more interesting insights that can be gained from this dataset.

    Research Ideas

    • Estimating the optimal pricing strategy for a product based on its ratings, merchant ratings count, mean discount and other metrics. This would help businesses to determine which pricing strategy would produce the most profits while still keeping customers interested in their products.
    • Analyzing the performance of seasonal summer products by studying correlations between them, and their ratings, units sold and prices etc., allowing businesses to identify trends more accurately and improve sales strategies accordingly.
    • Tracking sellers’ fame across different countries through analysis of customer reviews for each product listed by them in order to understand better how location affects sales performance as well as evaluate customer satisfaction with particular sellers regarding shipping times or quality of products supplied from aforesaid seller’s inventory

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: summer-products-with-rating-and-performance_2020-08.csv | Column name | Description | |:---------------------------------|:-----------------------------------------------------------------------| | title | The title of the product. (String) | | title_orig | The original title of the product. (String) | | price | The price of the product. (Float) | | retail_price | The original retail price of the product. (Float) | | currency_buyer | The currency of the buyer. (String) | | units_sold | The number of units sold. (Integer) | | uses_ad_boosts | A flag indicating if the product has been boosted using ads. (Boolean) | | rating | The rating of the product. (Float) | | rating_count | The total number of ratings for the product. (Integer) | | rating_five_count | The number of five star ratings for the product. (Integer) | | rating_four_count | The number of four star ratings for the product. (Integer) | | rating_three_count | The number of three star ratings for the product. (Integer) | | rating_two_count | The number of two star ratings for the product. (Integer) | | rating_one_count | The number of one star ratings for the product. (Integer) | | badges_count | The number of badges associated with the product. (Integer) | | badge_local_product | A flag indicating if the product is a local product. (Boolean) | | badge_product_quality | A flag indicating if the product has a quality badge. (Boolean) | | badge_fast_shipping | A flag in...

  8. S

    20230327-XPCS experimental data set

    • scidb.cn
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cui Chenhui; Zhou Zimu; Wei Linfeng; Li Songlin; Tian Feng; Li Xiuhong; Guo Zhi; Xu Yihui; Jiang Huaidong; Tai Renzhong (2023). 20230327-XPCS experimental data set [Dataset]. http://doi.org/10.57760/sciencedb.13559
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Cui Chenhui; Zhou Zimu; Wei Linfeng; Li Songlin; Tian Feng; Li Xiuhong; Guo Zhi; Xu Yihui; Jiang Huaidong; Tai Renzhong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of two parts: XPCS and DLS experimental data. The XPCS experimental data was collected on SSRF 10U USAXS beam lines in March 2023, and the experimental parameters are as follows: The X-ray energy is 10 KeV; The distance between the sample and detector is 27.6 meters; The detector is EIGER X 4M, with a single pixel size of 75 microns; The experimental sample was a colloidal glycerol suspension. The water in the original solution (McLean, M814153) was replaced with glycerol using a rotary evaporator, resulting in a volume fraction of 1%. By using partially coherent X-ray, speckle patterns were collected at different exposure periods. The correlation function of the sample at different exposure periods and q values can be obtained through autocorrelation function. Data file name explanation: SiO2_ 500_ N1000_ 100ms_ GL_ ICT1_ 4165 is a colloid (SiO2)_ Particle size in nm_ Number of frames collected (N+number)_ Exposure period_ Medium (glycerol)_ Pinhole (100 microns)_ Data sequence number. The DLS experimental data was collected using the SSRF ancillary laboratories in July 2023 using the DLS device. The experimental parameters are as follows: Laser wavelength: 633 nm; Scattering angle: 90 °; Experimental temperature: 23.7 ℃; Dilute the original solution with pure water to different concentrations, and then place it in a quartz colorimetric dish for detection. The scattered signal is received by PMT and correlated with the correlation instrument (ALV-7004/USB-FAST) to obtain the results. We collected correlation signals at different concentrations and analyzed the impact of multiple scattering of colloids on particle size. Data file name explanation: SiO2_ 500_ X10_ 10sX10_ 2 is colloid (SiO2)_ Particle size in nm_ Dilution ratio (concentration of 1%/10)_ Collection cycle X times_ Data sequence number.

  9. w

    Data from: Climate Prediction Center(CPC)Ensemble Canonical Correlation...

    • data.wu.ac.at
    html
    Updated Jan 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Commerce (2016). Climate Prediction Center(CPC)Ensemble Canonical Correlation Analysis Forecast of Temperature [Dataset]. https://data.wu.ac.at/schema/data_gov/YTBiMGQ0OGItNmM0Yy00M2FiLWEwNGQtZmNiY2IwM2ZlY2Zm
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jan 29, 2016
    Dataset provided by
    Department of Commerce
    Area covered
    7ffbc5dec9a6a70f53ae22361aca1bc42e701faf
    Description

    The Ensemble Canonical Correlation Analysis (ECCA) temperature forecast is a 90-day (seasonal) outlook of US surface temperature anomalies. The ECCA uses Canonical Correlation Analysis (CCA), an empirical statistical method that finds patterns of predictors (variables used to make the prediction) and predictands (variables to be predicted) that maximize the correlation between them. The most recent available predictor data for different atmospheric/oceanic variables are projected onto the loading patterns to create forecasts. The ensemble refers to forecasts produced by using each predictor separately to create a forecast. The final forecast is an equally weighted average of the ensemble of forecasts. The model is trained from 1953 to the year before the present year to create the loading patterns. The available forecasts are rotated, meaning that only the most recently created forecasts are available. Previously made forecasts are not archived. For each produced forecast, 13 different leads (0.5 months with increment of 1 month) for subsequent leads of forecasts are created (each with a different valid date).

  10. f

    Data_Sheet_1_Dental microwear texture analysis correlations in guinea pigs...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Aug 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hatt, Jean-Michel; Müller, Jaqueline; Tütken, Thomas; Codron, Daryl; Schulz-Kornas, Ellen; Clauss, Marcus; Martin, Louise Françoise; Kaiser, Thomas; Ackermans, Nicole Lauren; Winkler, Daniela Eileen (2022). Data_Sheet_1_Dental microwear texture analysis correlations in guinea pigs (Cavia porcellus) and sheep (Ovis aries) suggest that dental microwear texture signal consistency is species-specific.pdf [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000439092
    Explore at:
    Dataset updated
    Aug 25, 2022
    Authors
    Hatt, Jean-Michel; Müller, Jaqueline; Tütken, Thomas; Codron, Daryl; Schulz-Kornas, Ellen; Clauss, Marcus; Martin, Louise Françoise; Kaiser, Thomas; Ackermans, Nicole Lauren; Winkler, Daniela Eileen
    Description

    Dental microwear texture (DMT) analysis is used to differentiate abrasive dental wear patterns in many species fed different diets. Because DMT parameters all describe the same surface, they are expected to correlate with each other distinctively. Here, we explore the data range of, and correlations between, DMT parameters to increase the understanding of how this group of proxies records wear within and across species. The analysis was based on subsets of previously published DMT analyses in guinea pigs, sheep, and rabbits fed either a natural whole plant diet (lucerne, grass, bamboo) or pelleted diets with or without added quartz abrasives (guinea pigs and rabbits: up to 45 days, sheep: 17 months). The normalized DMT parameter range (P4: 0.69 ± 0.25; M2: 0.83 ± 0.16) and correlation coefficients (P4: 0.50 ± 0.31; M2: 0.63 ± 0.31) increased along the tooth row in guinea pigs, suggesting that strong correlations may be partially explained by data range. A comparison between sheep and guinea pigs revealed a higher DMT data range in sheep (0.93 ± 0.16; guinea pigs: 0.47 ± 0.29), but this did not translate into more substantial correlation coefficients (sheep: 0.35 ± 0.28; guinea pigs: 0.55 ± 0.32). Adding rabbits to an interspecies comparison of low abrasive dental wear (pelleted lucerne diet), the softer enamel of the hypselodont species showed a smaller data range for DMT parameters (guinea pigs 0.49 ± 0.32, rabbit 0.19 ± 0.18, sheep 0.78 ± 0.22) but again slightly higher correlations coefficients compared to the hypsodont teeth (guinea pigs 0.55 ± 0.31, rabbits 0.56 ± 0.30, sheep 0.42 ± 0.27). The findings suggest that the softer enamel of fast-replaced ever-growing hypselodont cheek teeth shows a greater inherent wear trace consistency, whereas the harder enamel of permanent and non-replaced enamel of hypsodont ruminant teeth records less coherent wear patterns. Because consistent diets were used across taxa, this effect cannot be ascribed to the random overwriting of individual wear traces on the more durable hypsodont teeth. This matches literature reports on reduced DMT pattern consistency on harder materials; possibly, individual wear events become more random in nature on harder material. Given the species-specific differences in enamel characteristics, the findings suggest a certain species-specificity of DMT patterns.

  11. D

    Event Correlation For Vehicle Data Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Event Correlation For Vehicle Data Market Research Report 2033 [Dataset]. https://dataintelo.com/report/event-correlation-for-vehicle-data-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Event Correlation for Vehicle Data Market Outlook



    As per our latest research, the global Event Correlation for Vehicle Data market size reached USD 1.68 billion in 2024, driven by the rapid adoption of connected vehicle technologies and the exponential growth in vehicle-generated data. The market is expected to grow at a robust CAGR of 16.2% from 2025 to 2033, reaching a forecasted value of USD 5.05 billion by 2033. This impressive expansion is fueled by the increasing demand for advanced analytics in automotive ecosystems, where real-time data correlation is transforming fleet management, predictive maintenance, and overall vehicle safety.




    The primary growth factor for the Event Correlation for Vehicle Data market is the proliferation of connected vehicles and the Internet of Things (IoT) in the automotive sector. Modern vehicles generate massive volumes of data from sensors, onboard diagnostics, telematics systems, and infotainment platforms. The need to aggregate, analyze, and correlate these diverse data streams in real-time is critical for deriving actionable insights. Event correlation technologies enable automotive stakeholders to interpret complex event patterns, identify anomalies, and predict potential failures, thereby facilitating proactive maintenance, enhanced operational efficiency, and improved driver safety. The ongoing evolution of 5G networks and edge computing further accelerates this trend by enabling faster data transmission and decentralized processing, making real-time event correlation more practical and scalable.




    Another significant driver is the rising focus on safety and regulatory compliance in the automotive industry. Governments and regulatory bodies across major economies are implementing stringent data-driven safety standards, particularly for commercial fleets and autonomous vehicles. Event correlation for vehicle data enables organizations to monitor compliance in real-time, detect non-conformities, and respond swiftly to potential safety threats. Insurance companies are leveraging these capabilities to develop usage-based insurance models, assess driver behavior, and reduce fraudulent claims. The integration of event correlation with advanced driver-assistance systems (ADAS) and vehicle-to-everything (V2X) communication is also creating new opportunities for innovation and differentiation among automotive OEMs and fleet operators.




    The surge in demand for predictive maintenance solutions is another key factor propelling market growth. By correlating historical and real-time data from various vehicle components, event correlation platforms can accurately forecast component failures and schedule maintenance before breakdowns occur. This not only reduces downtime and maintenance costs but also extends vehicle lifespan and enhances customer satisfaction. The growing adoption of electric vehicles (EVs) and the complexity of their powertrains further amplify the need for sophisticated data correlation tools to optimize battery performance, monitor charging cycles, and ensure reliability. Automotive OEMs and fleet operators are increasingly investing in advanced event correlation solutions to stay competitive in a data-driven market landscape.




    From a regional perspective, North America currently dominates the Event Correlation for Vehicle Data market, supported by a mature automotive sector, high penetration of connected vehicles, and strong investments in automotive IoT infrastructure. Europe follows closely, driven by strict regulatory frameworks and a strong focus on sustainability and safety. The Asia Pacific region is expected to witness the fastest growth over the forecast period, fueled by rapid urbanization, increasing vehicle production, and government initiatives promoting smart mobility. Latin America and the Middle East & Africa are also showing promising growth, albeit from a smaller base, as digital transformation accelerates across emerging markets.



    Component Analysis



    The Event Correlation for Vehicle Data market is segmented by component into software, hardware, and services, each playing a pivotal role in the ecosystem. Software solutions are at the core of event correlation, enabling real-time data aggregation, pattern recognition, and advanced analytics. These platforms leverage artificial intelligence and machine learning algorithms to automate the identification of correlated events and anomalies across vast datasets. The increasing sophistication of sof

  12. Stock Market: Historical Data of Top 10 Companies

    • kaggle.com
    zip
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khushi Pitroda (2023). Stock Market: Historical Data of Top 10 Companies [Dataset]. https://www.kaggle.com/datasets/khushipitroda/stock-market-historical-data-of-top-10-companies
    Explore at:
    zip(486977 bytes)Available download formats
    Dataset updated
    Jul 18, 2023
    Authors
    Khushi Pitroda
    Description

    The dataset contains a total of 25,161 rows, each row representing the stock market data for a specific company on a given date. The information collected through web scraping from www.nasdaq.com includes the stock prices and trading volumes for the companies listed, such as Apple, Starbucks, Microsoft, Cisco Systems, Qualcomm, Meta, Amazon.com, Tesla, Advanced Micro Devices, and Netflix.

    Data Analysis Tasks:

    1) Exploratory Data Analysis (EDA): Analyze the distribution of stock prices and volumes for each company over time. Visualize trends, seasonality, and patterns in the stock market data using line charts, bar plots, and heatmaps.

    2)Correlation Analysis: Investigate the correlations between the closing prices of different companies to identify potential relationships. Calculate correlation coefficients and visualize correlation matrices.

    3)Top Performers Identification: Identify the top-performing companies based on their stock price growth and trading volumes over a specific time period.

    4)Market Sentiment Analysis: Perform sentiment analysis using Natural Language Processing (NLP) techniques on news headlines related to each company. Determine whether positive or negative news impacts the stock prices and volumes.

    5)Volatility Analysis: Calculate the volatility of each company's stock prices using metrics like Standard Deviation or Bollinger Bands. Analyze how volatile stocks are in comparison to others.

    Machine Learning Tasks:

    1)Stock Price Prediction: Use time-series forecasting models like ARIMA, SARIMA, or Prophet to predict future stock prices for a particular company. Evaluate the models' performance using metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).

    2)Classification of Stock Movements: Create a binary classification model to predict whether a stock will rise or fall on the next trading day. Utilize features like historical price changes, volumes, and technical indicators for the predictions. Implement classifiers such as Logistic Regression, Random Forest, or Support Vector Machines (SVM).

    3)Clustering Analysis: Cluster companies based on their historical stock performance using unsupervised learning algorithms like K-means clustering. Explore if companies with similar stock price patterns belong to specific industry sectors.

    4)Anomaly Detection: Detect anomalies in stock prices or trading volumes that deviate significantly from the historical trends. Use techniques like Isolation Forest or One-Class SVM for anomaly detection.

    5)Reinforcement Learning for Portfolio Optimization: Formulate the stock market data as a reinforcement learning problem to optimize a portfolio's performance. Apply algorithms like Q-Learning or Deep Q-Networks (DQN) to learn the optimal trading strategy.

    The dataset provided on Kaggle, titled "Stock Market Stars: Historical Data of Top 10 Companies," is intended for learning purposes only. The data has been gathered from public sources, specifically from web scraping www.nasdaq.com, and is presented in good faith to facilitate educational and research endeavors related to stock market analysis and data science.

    It is essential to acknowledge that while we have taken reasonable measures to ensure the accuracy and reliability of the data, we do not guarantee its completeness or correctness. The information provided in this dataset may contain errors, inaccuracies, or omissions. Users are advised to use this dataset at their own risk and are responsible for verifying the data's integrity for their specific applications.

    This dataset is not intended for any commercial or legal use, and any reliance on the data for financial or investment decisions is not recommended. We disclaim any responsibility or liability for any damages, losses, or consequences arising from the use of this dataset.

    By accessing and utilizing this dataset on Kaggle, you agree to abide by these terms and conditions and understand that it is solely intended for educational and research purposes.

    Please note that the dataset's contents, including the stock market data and company names, are subject to copyright and other proprietary rights of the respective sources. Users are advised to adhere to all applicable laws and regulations related to data usage, intellectual property, and any other relevant legal obligations.

    In summary, this dataset is provided "as is" for learning purposes, without any warranties or guarantees, and users should exercise due diligence and judgment when using the data for any purpose.

  13. Improving Contact Prediction along Three Dimensions

    • plos.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christoph Feinauer; Marcin J. Skwark; Andrea Pagnani; Erik Aurell (2023). Improving Contact Prediction along Three Dimensions [Dataset]. http://doi.org/10.1371/journal.pcbi.1003847
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Christoph Feinauer; Marcin J. Skwark; Andrea Pagnani; Erik Aurell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Correlation patterns in multiple sequence alignments of homologous proteins can be exploited to infer information on the three-dimensional structure of their members. The typical pipeline to address this task, which we in this paper refer to as the three dimensions of contact prediction, is to (i) filter and align the raw sequence data representing the evolutionarily related proteins; (ii) choose a predictive model to describe a sequence alignment; (iii) infer the model parameters and interpret them in terms of structural properties, such as an accurate contact map. We show here that all three dimensions are important for overall prediction success. In particular, we show that it is possible to improve significantly along the second dimension by going beyond the pair-wise Potts models from statistical physics, which have hitherto been the focus of the field. These (simple) extensions are motivated by multiple sequence alignments often containing long stretches of gaps which, as a data feature, would be rather untypical for independent samples drawn from a Potts model. Using a large test set of proteins we show that the combined improvements along the three dimensions are as large as any reported to date.

  14. DataSheet1_Repeated Measures Correlation.pdf

    • frontiersin.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Z. Bakdash; Laura R. Marusich (2023). DataSheet1_Repeated Measures Correlation.pdf [Dataset]. http://doi.org/10.3389/fpsyg.2017.00456.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Jonathan Z. Bakdash; Laura R. Marusich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Repeated measures correlation (rmcorr) is a statistical technique for determining the common within-individual association for paired measures assessed on two or more occasions for multiple individuals. Simple regression/correlation is often applied to non-independent observations or aggregated data; this may produce biased, specious results due to violation of independence and/or differing patterns between-participants versus within-participants. Unlike simple regression/correlation, rmcorr does not violate the assumption of independence of observations. Also, rmcorr tends to have much greater statistical power because neither averaging nor aggregation is necessary for an intra-individual research question. Rmcorr estimates the common regression slope, the association shared among individuals. To make rmcorr accessible, we provide background information for its assumptions and equations, visualization, power, and tradeoffs with rmcorr compared to multilevel modeling. We introduce the R package (rmcorr) and demonstrate its use for inferential statistics and visualization with two example datasets. The examples are used to illustrate research questions at different levels of analysis, intra-individual, and inter-individual. Rmcorr is well-suited for research questions regarding the common linear association in paired repeated measures data. All results are fully reproducible.

  15. Data_Sheet_1_Interpretive JIVE: Connections with CCA and an application to...

    • frontiersin.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphiel J. Murden; Zhengwu Zhang; Ying Guo; Benjamin B. Risk (2023). Data_Sheet_1_Interpretive JIVE: Connections with CCA and an application to brain connectivity.PDF [Dataset]. http://doi.org/10.3389/fnins.2022.969510.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Raphiel J. Murden; Zhengwu Zhang; Ying Guo; Benjamin B. Risk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Joint and Individual Variation Explained (JIVE) is a model that decomposes multiple datasets obtained on the same subjects into shared structure, structure unique to each dataset, and noise. JIVE is an important tool for multimodal data integration in neuroimaging. The two most common algorithms are R.JIVE, an iterative approach, and AJIVE, which uses principal angle analysis. The joint structure in JIVE is defined by shared subspaces, but interpreting these subspaces can be challenging. In this paper, we reinterpret AJIVE as a canonical correlation analysis of principal component scores. This reformulation, which we call CJIVE, (1) provides an intuitive view of AJIVE; (2) uses a permutation test for the number of joint components; (3) can be used to predict subject scores for out-of-sample observations; and (4) is computationally fast. We conduct simulation studies that show CJIVE and AJIVE are accurate when the total signal ranks are correctly specified but, generally inaccurate when the total ranks are too large. CJIVE and AJIVE can still extract joint signal even when the joint signal variance is relatively small. JIVE methods are applied to integrate functional connectivity (resting-state fMRI) and structural connectivity (diffusion MRI) from the Human Connectome Project. Surprisingly, the edges with largest loadings in the joint component in functional connectivity do not coincide with the same edges in the structural connectivity, indicating more complex patterns than assumed in spatial priors. Using these loadings, we accurately predict joint subject scores in new participants. We also find joint scores are associated with fluid intelligence, highlighting the potential for JIVE to reveal important shared structure.

  16. D

    Urban Odor Complaint Correlation Analytics Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Urban Odor Complaint Correlation Analytics Market Research Report 2033 [Dataset]. https://dataintelo.com/report/urban-odor-complaint-correlation-analytics-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Urban Odor Complaint Correlation Analytics Market Outlook



    According to our latest research, the global Urban Odor Complaint Correlation Analytics market size reached USD 1.25 billion in 2024, reflecting a dynamic and rapidly evolving sector. The market is projected to grow at a robust CAGR of 11.7% from 2025 to 2033, reaching an estimated USD 3.44 billion by 2033. This growth is primarily driven by increasing urbanization, heightened public awareness regarding air quality, and stricter environmental regulations worldwide. As per our findings, the integration of advanced data analytics and IoT-enabled sensor networks is significantly enhancing the ability of municipalities and industries to monitor, analyze, and respond to urban odor complaints more effectively.




    A major growth factor for the Urban Odor Complaint Correlation Analytics market is the increasing prevalence of urbanization and population density in cities around the world. As urban areas expand, the sources of odor complaints—ranging from industrial emissions, waste treatment facilities, to transportation systems—become more concentrated and complex. This has led to a surge in demand for sophisticated analytics platforms that can correlate odor complaints with real-time environmental data, enabling authorities to pinpoint sources and develop targeted mitigation strategies. Furthermore, the growing importance of urban livability indices and the need to ensure a high quality of life for city dwellers are pushing municipal governments to invest in innovative odor monitoring and analytics solutions.




    Another significant driver is the advancement of sensor technology and the proliferation of IoT devices. Modern urban odor analytics platforms leverage networks of low-cost, high-precision sensors that continuously collect data on air quality, meteorological conditions, and odor intensity. This data is then processed using advanced machine learning algorithms and big data analytics to identify patterns and correlations between odor complaints and potential sources. The ability to provide actionable insights in near real-time is transforming how cities manage environmental nuisances, leading to more responsive and transparent governance. Additionally, the integration of citizen-generated data through mobile applications and social media platforms is further enriching the datasets available for analysis, making odor correlation analytics more accurate and community-driven.




    Regulatory pressures and the heightened focus on public health are also fueling market expansion. Environmental protection agencies and local governments are enacting stricter regulations regarding air quality and odor emissions, compelling industries and municipalities to adopt comprehensive monitoring and analytics solutions. Public health concerns, such as the impact of malodors on respiratory health and overall well-being, are prompting increased investment in correlation analytics to proactively address odor-related issues. Enhanced reporting requirements and the need for transparent communication with the public have made urban odor analytics an essential tool for regulatory compliance and community engagement.




    From a regional perspective, North America and Europe currently lead the Urban Odor Complaint Correlation Analytics market, driven by well-established regulatory frameworks, high levels of technological adoption, and significant investments in smart city initiatives. However, the Asia Pacific region is expected to witness the fastest growth over the forecast period, fueled by rapid urbanization, increasing environmental awareness, and government-led efforts to improve urban air quality. Latin America and the Middle East & Africa are also emerging as promising markets, supported by infrastructure development and rising public demand for cleaner, healthier urban environments. The global market landscape is thus characterized by a mix of mature and emerging regions, each with unique drivers and challenges shaping the adoption of odor analytics solutions.



    Component Analysis



    The Urban Odor Complaint Correlation Analytics market is segmented by component into Software, Hardware, and Services, each playing a pivotal role in the overall ecosystem. Software solutions form the core of the market, enabling the collection, integration, and analysis of vast datasets from diverse sources including sensors, citizen complaints, and meteorological systems. These platforms employ adva

  17. D

    Event Correlation For Physical Security Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Event Correlation For Physical Security Market Research Report 2033 [Dataset]. https://dataintelo.com/report/event-correlation-for-physical-security-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Event Correlation for Physical Security Market Outlook



    According to our latest research, the global Event Correlation for Physical Security market size reached USD 2.9 billion in 2024, with a robust year-on-year growth. The market is expected to grow at a CAGR of 12.7% from 2025 to 2033, projecting the market size to reach approximately USD 8.6 billion by 2033. This growth is primarily driven by the increasing integration of advanced analytics and artificial intelligence into physical security infrastructures, as organizations across the globe seek to enhance real-time threat detection and response capabilities.




    One of the key growth factors fueling the Event Correlation for Physical Security market is the rapid rise in sophisticated security threats and the corresponding need for more proactive security postures. Traditional physical security systems are often siloed, resulting in delayed or missed responses to complex incidents. Event correlation platforms address this by aggregating and analyzing data from multiple sources—such as access control, video surveillance, and intrusion detection systems—to deliver actionable insights in real time. This integrated approach not only improves situational awareness but also enables security teams to prioritize threats, reduce false positives, and respond more efficiently to genuine incidents. As organizations increasingly recognize the value of unified security management, demand for event correlation solutions is expected to continue its upward trajectory.




    Another significant driver for market expansion is the growing adoption of IoT devices and smart infrastructure across commercial, industrial, and government sectors. The proliferation of connected sensors and cameras generates vast volumes of data that can overwhelm traditional monitoring systems. Event correlation tools leverage advanced algorithms and machine learning to sift through this data, identifying patterns and anomalies that may indicate potential security breaches. This capability is particularly crucial in high-risk environments such as airports, critical infrastructure, and financial institutions, where timely detection and response can prevent significant losses. Furthermore, regulatory mandates requiring enhanced security protocols in sectors like BFSI and transportation are compelling organizations to invest in comprehensive event correlation platforms.




    The shift towards cloud-based deployment models and the increasing emphasis on remote monitoring are also shaping the future of the Event Correlation for Physical Security market. Cloud-based solutions offer scalability, flexibility, and cost efficiencies, allowing organizations to centralize security operations and leverage advanced analytics without significant upfront investments in hardware. This trend is particularly pronounced among small and medium enterprises, which often lack the resources for extensive on-premises deployments. Moreover, the integration of AI-driven analytics within event correlation platforms is enabling predictive threat detection and automated incident response, further enhancing the value proposition for end-users. As digital transformation initiatives accelerate across industries, the demand for intelligent, cloud-enabled security solutions is expected to see sustained growth.




    Regionally, North America continues to dominate the Event Correlation for Physical Security market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The high adoption rate in North America is attributed to stringent security regulations, widespread deployment of advanced surveillance systems, and significant investments in smart city projects. Meanwhile, Asia Pacific is emerging as the fastest-growing region, driven by rapid urbanization, increased security concerns, and government-led infrastructure modernization efforts. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a lower base, as organizations in these regions increasingly recognize the importance of integrated security management. The evolving threat landscape and ongoing technological advancements are expected to further accelerate regional adoption rates in the coming years.



    Component Analysis



    The Event Correlation for Physical Security market is segmented by component into software, hardware, and services, each playing a distinct yet interconnected role in the overall ecosystem. Software remains the cornerstone

  18. Data from: Brain Ages Derived from Different MRI Modalities are Associated...

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrei-Claudiu Roibu; Andrei-Claudiu Roibu; Stanislaw Adaszewski; Torsten Schindler; Stephen M. Smith; Stephen M. Smith; Ana I.L. Namburete; Ana I.L. Namburete; Frederik J. Lange; Frederik J. Lange; Stanislaw Adaszewski; Torsten Schindler (2025). Brain Ages Derived from Different MRI Modalities are Associated with Distinct Biological Phenotypes [Dataset]. http://doi.org/10.5281/zenodo.8110876
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrei-Claudiu Roibu; Andrei-Claudiu Roibu; Stanislaw Adaszewski; Torsten Schindler; Stephen M. Smith; Stephen M. Smith; Ana I.L. Namburete; Ana I.L. Namburete; Frederik J. Lange; Frederik J. Lange; Stanislaw Adaszewski; Torsten Schindler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    Brain ageing is a highly variable, spatially and temporally heterogeneous process, marked by numerous structural and functional changes. These can cause discrepancies between individuals’ chronological age and the apparent age of their brain, as inferred from neuroimaging data. Machine learning models, and particularly Convolutional Neural Networks (CNNs), have proven adept in capturing patterns relating to ageing induced changes in the brain. The differences between the predicted and chronological ages, referred to as brain age deltas, have emerged as useful biomarkers for exploring those factors which promote accelerated ageing or resilience, such as pathologies or lifestyle factors. However, previous studies rely only on structural neuroimaging for predictions, overlooking potentially informative functional and microstructural changes. Here we show that multiple contrasts derived from different MRI modalities can predict brain age, each encoding bespoke brain ageing information. By using 3D CNNs and UK Biobank data, we found that 57 contrasts derived from structural, susceptibility-weighted, diffusion, and functional MRI can successfully predict brain age. For each contrast, different patterns of association with non-imaging phenotypes were found, resulting in a total of 191 unique, statistically significant associations. Furthermore, we found that ensembling data from multiple contrasts results in both higher prediction accuracies and stronger correlations to non-imaging measurements. Our results demonstrate that other 3D contrasts and modalities, which have not been considered so far for the task of brain age prediction, encode different information about the ageing brain. We envision our work as being the starting point for future investigations into the causal links underpinning the observed brain age deltas and non-imaging measurement associations. For instance, drug effects can be monitored, given that certain medications correlated with accelerated brain ageing. Furthermore, continued development of brain age models could facilitate their deployment in clinical trials for recruitment and monitoring, and hospitals for diagnostic and screening tasks.

    Data Description

    This dataset contains the full correlation results with all nIDPs in the UK Biobank. These are presented in datasets split by sex in Female and Male subjects. For easier data manipulation, two smaller datasets have also been made available, containing just those correlation which pass the False Discovery Rate (FDR) threshold.

    As experiments were also conducted for ensembles using multiple contrasts, similar datasets are provided for those.

    Finally, global datasets are also provided. These are the concatenation of the associations contained in the Male and Female datasets.

    Paper & Code

    The original paper for this article can be accessed here:

    To access the codes relevant for this project, please access the project GitHub Repos:

    If using this work, please cite it based on the above paper, or using the following BibTex:

    @inproceedings{roibu2023brain,
     title={Brain Ages Derived from Different MRI Modalities are Associated with Distinct Biological Phenotypes},
     author={Roibu, Andrei-Claudiu and Adaszewski, Stanislaw and Schindler, Torsten and Smith, Stephen M and Namburete, Ana IL and Lange, Frederik J},
     booktitle={2023 10th IEEE Swiss Conference on Data Science (SDS)},
     pages={17--25},
     year={2023},
     organization={IEEE},
     doi={10.1109/SDS57534.2023.00010}
    }

    Data Access

    The data for this project is freely available upon application at the UK Biobank. For more information regarding the individual nIDPs, please access the UK Biobank Showcase website at: https://biobank.ctsu.ox.ac.uk/showcase/search.cgi

    Funding

    ACR is supported by EPSRC Grant EP/S024093/1, F. Hoffmann-La Roche AG and a 2021 Industrial Fellowship offered by the Royal Commission for the Exhibition of 1851. SMS is supported by a Wellcome Trust Collaborative Award 215573/Z/19/Z. AILN is grateful for support from the Academy of Medical Sciences under the Springboard Awards scheme (SBF005/1136), and the Bill and Melinda Gates Foundation. FJL is supported by a Wellcome Trust Collaborative Award (215573/Z/19/Z). The WIN is supported by core funding from the Wellcome Trust (203139/Z/16/Z). The computational aspects were supported by the Wellcome Trust (203141/Z/16/Z) and the NIHR Oxford BRC. Corresponding authors: ACR (andreiroibu@icloud.com), SA (stanislaw.adaszewski@roche.com) and AILN (ana.namburete@cs.ox.ac.uk).

  19. US Mail Statistics

    • kaggle.com
    zip
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US Mail Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-mail-statistics
    Explore at:
    zip(13500 bytes)Available download formats
    Dataset updated
    Dec 19, 2023
    Authors
    The Devastator
    Description

    US Mail Statistics

    US Mail History: Mail Volume, Post Offices, Income, Expenses (1790-2017)

    By Throwback Thursday [source]

    About this dataset

    The dataset contains multiple columns that provide specific information for each year recorded. The column labeled Year indicates the specific year in which the data was recorded. The Pieces of Mail Handled column shows the total number of mail items that were processed or handled in a given year.

    Another important metric is represented in the Number of Post Offices column, revealing the total count of post offices that were operational during a specific year. This information helps understand how postal services and infrastructure have evolved over time.

    Examining financial aspects, there are two columns: Income and Expenses. The former represents the total revenue generated by the US Mail service in a particular year, while the latter showcases the expenses incurred by this service during that same period.

    The dataset titled Week 22 - US Mail - 1790 to 2017.csv serves as an invaluable resource for researchers, historians, and analysts interested in studying trends and patterns within the US Mail system throughout its extensive history. By utilizing this dataset's wide range of valuable metrics, users can gain insights into how mail volume has changed over time alongside fluctuations in post office numbers and financial performance

    How to use the dataset

    • Familiarize yourself with the columns:

      • Year: This column represents the specific year in which data was recorded. It is represented by numeric values.
      • Pieces of Mail Handled: This column indicates the number of mail items processed or handled in a given year. It is also represented by numeric values.
      • Number of Post Offices: Here, you will find information on the total count of post offices in operation during a specific year. Like other columns, it consists of numeric values.
      • Income: The Income column displays the total revenue generated by the US Mail service in a particular year. Numeric values are used to represent this data.
      • Expenses: This column shows the total expenses incurred by the US Mail service for a particular year. Similar to other columns, it uses numeric values.
    • Understand data relationships: By exploring and analyzing different combinations of columns, you can uncover interesting patterns and relationships within mail statistics over time. For example:

      • Relationship between Year and Pieces of Mail Handled/Number of Post Offices/Income/Expenses: Analyzing these variables over years will allow you to observe trends such as increasing mail volume alongside changes in post office numbers or income and expenses patterns.

      • Relationship between Pieces of Mail Handled and Number Postal Office: By comparing these two variables across different years, you can assess if there is any correlation between mail volume growth and changes in post office counts.

    • Visualization:

      To gain better insights into this vast amount of data visually, consider making use graphs or plots beyond just numerical analysis. You can use tools like Matplotlib, Seaborn, or Plotly to create various types of visualizations:

      • Time-series line plots: Visualize the change in Pieces of Mail Handled, Number of Post Offices, Income, and Expenses over time.
      • Scatter plots: Identify potential correlations between different variables such as Year and Pieces of Mail Handled/Number of Post Offices/Income/Expenses.
    • Drawing conclusions:

      This dataset presents an extraordinary opportunity to learn about the history and evolution of the US Mail service. By examining various factors together or individually throughout time, you can draw conclusions about

    Research Ideas

    • Trend Analysis: The dataset can be used to analyze the trends and patterns in mail volume, post office numbers, income, and expenses over time. This can help identify any significant changes or fluctuations in these variables and understand the factors that may have influenced them.
    • Benchmarking: By comparing the performance of different years or periods, this dataset can be used for benchmarking purposes. For example, it can help assess how efficiently post offices have been handling mail items by comparing the number of pieces of mail handled with the corresponding expenses incurred.
    • Forecasting: Based on historical data on mail volume and revenue generation, this dataset can be used for forecasting future trends. This could be valuable for planning purposes, such as determining resource allocation or projecting financial o...
  20. Data from: Empirical Study of the Relationship between Design Patterns and...

    • zenodo.org
    Updated Apr 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmoud Alfadel; Khalid Al-Jasser; Mohammad Alshayeb; Mahmoud Alfadel; Khalid Al-Jasser; Mohammad Alshayeb (2020). Empirical Study of the Relationship between Design Patterns and Code Smells [Dataset]. http://doi.org/10.5281/zenodo.3633081
    Explore at:
    Dataset updated
    Apr 2, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mahmoud Alfadel; Khalid Al-Jasser; Mohammad Alshayeb; Mahmoud Alfadel; Khalid Al-Jasser; Mohammad Alshayeb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Software systems are often developed in such a way that good practices in the object-oriented paradigm are not met, causing the occurrence of specific disharmonies, which are sometimes called code smells. Design patterns catalogue best practices for developing object-oriented software systems. Although code smells and design patterns are widely divergent, there might be a co-occurrence relation between them. The objective of this paper is to empirically evaluate if the presence of design patterns is related to the presence of code smells at different granularity levels. We performed an empirical replication study using 20 design patterns, and 13 code smells in ten small-size to medium-size, open-source Java-based systems. We applied statistical analysis and association rules. Results confirm that classes participating in design patterns have less smell-proneness and smell frequency than classes not participating in design patterns. We also noticed that every design pattern category act in the same way in terms of smell-proneness in the subject systems. However, we observed, based on the association rules learning and the proposed validation technique, that some patterns may be associated with certain smells in some cases. For instance, Command patterns can co-occur with God Class, Blob and External Duplication smell.

    The published data set contains the following:

    1. List of the selected systems (source code files)
    2. The P-MARt: the design pattern repository as XML for the selected systems.
    3. Data of design patterns and code smells: We processed this data by parsing the design pattern XML file and running the smell detection tool (inFusion).
    4. The data of the data mining analysis.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kumar P. Mainali; Sharon Bewick; Peter Thielen; Thomas Mehoke; Florian P. Breitwieser; Shishir Paudel; Arjun Adhikari; Joshua Wolfe; Eric V. Slud; David Karig; William F. Fagan (2023). Statistical analysis of co-occurrence patterns in microbial presence-absence datasets [Dataset]. http://doi.org/10.1371/journal.pone.0187132
Organization logo

Statistical analysis of co-occurrence patterns in microbial presence-absence datasets

Explore at:
39 scholarly articles cite this dataset (View in Google Scholar)
htmlAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Kumar P. Mainali; Sharon Bewick; Peter Thielen; Thomas Mehoke; Florian P. Breitwieser; Shishir Paudel; Arjun Adhikari; Joshua Wolfe; Eric V. Slud; David Karig; William F. Fagan
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Drawing on a long history in macroecology, correlation analysis of microbiome datasets is becoming a common practice for identifying relationships or shared ecological niches among bacterial taxa. However, many of the statistical issues that plague such analyses in macroscale communities remain unresolved for microbial communities. Here, we discuss problems in the analysis of microbial species correlations based on presence-absence data. We focus on presence-absence data because this information is more readily obtainable from sequencing studies, especially for whole-genome sequencing, where abundance estimation is still in its infancy. First, we show how Pearson’s correlation coefficient (r) and Jaccard’s index (J)–two of the most common metrics for correlation analysis of presence-absence data–can contradict each other when applied to a typical microbiome dataset. In our dataset, for example, 14% of species-pairs predicted to be significantly correlated by r were not predicted to be significantly correlated using J, while 37.4% of species-pairs predicted to be significantly correlated by J were not predicted to be significantly correlated using r. Mismatch was particularly common among species-pairs with at least one rare species (

Search
Clear search
Close search
Google apps
Main menu