Facebook
TwitterAbstractThe dataset provided here contains the efforts of independent data aggregation, quality control, and visualization of the University of Arizona (UofA) COVID-19 testing programs for the 2019 novel Coronavirus pandemic. The dataset is provided in the form of machine-readable tables in comma-separated value (.csv) and Microsoft Excel (.xlsx) formats.Additional InformationAs part of the UofA response to the 2019-20 Coronavirus pandemic, testing was conducted on students, staff, and faculty prior to start of the academic year and throughout the school year. These testings were done at the UofA Campus Health Center and through their instance program called "Test All Test Smart" (TATS). These tests identify active cases of SARS-nCoV-2 infections using the reverse transcription polymerase chain reaction (RT-PCR) test and the Antigen test. Because the Antigen test provided more rapid diagnosis, it was greatly used three weeks prior to the start of the Fall semester and throughout the academic year.As these tests were occurring, results were provided on the COVID-19 websites. First, beginning in early March, the Campus Health Alerts website reported the total number of positive cases. Later, numbers were provided for the total number of tests (March 12 and thereafter). According to the website, these numbers were updated daily for positive cases and weekly for total tests. These numbers were reported until early September where they were then included in the reporting for the TATS program.For the TATS program, numbers were provided through the UofA COVID-19 Update website. Initially on August 21, the numbers provided were the total number (July 31 and thereafter) of tests and positive cases. Later (August 25), additional information was provided where both PCR and Antigen testings were available. Here, the daily numbers were also included. On September 3, this website then provided both the Campus Health and TATS data. Here, PCR and Antigen were combined and referred to as "Total", and daily and cumulative numbers were provided.At this time, no official data dashboard was available until September 16, and aside from the information provided on these websites, the full dataset was not made publicly available. As such, the authors of this dataset independently aggregated data from multiple sources. These data were made publicly available through a Google Sheet with graphical illustration provided through the spreadsheet and on social media. The goal of providing the data and illustrations publicly was to provide factual information and to understand the infection rate of SARS-nCoV-2 in the UofA community.Because of differences in reported data between Campus Health and the TATS program, the dataset provides Campus Health numbers on September 3 and thereafter. TATS numbers are provided beginning on August 14, 2020.Description of Dataset ContentThe following terms are used in describing the dataset.1. "Report Date" is the date and time in which the website was updated to reflect the new numbers2. "Test Date" is to the date of testing/sample collection3. "Total" is the combination of Campus Health and TATS numbers4. "Daily" is to the new data associated with the Test Date5. "To Date (07/31--)" provides the cumulative numbers from 07/31 and thereafter6. "Sources" provides the source of information. The number prior to the colon refers to the number of sources. Here, "UACU" refers to the UA COVID-19 Update page, and "UARB" refers to the UA Weekly Re-Entry Briefing. "SS" and "WBM" refers to screenshot (manually acquired) and "Wayback Machine" (see Reference section for links) with initials provided to indicate which author recorded the values. These screenshots are available in the records.zip file.The dataset is distinguished where available by the testing program and the methods of testing. Where data are not available, calculations are made to fill in missing data (e.g., extrapolating backwards on the total number of tests based on daily numbers that are deemed reliable). Where errors are found (by comparing to previous numbers), those are reported on the above Google Sheet with specifics noted.For inquiries regarding the contents of this dataset, please contact the Corresponding Author listed in the README.txt file. Administrative inquiries (e.g., removal requests, trouble downloading, etc.) can be directed to data-management@arizona.edu
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset provided by = Björn Holzhauer
Dataset Description==Meta-analyses of clinical trials often treat the number of patients experiencing a medical event as binomially distributed when individual patient data for fitting standard time-to-event models are unavailable. Assuming identical drop-out time distributions across arms, random censorship and low proportions of patients with an event, a binomial approach results in a valid test of the null hypothesis of no treatment effect with minimal loss in efficiency compared to time-to-event methods. To deal with differences in follow-up - at the cost of assuming specific distributions for event and drop-out times - we propose a hierarchical multivariate meta-analysis model using the aggregate data likelihood based on the number of cases, fatal cases and discontinuations in each group, as well as the planned trial duration and groups sizes. Such a model also enables exchangeability assumptions about parameters of survival distributions, for which they are more appropriate than for the expected proportion of patients with an event across trials of substantially different length. Borrowing information from other trials within a meta-analysis or from historical data is particularly useful for rare events data. Prior information or exchangeability assumptions also avoid the parameter identifiability problems that arise when using more flexible event and drop-out time distributions than the exponential one. We discuss the derivation of robust historical priors and illustrate the discussed methods using an example. We also compare the proposed approach against other aggregate data meta-analysis methods in a simulation study.
Facebook
Twitter
As per our latest research, the global map data aggregation platform market size reached USD 4.92 billion in 2024, demonstrating robust growth dynamics. The market is projected to expand at a CAGR of 13.8% over the forecast period, resulting in a forecasted value of USD 15.13 billion by 2033. This remarkable growth is driven by the increasing integration of geospatial intelligence across industries, the proliferation of IoT devices, and the rising demand for real-time, accurate mapping solutions. The market's evolution is underpinned by rapid technological advancements, particularly in cloud computing and artificial intelligence, which are revolutionizing how map data is aggregated, processed, and utilized for diverse applications.
The primary growth factor for the map data aggregation platform market is the surging demand for precise geospatial data to power navigation systems, location-based services, and urban infrastructure planning. As smart cities initiatives gain momentum worldwide, governments and municipal authorities are increasingly relying on map data aggregation platforms to optimize traffic management, resource allocation, and public safety. The integration of advanced sensors, IoT devices, and real-time data feeds into these platforms enables dynamic mapping and analytics, which are essential for supporting autonomous vehicles, drone delivery systems, and next-generation mobility solutions. Furthermore, the expansion of e-commerce and on-demand services is fueling the need for accurate, up-to-date mapping data to enhance last-mile delivery efficiency and customer experience.
Another significant driver is the widespread adoption of cloud-based map data aggregation solutions, which offer scalability, flexibility, and cost efficiency. Enterprises across transportation, logistics, and real estate sectors are leveraging these platforms to streamline operations, improve asset tracking, and gain actionable insights from spatial data. The integration of artificial intelligence and machine learning algorithms into map data aggregation platforms is enabling automated data cleansing, anomaly detection, and predictive analytics, further enhancing the value proposition for end users. Additionally, the growing emphasis on environmental sustainability and disaster management is prompting governments and NGOs to utilize map data aggregation platforms for monitoring land use, tracking deforestation, and coordinating emergency response efforts.
The map data aggregation platform market is also witnessing growth due to the increasing need for interoperability and data standardization across diverse mapping applications. As organizations seek to consolidate disparate geospatial datasets and facilitate seamless data exchange between systems, the role of aggregation platforms becomes critical. These platforms are evolving to support open standards, APIs, and cross-platform compatibility, enabling integration with GIS tools, enterprise resource planning (ERP) systems, and customer relationship management (CRM) solutions. This trend is particularly evident in sectors such as utilities and retail, where organizations require comprehensive spatial intelligence to optimize asset management, site selection, and market analysis.
Regionally, North America continues to dominate the map data aggregation platform market, owing to the presence of major technology providers, robust digital infrastructure, and early adoption of advanced mapping technologies. However, the Asia Pacific region is emerging as the fastest-growing market, driven by rapid urbanization, government investments in smart city projects, and the proliferation of mobile and connected devices. Europe also holds a significant share, supported by stringent regulatory frameworks for data privacy and the growing adoption of location-based services in transportation and logistics. The Middle East & Africa and Latin America are gradually catching up, fueled by infrastructure development and increasing digital transformation initiatives.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and Inferred Networks accompanying the manuscript entitled - “Aggregation of recount3 RNA-seq data improves the inference of consensus and context-specific gene co-expression networks”
Authors: Prashanthi Ravichandran, Princy Parsana, Rebecca Keener, Kaspar Hansen, Alexis Battle
Affiliations: Johns Hopkins University School of Medicine, Johns Hopkins University Department of Computer Science, Johns Hopkins University Bloomberg School of Public Health
Description:
This folder includes data produced in the analysis contained in the manuscript and inferred consensus and context-specific networks from graphical lasso and WGCNA with varying numbers of edges. Contents include:
all_metadata.rds: File including meta-data columns of study accession ID, sample ID, assigned tissue category, cancer status and disease status obtained through manual curation for the 95,484 RNA-seq samples used in the study.
all_counts.rds: log2 transformed RPKM normalized read counts for 5999 genes and 95,484 RNA-seq samples which was utilized for dimensionality reduction and data exploration
precision_matrices.zip: Zipped folder including networks inferred by graphical lasso for different experiments presented in the paper using weighted covariance aggregation following PC correction.
The networks can be found as follows. First, select the folder corresponding to the network of interest - for example, Blood, this will then include two or more folders which indicate the data aggregation utilized, select the folder corresponding appropriate level of data aggregation - either all samples/ GTEx for blood-specific networks, this includes precision matrices inferred across a range of penalization parameters. To view the precision matrix inferred for a particular value of the penalization parameter X, select the file labeled lambda_X.rds
For select networks, we have included the computed centrality measures which can be accessed at centrality_X.rds for a particular value of the penalization parameter X.
We have also included .rds files that list the hub genes from the consensus networks inferred from non-cancerous samples at “normal_hubs.rds”, and the consensus networks inferred from cancerous samples at “cancer_hubs.rds”
The file “context_specific_selected_networks.csv” includes the networks that were selected for downstream biological interpretation based on the scale-free criterion which is also summarized in the Supplementary Tables.
WGCNA.zip: A zipped folder containing gene modules inferred from WGCNA for sequentially aggregated GTEx, SRA, and blood studies. Select the data aggregated, and the number of studies based on folder names. For example, blood networks inferred from 20 studies can be accessed at blood/consensus/net_20. The individual networks correspond to distinct cut heights, and include information on the cut height used, the genes that the network was inferred over merged module labels, and merged module colors.
Facebook
TwitterA practical aggregation method for heterogeneous log-linear functions is presented. Inequality measures are employed in the construction of a simple but exact aggregate representation of an economy. Three macroeconomic applications are discussed: the aggregation of the Lucas supply function, the time-inconsistent behaviour of an egalitarian social planner facing heterogeneous discount rates, and the case of a simple heterogeneous growth model. In the latter application, aggregate CPS data is used to show that the slowdown that followed the first oil shock is worse than usually thought, and that the new economy growth resurgence is not as strong as it appears. The reaction of one man could be forecast by no known mathematics; the reaction of a billion is something else again.?Foundation and Empire, Isaac Asimov (1952)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The file set is a freely downloadable aggregation of information about Australian schools. The individual files represent a series of tables which, when considered together, form a relational database. The records cover the years 2008-2014 and include information on approximately 9500 primary and secondary school main-campuses and around 500 subcampuses. The records all relate to school-level data; no data about individuals is included. All the information has previously been published and is publicly available but it has not previously been released as a documented, useful aggregation. The information includes: (a) the names of schools (b) staffing levels, including full-time and part-time teaching and non-teaching staff (c) student enrolments, including the number of boys and girls (d) school financial information, including Commonwealth government, state government, and private funding (e) test data, potentially for school years 3, 5, 7 and 9, relating to an Australian national testing programme know by the trademark 'NAPLAN'
Documentation of this Edition 2016.1 is incomplete but the organization of the data should be readily understandable to most people. If you are a researcher, the simplest way to study the data is to make use of the SQLite3 database called 'school-data-2016-1.db'. If you are unsure how to use an SQLite database, ask a guru.
The database was constructed directly from the other included files by running the following command at a command-line prompt: sqlite3 school-data-2016-1.db < school-data-2016-1.sql Note that a few, non-consequential, errors will be reported if you run this command yourself. The reason for the errors is that the SQLite database is created by importing a series of '.csv' files. Each of the .csv files contains a header line with the names of the variable relevant to each column. The information is useful for many statistical packages but it is not what SQLite expects, so it complains about the header. Despite the complaint, the database will be created correctly.
Briefly, the data are organized as follows. (a) The .csv files ('comma separated values') do not actually use a comma as the field delimiter. Instead, the vertical bar character '|' (ASCII Octal 174 Decimal 124 Hex 7C) is used. If you read the .csv files using Microsoft Excel, Open Office, or Libre Office, you will need to set the field-separator to be '|'. Check your software documentation to understand how to do this. (b) Each school-related record is indexed by an identifer called 'ageid'. The ageid uniquely identifies each school and consequently serves as the appropriate variable for JOIN-ing records in different data files. For example, the first school-related record after the header line in file 'students-headed-bar.csv' shows the ageid of the school as 40000. The relevant school name can be found by looking in the file 'ageidtoname-headed-bar.csv' to discover that the the ageid of 40000 corresponds to a school called 'Corpus Christi Catholic School'. (3) In addition to the variable 'ageid' each record is also identified by one or two 'year' variables. The most important purpose of a year identifier will be to indicate the year that is relevant to the record. For example, if one turn again to file 'students-headed-bar.csv', one sees that the first seven school-related records after the header line all relate to the school Corpus Christi Catholic School with ageid of 40000. The variable that identifies the important differences between these seven records is the variable 'studentyear'. 'studentyear' shows the year to which the student data refer. One can see, for example, that in 2008, there were a total of 410 students enrolled, of whom 185 were girls and 225 were boys (look at the variable names in the header line). (4) The variables relating to years are given different names in each of the different files ('studentsyear' in the file 'students-headed-bar.csv', 'financesummaryyear' in the file 'financesummary-headed-bar.csv'). Despite the different names, the year variables provide the second-level means for joining information acrosss files. For example, if you wanted to relate the enrolments at a school in each year to its financial state, you might wish to JOIN records using 'ageid' in the two files and, secondarily, matching 'studentsyear' with 'financialsummaryyear'. (5) The manipulation of the data is most readily done using the SQL language with the SQLite database but it can also be done in a variety of statistical packages. (6) It is our intention for Edition 2016-2 to create large 'flat' files suitable for use by non-researchers who want to view the data with spreadsheet software. The disadvantage of such 'flat' files is that they contain vast amounts of redundant information and might not display the data in the form that the user most wants it. (7) Geocoding of the schools is not available in this edition. (8) Some files, such as 'sector-headed-bar.csv' are not used in the creation of the database but are provided as a convenience for researchers who might wish to recode some of the data to remove redundancy. (9) A detailed example of a suitable SQLite query can be found in the file 'school-data-sqlite-example.sql'. The same query, used in the context of analyses done with the excellent, freely available R statistical package (http://www.r-project.org) can be seen in the file 'school-data-with-sqlite.R'.
Facebook
TwitterBy data.world's Admin [source]
This dataset contains an aggregation of birth data from the United Statesbetween 1985 and 2015. It consists of information on mothers' locations by state (including District of Columbia) and county, as well as information such as the month they gave birth, and aggregates giving the sum of births during that month. This data has been provided by both the National Bureau for Economic Research and National Center for Health Statistics, whose shared mission is to understand how life works in order to aid individuals in making decisions about their health and wellbeing. This dataset provides valuable insight into population trends across time and location - for example, which states have higher or lower birthrates than others? Which counties experience dramatic fluctuations over time? Given its scope, this dataset could be used in a number of contexts--from epidemiology research to population forecasting. Be sure to check out our other datasets related to births while you're here!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset could be used to examine local trends in birth rates over time or analyze births at different geographical locations. In order to maximize your use of this dataset, it is important that you understand what information the various columns contain.
The main columns are: State (including District of Columbia), County (coded using the FIPS county code number), Month (numbering from 1 for January through 12 for December), Year (4-digit year) countyBirths (calculated sum of births that occurred to mothers living in a county for a given month) and stateBirths (calculated sum of births that occurred to mothers living in a state for a given month). These fields should provide enough information for you analyze trends across geographic locations both at monthly and yearly levels. You could also consider combining variables such as
YearwithStateorYearwithMonthor any other grouping combinations depending on your analysis goal.In addition, while all data were downloaded on April 5th 2017, it is worth noting that all sources used followed privacy guidelines as laid out by NCHC so individual births occurring after 2005 are not included due to geolocation concerns.
We hope you find this dataset useful and can benefit from its content! With proper understanding of what each field contains, we are confident you will gain valuable insights on birth rates across counties within the United States during this period
- Establishing county-level trends in birth rates for the US over time.
- Analyzing the relationship between month of birth and health outcomes for US babies after they are born (e.g., infant mortality, neurological development, etc.).
- Comparing state/county-level differences in average numbers of twins born each year
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: allBirthData.csv | Column name | Description | |:-----------------|:-----------------------------------------------------------------------------------------------------------------| | State | The numerical order of the state where the mother lives. (Integer) | | Month | The month in which the birth took place. (Integer) | | Year | The year of the birth. (Integer) | | countyBirths | The calculated sum of births that occurred to mothers living in that county for that particular month. (Integer) | | stateBirths | The aggregate number at the level of entire states for any given month-year combination. (Integer) | | County | The county where the mother lives, coded using FIPS County Code. (Integer) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit data.world's Admin.
Facebook
TwitterMonthly report including total dispatched trips, total dispatched shared trips, and unique dispatched vehicles aggregated by FHV (For-Hire Vehicle) base. These have been tabulated from raw trip record submissions made by bases to the NYC Taxi and Limousine Commission (TLC). This dataset is typically updated monthly on a two-month lag, as bases have until the conclusion of the following month to submit a month of trip records to the TLC. In example, a base has until Feb 28 to submit complete trip records for January. Therefore, the January base aggregates will appear in March at the earliest. The TLC may elect to defer updates to the FHV Base Aggregate Report if a large number of bases have failed to submit trip records by the due date. Note: The TLC publishes base trip record data as submitted by the bases, and we cannot guarantee or confirm their accuracy or completeness. Therefore, this may not represent the total amount of trips dispatched by all TLC-licensed bases. The TLC performs routine reviews of the records and takes enforcement actions when necessary to ensure, to the extent possible, complete and accurate information.
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Concrete is one if the most used building materials worldwide. With up to 80% of volume, a large constituent of concrete consists of fine and coarse aggregate particles (normally, sizes of 0.1mm to 32 mm) which are dispersed in a cement paste matrix. The size distribution of the aggregates (i.e. the grading curve) substantially affects the properties and quality characteristics of concrete, such as e.g. its workability at the fresh state and the mechanical properties at the hardened state. In practice, usually the size distribution of small samples of the aggregate is determined by manual mechanical sieving and is considered as representative for a large amount of aggregate. However, the size distribution of the actual aggregate used for individual production batches of concrete varies, especially when e.g. recycled material is used as aggregate. As a consequence, the unknown variations of the particle size distribution have a negative effect on the robustness and the quality of the final concrete produced from the raw material.
Towards the goal of deriving precise knowledge about the actual particle size distribution of the aggregate, thus eliminating the unknown variations in the material’s properties, we propose a data set for the image based prediction of the size distribution of concrete aggregates. Incorporating such an approach into the production chain of concrete enables to react on detected variations in the size distribution of the aggregate in real-time by adapting the composition, i.e. the mixture design of the concrete accordingly, so that the desired concrete properties are reached.
https://data.uni-hannover.de/dataset/f00bdcc4-8b27-4dc4-b48d-a84d75694e18/resource/042abf8d-e87a-4940-8195-2459627f57b6/download/overview.png" alt="Classicial vs. image based granulometry" title=" ">
In the classification data, nine different grading curves are distinguished. In this context, the normative regulations of DIN 1045 are considered. The nine grading curves differ in their maximum particle size (8, 16, or 32 mm) and in the distribution of the particle size fractions allowing a categorisation of the curves to coarse-grained (A), medium-grained (B) and fine-grained (C) curves, respectively. A quantitative description of the grain size distribution of the nine curves distinguished is shown in the following figure, where the left side shows a histogram of the particle size fractions 0-2, 2-8, 8-16, and 16-32 mm and the right side shows the cumulative histograms of the grading curves (the vertical axes represent the mass-percentages of the material).
For each of the grading curves, two samples (S1 and S2) of aggregate particles were created. Each sample consists of a total mass of 5 kg of aggregate material and is carefully designed according to the grain size distribution shwon in the figure by sieving the raw material in order to separate the different grain size fractions first, and subsequently, by composing the samples according to the dedicated mass-percentages of the size distributions.
https://data.uni-hannover.de/dataset/f00bdcc4-8b27-4dc4-b48d-a84d75694e18/resource/17eb2a46-eb23-4ec2-9311-0f339e0330b4/download/statistics_classification-data.png" alt="Particle size distribution of the classification data">
For data acquisition, a static setup was used for which the samples are placed in a measurement vessel equipped with a set of calibrated reference markers whose object coordinates are known and which are assembled in a way that they form a common plane with the surface of the aggregate sample. We acquired the data by taking images of the aggregate samples (and the reference markers) which are filled in the the measurement vessel and whose constellation within the vessel is perturbed between the acquisition of each image in order to obtain variations in the sample’s visual appearance. This acquisition strategy allows to record multiple different images for the individual grading curves by reusing the same sample, consequently reducing the labour-intensive part of material sieving and sample generation. In this way, we acquired a data set of 900 images in total, consisting of 50 images of each of the two samples (S1 and S2) which were created for each of the nine grading curve definitions, respectively (50 x 2 x 9 = 900). For each image, we automatically detect the reference markers, thus receiving the image coordinates of each marker in addition to its known object coordinates. We make use of these correspondences for the computation of the homography which describes the perspective transformation of the reference marker’s plane in object space (which corresponds to the surface plane of the aggregate sample) to the image plane. Using the computed homography, we transform the image in order to obtain an perspectively rectified representation of the aggregate sample with a known, and especially a for the entire image consistent, ground sampling distance (GSD) of 8 px/mm. In the following figure, example images of our data set showing aggregate samples of each of the distinguished grading curve classes are depicted.
https://data.uni-hannover.de/dataset/f00bdcc4-8b27-4dc4-b48d-a84d75694e18/resource/59925f1d-3eef-4b50-986a-e8d2b0e14beb/download/examples_classification_data.png" alt="Example images of the classification data">
If you make use of the proposed data, please cite the publication listed below.
Facebook
Twitter
According to our latest research, the global risk data aggregation and reporting for banks market size reached USD 7.9 billion in 2024, driven by the increasing regulatory requirements and the growing complexity of banking operations. The market is expected to expand at a robust CAGR of 14.2% from 2025 to 2033, reaching a projected value of USD 22.3 billion by 2033. This impressive growth is primarily fueled by the ongoing digital transformation initiatives within the banking sector, as well as the heightened focus on risk management and compliance. As per our latest research, banks globally are investing in advanced data aggregation and reporting solutions to meet evolving regulatory mandates and enhance operational efficiency.
One of the principal growth factors for the risk data aggregation and reporting for banks market is the tightening regulatory landscape. Financial authorities such as the Basel Committee on Banking Supervision (BCBS) have established stringent guidelines, notably BCBS 239, which require banks to improve their risk data aggregation capabilities and reporting practices. This has led to a surge in demand for robust solutions that can ensure data accuracy, consistency, and timeliness. Banks are compelled to invest in advanced software and services that facilitate real-time data integration, risk assessment, and regulatory reporting. The growing volume and complexity of banking transactions further underscore the need for comprehensive risk data aggregation and reporting frameworks, as traditional manual processes are no longer sufficient to meet regulatory expectations.
Another significant driver is the rapid digitalization of the banking sector. As banks embrace digital transformation, they are generating massive amounts of data from various sources, including online transactions, customer interactions, and third-party integrations. Efficient risk data aggregation and reporting solutions enable banks to harness this data, providing actionable insights for risk management and strategic decision-making. The adoption of technologies such as artificial intelligence, machine learning, and big data analytics is enhancing the capabilities of these solutions, allowing banks to identify emerging risks, optimize capital allocation, and improve overall governance. This digital shift is not just a response to regulatory pressure but also a strategic move to gain competitive advantage in a fast-evolving financial landscape.
Furthermore, the increasing focus on operational resilience and business continuity is propelling the adoption of risk data aggregation and reporting solutions. Banks are recognizing the need to quickly aggregate and analyze data from multiple sources to detect vulnerabilities, prevent fraud, and ensure compliance with internal and external policies. The COVID-19 pandemic has further highlighted the importance of real-time risk management and agile reporting, as financial institutions faced unprecedented disruptions and market volatility. As a result, investments in risk data infrastructure are becoming a top priority for banks of all sizes, paving the way for sustained market growth over the forecast period.
From a regional perspective, North America currently dominates the risk data aggregation and reporting for banks market, followed closely by Europe and Asia Pacific. The United States, in particular, has a mature banking sector with stringent regulatory requirements, driving early adoption of advanced risk data solutions. Meanwhile, the Asia Pacific region is witnessing the fastest growth, fueled by rapid digitalization, expanding banking networks, and increasing regulatory oversight in emerging economies such as China and India. Europe remains a key market due to the implementation of comprehensive financial regulations and the presence of major global banks. Latin America and the Middle East & Africa are also showing steady progress, albeit at a slower pace, as banks in these regions gradually upgrade their risk management capabilities.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Modality-agnostic files were copied over and the CHANGES file was updated. Data was aggregated using:
python phenotype.py aggregate subject -i segregated_subject -o aggregated_subject
phenotype.py came from the GitHub repository: https://github.com/ericearl/bids-phenotype
A comprehensive clinical, MRI, and MEG collection characterizing healthy research volunteers collected at the National Institute of Mental Health (NIMH) Intramural Research Program (IRP) in Bethesda, Maryland using medical and mental health assessments, diagnostic and dimensional measures of mental health, cognitive and neuropsychological functioning, structural and functional magnetic resonance imaging (MRI), along with diffusion tensor imaging (DTI), and a comprehensive magnetoencephalography battery (MEG).
In addition, blood samples are currently banked for future genetic analysis. All data collected in this protocol are broadly shared in the OpenNeuro repository, in the Brain Imaging Data Structure (BIDS) format. In addition, blood samples of healthy volunteers are banked for future analyses. All data collected in this protocol are broadly shared here, in the Brain Imaging Data Structure (BIDS) format. In addition, task paradigms and basic pre-processing scripts are shared on GitHub. This dataset is unique in its depth of characterization of a healthy population in terms of brain health and will contribute to a wide array of secondary investigations of non-clinical and clinical research questions.
This dataset is licensed under the Creative Commons Zero (CC0) v1.0 License.
Inclusion criteria for the study require that participants are adults at or over 18 years of age in good health with the ability to read, speak, understand, and provide consent in English. All participants provided electronic informed consent for online screening and written informed consent for all other procedures. Exclusion criteria include:
Study participants are recruited through direct mailings, bulletin boards and listservs, outreach exhibits, print advertisements, and electronic media.
All potential volunteers first visit the study website (https://nimhresearchvolunteer.ctss.nih.gov), check a box indicating consent, and complete preliminary self-report screening questionnaires. The study website is HIPAA compliant and therefore does not collect PII ; instead, participants are instructed to contact the study team to provide their identity and contact information. The questionnaires include demographics, clinical history including medications, disability status (WHODAS 2.0), mental health symptoms (modified DSM-5 Self-Rated Level 1 Cross-Cutting Symptom Measure), substance use survey (DSM-5 Level 2), alcohol use (AUDIT), handedness (Edinburgh Handedness Inventory), and perceived health ratings. At the conclusion of the questionnaires, participants are again prompted to send an email to the study team. Survey results, supplemented by NIH medical records review (if present), are reviewed by the study team, who determine if the participant is likely eligible for the protocol. These participants are then scheduled for an in-person assessment. Follow-up phone screenings were also used to determine if participants were eligible for in-person screening.
At this visit, participants undergo a comprehensive clinical evaluation to determine final eligibility to be included as a healthy research volunteer. The mental health evaluation consists of a psychiatric diagnostic interview (Structured Clinical Interview for DSM-5 Disorders (SCID-5), along with self-report surveys of mood (Beck Depression Inventory-II (BD-II) and anxiety (Beck Anxiety Inventory, BAI) symptoms. An intelligence quotient (IQ) estimation is determined with the Kaufman Brief Intelligence Test, Second Edition (KBIT-2). The KBIT-2 is a brief (20-30 minute) assessment of intellectual functioning administered by a trained examiner. There are three subtests, including verbal knowledge, riddles, and matrices.
Medical evaluation includes medical history elicitation and systematic review of systems. Biological and physiological measures include vital signs (blood pressure, pulse), as well as weight, height, and BMI. Blood and urine samples are taken and a complete blood count, acute care panel, hepatic panel, thyroid stimulating hormone, viral markers (HCV, HBV, HIV), C-reactive protein, creatine kinase, urine drug screen and urine pregnancy tests are performed. In addition, blood samples that can be used for future genomic analysis, development of lymphoblastic cell lines or other biomarker measures are collected and banked with the NIMH Repository and Genomics Resource (Infinity BiologiX). The Family Interview for Genetic Studies (FIGS) was later added to the assessment in order to provide better pedigree information; the Adverse Childhood Events (ACEs) survey was also added to better characterize potential risk factors for psychopathology. The entirety of the in-person assessment not only collects information relevant for eligibility determination, but it also provides a comprehensive set of standardized clinical measures of volunteer health that can be used for secondary research.
Participants are given the option to consent for a magnetic resonance imaging (MRI) scan, which can serve as a baseline clinical scan to determine normative brain structure, and also as a research scan with the addition of functional sequences (resting state and diffusion tensor imaging). The MR protocol used was initially based on the ADNI-3 basic protocol, but was later modified to include portions of the ABCD protocol in the following manner:
At the time of the MRI scan, volunteers are administered a subset of tasks from the NIH Toolbox Cognition Battery. The four tasks include:
An optional MEG study was added to the protocol approximately one year after the study was initiated, thus there are relatively fewer MEG recordings in comparison to the MRI dataset. MEG studies are performed on a 275 channel CTF MEG system (CTF MEG, Coquiltam BC, Canada). The position of the head was localized at the beginning and end of each recording using three fiducial coils. These coils were placed 1.5 cm above the nasion, and at each ear, 1.5 cm from the tragus on a line between the tragus and the outer canthus of the eye. For 48 participants (as of 2/1/2022), photographs were taken of the three coils and used to mark the points on the T1 weighted structural MRI scan for co-registration. For the remainder of the participants (n=16 as of 2/1/2022), a Brainsight neuronavigation system (Rogue Research, Montréal, Québec, Canada) was used to coregister the MRI and fiducial localizer coils in realtime prior to MEG data acquisition.
Online and In-person behavioral and clinical measures, along with the corresponding phenotype file name, sorted first by measurement location and then by file name.
| Location | Measure | File Name |
|---|---|---|
| Online | Alcohol Use Disorders Identification Test (AUDIT) | audit |
| Demographics | demographics | |
| DSM-5 Level 2 Substance Use - Adult | drug_use | |
| Edinburgh Handedness Inventory (EHI) | ehi | |
| Health History Form | health_history_questions | |
| Perceived Health Rating - self | health_rating | |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aggregated data by county and year of registration. NIRVAR statistics on persons declared incompetent by court procedure. Statistical data of the Register of Inactive and Limitedly Active Persons (NIRVAR). The geographical sample of the data is the entire territory of the country.
Facebook
Twitter
As per our latest research, the global Pet Wearable Data Aggregation Platforms market size in 2024 is valued at USD 2.37 billion, reflecting the rapid adoption of connected pet devices worldwide. The market is experiencing robust expansion with a compound annual growth rate (CAGR) of 16.5% from 2025 to 2033. By leveraging this CAGR, the market is forecasted to reach USD 9.44 billion by 2033. This impressive growth trajectory is primarily driven by rising pet ownership, increasing awareness about pet health, and the technological advancements in IoT-enabled pet wearables that facilitate real-time data aggregation and analysis.
One of the most significant growth factors for the Pet Wearable Data Aggregation Platforms market is the increasing humanization of pets, particularly in developed economies. Pet owners are now more inclined to invest in advanced technology that ensures the well-being, safety, and health monitoring of their animals. The proliferation of smart devices tailored for pets, such as collars, harnesses, and vests embedded with sensors, allows for continuous health and activity tracking. These devices collect large volumes of data, which can be aggregated and analyzed on centralized platforms to provide actionable insights for both pet owners and veterinarians. The trend is further supported by growing disposable incomes and the willingness of consumers to spend more on premium pet care solutions, fueling the demand for sophisticated wearable solutions.
Another key driver is the integration of artificial intelligence and machine learning algorithms into pet wearable data aggregation platforms. These advancements enable predictive analytics, early diagnosis of health issues, and personalized recommendations for pet care. The ability to remotely monitor pets' vital signs, activity levels, and behavioral patterns has proven invaluable during the post-pandemic era, where remote veterinary consultations and telemedicine services have gained traction. Additionally, the increasing prevalence of chronic diseases among pets, such as obesity and diabetes, is prompting pet owners to seek continuous health monitoring solutions, further propelling the adoption of pet wearable data aggregation platforms.
The growing ecosystem of partnerships between device manufacturers, software developers, and veterinary service providers is also contributing to market expansion. Collaborative efforts are leading to the creation of interoperable platforms that can integrate data from multiple devices, streamline information flow, and enhance the overall user experience. Moreover, data security and privacy have become paramount, prompting vendors to invest in secure cloud-based solutions that comply with global data protection regulations. These factors collectively are fostering an environment conducive to rapid market growth, as stakeholders across the value chain recognize the transformative potential of pet wearable data aggregation platforms.
Regionally, North America dominates the Pet Wearable Data Aggregation Platforms market, accounting for the largest share due to high pet ownership rates, advanced veterinary infrastructure, and the presence of leading market players. Europe follows closely, driven by similar trends and increasing awareness about animal welfare. The Asia Pacific region is emerging as a lucrative market, supported by rising disposable incomes, urbanization, and a burgeoning pet population. Latin America and the Middle East & Africa are also witnessing gradual adoption, albeit at a slower pace, as awareness and infrastructure continue to develop. Each region presents unique growth opportunities and challenges, shaping the global landscape of the pet wearable data aggregation platforms market.
The Pet Wearable Data Aggregation Platforms market by component is segmented into hardware, software, and services. Hardware forms the backbone of this ecosystem, comprising smart collars, harnesse
Facebook
TwitterThe Measurable AI Amazon Consumer Transaction Dataset is a leading source of email receipts and consumer transaction data, offering data collected directly from users via Proprietary Consumer Apps, with millions of opt-in users.
We source our email receipt consumer data panel via two consumer apps which garner the express consent of our end-users (GDPR compliant). We then aggregate and anonymize all the transactional data to produce raw and aggregate datasets for our clients.
Use Cases Our clients leverage our datasets to produce actionable consumer insights such as: - Market share analysis - User behavioral traits (e.g. retention rates) - Average order values - Promotional strategies used by the key players. Several of our clients also use our datasets for forecasting and understanding industry trends better.
Coverage - Asia (Japan) - EMEA (Spain, United Arab Emirates)
Granular Data Itemized, high-definition data per transaction level with metrics such as - Order value - Items ordered - No. of orders per user - Delivery fee - Service fee - Promotions used - Geolocation data and more
Aggregate Data - Weekly/ monthly order volume - Revenue delivered in aggregate form, with historical data dating back to 2018. All the transactional e-receipts are sent from app to users’ registered accounts.
Most of our clients are fast-growing Tech Companies, Financial Institutions, Buyside Firms, Market Research Agencies, Consultancies and Academia.
Our dataset is GDPR compliant, contains no PII information and is aggregated & anonymized with user consent. Contact business@measurable.ai for a data dictionary and to find out our volume in each country.
Facebook
TwitterThis dataset contains aggregate data on violent index victimizations at the quarter level of each year (i.e., January – March, April – June, July – September, October – December), from 2001 to the present (1991 to present for Homicides), with a focus on those related to gun violence. Index crimes are 10 crime types selected by the FBI (codes 1-4) for special focus due to their seriousness and frequency. This dataset includes only those index crimes that involve bodily harm or the threat of bodily harm and are reported to the Chicago Police Department (CPD). Each row is aggregated up to victimization type, age group, sex, race, and whether the victimization was domestic-related. Aggregating at the quarter level provides large enough blocks of incidents to protect anonymity while allowing the end user to observe inter-year and intra-year variation. Any row where there were fewer than three incidents during a given quarter has been deleted to help prevent re-identification of victims. For example, if there were three domestic criminal sexual assaults during January to March 2020, all victims associated with those incidents have been removed from this dataset. Human trafficking victimizations have been aggregated separately due to the extremely small number of victimizations. This dataset includes a " GUNSHOT_INJURY_I " column to indicate whether the victimization involved a shooting, showing either Yes ("Y"), No ("N"), or Unknown ("UKNOWN.") For homicides, injury descriptions are available dating back to 1991, so the "shooting" column will read either "Y" or "N" to indicate whether the homicide was a fatal shooting or not. For non-fatal shootings, data is only available as of 2010. As a result, for any non-fatal shootings that occurred from 2010 to the present, the shooting column will read as “Y.” Non-fatal shooting victims will not be included in this dataset prior to 2010; they will be included in the authorized dataset, but with "UNKNOWN" in the shooting column. The dataset is refreshed daily, but excludes the most recent complete day to allow CPD time to gather the best available information. Each time the dataset is refreshed, records can change as CPD learns more about each victimization, especially those victimizations that are most recent. The data on the Mayor's Office Violence Reduction Dashboard is updated daily with an approximately 48-hour lag. As cases are passed from the initial reporting officer to the investigating detectives, some recorded data about incidents and victimizations may change once additional information arises. Regularly updated datasets on the City's public portal may change to reflect new or corrected information. How does this dataset classify victims? The methodology by which this dataset classifies victims of violent crime differs by victimization type: Homicide and non-fatal shooting victims: A victimization is considered a homicide victimization or non-fatal shooting victimization depending on its presence in CPD's homicide victims data table or its shooting victims data table. A victimization is considered a homicide only if it is present in CPD's homicide data table, while a victimization is considered a non-fatal shooting only if it is present in CPD's shooting data tables and absent from CPD's homicide data table. To determine the IUCR code of homicide and non-fatal shooting victimizations, we defer to the incident IUCR code available in CPD's Crimes, 2001-present dataset (available on the City's open data portal). If the IUCR code in CPD's Crimes dataset is inconsistent with the homicide/non-fatal shooting categorization, we defer to CPD's Victims dataset. For a criminal homicide, the only sensible IUCR codes are 0110 (first-degree murder) or 0130 (second-degree murder). For a non-fatal shooting, a sensible IUCR code must signify a criminal sexual assault, a robbery, or, most commonly, an aggravated battery. In rare instances, the IUCR code in CPD's Crimes and Vi
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Fictitious example data about breathlessness in British coal miners taken from table 14.2 b in [7] with life tables for England and Wales in the general population and in the population with breathlessness stratified by 5-year age groups from 20 to 64.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Fictitious example data about breathlessness in British coal miners taken from table 14.2 a in [7] stratified by age groups with numbers of persons observed, number of persons with breathlessness and age-specific prevalence for 5-year age groups from 20 to 64.
Facebook
TwitterThe data underlying this published work have been made publicly available in this repository as part of the IMASC Data Management Plan. This work was supported as part of the Integrated Mesoscale Architectures for Sustainable Catalysis (IMASC), an Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award # DE-SC0012573.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset encompasses an Excel datasheet containing aggregate data from thin section analyses of ice samples from the Fimbule Ice Shelf, carried out during the 2021/2022 Antarctic expedition. The research forms a crucial part of the Master's research project: "Development of a Multi-tier System for the Analysis of Ice Crystallography of Antarctic Shelf Ice", conducted by Steven McEwen. Each entry in the datasheet corresponds to a specific thin section or ice grain and includes the following parameters: Grid Number, A1 Axis Reading, A4 Reading, Corrected A4 values, number of readings, Mean C-axis Orientation, Grain Size, Date, Sample number, x-coordinate, y-coordinate, Degree of Orientation, and Spherical Aperture. These data points collectively facilitate a comprehensive understanding of the crystallography of the Fimbule Ice Shelf's ice samples. Data was collected and analyzed during the 2021/2022 Antarctic summer expedition, with additional analysis being performed in the Polar engineering Research Group's laboratory.
Facebook
TwitterAbstractThe dataset provided here contains the efforts of independent data aggregation, quality control, and visualization of the University of Arizona (UofA) COVID-19 testing programs for the 2019 novel Coronavirus pandemic. The dataset is provided in the form of machine-readable tables in comma-separated value (.csv) and Microsoft Excel (.xlsx) formats.Additional InformationAs part of the UofA response to the 2019-20 Coronavirus pandemic, testing was conducted on students, staff, and faculty prior to start of the academic year and throughout the school year. These testings were done at the UofA Campus Health Center and through their instance program called "Test All Test Smart" (TATS). These tests identify active cases of SARS-nCoV-2 infections using the reverse transcription polymerase chain reaction (RT-PCR) test and the Antigen test. Because the Antigen test provided more rapid diagnosis, it was greatly used three weeks prior to the start of the Fall semester and throughout the academic year.As these tests were occurring, results were provided on the COVID-19 websites. First, beginning in early March, the Campus Health Alerts website reported the total number of positive cases. Later, numbers were provided for the total number of tests (March 12 and thereafter). According to the website, these numbers were updated daily for positive cases and weekly for total tests. These numbers were reported until early September where they were then included in the reporting for the TATS program.For the TATS program, numbers were provided through the UofA COVID-19 Update website. Initially on August 21, the numbers provided were the total number (July 31 and thereafter) of tests and positive cases. Later (August 25), additional information was provided where both PCR and Antigen testings were available. Here, the daily numbers were also included. On September 3, this website then provided both the Campus Health and TATS data. Here, PCR and Antigen were combined and referred to as "Total", and daily and cumulative numbers were provided.At this time, no official data dashboard was available until September 16, and aside from the information provided on these websites, the full dataset was not made publicly available. As such, the authors of this dataset independently aggregated data from multiple sources. These data were made publicly available through a Google Sheet with graphical illustration provided through the spreadsheet and on social media. The goal of providing the data and illustrations publicly was to provide factual information and to understand the infection rate of SARS-nCoV-2 in the UofA community.Because of differences in reported data between Campus Health and the TATS program, the dataset provides Campus Health numbers on September 3 and thereafter. TATS numbers are provided beginning on August 14, 2020.Description of Dataset ContentThe following terms are used in describing the dataset.1. "Report Date" is the date and time in which the website was updated to reflect the new numbers2. "Test Date" is to the date of testing/sample collection3. "Total" is the combination of Campus Health and TATS numbers4. "Daily" is to the new data associated with the Test Date5. "To Date (07/31--)" provides the cumulative numbers from 07/31 and thereafter6. "Sources" provides the source of information. The number prior to the colon refers to the number of sources. Here, "UACU" refers to the UA COVID-19 Update page, and "UARB" refers to the UA Weekly Re-Entry Briefing. "SS" and "WBM" refers to screenshot (manually acquired) and "Wayback Machine" (see Reference section for links) with initials provided to indicate which author recorded the values. These screenshots are available in the records.zip file.The dataset is distinguished where available by the testing program and the methods of testing. Where data are not available, calculations are made to fill in missing data (e.g., extrapolating backwards on the total number of tests based on daily numbers that are deemed reliable). Where errors are found (by comparing to previous numbers), those are reported on the above Google Sheet with specifics noted.For inquiries regarding the contents of this dataset, please contact the Corresponding Author listed in the README.txt file. Administrative inquiries (e.g., removal requests, trouble downloading, etc.) can be directed to data-management@arizona.edu