100+ datasets found

Aggregation Service
catalog.data.gov
datahub.va.gov
+2more
Updated Nov 10, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Veterans Affairs (2020). Aggregation Service [Dataset]. https://catalog.data.gov/dataset/aggregation-service
Explore at:
Dataset updated
Nov 10, 2020
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description
Collect and combine data from multiple internal and external data sources for exposure to consumers. Data for any individual is made available via a standard set of hierarchical HTTP resources through the Read Service. The VRS calls the ISIC external Producer endpoints to fetch and aggregate Care Coordinator Profiles VLER document type data and convert it to an XML Atom feed format for the Consumer.
Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate...
datarade.ai
.sql, .txt
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Envestnet | Yodlee, Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-retail-transaction-data-row-aggregate-envestnet-yodlee
Explore at:
.sql, .txtAvailable download formats
Dataset provided by
Envestnethttp://envestnet.com/
Yodlee
Authors
Envestnet | Yodlee
Area covered
United States of America
Description
Envestnet®| Yodlee®'s Retail Transaction Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.

Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.

We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.

Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?

Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.

Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis
f
Data from: Prediction of Protein Aggregation Propensity via Data-Driven...
acs.figshare.com
zip
Updated Oct 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seungpyo Kang; Minseon Kim; Jiwon Sun; Myeonghun Lee; Kyoungmin Min (2023). Prediction of Protein Aggregation Propensity via Data-Driven Approaches [Dataset]. http://doi.org/10.1021/acsbiomaterials.3c01001.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acsbiomaterials.3c01001.s002
Dataset updated
Oct 16, 2023
Dataset provided by
ACS Publications
Authors
Seungpyo Kang; Minseon Kim; Jiwon Sun; Myeonghun Lee; Kyoungmin Min
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Protein aggregation occurs when misfolded or unfolded proteins physically bind together and can promote the development of various amyloid diseases. This study aimed to construct surrogate models for predicting protein aggregation via data-driven methods using two types of databases. First, an aggregation propensity score database was constructed by calculating the scores for protein structures in the Protein Data Bank using Aggrescan3D 2.0. Moreover, feature- and graph-based models for predicting protein aggregation have been developed by using this database. The graph-based model outperformed the feature-based model, resulting in an R2 of 0.95, although it intrinsically required protein structures. Second, for the experimental data, a feature-based model was built using the Curated Protein Aggregation Database 2.0 to predict the aggregated intensity curves. In summary, this study suggests approaches that are more effective in predicting protein aggregation, depending on the type of descriptor and the database.
4
Data from: Climate and soil input data aggregation effects in crop models
data.4tu.nl
dataverse.harvard.edu
zip
Updated Jul 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andreas Enders; Holger Hoffmann; Stefan Siebert; Thomas Gaiser; Frank Ewert (2023). Climate and soil input data aggregation effects in crop models [Dataset]. http://doi.org/10.4121/db30b64d-8964-4522-8bc9-d3a7ac0c9834.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/db30b64d-8964-4522-8bc9-d3a7ac0c9834.v1
Dataset updated
Jul 27, 2023
Dataset provided by
4TU.ResearchData
Authors
Andreas Enders; Holger Hoffmann; Stefan Siebert; Thomas Gaiser; Frank Ewert
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1982 - 2012
Area covered

Description
This dataset contains interpolated and aggregated soil and climate data of the region of North Rhine-Westphalia (Germany). The data is provided for grids of 1, 10, 25, 50 and 100 km resolutions. These data grids represent spatial aggregations of the climate of approximately 1 km resolution and soil data of approximately 300 m resolution raster. The purpose of this data is the use as input for crop models. It thus contains the key relevant soil and climate variables for running crop models. Additionally, the data is specifically designed to analyze effects of scale and resolution in crop models, e.g. data aggregation effects. It has been used for several studies on spatial scales with regard to different scaling approaches, crops, crop models, model output variables, production situations and crop management among others.
J
Statistical inference for aggregates of Farrell-type efficiencies...
jda-test.zbw.eu
journaldata.zbw.eu
txt, xls, zip
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Léopold Simar; Valentin Zelenyuk; Léopold Simar; Valentin Zelenyuk (2024). Statistical inference for aggregates of Farrell-type efficiencies (replication data) [Dataset]. https://jda-test.zbw.eu/dataset/statistical-inference-for-aggregates-of-farrelltype-efficiencies
Explore at:
txt(7298), xls(22016), txt(1578), zip(470697)Available download formats
Dataset updated
Jul 22, 2024
Dataset provided by
ZBW - Leibniz Informationszentrum Wirtschaft
Authors
Léopold Simar; Valentin Zelenyuk; Léopold Simar; Valentin Zelenyuk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this study, we merge results of two recent directions in efficiency analysis research-aggregation and bootstrap-applied, as an example, to one of the most popular point estimators of individual efficiency: the data envelopment analysis (DEA) estimator. A natural context of the methodology developed here is a study of efficiency of a particular economic system (e.g., an industry) as a whole, or a comparison of efficiencies of distinct groups within such a system (e.g., regulated vs. non-regulated firms or private vs. public firms). Our methodology is justified by the (neoclassical) economic theory and is supported by carefully adapted statistical methods.
4
Source code underlying the publication: Privacy-Preserving Data Aggregation...
data.4tu.nl
zip
Updated Sep 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florine Dekker; Zekeriya Erkin (2020). Source code underlying the publication: Privacy-Preserving Data Aggregation with Probabilistic Range Validation [Dataset]. http://doi.org/10.4121/b9db276f-5522-4986-9d98-e9710134fd71.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/b9db276f-5522-4986-9d98-e9710134fd71.v1
Dataset updated
Sep 29, 2020
Dataset provided by
4TU.ResearchData
Authors
Florine Dekker; Zekeriya Erkin
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
MATLAB code to reproduce results presented in the paper "Privacy-Preserving Data Aggregation with Probabilistic Range Validation".

The source code is available as a git repository.

The source code was published by the paper's authors several years after the paper was published.

Git repository
Relevant code is stored in the src directory.

Measures and visualises various metrics shown in the paper. The settings in the various scripts correspond exactly to those used to achieve the results in the paper. The code is fully deterministic and gives the exact same results each time.

Unfortunately, Figure 6 in the paper was generated with a version of this code in which the seed for the random number generator was not configured correctly, and as a result Figure 6 cannot be recreated exactly. However, the outputs of the scripts in the repository are not significantly different from the published Figure 6, and do not undermine or alter the conclusions in any significant way.

See ARTIFACT-EVALUATION.md in the root folder for detailed end-user instructions.
H
Replication Data for: Lost in Aggregation: Improving Event Analysis with...
dataverse.harvard.edu
search.dataone.org
Updated Nov 21, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scott Cook; Nils Weidmann (2019). Replication Data for: Lost in Aggregation: Improving Event Analysis with Report-Level Data [Dataset]. http://doi.org/10.7910/DVN/OOIEAO
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/OOIEAO
Dataset updated
Nov 21, 2019
Dataset provided by
Harvard Dataverse
Authors
Scott Cook; Nils Weidmann
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/OOIEAOhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/OOIEAO
Description
Most measures of social conflict processes are derived from primary and secondary source reports. In many cases, reports are used to create event-level data sets by aggregating information from multiple, and often conflicting, reports to single event observations. We argue this pre-aggregation is less innocuous than it seems, costing applied researchers opportunities for improved inference. First, researchers cannot evaluate the consequences of different methods of report aggregation. Second, aggregation discards report-level information (i.e., variation across reports) that is useful in addressing measurement error inherent in event data. Therefore, we advocate that data should be supplied and analyzed at the report level. We demonstrate the consequences of using aggregated event data as a predictor or outcome variable, and how analysis can be improved using report-level information directly. These gains are demonstrated with simulated-data experiments and in the analysis of real-world data, using the newly available Mass Mobilization in Autocracies Database (MMAD)
s
Facebook Deactivation Participants
socialmediaarchive.org
pdf, xlsx
Updated May 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Facebook Deactivation Participants [Dataset]. https://socialmediaarchive.org/record/61?v=pdf
Explore at:
xlsx(16172), xlsx(33969), pdf(813810)Available download formats
Dataset updated
May 21, 2024
Description
This table includes platform data for Facebook participants in the Deactivation experiment. Each row of the dataset corresponds to data from a participant’s Facebook user account. Each column contains a value, or set of values, that aggregates log data for this specific participant over a certain period of time.
m
Data from: MAUP datasets
data.mendeley.com
Updated Jul 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Babak Khavari (2021). MAUP datasets [Dataset]. http://doi.org/10.17632/jcr5rgt66j.1
Explore at:
Unique identifier
https://doi.org/10.17632/jcr5rgt66j.1
Dataset updated
Jul 28, 2021
Authors
Babak Khavari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data used in the forthcoming “The modifiable areal unit problem in geospatial least-cost electrification modelling” publication.

The work describes how different methods of aggregation of population data effects the results produced by the Open Source Spatial Electrification Tool (OnSSET, https://github.com/OnSSET). In the initial study three countries have been assessed: Benin, Malawi and Namibia. The choice of countries is due to their different national population densities and starting electrification rates. The following repository includes three zipped files, one for each country, containing the 26 input files used in the study. These input files are generated with the QGIS tools published in the OnSSET repository (https://github.com/onsset). This data repository also contains a file describing the naming conventions for the results used and the summary files generated with OnSSET.

For more information on how to generate these datasets, please refer to the following GitHub repository https://github.com/babakkhavari/MAUP and the corresponding publication (To Be Added)
f
Main soil types in North Rhine-Westphalia as influenced by aggregation.
figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Holger Hoffmann; Gang Zhao; Senthold Asseng; Marco Bindi; Christian Biernath; Julie Constantin; Elsa Coucheney; Rene Dechow; Luca Doro; Henrik Eckersten; Thomas Gaiser; Balázs Grosz; Florian Heinlein; Belay T. Kassie; Kurt-Christian Kersebaum; Christian Klein; Matthias Kuhnert; Elisabet Lewan; Marco Moriondo; Claas Nendel; Eckart Priesack; Helene Raynal; Pier P. Roggero; Reimund P. Rötter; Stefan Siebert; Xenia Specka; Fulu Tao; Edmar Teixeira; Giacomo Trombi; Daniel Wallach; Lutz Weihermüller; Jagadeesh Yeluripati; Frank Ewert (2023). Main soil types in North Rhine-Westphalia as influenced by aggregation. [Dataset]. http://doi.org/10.1371/journal.pone.0151782.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0151782.t002
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Holger Hoffmann; Gang Zhao; Senthold Asseng; Marco Bindi; Christian Biernath; Julie Constantin; Elsa Coucheney; Rene Dechow; Luca Doro; Henrik Eckersten; Thomas Gaiser; Balázs Grosz; Florian Heinlein; Belay T. Kassie; Kurt-Christian Kersebaum; Christian Klein; Matthias Kuhnert; Elisabet Lewan; Marco Moriondo; Claas Nendel; Eckart Priesack; Helene Raynal; Pier P. Roggero; Reimund P. Rötter; Stefan Siebert; Xenia Specka; Fulu Tao; Edmar Teixeira; Giacomo Trombi; Daniel Wallach; Lutz Weihermüller; Jagadeesh Yeluripati; Frank Ewert
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
North Rhine-Westphalia
Description
Main soil types in North Rhine-Westphalia as influenced by aggregation.
Data from: A survey of digitized data from U.S. fish collections in the...
zenodo.org
datadryad.org
zip
Updated May 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Randal A. Singer; Kevin J. Love; Lawrence M. Page; Randal A. Singer; Kevin J. Love; Lawrence M. Page (2022). Data from: A survey of digitized data from U.S. fish collections in the iDigBio data aggregator [Dataset]. http://doi.org/10.5061/dryad.pc548kj
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pc548kj
Dataset updated
May 28, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Randal A. Singer; Kevin J. Love; Lawrence M. Page; Randal A. Singer; Kevin J. Love; Lawrence M. Page
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Recent changes in institutional cyberinfrastructure and collections data storage methods have dramatically improved accessibility of specimen-based data through the use of digital databases and data aggregators. This analysis of digitized fish collections in the U.S. demonstrates how information from data aggregators, in this case iDigBio, can be extracted and analyzed. Data from U.S. institutional fish collections in iDigBio were explored through a strictly programmatic approach using the ridigbio package and fishfindR web application. iDigBio facilitates the aggregation of collections data on a purely voluntary fashion that requires collection staff to consent to sharing of their data. Not all collections are sharing their data with iDigBio, but the data harvested from 38 of the 143 known fish collections in the U.S. that are in iDigBio account for the majority of fish specimens housed in U.S. collections. In the 22 years since publication of the last survey providing information on these 38 collections, 1,219,168 specimen records (lots), 15,225,744 specimens, 3,192 primary types, and 32,868 records of secondary types have been added. This is an increase of 65.1% in the number of cataloged records and an increase of 56.1% in the number of specimens. In addition to providing specimen-based data for research, education, and various outreach activities, data that are accessible via data aggregators can be used to develop accurate, up-to-date reports of information on institutional collections. Such reports present collections data in an organized and accessible fashion and can guide targeted efforts by collections personnel to meet discipline-specific needs and make data more transparent to downstream users. Data from this survey will be updated and published regularly in a dynamic web application that will aid collections staff in communicating collections value while simultaneously giving stakeholders a way to explore collections holdings as they relate to the institutions in which they are housed. It is through this resource that collections will be able to leverage their data against those of similar collections to aid in the procurement of financial and institutional support.
D
Data from: A hierarchically adaptable spatial regression model to link...
phys-techsciences.datastations.nl
application/dbf +12
Updated Jun 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
P.N. Truong; P.N. Truong (2024). A hierarchically adaptable spatial regression model to link aggregated health data and environmental data [Dataset]. http://doi.org/10.17026/dans-x3z-6que
Explore at:
application/dbf(164), csv(132), application/sbx(124), application/shp(114744), application/prj(402), mid(112), txt(319), mif(241621), txt(293), xml(1121), zip(22574), application/sbn(196), bin(5), application/shx(156)Available download formats
Unique identifier
https://doi.org/10.17026/dans-x3z-6que
Dataset updated
Jun 21, 2024
Dataset provided by
DANS Data Station Physical and Technical Sciences
Authors
P.N. Truong; P.N. Truong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Health data and environmental data are commonly collected at different levels of aggregation. A persistent challenge of using a spatial regression model to link these data is that their associations can vary as a function of aggregation. This results into ecological fallacy if association at one aggregation level is used for inferencing at another level. We address this challenge by presenting a hierarchically adaptable spatial regression model. In essence, the model extends the spatially varying coefficient model to allow the response to be count data at larger aggregation levels than that of the covariates. A Bayesian hierarchical approach is used for inferencing the model parameters. Robust inference and optimal prediction over geographical space and at different spatial aggregation levels are studied by simulated data sets. The spatial associations at different spatial supports are largely different, but can be efficiently inferred when prior knowledge of the associations is available. The model is applied to study hand, foot and mouth disease (HFMD) in Da Nang city, Viet Nam. Decrease in vegetated areas corresponds with elevated HFMD risks. A study to the identifiability of the parameters shows a strong need for a highly informative prior distribution. We conclude that the model is robust to the underlying aggregation levels of the calibrating data for association inference and it is ready for application in health geography.
Z
DIAMAS survey on Institutional Publishing - aggregated data
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kramer, Bianca (2025). DIAMAS survey on Institutional Publishing - aggregated data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10590502
Explore at:
Dataset updated
Mar 13, 2025
Dataset provided by
Kramer, Bianca
Ross, George
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The DIAMAS project investigates Institutional Publishing Service Providers (IPSP) in the broadest sense, with a special focus on those publishing initiatives that do not charge fees to authors or readers. To collect information on Institutional Publishing in the ERA, a survey was conducted among IPSPs between March-May 2024. This dataset contains aggregated data from the 685 valid responses to the DIAMAS survey on Institutional Publishing.

The dataset supplements D2.3 Final IPSP landscape Report Institutional Publishing in the ERA: results from the DIAMAS survey.

The data

Basic aggregate tabular data

Full individual survey responses are not being shared to prevent the easy identification of respondents (in line with conditions set out in the survey questionnaire). This dataset contains full tables with aggregate data for all questions from the survey, with the exception of free-text responses, from all 685 survey respondents. This includes, per question, overall totals and percentages for the answers given as well the breakdown by both IPSP-types: institutional publishers (IPs) and service providers (SPs). Tables at country level have not been shared, as cell values often turned out to be too low to prevent potential identification of respondents. The data is available in csv and docx formats, with csv files grouped and packaged into ZIP files. Metadata describing data type, question type, as well as question response rate, is available in csv format. The R code used to generate the aggregate tables is made available as well.

Files included in this dataset

survey_questions_data_description.csv - metadata describing data type, question type, as well as question response rate per survey question.

tables_raw_all.zip - raw tables (csv format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option. Zip file contains 180 csv files.

tables_raw_IP.zip - as tables_raw_all.zip, for responses from institutional publishers (IP) only. Zip file contains 180 csv files.

tables_raw_SP.zip - as tables_raw_all.zip, for responses from service providers (SP) only. Zip file contains 170 csv files.

tables_formatted_all.docx - formatted tables (docx format) with aggregated data per question for all respondents, with the exception of free-text responses. Questions with multiple answers have a table for each answer option.

tables_formatted_IP.docx - as tables_formatted_all.docx, for responses from institutional publishers (IP) only.

tables_formatted_SP.docx - as tables_formatted_all.docx, for responses from service providers (SP) only.

DIAMAS_Tables_single.R - R script used to generate raw tables with aggregated data for all single response questions

DIAMAS_Tables_multiple.R - R script used to generate raw tables with aggregated data for all multiple response questions

DIAMAS_Tables_layout.R - R script used to generate document with formatted tables from raw tables with aggregated data

DIAMAS Survey on Instititutional Publishing - data availability statement (pdf)

All data are made available under a CC0 license.
d
Data from: Across species-pool aggregation alters grassland productivity and...
datadryad.org
zip
Updated Aug 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas P. McKenna; Kathryn A. Yurkonis (2016). Across species-pool aggregation alters grassland productivity and diversity [Dataset]. http://doi.org/10.5061/dryad.35bk0
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.35bk0
Dataset updated
Aug 18, 2016
Dataset provided by
Dryad
Authors
Thomas P. McKenna; Kathryn A. Yurkonis
Time period covered
2016
Description
Biomass, Biodiversity Effects, and Diversity for all three yearsSPACEdata_Dryad.xlsx
g
Replication data for: How Transparency Kills Information Aggregation: Theory...
datasearch.gesis.org
openicpsr.org
Updated Oct 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fehrler, Sebastian; Hughes, Niall (2019). Replication data for: How Transparency Kills Information Aggregation: Theory and Experiment [Dataset]. http://doi.org/10.3886/E114353V1
Explore at:
Unique identifier
https://doi.org/10.3886/E114353V1
Dataset updated
Oct 13, 2019
Dataset provided by
da|ra (Registration agency for social science and economic data)
Authors
Fehrler, Sebastian; Hughes, Niall
Description
We investigate the potential of transparency to influence committee decision-making. We present a model in which career concerned committee members receive private information of different type-dependent accuracy, deliberate, and vote. We study three levels of transparency under which career concerns are predicted to affect behavior differently and test the model's key predictions in a laboratory experiment. The model's predictions are largely borne out—transparency negatively affects information aggregation at the deliberation and voting stages, leading to sharply different committee error rates than under secrecy. This occurs despite subjects revealing more information under transparency than theory predicts.
e
Aggregation generic tables road noise zones type C index D — Tarn
data.europa.eu
Updated Feb 27, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Aggregation generic tables road noise zones type C index D — Tarn [Dataset]. https://data.europa.eu/data/datasets/fr-120066022-jdd-366e145e-91f5-4657-b4ea-edb6827ab64a
Explore at:
Dataset updated
Feb 27, 2022
Description
Aggregation of generic tables describing the Noise Zones, for an infrastructure, the type of infrastructure concerned ROUTE (R), card type C and LD index.

Road infrastructure concerned: A68, C1_albi, C1_castres, D100, D1012, D13, D612, D622, D630, D631, D69, D800, D81, D84, D87, D88, D912, D926, D968, D988, D999A, D999, N112, N126, N88

Limit value exceedance maps (or “type c” maps) maps to be made within the framework of the CBS pursuant to Article 3-II-1°-c of the Decree of 24 March 2006. These are two maps representing the areas where the Lden limit values are exceeded for the year in which the maps are drawn up.

Lden sound level indicator means Level Day-Evening-Night. It corresponds to an equivalent 24-hour sound level in which evening and night noise levels are increased by 5 and 10 dB(A), respectively, to reflect greater discomfort during these periods.

Aggregation obtained by the QGIS MIZOGEO plugin made available by CEREMA.

Data source by infrastructure: CEREMA.
Data from: Aggregation of recount3 RNA-seq data improves inference of...
zenodo.org
bin, zip
Updated Aug 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prashanthi Ravichandran; Prashanthi Ravichandran (2024). Aggregation of recount3 RNA-seq data improves inference of consensus and tissue-specific gene co-expression networks [Dataset]. http://doi.org/10.5281/zenodo.10480999
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10480999
Dataset updated
Aug 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Prashanthi Ravichandran; Prashanthi Ravichandran
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data and Inferred Networks accompanying the manuscript entitled - “Aggregation of recount3 RNA-seq data improves the inference of consensus and context-specific gene co-expression networks”

Authors: Prashanthi Ravichandran, Princy Parsana, Rebecca Keener, Kaspar Hansen, Alexis Battle

Affiliations: Johns Hopkins University School of Medicine, Johns Hopkins University Department of Computer Science, Johns Hopkins University Bloomberg School of Public Health

Description:

This folder includes data produced in the analysis contained in the manuscript and inferred consensus and context-specific networks from graphical lasso and WGCNA with varying numbers of edges. Contents include:

all_metadata.rds: File including meta-data columns of study accession ID, sample ID, assigned tissue category, cancer status and disease status obtained through manual curation for the 95,484 RNA-seq samples used in the study.

all_counts.rds: log2 transformed RPKM normalized read counts for 5999 genes and 95,484 RNA-seq samples which was utilized for dimensionality reduction and data exploration

precision_matrices.zip: Zipped folder including networks inferred by graphical lasso for different experiments presented in the paper using weighted covariance aggregation following PC correction.

The networks can be found as follows. First, select the folder corresponding to the network of interest - for example, Blood, this will then include two or more folders which indicate the data aggregation utilized, select the folder corresponding appropriate level of data aggregation - either all samples/ GTEx for blood-specific networks, this includes precision matrices inferred across a range of penalization parameters. To view the precision matrix inferred for a particular value of the penalization parameter X, select the file labeled lambda_X.rds

For select networks, we have included the computed centrality measures which can be accessed at centrality_X.rds for a particular value of the penalization parameter X.

We have also included .rds files that list the hub genes from the consensus networks inferred from non-cancerous samples at “normal_hubs.rds”, and the consensus networks inferred from cancerous samples at “cancer_hubs.rds”

The file “context_specific_selected_networks.csv” includes the networks that were selected for downstream biological interpretation based on the scale-free criterion which is also summarized in the Supplementary Tables.

WGCNA.zip: A zipped folder containing gene modules inferred from WGCNA for sequentially aggregated GTEx, SRA, and blood studies. Select the data aggregated, and the number of studies based on folder names. For example, blood networks inferred from 20 studies can be accessed at blood/consensus/net_20. The individual networks correspond to distinct cut heights, and include information on the cut height used, the genes that the network was inferred over merged module labels, and merged module colors.
d
Data Collaborations Across Boundaries (Slides)
data.depositar.io
pdf
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
depositar (2025). Data Collaborations Across Boundaries (Slides) [Dataset]. https://data.depositar.io/dataset/data-collaborations-across-boundaries
Explore at:
pdf(4440122), pdf(10713394), pdf(1792282), pdf(1296859), pdf(3112569)Available download formats
Dataset updated
Jun 27, 2025
Dataset provided by
depositar
Description
This dataset collects the slides that were presented at the Data Collaborations Across Boundaries session in SciDataCon 2022, part of the International Data Week.

The following session proposal was prepared by Tyng-Ruey Chuang and submitted to SciDataCon 2022 organizers for consideration on 2022-02-28. The proposal was accepted on 2022-03-28. Six abstracts were submitted and accepted to this session. Five presentations were delivered online in a virtual session on 2022-06-21.

Data Collaborations Across Boundaries

There are many good stories about data collaborations across boundaries. We need more. We also need to share the lessons each of us has learned from collaborating with parties and communities not in our familiar circles.

By boundaries, we mean not just the regulatory borders in between the nation states about data sharing but the various barriers, readily conceivable or not, that hinder collaboration in aggregating, sharing, and reusing data for social good. These barriers to collaboration exist between the academic disciplines, between the economic players, and between the many user communities, just to name a few. There are also cross-domain barriers, for example those that lay among data practitioners, public administrators, and policy makers when they are articulating the why, what, and how of "open data" and debating its economic significance and fair distribution. This session aims to bring together experiences and thoughts on good data practices in facilitating collaborations across boundaries and domains.

The success of Wikipedia proves that collaborative content production and service, by ways of copyleft licenses, can be sustainable when coordinated by a non-profit and funded by the general public. Collaborative code repositories like GitHub and GitLab demonstrate the enormous value and mass scale of systems-facilitated integration of user contributions that run across multiple programming languages and developer communities. Research data aggregators and repositories such as GBIF, GISAID, and Zenodo have served numerous researchers across academic disciplines. Citizen science projects and platforms, for instance eBird, Galaxy Zoo, and Taiwan Roadkill Observation Network (TaiRON), not only collect data from diverse communities but also manage and release datasets for research use and public benefit (e.g. TaiRON datasets being used to improve road design and reduce animal mortality). At the same time large scale data collaborations depend on standards, protocols, and tools for building registries (e.g. Archival Resource Key), ontologies (e.g. Wikidata and schema.org), repositories (e.g. CKAN and Omeka), and computing services (e.g. Jupyter Notebook). There are many types of data collaborations. The above lists only a few.

This session proposal calls for contributions to bring forward lessons learned from collaborative data projects and platforms, especially about those that involve multiple communities and/or across organizational boundaries. Presentations focusing on the following (non-exclusive) topics are sought after:

Support mechanisms and governance structures for data collaborations across organizations/communities.

Data policies --- such as data sharing agreements, memorandum of understanding, terms of use, privacy policies, etc. --- for facilitating collaborations across organizations/communities.

Traditional and non-traditional funding sources for data collaborations across multiple parties; sustainability of data collaboration projects, platforms, and communities.

Data workflows --- collection, processing, aggregation, archiving, and publishing, etc. --- designed with considerations of (external) collaboration.

Collaborative web platforms for data acquisition, curation, analysis, visualization, and education.

Examples and insights from data trusts, data coops, as well as other formal and informal forms of data stewardship.

Debates on the pros and cons of centralized, distributed, and/or federated data services.

Practical lessons learned from data collaboration stories: failure, success, incidence, unexpected turn of event, aftermath, etc. (no story is too small!).
d
Data from: Leaf area predicts conspecific spatial aggregation of woody...
datadryad.org
data.niaid.nih.gov
+2more
zip
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jingjing Xi; Collin Li; Min Wang; Stavros Veresoglou (2024). Leaf area predicts conspecific spatial aggregation of woody species [Dataset]. http://doi.org/10.5061/dryad.4b8gthtn2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4b8gthtn2
Dataset updated
Jul 16, 2024
Dataset provided by
Dryad
Authors
Jingjing Xi; Collin Li; Min Wang; Stavros Veresoglou
Description
On the 8th of September 2022 we carried out a search in the Web of Science with the search string “(Ripley's K function) AND (forest)”. The search yielded 356 hits. We screened those 356 studies for eligibility, first based on the suitability of their article titles and second based on their abstracts (Figure S1). The 240 eligible studies were subsequently screened manually upon reading the entire article based on the following inclusion criteria: (1) The study reported on univariate Ripley's K or L statistics or else it was possible to extract those from figures or maps. (2) The study had been carried out in a woody ecosystem or a rangeland. (3) The univariate Ripley’s K statistics described the distribution of individuals from a single plant species. (4) &...
G
Final demand categories, by commodity, M-level aggregation [1961 - 2008]
ouvert.canada.ca
open.canada.ca
+1more
csv, html, xml
Updated Jan 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2023). Final demand categories, by commodity, M-level aggregation [1961 - 2008] [Dataset]. https://ouvert.canada.ca/data/dataset/cb9676b5-34b1-445b-b3cb-d6f9e9535dfc
Explore at:
csv, html, xmlAvailable download formats
Dataset updated
Jan 17, 2023
Dataset provided by
Statistics Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
This table contains 1140 series, with data for years 1961 - 2008 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 items: Canada ...) Final demand categories (42 items: Total final demand: final expenditure on gross domestic product (GDP); Personal expenditures; furniture and household appliances; Personal expenditures; motor vehicles; parts and repairs; Personal expenditures; other durable goods ...) Commodity (104 items: Total; final demand; Grains; Live animals; Other agricultural products ...).

Facebook

Twitter

Click to copy link

Link copied

Cite

Department of Veterans Affairs (2020). Aggregation Service [Dataset]. https://catalog.data.gov/dataset/aggregation-service

Aggregation Service

Explore at:

Dataset updated

Nov 10, 2020

Dataset provided by

United States Department of Veterans Affairshttp://va.gov/

Description

Collect and combine data from multiple internal and external data sources for exposure to consumers. Data for any individual is made available via a standard set of hierarchical HTTP resources through the Read Service. The VRS calls the ISIC external Producer endpoints to fetch and aggregate Care Coordinator Profiles VLER document type data and convert it to an XML Atom feed format for the Consumer.

Clear search

Close search

Google apps

Main menu

Aggregation Service

Envestnet | Yodlee's De-Identified Retail Transaction Data | Row/Aggregate...

Data from: Prediction of Protein Aggregation Propensity via Data-Driven...

Data from: Climate and soil input data aggregation effects in crop models

Statistical inference for aggregates of Farrell-type efficiencies...

Source code underlying the publication: Privacy-Preserving Data Aggregation...

Replication Data for: Lost in Aggregation: Improving Event Analysis with...

Facebook Deactivation Participants

Data from: MAUP datasets

Main soil types in North Rhine-Westphalia as influenced by aggregation.

Data from: A survey of digitized data from U.S. fish collections in the...

Data from: A hierarchically adaptable spatial regression model to link...

DIAMAS survey on Institutional Publishing - aggregated data

Data from: Across species-pool aggregation alters grassland productivity and...

Replication data for: How Transparency Kills Information Aggregation: Theory...

Aggregation generic tables road noise zones type C index D — Tarn

Data from: Aggregation of recount3 RNA-seq data improves inference of...

Data Collaborations Across Boundaries (Slides)

Data from: Leaf area predicts conspecific spatial aggregation of woody...

Final demand categories, by commodity, M-level aggregation [1961 - 2008]

Aggregation Service