Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Registry of Open Data on AWS contains publicly available datasets that are available for access from AWS resources. Note that datasets in this registry are available via AWS resources, but they are not provided by AWS; these datasets are owned and maintained by a variety of government organizations, researchers, businesses, and individuals. This dataset contains derived forms of the data in https://github.com/awslabs/open-data-registry that have been transformed for ease of use with machine interfaces. Currently, only the ndjson form of the registry is populated here.
Facebook
TwitterThe purpose of the Fiscal Service Data Registry is to promote the common identification, use and sharing of data/information across the federal government.
Facebook
Twitterhttps://data.csiro.au/dap/ws/v2/licences/1161https://data.csiro.au/dap/ws/v2/licences/1161
The CSIRO Linked Data Registry provides a service form management and public access to codes, codelists, vocabularies, ontologies and other reference resources authorized or adopted by CSIRO.
It is based on the UK Government Linked Data design for a Linked Data registry developed by Epimorphics.
Facebook
TwitterThe Sentinel-2 mission is a land monitoring constellation of two satellites that provide high resolution optical imagery and provide continuity for the current SPOT and Landsat missions. The mission provides a global coverage of the Earth's land surface every 5 days, making the data of great use in on-going studies. L1C data are available from June 2015 globally. L2A data are available from November 2016 over Europe region and globally since January 2017.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objective: To develop a clinical informatics pipeline designed to capture large-scale structured EHR data for a national patient registry.
Materials and Methods: The EHR-R-REDCap pipeline is implemented using R-statistical software to remap and import structured EHR data into the REDCap-based multi-institutional Merkel Cell Carcinoma (MCC) Patient Registry using an adaptable data dictionary.
Results: Clinical laboratory data were extracted from EPIC Clarity across several participating institutions. Labs were transformed, remapped and imported into the MCC registry using the EHR labs abstraction (eLAB) pipeline. Forty-nine clinical tests encompassing 482,450 results were imported into the registry for 1,109 enrolled MCC patients. Data-quality assessment revealed highly accurate, valid labs. Univariate modeling was performed for labs at baseline on overall survival (N=176) using this clinical informatics pipeline.
Conclusion: We demonstrate feasibility of the facile eLAB workflow. EHR data is successfully transformed, and bulk-loaded/imported into a REDCap-based national registry to execute real-world data analysis and interoperability.
Methods eLAB Development and Source Code (R statistical software):
eLAB is written in R (version 4.0.3), and utilizes the following packages for processing: DescTools, REDCapR, reshape2, splitstackshape, readxl, survival, survminer, and tidyverse. Source code for eLAB can be downloaded directly (https://github.com/TheMillerLab/eLAB).
eLAB reformats EHR data abstracted for an identified population of patients (e.g. medical record numbers (MRN)/name list) under an Institutional Review Board (IRB)-approved protocol. The MCCPR does not host MRNs/names and eLAB converts these to MCCPR assigned record identification numbers (record_id) before import for de-identification.
Functions were written to remap EHR bulk lab data pulls/queries from several sources including Clarity/Crystal reports or institutional EDW including Research Patient Data Registry (RPDR) at MGB. The input, a csv/delimited file of labs for user-defined patients, may vary. Thus, users may need to adapt the initial data wrangling script based on the data input format. However, the downstream transformation, code-lab lookup tables, outcomes analysis, and LOINC remapping are standard for use with the provided REDCap Data Dictionary, DataDictionary_eLAB.csv. The available R-markdown ((https://github.com/TheMillerLab/eLAB) provides suggestions and instructions on where or when upfront script modifications may be necessary to accommodate input variability.
The eLAB pipeline takes several inputs. For example, the input for use with the ‘ehr_format(dt)’ single-line command is non-tabular data assigned as R object ‘dt’ with 4 columns: 1) Patient Name (MRN), 2) Collection Date, 3) Collection Time, and 4) Lab Results wherein several lab panels are in one data frame cell. A mock dataset in this ‘untidy-format’ is provided for demonstration purposes (https://github.com/TheMillerLab/eLAB).
Bulk lab data pulls often result in subtypes of the same lab. For example, potassium labs are reported as “Potassium,” “Potassium-External,” “Potassium(POC),” “Potassium,whole-bld,” “Potassium-Level-External,” “Potassium,venous,” and “Potassium-whole-bld/plasma.” eLAB utilizes a key-value lookup table with ~300 lab subtypes for remapping labs to the Data Dictionary (DD) code. eLAB reformats/accepts only those lab units pre-defined by the registry DD. The lab lookup table is provided for direct use or may be re-configured/updated to meet end-user specifications. eLAB is designed to remap, transform, and filter/adjust value units of semi-structured/structured bulk laboratory values data pulls from the EHR to align with the pre-defined code of the DD.
Data Dictionary (DD)
EHR clinical laboratory data is captured in REDCap using the ‘Labs’ repeating instrument (Supplemental Figures 1-2). The DD is provided for use by researchers at REDCap-participating institutions and is optimized to accommodate the same lab-type captured more than once on the same day for the same patient. The instrument captures 35 clinical lab types. The DD serves several major purposes in the eLAB pipeline. First, it defines every lab type of interest and associated lab unit of interest with a set field/variable name. It also restricts/defines the type of data allowed for entry for each data field, such as a string or numerics. The DD is uploaded into REDCap by every participating site/collaborator and ensures each site collects and codes the data the same way. Automation pipelines, such as eLAB, are designed to remap/clean and reformat data/units utilizing key-value look-up tables that filter and select only the labs/units of interest. eLAB ensures the data pulled from the EHR contains the correct unit and format pre-configured by the DD. The use of the same DD at every participating site ensures that the data field code, format, and relationships in the database are uniform across each site to allow for the simple aggregation of the multi-site data. For example, since every site in the MCCPR uses the same DD, aggregation is efficient and different site csv files are simply combined.
Study Cohort
This study was approved by the MGB IRB. Search of the EHR was performed to identify patients diagnosed with MCC between 1975-2021 (N=1,109) for inclusion in the MCCPR. Subjects diagnosed with primary cutaneous MCC between 2016-2019 (N= 176) were included in the test cohort for exploratory studies of lab result associations with overall survival (OS) using eLAB.
Statistical Analysis
OS is defined as the time from date of MCC diagnosis to date of death. Data was censored at the date of the last follow-up visit if no death event occurred. Univariable Cox proportional hazard modeling was performed among all lab predictors. Due to the hypothesis-generating nature of the work, p-values were exploratory and Bonferroni corrections were not applied.
Facebook
Twitterhttps://woudc.org/en/data/data-use-policyhttps://woudc.org/en/data/data-use-policy
Connection to contributors in the WOUDC Data Registry Search Index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SpaceEye-T satellite collects the highest resolution optical imagery among the commercial satellites, 25 cm resolution. The Open Data features various satellite images around the world for end users to experience the power of VVHR optical data.
Facebook
TwitterCity of Austin Open Data Terms of Use https://data.austintexas.gov/stories/s/ranj-cccq This dataset is a monthly upload of the Community Registry (www.AustinTexas.gov/CR), where community organizations such as neighborhood associations may register with the City of Austin to receive notices of land development permit applications within 500 feet of the organization's specified boundaries. This dataset can be used to contact multiple registered organizations at once by filtering/sorting, for example, by Association Type or by Association ZipCode. The organizations' boundaries can be viewed in the City's interactive map at www.AustinTexas.gov/GIS/PropertyProfile/ - the Community Registry layer is under the Boundaries/Grids folder. Austin Development Services Data Disclaimer: The data provided are for informational use only and may differ from official department data. Austin Development Services’ database is continuously updated, so reports run at different times may produce different results. Care should be taken when comparing against other reports as different data collection methods and different data sources may have been used. Austin Development Services does not assume any liability for any decision made or action taken or not taken by the recipient in reliance upon any information or data provided.
Facebook
TwitterREAD is EPA's authoritative source for information about Agency information resources, including applications/systems, datasets and models. READ is one component of the System of Registries (SoR).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An application profile of DCAT combining it with other metadata vocabularies (e.g. VoID, DCTERMS, LIME) to meet requirements elicited in various use cases of the Semantic Web platform Semantic Turkey
Facebook
TwitterThe Clinical Case Registries (CCR) replaced the former Immunology Case Registry and the Hepatitis C Case Registry with local and national databases. The CCR:HIV and CCR:HCV are administrative and clinical databases designed to provide population-based data on VA patients infected with Human Immunodeficiency Virus (HIV) and/or Hepatitis C virus (HCV). Each Veterans Health Information Systems and Technology Architecture (VistA) system contains a local CCR where patients who are potentially HIV or HCV infected are identified based on International Classification of Diseases (ICD-9) codes and/or positive antibody test results. The local HIV or HCV coordinator must review these cases to determine which patients are truly infected and should be added to the local registry. The local CCR provides extensive reporting capabilities to the local HIV and HCV clinicians to monitor their patient population. The local CCR software also extracts data elements from multiple VistA packages and transmits Health Level Seven (HL7) messages to the national database at VA Austin Information Technology Center. The national database is used for monitoring clinical outcomes, assessing resource utilization and quality assurance.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Discover the booming market for Cancer Registry Data Management Software. This in-depth analysis reveals key trends, growth drivers, and leading companies shaping this $2.5B+ market, projected for significant expansion through 2033. Learn about market segmentation, regional insights, and competitive analysis.
Facebook
Twitterhttps://iknl.nl/nkr/cijfers-op-maat/gegevensaanvraaghttps://iknl.nl/nkr/cijfers-op-maat/gegevensaanvraag
The data from the Dutch Cancer Registry (NKR) provide insights into improving care for people with cancer.
The NKR includes information about diagnostics, diagnosis, tumor characteristics and initial treatment, regardless of the treatment location. For an increasing number of cancer types, follow-up data are also available for subsequent treatments. The data is collected by specially trained IKNL data managers in hospitals based on information in the medical file. Since 1989, the database contains data from patients from all over the Netherlands.
Facebook
Twitterhttps://woudc.org/en/data/data-use-policyhttps://woudc.org/en/data/data-use-policy
Connection to datasets in the WOUDC Data Registry Search Index.
Facebook
TwitterFAA Data Registry
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction: The number and quality of existing patient registries are not known in Portugal. In order to improve the knowledge regarding this issue, an interactive tool (RegisPt) was developed to identify and characterize all available patient registries in our country. This article aims to describe the RegisPt design, data model, and due functionalities. Methods: RegisPt was developed in Microsoft Office Access 2010 and all variables definitions were done according to standardized international classifications. A review of the available literature and web resources was performed in order to identify all relevant patient registries. Results: The RegisPt platform is divided into 5 core modules containing comprehensive information from each patient registry (general data, registry design, characterization, effectiveness, and publications). Effectiveness is of utmost importance for health technology assessment and is divided into 2 sections: the exposure (health care service, pharmaceutical drugs, and medical devices) and the outcomes (safety, clinical, and economic). The RegisPt platform allows adding and editing registries as well as consulting all available registries in an interactive and user-friendly way, using a preferred query (e.g., registry name, institution, and therapeutic area). About 50 patient registries were identified and characterized in Portugal in accordance with the patient registry definition of the International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Discussion: The RegisPt is a first step to better understand and use the available resources for health research in Portugal. RegisPt can promote collaboration between researchers and registry owners and encourage the efficient use of resources and may improve the access to registries. This project may also be useful to identify therapeutic areas still lacking patient registries and to optimize existing ones in the country, by comparing registry design, quality, and functional strategies. In the future, this valuable platform may be particularly relevant for researchers and authorities aiming to carry out health technology assessment.
Facebook
TwitterThe AWS Public Blockchain Data initiative provides free access to blockchain datasets through collaboration with data providers. The data is optimized for analytics by being transformed into compressed Parquet files, partitioned by date for efficient querying.
s3://aws-public-blockchain/v1.0/btc/s3://aws-public-blockchain/v1.0/eth/s3://aws-public-blockchain/v1.1/sonarx/arbitrum/s3://aws-public-blockchain/v1.1/sonarx/aptos/s3://aws-public-blockchain/v1.1/sonarx/base/s3://aws-public-blockchain/v1.1/sonarx/provenance/s3://aws-public-blockchain/v1.1/sonarx/xrp/s3://aws-public-blockchain/v1.1/stellar/s3://aws-public-blockchain/v1.1/ton/s3://aws-public-blockchain/v1.1/cronos/We welcome additional blockchain data providers to join this initiative. If you're interested in contributing datasets to the AWS Public Blockchain Data program, please contact our team at aws-public-blockchain@amazon.com.
Facebook
TwitterFind information and data on cancer in Massachusetts, managed by the Massachusetts Cancer Registry.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents an overview of platforms and registries that store, harmonize, and, in some cases, share COVID-19-related participant-level data, including clinical-epidemiological, human and pathogen OMICs, and high dimensional imaging data. The dataset provides an in-depth review of adherence to the FAIR principles, governance, benefit sharing and other ethical concerns related to resources that share harmonized, participant-level data for the research response to COVID-19. Systematic searches were conducted between April 2020 and June 2021 to identify relevant platforms and registries. We applied natural language processing to the CORD-19 dataset in March of 2021 and consulted with COVID-19 focused researchers in Asia, Africa, and Latin America to identify non-English language COVID-19-related data sharing resources.
Facebook
TwitterThe Province of Ontario Neurodevelopmental Disorders (POND) Network is an Integrated Discovery Program funded by the Ontario Brain Institute, and aims to understand the neurobiology of neurodevelopment disorders and translate the findings into effective new treatments. Neurodevelopmental disorders investigated as part of this program include attention deficit/hyperactivity disorder, autism spectrum disorder, intellectual disability, obsessive compulsive disorder, Tourette syndrome, Rett syndrome, Down syndrome, Fragile X syndrome and any other genetic differences associated with neurodevelopmental disorder difficulties.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Registry of Open Data on AWS contains publicly available datasets that are available for access from AWS resources. Note that datasets in this registry are available via AWS resources, but they are not provided by AWS; these datasets are owned and maintained by a variety of government organizations, researchers, businesses, and individuals. This dataset contains derived forms of the data in https://github.com/awslabs/open-data-registry that have been transformed for ease of use with machine interfaces. Currently, only the ndjson form of the registry is populated here.