51 datasets found

d
Open Source Indicators Project
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reed, Terry (2023). Open Source Indicators Project [Dataset]. http://doi.org/10.7910/DVN/EN8FUW
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/EN8FUW
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Reed, Terry
Description
The goal of the Open Source Indicators (OSI) Program was to make automated predictions of significant societal events through the continuous and automated analysis of publicly available data such as news media, social media, informational websites, and satellite imagery. Societal events of interest included civil unrest, disease outbreaks, and election results. Geographic areas of interest include countries in Latin America (LA) and the Middle East and North Africa (MENA). The handbook is intended to serve as a reference document for the OSI Program and a companion to the ground truth event data used for test and evaluation. The handbook provides guidance regarding the types of events considered; the submission of automated predictions or “warnings;” the development of ground truth; the test and evaluation of submitted warnings; performance measures; and other programmatic information. IARPA initiated a solicitation for OSI Research Teams in late summer 2011 for one base year and two option years of research. MITRE was selected as the Test and Evaluation (T&E) Team in November 2011. Following a review of proposals, three teams (BBN, HRL, and Virginia Tech (VT)) were selected. The OSI Program officially began in April 2012; manual event encoding and formal T&E ended in March 2015.
O
Open Source Intelligence Market Report
promarketreports.com
doc, pdf, ppt
Updated Jan 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pro Market Reports (2025). Open Source Intelligence Market Report [Dataset]. https://www.promarketreports.com/reports/open-source-intelligence-market-8906
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Jan 9, 2025
Dataset authored and provided by
Pro Market Reports
License
https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The size of the Open Source Intelligence Market was valued at USD 9.74 Billion in 2023 and is projected to reach USD 36.25 Billion by 2032, with an expected CAGR of 20.65% during the forecast period. The Open Source Intelligence (OSINT) market is experiencing rapid growth, driven by the increasing need for data-driven insights across various industries, including defense, security, finance, and law enforcement. OSINT involves collecting, analyzing, and utilizing publicly available information from sources like social media, websites, forums, and public records to gain actionable intelligence. The market's expansion is fueled by advancements in data analytics, machine learning, and artificial intelligence, which enable organizations to efficiently process vast amounts of unstructured data. OSINT tools and platforms provide critical capabilities for identifying threats, monitoring public sentiment, tracking geopolitical events, and detecting fraud. With cyber threats on the rise, government agencies and private enterprises are investing heavily in OSINT technologies to enhance decision-making processes and improve security measures. The rise of digital platforms and the growing volume of open-source data are expected to further drive market growth. However, the increasing complexity of regulations and ethical concerns regarding privacy are challenges that need to be addressed. Overall, the OSINT market presents a promising landscape for innovation and strategic advancements in intelligence gathering. Notable trends are: Use by marketers in creating unique strategies for customers.
u
Data from: Inventory of online public databases and repositories holding...
agdatacommons.nal.usda.gov
s.cnmilf.com
+4more
txt
Updated Feb 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erin Antognoli; Jonathan Sears; Cynthia Parr (2024). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. http://doi.org/10.15482/USDA.ADC/1389839
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1389839
Dataset updated
Feb 8, 2024
Dataset provided by
Ag Data Commons
Authors
Erin Antognoli; Jonathan Sears; Cynthia Parr
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to

establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data

Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered.
Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review:

Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection.
Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation.

See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt
F
Open Source Intelligence Market By Security (Human Intelligence, Content...
fnfresearch.com
pdf
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Facts and Factors (2025). Open Source Intelligence Market By Security (Human Intelligence, Content Intelligence, Link/Network Analytics, Data Analytics, Artificial Intelligence, & Big Data Security), By Technology (Big Data Software, Video Analytics, Text Analytics, Visualization Tools, Cyber Security, Web Analytics, & Others), By Application (Military & Defense, Private Sector, Public Sector, National Security, & Others), And By Regions - Global & Regional Industry Perspective, Comprehensive Analysis, and Forecast 2021 – 2026 [Dataset]. https://www.fnfresearch.com/open-source-intelligence-market-by-sources-public-government-143
Explore at:
pdfAvailable download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Facts and Factors
License
https://www.fnfresearch.com/privacy-policyhttps://www.fnfresearch.com/privacy-policy
Time period covered
2022 - 2030
Area covered
Global
Description
[205+ Pages Report] Global Open Source Intelligence Market is estimated to reach a value of USD 28.34 Billion in the year 2026 with a growth rate of 19.9% CAGR during 2021-2026
Z
Open Source Intelligence Market By Deployment Type (Cloud And On-Premises);...
zionmarketresearch.com
pdf
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zion Market Research (2025). Open Source Intelligence Market By Deployment Type (Cloud And On-Premises); By Source (Public Government Data, Professional And Academic Publications, Commercial Data, Grey Literature, Media, And Internet); By Security Type (Data Analytics, Text Analytics, Artificial Intelligence, Big Data, Human Intelligence, Content Intelligence, And Dark Web Analysis); By Application (Private Sector, Public Sector, Military And Defense, Homeland Security, And National Security) , And By Region - Global And Regional Industry Overview, Market Intelligence, Comprehensive Analysis, Historical Data, And Forecasts 2024-2032 [Dataset]. https://www.zionmarketresearch.com/report/open-source-intelligence-market
Explore at:
pdfAvailable download formats
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Zion Market Research
License
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Time period covered
2022 - 2030
Area covered
Global
Description
The global open source intelligence market size was valued USD 7.74 billion in 2023 and is expected to increase to USD 42.08 billion by 2032 at a CAGR of 20.70%.
Data from: Aequatus: an open-source homology browser
ckan.earlham.ac.uk
Updated Jul 24, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.earlham.ac.uk (2019). Aequatus: an open-source homology browser [Dataset]. https://ckan.earlham.ac.uk/dataset/c1d7f17d-1bc2-4ee8-ac73-0c4c4a40a5de
Explore at:
Dataset updated
Jul 24, 2019
Dataset provided by
CKANhttps://ckan.org/
Description
Phylogenetic information inferred from the study of homologous genes helps us to understand the evolution of genes and gene families, including the identification of ancestral gene duplication events as well as regions under positive or purifying selection within lineages. Gene family and orthogroup characterization enables the identification of syntenic blocks, which can then be visualized with various tools. Unfortunately, currently available tools display only an overview of syntenic regions as a whole, limited to the gene level, and none provide further details about structural changes within genes, such as the conservation of ancestral exon boundaries amongst multiple genomes. We present Aequatus, an open-source web-based tool that provides an in-depth view of gene structure across gene families, with various options to render and filter visualizations. It relies on precalculated alignment and gene feature information typically held in, but not limited to, the Ensembl Compara and Core databases. We also offer Aequatus.js, a reusable JavaScript module that fulfills the visualization aspects of Aequatus, available within the Galaxy web platform as a visualization plug-in, which can be used to visualize gene trees generated by the GeneSeqToFamily workflow.
v
Global Open Source Intelligence (OSINT) Market Size By Source Type (Media,...
verifiedmarketresearch.com
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VERIFIED MARKET RESEARCH (2024). Global Open Source Intelligence (OSINT) Market Size By Source Type (Media, Internet, Public Web Data), By Technique (Text Analytics, Video Analytics, Social Media Analytics), By Deployment Type(On-premises, Cloud), By Organization Size (SMEs, Large Enterprises), By End-User (Government Intelligence Agencies, Military & Defense Intelligence Agencies), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/open-source-intelligence-osint-market/
Explore at:
Dataset updated
Sep 13, 2024
Dataset authored and provided by
VERIFIED MARKET RESEARCH
License
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time period covered
2024 - 2031
Area covered
Global
Description
Open Source Intelligence (OSINT) Market size was valued at USD 9.96 Billion in 2024 and is projected to reach USD 52.40 Billion by 2031, growing at a CAGR of 25.44% from 2024 to 2031.

Global Open Source Intelligence (OSINT) Market Drivers

The market drivers for the Open Source Intelligence (OSINT) Market can be influenced by various factors. These may include:

Increasing Digitalization: A large amount of information is now accessible to the public due to the growth of digital platforms, social media networks, and online forums. The demand for OSINT solutions that can effectively gather, analyze, and interpret open-source information is fueled by the quantity of digital data. Growing Cyber Threats: As cyber threats like fraud, terrorism, cyberattacks, and information warfare become more common, organizations and governments are using OSINT tools and services more frequently to keep an eye on online activity, spot possible threats, and evaluate risks to their interests and assets. Increasing Social Media Usage: People, companies, and government organizations utilize social media platforms extensively, which results in a plethora of user-generated content that can be useful for obtaining and analyzing intelligence. Organizations can use social media data for sentiment research, brand monitoring, and threat detection by utilizing OSINT technologies. Organizations need real-time situational awareness in order to make educated decisions and reduce risks in an era marked by rapid technical breakthroughs, geopolitical conflicts, and global interconnection. By delivering pertinent and timely information from a variety of sources, OSINT tools assist organizations in keeping up with new risks and opportunities. Regulatory Compliance Requirements: In order to reduce legal, financial, and reputational risks, companies must comply with regulatory requirements and industry standards by conducting comprehensive due diligence, risk assessments, and compliance monitoring. By offering thorough insights into pertinent entities, activities, and events, OSINT solutions help enterprises meet their compliance requirements. Geopolitical instability: The security and stability of the world are seriously threatened by geopolitical tensions, conflicts, and disagreements on a regional scale. Government agencies, defense companies, and intelligence agencies may watch enemy actions, evaluate strategic risks, and keep an eye on geopolitical developments in real time thanks to OSINT capabilities. Business Intelligence and Competitive Analysis: To obtain competitive intelligence, track market trends, and examine customer behavior, corporations in the corporate sector rely on OSINT tools and methodologies. Businesses may make data-driven decisions, spot market possibilities, and have a competitive edge in their respective industries with the help of OSINT solutions. Technological Developments: The capabilities of OSINT solutions have been improved by continuous developments in the fields of artificial intelligence (AI), machine learning (ML), natural language processing (NLP), and data analytics. Organizations can rapidly process enormous volumes of open-source data, extract relevant insights, and derive useful intelligence from heterogeneous sources with the help of advanced algorithms and automation technologies.
c
Opendata Matera - Sites - CKAN Ecosystem Catalog
catalog.civicdataecosystem.org
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Opendata Matera - Sites - CKAN Ecosystem Catalog [Dataset]. https://catalog.civicdataecosystem.org/dataset/opendata-matera
Explore at:
Dataset updated
May 5, 2025
Area covered
Matera
Description
OPEN DATA CATALOG. The new Open Data catalog of the Municipality of Matera is part of the Entity's technological innovation strategy and the Strategic Plan desired by Councilor Vincenzo Acito. The catalog was created by Francesco Piersoft Paolicelli, assisted by Arch. Fedele Congedo on behalf of Ricerca e Sviluppo scarl, according to the AGID Guidelines for the enhancement of public information assets and the Guidelines for the Design of Public Administration websites. It complies with DCAT-AP_IT and uses the tools made available by the Digital Transformation Team of Commissioner Piacentini. CKAN, the world's leading platform for open-source data portals, promoted by the Open Knowledge Foundation, was chosen to create the open data catalog. CKAN is a complete and ready-to-use software solution that makes data accessible and usable – providing tools to optimize publication, search, and use (including data storage and the availability of robust APIs), and is used by numerous public bodies: from the European Union (europeandataporta.eu) to some states such as Great Britain (data.gov.uk) or Brazil (dados.gov.br), or by public administrations in Italy such as the Province of Trento (dati.trentino.it), the Province of Rome (opendata.provincia.roma.it/), the Municipality of Bari (opendata.comune.bari.it), the Municipality of Lecce (dati.comune.lecce.it) and many others. The logos in the THEMES section are taken from the national portal Dati.Gov.it. The image of the Sassi in the background on the home page is under CC0 License. EXTENDED PRIVACY INFORMATION The cookies on this site are both profiling and technical. They are used by Google Analytics and are anonymized in the public "Data" section, while in the "Blog" section and for users authorized to insert "Datasets", the data is for internal and statistical use only. The user can block or limit the reception of cookies through the options of their browser. On Internet Explorer, click on "Tools" in the Menu Bar and then on "Internet Options". Finally, access the settings in the "Privacy" tab to change your cookie preferences. On Firefox, click on "Tools" in the Menu Bar and then on "Options". Finally, access the settings in the "Privacy" section to change your cookie preferences. On Chrome, type "chrome://settings/content" in the address bar (without quotes) and change the cookie settings as desired. On Safari, select "Preferences" and then choose "Privacy". In the Block Cookies section, specify how Safari should accept cookies from websites. If you use Safari on portable devices, such as iPhone and iPad, you must instead act in this way: go to the "settings" item of the device and then find "Safari" on the left menu. From here, in the "Privacy and Security" section, you can manage the options on cookies. FAQ & TOOLS For further information: FAQ Geocoder based on OpenStreetMap: Map Translated from Italian Original Text: CATALOGO DATI APERTI Il Nuovo catalogo dei Dati Aperti del Comune di Matera rientra nella strategia per l'innovazione tecnologica dell'Ente e nel Piano Strategico voluto dall'Assessore Vincenzo Acito. Il catalogo è stato realizzato da Francesco Piersoft Paolicelli coadiuvato dall'Arch. Fedele Congedo per conto di Ricerca e Sviluppo scarl , secondo le Linee Guida AGID per la valorizzazione del Patrimonio informativo pubblico e le Linee Guida per il Design dei siti della Pubblica Amministrazione. E' conforme al DCAT-AP_IT ed usa gli strumenti messi a disposizione da parte del Team per la Trasformazione Digitale del Commissario Piacentini. Per realizzare il Catalogo dei dati aperti è stata scelta CKAN, piattaforma leader mondiale per i portali di dati open-source, promossa dalla Open Knowledge Foundation. CKAN è una soluzione software completa e pronta all'uso che rende accessibili e utilizzabili i dati – fornendo strumenti per ottimizzarne la pubblicazione, la ricerca e l'utilizzo (inclusa l'archiviazione dei dati e la disponibilità di solide API), ed è utilizzata da numerosi Enti pubblici: dall’Unione Europea (europeandataporta.eu) ad alcuni Stati quali la Gran Bretagna (data.gov.uk) o il Brasile (dados.gov.br), ovvero da pubbliche amministrazioni in Italia come la Provincia di Trento (dati.trentino.it), la Provincia di Roma (opendata.provincia.roma.it/), il Comune di Bari (opendata.comune.bari.it), il Comune di Lecce (dati.comune.lecce.it) e tanti altri ancora. I loghi della sezione TEMI sono tratti dal portale nazionale Dati.Gov.it. L'immagine dei Sassi di sfondo nella home page è in Licenza CC0. INFORMATIVA ESTESA PRIVACY I cookies presenti su questo sito, sono sia di profilazione che tecnici. Sono usati da Google Analitycs e sono anonimizzati nella sezione pubblica “Dati” mentre nella sezione “Blog” e per gli utenti abilitati agli inserimenti dei “Dataset”, i dati sono ad uso esclusivamente interno e statistico. L’utente può bloccare o limitare la ricezione di cookies attraverso le opzioni del proprio browser. Su Internet Explorer, cliccare sulla voce “Strumenti” della Barra dei menù e poi sulla sottovoce “Opzioni Internet”. Infine accedere alle impostazioni della scheda “Privacy” per modificare le preferenze relative ai cookies. Su Firefox, cliccare sulla voce “Strumenti” della Barra dei menù e poi sulla sottovoce “Opzioni”. Infine accedere alle impostazioni della voce “Privacy” per modificare le preferenze relative ai cookies. Su Chrome, digitare “chrome://settings/content” nella barra degli indirizzi (senza virgolette) e modificare le impostazioni relative ai cookies come si desidera. Su Safari, selezionare la voce “Preferenze” e poi scegliere “Privacy”. Nella sezione Blocca Cookie specificare come Safari deve accettare i cookie dai siti internet. Se si usa Safari su dispositivi portatili, come iPhone e iPad, è necessario invece agire in questo modo: andare sulla voce “impostazioni” del dispositivo e in seguito trovare “Safari” sul menù di sinistra. Da qui, alla voce “Privacy e sicurezza”, sarà possibile gestire le opzioni sui Cookie. FAQ & TOOLS Per approfondimenti: FAQ Geocoder basato su OpenStreetMap: Mappa
e
GeoStrat Jurassic Report (open source version)
data.europa.eu
data.wu.ac.at
html
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oil and Gas Authority, GeoStrat Jurassic Report (open source version) [Dataset]. https://data.europa.eu/data/datasets/geostrat-jurassic-report-open-source-version
Explore at:
htmlAvailable download formats
Dataset authored and provided by
Oil and Gas Authority
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Geostrat Report – The Sequence Stratigraphy and Sandstone Play Fairways of the Late Jurassic Humber Group of the UK Central Graben

This non-exclusive report was purchased by the OGA from Geostrat as part of the Data Purchase tender process (TRN097012017) that was carried out during Q1 2017. The contents do not necessarily reflect the technical view of the OGA but the report is being published in the interests of making additional sources of data and interpretation available for use by the wider industry and academic communities.

The Geostrat report provides stratigraphic analyses and interpretations of data from the Late Jurassic to Early Cretaceous Humber Group across the UK Central Graben and includes a series of depositional sequence maps for eight stratigraphic intervals. Stratigraphic interpretations and tops from 189 wells (up to Release 91) are also included in the report.

The outputs as published here include a full PDF report, ODM/IC .dat format sequence maps, and all stratigraphic tops (lithostratigraphy, ages, sequence stratigraphy) in .csv format for import into different interpretation platforms.

In addition, the OGA has undertaken to provide the well tops, stratigraphic interpretations and sequence maps in shapefile format that is intended to facilitate the integration of these data into projects and data storage systems held by individual organisations who are using non-ESRI ArcGIS GIS software. As part of this process, the Geostrat well names have been matched as far as possible to the OGA well names from the OGA Offshore Wells shapefile (as provided on the OGA’s Open Data website) and the original polygon files have been incorporated into an ArcGIS project. All the files within the GIS folder of this delivery have been created by the OGA.

An ESRI ArcGIS version of this delivery, including geodatabases, layer files and map documents for well tops, stratigraphic interpretations and sequence maps is available on the OGA’s Open Data website and is recommended for use with ArcGIS.

All releases included in the Data Purchase tender process that have been made openly available are summarised in a mapping application available from the OGA website. The application includes an area of interest outline for each of the products and an overview of which wellbores have been included in the products.
Carbon Storage Open Database
osti.gov
Updated Oct 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USDOE Office of Fossil Energy (FE) (2020). Carbon Storage Open Database [Dataset]. http://doi.org/10.18141/1671320
Explore at:
Unique identifier
https://doi.org/10.18141/1671320
Dataset updated
Oct 9, 2020
Dataset provided by
National Energy Technology Laboratoryhttps://netl.doe.gov/
USDOE Office of Fossil Energy (FE)
Description
The Carbon Storage Open Database is a collection of spatial data obtained from publicly available sources published by several NATCARB Partnerships and other organizations. The carbon storage open database was collected from open-source data on ArcREST servers and websites in 2018, 2019, 2021, and 2022. The original database was published on the former GeoCube, which is now EDX Spatial, in July 2020, and has since been updated with additional data resources from the Energy Data eXchange (EDX) and external public data resources. The shapefile geodatabase is available in total, and has also been split up into multiple databases based on the maps produced for EDX spatial. These are topical map categories that describe the type of data, and sometimes the region for which the data relates. The data is separated in case there is only a specific area or data type that is of interest for download. In addition to the geodatabases, this submission contains: 1. A ReadMe file describing the processing steps completed to collect and curate the data. 2. A data catalog of all feature layers within the database. Additional published resources are available that describe the work done to produce the geodatabase: Morkner, P., Bauer, J., Creason, C., Sabbatino, M., Wingo, P., Greenburg, R., Walker, S., Yeates, D., Rose, K. 2022. Distilling Data to Drive Carbon Storage Insights. Computers & Geosciences. https://doi.org/10.1016/j.cageo.2021.104945 Morkner, P., Bauer, J., Shay, J., Sabbatino, M., and Rose, K. An Updated Carbon Storage Open Database - Geospatial Data Aggregation to Support Scaling -Up Carbon Capture and Storage. United States: N. p., 2022. Web. https://www.osti.gov/biblio/1890730 Morkner, P., Rose, K., Bauer, J., Rowan, C., Barkhurst, A., Baker, D.V., Sabbatino, M., Bean, A., Creason, C.G., Wingo, P., and Greenburg, R. Tools for Data Collection, Curation, and Discovery to Support Carbon Sequestration Insights. United States: N. p., 2020. Web. https://www.osti.gov/biblio/1777195 Disclaimer: This project was funded by the United States Department of Energy, National Energy Technology Laboratory, in part, through a site support contract. Neither the United States Government nor any agency thereof, nor any of their employees, nor the support contractor, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
a
Child Care Centers
impactmap-smudallas.hub.arcgis.com
Updated Jan 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SMU (2024). Child Care Centers [Dataset]. https://impactmap-smudallas.hub.arcgis.com/datasets/child-care-centers
Explore at:
Dataset updated
Jan 20, 2024
Dataset authored and provided by
SMU
Area covered

Description
This feature class/shapefile contains Child Care Centers derived from various sources (refer SOURCE field) for the Homeland Infrastructure Foundation-Level Data (HIFLD) database. (https://gii.dhs.gov/HIFLD)This feature class/shapefile contains locations of child day care centers for Texas. The dataset only includes center based child day care locations (including those located at schools and religious institutes) and does not include group, home, and family based child day cares. The SOURCEDATE is an indicator of when the source data was last acquired or was publicly available. All the data was acquired from respective states departments or their open source websites and only contains data provided by these sources. Information on the source of data for each state is available in the SOURCE field of the feature class/shapefile. The TYPE attribute is a common categorization of child day care centers for all states which categorizes every child day care into Center Based, School Based, Head Start, or Religious Facility solely based on the type of facility where the child day care center is geographically located.
H
HydroLang: An open-source web-based programming framework for hydrological...
hydroshare.org
beta.hydroshare.org
zip
Updated Jan 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Erazo (2023). HydroLang: An open-source web-based programming framework for hydrological sciences [Dataset]. https://www.hydroshare.org/resource/335a5ed2f1af41acb4c531d8b2a94c3c
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Jan 16, 2023
Dataset provided by
HydroShare
Authors
Carlos Erazo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Global,
Description
This software introduces HydroLang, an open-source and integrated community-driven computational web framework for hydrology and water resources research and education. HydroLang employs client-side web technologies and standards to carry out various routines aimed at acquiring, managing, transforming, analyzing, and visualizing hydrological datasets. HydroLang consists of four major high-cohesion low-coupling modules: (1) retrieving, manipulating, and transforming raw hydrological data, (2) statistical operations, hydrological analysis, and model creation, (3) generating graphical and tabular data representations, and (4) mapping and geospatial data visualization. HydroLang's unique modular architecture and open-source nature allow it to be easily tailored into any use case and web framework, and it encourages iterative enhancements with community involvement to establish the comprehensive next-generation hydrological software toolkit. Case studies can be found in the repositories linked to the software.
d
Open Source Application Development Portal - Website
catalog.data.gov
Updated May 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Highway Administration (2024). Open Source Application Development Portal - Website [Dataset]. https://catalog.data.gov/dataset/open-source-application-development-portal-website
Explore at:
Dataset updated
May 8, 2024
Dataset provided by
Federal Highway Administration
Description
Open Source Application Development Portal (OSADP). The system provides a place for programmers to share software code and solutions.
c
Opendata - Tuscany Region - Sites - CKAN Ecosystem Catalog
catalog.civicdataecosystem.org
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Opendata - Tuscany Region - Sites - CKAN Ecosystem Catalog [Dataset]. https://catalog.civicdataecosystem.org/dataset/opendata-tuscany-region
Explore at:
Dataset updated
May 13, 2025
Area covered
Tuscany
Description
What was the average price of a house in the United Kingdom in 1935? When will India's population surpass that of China? Where can you admire publicly funded works of art in Seattle? The data to answer many, many questions like these exists somewhere on the Internet - but it's not always easy to find. The Open Data platform, created as part of the actions foreseen by the Tuscan Digital Agenda, makes reusable public data available in open format, thus maximizing transparency and ease of access to the many pieces of information available to the Tuscany Region. The goal is to publish, through a gradual process, the many datasets whose ownership belongs to the Tuscany Region and other Public Administrations of the regional territory adhering to the Tuscan Regional Telematics Network (RTRT), creating an infrastructure that will allow public and private entities and civil society to create new services and applications capable of improving access to information, transparency, and therefore the social, cultural, and economic life of the entire Tuscan territory. This site is based on a powerful open-source data cataloging software, called CKAN, developed by the Open Knowledge Foundation. Each 'dataset' entry on CKAN contains a description of the data and other useful information, such as the available formats, the holder, the freedom of access and reuse, and the topics that the data address. Other users can improve or modify this information (CKAN keeps a history of all these changes). CKAN is used for several data catalogs on the Internet. The Data Hub is a freely editable and reusable catalog, in the style of Wikipedia. The British government uses CKAN for the data.gov.uk portal, which currently has about 8000 government datasets. The official public data of most European countries are collected in a CKAN catalog on publicdata.eu. There is also a list of these catalogs from all over the world on datacatalogs.org, which is in turn based on CKAN. Most of the data on the Tuscany Region Open Data portal is freely accessible and reusable: anyone has the right to use and reuse the data in any way they prefer. Maybe someone will take that nice dataset on the city's works of art that you found, and add it to a tourist map - or develop a new app for your smartphone, which will help you find monuments when you visit the city. Open data means more enterprise, collaborative scientific research, and transparent public administration. You can learn more about this topic in the Open Data Handbook. The Open Knowledge Foundation is a non-profit organization that promotes free knowledge: the development and constant improvement of CKAN is one of the ways to achieve this goal. If you want to participate in the design or development, join the public discussion or development lists, or check out the OKFN website to discover the other ongoing projects. CKAN is the world's leading platform for open-source data portals. CKAN is a complete and ready-to-use software solution that makes data accessible and usable – providing tools to optimize its publication, search and use (including data storage and the availability of robust APIs). CKAN is aimed at organizations that publish data (national and local governments, companies and institutions) and want to make it open and accessible to all. CKAN is used by governments and user groups around the world to manage a wide range of data portals for official and community bodies, including portals for local, national and international governments, such as data.gov.uk in the UK and publicdata.eu of the European Union, dados.gov.br in Brazil, government portals of the Netherlands and the Netherlands, as well as city and municipal administration sites in the USA, the United Kingdom, Argentina, Finland and other countries. CKAN: http://ckan.org/ Tour of CKAN: http://ckan.org/tour/ Overview of functions: http://ckan.org/features/ CKAN's page-view tracking feature is enabled. Translated from Italian Original Text: Qual era il prezzo medio di una casa nel Regno Unito nel 1935? Quando avverrà il sorpasso della popolazione dell'India su quella della Cina? Dove si possono ammirare opere d'arte finanziate da enti pubblici a Seattle? I dati per rispondere a molte, molte domande come queste esistono da qualche parte in Internet - ma non è sempre facile trovarli. La piattaforma Open Data, realizzata nell'ambito delle azioni previste dall'Agenda digitale toscana, mette a disposizione dati pubblici riutilizzabili, in formato aperto, favorendo così al massimo la trasparenza e la facilità di accesso alle tante informazioni di cui dispone la Regione Toscana. L'obiettivo è quello di pubblicare, attraverso un processo graduale, i tanti dataset la cui titolarità afferisce alla Regione Toscana e ad altre Pubbliche amministrazioni del territorio regionale aderenti alla Rete telematica regionale toscana (RTRT), creando un'infrastruttura che consentirà a soggetti pubblici, privati e della società civile di creare nuovi servizi e applicazioni in grado di migliorare l'accesso all'informazione, la trasparenza e quindi la vita sociale, culturale ed economica dell'intero territorio toscano. Questo sito è basato su un potente software open-source di catalogazione dei dati, chiamato CKAN, sviluppato dalla Open Knowledge Foundation. Ogni voce di 'dataset' su CKAN contiene una descrizione dei dati e altre informazioni utili, come i formati disponibili, il detentore, la libertà di accesso e riuso, e gli argomenti che i dati affrontano. Gli altri utenti possono migliorare o modificare queste informazioni (CKAN mantiene una cronologia di tutte queste modifiche). CKAN è utilizzato per diversi cataloghi di dati su Internet. The Data Hub è un catalogo liberamente modificabile e riutilizzabile, nello stile di Wikipedia. Il governo britannico usa CKAN per il portale data.gov.uk, che attualmente conta circa 8000 dataset governativi. I dati pubblici ufficiali della maggior parte dei paesi europei sono raccolti in un catalogo CKAN su publicdata.eu. Esiste anche una lista di questi cataloghi da tutto il mondo su datacatalogs.org, che è a sua volta basato su CKAN. La maggior parte dei dati sul portale Open Data della Regione Toscana è liberamente accessibile e riutilizzabile: chiunque ha il diritto di utilizzare e riutilizzare i dati nel modo che preferisce. Magari qualcuno prenderà quel simpatico dataset sulle opere d'arte della città che avevi trovato tu, e lo aggiungerà a una mappa turistica - oppure svilupperà una nuova app per il tuo smartphone, che ti aiuterà a trovare i monumenti quando visiti la città. Gli open data significano più impresa, ricerca scientifica collaborativa e pubblica amministrazione trasparente. Puoi approfondire questo argomento nell'Open Data Handbook. La Open Knowledge Foundation è una organizzazione no-profit che promuove il sapere libero: lo sviluppo e il miglioramento costante di CKAN è uno dei modi per raggiungere questo obiettivo. Se vuoi partecipare alla progettazione o allo sviluppo, unisciti alle liste pubbliche di discussione o sviluppo, o dai un'occhiata al sito della OKFN per scoprire gli altri progetti in corso. CKAN è la piattaforma leader mondiale per i portali di dati open-source. CKAN è una soluzione software completa e pronta all'uso che rende accessibili e utilizzabili i dati – fornendo strumenti per ottimizzarne la pubblicazione, la ricerca e l'utilizzo (inclusa l'archiviazione dei dati e la disponibilità di solide API). CKAN si rivolge alle organizzazioni che pubblicano dati (governi nazionali e locali, aziende ed istituzioni) e desiderano renderli aperti e accessibili a tutti. CKAN è usato da governi e gruppi di utenti in tutto il mondo per gestire una vasta serie di portali di dati di enti ufficiali e di comunità, tra cui portali per governi locali, nazionali e internazionali, come data.gov.uk nel Regno Unito e publicdata.eu dell'Unione Europea, dados.gov.br in Brasile, portali di governo dell'Olanda e dei Paesi Bassi, oltre a siti di amministrazione cittadine e municipali negli USA, nel Regno Unito, Argentina, Finlandia e altri paesi. CKAN: http://ckan.org/ Tour di CKAN: http://ckan.org/tour/ Panoramica delle funzioni: http://ckan.org/features/ CKAN's page-view tracking feature is enabled.
d
The GRIN-Global Project.
datadiscoverystudio.org
data.amerigeoss.org
+1more
Updated Jun 9, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). The GRIN-Global Project. [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/0d9d90165aaf46deaed52d2b7e26c83d/html
Explore at:
Dataset updated
Jun 9, 2018
Description
description:
GRIN-Global (GG) is a database application that enables genebanks to store and manage information associated with plant genetic resources (germplasm) and deliver that information globally. The GRIN-Global project's mission is to provide a scalable version of the Germplasm Resource Information Network (GRIN) suitable for use by any interested genebank in the world. The GRIN-Global database platform has been and is being implemented at various genebanks around the world. The first version, 1.0.7, was released in December, 2011 in a joint effort by the Global Crop Diversity Trust, Bioversity International, and the Agricultural Research Service of the USDA. The U.S. National Plant Germplasm System version (1.9.4.2) entered into production on November 30, 2015.

Typically set up in a networked environment, GG can also run stand-alone on a single personal computer. GG has been developed with open source software and its source code is available, and Genebanks can thus tailor GG to meet their specific requirements. GG comprises a suite of programs, including a Curator Tool, Updater, Search Tool, Admin Tool, and Public Website with Shopping Cart. Through the Public Website, researchers can access germplasm information; search the entire GG database and download results; and order germplasm from the genebank. Data are also associated with Google Maps.

Current installations include Bolivia (INIAF), Chile (INIA), CIMMYT (CGIAR), Czech Republic (Crop Research Institute), Portugal (INIAV), USDA (NPGS), Tunisia (BNG), CIP (CGIAR), Genetic Resources of Madeira Island (Portugal), CIAT (CGIAR) with many others under evaluation.
; abstract:
GRIN-Global (GG) is a database application that enables genebanks to store and manage information associated with plant genetic resources (germplasm) and deliver that information globally. The GRIN-Global project's mission is to provide a scalable version of the Germplasm Resource Information Network (GRIN) suitable for use by any interested genebank in the world. The GRIN-Global database platform has been and is being implemented at various genebanks around the world. The first version, 1.0.7, was released in December, 2011 in a joint effort by the Global Crop Diversity Trust, Bioversity International, and the Agricultural Research Service of the USDA. The U.S. National Plant Germplasm System version (1.9.4.2) entered into production on November 30, 2015.

Typically set up in a networked environment, GG can also run stand-alone on a single personal computer. GG has been developed with open source software and its source code is available, and Genebanks can thus tailor GG to meet their specific requirements. GG comprises a suite of programs, including a Curator Tool, Updater, Search Tool, Admin Tool, and Public Website with Shopping Cart. Through the Public Website, researchers can access germplasm information; search the entire GG database and download results; and order germplasm from the genebank. Data are also associated with Google Maps.

Current installations include Bolivia (INIAF), Chile (INIA), CIMMYT (CGIAR), Czech Republic (Crop Research Institute), Portugal (INIAV), USDA (NPGS), Tunisia (BNG), CIP (CGIAR), Genetic Resources of Madeira Island (Portugal), CIAT (CGIAR) with many others under evaluation.
n
Data from: CovidCounties is an interactive real time tracker of the COVID19...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Nov 3, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Douglas Arneson; Matthew Elliot; Arman Mosenia; Boris Oskotsky; Vivek Rudrapatna; Rohit Vashisht; Travis Zack; Paul Bleicher; Atul Butte (2020). CovidCounties is an interactive real time tracker of the COVID19 pandemic at the level of US counties [Dataset]. http://doi.org/10.7272/Q6VQ30VD
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.7272/Q6VQ30VD
Dataset updated
Nov 3, 2020
Dataset provided by
Evident Health
University of California, San Francisco
Authors
Douglas Arneson; Matthew Elliot; Arman Mosenia; Boris Oskotsky; Vivek Rudrapatna; Rohit Vashisht; Travis Zack; Paul Bleicher; Atul Butte
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
United States
Description
Management of the COVID-19 pandemic has proven to be a significant challenge to policy makers. This is in large part due to uneven reporting and the absence of open-access visualization tools to present and analyze local trends as well as infer healthcare needs. Here we report the development of CovidCounties.org, an interactive web application that depicts daily disease trends at the level of US counties using time series plots and maps. This application is accompanied by a manually curated dataset that catalogs all major public policy actions made at the state-level, as well as technical validation of the primary data. Finally, the underlying code for the site is also provided as open source, enabling others to validate and learn from this work.

Methods Data related to state-wide implementation of social-distancing policies were manually curated by web search and independently reviewed by a second author; disagreements were rare and resolved by discussion. Government websites were prioritized as sources of truth where feasible; otherwise, news reports covering state-wide proclamations were used. All citations are captured in the data file.

Ground truth data used in the validation were manually curated from states’ Department of Public Health websites. Citations of the validation data are included in the data file.

To confirm global accessibility of covidcounties.org, we used dareboost.com to perform loading speed tests from 14 cities across the globe using three different devices: Google Chrome via desktop, iPhone 6s/7/8, and Samsung Galaxy S6.
d
Data from: Open Source Cyberinfrastructure to Simplify the Development and...
search.dataone.org
hydroshare.org
+1more
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dan Ames; Zhiyu (Drew) Li; Jim Nelson; Norm Jones; David Tarboton (2021). Open Source Cyberinfrastructure  to Simplify the Development and Deployment of Environmental Modelling Web Applications [Dataset]. https://search.dataone.org/view/sha256%3A7bcb90333217bbd54cecd0f1ed3c30cf6705e1d7b40e72ed7298ae3c2e900f06
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Dan Ames; Zhiyu (Drew) Li; Jim Nelson; Norm Jones; David Tarboton
Description
In view of the ubiquitous mobile-app concept that has taken hold over the past decade, whereby distinct, single purpose, modular applications are developed and deployed in a shared user interface (i.e. the phone in your pocket), we have created open source cyberinfrastructure that mimics this paradigm for developing and deploying environmental web applications using open source tools and cloud computing services. This cyberinfrastructure integrates HydroShare for cloud-based data storage and app cataloging, together with Tethys Platform for Python/Django based app development. HydroShare is an open source web-based data management system for climate and water data that is includes a web-services application programmer interface (API) to allow third party programmers to access and use its data resources. We have created a metadata management structure within HydroShare for cataloging, discovering, and sharing web apps. Tethys Platform is an open source software package based on the Django framework, Python programming language, Geoserver, PostgreSQL, OpenLayers and other open source technologies. The Tethys software development kit allows users to create web apps that are presented in a common portal for visualizing, analyzing and modelling environmental data. We will introduce this new cyberinfrastructure through a combination of architecture design and demonstration, and will provide attendees the essential concepts for building their own web apps using these tools.
theLook eCommerce
console.cloud.google.com
Updated Nov 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:BigQuery%20Public%20Data&inv=1&invt=Ab2Y8Q (2022). theLook eCommerce [Dataset]. https://console.cloud.google.com/marketplace/product/bigquery-public-data/thelook-ecommerce
Explore at:
Dataset updated
Nov 28, 2022
Dataset provided by
Googlehttp://google.com/
BigQueryhttps://cloud.google.com/bigquery
Description
TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets.What is BigQuery .
Open Data Portal Catalogue
open.canada.ca
datasets.ai
+1more
csv, json, jsonl, png +2
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Treasury Board of Canada Secretariat (2025). Open Data Portal Catalogue [Dataset]. https://open.canada.ca/data/en/dataset/c4c5c7f1-bfa6-4ff6-b4a0-c164cb2060f7
Explore at:
csv, sqlite, json, png, jsonl, xlsxAvailable download formats
Dataset updated
Jul 13, 2025
Dataset provided by
Treasury Board of Canada Secretariathttp://www.tbs-sct.gc.ca/
Treasury Board of Canadahttps://www.canada.ca/en/treasury-board-secretariat/corporate/about-treasury-board.html
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
The open data portal catalogue is a downloadable dataset containing some key metadata for the general datasets available on the Government of Canada's Open Data portal. Resource 1 is generated using the ckanapi tool (external link) Resources 2 - 8 are generated using the Flatterer (external link) utility. ###Description of resources: 1. Dataset is a JSON Lines (external link) file where the metadata of each Dataset/Open Information Record is one line of JSON. The file is compressed with GZip. The file is heavily nested and recommended for users familiar with working with nested JSON. 2. Catalogue is a XLSX workbook where the nested metadata of each Dataset/Open Information Record is flattened into worksheets for each type of metadata. 3. datasets metadata contains metadata at the dataset level. This is also referred to as the package in some CKAN documentation. This is the main table/worksheet in the SQLite database and XLSX output. 4. Resources Metadata contains the metadata for the resources contained within each dataset. 5. resource views metadata contains the metadata for the views applied to each resource, if a resource has a view configured. 6. datastore fields metadata contains the DataStore information for CSV datasets that have been loaded into the DataStore. This information is displayed in the Data Dictionary for DataStore enabled CSVs. 7. Data Package Fields contains a description of the fields available in each of the tables within the Catalogue, as well as the count of the number of records each table contains. 8. data package entity relation diagram Displays the title and format for column, in each table in the Data Package in the form of a ERD Diagram. The Data Package resource offers a text based version. 9. SQLite Database is a .db database, similar in structure to Catalogue. This can be queried with database or analytical software tools for doing analysis.
Z
Data from: OpenChart-SE: A corpus of artificial Swedish electronic health...
data.niaid.nih.gov
zenodo.org
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Berg, Johanna (2024). OpenChart-SE: A corpus of artificial Swedish electronic health records for imagined emergency care patients written by physicians in a crowd-sourcing project [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7499830
Explore at:
Dataset updated
Jul 15, 2024
Dataset provided by
Aits, Sonja
Appelgren Thorell, Björn
Aasa, Carl Ollvik
Berg, Johanna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Sweden
Description
Electronic health records (EHRs) are a rich source of information for medical research and public health monitoring. Information systems based on EHR data could also assist in patient care and hospital management. However, much of the data in EHRs is in the form of unstructured text, which is difficult to process for analysis. Natural language processing (NLP), a form of artificial intelligence, has the potential to enable automatic extraction of information from EHRs and several NLP tools adapted to the style of clinical writing have been developed for English and other major languages. In contrast, the development of NLP tools for less widely spoken languages such as Swedish has lagged behind. A major bottleneck in the development of NLP tools is the restricted access to EHRs due to legitimate patient privacy concerns. To overcome this issue we have generated a citizen science platform for collecting artificial Swedish EHRs with the help of Swedish physicians and medical students. These artificial EHRs describe imagined but plausible emergency care patients in a style that closely resembles EHRs used in emergency departments in Sweden. In the pilot phase, we collected a first batch of 50 artificial EHRs, which has passed review by an experienced Swedish emergency care physician. We make this dataset publicly available as OpenChart-SE corpus (version 1) under an open-source license for the NLP research community. The project is now open for general participation and Swedish physicians and medical students are invited to submit EHRs on the project website (https://github.com/Aitslab/openchart-se), where additional batches of quality-controlled EHRs will be released periodically.

Dataset content

OpenChart-SE, version 1 corpus (txt files and and dataset.csv)

The OpenChart-SE corpus, version 1, contains 50 artificial EHRs (note that the numbering starts with 5 as 1-4 were test cases that were not suitable for publication). The EHRs are available in two formats, structured as a .csv file and as separate textfiles for annotation. Note that flaws in the data were not cleaned up so that it simulates what could be encountered when working with data from different EHR systems. All charts have been checked for medical validity by a resident in Emergency Medicine at a Swedish hospital before publication.

Codebook.xlsx

The codebook contain information about each variable used. It is in XLSForm-format, which can be re-used in several different applications for data collection.

suppl_data_1_openchart-se_form.pdf

OpenChart-SE mock emergency care EHR form.

suppl_data_3_openchart-se_dataexploration.ipynb

This jupyter notebook contains the code and results from the analysis of the OpenChart-SE corpus.

More details about the project and information on the upcoming preprint accompanying the dataset can be found on the project website (https://github.com/Aitslab/openchart-se).

Facebook

Twitter

Click to copy link

Link copied

Cite

Reed, Terry (2023). Open Source Indicators Project [Dataset]. http://doi.org/10.7910/DVN/EN8FUW

Open Source Indicators Project

Explore at:

14 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.7910/DVN/EN8FUW

Dataset updated

Nov 21, 2023

Dataset provided by

Harvard Dataverse

Authors

Reed, Terry

Description

The goal of the Open Source Indicators (OSI) Program was to make automated predictions of significant societal events through the continuous and automated analysis of publicly available data such as news media, social media, informational websites, and satellite imagery. Societal events of interest included civil unrest, disease outbreaks, and election results. Geographic areas of interest include countries in Latin America (LA) and the Middle East and North Africa (MENA). The handbook is intended to serve as a reference document for the OSI Program and a companion to the ground truth event data used for test and evaluation. The handbook provides guidance regarding the types of events considered; the submission of automated predictions or “warnings;” the development of ground truth; the test and evaluation of submitted warnings; performance measures; and other programmatic information. IARPA initiated a solicitation for OSI Research Teams in late summer 2011 for one base year and two option years of research. MITRE was selected as the Test and Evaluation (T&E) Team in November 2011. Following a review of proposals, three teams (BBN, HRL, and Virginia Tech (VT)) were selected. The OSI Program officially began in April 2012; manual event encoding and formal T&E ended in March 2015.

Clear search

Close search

Google apps

Main menu

Open Source Indicators Project

Open Source Intelligence Market Report

Data from: Inventory of online public databases and repositories holding...

Open Source Intelligence Market By Security (Human Intelligence, Content...

Open Source Intelligence Market By Deployment Type (Cloud And On-Premises);...

Data from: Aequatus: an open-source homology browser

Global Open Source Intelligence (OSINT) Market Size By Source Type (Media,...

Opendata Matera - Sites - CKAN Ecosystem Catalog

GeoStrat Jurassic Report (open source version)

Carbon Storage Open Database

Child Care Centers

HydroLang: An open-source web-based programming framework for hydrological...

Open Source Application Development Portal - Website

Opendata - Tuscany Region - Sites - CKAN Ecosystem Catalog

The GRIN-Global Project.

Data from: CovidCounties is an interactive real time tracker of the COVID19...

Data from: Open Source Cyberinfrastructure to Simplify the Development and...

theLook eCommerce

Open Data Portal Catalogue

Data from: OpenChart-SE: A corpus of artificial Swedish electronic health...

Open Source Indicators Project