100+ datasets found

Open Data Challenges
figshare.com
pdf
Updated Sep 7, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ian Mulvany (2016). Open Data Challenges [Dataset]. http://doi.org/10.6084/m9.figshare.3810855.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3810855.v1
Dataset updated
Sep 7, 2016
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Ian Mulvany
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Calls in favour of Open Data in research are becoming overwhelming. They are at national [@RCKUOpen] and international levels [@Moedas2015, @RSOpen, @ams2016]. I will set out a working definition of Open Data and will discuss the key challenges preventing the publication of Open Data becoming standard practice. I will attempt to draw some general solutions to those challenges from field specific examples.
o
DEF Organizational Structure - Dataset - Open Government Data
opendata.gov.jo
Updated Oct 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). DEF Organizational Structure - Dataset - Open Government Data [Dataset]. https://opendata.gov.jo/dataset/def-organizational-structure-562-2020
Explore at:
Dataset updated
Oct 13, 2020
Description
this is to show the Organizational Structure for development and employment fund
Publications and filtering decisions for the systematic literature review in...
zenodo.org
data.4tu.nl
+2more
txt
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashraf Shaharudin; Ashraf Shaharudin; Bastiaan van Loenen; Bastiaan van Loenen; Marijn Janssen; Marijn Janssen (2023). Publications and filtering decisions for the systematic literature review in "Towards a Common Definition of Open Data Intermediaries" [Dataset]. http://doi.org/10.4121/6371fb1e-1a0c-467a-b290-570a417f6885.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.4121/6371fb1e-1a0c-467a-b290-570a417f6885.v1
Dataset updated
Nov 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ashraf Shaharudin; Ashraf Shaharudin; Bastiaan van Loenen; Bastiaan van Loenen; Marijn Janssen; Marijn Janssen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This file contains the list of publications and filtering decisions of the systematic literature review conducted for the article "Towards a common definition of open data intermediaries" published in the Digital Government: Research and Practice (DGOV) journal (https://doi.org/10.1145/3585537). The literature search was done on 1 June 2022 and there was no start date set (i.e. all relevant literature up to 1 June 2022 was included).

There are 4 documents in this folder (apart from README text describing the data in each document):

Stage-0 Search results

Stage-1 Remove redundant

Stage-2 Remove irrelevant

Stage-3 Final filtering

The authors acknowledge the financial support from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 955569, "Towards a sustainable Open Data ECOsystem" (ODECO).
o
Données éCO2mix régionales consolidées et définitives (janvier 2013 à...
odre.opendatasoft.com
data.smartidf.services
+1more
csv, excel, json
Updated May 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Données éCO2mix régionales consolidées et définitives (janvier 2013 à janvier 2023) [Dataset]. https://odre.opendatasoft.com/explore/dataset/eco2mix-regional-cons-def/
Explore at:
excel, json, csvAvailable download formats
Dataset updated
May 21, 2024
License
Licence Ouverte / Open Licence 2.0https://www.etalab.gouv.fr/wp-content/uploads/2018/11/open-licence.pdf
License information was derived automatically
Description
Ce jeu de données, rafraîchi une fois par jour, présente les données régionales consolidées depuis janvier 2021 et définitives (de janvier 2013 à décembre 2020) issues de l'application éCO2mix. Elles sont élaborées à partir des comptages et complétées par des forfaits. Les données sont dites consolidées lorsqu'elles ont été vérifiées et complétées (livraison en milieu de M+1). Elles deviennent définitives lorsque tous les partenaires ont transmis et vérifié l'ensemble des comptages, (livraison deuxième trimestre A+1).

Vous y trouverez au pas demi-heure:

La consommation réalisée. La production selon les différentes filières composant le mix énergétique. La consommation des pompes dans les Stations de Transfert d'Energie par Pompage (STEP). Le solde des échanges avec les régions limitrophes.Pour information, ci-dessous les définitions de TCO et TCH :TCO : le Taux de COuverture (TCO) d'une filière de production au sein d'une région représente la part de cette filière dans la consommation de cette régionTCH : le Taux de CHarge (TCH) ou facteur de charge (FC) d'une filière représente son volume de production par rapport à la capacité de production installée et en service de cette filière

Si vous souhaitez consulter les données régionales "temps réel" après la dernière consolidation, vous pouvez suivre ce lien : https://opendata.reseaux-energies.fr/explore/dataset/eco2mix-regional-tr

Pour en savoir plus, n'hésitez pas à consulter le site éCO2mix à cette adresse : http://www.rte-france.com/fr/eco2mix/eco2mix

Ce jeu de données est mis à jour automatiquement une fois par jour : en raison d'un nombre de téléchargements excessif des données éCO2mix régionales consolidées et définitives par des robots à des fréquences disproportionnées au regard de la fréquence de mise à jour du jeu de données, un quota de 50000 appels API par utilisateur et par mois a été mis en place. Si vous constatez des soucis d'accès aux données suite à la mise en place de ce quota, merci de nous contacter à l'adresse rte-opendata@rte-france.com
E
Data from: Slovenian Definition Extraction training dataset DF_NDF_wiki_slo...
live.european-language-grid.eu
binary format
Updated May 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Slovenian Definition Extraction training dataset DF_NDF_wiki_slo 1.0 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/21587
Explore at:
binary formatAvailable download formats
Dataset updated
May 18, 2023
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The Slovenian definition extraction training dataset DF_NDF_wiki_slo contains 38613 sentences extracted from the Slovenian Wikipedia. The first sentence of a term's description on Wikipedia is considered a definition, and all other sentences are considered non-definitions.

The corpus consists of the following files each containing one definition / non-definition sentence per line:

Definitions: df_ndf_wiki_slo_Y.txt with 3251 definition sentences.

Non-definitions: df_ndf_wiki_slo_N.txt with 14678 non-definition sentences which do not contain the term at the beginning of the sentence.

Non-definitions: df_ndf_wiki_slo_N1.txt with 20684 non-definition sentences which may also contain the term at the beginning of the sentence.

The dataset is described in more detail in Fišer et al. 2010. If you use this resource, please cite:

Fišer, D., Pollak, S., Vintar, Š. (2010). Learning to Mine Definitions from Slovene Structured and Unstructured Knowledge-Rich Resources. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10). https://aclanthology.org/L10-1089/

Reference to training Transformer-based definition extraction models using this dataset: Tran, T.H.H., Podpečan, V., Jemec Tomazin, M., Pollak, Senja (2023). Definition Extraction for Slovene: Patterns, Transformer Classifiers and ChatGPT. Proceedings of the ELEX 2023: Electronic lexicography in the 21st century. Invisible lexicography: everywhere lexical data is used without users realizing they make use of a “dictionary”.

Related resources: Jemec Tomazin, M. et al. (2023). Slovenian Definition Extraction evaluation datasets RSDO-def 1.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1841
AirBnB prod data
figshare.com
txt
Updated Apr 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepchecks Data (2023). AirBnB prod data [Dataset]. http://doi.org/10.6084/m9.figshare.22495942.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22495942.v1
Dataset updated
Apr 3, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Deepchecks Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The New York City Airbnb 2019 Open Data is a dataset containing varius details about a listed unit, when the goal is to predict the rental price of a unit.

This dataset contains the details for units listed in NYC during 2019, was adapted from the following open kaggle dataset: https://www.kaggle.com/datasets/dgomonov/new-york-city-airbnb-open-data. This, in turn was downloaded from the Airbnb data repository http://insideairbnb.com/get-the-data.

This dataset is licensed under the CC0 1.0 Universal License (https://creativecommons.org/publicdomain/zero/1.0/).

The typical ML task in this dataset is to build a model that predicts the average rental price of a unit.
o
Shorelines Definition - Dataset - Open Data NI
admin.opendatani.gov.uk
Updated Oct 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Shorelines Definition - Dataset - Open Data NI [Dataset]. https://admin.opendatani.gov.uk/dataset/shorelines-definition
Explore at:
Dataset updated
Oct 9, 2024
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
The primary objective from this project was to acquire historical shoreline information for all of the Northern Ireland coastline. Having this detailed understanding of the coast’s shoreline position and geometry over annual to decadal time periods is essential in any management of the coast.The historical shoreline analysis was based on all available Ordnance Survey maps and aerial imagery information. Analysis looked at position and geometry over annual to decadal time periods, providing a dynamic picture of how the coastline has changed since the start of the early 1800s.Once all datasets were collated, data was interrogated using the ArcGIS package – Digital Shoreline Analysis System (DSAS). DSAS is a software package which enables a user to calculate rate-of-change statistics from multiple historical shoreline positions. Rate-of-change was collected at 25m intervals and displayed both statistically and spatially allowing for areas of retreat/accretion to be identified at any given stretch of coastline.The DSAS software will produce the following rate-of-change statistics:Net Shoreline Movement (NSM) – the distance between the oldest and the youngest shorelines.Shoreline Change Envelope (SCE) – a measure of the total change in shoreline movement considering all available shoreline positions and reporting their distances, without reference to their specific dates.End Point Rate (EPR) – derived by dividing the distance of shoreline movement by the time elapsed between the oldest and the youngest shoreline positions.Linear Regression Rate (LRR) – determines a rate of change statistic by fitting a least square regression to all shorelines at specific transects.Weighted Linear Regression Rate (WLR) - calculates a weighted linear regression of shoreline change on each transect. It considers the shoreline uncertainty giving more emphasis on shorelines with a smaller error.The end product provided by Ulster University is an invaluable tool and digital asset that has helped to visualise shoreline change and assess approximate rates of historical change at any given coastal stretch on the Northern Ireland coast.
O
Economic Development Department Definition Guide
data.austintexas.gov
datahub.austintexas.gov
+1more
application/rdfxml +5
Updated Dec 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Austin, Texas - data.austintexas.gov (2021). Economic Development Department Definition Guide [Dataset]. https://data.austintexas.gov/City-Government/Economic-Development-Department-Definition-Guide/wuwd-rzwz
Explore at:
tsv, csv, json, application/rssxml, application/rdfxml, xmlAvailable download formats
Dataset updated
Dec 15, 2021
Dataset authored and provided by
City of Austin, Texas - data.austintexas.gov
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Frequently used terms and phrases in various Program Guidelines and Applications. For additional information, visit the department Funding page: https://www.austintexas.gov/department/economic-development/funding
Data from: Open Data Intermediaries in Developing Countries Dataset
zenodo.org
data.niaid.nih.gov
bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francois Van Schalkwyk; Michael Canares; Sumandro Chattapadhyay; Alexander Andrason; Francois Van Schalkwyk; Michael Canares; Sumandro Chattapadhyay; Alexander Andrason (2020). Open Data Intermediaries in Developing Countries Dataset [Dataset]. http://doi.org/10.5281/zenodo.45181
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.45181
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Francois Van Schalkwyk; Michael Canares; Sumandro Chattapadhyay; Alexander Andrason; Francois Van Schalkwyk; Michael Canares; Sumandro Chattapadhyay; Alexander Andrason
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These three datasets support the findings of the paper "Open Data Intermediaries in Developing Countries" published in the Journal of Community Informatics. The paper explores the concept of open data intermediaries using the theoretical framework of Bourdieu’s social model, particularly his species of capital. Secondary data on intermediaries from Emerging Impacts of Open Data in Developing Countries research was analysed according to a working definition of an open data intermediary presented in this paper, and with a focus on how intermediaries are able to link agents in an open data supply chain, including to grassroots communities. The study found that open data supply chains may comprise multiple intermediaries and that multiple forms of capital may be required to connect the supply and use of open data. The effectiveness of intermediaries can be attributed to their proximity to data suppliers or users, and proximity can be expressed as a function of the type of capital that an intermediary possesses. However, because no single intermediary necessarily has all the capital available to link effectively to all sources of power in a field, multiple intermediaries with complementary configurations of capital are more likely to connect between power nexuses. This study concludes that consideration needs to be given to the presence of multiple intermediaries in an open data ecosystem, each of whom may possess different forms of capital to enable the use of open data.

Data:

Data for 27 Asian cases extracted from the 17 Emerging Impacts of Open Data in Developing Countries case study reports.

Data for 4 African cases extracted from the 17 Emerging Impacts of Open Data in Developing Countries case study reports.

Tabulated data of findings for types of capital, organisational type and primary source of revenue for each of the 32 open data intermediaries included in the study.
h
Data from: dore
huggingface.co
Updated Mar 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Multilingual Definition Modelling (2024). dore [Dataset]. https://huggingface.co/datasets/multidefmod/dore
Explore at:
Dataset updated
Mar 12, 2024
Dataset authored and provided by
Multilingual Definition Modelling
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
You must agree to the license and terms of use before using the dataset in this repo.

DORE: Definition MOdelling in PoRtuguEse

This repository introduces DORE, a comprehensive corpus of over 100,000 definitions from Portuguese dictionaries. Alongside DORE, we also introduce the models used to perform Portuguese DM. The release of DORE aims to fill in the gap of resources for Automatic Definition Generation, or Definition Modelling (DM), in Portuguese. DORE is the first dataset… See the full description on the dataset page: https://huggingface.co/datasets/multidefmod/dore.
Program Definition and Rules
catalog.data.gov
datasets.ai
+2more
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farm Service Agency, Department of Agriculture (2025). Program Definition and Rules [Dataset]. https://catalog.data.gov/dataset/program-definition-and-rules
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Farm Service Agencyhttps://www.fsa.usda.gov/
United States Department of Agriculturehttp://usda.gov/
Description
All the information about a specific agricultural program offered by FSA, the specified rules for eligibility, disbursement and possible repayment options and continuing service activity.
Z
Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A...
data.niaid.nih.gov
zenodo.org
Updated Feb 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wang Guizhou (2024). Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A Spatial-Temporal Approach and Dataset" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8419699
Explore at:
Dataset updated
Feb 4, 2024
Dataset provided by
He Guojin
Gong Chengjuan
Jiao Weili
Yin Ranyu
Long Tengfei
Wang Guizhou
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset is built for time-series Sentinel-2 cloud detection and stored in Tensorflow TFRecord (refer to https://www.tensorflow.org/tutorials/load_data/tfrecord).

Each file is compressed in 7z format and can be decompressed using Bandzip or 7-zip software.

Dataset Structure:

Each filename can be split into three parts using underscores. The first part indicates whether it is designated for training or validation ('train' or 'val'); the second part indicates the Sentinel-2 tile name, and the last part indicates the number of samples in this file.

For each sample, it includes:

Sample ID;

Array of time series 4 band image patches in 10m resolution, shaped as (n_timestamps, 4, 42, 42);

Label list indicating cloud cover status for the center (6\times6) pixels of each timestamp;

Ordinal list for each timestamp;

Sample weight list (reserved);

Here is a demonstration function for parsing the TFRecord file:

import tensorflow as tf

init Tensorflow Dataset from file name

def parseRecordDirect(fname): sep = '/' parts = tf.strings.split(fname,sep) tn = tf.strings.split(parts[-1],sep='_')[-2] nn = tf.strings.to_number(tf.strings.split(parts[-1],sep='_')[-1],tf.dtypes.int64) t = tf.data.Dataset.from_tensors(tn).repeat().take(nn) t1 = tf.data.TFRecordDataset(fname) ds = tf.data.Dataset.zip((t, t1)) return ds

keys_to_features_direct = { 'localid': tf.io.FixedLenFeature([], tf.int64, -1), 'image_raw_ldseries': tf.io.FixedLenFeature((), tf.string, ''), 'labels': tf.io.FixedLenFeature((), tf.string, ''), 'dates': tf.io.FixedLenFeature((), tf.string, ''), 'weights': tf.io.FixedLenFeature((), tf.string, '') }

The Decoder (Optional)

class SeriesClassificationDirectDecorder(decoder.Decoder): """A tf.Example decoder for tfds classification datasets.""" def init(self) -> None: super()._init_()

def decode(self, tid, ds): parsed = tf.io.parse_single_example(ds, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) sample_dict = { 'tid': tid, # tile ID 'dates': dates, # Date list 'localid': parsed['localid'], # sample ID 'imgs': decoded, # image array 'labels': label, # label list 'weights': weight } return sample_dict

simple function

def preprocessDirect(tid, record): parsed = tf.io.parse_single_example(record, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) return tid, dates, parsed['localid'], decoded, label, weight

t1 = parseRecordDirect('filename here') dataset = t1.map(preprocessDirect, num_parallel_calls=tf.data.experimental.AUTOTUNE)

#

Class Definition:

0: clear

1: opaque cloud

2: thin cloud

3: haze

4: cloud shadow

5: snow

Dataset Construction:

First, we randomly generate 500 points for each tile, and all these points are aligned to the pixel grid center of the subdatasets in 60m resolution (eg. B10) for consistence when comparing with other products. It is because that other cloud detection method may use the cirrus band as features, which is in 60m resolution.

Then, the time series image patches of two shapes are cropped with each point as the center.The patches of shape (42 \times 42) are cropped from the bands in 10m resolution (B2, B3, B4, B8) and are used to construct this dataset.And the patches of shape (348 \times 348) are cropped from the True Colour Image (TCI, details see sentinel-2 user guide) file and are used to interpreting class labels.

The samples with a large number of timestamps could be time-consuming in the IO stage, thus the time series patches are divided into different groups with timestamps not exceeding 100 for every group.
TxDOT Street Definition Data Dictionary
gis-txdot.opendata.arcgis.com
geoportal-mpo.opendata.arcgis.com
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Texas Department of Transportation (2025). TxDOT Street Definition Data Dictionary [Dataset]. https://gis-txdot.opendata.arcgis.com/documents/2c7c512e64334fb49884613fe745b406
Explore at:
Dataset updated
Apr 24, 2025
Dataset authored and provided by
Texas Department of Transportationhttp://txdot.gov/
Description
Programmatically generated Data Dictionary document detailing the TxDOT Street Definition service.

The PDF contains service metadata and a complete list of data fields. For any questions or issues related to the document, please contact the data owner of the service identified in the PDF and Credits of this portal item. Related Links TxDOT Street Definition Service URL TxDOT Street Definition Portal Item
d
Data For The Ambiguous Definition of Open Government: Parliamentarians,...
search.dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kiss, Simon; Wootten, George (2023). Data For The Ambiguous Definition of Open Government: Parliamentarians, Journalists and Bloggers Define Open Government In Accordance With Their Interests [Dataset]. http://doi.org/10.5683/SP/R1ETCO
Explore at:
Unique identifier
https://doi.org/10.5683/SP/R1ETCO
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Kiss, Simon; Wootten, George
Description
This is the data set that includes rankings of competing definitions of open government by a sample of Canadian journalists, parliamentarians and bloggers.
d
Ridgelines: Department of Interior Definition
catalog.data.gov
data.openei.org
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Renewable Energy Laboratory (NREL) (2024). Ridgelines: Department of Interior Definition [Dataset]. https://catalog.data.gov/dataset/ridgelines-department-of-interior-definition
Explore at:
Dataset updated
Oct 1, 2024
Dataset provided by
National Renewable Energy Laboratory (NREL)
Description
This dataset represents ridgelines as defined by the Department of Interior (DOI): "Areas within 660 feet of the top of the ridgeline, where a ridgeline has at least 150 feet of vertical elevation gain with a minimum average slope of 10 percent between the ridgeline and the base." The dataset was created using the Geomorphons package from the University of Guelph, which can be found here: Geomorphons Package, and the 3DEP 1/3 arc-second digital elevation model. A TIF data file and a PNG map of the data are provided.
Z
Dataset: A Systematic Literature Review on the topic of High-value datasets
data.niaid.nih.gov
zenodo.org
Updated Jun 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Charalampos Alexopoulos (2023). Dataset: A Systematic Literature Review on the topic of High-value datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7944424
Explore at:
Dataset updated
Jun 23, 2023
Dataset provided by
Andrea Miletič
Magdalena Ciesielska
Nina Rizun
Charalampos Alexopoulos
Anastasija Nikiforova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data collected during a study ("Towards High-Value Datasets determination for data-driven development: a systematic literature review") conducted by Anastasija Nikiforova (University of Tartu), Nina Rizun, Magdalena Ciesielska (Gdańsk University of Technology), Charalampos Alexopoulos (University of the Aegean) and Andrea Miletič (University of Zagreb) It being made public both to act as supplementary data for "Towards High-Value Datasets determination for data-driven development: a systematic literature review" paper (pre-print is available in Open Access here -> https://arxiv.org/abs/2305.10234) and in order for other researchers to use these data in their own work.

The protocol is intended for the Systematic Literature review on the topic of High-value Datasets with the aim to gather information on how the topic of High-value datasets (HVD) and their determination has been reflected in the literature over the years and what has been found by these studies to date, incl. the indicators used in them, involved stakeholders, data-related aspects, and frameworks. The data in this dataset were collected in the result of the SLR over Scopus, Web of Science, and Digital Government Research library (DGRL) in 2023.

Methodology

To understand how HVD determination has been reflected in the literature over the years and what has been found by these studies to date, all relevant literature covering this topic has been studied. To this end, the SLR was carried out to by searching digital libraries covered by Scopus, Web of Science (WoS), Digital Government Research library (DGRL).

These databases were queried for keywords ("open data" OR "open government data") AND ("high-value data*" OR "high value data*"), which were applied to the article title, keywords, and abstract to limit the number of papers to those, where these objects were primary research objects rather than mentioned in the body, e.g., as a future work. After deduplication, 11 articles were found unique and were further checked for relevance. As a result, a total of 9 articles were further examined. Each study was independently examined by at least two authors.

To attain the objective of our study, we developed the protocol, where the information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information.

Test procedure Each study was independently examined by at least two authors, where after the in-depth examination of the full-text of the article, the structured protocol has been filled for each study. The structure of the survey is available in the supplementary file available (see Protocol_HVD_SLR.odt, Protocol_HVD_SLR.docx) The data collected for each study by two researchers were then synthesized in one final version by the third researcher.

Description of the data in this data set

Protocol_HVD_SLR provides the structure of the protocol Spreadsheets #1 provides the filled protocol for relevant studies. Spreadsheet#2 provides the list of results after the search over three indexing databases, i.e. before filtering out irrelevant studies

The information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information

Descriptive information
1) Article number - a study number, corresponding to the study number assigned in an Excel worksheet 2) Complete reference - the complete source information to refer to the study 3) Year of publication - the year in which the study was published 4) Journal article / conference paper / book chapter - the type of the paper -{journal article, conference paper, book chapter} 5) DOI / Website- a link to the website where the study can be found 6) Number of citations - the number of citations of the article in Google Scholar, Scopus, Web of Science 7) Availability in OA - availability of an article in the Open Access 8) Keywords - keywords of the paper as indicated by the authors 9) Relevance for this study - what is the relevance level of the article for this study? {high / medium / low}

Approach- and research design-related information 10) Objective / RQ - the research objective / aim, established research questions 11) Research method (including unit of analysis) - the methods used to collect data, including the unit of analy-sis (country, organisation, specific unit that has been ana-lysed, e.g., the number of use-cases, scope of the SLR etc.) 12) Contributions - the contributions of the study 13) Method - whether the study uses a qualitative, quantitative, or mixed methods approach? 14) Availability of the underlying research data- whether there is a reference to the publicly available underly-ing research data e.g., transcriptions of interviews, collected data, or explanation why these data are not shared? 15) Period under investigation - period (or moment) in which the study was conducted 16) Use of theory / theoretical concepts / approaches - does the study mention any theory / theoretical concepts / approaches? If any theory is mentioned, how is theory used in the study?

Quality- and relevance- related information
17) Quality concerns - whether there are any quality concerns (e.g., limited infor-mation about the research methods used)? 18) Primary research object - is the HVD a primary research object in the study? (primary - the paper is focused around the HVD determination, sec-ondary - mentioned but not studied (e.g., as part of discus-sion, future work etc.))

HVD determination-related information
19) HVD definition and type of value - how is the HVD defined in the article and / or any other equivalent term? 20) HVD indicators - what are the indicators to identify HVD? How were they identified? (components & relationships, “input -> output") 21) A framework for HVD determination - is there a framework presented for HVD identification? What components does it consist of and what are the rela-tionships between these components? (detailed description) 22) Stakeholders and their roles - what stakeholders or actors does HVD determination in-volve? What are their roles? 23) Data - what data do HVD cover? 24) Level (if relevant) - what is the level of the HVD determination covered in the article? (e.g., city, regional, national, international)

Format of the file .xls, .csv (for the first spreadsheet only), .odt, .docx

Licenses or restrictions CC-BY

For more info, see README.txt
e
Simple download service (Atom) of the dataset: Cantons within the meaning of...
data.europa.eu
unknown
Updated Mar 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Simple download service (Atom) of the dataset: Cantons within the meaning of INSEE — Aveyron [Dataset]. https://data.europa.eu/data/datasets/fr-120066022-srv-95baa7d9-5fdd-447a-b484-b92be3f6d768
Explore at:
unknownAvailable download formats
Dataset updated
Mar 1, 2022
Description
The canton is a territorial subdivision of the borough. It is the electoral district within which a general councillor is elected.The cantons were created, like the departments, by the Act of 22 December 1789. In most cases, the cantons comprise several municipalities. But the cantons do not always respect the municipal boundaries: the most populated municipalities belong to several cantons. A township belongs to one and a single borough.A layer different from that of the pseudo-cantons of the BD Carto layer #58 N_CANTON_BDC_ddd) because the latter contains the canton of attachment of the municipalities (finer definition to be provided)
r
Big Data and Society Abstract & Indexing - ResearchHelpDesk
researchhelpdesk.org
Updated Jun 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2022). Big Data and Society Abstract & Indexing - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/abstract-and-indexing/477/big-data-and-society
Explore at:
Dataset updated
Jun 23, 2022
Dataset authored and provided by
Research Help Desk
Description
Big Data and Society Abstract & Indexing - ResearchHelpDesk - Big Data & Society (BD&S) is open access, peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities and computing and their intersections with the arts and natural sciences about the implications of Big Data for societies. The Journal's key purpose is to provide a space for connecting debates about the emerging field of Big Data practices and how they are reconfiguring academic, social, industry, business, and government relations, expertise, methods, concepts, and knowledge. BD&S moves beyond usual notions of Big Data and treats it as an emerging field of practice that is not defined by but generative of (sometimes) novel data qualities such as high volume and granularity and complex analytics such as data linking and mining. It thus attends to digital content generated through online and offline practices in social, commercial, scientific, and government domains. This includes, for instance, the content generated on the Internet through social media and search engines but also that which is generated in closed networks (commercial or government transactions) and open networks such as digital archives, open government, and crowdsourced data. Critically, rather than settling on a definition the Journal makes this an object of interdisciplinary inquiries and debates explored through studies of a variety of topics and themes. BD&S seeks contributions that analyze Big Data practices and/or involve empirical engagements and experiments with innovative methods while also reflecting on the consequences for how societies are represented (epistemologies), realized (ontologies) and governed (politics). Article processing charge (APC) The article processing charge (APC) for this journal is currently 1500 USD. Authors who do not have funding for open access publishing can request a waiver from the publisher, SAGE, once their Original Research Article is accepted after peer review. For all other content (Commentaries, Editorials, Demos) and Original Research Articles commissioned by the Editor, the APC will be waived. Abstract & Indexing Clarivate Analytics: Social Sciences Citation Index (SSCI) Directory of Open Access Journals (DOAJ) Google Scholar Scopus
v
Open data change log
opendata.vancouver.ca
csv, excel, json
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Open data change log [Dataset]. https://opendata.vancouver.ca/explore/dataset/open-data-change-log/
Explore at:
csv, json, excelAvailable download formats
Dataset updated
Jul 4, 2025
License
https://opendata.vancouver.ca/pages/licence/https://opendata.vancouver.ca/pages/licence/
Description
Significant changes to the open data catalogue, including new datasets added, datasets renamed or retired, quarterly or annual updates to high-impact datasets, changes to data structure or definition. Smaller changes, such as adding or editing records or renaming a field in an existing dataset are not included. NoteThis log is published in the interest of transparency into the work of the open data program. You can subscribe to updates for a specific dataset by creating an account on the portal then clicking on the Follow button on the Information tab of any dataset. You can get updates by subscribing to our email newsletter. Data currencyNew records will be added whenever a significant change is made to the open data catalogue.
c
Department Of Education Open Data Platform
catalog.civicdataecosystem.org
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Department Of Education Open Data Platform [Dataset]. https://catalog.civicdataecosystem.org/dataset/department-of-education-open-data-platform
Explore at:
Dataset updated
May 5, 2025
Description
Data.ed.gov is the U.S. Department of Education’s solution for publishing, finding, and accessing our public data profiles. This open data catalog brings together the Department’s data assets in a single location, making them available with their metadata, documentation, and APIs for use by the public. The federal government’s Foundations for Evidence-Based Policymaking Act of 2018 (Evidence Act) requires government agencies to make data assets open and machine-readable by default. Data.ed.gov is the U.S. Department of Education’s comprehensive data inventory satisfying these requirements while also providing privacy and security. As defined by the Open Definition: Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike. Put simply, open data is data anyone can access, download, and use. Individuals, businesses, and governments can use open education data to bring about social, economic and educational benefits and drive innovation. Share your original analyses, products, and innovations on the Showcase tabs within Data.ed.gov. Open Data Platform - 5 Reasons Why Browse the data, download it, analyze it, or build apps or other tools using our APIs. Share what you do with our data using our Showcase feature. If you are new to open data, learn more here and get started with our How-Tos. If you are preparing an article or organizing a data event, and would like information or support from the Data.ed.gov team, contact us at: odp@ed.gov.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ian Mulvany (2016). Open Data Challenges [Dataset]. http://doi.org/10.6084/m9.figshare.3810855.v1

Open Data Challenges

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.3810855.v1

Dataset updated

Sep 7, 2016

Dataset provided by

figshare
Figsharehttp://figshare.com/

Authors

Ian Mulvany

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Calls in favour of Open Data in research are becoming overwhelming. They are at national [@RCKUOpen] and international levels [@Moedas2015, @RSOpen, @ams2016]. I will set out a working definition of Open Data and will discuss the key challenges preventing the publication of Open Data becoming standard practice. I will attempt to draw some general solutions to those challenges from field specific examples.

Clear search

Close search

Google apps

Main menu

Open Data Challenges

DEF Organizational Structure - Dataset - Open Government Data

Publications and filtering decisions for the systematic literature review in...

Données éCO2mix régionales consolidées et définitives (janvier 2013 à...

Data from: Slovenian Definition Extraction training dataset DF_NDF_wiki_slo...

AirBnB prod data

Shorelines Definition - Dataset - Open Data NI

Economic Development Department Definition Guide

Data from: Open Data Intermediaries in Developing Countries Dataset

Data from: dore

Program Definition and Rules

Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A...

init Tensorflow Dataset from file name

The Decoder (Optional)

simple function

TxDOT Street Definition Data Dictionary

Data For The Ambiguous Definition of Open Government: Parliamentarians,...

Ridgelines: Department of Interior Definition

Dataset: A Systematic Literature Review on the topic of High-value datasets

Simple download service (Atom) of the dataset: Cantons within the meaning of...

Big Data and Society Abstract & Indexing - ResearchHelpDesk

Open data change log

Department Of Education Open Data Platform

Open Data Challenges