The Ckanext-OpenRefine extension brings the power of OpenRefine data analysis directly into CKAN for resources within datasets. This extension allows users to leverage OpenRefine's data cleaning and transformation capabilities on their CKAN resources. By integrating OpenRefine, users can improve data quality and consistency. Key Features: OpenRefine Integration: Enables users to analyze and clean dataset resources using OpenRefine's functionalities. Dependency on Requests: Requires the 'requests' Python library, indicating that it likely interacts with APIs or external services. Technical Integration: The extension activates via the config.ini file, integrating directly into CKAN's resource handling process. It enhances available options for dataset resources, likely adding an "Analyze with OpenRefine" or similar action. Benefits & Impact: Ckanext-OpenRefine enhances data quality within CKAN datasets by providing users with a convenient way to use OpenRefine's cleaning and transformation tools. This leads to improved data reliability and usability for consumers of the data.
This workshop will introduce OpenRefine, a powerful open source tool for exploring, cleaning and manipulating "messy" data. Through hands-on activities, using a variety of datasets, participants will learn how to: Explore and identify patterns in data; Normalize data using facets and clusters; Manipulate and generate new textual and numeric data; Transform and reshape datasets; Use the General Regular Expression Language (GREL) to undertake manipulations, such as concatenating strings.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file is part of a tutorial for the Gesellschaft für Ökologie and the NFDI4Biodiversity.
Formerly known as "Google Refine", "OpenRefine" is a free and open-source tool for cleaning large datasets that can create links between metadata sets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created for a training workshop for the OpenRefine software. It is a subset of an old snapshot of the herbarium database of the Botanic Garden and Botanical Museum (BGBM) Berlin, Germany.
This is NOT a complete, accurate or up-to-date dataset.
DO NOT USE FOR SCIENTIFIC PURPOSES.
To get the complete and current version of the Herbarium dataset from the BGBM, please use: https://www.gbif.org/dataset/85714c48-f762-11e1-a439-00145eb45e9a" target="_blank" rel="noopener">https://www.gbif.org/dataset/85714c48-f762-11e1-a439-00145eb45e9a
This dataset contains 1136 records with 9 different columns: Barcode (string), Family (string), FullScientificName (string), KindOfUnit (string, 4 different values), CollectionDate (string), CollectorName (string), LocationDetails (string), Country (string, 127 different values), CountryCode (string, 127 different values).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MECP2 genetic variant data from DECIPHER described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
Teaching Dataset, used for an OpenRefine workshop offered at Cornell University's Mann Library, adapted from the Enipedia OpenRefine tutorial, using "universities.csv" dataset , representing data scraped from Wikipedia. This is a slightly cleaner subset of the university data, just to avoid unnecessary distractions caused by the extreme messiness of the original dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
library carpentry openrefine data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset CSVs and JSON recipe files for OpenRefine for a project to align Name Authority records in the Schoenberg Dataset of Manuscripts (SDBM) with Wikidata Items
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MECP2 genetic variant data from Rettbase described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MECP2 genetic variant data from ClinVar described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ogham Stones Wikidata Import
more at: https://github.com/ogi-ogham/ogham-wikidata/tree/master/OgamStones
This hands-on workshop has two parts. The first part covers working with SAS and the Postal Code Conversion File Plus. You'll start with Postal Codes, and leave with Census geography that can be linked to Census demographics. The second part introduces OpenRefine, an open source software platform for cleaning up messy data files. Initially developed by Google, OpenRefine will open your eyes to the beauty of clean data! No previous experience required.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MECP2 genetic variant data from EVA described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
This vocabulary defines terms needed to describe provenance of RDF data exported using RDF Extension for Google Refine. It uses the Open Provenance Model Vocabulary (OPVM).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ogham Townlands Wikidata Import
more at: https://github.com/ogi-ogham/ogham-wikidata/tree/master/Townlands
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MECP2 genetic variant data from the Maastricht Rett database (MRD) described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
https://spdx.org/licenses/etalab-2.0.html#licenseTexthttps://spdx.org/licenses/etalab-2.0.html#licenseText
Projet OpenRefine pour le tutoriel https://bylg.hypotheses.org/543
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Web scraping du 25 au 27 juin 2020 de données publiques relatives à des enregistrements de navires d'un échantillon de 11 ports européens : Aarhus (DK), Amsterdam (NL), Bordeaux (FR), Copenhague (DK), Dunkerque (FR), Fredericia (DK), Hambourg (DE), Klaipeda (LT), Le Havre (FR), Niedersachsen (DE) et Rotterdam (NL).
The Ckanext-OpenRefine extension brings the power of OpenRefine data analysis directly into CKAN for resources within datasets. This extension allows users to leverage OpenRefine's data cleaning and transformation capabilities on their CKAN resources. By integrating OpenRefine, users can improve data quality and consistency. Key Features: OpenRefine Integration: Enables users to analyze and clean dataset resources using OpenRefine's functionalities. Dependency on Requests: Requires the 'requests' Python library, indicating that it likely interacts with APIs or external services. Technical Integration: The extension activates via the config.ini file, integrating directly into CKAN's resource handling process. It enhances available options for dataset resources, likely adding an "Analyze with OpenRefine" or similar action. Benefits & Impact: Ckanext-OpenRefine enhances data quality within CKAN datasets by providing users with a convenient way to use OpenRefine's cleaning and transformation tools. This leads to improved data reliability and usability for consumers of the data.