25 datasets found

c
ckanext-openrefine
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-openrefine [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-openrefine
Explore at:
Dataset updated
Jun 4, 2025
Description
The Ckanext-OpenRefine extension brings the power of OpenRefine data analysis directly into CKAN for resources within datasets. This extension allows users to leverage OpenRefine's data cleaning and transformation capabilities on their CKAN resources. By integrating OpenRefine, users can improve data quality and consistency. Key Features: OpenRefine Integration: Enables users to analyze and clean dataset resources using OpenRefine's functionalities. Dependency on Requests: Requires the 'requests' Python library, indicating that it likely interacts with APIs or external services. Technical Integration: The extension activates via the config.ini file, integrating directly into CKAN's resource handling process. It enhances available options for dataset resources, likely adding an "Analyze with OpenRefine" or similar action. Benefits & Impact: Ckanext-OpenRefine enhances data quality within CKAN datasets by providing users with a convenient way to use OpenRefine's cleaning and transformation tools. This leads to improved data reliability and usability for consumers of the data.
d
Working With Messy Data in OpenRefine Workshop
search.dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kelly Schultz (2023). Working With Messy Data in OpenRefine Workshop [Dataset]. http://doi.org/10.5683/SP3/YSM3JM
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/YSM3JM
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Kelly Schultz
Description
This workshop will introduce OpenRefine, a powerful open source tool for exploring, cleaning and manipulating "messy" data. Through hands-on activities, using a variety of datasets, participants will learn how to: Explore and identify patterns in data; Normalize data using facets and clusters; Manipulate and generate new textual and numeric data; Transform and reshape datasets; Use the General Regular Expression Language (GREL) to undertake manipulations, such as concatenating strings.
f
Open Refine example file
fairdomhub.org
xlsx
Updated Dec 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Open Refine example file [Dataset]. https://fairdomhub.org/data_files/6270
Explore at:
xlsx(8.93 KB)Available download formats
Dataset updated
Dec 7, 2022
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This file is part of a tutorial for the Gesellschaft für Ökologie and the NFDI4Biodiversity.
d
Data from: OpenRefine
dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeff Moon (2023). OpenRefine [Dataset]. http://doi.org/10.5683/SP3/KRWHBX
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/KRWHBX
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Jeff Moon
Description
Formerly known as "Google Refine", "OpenRefine" is a free and open-source tool for cleaning large datasets that can create links between metadata sets.
z
OpenRefine Training Dataset based on a subset of the BGBM Herbarium
zenodo.org
csv
Updated Mar 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Botanic Garden and Botanical Museum, Berlin (2025). OpenRefine Training Dataset based on a subset of the BGBM Herbarium [Dataset]. http://doi.org/10.5281/zenodo.14918375
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14918375
Dataset updated
Mar 11, 2025
Dataset provided by
Botanic Garden and Botanical Museum, Berlin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was created for a training workshop for the OpenRefine software. It is a subset of an old snapshot of the herbarium database of the Botanic Garden and Botanical Museum (BGBM) Berlin, Germany.

This is NOT a complete, accurate or up-to-date dataset.

DO NOT USE FOR SCIENTIFIC PURPOSES.

To get the complete and current version of the Herbarium dataset from the BGBM, please use: https://www.gbif.org/dataset/85714c48-f762-11e1-a439-00145eb45e9a" target="_blank" rel="noopener">https://www.gbif.org/dataset/85714c48-f762-11e1-a439-00145eb45e9a

This dataset contains 1136 records with 9 different columns: Barcode (string), Family (string), FullScientificName (string), KindOfUnit (string, 4 different values), CollectionDate (string), CollectorName (string), LocationDetails (string), Country (string, 127 different values), CountryCode (string, 127 different values).
c
#MLA14 Twitter Archive, 9-12 January 2014
academiccommons.columbia.edu
city.figshare.com
Updated 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Priego, Ernesto; Zarate, Chris (2014). #MLA14 Twitter Archive, 9-12 January 2014 [Dataset]. http://doi.org/10.7916/D86M34TN
Explore at:
Unique identifier
https://doi.org/10.7916/D86M34TN
Dataset updated
2014
Authors
Priego, Ernesto; Zarate, Chris
Description
MLA14 is the hashtag which corresponded to the 2014 Modern Language Association Annual Convention. The Convention was held in Chicago from Monday 9 to Sunday 12 January 2014. A combination of Twitter Archiving Google Spreadsheets (Martin Hawksey's TAGS v5; available at http://mashe.hawksey.info/2013/02/twitter-archive-tagsv5/) was used to harvest this collection. OpenRefine (http://openrefine.org/) was used for deduplicating the data. An initial analysis of the data was posted as a series of blog posts by Ernesto Priego published between 16 January and 22 January 2014 at MLA Commons (http://remoteparticipation.commons.mla.org/2014/01/16/mla14-a-first-look/) (accessed 4 February 2014).
MECP2 DECIPHER genetic variant data in RDF
figshare.com
txt
Updated Nov 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Annika Jacobsen (2019). MECP2 DECIPHER genetic variant data in RDF [Dataset]. http://doi.org/10.6084/m9.figshare.11295464.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11295464.v1
Dataset updated
Nov 28, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Annika Jacobsen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MECP2 genetic variant data from DECIPHER described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
d
Slightly Cleaner Version of Enipedia's OpenRefine Tutorial universities...
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jenkins, Keith (2023). Slightly Cleaner Version of Enipedia's OpenRefine Tutorial universities dataset [Dataset]. http://doi.org/10.7910/DVN/DC0E63
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/DC0E63
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Jenkins, Keith
Description
Teaching Dataset, used for an OpenRefine workshop offered at Cornell University's Mann Library, adapted from the Enipedia OpenRefine tutorial, using "universities.csv" dataset , representing data scraped from Wikipedia. This is a slightly cleaner subset of the university data, just to avoid unnecessary distractions caused by the extreme messiness of the original dataset.
doaj article sample
figshare.com
txt
Updated Mar 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Clarke Iakovakis (2021). doaj article sample [Dataset]. http://doi.org/10.6084/m9.figshare.14195345.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14195345.v1
Dataset updated
Mar 10, 2021
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Clarke Iakovakis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
library carpentry openrefine data
Z
CSV Dataset Files and JSON OpenRefine Recipes for Alignment of the...
data.niaid.nih.gov
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coladangelo, L.P. (2023). CSV Dataset Files and JSON OpenRefine Recipes for Alignment of the Schoenberg Dataset of Manuscripts (SDBM) Name Authority with Wikidata [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_8033927
Explore at:
Dataset updated
Jun 14, 2023
Dataset provided by
Coladangelo, L.P.
Ransom, Lynn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset CSVs and JSON recipe files for OpenRefine for a project to align Name Authority records in the Schoenberg Dataset of Manuscripts (SDBM) with Wikidata Items
MECP2 Rettbase genetic variant data in RDF
figshare.com
txt
Updated Nov 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Annika Jacobsen (2019). MECP2 Rettbase genetic variant data in RDF [Dataset]. http://doi.org/10.6084/m9.figshare.11297861.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11297861.v1
Dataset updated
Nov 29, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Annika Jacobsen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MECP2 genetic variant data from Rettbase described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
MECP2 KMD genetic variant data in RDF
figshare.com
txt
Updated Nov 28, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Annika Jacobsen (2019). MECP2 KMD genetic variant data in RDF [Dataset]. http://doi.org/10.6084/m9.figshare.11295512.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11295512.v1
Dataset updated
Nov 28, 2019
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Annika Jacobsen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MECP2 genetic variant data from ClinVar described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
Ogham Stones Wikidata Import
zenodo.org
data.niaid.nih.gov
txt
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Thiery; Florian Thiery; Sophie Charlotte Schmidt; Sophie Charlotte Schmidt (2020). Ogham Stones Wikidata Import [Dataset]. http://doi.org/10.5281/zenodo.3612655
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3612655
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Thiery; Florian Thiery; Sophie Charlotte Schmidt; Sophie Charlotte Schmidt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ogham Stones Wikidata Import

more at: https://github.com/ogi-ogham/ogham-wikidata/tree/master/OgamStones
d
PCCF+: Postal Code Conversion File Plus
dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeff Moon (2023). PCCF+: Postal Code Conversion File Plus [Dataset]. http://doi.org/10.5683/SP3/SXIQPW
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/SXIQPW
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Jeff Moon
Description
This hands-on workshop has two parts. The first part covers working with SAS and the Postal Code Conversion File Plus. You'll start with Postal Codes, and leave with Census geography that can be linked to Census demographics. The second part introduces OpenRefine, an open source software platform for cleaning up messy data files. Initially developed by Google, OpenRefine will open your eyes to the beauty of clean data! No previous experience required.
MECP2 EVA genetic variant data in RDF
figshare.com
txt
Updated Nov 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Annika Jacobsen (2019). MECP2 EVA genetic variant data in RDF [Dataset]. http://doi.org/10.6084/m9.figshare.11295473.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11295473.v1
Dataset updated
Nov 28, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Annika Jacobsen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MECP2 genetic variant data from EVA described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
d
Google Refine provenance vocabulary
datadiscoverystudio.org
Updated Apr 30, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). Google Refine provenance vocabulary [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/6224ae9504164a03b9f85fc26a822183/html
Explore at:
Dataset updated
Apr 30, 2015
Description
This vocabulary defines terms needed to describe provenance of RDF data exported using RDF Extension for Google Refine. It uses the Open Provenance Model Vocabulary (OPVM).
Ogham Townlands Wikidata Import
zenodo.org
data.niaid.nih.gov
txt
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Thiery; Florian Thiery; Sophie Charlotte Schmidt; Sophie Charlotte Schmidt (2020). Ogham Townlands Wikidata Import [Dataset]. http://doi.org/10.5281/zenodo.3612650
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3612650
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Thiery; Florian Thiery; Sophie Charlotte Schmidt; Sophie Charlotte Schmidt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ogham Townlands Wikidata Import

more at: https://github.com/ogi-ogham/ogham-wikidata/tree/master/Townlands
MECP2 MRD genetic variant data in RDF
figshare.com
txt
Updated Dec 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Annika Jacobsen (2019). MECP2 MRD genetic variant data in RDF [Dataset]. http://doi.org/10.6084/m9.figshare.11295617.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11295617.v1
Dataset updated
Dec 4, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Annika Jacobsen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MECP2 genetic variant data from the Maastricht Rett database (MRD) described using the Resource Description Framework (RDF) format following the HGVS standard.This resource is FAIR and has been analyzed in combination with other MECP2 genetic variant data enabled by their interoperability.The data was described in RDF following a genetic variant semantic data model (https://github.com/LUMC-BioSemantics/rett-variant) using a general-purpose FAIRifier tool (https://doi.org/10.1162/dint_a_00031) based on the OpenRefine data cleaning and wrangling tool (http://openrefine.org/).The FAIR machine-readable metadata is available at:http://purl.org/biosemantics-lumc/rettbase/fdp
Projet OpenRefine pour un tutoriel de création d'une carte avec Wikidata,...
nakala.fr
Updated Dec 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jean-Baptiste Pressac; Jean-Baptiste Pressac (2024). Projet OpenRefine pour un tutoriel de création d'une carte avec Wikidata, OpenRefine et uMap [Dataset]. http://doi.org/10.34847/nkl.e6a03ldd
Explore at:
Unique identifier
https://doi.org/10.34847/nkl.e6a03ldd
Dataset updated
Dec 4, 2024
Dataset provided by
Huma-Numhttps://www.huma-num.fr/
Authors
Jean-Baptiste Pressac; Jean-Baptiste Pressac
License
https://spdx.org/licenses/etalab-2.0.html#licenseTexthttps://spdx.org/licenses/etalab-2.0.html#licenseText
Description
Projet OpenRefine pour le tutoriel https://bylg.hypotheses.org/543
Web scraping de données publiques d'un échantillon de 11 ports européens
zenodo.org
data.niaid.nih.gov
zip
Updated Aug 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Druey Guy; Druey Guy (2020). Web scraping de données publiques d'un échantillon de 11 ports européens [Dataset]. http://doi.org/10.5281/zenodo.3980515
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3980515
Dataset updated
Aug 12, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Druey Guy; Druey Guy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Web scraping du 25 au 27 juin 2020 de données publiques relatives à des enregistrements de navires d'un échantillon de 11 ports européens : Aarhus (DK), Amsterdam (NL), Bordeaux (FR), Copenhague (DK), Dunkerque (FR), Fredericia (DK), Hambourg (DE), Klaipeda (LT), Le Havre (FR), Niedersachsen (DE) et Rotterdam (NL).

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). ckanext-openrefine [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-openrefine

ckanext-openrefine

Explore at:

Dataset updated

Jun 4, 2025

Description

The Ckanext-OpenRefine extension brings the power of OpenRefine data analysis directly into CKAN for resources within datasets. This extension allows users to leverage OpenRefine's data cleaning and transformation capabilities on their CKAN resources. By integrating OpenRefine, users can improve data quality and consistency. Key Features: OpenRefine Integration: Enables users to analyze and clean dataset resources using OpenRefine's functionalities. Dependency on Requests: Requires the 'requests' Python library, indicating that it likely interacts with APIs or external services. Technical Integration: The extension activates via the config.ini file, integrating directly into CKAN's resource handling process. It enhances available options for dataset resources, likely adding an "Analyze with OpenRefine" or similar action. Benefits & Impact: Ckanext-OpenRefine enhances data quality within CKAN datasets by providing users with a convenient way to use OpenRefine's cleaning and transformation tools. This leads to improved data reliability and usability for consumers of the data.

Clear search

Close search

Google apps

Main menu

ckanext-openrefine

Working With Messy Data in OpenRefine Workshop

Open Refine example file

Data from: OpenRefine

OpenRefine Training Dataset based on a subset of the BGBM Herbarium

#MLA14 Twitter Archive, 9-12 January 2014

MECP2 DECIPHER genetic variant data in RDF

Slightly Cleaner Version of Enipedia's OpenRefine Tutorial universities...

doaj article sample

CSV Dataset Files and JSON OpenRefine Recipes for Alignment of the...

MECP2 Rettbase genetic variant data in RDF

MECP2 KMD genetic variant data in RDF

Ogham Stones Wikidata Import

PCCF+: Postal Code Conversion File Plus

MECP2 EVA genetic variant data in RDF

Google Refine provenance vocabulary

Ogham Townlands Wikidata Import

MECP2 MRD genetic variant data in RDF

Projet OpenRefine pour un tutoriel de création d'une carte avec Wikidata,...

Web scraping de données publiques d'un échantillon de 11 ports européens

ckanext-openrefine