2 datasets found

e
Tweets at the 2014 Jacathon by Aragon Open Data
data.europa.eu
json
Updated Oct 14, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gobierno de Aragón (2020). Tweets at the 2014 Jacathon by Aragon Open Data [Dataset]. https://data.europa.eu/data/datasets/https-opendata-aragon-es-datos-catalogo-dataset-tweets-en-el-jacathon-2014-de-aragon-open-data
Explore at:
jsonAvailable download formats
Dataset updated
Oct 14, 2020
Dataset authored and provided by
Gobierno de Aragón
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This file collects the tweets generated before and after the Jacathon Aragón Open Data. The collection of tweets has been generated based on the listening of a certain number of hashtags related to the Jacathon, specifically the terms jacathon, maddata, medialab, jacaton, datathon, aragonopendata, hackathon, hackaton, opendata, jaca
Literary Authors from Europe and Eurasia Web Archive collection derivatives
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Jan 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nick Ruest; Nick Ruest; Anna Rakityanskaya; Thomas Keenan; Robert Davis; Anna Arays; Samantha Abrams; Anna Rakityanskaya; Thomas Keenan; Robert Davis; Anna Arays; Samantha Abrams (2020). Literary Authors from Europe and Eurasia Web Archive collection derivatives [Dataset]. http://doi.org/10.5281/zenodo.3632728
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3632728
Dataset updated
Jan 31, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nick Ruest; Nick Ruest; Anna Rakityanskaya; Thomas Keenan; Robert Davis; Anna Arays; Samantha Abrams; Anna Rakityanskaya; Thomas Keenan; Robert Davis; Anna Arays; Samantha Abrams
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Eurasia, Europe
Description
Web archive derivatives of the Literary Authors from Europe and Eurasia Web Archive collection from the Ivy Plus Libraries Confederation. The derivatives were created with the Archives Unleashed Toolkit and Archives Unleashed Cloud.

The ivy-12172-parquet.tar.gz derivatives are in the Apache Parquet format, which is a columnar storage format. These derivatives are generally small enough to work with on your local machine, and can be easily converted to Pandas DataFrames. See this notebook for examples.

Domains

.webpages().groupBy(ExtractDomainDF($"url").alias("url")).count().sort($"count".desc)

Produces a DataFrame with the following columns:

domain

count

Web Pages

.webpages().select($"crawl_date", $"url", $"mime_type_web_server", $"mime_type_tika", RemoveHTMLDF(RemoveHTTPHeaderDF(($"content"))).alias("content"))

Produces a DataFrame with the following columns:

crawl_date

url

mime_type_web_server

mime_type_tika

content

Web Graph

.webgraph()

Produces a DataFrame with the following columns:

crawl_date

src

dest

anchor

Image Links

.imageLinks()

Produces a DataFrame with the following columns:

src

image_url

Binary Analysis

Audio

Images

PDFs

Presentation program files

Spreadsheets

Text files

Word processor files

The ivy-12172-auk.tar.gz derivatives are the standard set of web archive derivatives produced by the Archives Unleashed Cloud.

Gephi file, which can be loaded into Gephi. It will have basic characteristics already computed and a basic layout.

Raw Network file, which can also be loaded into Gephi. You will have to use that network program to lay it out yourself.

Full text file. In it, each website within the web archive collection will have its full text presented on one line, along with information around when it was crawled, the name of the domain, and the full URL of the content.

Domains count file. A text file containing the frequency count of domains captured within your web archive.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Gobierno de Aragón (2020). Tweets at the 2014 Jacathon by Aragon Open Data [Dataset]. https://data.europa.eu/data/datasets/https-opendata-aragon-es-datos-catalogo-dataset-tweets-en-el-jacathon-2014-de-aragon-open-data

Tweets at the 2014 Jacathon by Aragon Open Data

Explore at:

jsonAvailable download formats

Dataset updated

Oct 14, 2020

Dataset authored and provided by

Gobierno de Aragón

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This file collects the tweets generated before and after the Jacathon Aragón Open Data. The collection of tweets has been generated based on the listening of a certain number of hashtags related to the Jacathon, specifically the terms jacathon, maddata, medialab, jacaton, datathon, aragonopendata, hackathon, hackaton, opendata, jaca

Clear search

Close search

Google apps

Main menu

Tweets at the 2014 Jacathon by Aragon Open Data

Literary Authors from Europe and Eurasia Web Archive collection derivatives

Tweets at the 2014 Jacathon by Aragon Open DataSee More Versions

Tweets at the 2014 Jacathon by Aragon Open Data