Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This has been copied from the README.md file
bris-lib-checkout
This provides tidied up data from the Brisbane library checkouts
Retrieving and cleaning the data
The script for retrieving and cleaning the data is made available in scrape-library.R.
The data
data/
This contains four tidied up dataframes:
tidy-brisbane-library-checkout.csv contains the following columns, with the metadata file metadata_heading containing the description of these columns.
knitr::kable(readr::read_csv("data/metadata_heading.csv"))
#> Parsed with column specification:
#> cols(
#> heading = col_character(),
#> heading_explanation = col_character()
#> )
heading
heading_explanation
Title
Title of Item
Author
Author of Item
Call Number
Call Number of Item
Item id
Unique Item Identifier
Item Type
Type of Item (see next column)
Status
Current Status of Item
Language
Published language of item (if not English)
Age
Suggested audience
Checkout Library
Checkout branch
Date
Checkout date
We also added year, month, and day columns.
The remaining data are all metadata files that contain meta information on the columns in the checkout data:
library(tidyverse)
#> ── Attaching packages ────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
#> ✔ tibble 1.4.99.9006 ✔ dplyr 0.7.8
#> ✔ tidyr 0.8.2 ✔ stringr 1.3.1
#> ✔ readr 1.3.0 ✔ forcats 0.3.0
#> ── Conflicts ───────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
knitr::kable(readr::read_csv("data/metadata_branch.csv"))
#> Parsed with column specification:
#> cols(
#> branch_code = col_character(),
#> branch_heading = col_character()
#> )
branch_code
branch_heading
ANN
Annerley
ASH
Ashgrove
BNO
Banyo
BRR
BrackenRidge
BSQ
Brisbane Square Library
BUL
Bulimba
CDA
Corinda
CDE
Chermside
CNL
Carindale
CPL
Coopers Plains
CRA
Carina
EPK
Everton Park
FAI
Fairfield
GCY
Garden City
GNG
Grange
HAM
Hamilton
HPK
Holland Park
INA
Inala
IPY
Indooroopilly
MBG
Mt. Coot-tha
MIT
Mitchelton
MTG
Mt. Gravatt
MTO
Mt. Ommaney
NDH
Nundah
NFM
New Farm
SBK
Sunnybank Hills
SCR
Stones Corner
SGT
Sandgate
VAN
Mobile Library
TWG
Toowong
WND
West End
WYN
Wynnum
ZIL
Zillmere
knitr::kable(readr::read_csv("data/metadata_item_type.csv"))
#> Parsed with column specification:
#> cols(
#> item_type_code = col_character(),
#> item_type_explanation = col_character()
#> )
item_type_code
item_type_explanation
AD-FICTION
Adult Fiction
AD-MAGS
Adult Magazines
AD-PBK
Adult Paperback
BIOGRAPHY
Biography
BSQCDMUSIC
Brisbane Square CD Music
BSQCD-ROM
Brisbane Square CD Rom
BSQ-DVD
Brisbane Square DVD
CD-BOOK
Compact Disc Book
CD-MUSIC
Compact Disc Music
CD-ROM
CD Rom
DVD
DVD
DVD_R18+
DVD Restricted - 18+
FASTBACK
Fastback
GAYLESBIAN
Gay and Lesbian Collection
GRAPHICNOV
Graphic Novel
ILL
InterLibrary Loan
JU-FICTION
Junior Fiction
JU-MAGS
Junior Magazines
JU-PBK
Junior Paperback
KITS
Kits
LARGEPRINT
Large Print
LGPRINTMAG
Large Print Magazine
LITERACY
Literacy
LITERACYAV
Literacy Audio Visual
LOCSTUDIES
Local Studies
LOTE-BIO
Languages Other than English Biography
LOTE-BOOK
Languages Other than English Book
LOTE-CDMUS
Languages Other than English CD Music
LOTE-DVD
Languages Other than English DVD
LOTE-MAG
Languages Other than English Magazine
LOTE-TB
Languages Other than English Taped Book
MBG-DVD
Mt Coot-tha Botanical Gardens DVD
MBG-MAG
Mt Coot-tha Botanical Gardens Magazine
MBG-NF
Mt Coot-tha Botanical Gardens Non Fiction
MP3-BOOK
MP3 Audio Book
NONFIC-SET
Non Fiction Set
NONFICTION
Non Fiction
PICTURE-BK
Picture Book
PICTURE-NF
Picture Book Non Fiction
PLD-BOOK
Public Libraries Division Book
YA-FICTION
Young Adult Fiction
YA-MAGS
Young Adult Magazine
YA-PBK
Young Adult Paperback
Example usage
Let’s explore the data
bris_libs <- readr::read_csv("data/bris-lib-checkout.csv")
#> Parsed with column specification:
#> cols(
#> title = col_character(),
#> author = col_character(),
#> call_number = col_character(),
#> item_id = col_double(),
#> item_type = col_character(),
#> status = col_character(),
#> language = col_character(),
#> age = col_character(),
#> library = col_character(),
#> date = col_double(),
#> datetime = col_datetime(format = ""),
#> year = col_double(),
#> month = col_double(),
#> day = col_character()
#> )
#> Warning: 20 parsing failures.
#> row col expected actual file
#> 587795 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590579 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590597 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 595774 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 597567 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> ...... ....... ........ ....... ............................
#> See problems(...) for more details.
We can count the number of titles, item types, suggested age, and the library given:
library(dplyr)
count(bris_libs, title, sort = TRUE)
#> # A tibble: 121,046 x 2
#> title n
#>
License
This data is provided under a CC BY 4.0 license
It has been downloaded from Brisbane library checkouts, and tidied up using the code in data-raw.
The data this week comes from Adam Vagnar who also blogged about this dataset. There's a LOT of data here - match-level results, player details, and match-level statistics for some matches. For all this dataset all the matches are played 2 vs 2, so there are columns for 2 winners (1 team) and 2 losers (1 team). The data is relatively ready for analysis and clean, although there are some duplicated columns and the data is wide due to the 2-players per team.
Check out the data dictionary, or Wikipedia for some longer-form details around what the various match statistics mean.
Most of the data is from the international FIVB tournaments but about 1/3 is from the US-centric AVP.
The FIVB Beach Volleyball World Tour (known between 2003 and 2012 as the FIVB Beach Volleyball Swatch World Tour for sponsorship reasons) is the worldwide professional beach volleyball tour for both men and women organized by the Fédération Internationale de Volleyball (FIVB). The World Tour was introduced for men in 1989 while the women first competed in 1992.
Winning the World Tour is considered to be one of the highest honours in international beach volleyball, being surpassed only by the World Championships, and the Beach Volleyball tournament at the Summer Olympic Games.
FiveThirtyEight examined the disadvantage of serving in beach volleyball, although they used Olympic-level data. Again, Adam Vagnar also covered this data on his blog.
TidyTuesday A weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community
and the R for Data Science textbook
, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2
, tidyr
, dplyr
, and other tools in the tidyverse
ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.
Join the R4DS Online Learning Community in the weekly #TidyTuesday event! Every week we post a raw dataset, a chart or article related to that dataset, and ask you to explore the data. While the dataset will be “tamed”, it will not always be tidy!
We will have many sources of data and want to emphasize that no causation is implied. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our guidelines are to use the data provided to practice your data tidying and plotting techniques. Participants are invited to consider for themselves what nuancing factors might underlie these relationships.
The intent of Tidy Tuesday is to provide a safe and supportive forum for individuals to practice their wrangling and data visualization skills independent of drawing conclusions. While we understand that the two are related, the focus of this practice is purely on building skills with real-world data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This has been copied from the README.md file
bris-lib-checkout
This provides tidied up data from the Brisbane library checkouts
Retrieving and cleaning the data
The script for retrieving and cleaning the data is made available in scrape-library.R.
The data
data/
This contains four tidied up dataframes:
tidy-brisbane-library-checkout.csv contains the following columns, with the metadata file metadata_heading containing the description of these columns.
knitr::kable(readr::read_csv("data/metadata_heading.csv"))
#> Parsed with column specification:
#> cols(
#> heading = col_character(),
#> heading_explanation = col_character()
#> )
heading
heading_explanation
Title
Title of Item
Author
Author of Item
Call Number
Call Number of Item
Item id
Unique Item Identifier
Item Type
Type of Item (see next column)
Status
Current Status of Item
Language
Published language of item (if not English)
Age
Suggested audience
Checkout Library
Checkout branch
Date
Checkout date
We also added year, month, and day columns.
The remaining data are all metadata files that contain meta information on the columns in the checkout data:
library(tidyverse)
#> ── Attaching packages ────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
#> ✔ tibble 1.4.99.9006 ✔ dplyr 0.7.8
#> ✔ tidyr 0.8.2 ✔ stringr 1.3.1
#> ✔ readr 1.3.0 ✔ forcats 0.3.0
#> ── Conflicts ───────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
knitr::kable(readr::read_csv("data/metadata_branch.csv"))
#> Parsed with column specification:
#> cols(
#> branch_code = col_character(),
#> branch_heading = col_character()
#> )
branch_code
branch_heading
ANN
Annerley
ASH
Ashgrove
BNO
Banyo
BRR
BrackenRidge
BSQ
Brisbane Square Library
BUL
Bulimba
CDA
Corinda
CDE
Chermside
CNL
Carindale
CPL
Coopers Plains
CRA
Carina
EPK
Everton Park
FAI
Fairfield
GCY
Garden City
GNG
Grange
HAM
Hamilton
HPK
Holland Park
INA
Inala
IPY
Indooroopilly
MBG
Mt. Coot-tha
MIT
Mitchelton
MTG
Mt. Gravatt
MTO
Mt. Ommaney
NDH
Nundah
NFM
New Farm
SBK
Sunnybank Hills
SCR
Stones Corner
SGT
Sandgate
VAN
Mobile Library
TWG
Toowong
WND
West End
WYN
Wynnum
ZIL
Zillmere
knitr::kable(readr::read_csv("data/metadata_item_type.csv"))
#> Parsed with column specification:
#> cols(
#> item_type_code = col_character(),
#> item_type_explanation = col_character()
#> )
item_type_code
item_type_explanation
AD-FICTION
Adult Fiction
AD-MAGS
Adult Magazines
AD-PBK
Adult Paperback
BIOGRAPHY
Biography
BSQCDMUSIC
Brisbane Square CD Music
BSQCD-ROM
Brisbane Square CD Rom
BSQ-DVD
Brisbane Square DVD
CD-BOOK
Compact Disc Book
CD-MUSIC
Compact Disc Music
CD-ROM
CD Rom
DVD
DVD
DVD_R18+
DVD Restricted - 18+
FASTBACK
Fastback
GAYLESBIAN
Gay and Lesbian Collection
GRAPHICNOV
Graphic Novel
ILL
InterLibrary Loan
JU-FICTION
Junior Fiction
JU-MAGS
Junior Magazines
JU-PBK
Junior Paperback
KITS
Kits
LARGEPRINT
Large Print
LGPRINTMAG
Large Print Magazine
LITERACY
Literacy
LITERACYAV
Literacy Audio Visual
LOCSTUDIES
Local Studies
LOTE-BIO
Languages Other than English Biography
LOTE-BOOK
Languages Other than English Book
LOTE-CDMUS
Languages Other than English CD Music
LOTE-DVD
Languages Other than English DVD
LOTE-MAG
Languages Other than English Magazine
LOTE-TB
Languages Other than English Taped Book
MBG-DVD
Mt Coot-tha Botanical Gardens DVD
MBG-MAG
Mt Coot-tha Botanical Gardens Magazine
MBG-NF
Mt Coot-tha Botanical Gardens Non Fiction
MP3-BOOK
MP3 Audio Book
NONFIC-SET
Non Fiction Set
NONFICTION
Non Fiction
PICTURE-BK
Picture Book
PICTURE-NF
Picture Book Non Fiction
PLD-BOOK
Public Libraries Division Book
YA-FICTION
Young Adult Fiction
YA-MAGS
Young Adult Magazine
YA-PBK
Young Adult Paperback
Example usage
Let’s explore the data
bris_libs <- readr::read_csv("data/bris-lib-checkout.csv")
#> Parsed with column specification:
#> cols(
#> title = col_character(),
#> author = col_character(),
#> call_number = col_character(),
#> item_id = col_double(),
#> item_type = col_character(),
#> status = col_character(),
#> language = col_character(),
#> age = col_character(),
#> library = col_character(),
#> date = col_double(),
#> datetime = col_datetime(format = ""),
#> year = col_double(),
#> month = col_double(),
#> day = col_character()
#> )
#> Warning: 20 parsing failures.
#> row col expected actual file
#> 587795 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590579 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590597 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 595774 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 597567 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> ...... ....... ........ ....... ............................
#> See problems(...) for more details.
We can count the number of titles, item types, suggested age, and the library given:
library(dplyr)
count(bris_libs, title, sort = TRUE)
#> # A tibble: 121,046 x 2
#> title n
#>
License
This data is provided under a CC BY 4.0 license
It has been downloaded from Brisbane library checkouts, and tidied up using the code in data-raw.