#TidyTuesday is a weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2
, tidyr
, dplyr
, and other tools in the tidyverse
ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.
This data comes from Playbill. Weekly box office grosses comprise data on revenue and attendance figures for theatres that are part of The Broadway League, an industry association for, you guessed it, Broadway theatre.
CPI data is from the U.S. Bureau of Labor Statistics. There are many, many measures of CPI, so the one used here is "All items less food and energy in U.S. city average, all urban consumers, seasonally adjusted" (table CUSR0000SA0L1E).
Huge thanks to Alex Cookson who provided ALL of this week's data, cleaning script, and readme! You can check out his recent blog post on the same data here, and explore all of the raw data and other details on Alex's GitHub.
The data this week comes from Adam Vagnar who also blogged about this dataset. There's a LOT of data here - match-level results, player details, and match-level statistics for some matches. For all this dataset all the matches are played 2 vs 2, so there are columns for 2 winners (1 team) and 2 losers (1 team). The data is relatively ready for analysis and clean, although there are some duplicated columns and the data is wide due to the 2-players per team.
Check out the data dictionary, or Wikipedia for some longer-form details around what the various match statistics mean.
Most of the data is from the international FIVB tournaments but about 1/3 is from the US-centric AVP.
The FIVB Beach Volleyball World Tour (known between 2003 and 2012 as the FIVB Beach Volleyball Swatch World Tour for sponsorship reasons) is the worldwide professional beach volleyball tour for both men and women organized by the Fédération Internationale de Volleyball (FIVB). The World Tour was introduced for men in 1989 while the women first competed in 1992.
Winning the World Tour is considered to be one of the highest honours in international beach volleyball, being surpassed only by the World Championships, and the Beach Volleyball tournament at the Summer Olympic Games.
FiveThirtyEight examined the disadvantage of serving in beach volleyball, although they used Olympic-level data. Again, Adam Vagnar also covered this data on his blog.
TidyTuesday A weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community
and the R for Data Science textbook
, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2
, tidyr
, dplyr
, and other tools in the tidyverse
ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.
Join the R4DS Online Learning Community in the weekly #TidyTuesday event! Every week we post a raw dataset, a chart or article related to that dataset, and ask you to explore the data. While the dataset will be “tamed”, it will not always be tidy!
We will have many sources of data and want to emphasize that no causation is implied. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our guidelines are to use the data provided to practice your data tidying and plotting techniques. Participants are invited to consider for themselves what nuancing factors might underlie these relationships.
The intent of Tidy Tuesday is to provide a safe and supportive forum for individuals to practice their wrangling and data visualization skills independent of drawing conclusions. While we understand that the two are related, the focus of this practice is purely on building skills with real-world data.
This week's dataset is a dataset all about meteorites, where they fell and when they fell! Data comes from the Meteoritical Society by way of NASA. H/t to #TidyTuesday community member Malin Axelsson for sharing this data as an issue on GitHub!
If you want to find out more about meteorite classifications, Malin was kind enough to share a wikipedia article as well!
meteorites.csv
variable | class | description |
---|---|---|
name | character | Meteorite name |
id | double | Meteorite numerical ID |
name_type | character | Name type either valid or relict, where relict = a meteorite that cannot be assigned easily to a class |
class | character | Class of the meteorite, please see Wikipedia for full context |
mass | double | Mass in grams |
fall | character | Fell or Found meteorite |
year | integer | Year found |
lat | double | Latitude |
long | double | Longitude |
geolocation | character | Geolocation |
@misc{tidytuesday, title = {Tidy Tuesday: A weekly social data project}, author = {R4DS Online Learning Community}, url = {https://github.com/rfordatascience/tidytuesday}, year = {2023} }
Not seeing a result you expected?
Learn how you can add new datasets to our index.
#TidyTuesday is a weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2
, tidyr
, dplyr
, and other tools in the tidyverse
ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.
This data comes from Playbill. Weekly box office grosses comprise data on revenue and attendance figures for theatres that are part of The Broadway League, an industry association for, you guessed it, Broadway theatre.
CPI data is from the U.S. Bureau of Labor Statistics. There are many, many measures of CPI, so the one used here is "All items less food and energy in U.S. city average, all urban consumers, seasonally adjusted" (table CUSR0000SA0L1E).
Huge thanks to Alex Cookson who provided ALL of this week's data, cleaning script, and readme! You can check out his recent blog post on the same data here, and explore all of the raw data and other details on Alex's GitHub.