3 datasets found
  1. Broadway Weekly Grosses

    • kaggle.com
    Updated Apr 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesse Mostipak (2020). Broadway Weekly Grosses [Dataset]. https://www.kaggle.com/jessemostipak/broadway-weekly-grosses/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 29, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jesse Mostipak
    Description

    Context

    #TidyTuesday is a weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2, tidyr, dplyr, and other tools in the tidyverse ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.

    Content

    This data comes from Playbill. Weekly box office grosses comprise data on revenue and attendance figures for theatres that are part of The Broadway League, an industry association for, you guessed it, Broadway theatre.

    CPI data is from the U.S. Bureau of Labor Statistics. There are many, many measures of CPI, so the one used here is "All items less food and energy in U.S. city average, all urban consumers, seasonally adjusted" (table CUSR0000SA0L1E).

    Acknowledgements

    Huge thanks to Alex Cookson who provided ALL of this week's data, cleaning script, and readme! You can check out his recent blog post on the same data here, and explore all of the raw data and other details on Alex's GitHub.

  2. Beach Volleyball

    • kaggle.com
    Updated May 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesse Mostipak (2020). Beach Volleyball [Dataset]. https://www.kaggle.com/jessemostipak/beach-volleyball/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 18, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jesse Mostipak
    Description

    Beach Volleyball

    The data this week comes from Adam Vagnar who also blogged about this dataset. There's a LOT of data here - match-level results, player details, and match-level statistics for some matches. For all this dataset all the matches are played 2 vs 2, so there are columns for 2 winners (1 team) and 2 losers (1 team). The data is relatively ready for analysis and clean, although there are some duplicated columns and the data is wide due to the 2-players per team.

    Check out the data dictionary, or Wikipedia for some longer-form details around what the various match statistics mean.

    Most of the data is from the international FIVB tournaments but about 1/3 is from the US-centric AVP.

    The FIVB Beach Volleyball World Tour (known between 2003 and 2012 as the FIVB Beach Volleyball Swatch World Tour for sponsorship reasons) is the worldwide professional beach volleyball tour for both men and women organized by the Fédération Internationale de Volleyball (FIVB). The World Tour was introduced for men in 1989 while the women first competed in 1992.

    Winning the World Tour is considered to be one of the highest honours in international beach volleyball, being surpassed only by the World Championships, and the Beach Volleyball tournament at the Summer Olympic Games.

    FiveThirtyEight examined the disadvantage of serving in beach volleyball, although they used Olympic-level data. Again, Adam Vagnar also covered this data on his blog.

    What is Tidy Tuesday?

    TidyTuesday A weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2, tidyr, dplyr, and other tools in the tidyverse ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.

    Join the R4DS Online Learning Community in the weekly #TidyTuesday event! Every week we post a raw dataset, a chart or article related to that dataset, and ask you to explore the data. While the dataset will be “tamed”, it will not always be tidy!

    We will have many sources of data and want to emphasize that no causation is implied. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our guidelines are to use the data provided to practice your data tidying and plotting techniques. Participants are invited to consider for themselves what nuancing factors might underlie these relationships.

    The intent of Tidy Tuesday is to provide a safe and supportive forum for individuals to practice their wrangling and data visualization skills independent of drawing conclusions. While we understand that the two are related, the focus of this practice is purely on building skills with real-world data.

  3. NASA Meteorites Dataset

    • kaggle.com
    Updated Oct 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujay Kapadnis (2023). NASA Meteorites Dataset [Dataset]. https://www.kaggle.com/datasets/sujaykapadnis/meteorites-dataset/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 12, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sujay Kapadnis
    Description

    This week's dataset is a dataset all about meteorites, where they fell and when they fell! Data comes from the Meteoritical Society by way of NASA. H/t to #TidyTuesday community member Malin Axelsson for sharing this data as an issue on GitHub!

    If you want to find out more about meteorite classifications, Malin was kind enough to share a wikipedia article as well!

    Data Dictionary

    meteorites.csv

    variableclassdescription
    namecharacterMeteorite name
    iddoubleMeteorite numerical ID
    name_typecharacterName type either valid or relict, where relict = a meteorite that cannot be assigned easily to a class
    classcharacterClass of the meteorite, please see Wikipedia for full context
    massdoubleMass in grams
    fallcharacterFell or Found meteorite
    yearintegerYear found
    latdoubleLatitude
    longdoubleLongitude
    geolocationcharacterGeolocation

    @misc{tidytuesday, title = {Tidy Tuesday: A weekly social data project}, author = {R4DS Online Learning Community}, url = {https://github.com/rfordatascience/tidytuesday}, year = {2023} }

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jesse Mostipak (2020). Broadway Weekly Grosses [Dataset]. https://www.kaggle.com/jessemostipak/broadway-weekly-grosses/code
Organization logo

Broadway Weekly Grosses

#TidyTuesday Week 18

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 29, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jesse Mostipak
Description

Context

#TidyTuesday is a weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2, tidyr, dplyr, and other tools in the tidyverse ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.

Content

This data comes from Playbill. Weekly box office grosses comprise data on revenue and attendance figures for theatres that are part of The Broadway League, an industry association for, you guessed it, Broadway theatre.

CPI data is from the U.S. Bureau of Labor Statistics. There are many, many measures of CPI, so the one used here is "All items less food and energy in U.S. city average, all urban consumers, seasonally adjusted" (table CUSR0000SA0L1E).

Acknowledgements

Huge thanks to Alex Cookson who provided ALL of this week's data, cleaning script, and readme! You can check out his recent blog post on the same data here, and explore all of the raw data and other details on Alex's GitHub.

Search
Clear search
Close search
Google apps
Main menu