100+ datasets found
  1. Friends - R Package Dataset

    • kaggle.com
    zip
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucas Yukio Imafuko (2024). Friends - R Package Dataset [Dataset]. https://www.kaggle.com/datasets/lucasyukioimafuko/friends-r-package-dataset
    Explore at:
    zip(2018791 bytes)Available download formats
    Dataset updated
    Nov 11, 2024
    Authors
    Lucas Yukio Imafuko
    Description

    The whole data and source can be found at https://emilhvitfeldt.github.io/friends/

    "The goal of friends to provide the complete script transcription of the Friends sitcom. The data originates from the Character Mining repository which includes references to scientific explorations using this data. This package simply provides the data in tibble format instead of json files."

    Content

    • friends.csv - Contains the scenes and lines for each character, including season and episodes.
    • friends_emotions.csv - Contains sentiments for each scene - for the first four seasons only.
    • friends_info.csv - Contains information regarding each episode, such as imdb_rating, views, episode title and directors.

    Uses

    • Text mining, sentiment analysis and word statistics.
    • Data visualizations.
  2. H

    Political Analysis Using R: Example Code and Data, Plus Data for Practice...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Apr 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jamie Monogan (2020). Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems [Dataset]. http://doi.org/10.7910/DVN/ARKOTI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 28, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Jamie Monogan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.

  3. Reddit Data Science Posts (500k+)

    • kaggle.com
    zip
    Updated May 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maksym Shkliarevskyi (2022). Reddit Data Science Posts (500k+) [Dataset]. https://www.kaggle.com/maksymshkliarevskyi/reddit-data-science-posts
    Explore at:
    zip(119034997 bytes)Available download formats
    Dataset updated
    May 8, 2022
    Authors
    Maksym Shkliarevskyi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Reddit Data Science Posts

    Data Science Community on Reddit is growing every year. Today, the network is a platform for many professionals and enthusiasts who share valuable materials and experiences. Quite an interesting task is the analysis of posts dedicated to Data Science: - finding interesting topics, - studying changes in trends over time, - predicting the potential popularity of posts on Reddit by its title and text, etc.

    Over time, I will increase the size of this dataset by adding posts from other subreddits, so that the quality of the analysis and modeling will improve.

    Feel free to leave your comments on this dataset and starter notebook. I will try to make the dataset better and much larger. The current goal is 750k + posts (spoiler: after that, there will be a million!)

    This dataset includes over 500,000 posts from 19 Date Science subreddits:

    r/analytics, r/deeplearning, r/datascience, r/datasets, r/kaggle, r/learnmachinelearning, r/MachineLearning, r/statistics, r/artificial, r/AskStatistics, r/computerscience, r/computervision, r/dataanalysis, r/dataengineering, r/DataScienceJobs, r/datascienceproject, r/data, r/MLQuestions, r/rstats

    Data were collected from pushshift.io API (maintained by Jason Baumgartner).

    If you know of any interesting Data Science subreddits, please, let me know in discussions.

    19 datasets (one per one subreddit) include the following data:

    columndescription
    #row index
    created_datepost publication date
    created_timestamppost publication timestamp
    subredditsubreddit name
    titlepost title
    idunique operation id
    authorpost author
    author_created_utcauthor registration date
    full_linkhyperlink to post
    scoreratio of likes and dislikes
    num_commentsthe number of comments
    num_crosspoststhe number of crossposts
    subreddit_subscribersthe number of subreddit subscribers at the time the post was published
    postpost text
  4. r

    R codes and dataset for Visualisation of Diachronic Constructional Change...

    • researchdata.edu.au
    • bridges.monash.edu
    Updated Apr 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg (2019). R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart [Dataset]. http://doi.org/10.26180/5c844c7a81768
    Explore at:
    Dataset updated
    Apr 1, 2019
    Dataset provided by
    Monash University
    Authors
    Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Publication


    Primahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387

    Description of R codes and data files in the repository

    This repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).

    The raw input data consists of two files (i.e. will_INF.txt and go_INF.txt). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).

    These two input files are used in the R code file 1-script-create-input-data-raw.r. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i) decade, (ii) coll (for "collocate"), (iii) BE going to (for frequency of the collocates with be going to) and (iv) will (for frequency of the collocates with will); it is available in the input_data_raw.txt.

    Then, the script 2-script-create-motion-chart-input-data.R processes the input_data_raw.txt for normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in coha_size.txt). The output from the second script is input_data_futurate.txt.

    Next, input_data_futurate.txt contains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script 3-script-create-motion-chart-plot.R), and (ii) the dynamic motion chart (using the script 4-script-motion-chart-dynamic.R).

    The repository adopts the project-oriented workflow in RStudio; double-click on the Future Constructions.Rproj file to open an RStudio session whose working directory is associated with the contents of this repository.

  5. Bank Loan Approval - LR, DT, RF and AUC

    • kaggle.com
    zip
    Updated Nov 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vikram amin (2023). Bank Loan Approval - LR, DT, RF and AUC [Dataset]. https://www.kaggle.com/datasets/vikramamin/bank-loan-approval-lr-dt-rf-and-auc
    Explore at:
    zip(61437 bytes)Available download formats
    Dataset updated
    Nov 7, 2023
    Authors
    vikram amin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
    • DATASET: Dependent variable is 'Personal.Loan'. 0 indicates loan not approved and 1 indicates loan approved.
    • OBJECTIVE : We will do Exploratory Data Analysis and use Logistic Regression, Decision Tree, Random Forest and AUC to find out which is the best model. Steps:
    • Set the working directory and read the data
    • Check the data types of all the variables https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F020afd07cf0c5ba058d88add9bcd467a%2FPicture1.png?generation=1699357564112927&alt=media" alt="">
    • DATA CLEANING
    • We need to change the data types of certain variables to factor vector
    • Check for missing data, duplicate records and remove insignificant variables https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fa286a5225207d4419b34bcf800e3cb67%2FPicture2.png?generation=1699357685993423&alt=media" alt="">
    • New data frame created called 'bank1' after dropping the 'ID' column.
    • EXPLORATORY DATA ANALYSIS
    • We will try to get some insights by digging into the data through bar charts and box plots which can help the bank management in decision making
    • Run the required libraries https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F7363f4b9ca8245b6e998bf07005fa099%2FPicture3.png?generation=1699357871368520&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F8dba10f16fc6c2d7fd51a4c82a692136%2FCount%20of%20Loans%20Approved%20%20Not%20Approved.jpeg?generation=1699357967347355&alt=media" alt="">
    • Out of the total 5000 customers, 4520 have not been approved for a loan while 480 have been https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fe5eec968e7b264d9ec540bd1f24379fd%2FPicture4.png?generation=1699358066228901&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fb64eba6f373d5c043c9f504cfa348a75%2FPicture5.png?generation=1699358103026827&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F94608993dc12cdc31cfeca92932e0cb5%2FBoxPlot%20Income%20and%20Family.jpeg?generation=1699358148840198&alt=media" alt="">
    • THIS INDICATES THAT INCOME IS HIGHER WHEN THERE ARE LESS FAMILY MEMBERS https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F8e44daf4ed42094f71c3000737f07a32%2FPicture6.png?generation=1699360599956530&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F0fd9010b95acf9ad20f7b9d0e171f305%2FBoxplot%20between%20Income%20%20Personal%20Loan.jpeg?generation=1699359231020725&alt=media" alt="">
    • THIS INDICATES PERSONAL LOAN HAS BEEN APPROVED FOR CUSTOMERS HAVING HIGHER INCOME https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Ff817481849aba7f176b7c4d0147308de%2FPicture7.png?generation=1699360768102069&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F8e0bad8c76aaa11fe3b9909721d587f5%2FBoxPlot%20between%20Income%20%20Credit%20Cards.jpeg?generation=1699360798538907&alt=media" alt="">
    • THIS INDICATES THAT THE INCOME IS PRETTY SIMILAR FOR CUSTOMERS OWNING AND NOT OWNING A CREDIT CARD https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fab4b2fd2fde2a009bceb05a5a1161040%2FPicture8.png?generation=1699360882879480&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fe747dfa315609c4907ea83a9ac7f482c%2FBoxPlot%20between%20Income%20Class%20%20Mortgage.jpeg?generation=1699359265603058&alt=media" alt="">
    • CUSTOMERS BELONGING TO THE RICH CLASS (INCOME GROUP : 150-200) HAVE THE HIGHEST MORTGAGE https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F6552d3fb9564b3ab3239ef67ed17a098%2FPicture9.png?generation=1699360938106437&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F4c7c7077e26229f455c1d9ef6e83195f%2FBoxPlot%20between%20CC%20Avg%20and%20Online%20Banking.jpeg?generation=1699359306645100&alt=media" alt="">
    • CC AVG IS PRETTY SIMILAR FOR THOSE WHO OPTED FOR ONLINE SERVICES AND THOSE WHO DID NOT
      https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Feddee2ca08a8138bb54eed0c25750280%2FPicture10.png?generation=1699360994581181&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F6127e25258b25ccfbae66a5463a72773%2FBoxplot%20between%20CC%20Avg%20and%20Education.jpeg?generation=1699359333295827&alt=media" alt="">
    • MORE EDUCATED CUSTOMERS HAVE A HIGHER CREDIT AVERAGE ![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F...
  6. R

    R package WallomicsData

    • entrepot.recherche.data.gouv.fr
    Updated May 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harold Durufle; Harold Durufle (2023). R package WallomicsData [Dataset]. http://doi.org/10.57745/3J1XPO
    Explore at:
    Dataset updated
    May 15, 2023
    Dataset provided by
    Recherche Data Gouv
    Authors
    Harold Durufle; Harold Durufle
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Description

    Datasets from the WallOmics project. Contains phenomics, metabolomics, proteomics and transcriptomics data collected from two organs of five ecotypes of the model plant Arabidopsis thaliana exposed to two temperature growth conditions. Exploratory and integrative analyses of these data are presented in Durufle et al (2020) (doi:10.1093/bib/bbaa166) and Durufle et al (2020) (doi:10.3390/cells9102249).

  7. R-Factor for the Island of Kauai

    • catalog.data.gov
    • gimi9.com
    • +2more
    Updated Oct 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA Office for Coastal Management (Point of Contact, Custodian) (2024). R-Factor for the Island of Kauai [Dataset]. https://catalog.data.gov/dataset/r-factor-for-the-island-of-kauai1
    Explore at:
    Dataset updated
    Oct 31, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Area covered
    Kauai
    Description

    The rainfall-runoff erosivity factor (R-Factor) quantifies the effects of raindrop impacts and reflects the amount and rate of runoff associated with the rain. The R-factor is one of the parameters used by the Revised Unified Soil Loss Equation (RUSLE) to estimate annual rates of erosion. This product is a raster representation of R-Factor derived from isoerodent maps published in the Agriculture Handbook Number 703 (Renard et al.,1997). Lines connecting points of equal rainfall ersoivity are called isoerodents. The iserodents plotted on a map of the Island of Kauai were digitized, then values between these lines were obtained by linear interpolation. The final R-Factor data are in raster GeoTiff format at 30 meter resolution in UTM, Zone 4, GRS80, NAD83.

  8. d

    Scripts to run R-QWTREND models and produce results.

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Scripts to run R-QWTREND models and produce results. [Dataset]. https://catalog.data.gov/dataset/scripts-to-run-r-qwtrend-models-and-produce-results
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This child page contains a zipped folder which contains all files necessary to run trend models and produce results published in U.S. Geological Scientific Investigations Report 2020–5079 [Nustad, R.A., and Vecchia, A.V., 2020, Water-quality trends for selected sites and constituents in the international Red River of the North Basin, Minnesota and North Dakota, United States, and Manitoba, Canada, 1970–2017: U.S. Geological Survey Scientific Investigations Report 2020–5079, 75 p., https://doi.org/10.3133/sir20205079]. The folder contains: six files required to run the R–QWTREND trend analysis tool; a readme.txt file; an alldata.RData file; a siteinfo_appendix.txt: and a folder called "scripts". R–QWTREND is a software package for analyzing trends in stream-water quality. The package is a collection of functions written in R (R Development Core Team, 2019), an open source language and a general environment for statistical computing and graphics. The following system requirements are necessary for using R–QWTREND: • Windows 10 operating system • R (version 3.4 or later; 64 bit recommended) • RStudio (version 1.1.456 or later). An accompanying report (Vecchia and Nustad, 2020) serves as the formal documentation for R–QWTREND. Vecchia, A.V., and Nustad, R.A., 2020, Time-series model, statistical methods, and software documentation for R–QWTREND—An R package for analyzing trends in stream-water quality: U.S. Geological Survey Open-File Report 2020–1014, 51 p., https://doi.org/10.3133/ofr20201014 R Development Core Team, 2019, R—A language and environment for statistical computing: Vienna, Austria, R Foundation for Statistical Computing, accessed June 12, 2019, at https://www.r-project.org.

  9. Data_Sheet_1_NeuroDecodeR: a package for neural decoding in R.docx

    • frontiersin.figshare.com
    docx
    Updated Jan 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethan M. Meyers (2024). Data_Sheet_1_NeuroDecodeR: a package for neural decoding in R.docx [Dataset]. http://doi.org/10.3389/fninf.2023.1275903.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jan 3, 2024
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Ethan M. Meyers
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Neural decoding is a powerful method to analyze neural activity. However, the code needed to run a decoding analysis can be complex, which can present a barrier to using the method. In this paper we introduce a package that makes it easy to perform decoding analyses in the R programing language. We describe how the package is designed in a modular fashion which allows researchers to easily implement a range of different analyses. We also discuss how to format data to be able to use the package, and we give two examples of how to use the package to analyze real data. We believe that this package, combined with the rich data analysis ecosystem in R, will make it significantly easier for researchers to create reproducible decoding analyses, which should help increase the pace of neuroscience discoveries.

  10. R

    R+s_3k Dataset

    • universe.roboflow.com
    zip
    Updated Dec 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masterarbeit (2023). R+s_3k Dataset [Dataset]. https://universe.roboflow.com/masterarbeit-d3frz/r-s_3k
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 23, 2023
    Dataset authored and provided by
    Masterarbeit
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    CAR Bounding Boxes
    Description

    R+S_3k

    ## Overview
    
    R+S_3k is a dataset for object detection tasks - it contains CAR annotations for 4,000 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  11. Electric Vehicle Charging Impacts of R-Energy

    • kaggle.com
    Updated Apr 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afroz (2024). Electric Vehicle Charging Impacts of R-Energy [Dataset]. https://www.kaggle.com/datasets/pythonafroz/electric-vehicle-charging-impacts-of-r-energy
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2024
    Dataset provided by
    Kaggle
    Authors
    Afroz
    Description

    California policy is incentivizing rapid adoption of zero emission electric vehicles for light duty and freight applications. In this project, we explored how locating charging facilities at California's highway rest stops, might impact electricity demand, grid operation, and integration of renewables like solar and wind into California's energy mix. Assuming a growing population of electric vehicles to meet state goals, we estimated state-wide growth of electricity demand, and identified the most attractive rest stop locations for siting chargers. Using a California-specific electricity dispatch model developed at ITS, we estimated how charging vehicles at these stations would impact renewable energy curtailment in California. We estimated the impacts of charging infrastructures on California's electricity system and how they can be utilized to decrease the duck curve effect resulting from a large amount of solar energy penetration by 2050.

    https://zenodo.org/records/4941504

  12. r

    Data from: Working with a linguistic corpus using R: An introductory note...

    • researchdata.edu.au
    • bridges.monash.edu
    Updated May 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gede Primahadi Wijaya Rajeg; I Made Rajeg; Karlina Denistia (2022). Working with a linguistic corpus using R: An introductory note with Indonesian Negating Construction [Dataset]. http://doi.org/10.4225/03/5a7ee2ac84303
    Explore at:
    Dataset updated
    May 5, 2022
    Dataset provided by
    Monash University
    Authors
    Gede Primahadi Wijaya Rajeg; I Made Rajeg; Karlina Denistia
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This is a repository for codes and datasets for the open-access paper in Linguistik Indonesia, the flagship journal for the Linguistic Society of Indonesia (Masyarakat Linguistik Indonesia [MLI]) (cf. the link in the references below).


    To cite the paper (in APA 6th style):

    Rajeg, G. P. W., Denistia, K., & Rajeg, I. M. (2018). Working with a linguistic corpus using R: An introductory note with Indonesian negating construction. Linguistik Indonesia, 36(1), 1–36. doi: 10.26499/li.v36i1.71


    To cite this repository:
    Click on the Cite (dark-pink button on the top-left) and select the citation style through the dropdown button (default style is Datacite option (right-hand side)

    This repository consists of the following files:
    1. Source R Markdown Notebook (.Rmd file) used to write the paper and containing the R codes to generate the analyses in the paper.
    2. Tutorial to download the Leipzig Corpus file used in the paper. It is freely available on the Leipzig Corpora Collection Download page.
    3. Accompanying datasets as images and .rds format so that all code-chunks in the R Markdown file can be run.
    4. BibLaTeX and .csl files for the referencing and bibliography (with APA 6th style).
    5. A snippet of the R session info after running all codes in the R Markdown file.
    6. RStudio project file (.Rproj). Double click on this file to open an RStudio session associated with the content of this repository. See here and here for details on Project-based workflow in RStudio.
    7. A .docx template file following the basic stylesheet for Linguistik Indonesia

    Put all these files in the same folder (including the downloaded Leipzig corpus file)!

    To render the R Markdown into MS Word document, we use the bookdown R package (Xie, 2018). Make sure this package is installed in R.

    Yihui Xie (2018). bookdown: Authoring Books and Technical Documents with R Markdown. R package version 0.6.


  13. GOES-R PLT Mission Reports V1 - Dataset - NASA Open Data Portal

    • data.nasa.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov, GOES-R PLT Mission Reports V1 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/goes-r-plt-mission-reports-v1
    Explore at:
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The GOES-R PLT Mission Reports dataset consists of various reports filed by the scientists during the GOES-R Post Launch Test (PLT) field campaign including flight reports, weather forecasts, mission scientist reports, and plan-of-day reports. The campaign took place from March to May of 2017 in support of post-launch L1B and L2+ product validation of the Advanced Baseline Imager (ABI) and the Geostationary Lightning Mapper (GLM). The GOES-R PLT Mission Reports dataset contains reports from March 13 through May 17, 2017 in PDF, PNG, Microsoft Excel and Word (.xlsx and .docx) format, and KMZ format for display in Google Earth.

  14. Geographically Weighted Regression in R

    • kaggle.com
    zip
    Updated Jul 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fajar Krisna (2023). Geographically Weighted Regression in R [Dataset]. https://www.kaggle.com/datasets/fajarkrisnajaya/geographically-weighted-regression-in-r
    Explore at:
    zip(26311 bytes)Available download formats
    Dataset updated
    Jul 1, 2023
    Authors
    Fajar Krisna
    Description

    Implementation of GWR in R

    This repository contains code and files related to my project on Geographically Weighted Regression (GWR) in R. The dataset is from Badan Pusat Statistik. Files

    Dataset.xlsx: This file contains the dataset used in the analysis.
    
    GWLR.R: This script implements Geographically Weighted Logistic Regression in R.
    
    GWPR.r: This script implements Geographically Weighted Poisson Regression in R.
    
    GWR.R: This script implements Geographically Weighted Regression in R.
    

    License

    This project is licensed under the MIT License. See the LICENSE file for more details. Contact

    If you have any questions or suggestions, feel free to contact me.

  15. s

    Data and R code used in: Plant geographic distribution influences chemical...

    • repository.soilwise-he.eu
    • search.dataone.org
    • +1more
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Data and R code used in: Plant geographic distribution influences chemical defenses in native and introduced Plantago lanceolata populations [Dataset]. http://doi.org/10.5061/dryad.5dv41nsd1
    Explore at:
    Dataset updated
    Jan 22, 2024
    Description

    Open Access# Data and R code used in: Plant geographic distribution influences chemical defenses in native and introduced Plantago lanceolata populations ## Description of the data and file structure * 00_ReadMe_DescriptonVariables.csv: A list with the description of variables from each file used. * 00_Metadata_Coordinates.csv : A dataset that includes the coordinates of each Plantago lanceolata population used. * 00_Metadata_Climate.csv : A dataset that includes coordinates, bioclimatic parameters, and the results of PCA. The dataset was created based on the script '1_Environmental variables.qmd' * 00_Metadata_Individuals.csv: A dataset that includes general information about each plant individual. Information about root traits and chemistry is missing in four samples since we lost the samples. * 01_Datset_PlantTraits.csv: Size-related and resource allocation traits measured of Plantago lanceolata and herbivore damage. * 02_Dataset_TargetedCompounds.csv: Phytohormones, Iridoid glycosides, Verbascoside and Flavonoids quantification of the leaves and roots of Plantago lanceolata. Data generated from HPLC * 03_Dataset_Volatiles_Area.csv: Area of identified volatile compounds. Data generated from GC-FID * 03_Dataset_Volatiles_Compounds.csv: Information on identified volatile compounds. Data generated from GC-MS. * 04_Dataset_Metabolome_Negative_Metadata.txt: Metadata for files in negative mode * 04_Dataset_Metabolome_Negative_Intensity.xlsx : File with the intensity of the metabolite features in negative mode. The file was generated from Metaboscape and adapted as required for the Notame package. * 04_Dataset_Metabolome_Negative_Intensity_filtered.xlsx: File generated after preprocessing of features in negative mode. During the notadame pacakged preprossesing 0 were converted to na * 04_Dataset_Metabolome_Negative.msmsonly.csv: File with a intensity of the the metabolite features in negative mode with ms/ms data. File generated from Metaboscape. * 04_Results_Metabolome_Negative_canopus_compound_summary.tsv: Feature classification. Results generated from Sirius software. * 04_Results_Metabolome_Negative_compound_identifications.tsv: Feature identification. Results generated from Sirius software. * 05_Dataset_Metabolome_Positive_Metadata.txt: Metadata for files in positive mode * 05_DatasetMetabolome_Positive_Intensity.xlsx : File with a intensity of the the metabolite features in positive mode. File generated from Metaboscape and adapted as required for the Notame package. * 05_Dataset_Metabolome_Positive_Intensity_filtered: File generated after preprocessing of features in positive mode.During the notadame pacakged preprossesing 0 were converted to na ## ## Code/Software * 1_Environmental vairables.qmd: Rscript to Retrieve bioclimatic variables from based on the coordinates of each population and then perform a principal components analysis to reduce the axes variation and included the first principal component as an explanatory variable in our model to estimate trait differences between native and introduced populations. Figure 1b and 1d * 2_PlantTraits_and_Herbivory: Rscript for statistical anaylsis of size-related traits, resource allocation traits and herbivore damage. Figure 2. It needs to source: Model_1_Fucntion.R, Model_2_Fucntion.R, Plot_Function.R * 3_Metabolome: Rscript for statistical anaylsis of Plantago lanceolata metabolome. Figure 3. It needs to source: Metabolome_preprocessing_R, Model_1_Fucntion.R, Model_2_Fucntion.R, Plot_Function.R. * 4_TargetedCompounds: Rscript for statistical anaylsis of Plantago lanceolata targeted compounds. Figure 4. It needs to source: Model_1_Fucntion.R, Model_2_Fucntion.R, Plot_Function.R * 5_Volatilome: Rscript for statistical anaylsis of Plantago lanceolata metabolome. Figure 5. It needs to source: Model_1_Fucntion.R, Model_2_Fucntion.R, Plot_Function.R * Model_1_Function.R : Function to run statistical models * Model_2_Function.R : Function to run statistical models * Plots_Function.R : Function to run plot graphs * Metabolome_prepocessing.R: Script to preprocess features

  16. h

    PyX-R

    • huggingface.co
    Updated Nov 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SemCoder (2024). PyX-R [Dataset]. https://huggingface.co/datasets/semcoder/PyX-R
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 11, 2024
    Dataset authored and provided by
    SemCoder
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    semcoder/PyX-R dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. E

    AGD-R (Analysis of Genetic Designs with R for Windows) Version 5.0

    • data.moa.gov.et
    html
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CIMMYT Ethiopia (2025). AGD-R (Analysis of Genetic Designs with R for Windows) Version 5.0 [Dataset]. https://data.moa.gov.et/dataset/hdl-11529-10202
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    CIMMYT Ethiopia
    Description

    A major objective of biometrical genetics is to explore the nature of gene action in determining quantitative traits. This also includes determination of the number of major genetic factors or genes responsible for the traits. Diallel Mating Designs have been designed to deal with the type of genetic experiments that help assess variability in observed quantitative traits arising from genetic factors, environmental factors, and their interactions. Some Diallel Mating Designs are North Carolina Designs, Line by Tester Designs and Diallel designs. AGD-R is a set of R programs that performs statistical analyses to calculate Diallel, Line by Tester, North Carolina. AGD-R contains a graphical JAVA interface that helps the user to easily choose input files, which analysis to implement, and which variables to analyze.

  18. R

    Dxl_ver2_phase3.r Dataset

    • universe.roboflow.com
    zip
    Updated Feb 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fpt (2023). Dxl_ver2_phase3.r Dataset [Dataset]. https://universe.roboflow.com/fpt-jtshy/dxl_ver2_phase3.r-nhzjl
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 21, 2023
    Dataset authored and provided by
    fpt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Food And Drink 3 Bounding Boxes
    Description

    DxL_ver2_phase3.r

    ## Overview
    
    DxL_ver2_phase3.r is a dataset for object detection tasks - it contains Food And Drink 3 annotations for 6,206 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  19. n

    GOES-R PLT Surface Radiance Ivanpah V1

    • earthdata.nasa.gov
    • datasets.ai
    • +5more
    Updated Sep 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GHRC_DAAC (2019). GOES-R PLT Surface Radiance Ivanpah V1 [Dataset]. http://doi.org/10.5067/GOESRPLT/RAD/DATA201
    Explore at:
    Dataset updated
    Sep 6, 2019
    Dataset authored and provided by
    GHRC_DAAC
    Description

    The GOES-R PLT Field Campaign Ivanpah dataset consists of surface reflectance and total optical depth data measured at Ivanpah Playa, Nevada during the GOES-R Post Launch Test (PLT) field campaign. The atmospheric measurements were made using an Automated Solar Radiometer (ASR), which tracks the sun throughout the day. Surface reflectance measurements were made using an ASD portable spectroradiometer and Spectralon reference panel. The GOES-R PLT field campaign took place from March to May of 2017 in support of post-launch L1b and L2+ product validation of the Advanced Baseline Image (ABI) and the Geostationary Lightning Mapper (GLM). The main goal of this dataset is to provide an independent validation of the AVIRIS-NG airborne instrument calibration. Data files in Excel format and browse imagery files in JPEG and PNG formats are only available for March 23 and March 28, 2017.

  20. o

    R Street Cross Street Data in Wilmington, CA

    • ownerly.com
    Updated Dec 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). R Street Cross Street Data in Wilmington, CA [Dataset]. https://www.ownerly.com/ca/wilmington/r-st-home-details
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Wilmington, East R Street, California
    Description

    This dataset provides information about the number of properties, residents, and average property values for R Street cross streets in Wilmington, CA.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lucas Yukio Imafuko (2024). Friends - R Package Dataset [Dataset]. https://www.kaggle.com/datasets/lucasyukioimafuko/friends-r-package-dataset
Organization logo

Friends - R Package Dataset

The One with the Transcript

Explore at:
zip(2018791 bytes)Available download formats
Dataset updated
Nov 11, 2024
Authors
Lucas Yukio Imafuko
Description

The whole data and source can be found at https://emilhvitfeldt.github.io/friends/

"The goal of friends to provide the complete script transcription of the Friends sitcom. The data originates from the Character Mining repository which includes references to scientific explorations using this data. This package simply provides the data in tibble format instead of json files."

Content

  • friends.csv - Contains the scenes and lines for each character, including season and episodes.
  • friends_emotions.csv - Contains sentiments for each scene - for the first four seasons only.
  • friends_info.csv - Contains information regarding each episode, such as imdb_rating, views, episode title and directors.

Uses

  • Text mining, sentiment analysis and word statistics.
  • Data visualizations.
Search
Clear search
Close search
Google apps
Main menu