6 datasets found
  1. Flights data

    • kaggle.com
    zip
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eugeniy Osetrov (2023). Flights data [Dataset]. https://www.kaggle.com/datasets/eugeniyosetrov/flights-data
    Explore at:
    zip(1498790 bytes)Available download formats
    Dataset updated
    Dec 5, 2023
    Authors
    Eugeniy Osetrov
    Description

    On-time data for a random sample of flights that departed NYC (i.e. JFK, LGA or EWR) in 2013. year,month,day Date of departure.

    dep_time,arr_time Departure and arrival times, local tz.

    dep_delay,arr_delay Departure and arrival delays, in minutes. Negative times represent early departures/arrivals.

    hour,minute Time of departure broken in to hour and minutes.

    carrier Two letter carrier abbreviation. See airlines in the nycflights13 package for more information or google the airline code.

    tailnum Plane tail number.

    flight Flight number.

    origin,dest Origin and destination. See airports in the nycflights13 package for more information or google airport the code.

    air_time Amount of time spent in the air.

    distance Distance flown.

    Source Hadley Wickham (2014). nycflights13: Data about flights departing NYC in 2013. R package version 0.1.

    Formats CSV file Tab-delimited text file

    Format A tbl_df with 32,735 rows and 16 variables:

    Photo by Phil Mosley on Unsplash

  2. UAD Appraisal-Level Public Use File

    • catalog.data.gov
    • gimi9.com
    Updated Feb 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Housing Finance Agency (2025). UAD Appraisal-Level Public Use File [Dataset]. https://catalog.data.gov/dataset/uad-appraisal-level-public-use-file
    Explore at:
    Dataset updated
    Feb 10, 2025
    Dataset provided by
    Federal Housing Finance Agencyhttps://www.fhfa.gov/
    Description

    The Uniform Appraisal Dataset (UAD) Appraisal-Level Public Use File (PUF) is the nation’s first publicly available appraisal-level dataset of appraisal records, giving the public new access to a selected set of data fields found in appraisal reports. The UAD Appraisal-Level PUF is based on a five percent nationally representative random sample of appraisals for single-family mortgages acquired by the Enterprises. The current release includes appraisals from 2013 through 2021. The UAD Appraisal-Level PUF is a resource for users capable of using statistical software to extract and analyze data. Users can download annual or combined files in CSV, R, SAS and Stata formats. All files are zipped for ease with download.

  3. Data repository for Light-dark conditions drive variability in phosphorus...

    • figshare.com
    csv
    Updated Jul 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Pineda-Morante (2025). Data repository for Light-dark conditions drive variability in phosphorus and ammonium uptake by epilithic biofilms along the main stem of a Mediterranean river [Dataset]. http://doi.org/10.6084/m9.figshare.28741994.v5
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 2, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    David Pineda-Morante
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data repository for:Light-dark conditions drive variability in phosphorus and ammonium uptake by epilithic biofilms along the main stem of a Mediterranean riverThis repository accompanies the manuscript submitted to Limnology and Oceanography and contains all datasets and R scripts used to perform the analyses described in the study.---📁 Contents of this repository:1. R code biofilm uptake.R: R script used to perform all statistical analyses described in the study, including GLMM fitting, model selection, and model averaging.2. Biofilm nutrient uptake.csv: CSV file containing nutrient uptake rates (SRP and NH₄⁺) for all incubations conducted under light and dark conditions.3. Model selection.csv: CSV file used to analyze spatial and temporal variability in physicochemical and biofilm structural variables, and to explore their influence on biofilm nutrient uptake using model selection.4. Variable legend.csv: CSV file listing and describing all variables in "Model_selection.csv", including full variable names, units, and acronyms.5. Light-Dark uptake ratio.csv: CSV file containing the light/dark uptake ratios for NH₄⁺ and SRP, averaged per site and sampling date.6. NH4-SRP uptake ratio.csv: CSV file containing the NH₄⁺:SRP molar uptake ratios calculated for incubations with dual nutrient additions.7. Random structure selection.csv: CSV file used to explore random-effect structures (Site, Time, Site + Time, Site × Time) using global models prior to model selection.---Please cite this repository if you use these data or scripts in your own research.

  4. f

    Data Sheet 1_Feasibility of predicting next-day fatigue levels using heart...

    • frontiersin.figshare.com
    csv
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nana Yaw Aboagye; Maria Germann; Kenneth F. Baker; Mark R. Baker; Silvia Del Din (2025). Data Sheet 1_Feasibility of predicting next-day fatigue levels using heart rate variability and activity-sleep metrics in people with post-COVID fatigue.csv [Dataset]. http://doi.org/10.3389/fdgth.2025.1689846.s001
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 18, 2025
    Dataset provided by
    Frontiers
    Authors
    Nana Yaw Aboagye; Maria Germann; Kenneth F. Baker; Mark R. Baker; Silvia Del Din
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundPost-COVID fatigue (pCF) represents a significant burden for many individuals following SARS-CoV-2 infection. The unpredictable nature of fatigue fluctuations impairs daily functioning and quality of life, creating challenges for effective symptom management.ObjectiveThis study investigated the feasibility of developing predictive models to forecast next-day fatigue levels in individuals with pCF, utilizing objective physiological and behavioral features derived from wearable device data.MethodsWe analyzed data from 68 participants with pCF who wore an Axivity AX6 device on their non-dominant wrist and a VitalPatch electrocardiogram (ECG) sensor on their chest for up to 21 days while completing fatigue questionnaires every other day. HRV features were extracted from the VitalPatch single-lead ECG signal using the NeuroKit Python package, while activity and sleep features were derived from the Axivity wrist-worn device using the GGIR package. Using a 5-fold cross-validation approach, we trained and evaluated the performances of two machine learning models to predict next-day fatigue levels using Visual Analogue Scale (VAS) fatigue scores: Random Forest and XGBoost.ResultsUsing five-fold cross-validation, XGBoost outperformed Random Forest in predicting next-day fatigue levels (mean R² = 0.79 ± 0.04 vs. 0.69 ± 0.02; MAE = 3.18 ± 0.63 vs. 6.14 ± 0.96). Predicted and observed fatigue scores were strongly correlated for both models (XGBoost: r = 0.89 ± 0.02; Random Forest: r = 0.86 ± 0.01). Key predictors included heart rate variability features—sample entropy, low-frequency power, and approximate entropy—along with demographic (age, sex) and activity-related (moderate and vigorous duration) factors. These findings underscore the importance of integrating physiological, demographic, and activity data for accurate fatigue prediction.ConclusionsThis study demonstrates the feasibility of combining heart rate variability with activity and sleep features to predict next-day fatigue levels in individuals with pCF. Integrating physiological and behavioral data show promising predictive accuracy and provides insights that could inform future personalized fatigue management strategies.

  5. Supplement 1. R scripts and an example data set for conducting the power...

    • wiley.figshare.com
    html
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kathryn M. Miller; Brian R. Mitchell (2023). Supplement 1. R scripts and an example data set for conducting the power analysis simulations described in the main text. [Dataset]. http://doi.org/10.6084/m9.figshare.3563871.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    Wileyhttps://www.wiley.com/
    Authors
    Kathryn M. Miller; Brian R. Mitchell
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File List

       Miller_and_Mitchell_Power_Analysis_Code.r (md5: e0161858aaaeac3e81a2c755640b9feb)
       Tree_BA.csv (md5:  e7e52c80094e578260cc7d4bade00208) 
      Description
        Miller_and_Mitchell_Power_Analysis_Code.R - This script runs a bootstrap power analysis based on a mixed effects model of sample data (plot is the random effect, time and site (park) are fixed effects). The simulation determines power to detect a uniform percentage per sampling cycle change in the value of a metric as a linear trend in a mixed-effects model. The sample sizes tested by the script do not have to be the same as the number of samples in the data file; any desired number of samples will be bootstrapped from the actual data.
        The script will report power for a uniform trend across all parks (model with no interaction), as well as power for a trend that occurs only at one park (model with an interaction effect, where simulated effect occurs and power is tested for each park in turn).
        This script requires a comma delimited (.csv) file with the following headings:
        ID (unique alphanumeric value for each row of data; does not need to be called "ID")
        Plot (text, not numeric only, e.g.: "ACAD1" not "1")
        Park (text)
        Year (year of sample, numeric)
        Metric (metric to be evaluated, numeric)
       The data in the file must have two measurements for each plot, with the initial measurement of all plots collected prior to any second measurements (separate data collection cycles).
        Tree_BA.csv – This is an example of the data sets used for the power analysis in this article. Data sets must be formatted as demonstrated in this data set for the simulation to work properly.
    
  6. A CSV file containing both explanatory and response variables used in the...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    txt
    Updated Aug 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucie Tamisier; Frédéric Fabre; Marion Szadkowski; Lola Chateau; Ghislaine Nemouchi; Grégory Girardot; Pauline Millot; Alain Palloix; Benoît Moury (2024). A CSV file containing both explanatory and response variables used in the GLMs. [Dataset]. http://doi.org/10.1371/journal.ppat.1012424.s007
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lucie Tamisier; Frédéric Fabre; Marion Szadkowski; Lola Chateau; Ghislaine Nemouchi; Grégory Girardot; Pauline Millot; Alain Palloix; Benoît Moury
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file is one of three input files for the R Markdown script analyzing the data (S1 File). (CSV)

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Eugeniy Osetrov (2023). Flights data [Dataset]. https://www.kaggle.com/datasets/eugeniyosetrov/flights-data
Organization logo

Flights data

On-time data for a random sample of flights that departed NYC

Explore at:
zip(1498790 bytes)Available download formats
Dataset updated
Dec 5, 2023
Authors
Eugeniy Osetrov
Description

On-time data for a random sample of flights that departed NYC (i.e. JFK, LGA or EWR) in 2013. year,month,day Date of departure.

dep_time,arr_time Departure and arrival times, local tz.

dep_delay,arr_delay Departure and arrival delays, in minutes. Negative times represent early departures/arrivals.

hour,minute Time of departure broken in to hour and minutes.

carrier Two letter carrier abbreviation. See airlines in the nycflights13 package for more information or google the airline code.

tailnum Plane tail number.

flight Flight number.

origin,dest Origin and destination. See airports in the nycflights13 package for more information or google airport the code.

air_time Amount of time spent in the air.

distance Distance flown.

Source Hadley Wickham (2014). nycflights13: Data about flights departing NYC in 2013. R package version 0.1.

Formats CSV file Tab-delimited text file

Format A tbl_df with 32,735 rows and 16 variables:

Photo by Phil Mosley on Unsplash

Search
Clear search
Close search
Google apps
Main menu