Facebook
TwitterOn-time data for a random sample of flights that departed NYC (i.e. JFK, LGA or EWR) in 2013. year,month,day Date of departure.
dep_time,arr_time Departure and arrival times, local tz.
dep_delay,arr_delay Departure and arrival delays, in minutes. Negative times represent early departures/arrivals.
hour,minute Time of departure broken in to hour and minutes.
carrier Two letter carrier abbreviation. See airlines in the nycflights13 package for more information or google the airline code.
tailnum Plane tail number.
flight Flight number.
origin,dest Origin and destination. See airports in the nycflights13 package for more information or google airport the code.
air_time Amount of time spent in the air.
distance Distance flown.
Source Hadley Wickham (2014). nycflights13: Data about flights departing NYC in 2013. R package version 0.1.
Formats CSV file Tab-delimited text file
Format A tbl_df with 32,735 rows and 16 variables:
Photo by Phil Mosley on Unsplash
Facebook
TwitterThe Uniform Appraisal Dataset (UAD) Appraisal-Level Public Use File (PUF) is the nation’s first publicly available appraisal-level dataset of appraisal records, giving the public new access to a selected set of data fields found in appraisal reports. The UAD Appraisal-Level PUF is based on a five percent nationally representative random sample of appraisals for single-family mortgages acquired by the Enterprises. The current release includes appraisals from 2013 through 2021. The UAD Appraisal-Level PUF is a resource for users capable of using statistical software to extract and analyze data. Users can download annual or combined files in CSV, R, SAS and Stata formats. All files are zipped for ease with download.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data repository for:Light-dark conditions drive variability in phosphorus and ammonium uptake by epilithic biofilms along the main stem of a Mediterranean riverThis repository accompanies the manuscript submitted to Limnology and Oceanography and contains all datasets and R scripts used to perform the analyses described in the study.---📁 Contents of this repository:1. R code biofilm uptake.R: R script used to perform all statistical analyses described in the study, including GLMM fitting, model selection, and model averaging.2. Biofilm nutrient uptake.csv: CSV file containing nutrient uptake rates (SRP and NH₄⁺) for all incubations conducted under light and dark conditions.3. Model selection.csv: CSV file used to analyze spatial and temporal variability in physicochemical and biofilm structural variables, and to explore their influence on biofilm nutrient uptake using model selection.4. Variable legend.csv: CSV file listing and describing all variables in "Model_selection.csv", including full variable names, units, and acronyms.5. Light-Dark uptake ratio.csv: CSV file containing the light/dark uptake ratios for NH₄⁺ and SRP, averaged per site and sampling date.6. NH4-SRP uptake ratio.csv: CSV file containing the NH₄⁺:SRP molar uptake ratios calculated for incubations with dual nutrient additions.7. Random structure selection.csv: CSV file used to explore random-effect structures (Site, Time, Site + Time, Site × Time) using global models prior to model selection.---Please cite this repository if you use these data or scripts in your own research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundPost-COVID fatigue (pCF) represents a significant burden for many individuals following SARS-CoV-2 infection. The unpredictable nature of fatigue fluctuations impairs daily functioning and quality of life, creating challenges for effective symptom management.ObjectiveThis study investigated the feasibility of developing predictive models to forecast next-day fatigue levels in individuals with pCF, utilizing objective physiological and behavioral features derived from wearable device data.MethodsWe analyzed data from 68 participants with pCF who wore an Axivity AX6 device on their non-dominant wrist and a VitalPatch electrocardiogram (ECG) sensor on their chest for up to 21 days while completing fatigue questionnaires every other day. HRV features were extracted from the VitalPatch single-lead ECG signal using the NeuroKit Python package, while activity and sleep features were derived from the Axivity wrist-worn device using the GGIR package. Using a 5-fold cross-validation approach, we trained and evaluated the performances of two machine learning models to predict next-day fatigue levels using Visual Analogue Scale (VAS) fatigue scores: Random Forest and XGBoost.ResultsUsing five-fold cross-validation, XGBoost outperformed Random Forest in predicting next-day fatigue levels (mean R² = 0.79 ± 0.04 vs. 0.69 ± 0.02; MAE = 3.18 ± 0.63 vs. 6.14 ± 0.96). Predicted and observed fatigue scores were strongly correlated for both models (XGBoost: r = 0.89 ± 0.02; Random Forest: r = 0.86 ± 0.01). Key predictors included heart rate variability features—sample entropy, low-frequency power, and approximate entropy—along with demographic (age, sex) and activity-related (moderate and vigorous duration) factors. These findings underscore the importance of integrating physiological, demographic, and activity data for accurate fatigue prediction.ConclusionsThis study demonstrates the feasibility of combining heart rate variability with activity and sleep features to predict next-day fatigue levels in individuals with pCF. Integrating physiological and behavioral data show promising predictive accuracy and provides insights that could inform future personalized fatigue management strategies.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
File List
Miller_and_Mitchell_Power_Analysis_Code.r (md5: e0161858aaaeac3e81a2c755640b9feb)
Tree_BA.csv (md5: e7e52c80094e578260cc7d4bade00208)
Description
Miller_and_Mitchell_Power_Analysis_Code.R - This script runs a bootstrap power analysis based on a mixed effects model of sample data (plot is the random effect, time and site (park) are fixed effects). The simulation determines power to detect a uniform percentage per sampling cycle change in the value of a metric as a linear trend in a mixed-effects model. The sample sizes tested by the script do not have to be the same as the number of samples in the data file; any desired number of samples will be bootstrapped from the actual data.
The script will report power for a uniform trend across all parks (model with no interaction), as well as power for a trend that occurs only at one park (model with an interaction effect, where simulated effect occurs and power is tested for each park in turn).
This script requires a comma delimited (.csv) file with the following headings:
ID (unique alphanumeric value for each row of data; does not need to be called "ID")
Plot (text, not numeric only, e.g.: "ACAD1" not "1")
Park (text)
Year (year of sample, numeric)
Metric (metric to be evaluated, numeric)
The data in the file must have two measurements for each plot, with the initial measurement of all plots collected prior to any second measurements (separate data collection cycles).
Tree_BA.csv – This is an example of the data sets used for the power analysis in this article. Data sets must be formatted as demonstrated in this data set for the simulation to work properly.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file is one of three input files for the R Markdown script analyzing the data (S1 File). (CSV)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterOn-time data for a random sample of flights that departed NYC (i.e. JFK, LGA or EWR) in 2013. year,month,day Date of departure.
dep_time,arr_time Departure and arrival times, local tz.
dep_delay,arr_delay Departure and arrival delays, in minutes. Negative times represent early departures/arrivals.
hour,minute Time of departure broken in to hour and minutes.
carrier Two letter carrier abbreviation. See airlines in the nycflights13 package for more information or google the airline code.
tailnum Plane tail number.
flight Flight number.
origin,dest Origin and destination. See airports in the nycflights13 package for more information or google airport the code.
air_time Amount of time spent in the air.
distance Distance flown.
Source Hadley Wickham (2014). nycflights13: Data about flights departing NYC in 2013. R package version 0.1.
Formats CSV file Tab-delimited text file
Format A tbl_df with 32,735 rows and 16 variables:
Photo by Phil Mosley on Unsplash