Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Descriptive statistics, mean ± SD, range, median and interquartile range (IQR).
These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
The Precipitation Estimation from Remotely Sensed Information using an Artificial Neural Network-Climate Data Record (PERSIANN-CDR) is a new, retrospective satellite-based precipitation dataset, constructed as a climate data record for hydrological and climate studies. The PERSIANN-CDR is available from 1983-present making the dataset the longest satellite based precipitation data record available. The precipitation maps are available at daily temporal resolution for the latitude band 60°S–60°N at 0.25 degrees. The maps shown here represent 30-year annual and seasonal median and interquartile range (IQR) of the PERSIANN-CDR dataset from 1984 – 2014. In the median precipitation maps, the mid-point value (or 50th percentile) for each pixel in is computed and plotted for the study area. The range of the data about the median is represented by the interquartile range (IQR), and shows the variability of the dataset. For these maps, winter = December – February, spring = March – May, summer = June – August, fall = September – November
The Precipitation Estimation from Remotely Sensed Information using an Artificial Neural Network-Climate Data Record (PERSIANN-CDR) is a satellite-based precipitation dataset for hydrological and climate studies, spanning from 1983 to present. It is the longest satellite-based precipitation record available, with daily data at 0.25° resolution for the 60°S–60°N latitude band.PERSIANN rain rate estimates are generated at 0.25° resolution and calibrated to a monthly merged in-situ and satellite product from the Global Precipitation Climatology Project (GPCP). The model uses Gridded Satellite (GridSat-B1) infrared data at 3-hourly time steps, with the raw output (PERSIANN-B1) bias-corrected and accumulated to produce the daily PERSIANN-CDR.The maps show 31 years (1984–2014) of annual and seasonal median and interquartile range (IQR) data. The median represents the 50th percentile of precipitation, and the IQR reflects the range between the 75th and 25th percentiles, showing data variability. Median and IQR are preferred over mean and standard deviation as they are less influenced by extreme values and better represent non-normally distributed data, such as precipitation, which is skewed and zero-limited.Data and Metadata: NCEIThis is a component of the Gulf Data Atlas (V1.0) for the Physical topic area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*Those loading most heavily (component load ≥|0.5|) in principal component analyses are identified in bold.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Geoscience Australias GEOMACS model was utilised to produce hindcast hourly time series of continental shelf (~20 to 300 m depth) bed shear stress (unit of measure: Pascal, Pa) on a 0.1 degree grid covering the period March 1997 to February 2008 (inclusive). The hindcast data represents the combined contribution to the bed shear stress by waves, tides, wind and density-driven circulation. Included in the parameters that will be calculated to represent the magnitude of the bulk of the data are the quartiles of the distribution; Q25, Q50 and Q75 (i.e. the values for which 25, 50 and 75 percent of the observations fall below). The interquartile range, , of the GEOMACS output takes the observations from between Q25 and Q75 to provide an accurate representation of the spread of observations. The interquartile range was shown to provide a more robust representation of the observations than the standard deviation, which produced highly skewed observations (Hughes and Harris 2008). This dataset is a contribution to the CERF Marine Biodiversity Hub and is hosted temporarily by CMAR on behalf of Geoscience Australia.
The Precipitation Estimation from Remotely Sensed Information using an Artificial Neural Network-Climate Data Record (PERSIANN-CDR) is a satellite-based precipitation dataset for hydrological and climate studies, spanning from 1983 to present. It is the longest satellite-based precipitation record available, with daily data at 0.25° resolution for the 60°S–60°N latitude band.PERSIANN rain rate estimates are generated at 0.25° resolution and calibrated to a monthly merged in-situ and satellite product from the Global Precipitation Climatology Project (GPCP). The model uses Gridded Satellite (GridSat-B1) infrared data at 3-hourly time steps, with the raw output (PERSIANN-B1) bias-corrected and accumulated to produce the daily PERSIANN-CDR.The maps show 31 years (1984–2014) of annual and seasonal median and interquartile range (IQR) data. The median represents the 50th percentile of precipitation, and the IQR reflects the range between the 75th and 25th percentiles, showing data variability. Median and IQR are preferred over mean and standard deviation as they are less influenced by extreme values and better represent non-normally distributed data, such as precipitation, which is skewed and zero-limited.Data and Metadata: NCEIThis is a component of the Gulf Data Atlas (V1.0) for the Physical topic area.
Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.
Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.
Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DH represents 100% for the relative measure. Differences between medians and distributions were significant between all disciplines if indicated with * and were significantly different between GS and SG when marked with 1, significantly different between GS and DH if marked with 2 and significantly different between SG and DH if marked with 3. If no parameter was significantly different the column is empty. Columns marked with—indicate that the measure was not calculated.Median, interquartile range (IQR) and significance level of the difference between discipline medians and distributions for all parameters, and percentage of DH for GS and SG.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
eGFR: estimated glomerular filtration rate, IQR: interquartile range.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1. Introduction
Datasets are used to evaluate the performance of a Kalman filter approach to estimate daily discharge. This is a perturbed version of synthetic SWOT datasets consisting of 15 river sections, which are commonly agreed datasets for evaluating the performance of SWOT discharge algorithms (Frasson et al., 2020, 2021). The benchmarking manuscript entitled “A Kalman Filter Approach for Estimating Daily Discharge Using Space-based Discharge Estimates” is currently under review at Water Resources Research. Once the manuscript is accepted, its DOI will be included here.
2. File description
The datasets are generally divided into two categories: river information (River_Info) and time series data (Timeseries_Data). River information provides fundamental and general river characteristics, whereas time series data offers daily reach-averaged data for each reach. In time series data, the data mainly contains three components: true data, perturbed measurements, and true and perturbed flow law parameters (A0, an, and b). For each reach, there are 10000 realizations of perturbed measurements per time step and there are 100 realizations of time-invariant perturbed flow law parameters through a Monte Carlo simulation (Frasson et al., 2023). Moreover, to support our proposed Kalman filter approach to estimate daily discharge, the datasets provide the median of the perturbed discharge, river width, water surface slope, and change in the cross-sectional area, as well as the uncertainty of the perturbed discharge and change in the cross-sectional area based on the interquartile range (Fox, 2015).
To support reproducibility and facilitate example usage, we now include a MATLAB code package (KalmanFilter_Code.zip
) that demonstrates how to run the Kalman filter approach using the Missouri Downstream case as an example.
Datasets are contained in a .mat file per river. The detailed groups and variables are in the following:
River_Info
Name: River name, data type: char
QWBM: Mean annual discharge from the water balance model WBMsed (Cohen et al., 2014)
rch_bnd: Reach boundaries measured in meters from the upstream end of the model
gdrch: Good reaches in the study. They were used to exclude small reaches defined around low-head dams and other obstacles where Manning’s equation should not be applied.
Timeseries_Data
t: Time measured in days since the first day or “0-January-0000” for cases when specific dates were available. Dimension: 1, time step.
A: Reach-averaged cross-sectional area of flow in m2. Dimension: Reach, time step.
Q_true: True reach-averaged discharge (m3/s). Dimension: Reach, time step.
Q_ptb: Perturbed discharge (m3/s), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.
med_Q_ptb: Median perturbed discharge (m3/s) across the 10000 realizations. Dimension: Good reach, time step.
sigma_Q_ptb: Uncertainty of the perturbed discharge (m3/s), calculated based on the interquartile range. Dimension: Good reach, time step.
W_true: True reach-averaged river width (m). Dimension: Reach, time step.
W_ptb: Perturbed river width (m), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.
med_W_ptb: Median perturbed river width (m) across the 10000 realizations. Dimension: Good reach, time step.
H_true: True reach-averaged water surface elevation (m). Dimension: Reach, time step.
H_ptb: Perturbed water surface elevation (m), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.
S_true: True reach-averaged water surface slope (m/m). Dimension: Reach, time step.
S_ptb: Perturbed water surface slope (m/m), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.
med_S_ptb: Median perturbed water surface slope (m/m) across the 10000 realizations. Dimension: Good reach, time step.
dA_true: True reach-averaged change in the cross-sectional area (m2). Dimension: Good reach, time step.
dA_ptb: Perturbed change in the cross-sectional area (m2), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.
med_dA_ptb: Median perturbed change in the cross-sectional area (m2) across the 10000 realizations. Dimension: Good reach, time step.
sigma_dA_ptb: Uncertainty of the perturbed change in the cross-sectional area (m2), calculated based on the interquartile range. Dimension: Good reach, time step.
A0_true: True baseline cross-sectional area (m2). Dimension: Good reach, 1.
A0: Perturbed baseline cross-sectional area (m2), including 100 realizations for each parameter. Dimension: Good reach, 100.
na_true: True friction coefficient. Dimension: Good reach, 1.
na: Perturbed friction coefficient, including 100 realizations for each parameter. Dimension: Good reach, 100.
b_true: True exponent coefficient. Dimension: Good reach, 1.
b: Perturbed exponent coefficient, including 100 realizations for each parameter. Dimension: Good reach, 100.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A live version of the data record, which will be kept up-to-date with new estimates, can be downloaded from the Humanitarian Data Exchange: https://data.humdata.org/dataset/covid-19-mobility-italy.
If you find the data helpful or you use the data for your research, please cite our work:
Pepe, E., Bajardi, P., Gauvin, L., Privitera, F., Lake, B., Cattuto, C., & Tizzoni, M. (2020). COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown. Scientific Data 7, 230 (2020).
The data record is structured into 4 comma-separated value (CSV) files, as follows:
id_provinces_IT.csv. Table of the administrative codes of the 107 Italian provinces. The fields of the table are:
COD_PROV is an integer field that is used to identify a province in all other data records;
SIGLA is a two-letters code that identifies the province according to the ISO_3166-2 standard (https://en.wikipedia.org/wiki/ISO_3166-2:IT);
DEN_PCM is the full name of the province.
OD_Matrix_daily_flows_norm_full_2020_01_18_2020_04_17.csv. The file contains the daily fraction of users’ moving between Italian provinces. Each line corresponds to an entry of matrix (i, j). The fields of the table are:
p1: COD_PROV of origin,
p2: COD_PROV of destination,
day: in the format yyyy-mm-dd.
median_q1_q3_rog_2020_01_18_2020_04_17.csv. The file contains median and interquartile range (IQR) of users’ radius of gyration in a province by week. Each entry of the table fields of the table are:
COD_PROV of the province;
SIGLA of the province;
DEN_PCM of the province;
week: median value of the radius of gyration on week week, with week in the format dd/mm-DD/MM where dd/mm and DD/MM are the first and the last day of the week, respectively.
week Q1 first quartile (Q1) of the distribution of the radius of gyration on week week,
week Q3 third quartile (Q3) of the distribution of the radius of gyration on week week,
average_network_degree_2020_01_18_2020_04_17.csv. The file contains daily time-series of the average degree 〈k〉 of the proximity network. Each entry of the table is a value of 〈k〉 on a given day. The fields of the table are:
COD_PROV of the province;
SIGLA of the province;
DEN_PCM of the province;
day in the format yyyy-mm-dd.
ESRI shapefiles of the Italian provinces updated to the most recent definition are available from the website of the Italian National Office of Statistics (ISTAT): https://www.istat.it/it/archivio/222527.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objectives: This study aimed to ascertain utility and vision-related quality of life in patients awaiting access to specialist eye care. A secondary aim was to evaluate the association of utility indices with demographic profile and waiting time. Methods: Consecutive patients that had been waiting for ophthalmology care answered the 25-item National Eye Institute Visual Function Questionnaire (NEI VFQ-25). The questionnaire was administered when patients arrived at the clinics for their first visit. We derived a utility index (VFQ-UI) from the patients’ responses, then calculated the correlation between this index and waiting time and compared utility across demographic subgroups stratified by age, sex, and care setting. Results: 536 individuals participated in the study (mean age 52.9±16.6 years; 370 women, 69% women). The median utility index was 0.85 (interquartile range [IQR] 0.70–0.92; minimum 0.40, maximum 0.97). The mean VFQ-25 score was 70.88±14.59. Utility correlated weakly and nonsignificantly with waiting time (-0.05, P = 0.24). It did not vary across age groups (P = 0.85) or care settings (P = 0.77). Utility was significantly lower for women (0.84, IQR 0.70–0.92) than men (0.87, IQR 0.73–0.93, P = 0.03), but the magnitude of this difference was small (Cohen’s d = 0.13). Conclusion: Patients awaiting access to ophthalmology care had a utility index of 0.85 on a scale of 0 to 1. This measurement was not previously reported in the literature. Utility measures can provide insight into patients’ perspectives and support economic health analyses and inform health policies.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
this dataset gathered the trajectories of 161 lagrangian surface drifters that were deployed in the western mediterranean sea in 2023 by three campaigns of the swot adopt-a-crossover consortium: c-swot-2023, bioswot-med and fast-swot. drifter trajectories are available between march 27th 2023 and january 22th 2024. the deployment strategy involved releasing drifters to target specific mesoscale and submesoscale structures in the vicinity of selected swot passes. these structures were identified using spasso software, which combined near-real-time remote data from copernicus (duacs) and early swot data provided by cls/cnes. several drifter designs are used in these experiments : svp drifters drogued at 15m, 50m, and 100m; svp-b drifters at 15m depth; a customized bgc-svp drifter drogued at 15m and equipped with additional sensors such as a ctd (for temperature and salinity) and an optical triplet measuring biochemical properties of sea; surface drifters such as code, carthe, hereon type with drogue within the first meter depth; and spotter, melodi-eodyn devices as wave drifters. the original nominal sampling rates range from 5 minutes to 1 hour. drifters were deployed in the passes 3 and 16 of swot orbit during its fast-sampling (cal-val) phase (1-day revisit until july 10th) and some of the drifters further crossed the satellite ground-tracks afterwords, when the satellite science orbit was set to 21 days. this dataset is a collaborative effort between the swot-adac consortium and fast-swot, bioswot-med and c-swot cruises. to provide a single interoperable dataset, all drifter trajectories from the different campaigns were processed with the same scripts in a similar manner, resulting in three distinct levels of processing. l0 – harmonised and preprocessed trajectoriesall initial trajectories are merged into a single dataset with variables renamed to match database standards. the following steps are applied: removing rows with missing date/time, ordering by ascending time, trimming to valid deployment/recovery periods, dropping rows with missing values, eliminating duplicates, removing rows with repeated times but different positions, and excluding rows with erroneous latitude/longitude (e.g., outliers outside the mediterranean sea). l1 – processed trajectoriesl1 trajectories are filtered based on acceleration. velocity and acceleration are calculated at each timestep, and positions with accelerations exceeding 4 times the interquartile range (iqr) are removed. this results in irregularly spaced trajectories that retain the original gps positions and therefore the overall current dynamics signal with its multiscale components but exclude gps fix outliers as defined above.l2 – smoothed and regularly interpolated trajectoriesl2-trajectories are obtained from the l1-trajectories, that are regularly interpolated and smoothed in order to reduce noise, especially on acceleration. two methods are used: the lowess method (inspired by elipot et al. 2016) and a variational method developed by m. demol and a. ponte (inspired by yaremchuk and coelho, 2014). l2 trajectories are available with time steps of 10 minutes, 30 minutes, or 1 hour. for more details on the smoothing and interpolating processing, please refer to the attached pdf.data export in netcdf formateach drifter trajectory is stored in eight separate netcdf files, organised into eight distinct folders based on the processing stage and temporal resolution. for a given drifter, the following files are available :l0_data/bioswot_carthe_4388553.ncl1_data/bioswot_carthe_4388553.ncl2_data_variational_10min/bioswot_carthe_4388553.ncl2_data_variational_30min/bioswot_carthe_4388553.ncl2_data_variational_1hour/bioswot_carthe_4388553.ncl2_data_lowess_10min/bioswot_carthe_4388553.ncl2_data_lowess_30min/bioswot_carthe_4388553.ncl2_data_lowess_1hour/bioswot_carthe_4388553.nccontact list : maristella berta (maristella.berta@sp.ismar.cnr.it), margot demol (margot.demol@ifremer.fr), laura gómez navarro (laura.gomez@uib.es) and lloyd izard (lloyd.izard@locean.ipsl.fr)pis contact for the different involved projects: bio-swot-med andrea doglioli (andrea.doglioli@univ-amu.fr); c-swot pierre garreau (pierre.garreau@ifremer.fr), franck dumas (franck.dumas@shom.fr) and aurélien ponte (aurelien.ponte@ifremer.fr); fast-swot: ananda pascual (ananda.pascual@imedea.uib-csic.es) and baptiste mourre (bmourre@imedea.uib-csic.es).referencesdavis, russ e. “drifter observations of coastal surface currents during code: the method and descriptive view.” journal of geophysical research: oceans 90, no. c3 (1985): 4741–55. https://doi.org/10.1029/jc090ic03p04741.elipot, shane, rick lumpkin, renellys c perez, jonathan m lilly, jeffrey j early, and adam m sykulski. “a global surface drifter data set at hourly resolution.” journal of geophysical research: oceans 121, no. 5 (2016):[...]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Small description of the Reddit Politics 10 years dataset (starting from 2011-01).
node_month_leaning_full_headings.zip
The zip file contains a csv where each row identifies a user content (either post or comment) with the following structure
node_id,month_progressive_id,leaning_lable,leaning_score,post_comment
where
node_id: is the user uniq identifier
month_progressive:_id is a numeric value from 0 to 100 identifying the month in which the post/comment has been published
leaning_label: is a discrete variable identifying left/right/moderates (it is based on leaning_score and can be re-binned if needed)
leaning_score: is the continuos score describing the political leaning (range [0,1])
post_comment: a flag P/C to differentiate the submission type
monthly_scores_json.zip
This archive contains 3 json files:
monthly_scores.json: a dictionary month->node_id->{post: [list political leanings], comments: [list political leanings]};
monthly_scores_post_agg.json: a dictionary mont->node_id->political_leaning, where the aggregated score is the average of the interquartile range of the political leaning of the sole users' posts;
monthly_scores_agg.json: a dictionary mont->node_id->political_leaning, where the aggregated score is the weighted(*) average among (i) the mean value of the interquartile range of the political leaning of the users' posts, (ii) the mean value of the interquartile range of the political leaning of the users' comments;
(*) being posts' annotation more reliable than comments' ones we decided to weight the former 10 times the latter when aggregating.
monthly_networks_full.zip
This archive contains all the monthly undirected, unweighted, interaction network (each row identifying an edge among two node ids). The networks cover all users having having a political leanin computed (using *both* posts and comments).
monthly_networks_posts.zip
This archive contains all the monthly undirected, unweighted, interaction network (each row identifying an edge among two node ids). The networks cover all users having having a political leanin computed considering *only* posts.
Larval life history traits and geographic distribution for each thoracican barnacle species used in the study
The table "finalmergeddata.csv" contains life history and enironmental data as well as the calculated variance (IQR = interquartile range, se = standard error) summarized per species. The table "lifehistory.xls" contains the species-specific larval life history data we extracted from the literature. The first tab, "Taxonomy + larval mode" has one row per species. The taxonomy is taken from WoRMS (www.marinespecies.org). The following two tabs contain information on other larval traits and the known geographic distribution of the barnacle species. In these tabs, each species can occur several times, as we chose to give each reference a separate row. The references are detailed in the datatable_references file. The meaning of all columns is explained in the last tab "METADATA". Detailed references for the data sources are available in the last tab "Data sourc...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*n = 1041 (35 missing data).BMI = body mass index (kg/m2); SD = standard deviation; IQR = interquartile range; EI energy intake (MJ/d); BMR = basal metabolic rate (MJ/d).
OBJECTIVE:
To evaluate safety and efficacy of image guided-hypofractionated radiation therapy (IG-HRT) in patients with thoracic nodes oligometastases.
METHODS:
The present study is a multicenter analysis. Oligometastatic patients, affected by a maximum of five active lesions in three or less different organs, treated with IG-HRT to thoracic nodes metastases between 2012 and 2017 were included in the analysis. Primary end point was local control (LC), secondary end points were overall survival (OS), progression-free survival, acute and late toxicity. Univariate and multivariate analysis were performed to identify possible prognostic factors for the survival end points.
RESULTS:
76 patients were included in the analysis. Different RT dose and fractionation schedules were prescribed according to site, number, size of the lymph node(s) and to respect dose constraints for relevant organs at risk. Median biologically effective dose delivered was 75 Gy (interquartile range: 59-86 Gy). Treatment was optimal; one G1 acute toxicity and seven G1 late toxicities of any grade were recorded. Median follow-up time was 23.16 months. 16 patients (21.05%) had a local progression, while 52 patients progressed in distant sites (68.42 %).Median local relapse free survival was not reached, LC at 6, 12 and 24 months was 96.05% [confidence interval (CI) 88.26-98.71%], 86.68% (CI 75.86-92.87) and 68.21% (CI 51.89-80.00%), respectively. Median OS was 28.3 months (interquartile range 16.1-47.2). Median progression-freesurvival was 9.2 months (interquartile range 4.1-17.93).At multivariate analysis, RT dose, colorectal histology, systemic therapies were correlated with LC. Performance status and the presence of metastatic sites other than the thoracic nodes were correlated with OS. Local response was a predictor of OS.
CONCLUSION:
IG-HRT for thoracic nodes was safe and feasible. Higher RT doses were correlated to better LC and should be taken in consideration at least in patients with isolated nodal metastases and colorectal histology.
ADVANCES IN KNOWLEDGE:
Radiotherapy is safe and effective treatment for thoracic nodes metastases, higher radiotherapy doses are correlated to better LC. Oligometastatic patients can receive IG-HRT also for thoracic nodes metastases.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Median and interquartile range (IQR) for each numeric variable of the dataset, stratified by Survival (S: Survived, NS: Not survived, T: Total cohort), and for the SIRS and SEPSIS cohorts.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abbreviations: GFR, glomerular filtration rate; MDRD, Modification of Diet in Renal Disease; CKD-EPI, Chronic Kidney Disease Epidemiology Collaboration; CI, confidence interval; IQR, interquartile range.Performance of bias, precision and accuracy between measured GFR and estimated GFR in the validation data set.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Descriptive statistics, mean ± SD, range, median and interquartile range (IQR).