This dataset consists of mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.
## Example questions
Question: Solve -42*r + 27*c = -1167 and 130*r + 4*c = 372 for r.
Answer: 4
Question: Calculate -841880142.544 + 411127.
Answer: -841469015.544
Question: Let x(g) = 9*g + 1. Let q(c) = 2*c + 1. Let f(i) = 3*i - 39. Let w(j) = q(x(j)). Calculate f(w(a)).
Answer: 54*a - 30
It contains 2 million (question, answer) pairs per module, with questions limited to 160 characters in length, and answers to 30 characters in length. Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard". This allows training models via a curriculum. The data can also be mixed together uniformly from these training datasets to obtain the results reported in the paper. Categories:
The U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.
GLAH05 Level-1B waveform parameterization data include output parameters from the waveform characterization procedure and other parameters required to calculate surface slope and relief characteristics. GLAH05 contains parameterizations of both the transmitted and received pulses and other characteristics from which elevation and footprint-scale roughness and slope are calculated. The received pulse characterization uses two implementations of the retracking algorithms: one tuned for ice sheets, called the standard parameterization, used to calculate surface elevation for ice sheets, oceans, and sea ice; and another for land (the alternative parameterization). Each data granule has an associated browse product.
The databases ESTAR, PSTAR, and ASTAR calculate stopping-power and range tables for electrons, protons, or helium ions. Stopping-power and range tables can be calculated for electrons in any user-specified material and for protons and helium ions in 74 materials.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Overview The Human Vital Signs Dataset is a comprehensive collection of key physiological parameters recorded from patients. This dataset is designed to support research in medical diagnostics, patient monitoring, and predictive analytics. It includes both original attributes and derived features to provide a holistic view of patient health.
Attributes Patient ID
Description: A unique identifier assigned to each patient. Type: Integer Example: 1, 2, 3, ... Heart Rate
Description: The number of heartbeats per minute. Type: Integer Range: 60-100 bpm (for this dataset) Example: 72, 85, 90 Respiratory Rate
Description: The number of breaths taken per minute. Type: Integer Range: 12-20 breaths per minute (for this dataset) Example: 16, 18, 15 Timestamp
Description: The exact time at which the vital signs were recorded. Type: Datetime Format: YYYY-MM-DD HH:MM Example: 2023-07-19 10:15:30 Body Temperature
Description: The body temperature measured in degrees Celsius. Type: Float Range: 36.0-37.5°C (for this dataset) Example: 36.7, 37.0, 36.5 Oxygen Saturation
Description: The percentage of oxygen-bound hemoglobin in the blood. Type: Float Range: 95-100% (for this dataset) Example: 98.5, 97.2, 99.1 Systolic Blood Pressure
Description: The pressure in the arteries when the heart beats (systolic pressure). Type: Integer Range: 110-140 mmHg (for this dataset) Example: 120, 130, 115 Diastolic Blood Pressure
Description: The pressure in the arteries when the heart rests between beats (diastolic pressure). Type: Integer Range: 70-90 mmHg (for this dataset) Example: 80, 75, 85 Age
Description: The age of the patient. Type: Integer Range: 18-90 years (for this dataset) Example: 25, 45, 60 Gender
Description: The gender of the patient. Type: Categorical Categories: Male, Female Example: Male, Female Weight (kg)
Description: The weight of the patient in kilograms. Type: Float Range: 50-100 kg (for this dataset) Example: 70.5, 80.3, 65.2 Height (m)
Description: The height of the patient in meters. Type: Float Range: 1.5-2.0 m (for this dataset) Example: 1.75, 1.68, 1.82 Derived Features Derived_HRV (Heart Rate Variability)
Description: A measure of the variation in time between heartbeats. Type: Float Formula: 𝐻 𝑅
Standard Deviation of Heart Rate over a Period Mean Heart Rate over the Same Period HRV= Mean Heart Rate over the Same Period Standard Deviation of Heart Rate over a Period
Example: 0.10, 0.12, 0.08 Derived_Pulse_Pressure (Pulse Pressure)
Description: The difference between systolic and diastolic blood pressure. Type: Integer Formula: 𝑃
Systolic Blood Pressure − Diastolic Blood Pressure PP=Systolic Blood Pressure−Diastolic Blood Pressure Example: 40, 45, 30 Derived_BMI (Body Mass Index)
Description: A measure of body fat based on weight and height. Type: Float Formula: 𝐵 𝑀
Weight (kg) ( Height (m) ) 2 BMI= (Height (m)) 2
Weight (kg)
Example: 22.8, 25.4, 20.3 Derived_MAP (Mean Arterial Pressure)
Description: An average blood pressure in an individual during a single cardiac cycle. Type: Float Formula: 𝑀 𝐴
Diastolic Blood Pressure + 1 3 ( Systolic Blood Pressure − Diastolic Blood Pressure ) MAP=Diastolic Blood Pressure+ 3 1 (Systolic Blood Pressure−Diastolic Blood Pressure) Example: 93.3, 100.0, 88.7 Target Feature Risk Category Description: Classification of patients into "High Risk" or "Low Risk" based on their vital signs. Type: Categorical Categories: High Risk, Low Risk Criteria: High Risk: Any of the following conditions Heart Rate: > 90 bpm or < 60 bpm Respiratory Rate: > 20 breaths per minute or < 12 breaths per minute Body Temperature: > 37.5°C or < 36.0°C Oxygen Saturation: < 95% Systolic Blood Pressure: > 140 mmHg or < 110 mmHg Diastolic Blood Pressure: > 90 mmHg or < 70 mmHg BMI: > 30 or < 18.5 Low Risk: None of the above conditions Example: High Risk, Low Risk This dataset, with a total of 200,000 samples, provides a robust foundation for various machine learning and statistical analysis tasks aimed at understanding and predicting patient health outcomes based on vital signs. The inclusion of both original attributes and derived features enhances the richness and utility of the dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset and codes for "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023 "
The MATLAB codes and related datasets are used for generating the figures for the paper "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023".
Files and variables
File 1: Data_and_Code.zip
Directory: Main_function
**Description:****Include MATLAB scripts and functions. Each script include discriptions that guide the user how to used it and how to find the dataset that used for processing.
MATLAB Main Scripts: Include the whole steps to process the data, output figures, and output videos.
Script_1_Ice_velocity_process_flow.m
Script_2_strain_rate_process_flow.m
Script_3_DROT_grounding_line_extraction.m
Script_4_Read_ICESat2_h5_files.m
Script_5_Extraction_results.m
MATLAB functions: Five Files that includes MATLAB functions that support the main script:
1_Ice_velocity_code: Include MATLAB functions related to ice velocity post-processing, includes remove outliers, filter, correct for atmospheric and tidal effect, inverse weited averaged, and error estimate.
2_strain_rate: Include MATLAB functions related to strain rate calculation.
3_DROT_extract_grounding_line_code: Include MATLAB functions related to convert range offset results output from GAMMA to differential vertical displacement and used the result extract grounding line.
4_Extract_data_from_2D_result: Include MATLAB functions that used for extract profiles from 2D data.
5_NeRD_Damage_detection: Modified code fom Izeboud et al. 2023. When apply this code please also cite Izeboud et al. 2023 (https://www.sciencedirect.com/science/article/pii/S0034425722004655).
6_Figure_plotting_code:Include MATLAB functions related to Figures in the paper and support information.
Director: data_and_result
Description:**Include directories that store the results output from MATLAB. user only neeed to modify the path in MATLAB script to their own path.
1_origin : Sample data ("PS-20180323-20180329", “PS-20180329-20180404”, “PS-20180404-20180410”) output from GAMMA software in Geotiff format that can be used to calculate DROT and velocity. Includes displacment, theta, phi, and ccp.
2_maskccpN: Remove outliers by ccp < 0.05 and change displacement to velocity (m/day).
3_rockpoint: Extract velocities at non-moving region
4_constant_detrend: removed orbit error
5_Tidal_correction: remove atmospheric and tidal induced error
6_rockpoint: Extract non-aggregated velocities at non-moving region
6_vx_vy_v: trasform velocities from va/vr to vx/vy
7_rockpoint: Extract aggregated velocities at non-moving region
7_vx_vy_v_aggregate_and_error_estimate: inverse weighted average of three ice velocity maps and calculate the error maps
8_strain_rate: calculated strain rate from aggregate ice velocity
9_compare: store the results before and after tidal correction and aggregation.
10_Block_result: times series results that extrac from 2D data.
11_MALAB_output_png_result: Store .png files and time serties result
12_DROT: Differential Range Offset Tracking results
13_ICESat_2: ICESat_2 .h5 files and .mat files can put here (in this file only include the samples from tracks 0965 and 1094)
14_MODIS_images: you can store MODIS images here
shp: grounding line, rock region, ice front, and other shape files.
File 2 : PIG_front_1947_2023.zip
Includes Ice front positions shape files from 1947 to 2023, which used for plotting figure.1 in the paper.
File 3 : PIG_DROT_GL_2016_2021.zip
Includes grounding line positions shape files from 1947 to 2023, which used for plotting figure.1 in the paper.
Data was derived from the following sources:
Those links can be found in MATLAB scripts or in the paper "**Open Research" **section.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: Despite the wide distribution of many parasites around the globe, the range of individual species varies significantly even among phylogenetically related taxa. Since parasites need suitable hosts to complete their development, parasite geographical and environmental ranges should be limited to communities where their hosts are found. Parasites may also suffer from a trade-off between being locally abundant or widely dispersed. We hypothesize that the geographical and environmental ranges of parasites are negatively associated to their host specificity and their local abundance. Location: Worldwide Time period: 2009 to 2021 Major taxa studied: Avian haemosporidian parasites Methods: We tested these hypotheses using a global database which comprises data on avian haemosporidian parasites from across the world. For each parasite lineage, we computed five metrics: phylogenetic host-range, environmental range, geographical range, and their mean local and total number of observations in the database. Phylogenetic generalized least squares models were ran to evaluate the influence of phylogenetic host-range and total and local abundances on geographical and environmental range. In addition, we analysed separately the two regions with the largest amount of available data: Europe and South America. Results: We evaluated 401 lineages from 757 localities and observed that generalism (i.e. phylogenetic host range) associates positively to both the parasites’ geographical and environmental ranges at global and Europe scales. For South America, generalism only associates with geographical range. Finally, mean local abundance (mean local number of parasite occurrences) was negatively related to geographical and environmental range. This pattern was detected worldwide and in South America, but not in Europe. Main Conclusions: We demonstrate that parasite specificity is linked to both their geographical and environmental ranges. The fact that locally abundant parasites present restricted ranges, indicates a trade-off between these two traits. This trade-off, however, only becomes evident when sufficient heterogeneous host communities are considered. Methods We compiled data on haemosporidian lineages from the MalAvi database (http://130.235.244.92/Malavi/ , Bensch et al. 2009) including all the data available from the “Grand Lineage Summary” representing Plasmodium and Haemoproteus genera from wild birds and that contained information regarding location. After checking for duplicated sequences, this dataset comprised a total of ~6200 sequenced parasites representing 1602 distinct lineages (775 Plasmodium and 827 Haemoproteus) collected from 1139 different host species and 757 localities from all continents except Antarctica (Supplementary figure 1, Supplementary Table 1). The parasite lineages deposited in MalAvi are based on a cyt b fragment of 478 bp. This dataset was used to calculate the parasites’ geographical, environmental and phylogenetic ranges. Geographical range All analyses in this study were performed using R version 4.02. In order to estimate the geographical range of each parasite lineage, we applied the R package “GeoRange” (Boyle, 2017) and chose the variable minimum spanning tree distance (i.e., shortest total distance of all lines connecting each locality where a particular lineage has been found). Using the function “create.matrix” from the “fossil” package, we created a matrix of lineages and coordinates and employed the function “GeoRange_MultiTaxa” to calculate the minimum spanning tree distance for each parasite lineage distance (i.e. shortest total distance in kilometers of all lines connecting each locality). Therefore, as at least two distinct sites are necessary to calculate this distance, parasites observed in a single locality could not have their geographical range estimated. For this reason, only parasites observed in two or more localities were considered in our phylogenetically controlled least squares (PGLS) models. Host and Environmental diversity Traditionally, ecologists use Shannon entropy to measure diversity in ecological assemblages (Pielou, 1966). The Shannon entropy of a set of elements is related to the degree of uncertainty someone would have about the identity of a random selected element of that set (Jost, 2006). Thus, Shannon entropy matches our intuitive notion of biodiversity, as the more diverse an assemblage is, the more uncertainty regarding to which species a randomly selected individual belongs. Shannon diversity increases with both the assemblage richness (e.g., the number of species) and evenness (e.g., uniformity in abundance among species). To compare the diversity of assemblages that vary in richness and evenness in a more intuitive manner, we can normalize diversities by Hill numbers (Chao et al., 2014b). The Hill number of an assemblage represents the effective number of species in the assemblage, i.e., the number of equally abundant species that are needed to give the same value of the diversity metric in that assemblage. Hill numbers can be extended to incorporate phylogenetic information. In such case, instead of species, we are measuring the effective number of phylogenetic entities in the assemblage. Here, we computed phylogenetic host-range as the phylogenetic Hill number associated with the assemblage of hosts found infected by a given parasite. Analyses were performed using the function “hill_phylo” from the “hillr” package (Chao et al., 2014a). Hill numbers are parameterized by a parameter “q” that determines the sensitivity of the metric to relative species abundance. Different “q” values produce Hill numbers associated with different diversity metrics. We set q = 1 to compute the Hill number associated with Shannon diversity. Here, low Hill numbers indicate specialization on a narrow phylogenetic range of hosts, whereas a higher Hill number indicates generalism across a broader phylogenetic spectrum of hosts. We also used Hill numbers to compute the environmental range of sites occupied by each parasite lineage. Firstly, we collected the 19 bioclimatic variables from WorldClim version 2 (http://www.worldclim.com/version2) for all sites used in this study (N = 713). Then, we standardized the 19 variables by centering and scaling them by their respective mean and standard deviation. Thereafter, we computed the pairwise Euclidian environmental distance among all sites and used this distance to compute a dissimilarity cluster. Finally, as for the phylogenetic Hill number, we used this dissimilarity cluster to compute the environmental Hill number of the assemblage of sites occupied by each parasite lineage. The environmental Hill number for each parasite can be interpreted as the effective number of environmental conditions in which a parasite lineage occurs. Thus, the higher the environmental Hill number, the more generalist the parasite is regarding the environmental conditions in which it can occur. Parasite phylogenetic tree A Bayesian phylogenetic reconstruction was performed. We built a tree for all parasite sequences for which we were able to estimate the parasite’s geographical, environmental and phylogenetic ranges (see above); this represented 401 distinct parasite lineages. This inference was produced using MrBayes 3.2.2 (Ronquist & Huelsenbeck, 2003) with the GTR + I + G model of nucleotide evolution, as recommended by ModelTest (Posada & Crandall, 1998), which selects the best-fit nucleotide substitution model for a set of genetic sequences. We ran four Markov chains simultaneously for a total of 7.5 million generations that were sampled every 1000 generations. The first 1250 million trees (25%) were discarded as a burn-in step and the remaining trees were used to calculate the posterior probabilities of each estimated node in the final consensus tree. Our final tree obtained a cumulative posterior probability of 0.999. Leucocytozoon caulleryi was used as the outgroup to root the phylogenetic tree as Leucocytozoon spp. represents a basal group within avian haemosporidians (Pacheco et al., 2020).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary and methods used to calculate the physical characteristics used to compare the home range estimators.
Raw data to calculate rate of adaptationRaw dataset for rate of adaptation calculations (Figure 1) and related statistics.dataall.csvR code to analyze raw data for rate of adaptationCompetition Analysis.RRaw data to calculate effective population sizesdatacount.csvR code to analayze effective population sizesR code used to analyze effective population sizes; Figure 2Cell Count Ne.RR code to determine our best estimate of the dominance coefficient in each environmentR code to produce figures 3, S4, S5 -- what is the best estimate of dominance? Note, competition and effective population size R code must be run first in the same session.what is h.R
The share of the population aged 4 to the age when the compulsory primary education starts who is participating in early education. This indicator measures the Education and Training 2020 strategy's headline target to increase the share of children participating in pre-primary education (measured as those between 4 years old and the age for starting compulsory primary education) to at least 95% in 2020. The next table shows the entrance age to the primary education and the age range of the indicator by country: CountryBEBGCZDKDEEEIEELESFRHRITCYLVLTLUHUMTNLATPLPTROSISKFISEUKMKTRISLINOCHEntrance age*67666746667667766566766667756-766766-8Age range**4-54-64-54-54-54-64-54-54-54-54-64-54-54-64-64-54-544-54-54-64-54-54-54-54-64-644-54-54-54-64-54-6* Usual entrance age to primary education (Note: this can be earlier than the age of compulsory primary education) ** Used age range to calculate the participation rate in early childhood education, i.e. age 4 up to the age of compulsory primary education.
This dataset provides the equation of state data for lead in the temperature and pressure range from room temperature to 10 MK, and from atmospheric pressure to 107GPa. The thermodynamic properties of the shock Hugoniot line, 300 K isotherm, melting line, and temperature dense transition zone were calculated.
We calculate effective recombination coefficients for the formation of the 5g-4f lines of O III in the intermediate coupling scheme. Photoionization data for the 5g levels calculated using the R-matrix method are used to derive their recombination coefficients. Cascading from higher states is included, allowing for the effects of finite electron density in a hydrogenic approximation. We explicitly include the distribution of population between the two ground levels of O^3+^ in the calculation of the line intensities. The results are presented as a simple programmable formula allowing the calculation of recombination line intensities for electron temperatures, T_e_ in the range 5000-20000K and electron densities, N_e_ in the range 10^2^-10^6^cm^-3^.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.
This dataset contains the hydraulic conductivity and transmissivity data for the NSW part of the Clarence-Moreton Basin. The data were organized by geological formations. The data were sourced from the NSW state groundwater databases. Most records in the pumping test database do not have a hydraulic conductivity entry. The hydraulic conductivity was derived from the pumping test data by a two-step method to support the groundwater modelling: (i) transmissivity was estimated from the original test readings using the TGUESS approach; and (ii) hydraulic conductivity was calculated based on the estimated transmissivity and the screen information. The estimated hydraulic conductivity was mainly available for the alluvium and volcanics and varies in a range of six orders of magnitude.
This dataset was created through the following process:
Filter the data using the spatial extent of the Clarence-Moreton Basin;
Data quality check;
Calculate transmissivity using the TGUESS approach;
Compute hydraulic conductivity using the calculated transmissivity and the screen information in the state database;
Aquifer assignment.
Bioregional Assessment Programme (2014) CLM - Hydraulic conductivity NSW. Bioregional Assessment Derived Dataset. Viewed 28 September 2017, http://data.bioregionalassessments.gov.au/dataset/cecfa372-52ca-49f8-b61c-6506279eafb5.
Derived From Bioregional Assessment areas v02
Derived From Natural Resource Management (NRM) Regions 2010
Derived From Bioregional Assessment areas v03
Derived From NSW Office of Water Pump Test dataset
Derived From Bioregional Assessment areas v01
Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Derived From GEODATA TOPO 250K Series 3
Derived From NSW Catchment Management Authority Boundaries 20130917
Derived From Geological Provinces - Full Extent
📈 Daily Historical Stock Price Data for Formula One Group (2014–2025)
A clean, ready-to-use dataset containing daily stock prices for Formula One Group from 2014-07-08 to 2025-05-28. This dataset is ideal for use in financial analysis, algorithmic trading, machine learning, and academic research.
🗂️ Dataset Overview
Company: Formula One Group Ticker Symbol: FWONK Date Range: 2014-07-08 to 2025-05-28 Frequency: Daily Total Records: 2740 rows (one per trading day)… See the full description on the dataset page: https://huggingface.co/datasets/khaledxbenali/daily-historical-stock-price-data-for-formula-one-group-20142025.
New flux data are presented for nine non-variable stars and 14 evolved variable stars with spectral types M and C. The data are from measurements of 21 passbands in the wavelength range from 7440{AA} to 10834{AA}, and they are comparable to measurements made by Wing some 40 years ago. Because the extinction algorithm applied to the new data is based partly on up-to-date calculations of telluric water-vapor effects, those calculations are tested for accuracy. In addition, methods used to calibrate standard stars both outside and inside the Paschen confluence are explained. After reddening corrections are applied to the flux data for the variable stars, those data are used to calculate color temperatures. In turn, those temperatures are used to derive blanketing corrections to color temperatures measured in the Wing filter system. Indices of absorption strength are calculated by comparing the flux data to blackbody colors derived from the color temperatures. It is found that the standard errors of those temperatures range from 3% to less than 1%. For the variable stars, the standard errors for the flux data range from 6.8mmag to 11.6mmag. For the non-variable stars, the corresponding standard error is about 6.0mmag. Cone search capability for table J/PASP/120/1183/stdmags (Flux data for standard stars) Cone search capability for table J/PASP/120/1183/stars (Cool variable stars positions (from Simbad))
The U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.
[Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.26°C.]What does the data show? This dataset shows the change in summer maximum air temperature for a range of global warming levels, including the recent past (2001-2020), compared to the 1981-2000 baseline period. Here, summer is defined as June-July-August. The dataset uses projections of daily maximum air temperature from UKCP18. For each year, the highest daily maximum temperature from the summer period is found. These are then averaged to give values for the 1981-2000 baseline, recent past (2001-2020) and global warming levels. The warming levels available are 1.5°C, 2.0°C, 2.5°C, 3.0°C and 4.0°C above the pre-industrial (1850-1900) period. The recent past value and global warming level values are stated as a change (in °C) relative to the 1981-2000 value. This enables users to compare summer maximum temperature trends for the different periods. In addition to the change values, values for the 1981-2000 baseline (corresponding to 0.51°C warming) and recent past (2001-2020, corresponding to 0.87°C warming) are also provided. This is summarised in the table below.PeriodDescription1981-2000 baselineAverage temperature (°C) for the period2001-2020 (recent past)Average temperature (°C) for the period2001-2020 (recent past) changeTemperature change (°C) relative to 1981-20001.5°C global warming level changeTemperature change (°C) relative to 1981-20002°C global warming level changeTemperature change (°C) relative to 1981-20002.5°C global warming level changeTemperature change (°C) relative to 1981-20003°C global warming level changeTemperature change (°C) relative to 1981-20004°C global warming level changeTemperature change (°C) relative to 1981-2000What is a global warming level?The Summer Maximum Temperature Change is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Summer Maximum Temperature Change an average is taken across the 21 year period.We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?These data contain a field for each warming level and the 1981-2000 baseline. They are named 'tasmax summer change' (change in air 'temperature at surface'), the warming level or baseline, and 'upper' 'median' or 'lower' as per the description below. e.g. 'tasmax summer change 2.0 median' is the median value for summer for the 2.0°C warming level. Decimal points are included in field aliases but not in field names, e.g. 'tasmax summer change 2.0 median' is named 'tasmax_summer_change_20_median'. To understand how to explore the data, refer to the New Users ESRI Storymap. Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘tasmax summer change 2.0°C median’ values.What do the 'median', 'upper', and 'lower' values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future.For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Summer Maximum Temperature Change was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location.The ‘lower’ fields are the second lowest ranked ensemble member. The ‘higher’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and higher fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline period as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksFor further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.7910/DVN/2HMEHBhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.7910/DVN/2HMEHB
In order to uncover the best kept secret in today’s commercial aviation, this project deals with the calculation of fuel consumption of aircraft. With only the reference of the aircraft manufacturer’s information, given within the airport planning documents, a method is established that allows to computing values for the fuel consumption of every aircraft in question. The aircraft's fuel consumption per passenger and 100 flown kilometers decreases rapidly with range, until a near constant level is reached around the aircraft’s average range. At longer range, where payload reduction becomes necessary, fuel consumption increases significantly. Numerical results are visualized, explained, and discussed. With regard to today’s increasing number of long-haul flights, the results are investigated in terms of efficiency and viability. The environmental impact of burning fuel is not considered in this report. The presented method allows calculating aircraft type specific fuel consumption based on publicly available information. In this way, the fuel consumption of every aircraft can be investigated and can be discussed openly.
Studies utilizing Global Positioning System (GPS) telemetry rarely result in 100% fix success rates (FSR). Many assessments of wildlife resource use do not account for missing data, either assuming data loss is random or because a lack of practical treatment for systematic data loss. Several studies have explored how the environment, technological features, and animal behavior influence rates of missing data in GPS telemetry, but previous spatially explicit models developed to correct for sampling bias have been specified to small study areas, on a small range of data loss, or to be species-specific, limiting their general utility. Here we explore environmental effects on GPS fix acquisition rates across a wide range of environmental conditions and detection rates for bias correction of terrestrial GPS-derived, large mammal habitat use. We also evaluate patterns in missing data that relate to potential animal activities that change the orientation of the antennae and characterize home-range probability of GPS detection for 4 focal species; cougars (Puma concolor), desert bighorn sheep (Ovis canadensis nelsoni), Rocky Mountain elk (Cervus elaphus ssp. nelsoni) and mule deer (Odocoileus hemionus). Part 1, Positive Openness Raster (raster dataset): Openness is an angular measure of the relationship between surface relief and horizontal distance. For angles less than 90 degrees it is equivalent to the internal angle of a cone with its apex at a DEM location, and is constrained by neighboring elevations within a specified radial distance. 480 meter search radius was used for this calculation of positive openness. Openness incorporates the terrain line-of-sight or viewshed concept and is calculated from multiple zenith and nadir angles-here along eight azimuths. Positive openness measures openness above the surface, with high values for convex forms and low values for concave forms (Yokoyama et al. 2002). We calculated positive openness using a custom python script, following the methods of Yokoyama et. al (2002) using a USGS National Elevation Dataset as input. Part 2, Northern Arizona GPS Test Collar (csv): Bias correction in GPS telemetry data-sets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix acquisition. We found terrain exposure and tall over-story vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The model's predictive ability was evaluated using two independent data-sets from stationary test collars of different make/model, fix interval programming, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. The model training data are provided here for fix attempts by hour. This table can be linked with the site location shapefile using the site field. Part 3, Probability Raster (raster dataset): Bias correction in GPS telemetry datasets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix aquistion. We found terrain exposure and tall overstory vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The models predictive ability was evaluated using two independent datasets from stationary test collars of different make/model, fix interval programing, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. We evaluated GPS telemetry datasets by comparing the mean probability of a successful GPS fix across study animals home-ranges, to the actual observed FSR of GPS downloaded deployed collars on cougars (Puma concolor), desert bighorn sheep (Ovis canadensis nelsoni), Rocky Mountain elk (Cervus elaphus ssp. nelsoni) and mule deer (Odocoileus hemionus). Comparing the mean probability of acquisition within study animals home-ranges and observed FSRs of GPS downloaded collars resulted in a approximatly 1:1 linear relationship with an r-sq= 0.68. Part 4, GPS Test Collar Sites (shapefile): Bias correction in GPS telemetry data-sets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix acquisition. We found terrain exposure and tall over-story vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The model's predictive ability was evaluated using two independent data-sets from stationary test collars of different make/model, fix interval programming, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. Part 5, Cougar Home Ranges (shapefile): Cougar home-ranges were calculated to compare the mean probability of a GPS fix acquisition across the home-range to the actual fix success rate (FSR) of the collar as a means for evaluating if characteristics of an animal’s home-range have an effect on observed FSR. We estimated home-ranges using the Local Convex Hull (LoCoH) method using the 90th isopleth. Data obtained from GPS download of retrieved units were only used. Satellite delivered data was omitted from the analysis for animals where the collar was lost or damaged because satellite delivery tends to lose as additional 10% of data. Comparisons with home-range mean probability of fix were also used as a reference for assessing if the frequency animals use areas of low GPS acquisition rates may play a role in observed FSRs. Part 6, Cougar Fix Success Rate by Hour (csv): Cougar GPS collar fix success varied by hour-of-day suggesting circadian rhythms with bouts of rest during daylight hours may change the orientation of the GPS receiver affecting the ability to acquire fixes. Raw data of overall fix success rates (FSR) and FSR by hour were used to predict relative reductions in FSR. Data only includes direct GPS download datasets. Satellite delivered data was omitted from the analysis for animals where the collar was lost or damaged because satellite delivery tends to lose approximately an additional 10% of data. Part 7, Openness Python Script version 2.0: This python script was used to calculate positive openness using a 30 meter digital elevation model for a large geographic area in Arizona, California, Nevada and Utah. A scientific research project used the script to explore environmental effects on GPS fix acquisition rates across a wide range of environmental conditions and detection rates for bias correction of terrestrial GPS-derived, large mammal habitat use.
[Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.13°C.]What does the data show? This dataset shows the change in annual temperature for a range of global warming levels, including the recent past (2001-2020), compared to the 1981-2000 baseline period. Note, as the values in this dataset are averaged over a year they do not represent possible extreme conditions.The dataset uses projections of daily average air temperature from UKCP18 which are averaged to give values for the 1981-2000 baseline, the recent past (2001-2020) and global warming levels. The warming levels available are 1.5°C, 2.0°C, 2.5°C, 3.0°C and 4.0°C above the pre-industrial (1850-1900) period. The recent past value and global warming level values are stated as a change (in °C) relative to the 1981-2000 value. This enables users to compare annual average temperature trends for the different periods. In addition to the change values, values for the 1981-2000 baseline (corresponding to 0.51°C warming) and recent past (2001-2020, corresponding to 0.87°C warming) are also provided. This is summarised in the table below.
PeriodDescription 1981-2000 baselineAverage temperature (°C) for the period 2001-2020 (recent past)Average temperature (°C) for the period 2001-2020 (recent past) changeTemperature change (°C) relative to 1981-2000 1.5°C global warming level changeTemperature change (°C) relative to 1981-2000 2°C global warming level changeTemperature change (°C) relative to 1981-20002.5°C global warming level changeTemperature change (°C) relative to 1981-2000 3°C global warming level changeTemperature change (°C) relative to 1981-2000 4°C global warming level changeTemperature change (°C) relative to 1981-2000What is a global warming level?The Annual Average Temperature Change is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Average Temperature Change, an average is taken across the 21 year period.We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for the 1981-2000 baseline, 2001-2020 period and each warming level. They are named 'tas annual change' (change in air 'temperature at surface'), the warming level or historic time period, and 'upper' 'median' or 'lower' as per the description below. e.g. 'tas annual change 2.0 median' is the median value for the 2.0°C warming level. Decimal points are included in field aliases but not in field names, e.g. 'tas annual change 2.0 median' is named 'tas_annual_change_20_median'. To understand how to explore the data, refer to the New Users ESRI Storymap. Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘tas annual change 2.0°C median’ values.What do the 'median', 'upper', and 'lower' values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future.For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Average Temperature Change was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location.The ‘lower’ fields are the second lowest ranked ensemble member. The ‘higher’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and higher fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline period as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksFor further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.
This dataset consists of mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.
## Example questions
Question: Solve -42*r + 27*c = -1167 and 130*r + 4*c = 372 for r.
Answer: 4
Question: Calculate -841880142.544 + 411127.
Answer: -841469015.544
Question: Let x(g) = 9*g + 1. Let q(c) = 2*c + 1. Let f(i) = 3*i - 39. Let w(j) = q(x(j)). Calculate f(w(a)).
Answer: 54*a - 30
It contains 2 million (question, answer) pairs per module, with questions limited to 160 characters in length, and answers to 30 characters in length. Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard". This allows training models via a curriculum. The data can also be mixed together uniformly from these training datasets to obtain the results reported in the paper. Categories: