Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To construct CFA, MCFA, and maximum MCFA with LISREL v.8 and below, we provide iMCFA (integrated Multilevel Confirmatory Analysis) to examine the potential multilevel factorial structure in the complex survey data. Modeling multilevel structure for complex survey data is complicated because building a multilevel model is not an infallible statistical strategy unless the hypothesized model is close to the real data structure. Methodologists have suggested using different modeling techniques to investigate potential multilevel structure of survey data. Using iMCFA, researchers can visually set the between- and within-level factorial structure to fit MCFA, CFA and/or MAX MCFA models for complex survey data. iMCFA can then yield between- and within-level variance-covariance matrices, calculate intraclass correlations, perform the analyses and generate the outputs for respective models. The summary of the analytical outputs from LISREL is gathered and tabulated for further model comparison and interpretation. iMCFA also provides LISREL syntax of different models for researchers' future use. An empirical and a simulated multilevel dataset with complex and simple structures in the within or between level was used to illustrate the usability and the effectiveness of the iMCFA procedure on analyzing complex survey data. The analytic results of iMCFA using Muthen's limited information estimator were compared with those of Mplus using Full Information Maximum Likelihood regarding the effectiveness of different estimation methods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set contains all data used in the paper "Bi-level identification of complex dynamical systems through reinforcement learning". In the BILLIE (Bi-level Identification of Equations) algorithm proposed in the paper, two sets of orthogonal data (denote as s1 and s2) were used in each identification case. The "_s1" (or "_s2") in a dataset's name means that s1 (or s2) was sampled from that dataset. Those datasets without "_s1" (or "_s2") in the name means that both s1 and s2 was sampled from that dataset.
Each of the four folders is detailed as below. 1. The folder "Navier-Stokes equation" contains the simulated data of Navier-stokes equation for the three fluid dynamics identification cases. Naming: "NS_(2D or 3D)_(Reynolds number)_(s1/s2 if any)" Case1: 2D flow with Reynolds number of 100 The two sets of data is structured on a 256x256 grid within the 2pi x 2pi spatial domain, meaning dx=dy=2pi/256. The time step is dt=0.0015. The data is organized as [T, X, Y, C] where T is the temporal dimension, X, Y are the spatial dimensions, C=[V, U, P] where U, V are the two fluid velocity components on the two spatial dimensions respectively, and P is the pressure scalar. Case2: 2D flow with Reynolds number of 1000 The two sets of data is structured on a 2048x2048 grid within the 2pi x 2pi spatial domain, meaning dx=dy=2pi/2048. The time step is dt=0.00005. The data is organized as [T, X, Y, C], identically to case1. Case3: 3D flow with Reynolds number of 100 This is the data of a flow around a cylinder published by Raissi in "Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations" (DOI: 10.1126/science.aaw4741) 2. The folder "Burgers' equation" contains the simulated data of Burgers' equation for the experiments on small-coefficient terms, noise, and sparsity. Ground truth equation: u_t = lambda*u_xx - u*u_x Naming: "Burgers_coef_(lambda)_(grid setting: spatial x temporal)_(s1/s2)" The datasets were simulated on a [-8, 8] spatial domain and [0, 10] temporal domain with structured grids of different levels of sparsity. The noise tests were performed on "Burgers_coef_1e-1_257x101_s1.mat" and "Burgers_coef_1e-1_257x101_s2.mat" by manually adding gaussian noise. 3. The folder "Three body" contains the simulated data of the three-body system for the experiments on small-coefficient terms, noise, and sparsity. Naming: "three_body_coef_(lambda, see paper for meaning)_(s1/s2)" Each dataset contains a dictionary of 7 keys: x, y, z, u, v, w, dt. The time step is dt=0.005. The noise tests were performed on "three_body_coef_1e0_s1.mat" and "three_body_coef_1e0_s2.mat" by manually adding gaussian noise. 4. The folder "Single-cell sequencing data" contains the two sets of preprocessed multi-omics single-cell sequencing datasets used in identifying RNA and protein velocity. The original datasets of GSM2695381 and GSM2695382 is publicly available in the Gene Expression Omnibus ("Large-scale simultaneous measurement of epitopes and transcriptomes in single cells").
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By math_dataset (From Huggingface) [source]
This dataset comprises a collection of mathematical problems and their solutions designed for training and testing purposes. Each problem is presented in the form of a question, followed by its corresponding answer. The dataset covers various mathematical topics such as arithmetic, polynomials, and prime numbers. For instance, the arithmetic_nearest_integer_root_test.csv file focuses on problems involving finding the nearest integer root of a given number. Similarly, the polynomials_simplify_power_test.csv file deals with problems related to simplifying polynomials with powers. Additionally, the dataset includes the numbers_is_prime_train.csv file containing math problems that require determining whether a specific number is prime or not. The questions and answers are provided in text format to facilitate analysis and experimentation with mathematical problem-solving algorithms or models
Introduction: The Mathematical Problems Dataset contains a collection of various mathematical problems and their corresponding solutions or answers. This guide will provide you with all the necessary information on how to utilize this dataset effectively.
Understanding the columns: The dataset consists of several columns, each representing a different aspect of the mathematical problem and its solution. The key columns are:
- question: This column contains the text representation of the mathematical problem or equation.
- answer: This column contains the text representation of the solution or answer to the corresponding problem.
Exploring specific problem categories: To focus on specific types of mathematical problems, you can filter or search within the dataset using relevant keywords or terms related to your area of interest. For example, if you are interested in prime numbers, you can search for prime in the question column.
Applying machine learning techniques: This dataset can be used for training machine learning models related to natural language understanding and mathematics. You can explore various techniques such as text classification, sentiment analysis, or even sequence-to-sequence models for solving mathematical problems based on their textual representations.
Generating new questions and solutions: By analyzing patterns in this dataset, you can generate new questions and solutions programmatically using techniques like data augmentation or rule-based methods.
Validation and evaluation: As with any other machine learning task, it is essential to validate your models on separate validation sets not included in this dataset properly. You can also evaluate model performance by comparing predictions against known answers provided in this dataset's answer column.
Sharing insights and findings: After working with this datasets, it would be beneficial for researchers or educators to share their insights, approaches taken during analysis/modelling as Kaggle notebooks/ discussions/ blogs/ tutorials etc., so that others could get benefited from such shared resources too.
Note: Please note that the dataset does not include dates.
By following these guidelines, you can effectively explore and utilize the Mathematical Problems Dataset for various mathematical problem-solving tasks. Happy exploring!
- Developing machine learning algorithms for solving mathematical problems: This dataset can be used to train and test models that can accurately predict the solution or answer to different mathematical problems.
- Creating educational resources: The dataset can be used to create a wide variety of educational materials such as problem sets, worksheets, and quizzes for students studying mathematics.
- Research in mathematical problem-solving strategies: Researchers and educators can analyze the dataset to identify common patterns or strategies employed in solving different types of mathematical problems. This analysis can help improve teaching methodologies and develop effective problem-solving techniques
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purpos...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ordinary differential equations (ODEs) are widely used to model the dynamic behavior of a complex system. Parameter estimation and variable selection for a “Big System” with linear ODEs are very challenging due to the need of nonlinear optimization in an ultra-high dimensional parameter space. In this article, we develop a parameter estimation and variable selection method based on the ideas of similarity transformation and separable least squares (SLS). Simulation studies demonstrate that the proposed matrix-based SLS method could be used to estimate the coefficient matrix more accurately and perform variable selection for a linear ODE system with thousands of dimensions and millions of parameters much better than the direct least squares method and the vector-based two-stage method that are currently available. We applied this new method to two real datasets—a yeast cell cycle gene expression dataset with 30 dimensions and 930 unknown parameters and the Standard & Poor 1500 index stock price data with 1250 dimensions and 1,563,750 unknown parameters—to illustrate the utility and numerical performance of the proposed parameter estimation and variable selection method for big systems in practice. Supplementary materials for this article are available online.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was gathered by us from three software houses. This is real-life dataset. Use Case points method as originated by Karner was used for counting a steps or number of actors. Data are based on different languages, various problem domains. ISBSG style for language, domain and application type were adopted.
Attributes are used as follows:
Project_No - only project ID for identification purposes
Simple Actors - Number of actor classify according UCP - simple actors.
Average Actors - Number of actor classify according UCP - average actors.
Complex Actors - Number of actor classify according UCP - complex actors.
UAW - Unadjusted Actor weight, computed by using UCP equation.
Simple UC - Number of use cases classified as simple - UCP number of steps is used.
Average UC - Number of use cases classified as average - UCP number of steps is used.
Complex UC - Number of use cases classified as complex - UCP number of steps is used.
UUCW - Unadjusted UseCase Weight - computed by using UCP equation.
TCF - Technical Complexity FactorECF - Enviromental Complexity Factors
Real_P20 - Real_P20 - Real Effort in Person hours, decided by productivity factor (PF = 20).
Real_Effort_Person_Hours - Real Effort (development time) in person-hours.
Sector - Problem domain of projectLanguage - Programming language used for project.
Methodology - Development methodology used for project development.
ApplicationType - Classification of project type - provided by donator.
DataDonator - anonymized acronym for data donator.
Description
This sound field image dataset contains clean-noisy pairs of complex-valued sound-field images generated by 2D acoustic simulations. The dataset was initially prepared for deep sound-field denoiser (https://github.com/nttcslab/deep-sound-field-denoiser), a DNN-based denoising method for optically measured sound fields. Since the data is a two-dimensional sound field based on the Helmholtz equation, one can use this dataset for any acoustic application. Please check our GitHub repository and paper for details.
Directory structure
The dataset contains three directories: training, validation, and evaluation. Each directory contains "soundsource#" sub-directories (# represents the number of sound sources used in the acoustic simulation). Each sub-directory has three h5 files for data (clean, white noise, and speckle noise) and three CSV files listing random parameter values used in the simulation.
/training
/soundsource#
constants.csv
random_variable_ranges.csv
random_variables.csv
sf_true.h5
sf_noise_white.h5
sf_noise_speckle.h5
Condition of use
This dataset is available under the attached license file. Read the terms and conditions in NTTSoftwareLicenseAgreement.pdf carefully.
Citation
If you use this dataset, please cite the following paper.
K. Ishikawa, D. Takeuchi, N. Harada, and T. Moriya ``Deep sound-field denoiser: optically-measured sound-field denoising using deep neural network,'' arXiv:2304.14923 (2023).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Datasets:
Figure data:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MATLAB code and the generated datasets for the Lyapunov exponent spectra of the Kuramoto-Sivashinsky PDE, as published in the paper 'Lyapunov Exponents of the Kuramoto-Sivashinsky PDE' in 2019. The files are organised as follows:Codecode/lyapunovexpts.m contains a MATLAB function implementation of Algorithm 1 from the paper, which is the classic algorithm for finding Lyapunov exponents introduced by Benettin et al. (1980) and Shimada and Nagashima (1979).code/dudt_ksperiodic_spectral.m contains a vectorised ODE-discretisation of the Kuramoto-Sivashinsky PDE on a periodic domain using a spectral scheme, which can be used with the standard MATLAB ODE solvers to simulate the dynamics.code/dudt_ksoddperiodic_finitediff.m contains a similar vectorised ODE-discretisation of the Kuramoto-Sivashinsky PDE, but for the "odd-periodic" domain (u = uxx = 0 on x=0,L) and using a finite-difference scheme with error O(dx2).code/research_kslyaps{.m,.sh} contain the code that ran the computational experiments (using the above Lyapunov exponents code and the Kuramoto-Sivashinsky ODE-discretisations) to generate the Lyapunov spectra data using MATLAB 2016a on the School of Mathematical Sciences' maths1 Linux server in 2017.Datalyapexpts_ksperiodic.zip contains the Lyapunov spectra computed for the Kuramoto-Sivashinsky PDE on the periodic domain. Each file has a filename of the form LXYZpW.txt, and contains the 24 most positive Lyapunov exponents computed on the periodic domain [0, L] where L = XYZ.W. (E.g., L097p4.txt contains the exponents for L=97.4.)lyapexpts_ksoddperiodic.zip contains the Lyapunov spectra computed for the Kuramoto-Sivashinsky PDE on the "odd-periodic" domain. Each file has a filename of the form LXYZpW.txt, and contains the 24 most positive Lyapunov exponents computed on the odd-periodic domain [0, L] where L=XYZ.W.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Training and testing data to identify the QCD transition using deep learning and traditional machine learning.1. training_data.csv, testing_iebevishnu.csv, testing_ipglasma.csvThere are 723 entries in each row. The first row is the description of the data. The 0th entry in each row is the event id, the entries from 1 to 720 are the pion density distribution at mid-rapidity -- rho(pt, phi) at 15 different pt bins and 48 different azimuthal angle phi bins with phi as the inner loop. The 721st entry is the equation of state type (0 or 1). The 722 entry is extra information of each event.2. training_observables.csv, test_iebe_observables.csv, test_ipglasma_observables.csv There are 87 entries in each row. The first row is the header which describes the data of the following rows. In the following rows, the first entry is the event id, the second entry is the equation of state type (0 or 1), the remaining entries are 85 observables computed from raw spectra, which will be used in various traditional classifiers in machine learning toolbox.The meaning of these entries and the Monte Carlo model used to generate these data can be found in the paper,http://inspirehep.net/record/1503189
Dataset Card for Dataset Name
Contains text reasoning chain from Qwen2.5-MATH-7B-Instruct from standard prompt: [ { "role": "user", "content": "Solve the following math problem efficiently and clearly: - For simple problems (2 steps or fewer): Provide a concise solution with minimal equation. - For complex problems (3 steps or more): Use this step-by-step format:
... Regardless of the approach… See the full description on the dataset page: https://huggingface.co/datasets/smoorsmith/gsm8k_txt_Qwen2.5-MATH-7B-Instruct.
Spatially varied resources and threats govern the persistence of African lions across dynamic protected areas. An important precursor to effective conservation for lions requires assessing tradeoffs in space use due to heterogeneity in habitat, resources, and human presence between national parks and hunting areas, the dominant land-use classifications across their range.
We conducted the largest camera survey in West Africa, encompassing 3 national parks and 11 hunting concessions in Burkina Faso and Niger, equating to half of the 26,500-km2 transboundary W-Arly-Pendjari (WAP) protected area complex. We combined occupancy and structural equation modeling to disentangle the relative effects of environmental, ecological, and anthropogenic variables influencing space use of Critically Endangered lions across 21,430 trap-nights from 2016-2018.
National parks are intended to serve as refuges from human pressures, and thus we expect higher lion occupancy in national parks (NPs) comp...
The Bushland Reference ET calculator was developed at the USDA-ARS Conservation and Production Research Laboratory, Bushland, Texas. Although it was designed and developed for use mainly by producers and crop consultants to manage irrigation scheduling, it can also be used in educational training, research, and other practical application. It uses the ASCE Standardized Reference Evapotranspiration (ET) Equation for calculating grass and alfalfa reference ET at hourly and daily time steps. This program uses the more complex equation for estimating clear-sky solar radiation provided in Appendix D of the ASCE-EWRI ET Manual. Users have the option of using single set or time series weather data to calculate reference ET. Daily reference ET can be calculated either by summing the hourly ET values for a given day or by using averages of the climatic data. Resources in this dataset:Resource Title: Bushland ET Calculator download page. File Name: Web Page, url: https://res1wwwd-o-tarsd-o-tusdad-o-tgov.vcapture.xyz/research/software/download/?softwareid=Bushland+ET+Calculator&modecode=30-90-05-00
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This data set was generated for the spin-boson model using hierarchical equations of motion method for 1,000 model parameters. The data set contains reduced density matrix of the two-level system.
See related materials in Collection at: https://doi.org/10.25452/figshare.plus.c.6389553
Collection description: Simulations of the dynamics of dissipative quantum systems utilize many methods such as physics-based quantum, semiclassical, and quantum-classical as well as machine learning-based approximations, development and testing of which requires diverse data sets. Here we present a new database QD3SET-1 containing eight data sets of quantum dynamical data for two systems of broad interest, spin-boson (SB) model and the Fenna–Matthews–Olson (FMO) complex, generated with two different methods solving the dynamics, approximate local thermalizing Lindblad master equation (LTLME) and highly accurate hierarchy equations of motion (HEOM). One data set was generated with the SB model which is a two-level quantum system coupled to a harmonic environment using HEOM for 1,000 model parameters. Seven data sets were collected for the FMO complex of different sizes (7- and 8-site monomer and 24-site trimer with LTLME and 8-site monomer with HEOM) for 500–879 model parameters. Our QD3SET-1 database contains both population and coherence dynamics data and part of it has been already used for machine learning-based quantum dynamics studies.
When complex mechanisms are involved, pinpointing high-performance materials within large databases is a major challenge in materials discovery. We focus here on phonon-limited conductivities, and study 2D semiconductors doped by field effects. Using state-of-the-art density-functional perturbation theory and Boltzmann transport equation, we discuss 11 monolayers with outstanding transport properties. These materials are selected from a computational database of exfoliable materials providing monolayers that are dynamically stable and that do not have more than 6 atoms per unit cell. We first analyze electron-phonon scattering in two well-known systems: electron-doped InSe and hole-doped phosphorene. Both are single-valley systems with weak electron-phonon interactions, but they represent two distinct pathways to fast transport: a steep and deep isotropic valley for the former and strongly anisotropic electron-phonon physics for the latter. We identify similar features in the database and compute the conductivities of the relevant monolayers. This process yields several high-conductivity materials, some of them only very recently emerging in the literature (GaSe, Bi₂SeTe₂, Bi₂Se₃, Sb₂SeTe₂), others never discussed in this context (AlLiTe₂, BiClTe, ClGaTe, AuI). Comparing these 11 monolayers in detail, we discuss how the strength and angular dependency of the electron-phonon scattering drives key differences in the transport performance of materials despite similar valley structure. We also discuss the high conductivity of hole-doped WSe₂, and how this case study shows the limitations of a selection process that would be based on band properties alone. In this entry we provide the AiiDA database with the calculations of phonons and electron-phonon interactions for the 11 materials, along with the python library to collect and visualise the data, solve the Botzmann transport equation, and launch the same workflows for other 2D materials. To guide the reader, we include a Jupyter notebook showing how to extract the data, use the basic functionalities of the library, and regenerate the plots included in the associated paper.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Biological processes exhibit complex temporal dependencies due to the sequential nature of allocation decisions in organisms' life-cycles, feedback loops, and two-way causality. Consequently, longitudinal data often contain cross-lags: the predictor variable depends on the response variable of the previous time-step. Although statisticians have warned that regression models that ignore such covariate endogeneity in time series are likely to be inappropriate, this has received relatively little attention in biology. Furthermore, the resulting degree of estimation bias remains largely unexplored.
We use a graphical model and numerical simulations to understand why and how regression models that ignore cross-lags can be biased, and how this bias depends on the length and number of time series. Ecological and evolutionary examples are provided to illustrate that cross-lags may be more common than is typically appreciated and that they occur in functionally different ways.
We show that routinely used regression models that ignore cross-lags are asymptotically unbiased. However, this offers little relief, as for most realistically feasible lengths of time series conventional methods are biased. Furthermore, collecting time series on multiple subjects–such as populations, groups or individuals—does not help to overcome this bias when the analysis focusses on within-subject patterns (often the pattern of interest). Simulations (R tutorial 1 & 2), a literature search and a real-world empirical example on fairy wrens (data archived here with analyses presented in R-tutorial 3) together suggest that approaches that ignore cross-lags are likely biased in the direction opposite to the sign of the cross-lag (e.g. towards detecting density-dependence of vital rates and against detecting life history trade-offs and benefits of group living). Next, we show that multivariate (e.g. structural equation) models can dynamically account for cross-lags, and simultaneously address additional bias induced by measurement error, but only if the analysis considers multiple time series.
We provide guidance on how to identify a cross-lag and subsequently specify it in a multivariate model, which can be far from trivial. Our tutorials with data and R code of the worked examples provide step‐by‐step instructions on how to perform such analyses.
Our study offers insights into situations in which cross-lags can bias analysis of ecological and evolutionary time series and suggests that adopting dynamical models can be important, as this directly affects our understanding of population regulation, the evolution of life histories and cooperation, and possibly many other topics. Determining how strong estimation bias due to ignoring covariate endogeneity has been in the ecological literature requires further study, also because it may interact with other sources of bias.
Thermodynamic models and, in particular, Statistical Associating Fluid Theory (SAFT)-type equations, are vital in characterizing complex systems. This paper presents a framework for sampling parameter distributions in PC-SAFT and SAFT-VR Mie equations of state to examine parameter confidence intervals and correlations. Comparing the equations of state, we find that additional parameters introduced in the SAFT-VR Mie equation increase relative uncertainties (1%–2% to 3%–4%) and introduce more correlations. These correlations can be attributed to conserved quantities such as particle volume and interaction energy. When incorporating association through additional parameters, relative uncertainties increase further while slightly reducing correlations between parameters. We also investigate how uncertainties in parameters propagate to the predicted properties from these equations of state. While the uncertainties for the regressed properties remain small, when extrapolating to new properties, uncertainties can become significant. This is particularly true near the critical point where we observe that properties dependent on the isothermal compressibility observe massive divergences in the uncertainty. We find that these divergences are intrinsic to these equations of state and, as a result, will always be present regardless of how small the parameter uncertainties are.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chemists continuously harvest the power of non-covalent interactions to control phenomena in both the micro- and macroscopic worlds. From the quantum chemical perspective, the strategies essentially rely upon an in-depth understanding of the physical origin of these interactions, the quantification of their magnitude and their visualization in real-space. The total electron density ρ(r) represents the simplest yet most comprehensive piece of information available for fully characterizing bonding patterns and non-covalent interactions. The charge density of a molecule can be computed by solving the Schrödinger equation, but this approach becomes rapidly demanding if the electron density has to be evaluated for thousands of different molecules or very large chemical systems, such as peptides and proteins. Here we present a transferable and scalable machine-learning model capable of predicting the total electron density directly from the atomic coordinates. The regression model is used to access qualitative and quantitative insights beyond the underlying ρ(r) in a diverse ensemble of sidechain–sidechain dimers extracted from the BioFragment database (BFDb). The transferability of the model to more complex chemical systems is demonstrated by predicting and analyzing the electron density of a collection of 8 polypeptides.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
While all the approximate methods mentioned or others that exist, give some specific solutions to the generalized transcendental equations or even polynomial, cannot resolve them completely. "What we ask when we solve a generalized transcendental equation or polynomials, is to find the total number of roots and not separate sets of roots in some random or specified intervals. Mainly because too many categories of transcendental equations have an infinite number of solutions in the complex set."There are some particular equations (Logarithmic functions, Trigonometric, functions power fubction,, or any Functions) that solve particular problems in Physics, and mostly need the generalized solution. Now coming the theory G.R.E-L, which deals with hypergeometric functions or interlocking with others functions, to gives a very satisfactory answer by use inverses functions and give solutions to all this complex problem”. Certain equations mainly exist in Logarithmic functions or Trigonometric functions or power functions, which mainly solve problems of Physics, and for the most significant part need generalized solution in C. This outcome now is called G.R.E-L and it faces by the help of favor hypergeometric functions or other any simple functions , giving very satisfactory answer in the total of this complex problem
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.7910/DVN/UW6FAPhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.7910/DVN/UW6FAP
Purpose – Simple equations and more extended models are developed to determine characteristic engine parameters: Specific fuel consumption (SFC), engine mass, and engine size characterized by engine length and diameter. SFC (c) is considered a linear function of speed: c = c_a * V + c_b. --- Methodology – Data from 718 engines is collected from various open sources into an Excel spreadsheet. The characteristic engine parameters are plotted as function of bypass ratio (BPR), date of entry into service (EIS), take-off thrust, and typical cruise thrust. Engine length and diameter are plotted versus engine mass. Linear and nonlinear regression functions are investigated. Moreover, Singular Value Decomposition (SVD) is used to establish relations between parameters. SVD is used with Excel and MATLAB. The accuracy of all equations and models is compared. --- Findings – SFC should be calculated as a linear function of speed. This is especially important, when SFC is extrapolated to unconventional (low) cruise speeds for jet engines. The two parameters c_a and c_b are best estimated from a logarithmic or power function of bypass ratio (BPR). SFC and c_b clearly improved over the years. Engine mass, diameter, and length are proportional to take-off thrust. Characteristic engine parameters can also be obtained from SVD with comparable accuracy. However, SVD is more complicated to set up than using a simple equation. --- Practical implications – Engine characteristics need to be estimated based on only a few known parameters for aircraft preliminary sizing, conceptual design, and aircraft optimization as well as for practical quick calculations in flight mechanics. This thesis provides the tools. --- Social implications – Most engine characteristics like SFC are considered company secrets. The availability of open access engine data is the first step, but wisdom is retrieved only with careful analysis of the data as done here. Openly available aircraft engineering knowledge helps to democratize the discussion about the ecological footprint of aviation. --- Originality/value – Simple equation for jet engine SFC, mass, and size deduced from a large engine database are offered. This approach delivered equations as a function of BPR with an error of only 6%, which is the same accuracy as more complex equations from literature.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To construct CFA, MCFA, and maximum MCFA with LISREL v.8 and below, we provide iMCFA (integrated Multilevel Confirmatory Analysis) to examine the potential multilevel factorial structure in the complex survey data. Modeling multilevel structure for complex survey data is complicated because building a multilevel model is not an infallible statistical strategy unless the hypothesized model is close to the real data structure. Methodologists have suggested using different modeling techniques to investigate potential multilevel structure of survey data. Using iMCFA, researchers can visually set the between- and within-level factorial structure to fit MCFA, CFA and/or MAX MCFA models for complex survey data. iMCFA can then yield between- and within-level variance-covariance matrices, calculate intraclass correlations, perform the analyses and generate the outputs for respective models. The summary of the analytical outputs from LISREL is gathered and tabulated for further model comparison and interpretation. iMCFA also provides LISREL syntax of different models for researchers' future use. An empirical and a simulated multilevel dataset with complex and simple structures in the within or between level was used to illustrate the usability and the effectiveness of the iMCFA procedure on analyzing complex survey data. The analytic results of iMCFA using Muthen's limited information estimator were compared with those of Mplus using Full Information Maximum Likelihood regarding the effectiveness of different estimation methods.