71 datasets found
  1. f

    Data from: DataSheet1.docx

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiun-Yu Wu; Yuan-Hsuan Lee; John J. H. Lin (2023). DataSheet1.docx [Dataset]. http://doi.org/10.3389/fpsyg.2018.00251.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Jiun-Yu Wu; Yuan-Hsuan Lee; John J. H. Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To construct CFA, MCFA, and maximum MCFA with LISREL v.8 and below, we provide iMCFA (integrated Multilevel Confirmatory Analysis) to examine the potential multilevel factorial structure in the complex survey data. Modeling multilevel structure for complex survey data is complicated because building a multilevel model is not an infallible statistical strategy unless the hypothesized model is close to the real data structure. Methodologists have suggested using different modeling techniques to investigate potential multilevel structure of survey data. Using iMCFA, researchers can visually set the between- and within-level factorial structure to fit MCFA, CFA and/or MAX MCFA models for complex survey data. iMCFA can then yield between- and within-level variance-covariance matrices, calculate intraclass correlations, perform the analyses and generate the outputs for respective models. The summary of the analytical outputs from LISREL is gathered and tabulated for further model comparison and interpretation. iMCFA also provides LISREL syntax of different models for researchers' future use. An empirical and a simulated multilevel dataset with complex and simple structures in the within or between level was used to illustrate the usability and the effectiveness of the iMCFA procedure on analyzing complex survey data. The analytic results of iMCFA using Muthen's limited information estimator were compared with those of Mplus using Full Information Maximum Likelihood regarding the effectiveness of different estimation methods.

  2. Bi-level Identification of Governing Equations for Nonlinear Physical...

    • zenodo.org
    txt
    Updated Apr 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeyu Li; Zeyu Li (2025). Bi-level Identification of Governing Equations for Nonlinear Physical Systems [Dataset]. http://doi.org/10.5281/zenodo.15140828
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zeyu Li; Zeyu Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    This data set contains all data used in the paper "Bi-level identification of complex dynamical systems through reinforcement learning".
    
    In the BILLIE (Bi-level Identification of Equations) algorithm proposed in the paper, two sets of orthogonal data (denote as s1 and s2) were
    used in each identification case.
    The "_s1" (or "_s2") in a dataset's name means that s1 (or s2) was sampled from that dataset.
    Those datasets without "_s1" (or "_s2") in the name means that both s1 and s2 was sampled from that dataset.
    Each of the four folders is detailed as below. 1. The folder "Navier-Stokes equation" contains the simulated data of Navier-stokes equation for the three fluid dynamics identification cases. Naming: "NS_(2D or 3D)_(Reynolds number)_(s1/s2 if any)" Case1: 2D flow with Reynolds number of 100 The two sets of data is structured on a 256x256 grid within the 2pi x 2pi spatial domain, meaning dx=dy=2pi/256. The time step is dt=0.0015. The data is organized as [T, X, Y, C] where T is the temporal dimension, X, Y are the spatial dimensions, C=[V, U, P] where U, V are the two fluid velocity components on the two spatial dimensions respectively, and P is the pressure scalar. Case2: 2D flow with Reynolds number of 1000 The two sets of data is structured on a 2048x2048 grid within the 2pi x 2pi spatial domain, meaning dx=dy=2pi/2048. The time step is dt=0.00005. The data is organized as [T, X, Y, C], identically to case1. Case3: 3D flow with Reynolds number of 100 This is the data of a flow around a cylinder published by Raissi in "Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations" (DOI: 10.1126/science.aaw4741) 2. The folder "Burgers' equation" contains the simulated data of Burgers' equation for the experiments on small-coefficient terms, noise, and sparsity. Ground truth equation: u_t = lambda*u_xx - u*u_x Naming: "Burgers_coef_(lambda)_(grid setting: spatial x temporal)_(s1/s2)" The datasets were simulated on a [-8, 8] spatial domain and [0, 10] temporal domain with structured grids of different levels of sparsity. The noise tests were performed on "Burgers_coef_1e-1_257x101_s1.mat" and "Burgers_coef_1e-1_257x101_s2.mat" by manually adding gaussian noise. 3. The folder "Three body" contains the simulated data of the three-body system for the experiments on small-coefficient terms, noise, and sparsity. Naming: "three_body_coef_(lambda, see paper for meaning)_(s1/s2)" Each dataset contains a dictionary of 7 keys: x, y, z, u, v, w, dt. The time step is dt=0.005. The noise tests were performed on "three_body_coef_1e0_s1.mat" and "three_body_coef_1e0_s2.mat" by manually adding gaussian noise. 4. The folder "Single-cell sequencing data" contains the two sets of preprocessed multi-omics single-cell sequencing datasets used in identifying RNA and protein velocity. The original datasets of GSM2695381 and GSM2695382 is publicly available in the Gene Expression Omnibus ("Large-scale simultaneous measurement of epitopes and transcriptomes in single cells").
  3. Mathematical Problems Dataset: Various

    • kaggle.com
    Updated Dec 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Mathematical Problems Dataset: Various [Dataset]. https://www.kaggle.com/datasets/thedevastator/mathematical-problems-dataset-various-mathematic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Mathematical Problems Dataset: Various Mathematical Problems and Solutions

    Mathematical Problems Dataset: Questions and Answers

    By math_dataset (From Huggingface) [source]

    About this dataset

    This dataset comprises a collection of mathematical problems and their solutions designed for training and testing purposes. Each problem is presented in the form of a question, followed by its corresponding answer. The dataset covers various mathematical topics such as arithmetic, polynomials, and prime numbers. For instance, the arithmetic_nearest_integer_root_test.csv file focuses on problems involving finding the nearest integer root of a given number. Similarly, the polynomials_simplify_power_test.csv file deals with problems related to simplifying polynomials with powers. Additionally, the dataset includes the numbers_is_prime_train.csv file containing math problems that require determining whether a specific number is prime or not. The questions and answers are provided in text format to facilitate analysis and experimentation with mathematical problem-solving algorithms or models

    How to use the dataset

    • Introduction: The Mathematical Problems Dataset contains a collection of various mathematical problems and their corresponding solutions or answers. This guide will provide you with all the necessary information on how to utilize this dataset effectively.

    • Understanding the columns: The dataset consists of several columns, each representing a different aspect of the mathematical problem and its solution. The key columns are:

      • question: This column contains the text representation of the mathematical problem or equation.
      • answer: This column contains the text representation of the solution or answer to the corresponding problem.
    • Exploring specific problem categories: To focus on specific types of mathematical problems, you can filter or search within the dataset using relevant keywords or terms related to your area of interest. For example, if you are interested in prime numbers, you can search for prime in the question column.

    • Applying machine learning techniques: This dataset can be used for training machine learning models related to natural language understanding and mathematics. You can explore various techniques such as text classification, sentiment analysis, or even sequence-to-sequence models for solving mathematical problems based on their textual representations.

    • Generating new questions and solutions: By analyzing patterns in this dataset, you can generate new questions and solutions programmatically using techniques like data augmentation or rule-based methods.

    • Validation and evaluation: As with any other machine learning task, it is essential to validate your models on separate validation sets not included in this dataset properly. You can also evaluate model performance by comparing predictions against known answers provided in this dataset's answer column.

    • Sharing insights and findings: After working with this datasets, it would be beneficial for researchers or educators to share their insights, approaches taken during analysis/modelling as Kaggle notebooks/ discussions/ blogs/ tutorials etc., so that others could get benefited from such shared resources too.

    Note: Please note that the dataset does not include dates.

    By following these guidelines, you can effectively explore and utilize the Mathematical Problems Dataset for various mathematical problem-solving tasks. Happy exploring!

    Research Ideas

    • Developing machine learning algorithms for solving mathematical problems: This dataset can be used to train and test models that can accurately predict the solution or answer to different mathematical problems.
    • Creating educational resources: The dataset can be used to create a wide variety of educational materials such as problem sets, worksheets, and quizzes for students studying mathematics.
    • Research in mathematical problem-solving strategies: Researchers and educators can analyze the dataset to identify common patterns or strategies employed in solving different types of mathematical problems. This analysis can help improve teaching methodologies and develop effective problem-solving techniques

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purpos...

  4. f

    Data from: Parameter Estimation and Variable Selection for Big Systems of...

    • tandf.figshare.com
    • datasetcatalog.nlm.nih.gov
    application/gzip
    Updated Aug 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leqin Wu; Xing Qiu; Ya-xiang Yuan; Hulin Wu (2024). Parameter Estimation and Variable Selection for Big Systems of Linear Ordinary Differential Equations: A Matrix-Based Approach [Dataset]. http://doi.org/10.6084/m9.figshare.5813787.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 7, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Leqin Wu; Xing Qiu; Ya-xiang Yuan; Hulin Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ordinary differential equations (ODEs) are widely used to model the dynamic behavior of a complex system. Parameter estimation and variable selection for a “Big System” with linear ODEs are very challenging due to the need of nonlinear optimization in an ultra-high dimensional parameter space. In this article, we develop a parameter estimation and variable selection method based on the ideas of similarity transformation and separable least squares (SLS). Simulation studies demonstrate that the proposed matrix-based SLS method could be used to estimate the coefficient matrix more accurately and perform variable selection for a linear ODE system with thousands of dimensions and millions of parameters much better than the direct least squares method and the vector-based two-stage method that are currently available. We applied this new method to two real datasets—a yeast cell cycle gene expression dataset with 30 dimensions and 930 unknown parameters and the Standard & Poor 1500 index stock price data with 1250 dimensions and 1,563,750 unknown parameters—to illustrate the utility and numerical performance of the proposed parameter estimation and variable selection method for big systems in practice. Supplementary materials for this article are available online.

  5. Use Case Points Benchmark Dataset

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radek Silhavy; Petr Silhavy; Zdenka Prokopova; Radek Silhavy; Petr Silhavy; Zdenka Prokopova (2020). Use Case Points Benchmark Dataset [Dataset]. http://doi.org/10.5281/zenodo.344959
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Radek Silhavy; Petr Silhavy; Zdenka Prokopova; Radek Silhavy; Petr Silhavy; Zdenka Prokopova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was gathered by us from three software houses. This is real-life dataset. Use Case points method as originated by Karner was used for counting a steps or number of actors. Data are based on different languages, various problem domains. ISBSG style for language, domain and application type were adopted.


    Attributes are used as follows:
    Project_No - only project ID for identification purposes
    Simple Actors - Number of actor classify according UCP - simple actors.
    Average Actors - Number of actor classify according UCP - average actors.
    Complex Actors - Number of actor classify according UCP - complex actors.
    UAW - Unadjusted Actor weight, computed by using UCP equation.
    Simple UC - Number of use cases classified as simple - UCP number of steps is used.
    Average UC - Number of use cases classified as average - UCP number of steps is used.
    Complex UC - Number of use cases classified as complex - UCP number of steps is used.
    UUCW - Unadjusted UseCase Weight - computed by using UCP equation.
    TCF - Technical Complexity FactorECF - Enviromental Complexity Factors
    Real_P20 - Real_P20 - Real Effort in Person hours, decided by productivity factor (PF = 20).
    Real_Effort_Person_Hours - Real Effort (development time) in person-hours.
    Sector - Problem domain of projectLanguage - Programming language used for project.
    Methodology - Development methodology used for project development.
    ApplicationType - Classification of project type - provided by donator.
    DataDonator - anonymized acronym for data donator.

  6. Z

    Sound field image dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Takehiro (2024). Sound field image dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8357752
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Kenji
    Takehiro
    Daiki
    Noboru
    Description

    Description

    This sound field image dataset contains clean-noisy pairs of complex-valued sound-field images generated by 2D acoustic simulations. The dataset was initially prepared for deep sound-field denoiser (https://github.com/nttcslab/deep-sound-field-denoiser), a DNN-based denoising method for optically measured sound fields. Since the data is a two-dimensional sound field based on the Helmholtz equation, one can use this dataset for any acoustic application. Please check our GitHub repository and paper for details.

    Directory structure

    The dataset contains three directories: training, validation, and evaluation. Each directory contains "soundsource#" sub-directories (# represents the number of sound sources used in the acoustic simulation). Each sub-directory has three h5 files for data (clean, white noise, and speckle noise) and three CSV files listing random parameter values used in the simulation.

    • /training

      • /soundsource#

        • constants.csv

        • random_variable_ranges.csv

        • random_variables.csv

        • sf_true.h5

        • sf_noise_white.h5

        • sf_noise_speckle.h5

    Condition of use

    This dataset is available under the attached license file. Read the terms and conditions in NTTSoftwareLicenseAgreement.pdf carefully.

    Citation

    If you use this dataset, please cite the following paper.

    K. Ishikawa, D. Takeuchi, N. Harada, and T. Moriya ``Deep sound-field denoiser: optically-measured sound-field denoising using deep neural network,'' arXiv:2304.14923 (2023).

  7. PEDS datasets and figure data

    • zenodo.org
    csv
    Updated Oct 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphaël Pestourie; Raphaël Pestourie; Payel Das; Payel Das (2023). PEDS datasets and figure data [Dataset]. http://doi.org/10.5281/zenodo.10011958
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Raphaël Pestourie; Raphaël Pestourie; Payel Das; Payel Das
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Datasets:

    • y_fisher25.csv: reaction-diffusion equation; the thermal flux corresponding to structures with 25 holes
    • X_fisher25.csv: reaction-diffusion equation; the side lengths of the 25 holes in the structures
    • y_fisher16.csv: reaction-diffusion equation; the thermal flux corresponding to structures with 16 holes
    • X_fisher16.csv: reaction-diffusion equation; the side lengths of the 16 holes in the structures
    • y_fourier25.csv: diffusion equation; the thermal flux corresponding to structures with 25 holes
    • X_fourier25.csv: diffusion equation; the side lengths of the 25 holes in the structures
    • y_fourier16.csv: diffusion equation; the thermal flux corresponding to structures with 16 holes
    • X_fourier16.csv: diffusion equation; the side lengths of the 16 holes in the structures
    • y_maxwell10.csv: Helmholtz equation; the complex transmission through the 10-layered structure
    • X_maxwell10.csv: Helmholtz equation; the side lengths of the 10 holes in each layer of the structure followed by a one-hot encoding of the frequency [0.5, 0.75, 1]

    Figure data:

    • nb_trainingpoints_Fig1.csv: number of training points in the dataset–x-coordinates (Fig 1, S1, and S2)
    • baseline_alFig1.csv: error of the baseline ensemble using a dataset that was generated using active learning (Fig 1, S1, and S2)
    • baseline_noalFig1.csv: error of the baseline ensemble using a dataset that was sampled uniformly at random (Fig 1, S1, and S2)
    • baseline_single_noalFig1.csv: error of the baseline (single model) using a dataset that was sampled uniformly at random (Fig 1, S1, and S2)
    • PEDS_alFig1.csv: error of the PEDS ensemble using a dataset that was generated using active learning (Fig 1, S1, and S2)
    • PEDS_noalFig1.csv: error of the PEDS ensemble using a dataset that was sampled uniformly at random (Fig 1, S1, and S2)
    • PEDS_single_noalFig1.csv: error of the PEDS (single model) using a dataset that was sampled uniformly at random (Fig 1, S1, and S2)
    • SM10_ALFigS1.csv: error of the space mapping ensemble with a resolution of 10 using a dataset that was generated using active learning (Fig S1)
    • SM10_noALFigS1.csv: error of the space mapping ensemble with a resolution of 10 using a dataset that was sampled uniformly at random (Fig S1)
    • SM10_single_noALFigS1.csv: error of the space mapping (single model) with a resolution of 10 using a dataset that was sampled uniformly at random (Fig S1)
    • SM20_single_noalFigS2.csv: error of the space mapping (single model) with a resolution of 20 using a dataset that was sampled uniformly at random (Fig S2)
    • SM20_ALFigS2.csv: error of the space mapping ensemble with a resolution of 20 using a dataset that was generated using active learning (Fig S2)
    • SM20_noALFigS2.csv: error of the space mapping ensemble with a resolution of 20 using a dataset that was sampled uniformly at random (Fig S2)
    • resolutionFigS4.csv: resolution of the middle fidelity model–x-coordinate (Fig. S4)
    • error_midfidFigS4.csv: error of the middle fidelity model (Fig. S4)

  8. f

    Kuramoto-Sivashinsky PDE Lyapunov Exponents: Code & Data

    • adelaide.figshare.com
    • researchdata.edu.au
    txt
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Russell Edson; Judith Bunder; Trent Mattner; Anthony Roberts (2024). Kuramoto-Sivashinsky PDE Lyapunov Exponents: Code & Data [Dataset]. http://doi.org/10.25909/26184566.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    The University of Adelaide
    Authors
    Russell Edson; Judith Bunder; Trent Mattner; Anthony Roberts
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MATLAB code and the generated datasets for the Lyapunov exponent spectra of the Kuramoto-Sivashinsky PDE, as published in the paper 'Lyapunov Exponents of the Kuramoto-Sivashinsky PDE' in 2019. The files are organised as follows:Codecode/lyapunovexpts.m contains a MATLAB function implementation of Algorithm 1 from the paper, which is the classic algorithm for finding Lyapunov exponents introduced by Benettin et al. (1980) and Shimada and Nagashima (1979).code/dudt_ksperiodic_spectral.m contains a vectorised ODE-discretisation of the Kuramoto-Sivashinsky PDE on a periodic domain using a spectral scheme, which can be used with the standard MATLAB ODE solvers to simulate the dynamics.code/dudt_ksoddperiodic_finitediff.m contains a similar vectorised ODE-discretisation of the Kuramoto-Sivashinsky PDE, but for the "odd-periodic" domain (u = uxx = 0 on x=0,L) and using a finite-difference scheme with error O(dx2).code/research_kslyaps{.m,.sh} contain the code that ran the computational experiments (using the above Lyapunov exponents code and the Kuramoto-Sivashinsky ODE-discretisations) to generate the Lyapunov spectra data using MATLAB 2016a on the School of Mathematical Sciences' maths1 Linux server in 2017.Datalyapexpts_ksperiodic.zip contains the Lyapunov spectra computed for the Kuramoto-Sivashinsky PDE on the periodic domain. Each file has a filename of the form LXYZpW.txt, and contains the 24 most positive Lyapunov exponents computed on the periodic domain [0, L] where L = XYZ.W. (E.g., L097p4.txt contains the exponents for L=97.4.)lyapexpts_ksoddperiodic.zip contains the Lyapunov spectra computed for the Kuramoto-Sivashinsky PDE on the "odd-periodic" domain. Each file has a filename of the form LXYZpW.txt, and contains the 24 most positive Lyapunov exponents computed on the odd-periodic domain [0, L] where L=XYZ.W.

  9. Training and testing data used in the paper "An equation-of-state-meter of...

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Long-Gang Pang; Kai Zhou; Nan Su; Hannah Petersen,; Horst Stocker; Xin-Nian Wang (2023). Training and testing data used in the paper "An equation-of-state-meter of QCD transition from deep learning" [Dataset]. http://doi.org/10.6084/m9.figshare.5457220.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Long-Gang Pang; Kai Zhou; Nan Su; Hannah Petersen,; Horst Stocker; Xin-Nian Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Training and testing data to identify the QCD transition using deep learning and traditional machine learning.1. training_data.csv, testing_iebevishnu.csv, testing_ipglasma.csvThere are 723 entries in each row. The first row is the description of the data. The 0th entry in each row is the event id, the entries from 1 to 720 are the pion density distribution at mid-rapidity -- rho(pt, phi) at 15 different pt bins and 48 different azimuthal angle phi bins with phi as the inner loop. The 721st entry is the equation of state type (0 or 1). The 722 entry is extra information of each event.2. training_observables.csv, test_iebe_observables.csv, test_ipglasma_observables.csv There are 87 entries in each row. The first row is the header which describes the data of the following rows. In the following rows, the first entry is the event id, the second entry is the equation of state type (0 or 1), the remaining entries are 85 observables computed from raw spectra, which will be used in various traditional classifiers in machine learning toolbox.The meaning of these entries and the Monte Carlo model used to generate these data can be found in the paper,http://inspirehep.net/record/1503189

  10. h

    gsm8k_txt_Qwen2.5-MATH-7B-Instruct

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Moor-Smith, gsm8k_txt_Qwen2.5-MATH-7B-Instruct [Dataset]. https://huggingface.co/datasets/smoorsmith/gsm8k_txt_Qwen2.5-MATH-7B-Instruct
    Explore at:
    Authors
    Samuel Moor-Smith
    Description

    Dataset Card for Dataset Name

    Contains text reasoning chain from Qwen2.5-MATH-7B-Instruct from standard prompt: [ { "role": "user", "content": "Solve the following math problem efficiently and clearly: - For simple problems (2 steps or fewer): Provide a concise solution with minimal equation. - For complex problems (3 steps or more): Use this step-by-step format:

    Step 1: [Brief calculations]

    Step 2: [Brief calculations]

    ... Regardless of the approach… See the full description on the dataset page: https://huggingface.co/datasets/smoorsmith/gsm8k_txt_Qwen2.5-MATH-7B-Instruct.

  11. d

    Data from: Comparable space use by lions between hunting concessions and...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Feb 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kirby Mills; Yahou Harissou; Isaac Gnoumou; Yaye Abdel-Nasser; Benoit Doamba; Nyeema Harris (2020). Comparable space use by lions between hunting concessions and national parks in West Africa [Dataset]. http://doi.org/10.5061/dryad.r4xgxd28g
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 18, 2020
    Dataset provided by
    Dryad
    Authors
    Kirby Mills; Yahou Harissou; Isaac Gnoumou; Yaye Abdel-Nasser; Benoit Doamba; Nyeema Harris
    Time period covered
    Feb 11, 2020
    Description

    Spatially varied resources and threats govern the persistence of African lions across dynamic protected areas. An important precursor to effective conservation for lions requires assessing tradeoffs in space use due to heterogeneity in habitat, resources, and human presence between national parks and hunting areas, the dominant land-use classifications across their range.

    We conducted the largest camera survey in West Africa, encompassing 3 national parks and 11 hunting concessions in Burkina Faso and Niger, equating to half of the 26,500-km2 transboundary W-Arly-Pendjari (WAP) protected area complex. We combined occupancy and structural equation modeling to disentangle the relative effects of environmental, ecological, and anthropogenic variables influencing space use of Critically Endangered lions across 21,430 trap-nights from 2016-2018.

    National parks are intended to serve as refuges from human pressures, and thus we expect higher lion occupancy in national parks (NPs) comp...

  12. Data from: Bushland ET Calculator

    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • data.amerigeoss.org
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Bushland ET Calculator [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/bushland-et-calculator-d94e3
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Area covered
    Bushland
    Description

    The Bushland Reference ET calculator was developed at the USDA-ARS Conservation and Production Research Laboratory, Bushland, Texas. Although it was designed and developed for use mainly by producers and crop consultants to manage irrigation scheduling, it can also be used in educational training, research, and other practical application. It uses the ASCE Standardized Reference Evapotranspiration (ET) Equation for calculating grass and alfalfa reference ET at hourly and daily time steps. This program uses the more complex equation for estimating clear-sky solar radiation provided in Appendix D of the ASCE-EWRI ET Manual. Users have the option of using single set or time series weather data to calculate reference ET. Daily reference ET can be calculated either by summing the hourly ET values for a given day or by using averages of the climatic data. Resources in this dataset:Resource Title: Bushland ET Calculator download page. File Name: Web Page, url: https://res1wwwd-o-tarsd-o-tusdad-o-tgov.vcapture.xyz/research/software/download/?softwareid=Bushland+ET+Calculator&modecode=30-90-05-00

  13. f

    Spin-Boson dataset

    • plus.figshare.com
    zip
    Updated Jul 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arif Ullah; Luis Herrera; Pavlo O. Dral; Alexei Kananenka (2023). Spin-Boson dataset [Dataset]. http://doi.org/10.25452/figshare.plus.21913062.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 22, 2023
    Dataset provided by
    Figshare+
    Authors
    Arif Ullah; Luis Herrera; Pavlo O. Dral; Alexei Kananenka
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This data set was generated for the spin-boson model using hierarchical equations of motion method for 1,000 model parameters. The data set contains reduced density matrix of the two-level system.

    See related materials in Collection at: https://doi.org/10.25452/figshare.plus.c.6389553

    Collection description: Simulations of the dynamics of dissipative quantum systems utilize many methods such as physics-based quantum, semiclassical, and quantum-classical as well as machine learning-based approximations, development and testing of which requires diverse data sets. Here we present a new database QD3SET-1 containing eight data sets of quantum dynamical data for two systems of broad interest, spin-boson (SB) model and the Fenna–Matthews–Olson (FMO) complex, generated with two different methods solving the dynamics, approximate local thermalizing Lindblad master equation (LTLME) and highly accurate hierarchy equations of motion (HEOM). One data set was generated with the SB model which is a two-level quantum system coupled to a harmonic environment using HEOM for 1,000 model parameters. Seven data sets were collected for the FMO complex of different sizes (7- and 8-site monomer and 24-site trimer with LTLME and 8-site monomer with HEOM) for 500–879 model parameters. Our QD3SET-1 database contains both population and coherence dynamics data and part of it has been already used for machine learning-based quantum dynamics studies.

  14. e

    Profiling novel high-conductivity 2D semiconductors - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Aug 1, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Profiling novel high-conductivity 2D semiconductors - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/a1ed3619-b70e-5e74-b527-e72162f3f603
    Explore at:
    Dataset updated
    Aug 1, 2020
    Description

    When complex mechanisms are involved, pinpointing high-performance materials within large databases is a major challenge in materials discovery. We focus here on phonon-limited conductivities, and study 2D semiconductors doped by field effects. Using state-of-the-art density-functional perturbation theory and Boltzmann transport equation, we discuss 11 monolayers with outstanding transport properties. These materials are selected from a computational database of exfoliable materials providing monolayers that are dynamically stable and that do not have more than 6 atoms per unit cell. We first analyze electron-phonon scattering in two well-known systems: electron-doped InSe and hole-doped phosphorene. Both are single-valley systems with weak electron-phonon interactions, but they represent two distinct pathways to fast transport: a steep and deep isotropic valley for the former and strongly anisotropic electron-phonon physics for the latter. We identify similar features in the database and compute the conductivities of the relevant monolayers. This process yields several high-conductivity materials, some of them only very recently emerging in the literature (GaSe, Bi₂SeTe₂, Bi₂Se₃, Sb₂SeTe₂), others never discussed in this context (AlLiTe₂, BiClTe, ClGaTe, AuI). Comparing these 11 monolayers in detail, we discuss how the strength and angular dependency of the electron-phonon scattering drives key differences in the transport performance of materials despite similar valley structure. We also discuss the high conductivity of hole-doped WSe₂, and how this case study shows the limitations of a selection process that would be based on band properties alone. In this entry we provide the AiiDA database with the calculations of phonons and electron-phonon interactions for the 11 materials, along with the python library to collect and visualise the data, solve the Botzmann transport equation, and launch the same workflows for other 2D materials. To guide the reader, we include a Jupyter notebook showing how to extract the data, use the basic functionalities of the library, and regenerate the plots included in the associated paper.

  15. Data from: Source code for R tutorials and dataset for empirical case study...

    • zenodo.org
    • data.niaid.nih.gov
    • +2more
    txt
    Updated Jun 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martijn van de Pol; Martijn van de Pol; Lyanne Brouwer; Lyanne Brouwer (2022). Source code for R tutorials and dataset for empirical case study on Malurus elegans (red-winged fairy wren) [Dataset]. http://doi.org/10.5061/dryad.7h44j0ztw
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 4, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martijn van de Pol; Martijn van de Pol; Lyanne Brouwer; Lyanne Brouwer
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Biological processes exhibit complex temporal dependencies due to the sequential nature of allocation decisions in organisms' life-cycles, feedback loops, and two-way causality. Consequently, longitudinal data often contain cross-lags: the predictor variable depends on the response variable of the previous time-step. Although statisticians have warned that regression models that ignore such covariate endogeneity in time series are likely to be inappropriate, this has received relatively little attention in biology. Furthermore, the resulting degree of estimation bias remains largely unexplored.

    We use a graphical model and numerical simulations to understand why and how regression models that ignore cross-lags can be biased, and how this bias depends on the length and number of time series. Ecological and evolutionary examples are provided to illustrate that cross-lags may be more common than is typically appreciated and that they occur in functionally different ways.

    We show that routinely used regression models that ignore cross-lags are asymptotically unbiased. However, this offers little relief, as for most realistically feasible lengths of time series conventional methods are biased. Furthermore, collecting time series on multiple subjects–such as populations, groups or individuals—does not help to overcome this bias when the analysis focusses on within-subject patterns (often the pattern of interest). Simulations (R tutorial 1 & 2), a literature search and a real-world empirical example on fairy wrens (data archived here with analyses presented in R-tutorial 3) together suggest that approaches that ignore cross-lags are likely biased in the direction opposite to the sign of the cross-lag (e.g. towards detecting density-dependence of vital rates and against detecting life history trade-offs and benefits of group living). Next, we show that multivariate (e.g. structural equation) models can dynamically account for cross-lags, and simultaneously address additional bias induced by measurement error, but only if the analysis considers multiple time series.

    We provide guidance on how to identify a cross-lag and subsequently specify it in a multivariate model, which can be far from trivial. Our tutorials with data and R code of the worked examples provide step‐by‐step instructions on how to perform such analyses.

    Our study offers insights into situations in which cross-lags can bias analysis of ecological and evolutionary time series and suggests that adopting dynamical models can be important, as this directly affects our understanding of population regulation, the evolution of life histories and cooperation, and possibly many other topics. Determining how strong estimation bias due to ignoring covariate endogeneity has been in the ecological literature requires further study, also because it may interact with other sources of bias.

  16. f

    Data from: Confidence-Interval and Uncertainty-Propagation Analysis of...

    • datasetcatalog.nlm.nih.gov
    • acs.figshare.com
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Smirnova, Irina; Mueller, Simon; Walker, Pierre J. (2023). Confidence-Interval and Uncertainty-Propagation Analysis of SAFT-type Equations of State [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000936902
    Explore at:
    Dataset updated
    Aug 31, 2023
    Authors
    Smirnova, Irina; Mueller, Simon; Walker, Pierre J.
    Description

    Thermodynamic models and, in particular, Statistical Associating Fluid Theory (SAFT)-type equations, are vital in characterizing complex systems. This paper presents a framework for sampling parameter distributions in PC-SAFT and SAFT-VR Mie equations of state to examine parameter confidence intervals and correlations. Comparing the equations of state, we find that additional parameters introduced in the SAFT-VR Mie equation increase relative uncertainties (1%–2% to 3%–4%) and introduce more correlations. These correlations can be attributed to conserved quantities such as particle volume and interaction energy. When incorporating association through additional parameters, relative uncertainties increase further while slightly reducing correlations between parameters. We also investigate how uncertainties in parameters propagate to the predicted properties from these equations of state. While the uncertainties for the regressed properties remain small, when extrapolating to new properties, uncertainties can become significant. This is particularly true near the critical point where we observe that properties dependent on the isothermal compressibility observe massive divergences in the uncertainty. We find that these divergences are intrinsic to these equations of state and, as a result, will always be present regardless of how small the parameter uncertainties are.

  17. e

    Electron density learning of non-covalent systems - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Sep 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Electron density learning of non-covalent systems - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/e8b89b1a-f8eb-5b83-b2a4-d4979e227adf
    Explore at:
    Dataset updated
    Sep 4, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Chemists continuously harvest the power of non-covalent interactions to control phenomena in both the micro- and macroscopic worlds. From the quantum chemical perspective, the strategies essentially rely upon an in-depth understanding of the physical origin of these interactions, the quantification of their magnitude and their visualization in real-space. The total electron density ρ(r) represents the simplest yet most comprehensive piece of information available for fully characterizing bonding patterns and non-covalent interactions. The charge density of a molecule can be computed by solving the Schrödinger equation, but this approach becomes rapidly demanding if the electron density has to be evaluated for thousands of different molecules or very large chemical systems, such as peptides and proteins. Here we present a transferable and scalable machine-learning model capable of predicting the total electron density directly from the atomic coordinates. The regression model is used to access qualitative and quantitative insights beyond the underlying ρ(r) in a diverse ensemble of sidechain–sidechain dimers extracted from the BioFragment database (BFDb). The transferability of the model to more complex chemical systems is demonstrated by predicting and analyzing the electron density of a collection of 8 polypeptides.

  18. m

    Solutions of Polynomials(Method GRLN)

    • data.mendeley.com
    Updated Jul 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nikos mantzakouras (2021). Solutions of Polynomials(Method GRLN) [Dataset]. http://doi.org/10.17632/c2vf5znxk6.3
    Explore at:
    Dataset updated
    Jul 9, 2021
    Authors
    nikos mantzakouras
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    While all the approximate methods mentioned or others that exist, give some specific solutions to the generalized transcendental equations or even polynomial, cannot resolve them completely. "What we ask when we solve a generalized transcendental equation or polynomials, is to find the total number of roots and not separate sets of roots in some random or specified intervals. Mainly because too many categories of transcendental equations have an infinite number of solutions in the complex set."There are some particular equations (Logarithmic functions, Trigonometric, functions power fubction,, or any Functions) that solve particular problems in Physics, and mostly need the generalized solution. Now coming the theory G.R.E-L, which deals with hypergeometric functions or interlocking with others functions, to gives a very satisfactory answer by use inverses functions and give solutions to all this complex problem”. Certain equations mainly exist in Logarithmic functions or Trigonometric functions or power functions, which mainly solve problems of Physics, and for the most significant part need generalized solution in C. This outcome now is called G.R.E-L and it faces by the help of favor hypergeometric functions or other any simple functions , giving very satisfactory answer in the total of this complex problem

  19. s

    Criteria for a sediment data set

    • cinergi.sdsc.edu
    Updated Jun 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Criteria for a sediment data set [Dataset]. http://cinergi.sdsc.edu/geoportal/rest/metadata/item/e4b66a7f55954cfb91bbdcdf0d42663c/html
    Explore at:
    Dataset updated
    Jun 26, 2018
    Description

    Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information

  20. H

    Data from: Turbofan Specific Fuel Consumption, Size, and Mass from...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Nov 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Oussama Hammami (2024). Turbofan Specific Fuel Consumption, Size, and Mass from Correlated Engine Parameters [Dataset]. http://doi.org/10.7910/DVN/UW6FAP
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Mohamed Oussama Hammami
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.7910/DVN/UW6FAPhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.7910/DVN/UW6FAP

    Description

    Purpose – Simple equations and more extended models are developed to determine characteristic engine parameters: Specific fuel consumption (SFC), engine mass, and engine size characterized by engine length and diameter. SFC (c) is considered a linear function of speed: c = c_a * V + c_b. --- Methodology – Data from 718 engines is collected from various open sources into an Excel spreadsheet. The characteristic engine parameters are plotted as function of bypass ratio (BPR), date of entry into service (EIS), take-off thrust, and typical cruise thrust. Engine length and diameter are plotted versus engine mass. Linear and nonlinear regression functions are investigated. Moreover, Singular Value Decomposition (SVD) is used to establish relations between parameters. SVD is used with Excel and MATLAB. The accuracy of all equations and models is compared. --- Findings – SFC should be calculated as a linear function of speed. This is especially important, when SFC is extrapolated to unconventional (low) cruise speeds for jet engines. The two parameters c_a and c_b are best estimated from a logarithmic or power function of bypass ratio (BPR). SFC and c_b clearly improved over the years. Engine mass, diameter, and length are proportional to take-off thrust. Characteristic engine parameters can also be obtained from SVD with comparable accuracy. However, SVD is more complicated to set up than using a simple equation. --- Practical implications – Engine characteristics need to be estimated based on only a few known parameters for aircraft preliminary sizing, conceptual design, and aircraft optimization as well as for practical quick calculations in flight mechanics. This thesis provides the tools. --- Social implications – Most engine characteristics like SFC are considered company secrets. The availability of open access engine data is the first step, but wisdom is retrieved only with careful analysis of the data as done here. Openly available aircraft engineering knowledge helps to democratize the discussion about the ecological footprint of aviation. --- Originality/value – Simple equation for jet engine SFC, mass, and size deduced from a large engine database are offered. This approach delivered equations as a function of BPR with an error of only 6%, which is the same accuracy as more complex equations from literature.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jiun-Yu Wu; Yuan-Hsuan Lee; John J. H. Lin (2023). DataSheet1.docx [Dataset]. http://doi.org/10.3389/fpsyg.2018.00251.s001

Data from: DataSheet1.docx

Related Article
Explore at:
docxAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
Frontiers
Authors
Jiun-Yu Wu; Yuan-Hsuan Lee; John J. H. Lin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

To construct CFA, MCFA, and maximum MCFA with LISREL v.8 and below, we provide iMCFA (integrated Multilevel Confirmatory Analysis) to examine the potential multilevel factorial structure in the complex survey data. Modeling multilevel structure for complex survey data is complicated because building a multilevel model is not an infallible statistical strategy unless the hypothesized model is close to the real data structure. Methodologists have suggested using different modeling techniques to investigate potential multilevel structure of survey data. Using iMCFA, researchers can visually set the between- and within-level factorial structure to fit MCFA, CFA and/or MAX MCFA models for complex survey data. iMCFA can then yield between- and within-level variance-covariance matrices, calculate intraclass correlations, perform the analyses and generate the outputs for respective models. The summary of the analytical outputs from LISREL is gathered and tabulated for further model comparison and interpretation. iMCFA also provides LISREL syntax of different models for researchers' future use. An empirical and a simulated multilevel dataset with complex and simple structures in the within or between level was used to illustrate the usability and the effectiveness of the iMCFA procedure on analyzing complex survey data. The analytic results of iMCFA using Muthen's limited information estimator were compared with those of Mplus using Full Information Maximum Likelihood regarding the effectiveness of different estimation methods.

Search
Clear search
Close search
Google apps
Main menu