100+ datasets found
  1. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  2. Simulation data and code

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    zip
    Updated Feb 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charlotte de Vries; E Yagmur Erten (2022). Simulation data and code [Dataset]. http://doi.org/10.6084/m9.figshare.19232535.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 24, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Charlotte de Vries; E Yagmur Erten
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
    • PF_simulation_data.zipcontains Simulation data to create figure 2 of de Vries, Erten and Kokko- Code_PF.zip contains C++ code to create the data used to create figure 2 (see PF_simulation_data.zip for the datafiles produced), and it also contains the R script to create figure 2 from the data (Figure2_cloud_25.R). All code files were created by Pen, I., & Flatt, T. (2021). Asymmetry, division of labour and the evolution of ageing in multicellular organisms. Philosophical Transactions of the Royal Society B, 376(1823), 20190729. C++ code is slightly adjusted to change output. Note that the R script takes a long time to run (multiple days on our laptops), and uses a lot of swap memory, we advice running it on a server. Alternatively, you can edit the code to use less than the last 25 days bychanging this line: ddead% filter(t>4975)to for example ddead% filter(t>4998)to use the last 2 time steps only. However, note that therewill be insufficient data at high ages to estimate mortality rates.
  3. Call Centre Queue Simulation

    • kaggle.com
    zip
    Updated Sep 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donovan Bangs (2022). Call Centre Queue Simulation [Dataset]. https://www.kaggle.com/datasets/donovanbangs/call-centre-queue-simulation
    Explore at:
    zip(841475 bytes)Available download formats
    Dataset updated
    Sep 20, 2022
    Authors
    Donovan Bangs
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Call Centre Queue Simulation

    A simulated call centre dataset and notebook, designed to be used as a classroom / tutorial dataset for Business and Operations Analytics.

    This notebook details the creation of simulated call centre logs over the course of one year. For this dataset we are imagining a business whose lines are open from 8:00am to 6:00pm, Monday to Friday. Four agents are on duty at any given time and each call takes an average of 5 minutes to resolve.

    The call centre manager is required to meet a performance target: 90% of calls must be answered within 1 minute. Lately, the performance has slipped. As the data analytics expert, you have been brought in to analyze their performance and make recommendations to return the centre back to its target.

    The dataset records timestamps for when a call was placed, when it was answered, and when the call was completed. The total waiting and service times are calculated, as well as a logical for whether the call was answered within the performance standard.

    Discrete-Event Simulation

    Discrete-Event Simulation allows us to model real calling behaviour with a few simple variables.

    • Arrival Rate
    • Service Rate
    • Number of Agents

    The simulations in this dataset are performed using the package simmer (Ucar et al., 2019). I encourage you to visit their website for complete details and fantastic tutorials on Discrete-Event Simulation.

    Ucar I, Smeets B, Azcorra A (2019). “simmer: Discrete-Event Simulation for R.” Journal of Statistical Software, 90(2), 1–30.

    For source code and simulation details, view the cross-posted GitHub notebook and Shiny app.

  4. Data from: Car simulator Dataset

    • kaggle.com
    zip
    Updated Aug 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pratham saraf (2022). Car simulator Dataset [Dataset]. https://www.kaggle.com/datasets/prathamsaraf1389/car-simulator-dataset
    Explore at:
    zip(2157210752 bytes)Available download formats
    Dataset updated
    Aug 30, 2022
    Authors
    pratham saraf
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    "Some call it a marvel of technology. Some call it a fad. Self-driving cars are constantly making the headlines. These vehicles, designed to carry passengers from point A to B without a human manoeuvre, are promised to bring greater mobility, reduce street congestion and fuel consumption, and create safer roads."

    The data set contains images which ae captured by the simulation using three cameras front (which is over the windshield ), right (which shows the view from the right side) , left(which shows the image from left side).

    It also contains data about brake (whether it was used or not) , throttle speed, and steering angle

  5. D

    Crash Data Simulator Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Crash Data Simulator Market Research Report 2033 [Dataset]. https://dataintelo.com/report/crash-data-simulator-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Crash Data Simulator Market Outlook



    According to our latest research, the global crash data simulator market size reached USD 1.42 billion in 2024, reflecting a robust demand for advanced simulation technologies across industries. The market is projected to grow at a CAGR of 8.7% from 2025 to 2033, reaching a forecasted value of USD 3.01 billion by 2033. This impressive growth is primarily driven by the increasing emphasis on safety standards and regulatory compliance in sectors such as automotive, aerospace, and defense, as well as the rapid integration of digital technologies into crash analysis processes. The ongoing advancements in simulation software and hardware, coupled with the rising need for cost-effective and accurate crash testing solutions, are further propelling the expansion of the crash data simulator market worldwide.




    A significant growth factor for the crash data simulator market is the automotive industry's relentless pursuit of enhanced vehicle safety. With the advent of autonomous vehicles, electric mobility, and stringent crashworthiness regulations, automotive manufacturers and OEMs are increasingly relying on simulation tools to optimize design and validate safety features before physical prototyping. Crash data simulators enable engineers to model diverse crash scenarios, analyze occupant safety, and predict structural deformations with high accuracy. This not only accelerates the product development lifecycle but also reduces costs associated with physical crash tests. As global road safety initiatives intensify and consumer awareness regarding vehicle safety rises, the adoption of crash data simulators in automotive R&D is expected to surge, fueling market growth throughout the forecast period.




    Another critical driver is the technological evolution in simulation software and hardware. Modern crash data simulators leverage advanced computational techniques, such as finite element analysis (FEA), machine learning, and high-performance computing (HPC), to deliver detailed insights into crash dynamics. The integration of cloud-based platforms and digital twins further enhances the scalability and flexibility of simulation environments, allowing for real-time collaboration and data sharing among stakeholders. These technological advancements not only improve the accuracy and reliability of crash simulations but also enable organizations to conduct virtual testing for a broader range of scenarios, including those that are difficult or costly to replicate physically. As industries increasingly embrace digital transformation, the demand for sophisticated and user-friendly crash data simulation solutions is poised to escalate.




    The expansion of crash data simulator applications beyond automotive, particularly in aerospace, defense, and railways, is also contributing to market growth. In aerospace and defense, crash data simulators are utilized to assess the structural integrity of aircraft and military vehicles under various impact conditions, ensuring compliance with rigorous safety standards. Similarly, the railway sector employs simulation tools to evaluate crashworthiness and passenger safety in train collisions and derailments. The versatility of crash data simulators in addressing diverse safety challenges across multiple industries underscores their growing significance as essential tools for risk assessment and regulatory compliance. As emerging economies invest in transportation infrastructure and safety modernization, the adoption of crash data simulators is anticipated to rise across new verticals.




    Regionally, North America continues to dominate the crash data simulator market, attributed to the presence of leading automotive and aerospace manufacturers, stringent safety regulations, and significant investments in R&D. However, the Asia Pacific region is witnessing the fastest growth, driven by rapid industrialization, expanding automotive production, and increasing focus on transportation safety. Europe also maintains a strong market position, supported by robust regulatory frameworks and technological innovation. The Middle East & Africa and Latin America are gradually emerging as promising markets, as governments and industries in these regions prioritize safety and infrastructure development. Overall, the global crash data simulator market is characterized by dynamic regional trends and a growing emphasis on digital simulation as a cornerstone of safety engineering.



    Component Analysis


    <

  6. Simantha: Simulation for Manufacturing

    • catalog.data.gov
    • data.nist.gov
    • +1more
    Updated Jul 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Simantha: Simulation for Manufacturing [Dataset]. https://catalog.data.gov/dataset/simantha-simulation-for-manufacturing
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    Simantha is a discrete event simulation package written in Python that is designed to model the behavior of discrete manufacturing systems. Specifically, it focuses on asynchronous production lines with finite buffers. It also provides functionality for modeling the degradation and maintenance of machines in these systems. Classes for five basic manufacturing objects are included: source, machine, buffer, sink, and maintainer. These objects can be defined by the user and configured in different ways to model various real-world manufacturing systems. The object classes are also designed to be extensible so that they can be used to model more complex processes.In addition to modeling the behavior of existing systems, Simantha is also intended for use with simulation-based optimization and planning applications. For instance, users may be interested in evaluating alternative maintenance policies for a particular system. Estimating the expected system performance under each candidate policy will require a large number of simulation replications when the system is subject to a high degree of stochasticity. Simantha therefore supports parallel simulation replications to make this procedure more efficient.Github repository: https://github.com/usnistgov/simantha

  7. G

    Crash Data Simulator Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Crash Data Simulator Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/crash-data-simulator-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 6, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Crash Data Simulator Market Outlook



    According to our latest research, the global Crash Data Simulator market size reached USD 1.28 billion in 2024, reflecting a robust and expanding industry. The market is expected to grow at a CAGR of 11.7% from 2025 to 2033, with the projected market size anticipated to reach USD 3.16 billion by 2033. This growth is primarily driven by the rising demand for advanced safety measures, stringent regulatory environments, and the increasing integration of simulation technologies across various industries. The adoption of crash data simulators is accelerating as organizations seek to enhance product safety, reduce development cycles, and comply with evolving global standards.




    One of the primary growth factors for the Crash Data Simulator market is the automotive industry’s relentless pursuit of vehicle safety and innovation. As automotive manufacturers face mounting pressure to meet rigorous crashworthiness standards and consumer expectations, they are increasingly leveraging crash data simulation tools to optimize vehicle design and validate safety features. These simulators enable comprehensive virtual testing, reducing the need for costly and time-consuming physical crash tests. Furthermore, the rise of electric and autonomous vehicles has intensified the necessity for sophisticated simulation environments, as these vehicles present unique safety challenges and require extensive validation before market release. The integration of artificial intelligence and machine learning into crash data simulators further enhances their predictive accuracy, creating additional impetus for market expansion.




    The aerospace and defense sectors are also significant contributors to the growth of the Crash Data Simulator market. In aerospace, the demand for lightweight materials and complex structural designs necessitates advanced simulation tools to assess crashworthiness and survivability under extreme conditions. Defense organizations utilize crash data simulators to evaluate the safety of military vehicles, aircraft, and personnel equipment, ensuring compliance with strict regulatory and operational requirements. The increasing frequency of joint research initiatives between government agencies and private enterprises is fostering technological advancements in simulation software and hardware, further propelling market growth. Additionally, the adoption of crash data simulators in industrial and research and development applications is expanding, as organizations across sectors recognize the value of predictive analytics in product development and quality assurance.




    Another critical growth driver is the proliferation of cloud-based deployment models and scalable simulation services. As organizations strive for flexibility, collaboration, and cost efficiency, cloud-based crash data simulators are gaining traction. These platforms enable remote access, real-time data sharing, and integration with other enterprise systems, facilitating seamless workflow management and cross-functional collaboration. The shift towards cloud-based solutions is particularly pronounced among research institutes and testing laboratories, which benefit from reduced infrastructure costs and the ability to rapidly scale simulation capabilities based on project requirements. Moreover, advancements in high-performance computing and big data analytics are empowering organizations to conduct more complex simulations, analyze vast datasets, and derive actionable insights to enhance safety and performance outcomes.




    From a regional perspective, North America continues to dominate the Crash Data Simulator market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The United States, in particular, is a hub for automotive innovation, aerospace engineering, and regulatory oversight, driving substantial investments in crash simulation technologies. Europe’s strong emphasis on vehicle safety standards and sustainability initiatives is fueling demand for advanced simulators, while Asia Pacific is emerging as a high-growth region due to rapid industrialization, expanding automotive manufacturing, and increasing government initiatives to enhance transportation safety. Latin America and the Middle East & Africa are also witnessing steady adoption, supported by growing investments in infrastructure and regulatory reforms aimed at improving safety standards across industries.



  8. ARC Code TI: Multi-Fidelity Simulator (MFSim) - Dataset - NASA Open Data...

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). ARC Code TI: Multi-Fidelity Simulator (MFSim) - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/arc-code-ti-multi-fidelity-simulator-mfsim
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Multi-Fidelity Simulator, MFSim is a pluggable framework for creating an air traffic flow simulator at multiple levels of fidelity. The framework is designed to allow low-fidelity simulations of the entire US Airspace to be completed very quickly (on the order of seconds). The framework allows higher-fidelity plugins to be added to allow higher-fidelity simulations to occur in certain regions of the airspace concurrently with the low-fidelity simulation of the full airspace.

  9. C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Sep 22, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2010). C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/c-mapss-aircraft-engine-simulator-data
    Explore at:
    Dataset updated
    Sep 22, 2010
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    SPECIAL NOTE: C-MAPSS and C-MAPSS40K ARE CURRENTLY UNAVAILABLE FOR DOWNLOAD. Glenn Research Center management is reviewing the availability requirements for these software packages. We are working with Center management to get the review completed and issues resolved in a timely manner. We will post updates on this website when the issues are resolved. We apologize for any inconvenience. Please contact Jonathan Litt, jonathan.s.litt@nasa.gov, if you have any questions in the meantime. Subject Area: Engine Health Description: This data set was generated with the C-MAPSS simulator. C-MAPSS stands for 'Commercial Modular Aero-Propulsion System Simulation' and it is a tool for the simulation of realistic large commercial turbofan engine data. Each flight is a combination of a series of flight conditions with a reasonable linear transition period to allow the engine to change from one flight condition to the next. The flight conditions are arranged to cover a typical ascent from sea level to 35K ft and descent back down to sea level. The fault was injected at a given time in one of the flights and persists throughout the remaining flights, effectively increasing the age of the engine. The intent is to identify which flight and when in the flight the fault occurred. How Data Was Acquired: The data provided is from a high fidelity system level engine simulation designed to simulate nominal and fault engine degradation over a series of flights. The simulated data was created with a Matlab Simulink tool called C-MAPSS. Sample Rates and Parameter Description: The flights are full flight recordings sampled at 1 Hz and consist of 30 engine and flight condition parameters. Each flight contains 7 unique flight conditions for an approximately 90 min flight including ascent to cruise at 35K ft and descent back to sea level. The parameters for each flight are the flight conditions, health indicators, measurement temperatures and pressure measurements. Faults/Anomalies: Faults arose from the inlet engine fan, the low pressure compressor, the high pressure compressor, the high pressure turbine and the low pressure turbine.

  10. n

    Data and code for: Generation and applications of simulated datasets to...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Mar 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Silk; Olivier Gimenez (2023). Data and code for: Generation and applications of simulated datasets to integrate social network and demographic analyses [Dataset]. http://doi.org/10.5061/dryad.m0cfxpp7s
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 10, 2023
    Dataset provided by
    Centre d'Écologie Fonctionnelle et Évolutive
    Authors
    Matthew Silk; Olivier Gimenez
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Social networks are tied to population dynamics; interactions are driven by population density and demographic structure, while social relationships can be key determinants of survival and reproductive success. However, difficulties integrating models used in demography and network analysis have limited research at this interface. We introduce the R package genNetDem for simulating integrated network-demographic datasets. It can be used to create longitudinal social networks and/or capture-recapture datasets with known properties. It incorporates the ability to generate populations and their social networks, generate grouping events using these networks, simulate social network effects on individual survival, and flexibly sample these longitudinal datasets of social associations. By generating co-capture data with known statistical relationships it provides functionality for methodological research. We demonstrate its use with case studies testing how imputation and sampling design influence the success of adding network traits to conventional Cormack-Jolly-Seber (CJS) models. We show that incorporating social network effects in CJS models generates qualitatively accurate results, but with downward-biased parameter estimates when network position influences survival. Biases are greater when fewer interactions are sampled or fewer individuals are observed in each interaction. While our results indicate the potential of incorporating social effects within demographic models, they show that imputing missing network measures alone is insufficient to accurately estimate social effects on survival, pointing to the importance of incorporating network imputation approaches. genNetDem provides a flexible tool to aid these methodological advancements and help researchers test other sampling considerations in social network studies. Methods The dataset and code stored here is for Case Studies 1 and 2 in the paper. Datsets were generated using simulations in R. Here we provide 1) the R code used for the simulations; 2) the simulation outputs (as .RDS files); and 3) the R code to analyse simulation outputs and generate the tables and figures in the paper.

  11. h

    data-drift-simulation-dataset

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sara Han Díaz, data-drift-simulation-dataset [Dataset]. https://huggingface.co/datasets/sdiazlor/data-drift-simulation-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Sara Han Díaz
    Description

    sdiazlor/data-drift-simulation-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. Z

    Simulation Data & R scripts for: "Introducing recurrent events analyses to...

    • data.niaid.nih.gov
    • doi.org
    • +1more
    Updated Apr 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ferry, Nicolas (2024). Simulation Data & R scripts for: "Introducing recurrent events analyses to assess species interactions based on camera trap data: a comparison with time-to-first-event approaches" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11085005
    Explore at:
    Dataset updated
    Apr 29, 2024
    Dataset provided by
    Department of National Park Monitoring and Animal Management, Bavarian Forest National Park
    Authors
    Ferry, Nicolas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Files descriptions:

    All csv files refer to results from the different models (PAMM, AARs, Linear models, MRPPs) on each iteration of the simulation. One row being one iteration. "results_perfect_detection.csv" refers to the results from the first simulation part with all the observations."results_imperfect_detection.csv" refers to the results from the first simulation part with randomly thinned observations to mimick imperfect detection.

    ID_run: identified of the iteration (N: number of sites, D_AB: duration of the effect of A on B, D_BA: duration of the effect of B on A, AB: effect of A on B, BA: effect of B on A, Se: seed number of the iteration).PAMM30: p-value of the PAMM running on the 30-days survey.PAMM7: p-value of the PAMM running on the 7-days survey.AAR1: ratio value for the Avoidance-Attraction-Ratio calculating AB/BA.AAR2: ratio value for the Avoidance-Attraction-Ratio calculating BAB/BB.Harmsen_P: p-value from the linear model with interaction Species1*Species2 from Harmsen et al. (2009).Niedballa_P: p-value from the linear model comparing AB to BA (Niedballa et al. 2021).Karanth_permA: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species A (Karanth et al. 2017).MurphyAB_permA: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). MurphyBA_permA: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). Karanth_permB: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species B (Karanth et al. 2017).MurphyAB_permB: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021). MurphyBA_permB: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021).

    "results_int_dir_perf_det.csv" refers to the results from the second simulation part, with all the observations."results_int_dir_imperf_det.csv" refers to the results from the second simulation part, with randomly thinned observations to mimick imperfect detection.ID_run: identified of the iteration (N: number of sites, D_AB: duration of the effect of A on B, D_BA: duration of the effect of B on A, AB: effect of A on B, BA: effect of B on A, Se: seed number of the iteration).p_pamm7_AB: p-value of the PAMM running on the 7-days survey testing for the effect of A on B.p_pamm7_AB: p-value of the PAMM running on the 7-days survey testing for the effect of B on A.AAR1: ratio value for the Avoidance-Attraction-Ratio calculating AB/BA.AAR2_BAB: ratio value for the Avoidance-Attraction-Ratio calculating BAB/BB.AAR2_ABA: ratio value for the Avoidance-Attraction-Ratio calculating ABA/AA.Harmsen_P: p-value from the linear model with interaction Species1*Species2 from Harmsen et al. (2009).Niedballa_P: p-value from the linear model comparing AB to BA (Niedballa et al. 2021).Karanth_permA: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species A (Karanth et al. 2017).MurphyAB_permA: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). MurphyBA_permA: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). Karanth_permB: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species B (Karanth et al. 2017).MurphyAB_permB: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021). MurphyBA_permB: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021).

    Scripts files description:1_Functions: R script containing the functions: - MRPP from Karanth et al. (2017) adapted here for time efficiency. - MRPP from Murphy et al. (2021) adapted here for time efficiency. - Version of the ct_to_recurrent() function from the recurrent package adapted to process parallized on the simulation datasets. - The simulation() function used to simulate two species observations with reciprocal effect on each other.2_Simulations: R script containing the parameters definitions for all iterations (for the two parts of the simulations), the simulation paralellization and the random thinning mimicking imperfect detection.3_Approaches comparison: R script containing the fit of the different models tested on the simulated data.3_1_Real data comparison: R script containing the fit of the different models tested on the real data example from Murphy et al. 2021.4_Graphs: R script containing the code for plotting results from the simulation part and appendices.5_1_Appendix - Check for similarity between codes for Karanth et al 2017 method: R script containing Karanth et al. (2017) and Murphy et al. (2021) codes lines and the adapted version for time-efficiency matter and a comparison to verify similarity of results.5_2_Appendix - Multi-response procedure permutation difference: R script containing R code to test for difference of the MRPPs approaches according to the species on which permutation are done.

  13. I

    Data from development and evaluation of SASCA-s: Scalable Agent-based...

    • databank.illinois.edu
    Updated Aug 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minhyuk Park; João AC Lamy; Esther CC Rodrigues; Felipe Mariano Ferreira; The-Anh Vu-Le; Tandy Warnow; George Chacko (2025). Data from development and evaluation of SASCA-s: Scalable Agent-based Simulator for Citation Analysis with simulation [Dataset]. http://doi.org/10.13012/B2IDB-3926377_V1
    Explore at:
    Dataset updated
    Aug 16, 2025
    Authors
    Minhyuk Park; João AC Lamy; Esther CC Rodrigues; Felipe Mariano Ferreira; The-Anh Vu-Le; Tandy Warnow; George Chacko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    U.S. National Science Foundation (NSF)
    Illinois:Insper Partnership
    Description

    The data within consist of compressed output files in the form of edgelists (.edgelist.gz) and nodelists (.aux.parquet) from large citation network simulations using an agent-based model. The code and instructions are available at: https://github.com/illinois-or-research-analytics/SASCA. In addition, we provide a distribution of citation frequencies drawn from a random sample of PubMed journal articles (pooled_50k_pubmed_unique.csv) and a table of recencies- the frequency with which citations are made to the previous year, the year before that and so on (recency_probs_percent_stahl_filled.csv). A manuscript describing the SASCA-s simulator has been submitted for review and will be referenced in a future version of this data repository if it is accepted. The prefixes sj and er refer to the real world and Erdos-Renyi random graph respectively that were used to initiate simulations. These 'seed' networks are available from the Github site referenced above.

  14. T

    Simulation and Test Data Management Market Analysis - Size, Share, and...

    • futuremarketinsights.com
    html, pdf
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudip Saha (2025). Simulation and Test Data Management Market Analysis - Size, Share, and Forecast 2025 to 2035 [Dataset]. https://www.futuremarketinsights.com/reports/simulation-and-test-data-management-market
    Explore at:
    pdf, htmlAvailable download formats
    Dataset updated
    Jun 3, 2025
    Authors
    Sudip Saha
    License

    https://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy

    Time period covered
    2025 - 2035
    Area covered
    Worldwide
    Description

    The global simulation and test data management market is expected to witness substantial growth, with its valuation projected to increase from approximately USD 905.2 million in 2025 to about USD 3.24 billion by 2035. This corresponds to a CAGR of 12.1% over the forecast period.

    Attributes Description
    Industry Size (2025E)USD 905.2 million
    Industry Size (2035F)USD 3.24 billion
    CAGR (2025 to 2035)12.1% CAGR

    Category-wise Insights

    SegmentCAGR (2025 to 2035)
    Aerospace & Defense (Industry)14.8%
    SegmentValue Share ( 2025 )
    Test Data Simulation Software (Solution)42.3%
  15. Health Data Simulator (patients)

    • kaggle.com
    zip
    Updated Sep 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farahnaz Amini (2024). Health Data Simulator (patients) [Dataset]. https://www.kaggle.com/datasets/farahnazamini/health-data-simulator-patients/data
    Explore at:
    zip(33583 bytes)Available download formats
    Dataset updated
    Sep 19, 2024
    Authors
    Farahnaz Amini
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Farahnaz Amini

    Released under CC0: Public Domain

    Contents

  16. S

    Dataset for: An integrated simulator and data set that combines grasping and...

    • dataverse.scholarsportal.info
    • search.dataone.org
    bin, pdf, txt
    Updated Oct 22, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scholars Portal Dataverse (2021). Dataset for: An integrated simulator and data set that combines grasping and vision for deep learning [Dataset]. http://doi.org/10.5683/SP/KL5P5S
    Explore at:
    bin(2088507492), txt(15933), pdf(32039), bin(1048221878), txt(10061)Available download formats
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    Scholars Portal Dataverse
    Dataset funded by
    Canada Foundation for Innovation (CFI)
    Natural Sciences and Engineering Research Council of Canada (NSERC)
    Description

    To develop a simulation that collects both visual information, as well as grasp information about different objects using a multi-fingered hand. These sources of data can be used in the future to learn integrated object-action grasp representations.

  17. l

    Data from: Simulated dataset

    • figshare.le.ac.uk
    zip
    Updated Feb 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rodrigo Quian Quiroga (2024). Simulated dataset [Dataset]. http://doi.org/10.25392/leicester.data.11897595.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 20, 2024
    Dataset provided by
    University of Leicester
    Authors
    Rodrigo Quian Quiroga
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A simulated dataset that has been widely used in the evaluation of spike-sorting algorithms. Synthetic datasets are generated by adding spike waveform templates to background noise of various levels; this download contains several datasets, generated using different spike templates.Use wave_clus (see www2.le.ac.uk/centres/csn/software/wave-clus) for spike detection and sorting of this data. Wave_clus is a fast and unsupervised algorithm for spike detection and sorting compatible with Windows, Mac or Linux operating systems.

  18. Z

    Data from: Simulated Well Production Data using a Transient Well Model and a...

    • data.niaid.nih.gov
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AlHammad, Yousef K. (2023). Simulated Well Production Data using a Transient Well Model and a Developed Simulator [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8128888
    Explore at:
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    King Abdullah University of Science and Technology
    Authors
    AlHammad, Yousef K.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a simulated dataset of transient well production data. This dataset was used in my Masters thesis at King Abullah University of Science and Technology (KAUST), and it is shared for academic use and research work.

    The dataset has 100 wells simulated at time steps of 0.2 hours for an entire year. This gives 43,800 observations per well, and grand total of 4,380,000 observations in the entire dataset. The resulting production data is then perturbed with systemic and random gauge errors to better simulate real-world gauge readings.

    The simulator code used to generate this dataset can be found at: https://github.com/ykh-1992/TransientNodalAnalysis.jl

    The data consists of three files: - "wells.csv": This file details the input parameters for each simulated well. - "data.zip": This file houses an 850 MB "data.csv" that includes the simulated well production data. - "auxiliary.csv": This file includes information related to the simulation run.

  19. Third Generation Simulation Data (TGSIM) I-395 Trajectories

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Aug 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Highway Administration (2025). Third Generation Simulation Data (TGSIM) I-395 Trajectories [Dataset]. https://catalog.data.gov/dataset/third-generation-simulation-data-tgsim-i-395-trajectories
    Explore at:
    Dataset updated
    Aug 18, 2025
    Dataset provided by
    Federal Highway Administrationhttps://highways.dot.gov/
    Description

    The main dataset is a 232 MB file of trajectory data (I395-final.csv) that contains position, speed, and acceleration data for non-automated passenger cars, trucks, buses, and automated vehicles on an expressway within an urban environment. Supporting files include an aerial reference image (I395_ref_image.png) and a list of polygon boundaries (I395_boundaries.csv) and associated images (I395_lane-1, I395_lane-2, …, I395_lane-6) stored in a folder titled “Annotation on Regions.zip” to map physical roadway segments to the numerical lane IDs referenced in the trajectory dataset. In the boundary file, columns “x1” to “x5” represent the horizontal pixel values in the reference image, with “x1” being the leftmost boundary line and “x5” being the rightmost boundary line, while the column "y" represents corresponding vertical pixel values. The origin point of the reference image is located at the top left corner. The dataset defines five lanes with five boundaries. Lane -6 corresponds to the area to the left of “x1”. Lane -5 corresponds to the area between “x1” and “x2”, and so forth to the rightmost lane, which is defined by the area to the right of “x5” (Lane -2). Lane -1 refers to vehicles that go onto the shoulder of the merging lane (Lane -2), which are manually separated by watching the videos. This dataset was collected as part of the Third Generation Simulation Data (TGSIM): A Closer Look at the Impacts of Automated Driving Systems on Human Behavior project. During the project, six trajectory datasets capable of characterizing human-automated vehicle interactions under a diverse set of scenarios in highway and city environments were collected and processed. For more information, see the project report found here: https://rosap.ntl.bts.gov/view/dot/74647. This dataset, which was one of the six collected as part of the TGSIM project, contains data collected from six 4K cameras mounted on tripods, positioned on three overpasses along I-395 in Washington, D.C. The cameras captured distinct segments of the highway, and their combined overlapping and non-overlapping footage resulted in a continuous trajectory for the entire section covering 0.5 km. This section covers a major weaving/mandatory lane-changing between L'Enfant Plaza and 4th Street SW, with three lanes in the eastbound direction and a major on-ramp on the left side. In addition to the on-ramp, the section covers an off-ramp on the right side. The expressway includes one diverging lane at the beginning of the section on the right side and one merging lane in the middle of the section on the left side. For the purposes of data extraction, the shoulder of the merging lane is also considered a travel lane since some vehicles illegally use it as an extended on-ramp to pass other drivers (see I395_ref_image.png for details). The cameras captured continuous footage during the morning rush hour (8:30 AM-10:30 AM ET) on a sunny day. During this period, vehicles equipped with SAE Level 2 automation were deployed to travel through the designated section to capture the impact of SAE Level 2-equipped vehicles on adjacent vehicles and their behavior in congested areas, particularly in complex merging sections. These vehicles are indicated in the dataset. As part of this dataset, the following files were provided: I395-final.csv contains the numerical data to be used for analysis that includes vehicle level trajectory data at every 0.1 second. Vehicle type, width, and length are provided with instantaneous location, speed, and acceleration data. All distance measurements (width, length, location) were converted from pixels to meters using the following conversion factor: 1 pixel = 0.3-meter conversion. I395_ref_image.png is the aerial reference image that defines the geographic region and the associated roadway segments. I395_boundaries.csv contains the coordinates that define the roadway segments (n=X). The columns "x1" to "x5" represent the horizontal pi

  20. Call Center Simulated Data

    • kaggle.com
    zip
    Updated Mar 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Sebastián Campos Ortiz (2023). Call Center Simulated Data [Dataset]. https://www.kaggle.com/datasets/scss17/call-center-simulated-data
    Explore at:
    zip(3098 bytes)Available download formats
    Dataset updated
    Mar 28, 2023
    Authors
    Pablo Sebastián Campos Ortiz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The aim of this data set is to be used along with my notebook Linear Regression Notes which provides a guideline for applying correlation analysis and linear regression models from a statistical approach.

    A fictional call center is interested in knowing the relationship between the number of personnel and some variables that measure their performance such as average answer time, average calls per hour, and average time per call. Data were simulated to represent 200 shifts.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
Organization logo

Simulation Data Set

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

Search
Clear search
Close search
Google apps
Main menu