Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionA required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data.MethodsThe system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application’s performance and functionality.ResultsThe system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects.DiscussionMedical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This submission includes raster datasets for each layer of evidence used for weights of evidence analysis as well as the deterministic play fairway analysis (PFA). Data representative of heat, permeability and groundwater comprises some of the raster datasets. Additionally, the final deterministic PFA model is provided along with a certainty model. All of these datasets are best used with an ArcGIS software package, specifically Spatial Data Modeler.
Facebook
TwitterThis dataset was created by Sanskruti Panda
Released under Other (specified in description)
Facebook
TwitterAll data associated with this data entry are the simulations related storm surge in three case study locations. These simulated water height, wind and other physical parameters are used for analysis to construct all the figures presented herein. This dataset is associated with the following publication: Liang, M., Z. Dong, S. Julius, J. Neal, and J. Yang. Storm Surge Projection for Objective-based Risk Management for Climate Change Adaptation along the US Atlantic Coast. JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT. American Society of Civil Engineers (ASCE), Reston, VA, USA, 150(6): e04024014-1, (2024).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset accompanying the Synthetic Daisies post "Are the Worst Performers the Best Predictors?" and the technical paper (on viXra) "From Worst to Most Variable? Only the worst performers may be the most informative".
Facebook
TwitterThe Local Analysis and Prediction System (LAPS), run by the NOAA's Forecast Systems Laboratory (FSL), combines numerous observed meteorological data sets into a collection of atmospheric analyses.
Facebook
TwitterNo description found
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Empirical researchers studying party systems often struggle with the question of how to count parties. Indexes of party system fragmentation used to address this problem (e.g., the effective number of parties) have a fundamental shortcoming: since the same index value may represent very different party systems, they are impossible to interpret and may lead to erroneous inference. We offer a novel approach to this problem: instead of focusing on index measures, we develop a model that predicts the \emph{entire distribution} of party vote-shares and, thus, does not require any index measure. First, a model of party-counts predicts the number of parties. Second, a set of multivariate t models predicts party vote-shares. Compared to the standard index-based approach, our approach helps to avoid inferential errors and, in addition, yields a much richer set of insights into the variation of party systems. For illustration, we apply the model on two datasets. Our analyses call into question the conclusions one would arrive at by the index-based approach. A publicly available software is provided to implement the proposed model.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains tabular files with information about the usage preferences of speakers of Maltese English with regard to 63 pairs of lexical expressions. These pairs (e.g. truck-lorry or realization-realisation) are known to differ in usage between BrE and AmE (cf. Algeo 2006). The data were elicited with a questionnaire that asks informants to indicate whether they always use one of the two variants, prefer one over the other, have no preference, or do not use either expression (see Krug and Sell 2013 for methodological details). Usage preferences were therefore measured on a symmetric 5-point ordinal scale. Data were collected between 2008 to 2018, as part of a larger research project on lexical and grammatical variation in settings where English is spoken as a native, second, or foreign language. The current dataset, which we use for our methodological study on ordinal data modeling strategies, consists of a subset of 500 speakers that is roughly balanced on year of birth. Abstract: Related publication In empirical work, ordinal variables are typically analyzed using means based on numeric scores assigned to categories. While this strategy has met with justified criticism in the methodological literature, it also generates simple and informative data summaries, a standard often not met by statistically more adequate procedures. Motivated by a survey of how ordered variables are dealt with in language research, we draw attention to an un(der)used latent-variable approach to ordinal data modeling, which constitutes an alternative perspective on the most widely used form of ordered regression, the cumulative model. Since the latent-variable approach does not feature in any of the studies in our survey, we believe it is worthwhile to promote its benefits. To this end, we draw on questionnaire-based preference ratings by speakers of Maltese English, who indicated on a 5-point scale which of two synonymous expressions (e.g. package-parcel) they (tend to) use. We demonstrate that a latent-variable formulation of the cumulative model affords nuanced and interpretable data summaries that can be visualized effectively, while at the same time avoiding limitations inherent in mean response models (e.g. distortions induced by floor and ceiling effects). The online supplementary materials include a tutorial for its implementation in R.
Facebook
TwitterPOLARIS_Analysis_ER2_Data is the modeled trajectories and meteorological data along the flight path for the ER-2 aircraft collected during the Photochemistry of Ozone Loss in the Arctic Region in Summer (POLARIS) campaign. Data collection for this product is complete.The POLARIS mission was a joint effort of NASA and NOAA that occurred in 1997 and was designed to expand on the photochemical and transport processes that cause the summer polar decreases in the stratospheric ozone. The POLARIS campaign had the overarching goal of better understanding the change of stratospheric ozone levels from very high concentrations in the spring to very low concentrations in the autumn. The NASA ER-2 high-altitude aircraft was the primary platform deployed along with balloons, satellites, and ground-sites. The POLARIS campaign was based in Fairbanks, Alaska with some flights being conducted from California and Hawaii. Flights were conducted between the summer solstice and fall equinox at mid- to high latitudes. The data collected included meteorological variables; long-lived tracers in reference to summertime transport questions; select species with reactive nitrogen (NOy), halogen (Cly), and hydrogen (HOx) reservoirs; and aerosols. More specifically, the ER-2 utilized various techniques/instruments including Laser Absorption, Gas Chromatography, Non-dispersive IR, UV Photometry, Catalysis, and IR Absorption. These techniques/instruments were used to collect data including N2O, CH4, CH3CCl3, CO2, O3, H2O, and NOy. Ground stations were responsible for collecting SO2 and O3, while balloons recorded pressure, temperature, wind speed, and wind directions. Satellites partnered with these platforms collected meteorological data and Lidar imagery. The observations were used to constrain stratospheric computer models to evaluate ozone changes due to chemistry and transport.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is the replication file for 'Bayesian Sensitivity Analysis for Unmeasured Confounding in Causal Panel Data Models', including the package that implements the proposed method, as well as replication code for Monte Carlo studies, simulated example and empirical analysis.
Facebook
TwitterSTRAT_Analysis_ER2_Data is the modeled trajectories and meteorological data along the flight path for the ER-2 aircraft collected during the Stratospheric Tracers of Atmospheric Transport (STRAT) campaign. Data collection for this product is complete.The STRAT campaign was a field campaign conducted by NASA from May 1995 to February 1996. The primary goal of STRAT was to collect measurements of the change of long-lived tracers and functions of altitude, latitude, and season. These measurements were taken to aid with determining rates for global-scale transport and future distributions of high-speed civil transport (HSCT) exhaust that was emitted into the lower atmosphere. STRAT had four main objectives: defining the rate of transport of trace gases from the stratosphere and troposphere (i.e., HSCT exhaust emissions), improving the understanding of dynamical coupling rates for transport of trace gases between tropical regions and higher latitudes and lower altitudes (between tropical regions, higher latitudes, and lower altitudes are where most ozone resides), improving understanding of chemistry in the upper troposphere and lower stratosphere, and finally, providing data sets for testing two-dimensional and three-dimensional models used in assessments of impacts from stratospheric aviation. To accomplish these objectives, the STRAT Science Team conducted various surface-based remote sensing and in-situ measurements. NASA flew the ER-2 aircraft along with balloons such as ozonesondes and radiosondes just below the tropopause in the Northern Hemisphere to collect data. Along with the ER-2 and balloons, NASA also utilized satellite imagery, theoretical models, and ground sites. The ER-2 collected data on HOx, NOy, CO2, ozone, water vapor, and temperature. The ER-2 also collected in-situ stratospheric measurements of N2O, CH4, CO, HCL, and NO using the Aircraft Laser Infrared Absorption Spectrometer (ALIAS). Ozonesondes and radiosondes were also deployed to collect data on CO2, NO/NOy, air temperature, pressure, and 3D wind. These balloons also took in-situ measurements of N2O, CFC-11, CH4, CO, HCL, and NO2 using the ALIAS. Ground stations were responsible for taking measurements of O3, ozone mixing ratio, pressure, and temperature. Satellites took infrared images of the atmosphere with the goal of aiding in completing STRAT objectives. Pressure and temperature models were created to help plan the mission.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
A fully synthetic dataset simulating real-world medical billing scenarios, including claim status, denials, team allocation, and AR follow-up logic.
This dataset represents a synthetic Account Receivable (AR) data model for medical billing, created using realistic healthcare revenue cycle management (RCM) workflows. It is designed for data analysis, machine learning modeling, automation testing, and process simulation in the healthcare billing domain.
The dataset includes realistic business logic, mimicking the actual process of claim submission, denial management, follow-ups, and payment tracking. This is especially useful for: ✔ Medical billing training ✔ Predictive modeling (claim outcomes, denial prediction, payment forecasting) ✔ RCM process automation and AI research ✔ Data visualization and dashboard creation
✅ Patient & Claim Information:
XXXXXZXXXXXXToday - DOS0-30, 31-60, 61-90, 91-120, 120+✅ Claim Status & Denial Logic:
Dx inconsistent with CPT)Need Coding Assistance)Team Allocation: Based on denial type
Coding TeamBilling TeamPayment Team✅ Realistic Denial Scenarios Covered:
✅ Other Important Columns:
| Column Name | Description |
|---|---|
| Client | Name of the client/provider |
| State | US State where service provided |
| Visit ID# | Unique alphanumeric ID (XXXXXZXXXXXX) |
| Patient Name | Patient’s full name |
| DOS | Date of Service (MM/DD/YYYY) |
| Aging Days | Days from DOS to today |
| Aging Bucket | Aging category |
| Claim Amount | Original claim billed |
| Paid Amount | Amount paid so far |
| Balance | Remaining balance |
| Status | Initial claim status (No Response, Paid, etc.) |
| Status Code | Actual reason (e.g., Dx inconsistent with CPT) |
| Action Code | Next step (e.g., Need Coding Assistance) |
| Team Allocation | Responsible team (Coding, Billing, Payment) |
| Notes | Follow-up notes |
XXXXXZXXXXXX formatDenial Workflow:
Payments: Realistic logic where payment may be partial, full, or none
Insurance Flow: Balance moves from primary → secondary → tertiary → patient responsibility
CC BY 4.0 – Free to use, modify, and share with attribution.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our model is examined based on an extensive data set. Due to the unavailability of real data sets, the values of model parameters are generated randomly using discrete uniform distribution.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Technical notes and documentation on the common data model of the project CONCEPT-DM2.
This publication corresponds to the Common Data Model (CDM) specification of the CONCEPT-DM2 project for the implementation of a federated network analysis of the healthcare pathway of type 2 diabetes.
Aims of the CONCEPT-DM2 project:
General aim: To analyse chronic care effectiveness and efficiency of care pathways in diabetes, assuming the relevance of care pathways as independent factors of health outcomes using data from real life world (RWD) from five Spanish Regional Health Systems.
Main specific aims:
Study Design: It is a population-based retrospective observational study centered on all T2D patients diagnosed in five Regional Health Services within the Spanish National Health Service. We will include all the contacts of these patients with the health services using the electronic medical record systems including Primary Care data, Specialized Care data, Hospitalizations, Urgent Care data, Pharmacy Claims, and also other registers such as the mortality and the population register.
Cohort definition: All patients with code of Type 2 Diabetes in the clinical health records
Files included in this publication:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
While fixed effects (FE) models are often employed to address potential omitted variables, we argue that these models’ real utility is in isolating a particular dimension of variance from panel data for analysis. In addition, we show through novel mathematical decomposition and simulation that only one-way FE models cleanly capture either the over-time or cross-sectional dimensions in panel data, while the two-way FE model unhelpfully combines within-unit and cross-sectional variation in a way that produces un-interpretable answers. In fact, as we show in this paper, if we begin with the interpretation that many researchers wrongly assign to the two-way FE model—that it represents a single estimate of X on Y while accounting for unit-level heterogeneity and time shocks—the two-way FE specification is statistically unidentified, a fact that statistical software packages like R and Stata obscure through internal matrix processing.
Facebook
TwitterThis data set contains the European Center for Medium range Weather Forecasting (ECMWF) deterministic forecast model analysis data on hybrid levels in GRIB2 format. The forecast files and analysis files on pressure levels are in the companion data sets (see below). For each day there are global 0.25 degree resolution analysis files at 00, 06, 12, and 18 UTC. This data set is password protected. Contact Steve Williams (see below) for password information. This is a large data set with each day containing ~6.1 Gb in 4 files. Each order can contain a maximum of 16 Gb of data (~2.5 days). For very large orders, it may be preferable for you to access these data via a loaner external hard drive. Please contact Steve Williams (see below) for these very large orders.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset was created by Timo Bozsolik
Released under Database: Open Database, Contents: Database Contents
Data for sentiment analysis
Facebook
TwitterThis dataset includes the analysis year inputs and outputs from the Air Quality Conformity Analysis approved in June 2025. The horizon year is 2050 and reflects the policies and projects adopted in the ON TO 2050 Regional Comprehensive Plan.The air quality analysis is completed twice annually, in the second quarter and the fourth quarter. The data associated with the analysis is named based on the year the analysis was completed and the quarter it was completed. Therefore, the files in this dataset are referred to as c25q2 data.The analysis years for this conformity cycle include 2019, 2025, 2030, 2035, 2040, and 2050. We associate scenario numbers with the analysis years as shown below. You will notice the scenario numbers 100–700 referenced in many of the filenames or in headers within the files.Analysis year scenario numbers:2019 | 1002025 | 2002030 | 3002035 | 4002040 | 5002050 | 700 Links to download data files:Travel Demand Model Data (c25q2) - 2019 BaseTravel Demand Model Data (c25q2) - 2025 ForecastTravel Demand Model Data (c25q2) - 2030 ForecastTravel Demand Model Data (c25q2) - 2035 ForecastTravel Demand Model Data (c25q2) - 2040 ForecastTravel Demand Model Data (c25q2) - 2050 Forecast For additional information, see the travel demand model documentation.
Facebook
TwitterThis dataset archives the 3-hourly SNACS Polar MM5 atmospheric model data for simulations run with the following forcing data. Model years Data 1957-1958 to 1978-1979 ERA40 1979-1980 to 2000-2001 ERA40 + SSMR/SSMI sea ice from National Snow and Ice Data Center (NSIDC) 2001-2002 to 2006-2007 TOGA + SSMR/SSMI sea ice from National Snow and Ice Data Center (NSIDC) V2 + NNRP for soil moisture There are time series data from 28 model grid points near Barrow, AK.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionA required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data.MethodsThe system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application’s performance and functionality.ResultsThe system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects.DiscussionMedical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.