Facebook
TwitterThe largest all-payer ambulatory surgery database in the United States, the Healthcare Cost and Utilization Project (HCUP) Nationwide Ambulatory Surgery Sample (NASS) produces national estimates of major ambulatory surgery encounters in hospital-owned facilities. Major ambulatory surgeries are defined as selected major therapeutic procedures that require the use of an operating room, penetrate or break the skin, and involve regional anesthesia, general anesthesia, or sedation to control pain (i.e., surgeries flagged as "narrow" in the HCUP Surgery Flag Software). Unweighted, the NASS contains approximately 9.0 million ambulatory surgery encounters each year and approximately 11.8 million ambulatory surgery procedures. Weighted, it estimates approximately 11.9 million ambulatory surgery encounters and 15.7 million ambulatory surgery procedures. Sampled from the HCUP State Ambulatory Surgery and Services Databases (SASD) and State Emergency Department Databases (SEDD) in order to capture both planned and emergent major ambulatory surgeries, the NASS can be used to examine selected ambulatory surgery utilization patterns. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. The NASS contains clinical and resource-use information that is included in a typical hospital-owned facility record, including patient characteristics, clinical diagnostic and surgical procedure codes, disposition of patients, total charges, facility characteristics, and expected source of payment, regardless of payer, including patients covered by Medicaid, private insurance, and the uninsured. The NASS excludes data elements that could directly or indirectly identify individuals, hospitals, or states. The NASS is limited to encounters with at least one in-scope major ambulatory surgery on the record, performed at hospital-owned facilities. Procedures intended primarily for diagnostic purposes are not considered in-scope. Restricted access data files are available with a data use agreement and brief online security training.
Facebook
TwitterThe NIS is the largest publicly available all-payer inpatient healthcare database designed to produce U.S. regional and national estimates of inpatient utilization, access, cost, quality, and outcomes. Unweighted, it contains data from around 7 million hospital stays each year. Weighted, it estimates around 35 million hospitalizations nationally. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ), HCUP data inform decision making at the national, State, and community levels.
Its large sample size is ideal for developing national and regional estimates and enables analyses of rare conditions, uncommon treatments, and special populations.
%3Cu%3EDO NOT%3C/u%3E
use this data without referring to the NIS Database Documentation, which includes:
%3C!-- --%3E
%3C!-- --%3E
%3Cu%3E%3Cstrong%3EAll manuscripts%3C/strong%3E%3C/u%3E
(and other items you'd like to publish) %3Cu%3E%3Cstrong%3Emust be submitted to%3C/strong%3E%3C/u%3E
%3Cu%3E%3Cstrong%3Ephsdatacore@stanford.edu%3C/strong%3E%3C/u%3E
for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
You must also %3Cu%3E%3Cstrong%3Emake sure that your work meets all of the AHRQ (data owner) requirements for publishing%3C/strong%3E%3C/u%3E
with HCUP data--listed at https://hcup-us.ahrq.gov/db/nation/nis/nischecklist.jsp
For additional assistance, AHRQ has created the HCUP Online Tutorial Series, a series of free, interactive courses which provide training on technical methods for conducting research with HCUP data. Topics include an HCUP Overview Course and these tutorials:
• The HCUP Sampling Design tutorial is designed to help users learn how to account for sample design in their work with HCUP national (nationwide) databases. • The Producing National HCUP Estimates tutorial is designed to help users understand how the three national (nationwide) databases – the NIS, Nationwide Emergency Department Sample (NEDS), and Kids' Inpatient Database (KID) – can be used to produce national and regional estimates. HCUP 2020 NIS (8/22/22) 14 Introduction • The Calculating Standard Errors tutorial shows how to accurately determine the precision of the estimates produced from the HCUP nationwide databases. Users will learn two methods for calculating standard errors for estimates produced from the HCUP national (nationwide) databases. • The HCUP Multi-year Analysis tutorial presents solutions that may be necessary when conducting analyses that span multiple years of HCUP data. • The HCUP Software Tools Tutorial provides instructions on how to apply the AHRQ software tools to HCUP or other administrative databases.
New tutorials are added periodically, and existing tutorials are updated when necessary. The Online Tutorial Series is located on the HCUP-US website at https://hcup-us.ahrq.gov/tech_assist/tutorials.jsp
In 2015, AHRQ restructured the data as described here:
https://hcup-us.ahrq.gov/db/nation/nis/2015HCUPNationalInpatientSample.pdf
Some key points:
Facebook
TwitterThe Healthcare Cost and Utilization Project (HCUP) Nationwide Readmissions Database (NRD) is a unique and powerful database designed to support various types of analyses of national readmission rates for all payers and the uninsured. The NRD includes discharges for patients with and without repeat hospital visits in a year and those who have died in the hospital. Repeat stays may or may not be related. The criteria to determine the relationship between hospital admissions is left to the analyst using the NRD. This database addresses a large gap in health care data - the lack of nationally representative information on hospital readmissions for all ages. Outcomes of interest include national readmission rates, reasons for returning to the hospital for care, and the hospital costs for discharges with and without readmissions. Unweighted, the NRD contains data from approximately 18 million discharges each year. Weighted, it estimates roughly 35 million discharges. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. The NRD is drawn from HCUP State Inpatient Databases (SID) containing verified patient linkage numbers that can be used to track a person across hospitals within a State, while adhering to strict privacy guidelines. The NRD is not designed to support regional, State-, or hospital-specific readmission analyses. The NRD contains more than 100 clinical and non-clinical data elements provided in a hospital discharge abstract. Data elements include but are not limited to: diagnoses, procedures, patient demographics (e.g., sex, age), expected source of payer, regardless of expected payer, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge, discharge month, quarter, and year, total charges, length of stay, and data elements essential to readmission analyses. The NIS excludes data elements that could directly or indirectly identify individuals. Restricted access data files are available with a data use agreement and brief online security training.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With the accumulation of large amounts of health related data, predictive analytics could stimulate the transformation of reactive medicine towards Predictive, Preventive and Personalized (PPPM) Medicine, ultimately affecting both cost and quality of care. However, high-dimensionality and high-complexity of the data involved, prevents data-driven methods from easy translation into clinically relevant models. Additionally, the application of cutting edge predictive methods and data manipulation require substantial programming skills, limiting its direct exploitation by medical domain experts. This leaves a gap between potential and actual data usage. In this study, the authors address this problem by focusing on open, visual environments, suited to be applied by the medical community. Moreover, we review code free applications of big data technologies. As a showcase, a framework was developed for the meaningful use of data from critical care patients by integrating the MIMIC-II database in a data mining environment (RapidMiner) supporting scalable predictive analytics using visual tools (RapidMiner’s Radoop extension). Guided by the CRoss-Industry Standard Process for Data Mining (CRISP-DM), the ETL process (Extract, Transform, Load) was initiated by retrieving data from the MIMIC-II tables of interest. As use case, correlation of platelet count and ICU survival was quantitatively assessed. Using visual tools for ETL on Hadoop and predictive modeling in RapidMiner, we developed robust processes for automatic building, parameter optimization and evaluation of various predictive models, under different feature selection schemes. Because these processes can be easily adopted in other projects, this environment is attractive for scalable predictive analytics in health research.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This data repository provides the Food and Agriculture Biomass Input Output (FABIO) database, a global set of multi-regional physical supply-use and input-output tables covering global agriculture and forestry.
The work is based on mostly freely available data from FAOSTAT, IEA, EIA, and UN Comtrade/BACI. FABIO currently covers 191 countries + RoW, 118 processes and 125 commodities (raw and processed agricultural and food products) for 1986-2013. All R codes and auxilliary data are available on GitHub. For more information please refer to https://fabio.fineprint.global.
The database consists of the following main components, in compressed .rds format:
Z: the inter-commodity input-output matrix, displaying the relationships of intermediate use of each commodity in the production of each commodity, in physical units (tons). The matrix has 24000 rows and columns (125 commodities x 192 regions), and is available in two versions, based on the method to allocate inputs to outputs in production processes: Z_mass (mass allocation) and Z_value (value allocation). Note that the row sums of the Z matrix (= total intermediate use by commodity) are identical in both versions.
Y: the final demand matrix, denoting the consumption of all 24000 commodities by destination country and final use category. There are six final use categories (yielding 192 x 6 = 1152 columns): 1) food use, 2) other use (non-food), 3) losses, 4) stock addition, 5) balancing, and 6) unspecified.
X: the total output vector of all 24000 commodities. Total output is equal to the sum of intermediate and final use by commodity.
L: the Leontief inverse, computed as (I – A)-1, where A is the matrix of input coefficients derived from Z and x. Again, there are two versions, depending on the underlying version of Z (L_mass and L_value).
E: environmental extensions for each of the 24000 commodities, including four resource categories: 1) primary biomass extraction (in tons), 2) land use (in hectares), 3) blue water use (in m3)., and 4) green water use (in m3).
mr_sup_mass/mr_sup_value: For each allocation method (mass/value), the supply table gives the physical supply quantity of each commodity by producing process, with processes in the rows (118 processes x 192 regions = 22656 rows) and commodities in columns (24000 columns).
mr_use: the use table capture the quantities of each commodity (rows) used as an input in each process (columns).
A description of the included countries and commodities (i.e. the rows and columns of the Z matrix) can be found in the auxiliary file io_codes.csv. Separate lists of the country sample (including ISO3 codes and continental grouping) and commodities (including moisture content) are given in the files regions.csv and items.csv, respectively. For information on the individual processes, see auxiliary file su_codes.csv. RDS files can be opened in R. Information on how to read these files can be obtained here: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/readRDS
Except of X.rds, which contains a matrix, all variables are organized as lists, where each element contains a sparse matrix. Please note that values are always given in physical units, i.e. tonnes or head, as specified in items.csv. The suffixes value and mass only indicate the form of allocation chosen for the construction of the symmetric IO tables (for more details see Bruckner et al. 2019). Product, process and country classifications can be found in the file fabio_classifications.xlsx.
Footprint results are not contained in the database but can be calculated, e.g. by using this script: https://github.com/martinbruckner/fabio_comparison/blob/master/R/fabio_footprints.R
How to cite:
To cite FABIO work please refer to this paper:
Bruckner, M., Wood, R., Moran, D., Kuschnig, N., Wieland, H., Maus, V., Börner, J. 2019. FABIO – The Construction of the Food and Agriculture Input–Output Model. Environmental Science & Technology 53(19), 11302–11312. DOI: 10.1021/acs.est.9b03554
License:
This data repository is distributed under the CC BY-NC-SA 4.0 License. You are free to share and adapt the material for non-commercial purposes using proper citation. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. In case you are interested in a collaboration, I am happy to receive enquiries at martin.bruckner@wu.ac.at.
Known issues:
The underlying FAO data have been manipulated to the minimum extent necessary. Data filling and supply-use balancing, yet, required some adaptations. These are documented in the code and are also reflected in the balancing item in the final demand matrices. For a proper use of the database, I recommend to distribute the balancing item over all other uses proportionally and to do analyses with and without balancing to illustrate uncertainties.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is a comprehensive collection of free tech books available on the web, specifically sourced from the FreeTechBooks platform. It includes details such as the names and URLs of various free textbooks, covering a wide range of topics including computer science, programming, data science, artificial intelligence, and more. The dataset is designed for educational purposes, providing easy access to high-quality, freely available technical resources.
The dataset consists of two columns:
Name: The title of the book.
URL: A direct link to the page where the book can be accessed or downloaded for free.
1200 free tech books.The dataset can be used for:
scraped from the FreeTechBooks website, a platform that aggregates freely available textbooks on various technical topics.82 pages of the FreeTechBooks website, extracting the names andURLs of books listed under different topics. The dataset includes data for a total of 1200+ books.
Facebook
TwitterThe National Cooperative Soil Survey - Soil Characterization Database (NCSS-SCD) contains laboratory data for more than 65,000 locations (i.e. xy coordinates) throughout the United States and its Territories, and about 2,100 locations from other countries. It is a compilation of data from the Kellogg Soil Survey Laboratory (KSSL) and several cooperating laboratories. The data steward and distributor is the National Soil Survey Center (NSSC). Information contained within the database includes physical, chemical, biological, mineralogical, morphological, and mid infrared reflectance (MIR) soil measurements, as well a collection of calculated values. The intended use of the data is to support interpretations related to soil use and management. Data Usage Access to the data is provided via the following user interfaces: 1. Interactive Web Map 2. Lab Data Mart (LDM) for querying data and generating reports 3. Soil Data Access (SDA) web services for querying data 5. Direct download of the entire database in several formats Data at each location includes measurements at multiple depths (e.g. soil horizons). However, not all analyses have been conducted for each location and depth. Typically, a suite of measurements was collected based upon assumed or known conditions regarding the soil being analyzed. For example, soils of arid environments are routinely analyzed for salts and carbonates as part of the standard analysis suite. Standard morphological soil descriptions are available for about 60,000 of these locations. Mid-infrared (MIR) spectroscopy is available for about 7,000 locations. Soil fertility measurements, such as those made by Agricultural Experiment Stations, were not made. Most of the data were obtained over the last 40 years, with about 4,000 locations before 1960, 25,000 from 1960-1990, 27,000 from 1990-2010, and 13,000 from 2010 to 2021. Generally, the number of measurements recorded per location has increased over time. Typically, the data were collected to represent a soil series or map unit component concept. They may also have been sampled to determine the range of variation within a given landscape. Although strict quality-control measures are applied, the NSSC does not warrant that the data are error free. Also, in some cases the measurements are not within the applicability range of the laboratory methods. For example, dispersion of clay is incomplete in some soils by the standard method used for determining particle-size distribution. Soils producing incomplete dispersion include those that are derived from volcanic materials or that have a high content of iron oxides, gypsum, carbonates, or other cementing materials. Also note that determination of clay minerals by x-ray diffraction is relative. Measurements of very high or very low quantities by any method are not very precise. Other measurements have other limitations in some kinds of soils. Such data are retained in the database for research purposes. Also, some of the data for were obtained from cooperating laboratories within the NCSS. The accuracy of the location coordinates has not been quantified but can be inferred from the precision of their decimal degrees and the presence of a map datum. Some older records may correspond to a county centroid. When the map datum is missing it can be assumed that data prior to 1990 was recorded using NAD27 and with WGS84 after 1995. For detailed information about methods used in the KSSL and other laboratories refer to "Soil Survey Investigation Report No. 42". For information on the application of laboratory data, refer to "Soil Survey Investigation Report No. 45". If you are unfamiliar with any terms or methods feel free to consult your NRCS State Soil Scientist. Terms of Use This dataset is not designed for use as a primary regulatory tool in permitting or citing decisions but may be used as a reference source. This is public information and may be interpreted by organizations, agencies, units of government, or others based on needs; however, they are responsible for the appropriate application. Federal, State, or local regulatory bodies are not to reassign to the Natural Resources Conservation Service or the National Cooperative Soil Survey any authority for the decisions that they make. The Natural Resources Conservation Service will not perform any evaluations of these data for purposes related solely to State or local regulatory programs.
Facebook
TwitterThis database automatically captures metadata, the source of which is the GOVERNMENT OF THE REPUBLIC OF SLOVENIA STATISTICAL USE OF THE REPUBLIC OF SLOVENIA and corresponding to the source database entitled “Use of electronic identification procedures in the last 12 months for private purposes by individuals, by status of activity, Slovenia, 2018”.
Actual data are available in Px-Axis format (.px). With additional links, you can access the source portal page for viewing and selecting data, as well as the PX-Win program, which can be downloaded free of charge. Both allow you to select data for display, change the format of the printout, and store it in different formats, as well as view and print tables of unlimited size, as well as some basic statistical analyses and graphics.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The AneuX morphology database is an open-access, multi-centric database containing 3D geometries of 750 intracranial aneurysms curated in the context of the AneuX project (2015-2020). The database combines data from three different projects (AneuX, @neurIST and Aneurisk) standardized using a single processing pipeline. The code to process and view the 3D geometries is provided under this public repository: https://github.com/hirsch-lab/aneuxdb
The database at a glance:
750 aneurysm domes (surface meshes)
668 vessel trees (surface meshes)
3 different data sources (AneuX, @neurIST, Aneurisk)
3 different mesh resolutions (original resolution, 0.01mm² and 0.05mm² target cell area)
4 different cut configurations (including planar and free-form cuts)
5 clinical parameters (aneurysm rupture status, location and side; patient age and sex)
170 pre-computed morphometric indices for each of the aneurysm domes
Terms of use / License:
The data is provided "as is", without any warranties of any kind. It is provided under the CC BY-NC 4.0 license, with the additional requirements (A) that the use of the database is declared using the sentence below (you can omit the URLs), and (B) to cite our peer reviewed journal article below.
[This project] uses data from the AneuX morphology database, an open-access, multi-centric database combining data from three European projects: AneuX project (www.aneux.ch), @neurIST project (www.aneurist.org) and Aneurisk (http://ecm2.mathcs.emory.edu/aneuriskweb/index).
In accordance with the terms of use, please cite the following journal article when referring to our dataset.
Juchler, Schilling, Bijlenga, Kurtcuoglu, Hirsch. Shape trumps size: Image-based morphological analysis reveals that the 3D shape discriminates intracranial aneurysm disease status better than aneurysm size. Frontiers in Neurology (2022), DOI: 10.3389/fneur.2022.809391
The AneuX morphology database contains parts (geometric models, clinical data) of the publicly available Aneurisk dataset released under the CC BY-NC 3.0 license (which is compatible with the license used here). Like all geometric models in this database, the Aneurisk models were preprocessed using the same procedure. See here for a description.
Funding and authorizations
The AneuX project
Data collection in accordance with @neurIST protocol v5
Ethics autorisations Geneva BASEC PB_2018‐00073
Supported by the grant from the Swiss SystemsX.ch initiative, evaluated by the Swiss National Science Foundation
@neurIST project
Data collection in accordance with @neurIST protocol v1
Ethics autorisations AmsterdamMEC 07-159, Barcelona2007-3507, Geneva CER 07-056, Oxfordshire REC AQ05/Q1604/162, Pècs RREC MC P 06 Jul 2007
Supported by the 6th framework program of the European Commission FP6-IST-2004–027703
Acknowledgments:
The AneuX project was supported by SystemsX.ch, and evaluated by the Swiss National Science Foundation (SNSF). This database would not be possible without the support of the Zurich University of Applied Sciences (ZHAW) and University Hospitals Geneva (HUG).
We thank the following people for their support and contributions to the AneuX morphology database.
From the AneuX project (in alphabetical order):
Daniel Rüfenacht
Diana Sapina
Isabel Wanke
Karl Lovblad
Karl Schaller
Olivier Brina
Paolo Machi
Rafik Ouared
Sabine Schilling
Sandrine Morel
Ueli Ebnöther
Vartan Kurtucuoglu
Vitor Mendes Pereira
Zolt Kuscàr
From the @neurIST project (in alphabetical order)
Alan Waterworth
Alberto Marzo
Alejandro Frangi
Alison Clarke
Ana Marcos Gonzalez
Ana Paula Narata
Antonio Arbona
Bawarjan Schatlo
Daniel Rüfenacht
Elio Vivas
Ferenc Kover
Gulam Zilani
Guntram Berti
Guy Lonsdale
Istvan Hudak
James Byrne
Jimison Iavindrasana
Jordi Blasco
Juan Macho
Julia Yarnold
Mari Cruz Villa Uriol
Martin Hofmann-Apitius
Max Jägersberg
Miriam CJM Sturkenboom
Nicolas Roduit
Pankaj Singh
Patricia Lawford
Paul Summers
Peer Hasselmeyer
Peter Bukovics
Rod Hose
Roelof Risselada
Stuart Coley
Tamas Doczi
Teresa Sola
Umang Patel
From the Aneurisk project (list from AneuriskWeb, in alphabetical order):
Alessandro Veneziani
Andrea Remuzzi
Edoardo Boccardi
Francesco. Migliavacca
Gabriele Dubini
Laura Sangalli
Luca Antiga
Maria Piccinelli
Piercesare Secchi
Simone Vantini
Susanna Bacigaluppi
Tiziano Passerini
Facebook
TwitterPyPSA-Eur is an open model dataset of the European power system at the transmission network level that covers the full ENTSO-E area. It can be built using the code provided at https://github.com/PyPSA/PyPSA-eur.
It contains alternating current lines at and above 220 kV voltage level and all high voltage direct current lines, substations, an open database of conventional power plants, time series for electrical demand and variable renewable generator availability, and geographic potentials for the expansion of wind and solar power.
Not all data dependencies are shipped with the code repository, since git is not suited for handling large changing files. Instead we provide separate data bundles to be downloaded and extracted as noted in the documentation.
This is the full data bundle to be used for rigorous research. It includes large bathymetry and natural protection area datasets.
While the code in PyPSA-Eur is released as free software under the MIT, different licenses and terms of use apply to the various input data, which are summarised below:
corine/*
Access to data is based on a principle of full, open and free access as established by the Copernicus data and information policy Regulation (EU) No 1159/2013 of 12 July 2013. This regulation establishes registration and licensing conditions for GMES/Copernicus users and can be found here. Free, full and open access to this data set is made on the conditions that:
When distributing or communicating Copernicus dedicated data and Copernicus service information to the public, users shall inform the public of the source of that data and information.
Users shall make sure not to convey the impression to the public that the user's activities are officially endorsed by the Union.
Where that data or information has been adapted or modified, the user shall clearly state this.
The data remain the sole property of the European Union. Any information and data produced in the framework of the action shall be the sole property of the European Union. Any communication and publication by the beneficiary shall acknowledge that the data were produced “with funding by the European Union”.
eez/*
Marine Regions’ products are licensed under CC-BY-NC-SA. Please contact us for other uses of the Licensed Material beyond license terms. We kindly request our users not to make our products available for download elsewhere and to always refer to marineregions.org for the most up-to-date products and services.
natura/*
EEA standard re-use policy: unless otherwise indicated, re-use of content on the EEA website for commercial or non-commercial purposes is permitted free of charge, provided that the source is acknowledged (https://www.eea.europa.eu/legal/copyright). Copyright holder: Directorate-General for Environment (DG ENV).
naturalearth/*
All versions of Natural Earth raster + vector map data found on this website are in the public domain. You may use the maps in any manner, including modifying the content and design, electronic dissemination, and offset printing. The primary authors, Tom Patterson and Nathaniel Vaughn Kelso, and all other contributors renounce all financial claim to the maps and invites you to use them for personal, educational, and commercial purposes.
No permission is needed to use Natural Earth. Crediting the authors is unnecessary.
NUTS_2013_60M_SH/*
In addition to the general copyright and licence policy applicable to the whole Eurostat website, the following specific provisions apply to the datasets you are downloading. The download and usage of these data is subject to the acceptance of the following clauses:
The Commission agrees to grant the non-exclusive and not transferable right to use and process the Eurostat/GISCO geographical data downloaded from this page (the "data").
The permission to use the data is granted on condition that: the data will not be used for commercial purposes; the source will be acknowledged. A copyright notice, as specified below, will have to be visible on any printed or electronic publication using the data downloaded from this page.
gebco/GEBCO_2014_2D.nc
The GEBCO Grid is placed in the public domain and may be used free of charge. Use of the GEBCO Grid indicates that the user accepts the conditions of use and disclaimer information given below.
Users are free to:
Copy, publish, distribute and transmit The GEBCO Grid
Adapt The GEBCO Grid
Commercially exploit The GEBCO Grid, by, for example, combining it with other information, or by including it in their own product or application
Users must:
Acknowledge the source of The GEBCO Grid. A suitable form of attribution is given in the documentation that accompanies The GEBCO Grid.
Not use The GEBCO Grid in a way that suggests any official status or that GEBCO, or the IHO or IOC, endorses any particular application of The GEBCO Grid.
Not mislead others or misrepresent The GEBCO Grid or its source.
je-e-21.03.02.xls
Information on the websites of the Federal Authorities is accessible to the public. Downloading, copying or integrating content (texts, tables, graphics, maps, photos or any other data) does not entail any transfer of rights to the content.
Copyright and any other rights relating to content available on the websites of the Federal Authorities are the exclusive property of the Federal Authorities or of any other expressly mentioned owners.
Any reproduction requires the prior written consent of the copyright holder. The source of the content (statistical results) should always be given.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.
Data content areas include:
Facebook
Twitter
According to our latest research, the global Time-Series Database for OT Data market size reached USD 1.84 billion in 2024, driven by increasing adoption of IoT and Industry 4.0 initiatives across operational technology (OT) environments. The market is expanding at a robust CAGR of 15.2%, and is forecasted to reach USD 5.18 billion by 2033. This growth is primarily propelled by the escalating need for real-time data analytics and process optimization in critical industries such as manufacturing, energy, and transportation, which are leveraging time-series databases to efficiently store, process, and analyze massive volumes of time-stamped data generated by OT systems.
A significant growth factor in the Time-Series Database for OT Data market is the rapid digital transformation occurring across traditional industrial sectors. As organizations strive to modernize their operations, there is a marked increase in the deployment of smart sensors, connected devices, and automation solutions. These advancements generate vast streams of time-stamped data, necessitating robust, scalable, and high-performance time-series databases capable of handling the unique requirements of OT environments. The integration of advanced analytics and artificial intelligence (AI) with time-series databases further enhances their value proposition, enabling predictive maintenance, anomaly detection, and real-time decision-making, which are critical for maximizing operational efficiency and minimizing downtime.
Another critical driver is the growing emphasis on predictive maintenance and asset management. Industrial companies are shifting from reactive to proactive maintenance strategies to reduce unplanned outages and extend asset lifecycles. Time-series databases play a pivotal role in this transition by enabling the continuous collection, storage, and analysis of sensor data from machinery, equipment, and infrastructure. The ability to detect patterns, trends, and anomalies in real-time empowers organizations to schedule maintenance activities precisely when needed, thereby reducing costs and improving overall productivity. This trend is particularly pronounced in sectors such as energy & utilities, oil & gas, and transportation, where equipment reliability and uptime are paramount.
Furthermore, the increasing adoption of cloud-based solutions is accelerating the growth of the Time-Series Database for OT Data market. Cloud deployment offers enhanced scalability, flexibility, and cost-efficiency, making it an attractive option for organizations seeking to manage large volumes of time-series data without the burden of maintaining on-premises infrastructure. Cloud-based time-series databases facilitate seamless integration with other cloud-native analytics tools and platforms, supporting advanced use cases such as remote monitoring, process optimization, and cross-site data aggregation. This shift is also fostering greater adoption among small and medium enterprises (SMEs), which can now leverage enterprise-grade time-series data management capabilities without significant upfront investment.
From a regional perspective, North America continues to dominate the global Time-Series Database for OT Data market, accounting for the largest share in 2024. The region benefits from a high concentration of technologically advanced industries, robust IT infrastructure, and early adoption of IoT and digitalization initiatives. Europe follows closely, driven by stringent regulatory requirements and a strong focus on industrial automation. The Asia Pacific region, meanwhile, is witnessing the fastest growth, fueled by rapid industrialization, expanding manufacturing sectors, and increasing investments in smart infrastructure projects across countries such as China, India, and Japan. As the adoption of time-series databases for OT data accelerates globally, regional markets are expected to experience differentiated growth trajectories based on industry maturity, technological readiness, and regulatory landscapes.
Facebook
TwitterThis paper drives the process of creating VMLA, a language test meant to be used during awake craniotomies. It focuses on step by step process and aims to help other developers to build their own assessment. This project was designed as a prospective study and registered in the Ethic Committee of Educational and Research Institute of Sirio Libanês Hospital. Ethics committee approval number: HSL 2018-37 / CAEE 90603318.9.0000.5461. Images were bought by Shutterstock.com and generated the following receipts: SSTK-0CA8F-1358 and SSTK-0235F-6FC2 VMLA is a neuropsychological assessment of language function, comprising object naming (ON) and semantic. Originally composed by 420 slides, validation among Brazilian native speakers left 368 figures plus fifteen other elements, like numbers, sentences and count. Validation was focused on educational level (EL), gender and age. Volunteers were tested in fourteen different states of Brazil. Cultural differences resulted in improvements to final Answer Template. EL and age were identified as factors that influenced VLMA assessment results. Highly educated volunteers performed better for both ON and semantic. People over 50 and 35 years old had better performance for ON and semantic, respectively. Further validation in unevaluated regions of Brazil, including more balanced number of males and females and more even distribution of age and EL, could confirm our statistical analysis. After validation, ON-VMLA was framed in batteries of 100 slides each, mixing images of six different complexity categories. Semantic-VMLA kept all the original seventy verbal and non-verbal combinations. The validation process resulted in increased confidence during intraoperative test application. We are now able to score and evaluate patient´s language deficits. Currently, VLMA fits its purpose of dynamical application and accuracy during language areas mapping. It is the first test targeted to Brazilians, representing much of our culture and collective imagery. Our experience may be of value to clinicians and researchers working with awake craniotomy who seek to develop their own language test.
The test is available for free use at www.vemotests.com (beginning in February, 2021)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
At a fundamental level most genes, signaling pathways, biological functions and organ systems are highly conserved between man and all vertebrate species. Leveraging this conservation, researchers are increasingly using the experimental advantages of the amphibian Xenopus to model human disease. The online Xenopus resource, Xenbase, enables human disease modeling by curating the Xenopus literature published in PubMed and integrating these Xenopus data with orthologous human genes, anatomy, and more recently with links to the Online Mendelian Inheritance in Man resource (OMIM) and the Human Disease Ontology (DO). Here we review how Xenbase supports disease modeling and report on a meta-analysis of the published Xenopus research providing an overview of the different types of diseases being modeled in Xenopus and the variety of experimental approaches being used. Text mining of over 50,000 Xenopus research articles imported into Xenbase from PubMed identified approximately 1,000 putative disease- modeling articles. These articles were manually assessed and annotated with disease ontologies, which were then used to classify papers based on disease type. We found that Xenopus is being used to study a diverse array of disease with three main experimental approaches: cell-free egg extracts to study fundamental aspects of cellular and molecular biology, oocytes to study ion transport and channel physiology and embryo experiments focused on congenital diseases. We integrated these data into Xenbase Disease Pages to allow easy navigation to disease information on external databases. Results of this analysis will equip Xenopus researchers with a suite of experimental approaches available to model or dissect a pathological process. Ideally clinicians and basic researchers will use this information to foster collaborations necessary to interrogate the development and treatment of human diseases.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Digitization of healthcare data along with algorithmic breakthroughts in AI will have a major impact on healthcare delivery in coming years. Its intresting to see application of AI to assist clinicians during patient treatment in a privacy preserving way. While scientific knowledge can help guide interventions, there remains a key need to quickly cut through the space of decision policies to find effective strategies to support patients during the care process.
Offline Reinforcement learning (also referred to as safe or batch reinforcement learning) is a promising sub-field of RL which provides us with a mechanism for solving real world sequential decision making problems where access to simulator is not available. Here we assume that learn a policy from fixed dataset of trajectories with further interaction with the environment(agent doesn't receive reward or punishment signal from the environment). It has shown that such an approach can leverage vast amount of existing logged data (in the form of previous interactions with the environment) and can outperform supervised learning approaches or heuristic based policies for solving real world - decision making problems. Offline RL algorithms when trained on sufficiently large and diverse offline datasets can produce close to optimal policies(ability to generalize beyond training data).
As Part of my PhD, research, I investigated the problem of developing a Clinical Decision Support System for Sepsis Management using Offline Deep Reinforcement Learning.
MIMIC-III ('Medical Information Mart for Intensive Care') is a large open-access anonymized single-center database which consists of comprehensive clinical data of 61,532 critical care admissions from 2001–2012 collected at a Boston teaching hospital. Dataset consists of 47 features (including demographics, vitals, and lab test results) on a cohort of sepsis patients who meet the sepsis-3 definition criteria.
we try to answer the following question:
Given a particular patient’s characteristics and physiological information at each time step as input, can our DeepRL approach, learn an optimal treatment policy that can prescribe the right intervention(e.g use of ventilator) to the patient each stage of the treatment process, in order to improve the final outcome(e.g patient mortality)?
we can use popular state-of-the-art algorithms such as Deep Q Learning(DQN), Double Deep Q Learning (DDQN), DDQN combined with BNC, Mixed Monte Carlo(MMC) and Persistent Advantage Learning (PAL). Using these methods we can train an RL policy to recommend optimum treatment path for a given patient.
Data acquisition, standard pre-processing and modelling details can be found here in Github repo: https://github.com/asjad99/MIMIC_RL_COACH
Facebook
TwitterFree and publicly accessible literature database for peer-reviewed primary and review articles in the field of human Biospecimen Science. Each entry has been created by a Ph.D. level scientist to capture relevant parameters, pre-analytical factors, and original summaries of relevant results.
Facebook
TwitterThis database automatically captures metadata, the source of which is the GOVERNMENT OF THE REPUBLIC OF SLOVENIA STATISTICAL USE OF THE REPUBLIC OF SLOVENIA and corresponding to the source database entitled “Use of electronic identification procedures in the last 12 months for private purposes by individuals, by degree of urbanisation of the area in which these individuals live, Slovenia, 2018”.
Actual data are available in Px-Axis format (.px). With additional links, you can access the source portal page for viewing and selecting data, as well as the PX-Win program, which can be downloaded free of charge. Both allow you to select data for display, change the format of the printout, and store it in different formats, as well as view and print tables of unlimited size, as well as some basic statistical analyses and graphics.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset has been uploaded to Kaggle on the occasion of solving questions of the 365 Data Science • Practice Exams: SQL curriculum, a set of free resources designed to help test and elevate data science skills. The dataset consists of a synthetic, relational collection of data structured to simulate common employee and organizational data scenarios, ideal for practicing SQL queries and data analysis skills in a People Analytics context.
The dataset contains the following tables:
departments.csv: List of all company departments.
dept_emp.csv: Historical and current assignments of employees to departments.
dept_manager.csv: Historical and current assignments of employees as department managers.
employees.csv: Core employee demographic information.
employees.db: A SQLite database containing all the relational tables from the CSV files.
salaries.csv: Historical salary records for employees.
titles.csv: Historical job titles held by employees.
The dataset is ideal for practicing SQL queries and data analysis skills in a People Analytics context. It serves applications on both general Data Analytics, and also Time Series Analysis.
A practical application is presented on the 🎓 365DS Practice Exams • SQL notebook, which covers in detail answers to the questions of SQL Practice Exams 1, 2, and 3 on the 365DS platform, especially ilustrating the usage and the value of SQL procedures and functions.
This dataset has a rich lineage, originating from academic research and evolving through various formats to its current relational structure:
The foundational dataset was authored by Prof. Dr. Fusheng Wang 🔗 (then a PhD student at the University of California, Los Angeles - UCLA) and his advisor, Prof. Dr. Carlo Zaniolo 🔗 (UCLA). This work is primarily described in their paper:
It was originally distributed as an .xml file. Giuseppe Maxia (known as @datacharmer on GitHub🔗 and LinkedIn🔗, as well as here on Kaggle) converted it into its relational form and subsequently distributed it as a .sql file, making it accessible for relational database use.
This .sql version was then loaded to Kaggle as the « Employees Dataset » by Mirza Huzaifa🔗 on February 5th, 2023.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder includes the shapefiles for the 10 validation countries included in the manuscript. Abstract: The study of population health through network science holds high promise, but data sources that allow complete representation of populations are limited in low- and middle-income settings. Large national health surveys designed to gather nationally representative health and development data in low- and middle-income countries are promising sources of such data. Although they provide researchers, healthcare providers, and policymakers with valuable information, they are not designed to produce small-area estimates of health indicators, and the methods for producing these tend to rely on diverse and imperfect covariate data sources, have high data input requirements and are computationally demanding, limiting their use for network representations of populations. To reduce the sources of measurement error and allow efficient multi-country representation of populations as networks of human settlements here, we present a covariate-free multi-country method to estimate small-area health indicators using standardized georeferenced surveys. The approach utilizes interpolation via local inverse distance weighting. The estimates are compared to those obtained using a Bayesian Geostatistical Model and have been cross-validated. The estimates are aggregated into population settlements and identified using the Global Human Settlement Layer database. The method is fully automated, requiring a single standard georeferenced survey data source for mapping populations, eliminating the need for indicator or country-specific covariate selection by investigators. Efficient estimation is achieved by only computing values for human-occupied areas and adopting a logical aggregation of estimates into the complete range of settlement sizes. An open-access library of standardized georeferenced settlement-level datasets for 15 indicators and 10 countries was validated in this paper, as well as the code used to identify settlements and estimate indicators. The datasets are intended to be used as the basis for population health studies, and the library will continue to be expanded. The novel aspects include using harmonized input sources and estimation procedures across countries and the adoption of real-world units for population data aggregation, creating a specialized library of nodes that serve as a basis for network representations of population health in low- and middle-income countries.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains anime images for 231 different anime, with approximately 380 image for each of those anime. Please note that you might need to clean the image directories a bit, since the images might contain merchandise and live-action photos in addition to the actual anime itself.
If you'd like to take a look at the scripts used to make this dataset, you can find them on this GitHub repo.
Feel free to extend it, scrape your own images, etc. etc.
As a big anime fan, I found a lot of anime related datasets on Kaggle. I was however disappointed to find no dataset containing anime specific images for popular anime. Some other great datasets that I've been inspired by include: - Top 250 Anime 2023 - Anime Recommendations Database - Anime Recommendation Database 2020 - Anime Face Dataset - Safebooru - Anime Image Metadata
Facebook
TwitterThe largest all-payer ambulatory surgery database in the United States, the Healthcare Cost and Utilization Project (HCUP) Nationwide Ambulatory Surgery Sample (NASS) produces national estimates of major ambulatory surgery encounters in hospital-owned facilities. Major ambulatory surgeries are defined as selected major therapeutic procedures that require the use of an operating room, penetrate or break the skin, and involve regional anesthesia, general anesthesia, or sedation to control pain (i.e., surgeries flagged as "narrow" in the HCUP Surgery Flag Software). Unweighted, the NASS contains approximately 9.0 million ambulatory surgery encounters each year and approximately 11.8 million ambulatory surgery procedures. Weighted, it estimates approximately 11.9 million ambulatory surgery encounters and 15.7 million ambulatory surgery procedures. Sampled from the HCUP State Ambulatory Surgery and Services Databases (SASD) and State Emergency Department Databases (SEDD) in order to capture both planned and emergent major ambulatory surgeries, the NASS can be used to examine selected ambulatory surgery utilization patterns. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. The NASS contains clinical and resource-use information that is included in a typical hospital-owned facility record, including patient characteristics, clinical diagnostic and surgical procedure codes, disposition of patients, total charges, facility characteristics, and expected source of payment, regardless of payer, including patients covered by Medicaid, private insurance, and the uninsured. The NASS excludes data elements that could directly or indirectly identify individuals, hospitals, or states. The NASS is limited to encounters with at least one in-scope major ambulatory surgery on the record, performed at hospital-owned facilities. Procedures intended primarily for diagnostic purposes are not considered in-scope. Restricted access data files are available with a data use agreement and brief online security training.