CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data organization for the figures in the document: Figure 3A LineOutWithSun_SSAzi_135to225_green_Correct_ROI5_INFO.xls Figure 3b LineOutWithSun_SSAzi_m45to45_green_Correct_ROI5_INFO.xls Figure 4 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Sim_Correct_ROI5_INFO.xls Figure 5a LineOut_Camera_Elevation_SqAzi_m180to0_green_Sim_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls Figure 5b LineOut_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_0to180_green_Sim_Correct_ROI5_INFO.xls Figure 6a LineOutColor_SqAzi_m180to0_CP_20to50_Correct_ROI5_INFO.xls Figure 6b LineOutROI_SqAzi_m180to0_CP_20to50_green_Correct_INFO.xls Figure 7 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls
This Excel file contains a comprehensive list of fields in the DIEM dataset along with detailed descriptions for each. The information is organized across multiple sheets within the workbook. You can navigate between sheets by selecting them from the list at the bottom of the Excel window. The first sheet contains a readme text. The second sheet provides the list of fields and detailed descriptions for the DIEM Microdata, which includes data at the household level. The subsequent four sheets contain the list of fields and their descriptions for the four DIEM aggregated datasets, each corresponding to a specific DIEM thematic area:
Income, Shocks and Needs thematic area Crop Production thematic area Livestock Production thematic area Food Security thematic area
ISO is an independent advisory organization that collects information on a community's building-code adoption and enforcement services in order to provide a ranking for insurance companies. ISO assigns a Building Code Effectiveness Classification from 1 to 10 based on the data collected. Class 1 represents exemplary commitment to building-code enforcement.Municipalities with better rankings are lower risk, and their residents' insurance rates can reflect that. The prospect of minimizing catastrophe-related damage and ultimately lowering insurance costs gives communities an incentive to enforce their building codes rigorously.This page provides data for the Insurance Services Organization (ISO) performance measure. This data includes residential and commercial building code enforcement ratings for the City of Tempe.The performance measure dashboard is available at 1.15 Insurance Services Organization (ISO) RatingAdditional InformationSource: Insurance Service Organization RatingContact: Chris ThompsonContact E-Mail: Christopher_Thompson@tempe.govData Source Type: ExcelPreparation Method: Information added to Excel spreadsheet from rating reportPublish Frequency: Every 5 YearsPublish Method: ManualData Dictionary
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘1.15 Insurance Services Organization (summary)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/0905a875-5513-4591-aeaf-370103dc476a on 11 February 2022.
--- Dataset description provided by original source is as follows ---
ISO is an independent advisory organization that collects information on a community's building-code adoption and enforcement services in order to provide a ranking for insurance companies. ISO assigns a Building Code Effectiveness Classification from 1 to 10 based on the data collected. Class 1 represents exemplary commitment to building-code enforcement.
Municipalities with better rankings are lower risk, and their residents' insurance rates can reflect that. The prospect of minimizing catastrophe-related damage and ultimately lowering insurance costs gives communities an incentive to enforce their building codes rigorously.
This page provides data for the Insurance Services Organization (ISO) performance measure.
This data includes residential and commercial building code enforcement ratings for the City of Tempe.
The performance measure dashboard is available at 1.15 Insurance Services Organization (ISO) Rating
Additional Information
Source: Insurance Service Organization Rating
Contact: Chris Thompson
Contact E-Mail: Christopher_Thompson@tempe.gov
Data Source Type: Excel
Preparation Method: Information added to Excel spreadsheet from rating report
Publish Frequency: Every 5 Years
Publish Method: Manual
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The GAPs Data Repository provides a comprehensive overview of available qualitative and quantitative data on national return regimes, now accessible through an advanced web interface at https://data.returnmigration.eu/.
This updated guideline outlines the complete process, starting from the initial data collection for the return migration data repository to the development of a comprehensive web-based platform. Through iterative development, participatory approaches, and rigorous quality checks, we have ensured a systematic representation of return migration data at both national and comparative levels.
The Repository organizes data into five main categories, covering diverse aspects and offering a holistic view of return regimes: country profiles, legislation, infrastructure, international cooperation, and descriptive statistics. These categories, further divided into subcategories, are based on insights from a literature review, existing datasets, and empirical data collection from 14 countries. The selection of categories prioritizes relevance for understanding return and readmission policies and practices, data accessibility, reliability, clarity, and comparability. Raw data is meticulously collected by the national experts.
The transition to a web-based interface builds upon the Repository’s original structure, which was initially developed using REDCap (Research Electronic Data Capture). It is a secure web application for building and managing online surveys and databases.The REDCAP ensures systematic data entries and store them on Uppsala University’s servers while significantly improving accessibility and usability as well as data security. It also enables users to export any or all data from the Project when granted full data export privileges. Data can be exported in various ways and formats, including Microsoft Excel, SAS, Stata, R, or SPSS for analysis. At this stage, the Data Repository design team also converted tailored records of available data into public reports accessible to anyone with a unique URL, without the need to log in to REDCap or obtain permission to access the GAPs Project Data Repository. Public reports can be used to share information with stakeholders or external partners without granting them access to the Project or requiring them to set up a personal account. Currently, all public report links inserted in this report are also available on the Repository’s webpage, allowing users to export original data.
This report also includes a detailed codebook to help users understand the structure, variables, and methodologies used in data collection and organization. This addition ensures transparency and provides a comprehensive framework for researchers and practitioners to effectively interpret the data.
The GAPs Data Repository is committed to providing accessible, well-organized, and reliable data by moving to a centralized web platform and incorporating advanced visuals. This Repository aims to contribute inputs for research, policy analysis, and evidence-based decision-making in the return and readmission field.
Explore the GAPs Data Repository at https://data.returnmigration.eu/.
The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.
----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:
Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
The survey has six main objectives. These objectives are:
The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.
National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.
1- Household/family. 2- Individual/person.
The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
Sample survey data [ssd]
----> Design:
Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.
----> Sample frame:
Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.
----> Sampling Stages:
In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.
Face-to-face [f2f]
----> Preparation:
The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.
----> Questionnaire Parts:
The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job
Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.
Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days
Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.
----> Raw Data:
Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.
----> Harmonized Data:
Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).
The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Macros for organizing raw data from LabView software that logs timestamps for every rotation of a flight mill arms and then summarizes the time of individual flight bouts.
Spatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and recreation access across the nation. The PAD-US 3.0 Combined Fee, Designation, Easement feature class (with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class) was modified to remove overlaps, avoiding overestimation in protected area statistics and to support user needs. A Python scripted process ("PADUS3_0_CreateVectorAnalysisFileScript.zip") associated with this data release prioritized overlapping designations (e.g. Wilderness within a National Forest) based upon their relative biodiversity conservation status (e.g. GAP Status Code 1 over 2), public access values (in the order of Closed, Restricted, Open, Unknown), and geodatabase load order (records are deliberately organized in the PAD-US full inventory with fee owned lands loaded before overlapping management designations, and easements). The Vector Analysis File ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") associated item of PAD-US 3.0 Spatial Analysis and Statistics ( https://doi.org/10.5066/P9KLBB5D ) was clipped to the Census state boundary file to define the extent and serve as a common denominator for statistical summaries. Boundaries of interest to stakeholders (State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative) were incorporated into separate geodatabase feature classes to support various data summaries ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip") and Comma-separated Value (CSV) tables ("PADUS3_0SummaryStatistics_TabularData_CSV.zip") summarizing "PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip" are provided as an alternative format and enable users to explore and download summary statistics of interest (Comma-separated Table [CSV], Microsoft Excel Workbook [.XLSX], Portable Document Format [.PDF] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). In addition, a "flattened" version of the PAD-US 3.0 combined file without other extent boundaries ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") allow for other applications that require a representation of overall protection status without overlapping designation boundaries. The "PADUS3_0VectorAnalysis_State_Clip_CENSUS2020" feature class ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.gdb") is the source of the PAD-US 3.0 raster files (associated item of PAD-US 3.0 Spatial Analysis and Statistics, https://doi.org/10.5066/P9KLBB5D ). Note, the PAD-US inventory is now considered functionally complete with the vast majority of land protection types represented in some manner, while work continues to maintain updates and improve data quality (see inventory completeness estimates at: http://www.protectedlands.net/data-stewards/ ). In addition, changes in protected area status between versions of the PAD-US may be attributed to improving the completeness and accuracy of the spatial data more than actual management actions or new acquisitions. USGS provides no legal warranty for the use of this data. While PAD-US is the official aggregation of protected areas ( https://www.fgdc.gov/ngda-reports/NGDA_Datasets.html ), agencies are the best source of their lands data.
WHOSIS, the WHO Statistical Information System, is an interactive database bringing together core health statistics for the 193 WHO Member States. It comprises more than 100 indicators, which can be accessed by way of a quick search, by major categories, or through user-defined tables. The data can be further filtered, tabulated, charted and downloaded. The data are also published annually in the World Health Statistics Report released in May. The 2009 data is available for download in Excel format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General description
This dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.
Is the research article an Open Access publication?
Does the research article have a Creative Common license or a similar license?
Does the research article contain a data availability statement?
Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?
Does the research article contain supplementary data?
Do the supplementary data have a persistent identifier that makes them citable as a defined research output?
Variables
The data were compiled in a Microsoft Excel 365 document that includes the following variables.
DOI URL of research article
Year of publication
Research article published with Open Access
License for research article
Data availability statement in article
Supplementary data added to article
Persistent identifier for supplementary data
Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDC
Visualization
Parts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).
File formats and software
The file formats used in this dataset are:
.csv (Text file) .docx (Microsoft Word 365 file) .jpg (JPEG image file) .pdf/A (Portable Document Format for archiving) .png (Portable Network Graphics image file) .pptx (Microsoft Power Point 365 file) .txt (Text file) .xlsx (Microsoft Excel 365 file)
All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016.
MD5 checksums
Here is a list of all files of this dataset and of their MD5 checksums.
GSA, the nation's largest public real estate organization, provides workspace for over one million federal workers. These employees, along with government property, are housed in space owned by the federal government and in leased properties including buildings, land, antenna sites, etc. across the country.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
iEEG and EEG data from 5 centers is organized in our study with a total of 100 subjects. We publish 4 centers' dataset here due to data sharing issues.
Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject's session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event.
For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.
NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.
All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:
Amber Sours, MPH Research Supervisor | Epilepsy Center Cleveland Clinic | 9500 Euclid Ave. S3-399 | Cleveland, OH 44195 (216) 444-8638
You will need to sign a data use agreement (DUA).
For each subject, there was a raw EDF file, which was converted into the BrainVision format with mne_bids
.
Each subject with SEEG implantation, also has an Excel table, called electrode_layout.xlsx
, which outlines where the clinicians marked each electrode anatomically. Note that there is no rigorous atlas applied, so the main points of interest are: WM
, GM
, VENTRICLE
, CSF
, and OUT
, which represent white-matter, gray-matter, ventricle, cerebrospinal fluid and outside the brain. WM, Ventricle, CSF and OUT were removed channels from further analysis. These were labeled in the corresponding BIDS channels.tsv
sidecar file as status=bad
.
The dataset uploaded to openneuro.org
does not contain the sourcedata
since there was an extra
anonymization step that occurred when fully converting to BIDS.
Derivatives include: * fragility analysis * frequency analysis * graph metrics analysis * figures
These can be computed by following the following paper: Neural Fragility as an EEG Marker for the Seizure Onset Zone
Within each EDF file, there contain event markers that are annotated by clinicians, which may inform you of specific clinical events that are occuring in time, or of when they saw seizures onset and offset (clinical and electrographic).
During a seizure event, specifically event markers may follow this time course:
* eeg onset, or clinical onset - the onset of a seizure that is either marked electrographically, or by clinical behavior. Note that the clinical onset may not always be present, since some seizures manifest without clinical behavioral changes.
* Marker/Mark On - these are usually annotations within some cases, where a health practitioner injects a chemical marker for use in ICTAL SPECT imaging after a seizure occurs. This is commonly done to see which portions of the brain are active metabolically.
* Marker/Mark Off - This is when the ICTAL SPECT stops imaging.
* eeg offset, or clinical offset - this is the offset of the seizure, as determined either electrographically, or by clinical symptoms.
Other events included may be beneficial for you to understand the time-course of each seizure. Note that ICTAL SPECT occurs in all Cleveland Clinic data. Note that seizure markers are not consistent in their description naming, so one might encode some specific regular-expression rules to consistently capture seizure onset/offset markers across all dataset. In the case of UMMC data, all onset and offset markers were provided by the clinicians on an Excel sheet instead of via the EDF file. So we went in and added the annotations manually to each EDF file.
For various datasets, there are seizures present within the dataset. Generally there is only one seizure per EDF file. When seizures are present, they are marked electrographically (and clinically if present) via standard approaches in the epilepsy clinical workflow.
Clinical onset are just manifestation of the seizures with clinical syndromes. Sometimes the maker may not be present.
What is actually important in the evaluation of datasets is the clinical annotations of their localization hypotheses of the seizure onset zone.
These generally include:
* early onset: the earliest onset electrodes participating in the seizure that clinicians saw
* early/late spread (optional): the electrodes that showed epileptic spread activity after seizure onset. Not all seizures has spread contacts annotated.
For patients with the post-surgical MRI available, then the segmentation process outlined above tells us which electrodes were within the surgical removed brain region.
Otherwise, clinicians give us their best estimate, of which electrodes were resected/ablated based on their surgical notes.
For surgical patients whose postoperative medical records did not explicitly indicate specific resected or ablated contacts, manual visual inspection was performed to determine the approximate contacts that were located in later resected/ablated tissue. Postoperative T1 MRI scans were compared against post-SEEG implantation CT scans or CURRY coregistrations of preoperative MRI/post SEEG CT scans. Contacts of interest in and around the area of the reported resection were selected individually and the corresponding slice was navigated to on the CT scan or CURRY coregistration. After identifying landmarks of that slice (e.g. skull shape, skull features, shape of prominent brain structures like the ventricles, central sulcus, superior temporal gyrus, etc.), the location of a given contact in relation to these landmarks, and the location of the slice along the axial plane, the corresponding slice in the postoperative MRI scan was navigated to. The resected tissue within the slice was then visually inspected and compared against the distinct landmarks identified in the CT scans, if brain tissue was not present in the corresponding location of the contact, then the contact was marked as resected/ablated. This process was repeated for each contact of interest.
Adam Li, Chester Huynh, Zachary Fitzgerald, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Emily Johnson, William Anderson, Nathan Crone, Sara Inati, Kareem Zaghloul, Juan Bulacio, Jorge Gonzalez-Martinez, Sridevi V. Sarma. Neural Fragility as an EEG Marker of the Seizure Onset Zone. bioRxiv 862797; doi: https://doi.org/10.1101/862797
Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896
Holdgraf, C., Appelhoff, S., Bickel, S., Bouchard, K., D'Ambrosio, S., David, O., … Hermes, D. (2019). iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology. Scientific Data, 6, 102. https://doi.org/10.1038/s41597-019-0105-7
Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract The article aims to discuss the consequences of social distancing measures on the availability of blood and organization of blood therapy services at the beginning of the Covid-19 pandemic in Brazil. News published in April 2020 on the websites of the country’s state Blood Service Networks were consulted and organized in an Excel spreadsheet, presented in summary charts, and descriptions of results were prepared. A critical situation of blood supply, especially of some blood types, has been observed in many states. This situation is influenced by the circulation of the new coronavirus. The adoption of social distancing measures associated with unchanged transfusion demands for outpatient, urgency and emergency care required the implementation of strategies and actions for the reorganization of the services. Protection measures were incorporated, flows were changed and new routines were established. This study shows the extent to which the epidemiological situation of Covid-19 and the necessary measures for its control influenced the stocks and availability of blood. Changes in the organization of blood therapy services were fundamental in order to ensure protection, mitigate the risks of spreading the virus, and ensure the blood supply to meet the needs of the health system.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
UCS-JSpOC-soy-panel-22.csv
. This dataset combines and cleans data from the Union of Concerned Scientists and Space-Track.org to create a panel of satellites, operators, and years. This dataset is used in the paper "Oligopoly competition between satellite constellations will reduce economic welfare from orbit use". The final dataset can also be downloaded from the replication files for that paper: https://doi.org/10.57968/Middlebury.23816994.v1A "living" version of this repository can be found at: https://github.com/akhilrao/orbital-ownership-data# Repository structure* /UCS data
contains Excel and CSV data files from the Union of Concerned Scientists, as well as output files generated from data cleaning. You can find the UCS Satellite Database here: https://www.ucsusa.org/resources/satellite-database . Historical data was obtained from Dr. Teri Grimwood.* /Space-Track data
contains JSON data from Space-Track.org, files to help identify operator names for harmonization in UCS_text_cleaner.R
, and output generated from cleaning and merging data. * API queries to generate the JSON files can be found in json_cleaned_script.R. They are restated below for convenience. These queries were run on January 1, 2023 to produce the data used in "Oligopoly competition between satellite constellations will reduce economic welfare from orbit use". * 33999/OBJECT_TYPE/PAYLOAD/orderby/INTLDES asc/emptyresult/show* /Current R scripts
contains R scripts to process the data. * combined_scripts.R
loads and cleans UCS data. It takes the raw CSV files from /UCS data
as input and produces UCS_Combined_Data.csv
as output. * UCS_text_cleaner.R
harmonizes various text fields in the UCS data, including operator and owner names. Best efforts were made to ensure correctness and completeness, but some gaps may remain. * json_cleaned_script.R
loads and cleans Space-Track data, and merges it with the cleaned and combined UCS data. * panel_builder.R
uses the cleaned and merged files to construct the satellite-operator-year panel dataset with annual satellite histories and operator information. The logic behind the dataset construction approach is described in this blog post: https://akhilrao.github.io/blog//data/2020/08/20/build_stencil_cut/* /Output_figures
contains figures produced by the scripts. Some are diagnostic, some are just interesting.* /Output_data
contains the final data outputs.* /data-cleaning-notes
contains Excel and CSV files used to assist in harmonizing text fields in UCS_text_cleaner.R
. They are included here for completeness.# Creating the datasetTo reproduce the UCS-JSpOC-soy-panel-22.csv
dataset:1. Ensure R
is installed along with the required packages2. Run the scripts in /Current R scripts
in the following order: * combined_scripts.R
(this will call UCS_text_cleaner.R
) * json_cleaned_script.R
* panel_builder.R
3. The output file UCS-JSpOC-soy-panel-22.csv
, along with several intermediate files used to create it, will be generated in /Output data
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Users can find data on a range of global health topics like mortality, the burden of disease, infectious diseases, risk factors and health expenditures. Background The Global Health Observatory (GHO) database is the World Health Organization's main health statistics repository. Data is available for 193 World Health Organization member states on topics including but not limited to: Health related millennium goals, mortality, immunization, nutrition, infectious disease, non- communicable disease, tobacco control, violence, injuries, alcohol, HIV/AIDS, tuberculosis, malaria, water and sanitation, maternal and reproductive health, cho lera, child health, child nutrition, and road safety. User FunctionalityUsers can generate tables and charts according to country or region, health indicator, and time period. Data can also be compared across countries. Data can be filtered, tabulated, charted, and downloaded into Excel statistical software. These data are also published in statistical reports covering topics including: Alcohol and health, Child health, Cholera, HIV/AIDS, Malaria, Maternal and reproductive heal th, Non-communicable diseases, Public health and environment, Road safety, Tuberculosis, Tobacco control. Data Notes Data are derived from surveillance and household surveys. Years in which data were collected is indicated with these health statistics. Information is available for each WHO member country and international region. The most recent data is available from 2009.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Excel-based SYNTAX score calculator for up to 1500 casesThis SYNTAX score calculator is a Microsoft Excel-based tool designed for rapid data entry and automatic calculation. It summarizes various parameters, including scores for individual coronary lesions. Its accuracy has been validated in a cohort from the FAME3 trial and published in Catheterization and Cardiovascular Interventions.Key FeaturesEfficient Data Entry: Streamlined for quick and easy input.Automatic Calculations: Instantly calculates and summarizes scores.Validated Accuracy: Proven accurate in clinical trials.Enhanced Data ManagementOur calculator doubles as a data management platform. If you have results exported as PDFs from conventional online calculators, consider migrating your data to this Excel-based platform for better organization and management.How to CiteWhen using our calculator, please cite the following article:Clinical Validation of a Novel Simplified Offline Tool for SYNTAX Score CalculationNote: The tool supporting a smaller number of cases is more lightweight.[Contact/Bug report]Tatsunori Takahashi, MDSmidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, UStatsunori.takahashi@protonmail.comryutoku0221@gmail.com
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global big data technology market size was valued at approximately $162 billion in 2023 and is projected to reach around $471 billion by 2032, growing at a Compound Annual Growth Rate (CAGR) of 12.6% during the forecast period. The growth of this market is primarily driven by the increasing demand for data analytics and insights to enhance business operations, coupled with advancements in AI and machine learning technologies.
One of the principal growth factors of the big data technology market is the rapid digital transformation across various industries. Businesses are increasingly recognizing the value of data-driven decision-making processes, leading to the widespread adoption of big data analytics. Additionally, the proliferation of smart devices and the Internet of Things (IoT) has led to an exponential increase in data generation, necessitating robust big data solutions to analyze and extract meaningful insights. Organizations are leveraging big data to streamline operations, improve customer engagement, and gain a competitive edge.
Another significant growth driver is the advent of advanced technologies like artificial intelligence (AI) and machine learning (ML). These technologies are being integrated into big data platforms to enhance predictive analytics and real-time decision-making capabilities. AI and ML algorithms excel at identifying patterns within large datasets, which can be invaluable for predictive maintenance in manufacturing, fraud detection in banking, and personalized marketing in retail. The combination of big data with AI and ML is enabling organizations to unlock new revenue streams, optimize resource utilization, and improve operational efficiency.
Moreover, regulatory requirements and data privacy concerns are pushing organizations to adopt big data technologies. Governments worldwide are implementing stringent data protection regulations, like the General Data Protection Regulation (GDPR) in Europe and the California Consumer Privacy Act (CCPA) in the United States. These regulations necessitate robust data management and analytics solutions to ensure compliance and avoid hefty fines. As a result, organizations are investing heavily in big data platforms that offer secure and compliant data handling capabilities.
As organizations continue to navigate the complexities of data management, the role of Big Data Professional Services becomes increasingly critical. These services offer specialized expertise in implementing and managing big data solutions, ensuring that businesses can effectively harness the power of their data. Professional services encompass a range of offerings, including consulting, system integration, and managed services, tailored to meet the unique needs of each organization. By leveraging the knowledge and experience of big data professionals, companies can optimize their data strategies, streamline operations, and achieve their business objectives more efficiently. The demand for these services is driven by the growing complexity of big data ecosystems and the need for seamless integration with existing IT infrastructure.
Regionally, North America holds a dominant position in the big data technology market, primarily due to the early adoption of advanced technologies and the presence of key market players. The Asia Pacific region is expected to witness the highest growth rate during the forecast period, driven by increasing digitalization, the rapid growth of industries such as e-commerce and telecommunications, and supportive government initiatives aimed at fostering technological innovation.
The big data technology market is segmented into software, hardware, and services. The software segment encompasses data management software, analytics software, and data visualization tools, among others. This segment is expected to witness substantial growth due to the increasing demand for data analytics solutions that can handle vast amounts of data. Advanced analytics software, in particular, is gaining traction as organizations seek to gain deeper insights and make data-driven decisions. Companies are increasingly adopting sophisticated data visualization tools to present complex data in an easily understandable format, thereby enhancing decision-making processes.
This dataset was generated from a set of Excel spreadsheets from an Information and Communication Technology Services (ICTS) administrative database on student applications to the University of Cape Town (UCT). This database contains information on applications to UCT between the January 2006 and December 2014. In the original form received by DataFirst the data were ill suited to research purposes. This dataset represents an attempt at cleaning and organizing these data into a more tractable format. To ensure data confidentiality direct identifiers have been removed from the data and the data is only made available to accredited researchers through DataFirst's Secure Data Service.
The dataset was separated into the following data files:
Applications, individuals
Administrative records [adm]
Other [oth]
The data files were made available to DataFirst as a group of Excel spreadsheet documents from an SQL database managed by the University of Cape Town's Information and Communication Technology Services . The process of combining these original data files to create a research-ready dataset is summarised in a document entitled "Notes on preparing the UCT Student Application Data 2006-2014" accompanying the data.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data organization for the figures in the document: Figure 3A LineOutWithSun_SSAzi_135to225_green_Correct_ROI5_INFO.xls Figure 3b LineOutWithSun_SSAzi_m45to45_green_Correct_ROI5_INFO.xls Figure 4 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Sim_Correct_ROI5_INFO.xls Figure 5a LineOut_Camera_Elevation_SqAzi_m180to0_green_Sim_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls Figure 5b LineOut_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_0to180_green_Sim_Correct_ROI5_INFO.xls Figure 6a LineOutColor_SqAzi_m180to0_CP_20to50_Correct_ROI5_INFO.xls Figure 6b LineOutROI_SqAzi_m180to0_CP_20to50_green_Correct_INFO.xls Figure 7 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls