This Data Service Standard outlines rules and best practices for managing data in the City. Its goal is to maintain data accuracy, security, accessibility, and usability. By ensuring these aspects, the standard supports well-informed decision-making and enhances public services. The aspects covered in this Data Service Standards are:Create Purposeful Data ServicesBuild Data Excellence through Multidisciplinary CollaborationProvide User-Centric SolutionsProvide Holistic Solutions for Seamless User ExperiencesFoster User Engagement and InspirationCreate a Secure Service That Protects Sensitive DataDefine What Success Looks Like and Publish Performance DataBe Strategic When Choosing Technology
Sea surface temperature (SST) plays an important role in a number of ecological processes and can vary over a wide range of time scales, from daily to decadal changes. SST influences primary production, species migration patterns, and coral health. If temperatures are anomalously warm for extended periods of time, drastic changes in the surrounding ecosystem can result, including harmful effects such as coral bleaching. This layer represents the standard deviation of SST (degrees Celsius) of the weekly time series from 2000-2013. Three SST datasets were combined to provide continuous coverage from 1985-2013. The concatenation applies bias adjustment derived from linear regression to the overlap periods of datasets, with the final representation matching the 0.05-degree (~5-km) near real-time SST product. First, a weekly composite, gap-filled SST dataset from the NOAA Pathfinder v5.2 SST 1/24-degree (~4-km), daily dataset (a NOAA Climate Data Record) for each location was produced following Heron et al. (2010) for January 1985 to December 2012. Next, weekly composite SST data from the NOAA/NESDIS/STAR Blended SST 0.1-degree (~11-km), daily dataset was produced for February 2009 to October 2013. Finally, a weekly composite SST dataset from the NOAA/NESDIS/STAR Blended SST 0.05-degree (~5-km), daily dataset was produced for March 2012 to December 2013. The standard deviation of the long-term mean SST was calculated by taking the standard deviation over all weekly data from 2000-2013 for each pixel.
Unlock the power of ready-to-use data sourced from developer communities and repositories with Developer Community and Code Datasets.
Data Sources:
GitHub: Access comprehensive data about GitHub repositories, developer profiles, contributions, issues, social interactions, and more.
StackShare: Receive information about companies, their technology stacks, reviews, tools, services, trends, and more.
DockerHub: Dive into data from container images, repositories, developer profiles, contributions, usage statistics, and more.
Developer Community and Code Datasets are a treasure trove of public data points gathered from tech communities and code repositories across the web.
With our datasets, you'll receive:
Choose from various output formats, storage options, and delivery frequencies:
Why choose our Datasets?
Fresh and accurate data: Access complete, clean, and structured data from scraping professionals, ensuring the highest quality.
Time and resource savings: Let us handle data extraction and processing cost-effectively, freeing your resources for strategic tasks.
Customized solutions: Share your unique data needs, and we'll tailor our data harvesting approach to fit your requirements perfectly.
Legal compliance: Partner with a trusted leader in ethical data collection. Oxylabs is trusted by Fortune 500 companies and adheres to GDPR and CCPA standards.
Pricing Options:
Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.
Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.
Experience a seamless journey with Oxylabs:
Empower your data-driven decisions with Oxylabs Developer Community and Code Datasets!
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Here, we present FLiPPR, or FragPipe LiP (limited proteolysis) Processor, a tool that facilitates the analysis of data from limited proteolysis mass spectrometry (LiP-MS) experiments following primary search and quantification in FragPipe. LiP-MS has emerged as a method that can provide proteome-wide information on protein structure and has been applied to a range of biological and biophysical questions. Although LiP-MS can be carried out with standard laboratory reagents and mass spectrometers, analyzing the data can be slow and poses unique challenges compared to typical quantitative proteomics workflows. To address this, we leverage FragPipe and then process its output in FLiPPR. FLiPPR formalizes a specific data imputation heuristic that carefully uses missing data in LiP-MS experiments to report on the most significant structural changes. Moreover, FLiPPR introduces a data merging scheme and a protein-centric multiple hypothesis correction scheme, enabling processed LiP-MS data sets to be more robust and less redundant. These improvements strengthen statistical trends when previously published data are reanalyzed with the FragPipe/FLiPPR workflow. We hope that FLiPPR will lower the barrier for more users to adopt LiP-MS, standardize statistical procedures for LiP-MS data analysis, and systematize output to facilitate eventual larger-scale integration of LiP-MS data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Standard City population by gender and age. The dataset can be utilized to understand the gender distribution and demographics of Standard City.
The dataset constitues the following two datasets across these two themes
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This data reference standard provides a standard list of values to categorize data on person(s). The list reflects the classifications of gender and is designed to provide common data variables of the reported gender of person(s) or an individual’s personal and social gender identity. This data reference standard is to be read in conjunction with the Policy Direction to Modernize the Government of Canada’s Sex and Gender Information Practices and the Disaggregated Data Action Plan. This list of values is intended to standardize the way gender classifications are described in datasets to enable data interoperability and improve data quality. The appendix lists a data reference table that includes one-digit codes for designating gender. Not included in this data reference standard is an additional two-digit code for further classification. This data reference standard will be reviewed as required by the data reference steward in consultation with the data reference standard custodian. For support or advice on the measurement of “gender of person” or related data variables, contact statcan.csds-cnsd.statcan@statcan.gc.ca
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Standard City population by age. The dataset can be utilized to understand the age distribution and demographics of Standard City.
The dataset constitues the following three datasets
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Effective data management plays a key role in oceanographic research as cruise-based data, collected from different laboratories and expeditions, are commonly compiled to investigate regional to global oceanographic processes. Here we describe new and updated best practice data standards for discrete chemical oceanographic observations, specifically those dealing with column header abbreviations, quality control flags, missing value indicators, and standardized calculation of certain properties. These data standards have been developed with the goals of improving the current practices of the scientific community and promoting their international usage. These guidelines are intended to standardize data files for data sharing and submission into permanent archives. They will facilitate future quality control and synthesis efforts and lead to better data interpretation. In turn, this will promote research in ocean biogeochemistry, such as studies of carbon cycling and ocean acidification, on regional to global scales. These best practice standards are not mandatory. Agencies, institutes, universities, or research vessels can continue using different data standards if it is important for them to maintain historical consistency. However, it is hoped that they will be adopted as widely as possible to facilitate consistency and to achieve the goals stated above.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the data for the Standard, IL population pyramid, which represents the Standard population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Standard Population by Age. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Standard City population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Standard City. The dataset can be utilized to understand the population distribution of Standard City by age. For example, using this dataset, we can identify the largest age group in Standard City.
Key observations
The largest age group in Standard City, IL was for the group of age 65 to 69 years years with a population of 35 (18.72%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Standard City, IL was the 40 to 44 years years with a population of 1 (0.53%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Standard City Population by Age. You can refer the same here
This part of the data release contains a grid of standard deviations of bathymetric soundings within each 0.5 m x 0.5 m grid cell. The bathymetry was collected on February 1, 2011, in the Sacramento River from the confluence of the Feather River to Knights Landing. The standard deviations represent one component of bathymetric uncertainty in the final digital elevation model (DEM), which is also available in this data release. The bathymetry data were collected by the USGS Pacific Coastal and Marine Science Center (PCMSC) team with collaboration and funding from the U.S. Army Corps of Engineers. This project used interferometric sidescan sonar to characterize the riverbed and channel banks along a 12 mile reach of the Sacramento River near the town of Knights Landing, California (River Mile 79 through River Mile 91) to aid in the understanding of fish response to the creation of safe habitat associated with levee restoration efforts in two 1.5 mile reaches of the Sacramento River between River Mile 80 and 86.
Overview The purpose of this dataset is to provide preliminary filtered, averaged lidar data and standardize the data format of various datastreams from the buoy into NetCDF. Data Quality Standard filtering thresholds on the averaged data were applied and several data format issues of the raw data were streamlined to create a standardized NetCDF format data. Uncertainty The uncertainty of lidar data has not been analyzed, but they are not expected to deviate from instrument technical specifications.
Introducing Job Posting Datasets: Uncover labor market insights!
Elevate your recruitment strategies, forecast future labor industry trends, and unearth investment opportunities with Job Posting Datasets.
Job Posting Datasets Source:
Indeed: Access datasets from Indeed, a leading employment website known for its comprehensive job listings.
Glassdoor: Receive ready-to-use employee reviews, salary ranges, and job openings from Glassdoor.
StackShare: Access StackShare datasets to make data-driven technology decisions.
Job Posting Datasets provide meticulously acquired and parsed data, freeing you to focus on analysis. You'll receive clean, structured, ready-to-use job posting data, including job titles, company names, seniority levels, industries, locations, salaries, and employment types.
Choose your preferred dataset delivery options for convenience:
Receive datasets in various formats, including CSV, JSON, and more. Opt for storage solutions such as AWS S3, Google Cloud Storage, and more. Customize data delivery frequencies, whether one-time or per your agreed schedule.
Why Choose Oxylabs Job Posting Datasets:
Fresh and accurate data: Access clean and structured job posting datasets collected by our seasoned web scraping professionals, enabling you to dive into analysis.
Time and resource savings: Focus on data analysis and your core business objectives while we efficiently handle the data extraction process cost-effectively.
Customized solutions: Tailor our approach to your business needs, ensuring your goals are met.
Legal compliance: Partner with a trusted leader in ethical data collection. Oxylabs is a founding member of the Ethical Web Data Collection Initiative, aligning with GDPR and CCPA best practices.
Pricing Options:
Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.
Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.
Experience a seamless journey with Oxylabs:
Effortlessly access fresh job posting data with Oxylabs Job Posting Datasets.
Data consists of worldwide ocean water color/transparency derived from NODC's ocean station data file from April 1901 to December 1985. The NODC Environmental Information Bulletin 87-1 (EIB 87-1, March 1987) contains a complete explanation of the data and their format.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Background: Critical care units (CCUs) with wide use of various monitoring devices generate massive data. To utilize the valuable information of these devices; data are collected and stored using systems like Clinical Information System (CIS), Laboratory Information Management System (LIMS), etc. These systems are proprietary in nature, allow limited access to their database and have vendor specific clinical implementation. In this study we focus on developing an open source web-based meta-data repository for CCU representing stay of patient with relevant details.
Methods: After developing the web-based open source repository we analyzed prospective data from two sites for four months for data quality dimensions (completeness, timeliness, validity, accuracy and consistency), morbidity and clinical outcomes. We used a regression model to highlight the significance of practice variations linked with various quality indicators. Results: Data dictionary (DD) with 1447 fields (90.39% categorical and 9.6% text fields) is presented to cover clinical workflow of NICU. The overall quality of 1795 patient days data with respect to standard quality dimensions is 87%. The data exhibit 82% completeness, 97% accuracy, 91% timeliness and 94% validity in terms of representing CCU processes. The data scores only 67% in terms of consistency. Furthermore, quality indicator and practice variations are strongly correlated (p-value < 0.05).
Results: Data dictionary (DD) with 1555 fields (89.6% categorical and 11.4% text fields) is presented to cover clinical workflow of a CCU. The overall quality of 1795 patient days data with respect to standard quality dimensions is 87%. The data exhibit 82% completeness, 97% accuracy, 91% timeliness and 94% validity in terms of representing CCU processes. The data scores only 67% in terms of consistency. Furthermore, quality indicators and practice variations are strongly correlated (p-value < 0.05).
Conclusion: This study documents DD for standardized data collection in CCU. This provides robust data and insights for audit purposes and pathways for CCU to target practice improvements leading to specific quality improvements.
This is the current version of Oregon's Open Data Coordinator's Handbook. The Open Data Coordinator's Handbook provides instructions to help appointed data coordinators in completing the deliverables of the Open Data Standard, including an agency data inventory, open data plan, and in architecting processes to publish open data.
Transform Unstructured Financial Docs into Actionable Insights Harness proprietary AI models to extract, validate, and standardize financial data from any document format, including scanned images, handwritten notes, or multi-language PDFs. Unlike basic OCR tools, our solution handles complex layouts, merged cells, poor quality PDFs and low-quality scans with industry-leading precision.
Key Features Universal Format Support: Extract data from scanned PDFs, images (JPEG/PNG), Excel, Word, and any other handwritten documents.
AI-Driven OCR & LLM Standardization:
Convert unstructured text into standardized fields (e.g., "Net Profit" → ISO 20022-compliant tags).
Resolve inconsistencies (e.g., "$1M" vs. "1,000,000 USD") using context-aware LLMs.
100+ Language Coverage: Process financial docs in Arabic, Bulgarian, and more with automated translation.
Up to 99% Accuracy: Triple-validation via AI cross-checks, rule-based audits, and human-in-the-loop reviews.
Prebuilt Templates: Auto-detect formats for common documents (e.g., IFRS-compliant P&L statements, IRS tax forms).
Data Sourcing & Output Supported Documents: Balance sheets, invoices, tax filings, bank statements, receipts and more. Export Formats: Excel, CSV, JSON, API, PostgreSQL, or direct integration with tools like QuickBooks, SAP.
Use Cases 1. Credit Risk Analysis: Automate financial health assessments for loan approvals and vender analysis.
Audit Compliance: Streamline data aggregation for GAAP/IFRS audits.
Due Diligence: Verify company legitimacy for mergers, investments, acquisitions, or partnerships.
Compliance: Streamline KYC/AML workflows with automated financials check.
Invoice Processing: Extract vendor payment terms, due dates, and amounts.
Technical Edge 1. AI Architecture: Leverages proprietary algorithm which combines vision transformers and OCR pipelines for layout detection, LLM models for context analysis, and rule-based validation.
Security: SOC 2 compliance, and on-premise storage options.
Latency: Process as much as 10,000 pages/hour with sub-60-second extractions.
Pricing & Trials Pay-as-you-go (min 1,000 docs/month).
Enterprise: Custom pricing for volume discounts, SLA guarantees, and white-glove onboarding.
Free Trial Available
https://snd.se/en/search-and-order-data/using-datahttps://snd.se/en/search-and-order-data/using-data
The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.
The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.
QoG Standard Dataset is the largest dataset consisting of more than 2,000 variables from sources related to the Quality of Government. The data exist in both time-series (year 1946 and onwards) and cross-section (year 2020). Many of the variables are available in both datasets, but some are not. The datasets draws on a number of freely available data sources related to QoG and its correlates.
In the QoG Standard CS dataset, data from and around 2020 is included. Data from 2020 is prioritized; however, if no data is available for a country for 2020, data for 2021 is included. If no data exists for 2021, data for 2019 is included, and so on up to a maximum of +/- 3 years.
In the QoG Standard TS dataset, data from 1946 and onwards is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This reference data provides a standard list of values for all Countries, Territories and Geographic areas. This list is intended to standardize the way Countries, Territories and Geographic areas are described in datasets to enable data interoperability and improve data quality. The data dictionary explains what each column means in the list.
This ongoing dataset contains monthly precipitation measurements from a network of standard can rain gauges at the Jornada Experimental Range in Dona Ana County, New Mexico, USA. Precipitation physically collects within gauges during the month and is manually measured with a graduated cylinder at the end of each month. This network is maintained by USDA Agricultural Research Service personnel. This dataset includes 39 different locations but only 29 of them are current. Other precipitation data exist for this area, including event-based tipping bucket data with timestamps, but do not go as far back in time as this dataset. Resources in this dataset:Resource Title: Website Pointer to html file. File Name: Web Page, url: https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-jrn&identifier=210380001 Webpage with information and links to data files for download
This Data Service Standard outlines rules and best practices for managing data in the City. Its goal is to maintain data accuracy, security, accessibility, and usability. By ensuring these aspects, the standard supports well-informed decision-making and enhances public services. The aspects covered in this Data Service Standards are:Create Purposeful Data ServicesBuild Data Excellence through Multidisciplinary CollaborationProvide User-Centric SolutionsProvide Holistic Solutions for Seamless User ExperiencesFoster User Engagement and InspirationCreate a Secure Service That Protects Sensitive DataDefine What Success Looks Like and Publish Performance DataBe Strategic When Choosing Technology