Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data are the foundation of science, and there is an increasing focus on how data can be reused and enhanced to drive scientific discoveries. However, most seemingly “open data” do not provide legal permissions for reuse and redistribution. The inability to integrate and redistribute our collective data resources blocks innovation and stymies the creation of life-improving diagnostic and drug selection tools. To help the biomedical research and research support communities (e.g. libraries, funders, repositories, etc.) understand and navigate the data licensing landscape, the (Re)usable Data Project (RDP) (http://reusabledata.org) assesses the licensing characteristics of data resources and how licensing behaviors impact reuse. We have created a ruleset to determine the reusability of data resources and have applied it to 56 scientific data resources (e.g. databases) to date. The results show significant reuse and interoperability barriers. Inspired by game-changing projects like Creative Commons, the Wikipedia Foundation, and the Free Software movement, we hope to engage the scientific community in the discussion regarding the legal use and reuse of scientific data, including the balance of openness and how to create sustainable data resources in an increasingly competitive environment.
Facebook
TwitterData from the State of California. From website:
Access raw State data files, databases, geographic data, and other data sources. Raw State data files can be reused by citizens and organizations for their own web applications and mashups.
Open. Effectively in the public domain. Terms of use page says:
In general, information presented on this web site, unless otherwise indicated, is considered in the public domain. It may be distributed or copied as permitted by law. However, the State does make use of copyrighted data (e.g., photographs) which may require additional permissions prior to your use. In order to use any information on this web site not owned or created by the State, you must seek permission directly from the owning (or holding) sources. The State shall have the unlimited right to use for any purpose, free of any charge, all information submitted via this site except those submissions made under separate legal contract. The State shall be free to use, for any purpose, any ideas, concepts, or techniques contained in information provided through this site.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Gratis by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Gratis across both sexes and to determine which sex constitutes the majority.
Key observations
There is a slight majority of female population, with 50.0% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Gratis Population by Race & Ethnicity. You can refer the same here
Facebook
TwitterOfficial statistics are produced impartially and free from political influence.
Facebook
TwitterThe data sets provide the text and detailed numeric information in all financial statements and their notes extracted from exhibits to corporate financial reports filed with the Commission using eXtensible Business Reporting Language (XBRL).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Regression ranks among the most popular statistical analysis methods across many research areas, including psychology. Typically, regression coefficients are displayed in tables. While this mode of presentation is information-dense, extensive tables can be cumbersome to read and difficult to interpret. Here, we introduce three novel visualizations for reporting regression results. Our methods allow researchers to arrange large numbers of regression models in a single plot. Using regression results from real-world as well as simulated data, we demonstrate the transformations which are necessary to produce the required data structure and how to subsequently plot the results. The proposed methods provide visually appealing ways to report regression results efficiently and intuitively. Potential applications range from visual screening in the model selection stage to formal reporting in research papers. The procedure is fully reproducible using the provided code and can be executed via free-of-charge, open-source software routines in R.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license variants. To assemble it we have collected from the Software Heritage archive—the largest publicly available archive of FOSS source code with accompanying development history—all versions of files whose names are commonly used to convey licensing terms to software users and developers. The dataset consists of 6.5 million unique license files that can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. Additional metadata about shipped license files are also provided, making the dataset ready to use in various contexts; they include: file length measures, detected MIME type, detected SPDX license (using ScanCode), example origin (e.g., GitHub repository), oldest public commit in which the license appeared. The dataset is released as open data as an archive file containing all deduplicated license blobs, plus several portable CSV files for metadata, referencing blobs via cryptographic checksums.
For more details see the included README file and companion paper:
Stefano Zacchiroli. A Large-scale Dataset of (Open Source) License Text Variants. In proceedings of the 2022 Mining Software Repositories Conference (MSR 2022). 23-24 May 2022 Pittsburgh, Pennsylvania, United States. ACM 2022.
If you use this dataset for research purposes, please acknowledge its use by citing the above paper.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
BioTIME is an invaluable open source biodiversity database, brought to life by an international research collective. Comprised of species abundance and diversity data from different ecological sites around the world, BioTIME provides a comprehensive global perspective on species richness in the Anthropocene. This extensive dataset can help us understand and comprehend trends and insights about the history of global biodiversity for many years to come.
From current to past records, this dataset offers detailed information about species composition, abundance levels and diversity throughout time. Through such analysis, researchers can better recognize the intricate connections between global ecosystems over time - providing insight into changes in climate and habitats due to human activity or natural causes. With its global scope and unparalleled depth of data points, this dataset sets itself apart as a unique resource for future ecological studies - available free to all!
Look through each column provided: DAY, MONTH ,YEAR ,SAMPLE_DESC ,PLOT ,LATITUDE ,LONGITUDE ,sum.allrawdata.ABUNDANCE ,sum.allrawdata.BIOMASS GENUS ,SPECIESGENUS_SPECIES REALMCLIMATE GENERAL_TREATMENT TREATMENT TREAT_COMMENTS TREAT_DATEHABITATPROTECTED_AREA BIOME_MAP TAXA ORGANISMSTITLE AB_BIOHAS_PLOTDATA_POINTSSTART_YEAREND _YEARCENT _LATCENT _LONGNUMBER _OF . SPECIESSNUMBER _OF . SAMPLESNUMBER _
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
First, it is important to understand the columns included in this dataset: DAY, MONTH, YEAR, SAMPLE_DESC (description of sample), PLOT (where sample was taken), LATITUDE & LONGITUDE (coordinates), sum.allrawdata.ABUNDANCE & sum.allrawdata.BIOMASS(total abundance/biomass of species observed in samples), GENUS & SPECIES (genus/species observed in samples). REALM (the geographic realm where samples were taken from) CLIMATE(climate type for study area), GENERAL_TREAT & TREATMENT (general/specific treatments applied to study area) TREAT_COMMENTS(additional comments on the treatment) HABITAT(habitat type from study area) PROTECTED_AREA whether or not it is a protected area BIOME_MAP biome map TAXA taxonomic group ORGANISMS organisms studied TITLE title description AB_BIO abundance or biomass HAS_PLOT whether or not the study has a plot DATA POINTS number of data points START_YEAR start year END_YEAR end year CENT-LAT central latitude CENT-LONG central longitude NUMBER OF SPECIES number of species studied NUMBER OF SAMPLES number of samples taken NUMBER LAT LONG number latitude and longitude GRAIN SIZE TEXT grain size text GRAIN SQ KM grain size kilometers AREA SQ KM area square kilometers CONTACT 1 primary contact CONTACT 2 secondary contact CONT 1 MAIL primary contacts email address CONT 2 MAIL secondary contacts email address LICENSE license associated with studies WEB LINK web link DATA SOURCE source of data METHODS methods used SUMMARY METHODS summary methods COMMENTS additional comments DATE STUDY ADDED date added to database ABUNDANCE TYPE type abundance data COLLECTED BIOMASS TYPE type biomass collected SAMPLE DESC NAME name sample description
The second step towards understanding this dataset is exploring how each column can be utilized within your research project; depending on your research topic the usage will vary according to what information you may be needing or searching for within
- Investigating historical patterns of species distribution – By leveraging the temporal data in this dataset, researchers can observe changes in species abundance and diversity over a given period of time and compare it to environmental factors. This could shed light on current distributions of species as well as inform conservation efforts by providing information about formerly healthy ecosystems or unsustainable management practices.
- Determining the impact of human actions on biodiversity – Through analysis of BioTIME data, land development and subsequent changes to habitat loss may be identified, allowing researchers to understand the impact human action has had upon a species population size or geographic range over time.
- Analysing climate change effects on biodiversity – By examining changes in abundance, diversity and geographic range across different study sites captured over several years within this dataset, researchers may detect correlations between climatic conditions such as temperature increases and precipitation levels with certain species diversity acr...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.
Data content areas include:
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12064410%2F5d593c922def359a7cfc25b87eff667e%2Fus%20debt%20flag.png?generation=1676702443974125&alt=media" alt="">
This is a dataset that tracks several figures regarding US debt (to the penny) since 1993.
All data are official figures from the U.S. Treasury that have been compiled and structured by myself. Dates on the weekend (Saturday and Sunday), as well as federal holidays, are excluded from the debt tracker because the Treasury's fiscal data do not account for those days. Recent political debates in the US over the potential raising of the debt ceiling has inspired me to create this dataset. Personally, I believe that the issue will continue to dominate political discourse due to the increasing polarization between Democrats and Republicans.
2023-02-17 - Dataset is created (10,914 days after temporal coverage start date).
GitHub Repository - The same data but on GitHub.
Link to Notebook Important: Each new record is accumulated data from previous days.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The DOE and Berkeley Lab have partnered across the national laboratory complex and with the research community to curate, validate, and publish the world’s largest set of labeled time-series data representing commercial HVAC systems operating in faulted and fault-free states. The data sets currently cover rooftop units, single-duct air handler units, dual-duct air handler units, variable air volume boxes, fan coil units, chiller plant, and boiler plant. Each data set includes from 20 to more than 100 data points that are commonly monitored in today’s buildings. This operational data is paired with ground truth information indicating which faults are present during which time periods.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Excel population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Excel. The dataset can be utilized to understand the population distribution of Excel by age. For example, using this dataset, we can identify the largest age group in Excel.
Key observations
The largest age group in Excel, AL was for the group of age 45 to 49 years years with a population of 74 (15.64%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Excel, AL was the 85 years and over years with a population of 2 (0.42%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel Population by Age. You can refer the same here
Facebook
TwitterThe USGS Governmental Unit Boundaries dataset from The National Map (TNM) represents major civil areas for the Nation, including States or Territories, counties (or equivalents), Federal and Native American areas, congressional districts, minor civil divisions, incorporated places (such as cities and towns), and unincorporated places. Boundaries data are useful for understanding the extent of jurisdictional or administrative areas for a wide range of applications, including mapping or managing resources, and responding to natural disasters. Boundaries data also include extents of forest, grassland, park, wilderness, wildlife, and other reserve areas useful for recreational activities, such as hiking and backpacking. Boundaries data are acquired from a variety of government sources. The data represents the source data with minimal editing or review by USGS. Please refer to the feature-level metadata for information on the data source. The National Map boundaries data is commonly combined with other data themes, such as elevation, hydrography, structures, and transportation, to produce general reference base maps. The National Map viewer allows free downloads of public domain boundaries data in either ESRI File Geodatabase or Shapefile formats. For additional information on the boundaries data model, go to https://www.usgs.gov/core-science-systems/national-geospatial-program/national-map.
Facebook
TwitterPremium B2C Consumer Database - 269+ Million US Records
Supercharge your B2C marketing campaigns with comprehensive consumer database, featuring over 269 million verified US consumer records. Our 20+ year data expertise delivers higher quality and more extensive coverage than competitors.
Core Database Statistics
Consumer Records: Over 269 million
Email Addresses: Over 160 million (verified and deliverable)
Phone Numbers: Over 76 million (mobile and landline)
Mailing Addresses: Over 116,000,000 (NCOA processed)
Geographic Coverage: Complete US (all 50 states)
Compliance Status: CCPA compliant with consent management
Targeting Categories Available
Demographics: Age ranges, education levels, occupation types, household composition, marital status, presence of children, income brackets, and gender (where legally permitted)
Geographic: Nationwide, state-level, MSA (Metropolitan Service Area), zip code radius, city, county, and SCF range targeting options
Property & Dwelling: Home ownership status, estimated home value, years in residence, property type (single-family, condo, apartment), and dwelling characteristics
Financial Indicators: Income levels, investment activity, mortgage information, credit indicators, and wealth markers for premium audience targeting
Lifestyle & Interests: Purchase history, donation patterns, political preferences, health interests, recreational activities, and hobby-based targeting
Behavioral Data: Shopping preferences, brand affinities, online activity patterns, and purchase timing behaviors
Multi-Channel Campaign Applications
Deploy across all major marketing channels:
Email marketing and automation
Social media advertising
Search and display advertising (Google, YouTube)
Direct mail and print campaigns
Telemarketing and SMS campaigns
Programmatic advertising platforms
Data Quality & Sources
Our consumer data aggregates from multiple verified sources:
Public records and government databases
Opt-in subscription services and registrations
Purchase transaction data from retail partners
Survey participation and research studies
Online behavioral data (privacy compliant)
Technical Delivery Options
File Formats: CSV, Excel, JSON, XML formats available
Delivery Methods: Secure FTP, API integration, direct download
Processing: Real-time NCOA, email validation, phone verification
Custom Selections: 1,000+ selectable demographic and behavioral attributes
Minimum Orders: Flexible based on targeting complexity
Unique Value Propositions
Dual Spouse Targeting: Reach both household decision-makers for maximum impact
Cross-Platform Integration: Seamless deployment to major ad platforms
Real-Time Updates: Monthly data refreshes ensure maximum accuracy
Advanced Segmentation: Combine multiple targeting criteria for precision campaigns
Compliance Management: Built-in opt-out and suppression list management
Ideal Customer Profiles
E-commerce retailers seeking customer acquisition
Financial services companies targeting specific demographics
Healthcare organizations with compliant marketing needs
Automotive dealers and service providers
Home improvement and real estate professionals
Insurance companies and agents
Subscription services and SaaS providers
Performance Optimization Features
Lookalike Modeling: Create audiences similar to your best customers
Predictive Scoring: Identify high-value prospects using AI algorithms
Campaign Attribution: Track performance across multiple touchpoints
A/B Testing Support: Split audiences for campaign optimization
Suppression Management: Automatic opt-out and DNC compliance
Pricing & Volume Options
Flexible pricing structures accommodate businesses of all sizes:
Pay-per-record for small campaigns
Volume discounts for large deployments
Subscription models for ongoing campaigns
Custom enterprise pricing for high-volume users
Data Compliance & Privacy
VIA.tools maintains industry-leading compliance standards:
CCPA (California Consumer Privacy Act) compliant
CAN-SPAM Act adherence for email marketing
TCPA compliance for phone and SMS campaigns
Regular privacy audits and data governance reviews
Transparent opt-out and data deletion processes
Getting Started
Our data specialists work with you to:
Define your target audience criteria
Recommend optimal data selections
Provide sample data for testing
Configure delivery methods and formats
Implement ongoing campaign optimization
Why We Lead the Industry
With over two decades of data industry experience, we combine extensive database coverage with advanced targeting capabilities. Our commitment to data quality, compliance, and customer success has made us the preferred choice for businesses seeking superior B2C marketing performance.
Contact our team to discuss your specific targeting requirements and receive custom pricing for your marketing objectives.
Facebook
TwitterIn the rapidly moving proteomics field, a diverse patchwork of data analysis pipelines and algorithms for data normalization and differential expression analysis is used by the community. We generated a mass spectrometry downstream analysis pipeline (MS-DAP) that integrates both popular and recently developed algorithms for normalization and statistical analyses. Additional algorithms can be easily added in the future as plugins. MS-DAP is open-source and facilitates transparent and reproducible proteome science by generating extensive data visualizations and quality reporting, provided as standardized PDF reports. Second, we performed a systematic evaluation of methods for normalization and statistical analysis on a large variety of data sets, including additional data generated in this study, which revealed key differences. Commonly used approaches for differential testing based on moderated t-statistics were consistently outperformed by more recent statistical models, all integrated in MS-DAP. Third, we introduced a novel normalization algorithm that rescues deficiencies observed in commonly used normalization methods. Finally, we used the MS-DAP platform to reanalyze a recently published large-scale proteomics data set of CSF from AD patients. This revealed increased sensitivity, resulting in additional significant target proteins which improved overlap with results reported in related studies and includes a large set of new potential AD biomarkers in addition to previously reported.
Facebook
TwitterPresentation service (WMS) for a freely usable worldwide uniform web map based on free and official data sources.In the product, among other things, free official geodata of the federal government and the open data countries Berlin, Brandenburg, Hamburg, North Rhine-Westphalia, Saxony and Thuringia are presented. In addition, Mecklenburg-Western Pomerania and Rhineland-Palatinate provide their official spatial data for the TopPlusOpen within the framework of a cooperation agreement, so that these countries are also represented exclusively by official data.In the other federal states and abroad, OSM data is mainly used in the corresponding zoom levels, which from the point of view of the BKG meet all quality requirements and can be combined almost seamlessly with the official data.The web services of the TopPlusOpen are offered via the standardized interfaces WMS and WMTS and are high-performance.There are 4 different variants offered: - TopPlusOpen: Very detailed map display in solid colors - TopPlusOpen grayscale: Content identical to the full-tone version; Automatically generated grayscale - TopPlusOpen Light: Content reduced compared to the full-tone version; Subtle colour scheme - TopPlusOpen Light Grey: Content identical to the TopPlusOpen Light; Presentation in shades of grey and individual discreet colors (waters, borders)The TopPlusOpen web map is produced in two projections: - Pseudo-Mercator projection (EPSG:3857) - UTM32 (EPSG:25832)Pseudo-Mercator projection: The web map has 19 scale levels in this projection and is divided into three different display areas: - Worldwide representation for small scales - Europe-wide representation for medium scales - Detailed representation for Germany and the adjacent foreign countriesProjection UTM32: The web map has 14 scale levels in this projection and is divided into two display areas: - Europe-wide representation for medium scales - Detailed representation for Germany and neighbouring countries: The Layer TopPlusOpen-Light-Grey is well suited for use as a background map and has a reduced content compared to the solid version of the TopPlusOpen. The illustration is done in gray tones and individual pale colors (borders, waters).
Facebook
Twitterhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdf
This dataset provides high-resolution gridded temperature and precipitation observations from a selection of sources. Additionally the dataset contains daily global average near-surface temperature anomalies. All fields are defined on either daily or monthly frequency. The datasets are regularly updated to incorporate recent observations. The included data sources are commonly known as GISTEMP, Berkeley Earth, CPC and CPC-CONUS, CHIRPS, IMERG, CMORPH, GPCC and CRU, where the abbreviations are explained below. These data have been constructed from high-quality analyses of meteorological station series and rain gauges around the world, and as such provide a reliable source for the analysis of weather extremes and climate trends. The regular update cycle makes these data suitable for a rapid study of recently occurred phenomena or events. The NASA Goddard Institute for Space Studies temperature analysis dataset (GISTEMP-v4) combines station data of the Global Historical Climatology Network (GHCN) with the Extended Reconstructed Sea Surface Temperature (ERSST) to construct a global temperature change estimate. The Berkeley Earth Foundation dataset (BERKEARTH) merges temperature records from 16 archives into a single coherent dataset. The NOAA Climate Prediction Center datasets (CPC and CPC-CONUS) define a suite of unified precipitation products with consistent quantity and improved quality by combining all information sources available at CPC and by taking advantage of the optimal interpolation (OI) objective analysis technique. The Climate Hazards Group InfraRed Precipitation with Station dataset (CHIRPS-v2) incorporates 0.05° resolution satellite imagery and in-situ station data to create gridded rainfall time series over the African continent, suitable for trend analysis and seasonal drought monitoring. The Integrated Multi-satellitE Retrievals dataset (IMERG) by NASA uses an algorithm to intercalibrate, merge, and interpolate “all'' satellite microwave precipitation estimates, together with microwave-calibrated infrared (IR) satellite estimates, precipitation gauge analyses, and potentially other precipitation estimators over the entire globe at fine time and space scales for the Tropical Rainfall Measuring Mission (TRMM) and its successor, Global Precipitation Measurement (GPM) satellite-based precipitation products. The Climate Prediction Center morphing technique dataset (CMORPH) by NOAA has been created using precipitation estimates that have been derived from low orbiter satellite microwave observations exclusively. Then, geostationary IR data are used as a means to transport the microwave-derived precipitation features during periods when microwave data are not available at a location. The Global Precipitation Climatology Centre dataset (GPCC) is a centennial product of monthly global land-surface precipitation based on the ~80,000 stations world-wide that feature record durations of 10 years or longer. The data coverage per month varies from ~6,000 (before 1900) to more than 50,000 stations. The Climatic Research Unit dataset (CRU v4) features an improved interpolation process, which delivers full traceability back to station measurements. The station measurements of temperature and precipitation are public, as well as the gridded dataset and national averages for each country. Cross-validation was performed at a station level, and the results have been published as a guide to the accuracy of the interpolation. This catalogue entry complements the E-OBS record in many aspects, as it intends to provide high-resolution gridded meteorological observations at a global rather than continental scale. These data may be suitable as a baseline for model comparisons or extreme event analysis in the CMIP5 and CMIP6 dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Free Soil population by gender and age. The dataset can be utilized to understand the gender distribution and demographics of Free Soil.
The dataset constitues the following two datasets across these two themes
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Adapted from Wikipedia: OpenStreetMap (OSM) is a collaborative project to create a free editable map of the world. Created in 2004, it was inspired by the success of Wikipedia and more than two million registered users who can add data by manual survey, GPS devices, aerial photography, and other free sources. We've made available a number of tables (explained in detail below): history_* tables: full history of OSM objects planet_* tables: snapshot of current OSM objects as of Nov 2019 The history_* and planet_* table groups are composed of node, way, relation, and changeset tables. These contain the primary OSM data types and an additional changeset corresponding to OSM edits for convenient access. These objects are encoded using the BigQuery GEOGRAPHY data type so that they can be operated upon with the built-in geography functions to perform geometry and feature selection, additional processing. Example analyses are given below. This dataset is part of a larger effort to make data available in BigQuery through the Google Cloud Public Datasets program . OSM itself is produced as a public good by volunteers, and there are no guarantees about data quality. Interested in learning more about how these data were brought into BigQuery and how you can use them? Check out the sample queries below to get started. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Facebook
TwitterThe State Contract and Procurement Registration System (SCPRS) was established in 2003, as a centralized database of information on State contracts and purchases over $5000. eSCPRS represents the data captured in the State's eProcurement (eP) system, Bidsync, as of March 16, 2009. The data provided is an extract from that system for fiscal years 2012-2013, 2013-2014, and 2014-2015
Data Limitations:
Some purchase orders have multiple UNSPSC numbers, however only first was used to identify the purchase order. Multiple UNSPSC numbers were included to provide additional data for a DGS special event however this affects the formatting of the file. The source system Bidsync is being deprecated and these issues will be resolved in the future as state systems transition to Fi$cal.
Data Collection Methodology:
The data collection process starts with a data file from eSCPRS that is scrubbed and standardized prior to being uploaded into a SQL Server database. There are four primary tables. The Supplier, Department and United Nations Standard Products and Services Code (UNSPSC) tables are reference tables. The Supplier and Department tables are updated and mapped to the appropriate numbering schema and naming conventions. The UNSPSC table is used to categorize line item information and requires no further manipulation. The Purchase Order table contains raw data that requires conversion to the correct data format and mapping to the corresponding data fields. A stacking method is applied to the table to eliminate blanks where needed. Extraneous characters are removed from fields. The four tables are joined together and queries are executed to update the final Purchase Order Dataset table. Once the scrubbing and standardization process is complete the data is then uploaded into the SQL Server database.
Secondary/Related Resources:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data are the foundation of science, and there is an increasing focus on how data can be reused and enhanced to drive scientific discoveries. However, most seemingly “open data” do not provide legal permissions for reuse and redistribution. The inability to integrate and redistribute our collective data resources blocks innovation and stymies the creation of life-improving diagnostic and drug selection tools. To help the biomedical research and research support communities (e.g. libraries, funders, repositories, etc.) understand and navigate the data licensing landscape, the (Re)usable Data Project (RDP) (http://reusabledata.org) assesses the licensing characteristics of data resources and how licensing behaviors impact reuse. We have created a ruleset to determine the reusability of data resources and have applied it to 56 scientific data resources (e.g. databases) to date. The results show significant reuse and interoperability barriers. Inspired by game-changing projects like Creative Commons, the Wikipedia Foundation, and the Free Software movement, we hope to engage the scientific community in the discussion regarding the legal use and reuse of scientific data, including the balance of openness and how to create sustainable data resources in an increasingly competitive environment.