This dataset is a compilation of address point data for the City of Tempe. The dataset contains a point location, the official address (as defined by The Building Safety Division of Community Development) for all occupiable units and any other official addresses in the City. There are several additional attributes that may be populated for an address, but they may not be populated for every address. Contact: Lynn Flaaen-Hanna, Development Services Specialist Contact E-mail Link: Map that Lets You Explore and Export Address Data Data Source: The initial dataset was created by combining several datasets and then reviewing the information to remove duplicates and identify errors. This published dataset is the system of record for Tempe addresses going forward, with the address information being created and maintained by The Building Safety Division of Community Development.Data Source Type: ESRI ArcGIS Enterprise GeodatabasePreparation Method: N/APublish Frequency: WeeklyPublish Method: AutomaticData Dictionary
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license variants. To assemble it we have collected from the Software Heritage archive—the largest publicly available archive of FOSS source code with accompanying development history—all versions of files whose names are commonly used to convey licensing terms to software users and developers. The dataset consists of 6.5 million unique license files that can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. Additional metadata about shipped license files are also provided, making the dataset ready to use in various contexts; they include: file length measures, detected MIME type, detected SPDX license (using ScanCode), example origin (e.g., GitHub repository), oldest public commit in which the license appeared. The dataset is released as open data as an archive file containing all deduplicated license blobs, plus several portable CSV files for metadata, referencing blobs via cryptographic checksums.
For more details see the included README file and companion paper:
Stefano Zacchiroli. A Large-scale Dataset of (Open Source) License Text Variants. In proceedings of the 2022 Mining Software Repositories Conference (MSR 2022). 23-24 May 2022 Pittsburgh, Pennsylvania, United States. ACM 2022.
If you use this dataset for research purposes, please acknowledge its use by citing the above paper.
This dataset lists out all software in use by NASA
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Data Sources and Resources
This is a PDF document created by the Department of Information Technology (DoIT) and the Governor's Office of Performance Improvement to assist training Maryland state employees on use of the Open Data Portal, https://opendata.maryland.gov. This document covers direct data entry, uploading Excel spreadsheets, connecting source databases, and transposing data. Please note that this tutorial is intended for use by state employees, as non-state users cannot upload datasets to the Open Data Portal.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘2019 NYC Open Data Plan: Removed Datasets’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/80355d19-52a3-435d-bc73-2dfb2770c3c4 on 13 November 2021.
--- Dataset description provided by original source is as follows ---
Datasets removed from the Open Data Plan, and an explanation why they were removed.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Machine learning (ML) has gained much attention and has been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those projects to curate ML projects of high quality. The limited availability of such high-quality dataset poses an obstacle to understanding ML projects. To help clear this obstacle, we present NICHE, a manually labelled dataset consisting of 572 ML projects. Based on evidences of good software engineering practices, we label 441 of these projects as engineered and 131 as non-engineered. In this repository we provide "NICHE.csv" file that contains the list of the project names along with their labels, descriptive information for every dimension, and several basic statistics, such as the number of stars and commits. This dataset can help researchers understand the practices that are followed in high-quality ML projects. It can also be used as a benchmark for classifiers designed to identify engineered ML projects.
GitHub page: https://github.com/soarsmu/NICHE
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank
This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.
For more information, see the World Bank website.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population
http://data.worldbank.org/data-catalog/ed-stats
https://cloud.google.com/bigquery/public-data/world-bank-education
Citation: The World Bank: Education Statistics
Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @till_indeman from Unplash.
Of total government spending, what percentage is spent on education?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘2021 Open Data Plan: Future Releases’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/44d9a5f1-1555-44b4-a002-8f89b9bb986b on 26 January 2022.
--- Dataset description provided by original source is as follows ---
The collection of datasets set to be released on the Open Data Portal, according to the 2021 Open Data Plan.
--- Original source retains full ownership of the source dataset ---
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
DECD's listing of direct financial assistance to businesses from July 1, 2009 through June 30, 2024. New projects are usually added quarterly, but updates may be made on an ongoing basis.
Small Business Boost loan recipients can be found here: https://data.ct.gov/d/yk65-8y82
https://louisville-metro-opendata-lojic.hub.arcgis.com/pages/terms-of-use-and-licensehttps://louisville-metro-opendata-lojic.hub.arcgis.com/pages/terms-of-use-and-license
On October 15, 2013, Louisville Mayor Greg Fischer announced the signing of an open data policy executive order in conjunction with his compelling talk at the 2013 Code for America Summit. In nonchalant cadence, the mayor announced his support for complete information disclosure by declaring, "It's data, man."Sunlight Foundation - New Louisville Open Data Policy Insists Open By Default is the Future Open Data Annual ReportsSection 5.A. Within one year of the effective Data of this Executive Order, and thereafter no later than September 1 of each year, the Open Data Management Team shall submit to the Mayor an annual Open Data Report.The Open Data Management team (also known as the Data Governance Team is currently led by the city's Data Officer Andrew McKinney in the Office of Civic Innovation and Technology. Previously (2014-16) it was led by the Director of IT.Full Executive OrderEXECUTIVE ORDER NO. 1, SERIES 2013AN EXECUTIVE ORDERCREATING AN OPEN DATA PLAN. WHEREAS, Metro Government is the catalyst for creating a world-class city that provides its citizens with safe and vibrant neighborhoods, great jobs, a strong system of education and innovation, and a high quality of life; andWHEREAS, it should be easy to do business with Metro Government. Online government interactions mean more convenient services for citizens and businesses and online government interactions improve the cost effectiveness and accuracy of government operations; andWHEREAS, an open government also makes certain that every aspect of the built environment also has reliable digital descriptions available to citizens and entrepreneurs for deep engagement mediated by smart devices; andWHEREAS, every citizen has the right to prompt, efficient service from Metro Government; andWHEREAS, the adoption of open standards improves transparency, access to public information and improved coordination and efficiencies among Departments and partner organizations across the public, nonprofit and private sectors; andWHEREAS, by publishing structured standardized data in machine readable formats the Louisville Metro Government seeks to encourage the local software community to develop software applications and tools to collect, organize, and share public record data in new and innovative ways; andWHEREAS, in commitment to the spirit of Open Government, Louisville Metro Government will consider public information to be open by default and will proactively publish data and data containing information, consistent with the Kentucky Open Meetings and Open Records Act; andNOW, THEREFORE, BE IT PROMULGATED BY EXECUTIVE ORDER OF THE HONORABLE GREG FISCHER, MAYOR OF LOUISVILLE/JEFFERSON COUNTY METRO GOVERNMENT AS FOLLOWS:Section 1. Definitions. As used in this Executive Order, the terms below shall have the following definitions:(A) “Open Data” means any public record as defined by the Kentucky Open Records Act, which could be made available online using Open Format data, as well as best practice Open Data structures and formats when possible. Open Data is not information that is treated exempt under KRS 61.878 by Metro Government.(B) “Open Data Report” is the annual report of the Open Data Management Team, which shall (i) summarize and comment on the state of Open Data availability in Metro Government Departments from the previous year; (ii) provide a plan for the next year to improve online public access to Open Data and maintain data quality. The Open Data Management Team shall present an initial Open Data Report to the Mayor within 180 days of this Executive Order.(C) “Open Format” is any widely accepted, nonproprietary, platform-independent, machine-readable method for formatting data, which permits automated processing of such data and is accessible to external search capabilities.(D) “Open Data Portal” means the Internet site established and maintained by or on behalf of Metro Government, located at portal.louisvilleky.gov/service/data or its successor website.(E) “Open Data Management Team” means a group consisting of representatives from each Department within Metro Government and chaired by the Chief Information Officer (CIO) that is responsible for coordinating implementation of an Open Data Policy and creating the Open Data Report.(F) “Department” means any Metro Government department, office, administrative unit, commission, board, advisory committee, or other division of Metro Government within the official jurisdiction of the executive branch.Section 2. Open Data Portal.(A) The Open Data Portal shall serve as the authoritative source for Open Data provided by Metro Government(B) Any Open Data made accessible on Metro Government’s Open Data Portal shall use an Open Format.Section 3. Open Data Management Team.(A) The Chief Information Officer (CIO) of Louisville Metro Government will work with the head of each Department to identify a Data Coordinator in each Department. Data Coordinators will serve as members of an Open Data Management Team facilitated by the CIO and Metro Technology Services. The Open Data Management Team will work to establish a robust, nationally recognized, platform that addresses digital infrastructure and Open Data.(B) The Open Data Management Team will develop an Open Data management policy that will adopt prevailing Open Format standards for Open Data, and develop agreements with regional partners to publish and maintain Open Data that is open and freely available while respecting exemptions allowed by the Kentucky Open Records Act or other federal or state law.Section 4. Department Open Data Catalogue.(A) Each Department shall be responsible for creating an Open Data catalogue, which will include comprehensive inventories of information possessed and/or managed by the Department.(B) Each Department’s Open Data catalogue will classify information holdings as currently “public” or “not yet public”; Departments will work with Metro Technology Services to develop strategies and timelines for publishing open data containing information in a way that is complete, reliable, and has a high level of detail.Section 5. Open Data Report and Policy Review.(A) Within one year of the effective date of this Executive Order, and thereafter no later than September 1 of each year, the Open Data Management Team shall submit to the Mayor an annual Open Data Report.(B) In acknowledgment that technology changes rapidly, in the future, the Open Data Policy should be reviewed and considered for revisions or additions that will continue to position Metro Government as a leader on issues of openness, efficiency, and technical best practices.Section 6. This Executive Order shall take effect as of October 11, 2013.Signed this 11th day of October, 2013, by Greg Fischer, Mayor of Louisville/Jefferson County Metro Government.GREG FISCHER, MAYOR
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘2020 Open Data Plan: Future Releases’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/827034cd-2098-48d1-991f-8080886c66a4 on 27 January 2022.
--- Dataset description provided by original source is as follows ---
The collection of datasets set to be released on the Open Data Portal, according to the 2020 Open Data Plan.
--- Original source retains full ownership of the source dataset ---
Under the Freedom of Information Act 2000, I was wondering if you would be able to develop on top of the FOI Request FOI 24442 and FOI 27689. https://opendata.nhsbsa.net/dataset/foi-24442 https://opendata.nhsbsa.net/dataset/foi-27689 The data in this request relates to April 2020 to March 2022 and April 2022 to June 2022 from the data source ‘NHSBSA Information Services Data Warehouse’ with the Columns YEAR_MONTH, PRACTICE_CODE, DISPENSER_CODE, BNF_CODE, PRODUCT_ORDER_NUMBER, PACK_ORDER_NUMBER and NIC_GBP. Would it be possible to have the data in the same format from July 2022 to December 2022 or from July 2022 to the latest possible month please?
This asset is a derived view based on the system dataset 'Site Analytics: Asset Inventory' which is automatically generated by the data management platform and provides a comprehensive inventory of all assets on this site. This asset has been filtered to present an overview of the various types of data that are classified as public and have been published on the City of Austin Open Data Portal (data.austintexas.gov) by departmental data owners.
The columns of the Asset Inventory dataset contain information about every asset. These include metadata fields (e.g., Name, Description, and Category), as well as statistics, such as the number of visits, row count, column count, and downloads. This asset is updated at least once per day to sync any changes, additional assets, or removed assets.
Data provided by: Tyler Technologies Creation date of data source: November 1, 2022
*City of Austin Open Data Terms of Use – https://data.austintexas.gov/stories/s/ranj-cccq
Data Source The data source was the NHSBSA Information Services Data Warehouse. Time period Contractor list contains open pharmacies indicated as 100-hours on 15th March 2024. Pharmacy Code The pharmacy code, also known as the 'f-code' of any pharmacy indicated as a 100-hour pharmacy Pharmacy Name The trading name associated with that pharmacy's f-code Please note that this request and our response is published on our Freedom of Information disclosure log at: https://opendata.nhsbsa.net/dataset/foi-01859
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the accompanying dataset to the following paper https://www.nature.com/articles/s41597-023-01975-w
Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.
If you use Caravan in your research, it would be appreciated to not only cite Caravan itself, but also the source datasets, to pay respect to the amount of work that was put into the creation of these datasets and that made Caravan possible in the first place.
All current development and additional community extensions can be found at https://github.com/kratzert/Caravan
Channel Log:
23 May 2022: Version 0.2 - Resolved a bug when renaming the LamaH gauge ids from the LamaH ids to the official gauge ids provided as "govnr" in the LamaH dataset attribute files.
24 May 2022: Version 0.3 - Fixed gaps in forcing data in some "camels" (US) basins.
15 June 2022: Version 0.4 - Fixed replacing negative CAMELS US values with NaN (-999 in CAMELS indicates missing observation).
1 December 2022: Version 0.4 - Added 4298 basins in the US, Canada and Mexico (part of HYSETS), now totalling to 6830 basins. Fixed a bug in the computation of catchment attributes that are defined as pour point properties, where sometimes the wrong HydroATLAS polygon was picked. Restructured the attribute files and added some more meta data (station name and country).
16 January 2023: Version 1.0 - Version of the official paper release. No changes in the data but added a static copy of the accompanying code of the paper. For the most up to date version, please check https://github.com/kratzert/Caravan
10 May 2023: Version 1.1 - No data change, just update data description.
17 May 2023: Version 1.2 - Updated a handful of attribute values that were affected by a bug in their derivation. See https://github.com/kratzert/Caravan/issues/22 for details.
16 April 2024: Version 1.4 - Added 9130 gauges from the original source dataset that were initially not included because of the area thresholds (i.e. basins smaller than 100sqkm or larger than 2000sqkm). Also extended the forcing period for all gauges (including the original ones) to 1950-2023. Added two different download options that include timeseries data only as either csv files (Caravan-csv.tar.xz) or netcdf files (Caravan-nc.tar.xz). Including the large basins also required an update in the earth engine code
16 Jan 2025: Version 1.5 - Added FAO Penman-Monteith PET (potential_evaporation_sum_FAO_PENMAN_MONTEITH) and renamed the ERA5-LAND potential_evaporation band to potential_evaporation_sum_ERA5_LAND. Also added all PET-related climated indices derived with the Penman-Monteith PET band (suffix "_FAO_PM") and renamed the old PET-related indices accordingly (suffix "_ERA5_LAND").
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data supporting the Springer Nature Data Availability Statement (DAS) analysis in the State of Open Data 2024. SOOD_2024_special_analysis_DAS_SN.xlsx contains the DAS, DOI, publication date, DAS categories and related country by Insitution of any author.SOOD 2024_DAS_analysis_sharing.xlsx contains the summary data by country and data sharing type.Utilizing the Dimensions database, we identified articles containing key DAS identifiers such as “Data Availability Statement” or “Availability of Data and Materials” within their full text. Digital Object Identifiers (DOIs) of these articles were collected and matched against Springer Nature’s XML database to extract the DAS for each article. The extracted DAS were categorized into specific sharing types using text and data matching terms. For statements indicating that data are publicly available in a repository, we matched against a predefined list of repository identifiers, names, and URLs. The DAS were classified into the following categories:1. Data are available from the author on request. 2. Data are included in the manuscript or its supplementary material. 3. Some or all of the data are publicly available, for example in a repository.4. Figure source data are included with the manuscript. 5. Data availability is not applicable.6. Data are declared as not available by the author.7. Data available online but not in a repository.These categories are non-exclusive: more than one can apply to any one article. Publications outside the 2019–2023 range and non-article publication types (e.g., book chapters) that were initially included in the Dimensions search results were excluded from the final dataset. Articles were included in the final analysis after applying the exclusion criteria. Upon processing, it was found that only 370 results were returned for Botswana across the five-year period; due to this low number, Botswana was not included in the DAS focused country-level analysis. This analysis does not assess the accuracy of the DAS in the context of each individual article. There was no manual verification of the categories applied; as a result, terms used out of context could have led to misclassification. Approximately 5% of articles remained unclassified following text and data matching due to these limitations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We downloaded open access papers via PubMed from 10 systems and computational biology journals. We provide in this repository raw data in XML format. Our approach to extract software links from the downloaded papers and verify the archival stability of links is described in the Methods section of the paper. Timeout links were manually verified. Links extracted from the abstracts and the body of the surveyed papers (n=48,393) are available in CSV format here. For more information, please visit our main repository:https://github.com/smangul1/good.software
Under Section 21 of the Act, we are not required to provide information in response to a request if it is already reasonably accessible to you. The information you requested is available from web link: https://opendata.nhsbsa.net/dataset/foi-23358 Data for January 2022 and February 2022 A copy of the information is attached. NHS Prescription Services process prescriptions for Pharmacy Contractors, Appliance Contractors, Dispensing Doctors and Personal Administration with information then used to make payments to pharmacists and appliance contractors in England for prescriptions dispensed in primary care settings (other arrangements are in place for making payments to Dispensing Doctors and Personal Administration). This involves processing over 1 billion prescription items and payments totalling over £9 billion each year. The information gathered from this process is then used to provide information on costs and trends in prescribing in England and Wales to over 25,000 registered NHS and Department of Health and Social Care users. Data source Source System - ISP (National MIS Files) Time period January 2022 and February 2022 The month refers to the month of the report. Please note Appliance Contractors data within the MIS Report shows data for the following month. (E.g. MIS Report for January 2022 will show February 2022 Appliance Contractor data) This dataset FOI25450 has 4 files – January and February 2022 MIS Pharmacy and January and February 2022 MIS Appliance Contractor. This report consists of a management information file detailing monthly Community Pharmacy and Appliance Payments by type of payment and contractor account. Payments include all drug costs, fees, patient charges, locally authorised payments, etc. Other details such as the numbers of items dispensed, patient’s charges collected are also included. The management information file reflects the contractor's payment and prescription data associated with the sustainability and transformation partnerships (STPs) structure at the relevant payment date. The data contained within the files can be interpreted correctly by using the ‘MIS Glossary’ available under ‘Management Information Spreadsheet (MIS) Report’ at https://www.nhsbsa.nhs.uk/information-services-portal-isp/isp-report-information . Disclosure Control The data in column METHADONE PAYMT and ADD FEE-2E within the Pharmacy dataset have been removed following Information Governance policy. February 2022 is the latest MIS report that is available Please note that this request and our response is published on our Freedom of Information disclosure log at:
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
SEPAL (https://sepal.io/) is a free and open source cloud computing platform for geo-spatial data access and processing. It empowers users to quickly process large amounts of data on their computer or mobile device. Users can create custom analysis ready data using freely available satellite imagery, generate and improve land use maps, analyze time series, run change detection and perform accuracy assessment and area estimation, among many other functionalities in the platform. Data can be created and analyzed for any place on Earth using SEPAL.
https://data.apps.fao.org/catalog/dataset/9c4d7c45-7620-44c4-b653-fbe13eb34b65/resource/63a3efa0-08ab-4ad6-9d4a-96af7b6a99ec/download/cambodia_mosaic_2020.png" alt="alt text" title="Figure 1: Best pixel mosaic of Landsat 8 data for 2020 over Cambodia">
SEPAL reaches over 5000 users in 180 countries for the creation of custom data products from freely available satellite data. SEPAL was developed as a part of the Open Foris suite, a set of free and open source software platforms and tools that facilitate flexible and efficient data collection, analysis and reporting. SEPAL combines and integrates modern geospatial data infrastructures and supercomputing power available through Google Earth Engine and Amazon Web Services with powerful open-source data processing software, such as R, ORFEO, GDAL, Python and Jupiter Notebooks. Users can easily access the archive of satellite imagery from NASA, the European Space Agency (ESA) as well as high spatial and temporal resolution data from Planet Labs and turn such images into data that can be used for reporting and better decision making.
National Forest Monitoring Systems in many countries have been strengthened by SEPAL, which provides technical government staff with computing resources and cutting edge technology to accurately map and monitor their forests. The platform was originally developed for monitoring forest carbon stock and stock changes for reducing emissions from deforestation and forest degradation (REDD+). The application of the tools on the platform now reach far beyond forest monitoring by providing different stakeholders access to cloud based image processing tools, remote sensing and machine learning for any application. Presently, users work on SEPAL for various applications related to land monitoring, land cover/use, land productivity, ecological zoning, ecosystem restoration monitoring, forest monitoring, near real time alerts for forest disturbances and fire, flood mapping, mapping impact of disasters, peatland rewetting status, and many others.
The Hand-in-Hand initiative enables countries that generate data through SEPAL to disseminate their data widely through the platform and to combine their data with the numerous other datasets available through Hand-in-Hand.
https://data.apps.fao.org/catalog/dataset/9c4d7c45-7620-44c4-b653-fbe13eb34b65/resource/868e59da-47b9-4736-93a9-f8d83f5731aa/download/probability_classification_over_zambia.png" alt="alt text" title="Figure 2: Image classification module for land monitoring and mapping. Probability classification over Zambia">
This dataset is a compilation of address point data for the City of Tempe. The dataset contains a point location, the official address (as defined by The Building Safety Division of Community Development) for all occupiable units and any other official addresses in the City. There are several additional attributes that may be populated for an address, but they may not be populated for every address. Contact: Lynn Flaaen-Hanna, Development Services Specialist Contact E-mail Link: Map that Lets You Explore and Export Address Data Data Source: The initial dataset was created by combining several datasets and then reviewing the information to remove duplicates and identify errors. This published dataset is the system of record for Tempe addresses going forward, with the address information being created and maintained by The Building Safety Division of Community Development.Data Source Type: ESRI ArcGIS Enterprise GeodatabasePreparation Method: N/APublish Frequency: WeeklyPublish Method: AutomaticData Dictionary