Facebook
TwitterA. SUMMARY Mechanical street sweeping and street cleaning schedule managed by San Francisco Public Works. B. HOW THE DATASET IS CREATED This dataset is created by extracting all street sweeping schedule data from a Department of Public Works database, it is then geocoded to add common identifiers such as Centerline Network Number ("CNN") then published to the open data portal. C. UPDATE PROCESS This dataset will be updated on an 'as needed' basis, when sweeping schedules change. D. HOW TO USE THIS DATASET Use this dataset to understand, track, or analyze street sweeping in San Francisco.
Facebook
Twitter[Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.6.]What does the data show? The Annual Count of Summer Days is the number of days per year where the maximum daily temperature (the hottest point in the day) is above 25°C. It measures how many times the threshold is exceeded (not by how much) in a year. Note, the term ‘summer days’ is used to refer to the threshold and temperatures above 25°C outside the summer months also contribute to the annual count. The results should be interpreted as an approximation of the projected number of days when the threshold is exceeded as there will be many factors such as natural variability and local scale processes that the climate model is unable to represent.The Annual Count of Summer Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of summer days to previous values. What are the possible societal impacts?An increase in the Annual Count of Summer Days indicates increased health risks from high temperatures. Impacts include:Increased heat related illnesses, hospital admissions or death for vulnerable people.Transport disruption due to overheating of railway infrastructure. Periods of increased water demand.Other metrics such as the Annual Count of Hot Summer Days (days above 30°C), Annual Count of Extreme Summer Days (days above 35°C) and the Annual Count of Tropical Nights (where the minimum temperature does not fall below 20°C) also indicate impacts from high temperatures, however they use different temperature thresholds.What is a global warming level?The Annual Count of Summer Days is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the 'Annual Count of Summer Days', an average is taken across the 21 year period. Therefore, the Annual Count of Summer Days show the number of summer days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data? This data contains a field for each global warming level and two baselines. They are named ‘Summer Days’, the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘Summer Days 2.5 median’ is the median value for the 2.5°C warming level. Decimal points are included in field aliases but not field names e.g. ‘Summer Days 2.5 median’ is ‘SummerDays_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘Summer Days 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Count of Summer Days was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
A. SUMMARY The San Francisco Controller's Office maintains a database of the salary and benefits paid to City employees since fiscal year 2013.
B. HOW THE DATASET IS CREATED This data is summarized and presented on the Employee Compensation report hosted at http://openbook.sfgov.org, and is also available in this dataset in CSV format.
C. UPDATE PROCESS New data is added on a bi-annual basis when available for each fiscal and calendar year.
D. HOW TO USE THIS DATASET Before using please first review the following two resources: Data Dictionary - Can be found in 'About this dataset' section after click 'Show More' Employee Compensation FAQ - https://support.datasf.org/help/employee-compensation-faq
This is a dataset hosted by the city of San Francisco. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore San Francisco's Data using Kaggle and all of the data sources available through the San Francisco organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by rawpixel on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
Facebook
TwitterFor any b2z file, It is recommend to be parallel bzip decompressor (https://github.com/mxmlnkn/indexed_bzip2) for speed.
In summary:
See forum discussion for details of [1],[2]: https://www.kaggle.com/competitions/leash-BELKA/discussion/492846
This is somehow obsolete as the competition progresses. ecfp6 gives better results and can be extracted fast with scikit-fingerprints.
See forum discussion for details of [3]: https://www.kaggle.com/competitions/leash-BELKA/discussion/498858 https://www.kaggle.com/code/hengck23/lb6-02-graph-nn-example
See forum discussion for details of [4]: https://www.kaggle.com/competitions/leash-BELKA/discussion/505985 https://www.kaggle.com/code/hengck23/conforge-open-source-conformer-generator
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Citation metrics are widely used and misused. We have created a publicly available database of over 100,000 top-scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator. Separate data are shown for career-long and single year impact. Metrics with and without self-citations and ratio of citations to citing papers are given. Scientists are classified into 22 scientific fields and 176 sub-fields. Field- and subfield-specific percentiles are also provided for all scientists who have published at least 5 papers. Career-long data are updated to end-of-2020. The selection is based on the top 100,000 by c-score (with and without self-citations) or a percentile rank of 2% or above.
The dataset and code provides an update to previously released version 1 data under https://doi.org/10.17632/btchxktzyw.1; The version 2 dataset is based on the May 06, 2020 snapshot from Scopus and is updated to citation year 2019 available at https://doi.org/10.17632/btchxktzyw.2
This version (3) is based on the Aug 01, 2021 snapshot from Scopus and is updated to citation year 2020.
Facebook
Twitter[Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average percentage change between the 'lower' values before and after this update is -1%.]What does the data show? A Growing Degree Day (GDD) is a day in which the average temperature is above 5.5°C. It is the number of degrees above this threshold that counts as a Growing Degree Day. For example if the average temperature for a specific day is 6°C, this would contribute 0.5 Growing Degree Days to the annual sum, alternatively an average temperature of 10.5°C would contribute 5 Growing Degree Days. Given the data shows the annual sum of Growing Degree Days, this value can be above 365 in some parts of the UK.Annual Growing Degree Days are calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of GDD to previous values. What are the possible societal impacts?Annual Growing Degree Days indicate if conditions are suitable for plant growth. An increase in GDD can indicate larger crop yields due to increased crop growth from warm temperatures, but crop growth also depends on other factors. For example, GDD do not include any measure of rainfall/drought, sunlight, day length or wind, species vulnerability, or plant dieback in extremely high temperatures. GDD can indicate increased crop growth until temperatures reach a critical level above which there are detrimental impacts on plant physiology.GDD does not estimate the growth of specific species and is not a measure of season length.What is a global warming level?Annual Growing Degree Days are calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Growing Degree Days, an average is taken across the 21 year period. Therefore, the Annual Growing Degree Days show the number of growing degree days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for each global warming level and two baselines. They are named 'GDD' (Growing Degree Days), the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘GDD 2.5 median’ is the median value for the 2.5°C projection. Decimal points are included in field aliases but not field names e.g. ‘GDD 2.5 median’ is ‘GDD_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘GDD 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, Annual Growing Degree Days were calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY Transactions from FPPC Forms 460, 461, 496, 497, and 450. This dataset combines all schedules, pages, and includes unitemized totals. Only transactions from the "most recent" version of a filing (original/amendment) appear here.
B. HOW THE DATASET IS CREATED Committees file campaign statements with the Ethics Commission on a periodic basis. Those statements are stored with the Commission's data provider. Data is generally presented as-filed by committees.
If a committee files an amendment, the data from that filing completely replaces the original and any prior amendments in the filing sequence.
C. UPDATE PROCESS Each night starting at midnight Pacific time a script runs to check for new filings with the Commission's database, and updates this dataset with transactions from new filings. The update process can take a variable amount of time to complete. Viewing or downloading this dataset while the update is running may result in incomplete data, therefore it is highly recommended to view or download this data before midnight or after 8am.
During the update, some fields are copied from the Filings dataset into this dataset for viewing convenience. The copy process may occasionally fail for some transactions due to timing issues but should self-correct the following day. Transactions with a blank 'Filing Id Number' or 'Filing Date' field are such transactions, but can be joined with the appropriate record using the 'Filing Activity Nid' field shared between Filing and Transaction datasets.
D. HOW TO USE THIS DATASET
Transactions from rejected filings are not included in this dataset. Transactions from many different FPPC forms and schedules are combined in this dataset, refer to the column "Form Type" to differentiate transaction types.
Properties suffixed with "-nid" can be used to join the data between Filers, Filings, and Transaction datasets.
Refer to the Ethics Commission's webpage for more information.
Fppc Form460 is organized into Schedules as follows:
RELATED DATASETS
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data from the Council’s Annual Budget. The budget is comprised of Tables A to F and Appendix 1. Each table is represented by a separate data file.Table C is the Calculation of the Annual Rate on Valuation for the Financial Year for Balbriggan Town Council. It contains –Estimate of ‘Money Demanded’Adopted ‘Money Demanded’Estimated ‘Irrecoverable rates and cost of collection’Adopted ‘Irrecoverable rates and cost of collection’Total Sum to be Raised is the sum of ‘Money Demanded’ and ‘Irrecoverable rates and cost of collection’‘Annual Rate on Valuation to meet Total Sum to be Raised’This dataset is used to create Table C in the published Annual Budget document, which can be found at www.fingal.ieThe data is best understood by comparing it to Table C.Data fields for Table C are as follows –Doc : Table ReferenceHeading : Indicates sections in the Table - Table C is comprised of one section, therefore Heading value for all records = 1Ref : Town ReferenceDesc : Town DescriptionMD_Est : Money Demanded EstimatedMD_Adopt : Money Demanded AdoptedIR_Est : Irrecoverable rates and cost of collection EstimatedIR_Adopt : Irrecoverable rates and cost of collection AdoptedNEV : Annual Rate on Valuation to meet Total Sum to be Raised
Facebook
TwitterSPECIAL NOTE: C-MAPSS and C-MAPSS40K ARE CURRENTLY UNAVAILABLE FOR DOWNLOAD. Glenn Research Center management is reviewing the availability requirements for these software packages. We are working with Center management to get the review completed and issues resolved in a timely manner. We will post updates on this website when the issues are resolved. We apologize for any inconvenience. Please contact Jonathan Litt, jonathan.s.litt@nasa.gov, if you have any questions in the meantime. Subject Area: Engine Health Description: This data set was generated with the C-MAPSS simulator. C-MAPSS stands for 'Commercial Modular Aero-Propulsion System Simulation' and it is a tool for the simulation of realistic large commercial turbofan engine data. Each flight is a combination of a series of flight conditions with a reasonable linear transition period to allow the engine to change from one flight condition to the next. The flight conditions are arranged to cover a typical ascent from sea level to 35K ft and descent back down to sea level. The fault was injected at a given time in one of the flights and persists throughout the remaining flights, effectively increasing the age of the engine. The intent is to identify which flight and when in the flight the fault occurred. How Data Was Acquired: The data provided is from a high fidelity system level engine simulation designed to simulate nominal and fault engine degradation over a series of flights. The simulated data was created with a Matlab Simulink tool called C-MAPSS. Sample Rates and Parameter Description: The flights are full flight recordings sampled at 1 Hz and consist of 30 engine and flight condition parameters. Each flight contains 7 unique flight conditions for an approximately 90 min flight including ascent to cruise at 35K ft and descent back to sea level. The parameters for each flight are the flight conditions, health indicators, measurement temperatures and pressure measurements. Faults/Anomalies: Faults arose from the inlet engine fan, the low pressure compressor, the high pressure compressor, the high pressure turbine and the low pressure turbine.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context : We share a large database containing electroencephalographic signals from 87 human participants, with more than 20,800 trials in total representing about 70 hours of recording. It was collected during brain-computer interface (BCI) experiments and organized into 3 datasets (A, B, and C) that were all recorded following the same protocol: right and left hand motor imagery (MI) tasks during one single day session. It includes the performance of the associated BCI users, detailed information about the demographics, personality and cognitive user’s profile, and the experimental instructions and codes (executed in the open-source platform OpenViBE). Such database could prove useful for various studies, including but not limited to: 1) studying the relationships between BCI users' profiles and their BCI performances, 2) studying how EEG signals properties varies for different users' profiles and MI tasks, 3) using the large number of participants to design cross-user BCI machine learning algorithms or 4) incorporating users' profile information into the design of EEG signal classification algorithms. Sixty participants (Dataset A) performed the first experiment, designed in order to investigated the impact of experimenters' and users' gender on MI-BCI user training outcomes, i.e., users performance and experience, (Pillette & al). Twenty one participants (Dataset B) performed the second one, designed to examined the relationship between users' online performance (i.e., classification accuracy) and the characteristics of the chosen user-specific Most Discriminant Frequency Band (MDFB) (Benaroch & al). The only difference between the two experiments lies in the algorithm used to select the MDFB. Dataset C contains 6 additional participants who completed one of the two experiments described above. Physiological signals were measured using a g.USBAmp (g.tec, Austria), sampled at 512 Hz, and processed online using OpenViBE 2.1.0 (Dataset A) & OpenVIBE 2.2.0 (Dataset B). For Dataset C, participants C83 and C85 were collected with OpenViBE 2.1.0 and the remaining 4 participants with OpenViBE 2.2.0. Experiments were recorded at Inria Bordeaux sud-ouest, France. Duration : Each participant's folder is composed of approximately 48 minutes EEG recording. Meaning six 7-minutes runs and a 6-minutes baseline. Documents Instructions: checklist read by experimenters during the experiments. Questionnaires: the Mental Rotation test used, the translation of 4 questionnaires, notably the Demographic and Social information, the Pre and Post-session questionnaires, and the Index of Learning style. English and french version Performance: The online OpenViBE BCI classification performances obtained by each participant are provided for each run, as well as answers to all questionnaires Scenarios/scripts : set of OpenViBE scenarios used to perform each of the steps of the MI-BCI protocol, e.g., acquire training data, calibrate the classifier or run the online MI-BCI Database : raw signals Dataset A : N=60 participants Dataset B : N=21 participants Dataset C : N=6 participants
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Change Counter is a dataset for object detection tasks - it contains Coins annotations for 2,440 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://eidc.ac.uk/licences/ogl/plainhttps://eidc.ac.uk/licences/ogl/plain
The Reference Observatory of Basins for INternational hydrological climate change detection (ROBIN) dataset is a global hydrological dataset containing publicly available daily flow data for 2,386 gauging stations across the globe which have natural or near-natural catchments. Metadata is also provided alongside these stations for the Full ROBIN Dataset consisting of 3,060 gauging stations. Data were quality controlled by the central ROBIN team before being added to the dataset, and two levels of data quality are applied to guide users towards appropriate the data usage. Most records have data of at least 40 years with minimal missing data with data records starting in the late 19th Century for some sites through to 2022. ROBIN represents a significant advance in global-scale, accessible streamflow data. The project was funded the UK Natural Environment Research Council Global Partnership Seedcorn Fund - NE/W004038/1 and the NC-International programme [NE/X006247/1] delivering National Capability
Facebook
TwitterThe NOAA Coastal Change Analysis Program (C-CAP) produces national standardized land cover and change products for the coastal regions of the U.S. C-CAP products inventory coastal intertidal areas, wetlands, and adjacent uplands with the goal of monitoring changes in these habitats, on a one-to-five year repeat cycle. The timeframe for this metadata is reported as 1985 - 2010-Era, but the actual dates of the Landsat imagery used to create the land cover may have been acquired a few years before or after each era. These maps are developed utilizing Landsat Thematic Mapper imagery, and can be used to track changes in the landscape through time. This trend information gives important feedback to managers on the success or failure of management policies and programs and aid in developing a scientific understanding of the Earth system and its response to natural and human-induced changes. This understanding allows for the prediction of impacts due to these changes and the assessment of their cumulative effects, helping coastal resource managers make more informed regional decisions. NOAA C-CAP is a contributing member to the Multi-Resolution Land Characteristics consortium and C-CAP products are included as the coastal expression of land cover within the National Land Cover Database.
Facebook
TwitterThis dataset covers vocational qualifications starting 2012 to present for England.
The dataset is updated every quarter. Data for previous quarters may be revised to insert late data or to correct an error. Updates also reflect where qualifications were re-categorised to a different type, level, sector subject area or awarding organisation. Where a quarterly update includes revisions to data for previous quarters, a table of revisions is published in the vocational and other qualifications quarterly release
In the dataset, the number of certificates issued are rounded to the nearest 5 and values less than 5 appear as ‘Fewer than 5’ to preserve confidentiality (and a 0 represents no certificates).
Where a qualification has been owned by more than one awarding organisation at different points in time, a separate row is given for each organisation.
Background information and key headlines for every quarter are published in in the vocational and other qualifications quarterly release.
For any queries contact us at data.analytics@ofqual.gov.uk.
CSV, 20.5 MB
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY The dataset inventory provides a list of data maintained by departments that are candidates for open data publishing or have already been published and is collected in accordance with Chapter 22D of the Administrative Code. The inventory will be used in conjunction with department publishing plans to track progress toward meeting plan goals for each department.
B. HOW THE DATASET IS CREATED This dataset is collated through 2 ways: 1. Ongoing updates are made throughout the year to reflect new datasets, this process involves DataSF staff reconciling publishing records after datasets are published 2. Annual bulk updates - departments review their inventories and identify changes and updates and submit those to DataSF for a once a year bulk update - not all departments will have changes or their changes will have been captured over the course of the prior year already as ongoing updates
C. UPDATE PROCESS The dataset is synced automatically daily, but the underlying data changes manually throughout the year as needed
D. HOW TO USE THIS DATASET Interpreting dates in this dataset This dataset has 2 dates: 1. Date Added - when the dataset was added to the inventory itself 2. First Published - the open data portal automatically captures the date the dataset was first created, this is that system generated date
Note that in certain cases we may have published a dataset prior to it being added to the inventory. We do our best to have an accurate accounting of when something was added to this inventory and when it was published. In most cases the inventory addition will happen prior to publishing, but in certain cases it will be published and we will have missed updating the inventory as this is a manual process.
First published will give an accounting of when it was actually available on the open data catalog and date added when it was added to this list.
E. RELATED DATASETS
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY A list of street centerlines, including both active and retired streets. These centerlines are identified by their Centerline Network Number ("CNN").B. HOW THE DATASET IS CREATED This data is extracted from the Department of Public Works Basemap. Supervisor District and Analysis Neighborhood are added during the loading process. These boundaries utilize the centroid (middle) of the line to determine the district or neighborhood.C. UPDATE PROCESS This dataset refreshes daily, though the data may not change every day.D. HOW TO USE THIS DATASET Note 1: The Class Code field is used for symbolization:1 = Freeway2 = Major street/Highway3 = Arterial street4 = Collector Street5 = Residential Street6 = Freeway Ramp0 = Other (private streets, paper street, etc.)E. RELATED DATASETS Understanding street-level dataData pushed to ArcGIS Online on November 10, 2025 at 3:25 AM by SFGIS.Data from: https://data.sfgov.org/d/3psu-pn9hDescription of dataset columns:
cnn
Centerline Network Number - unique identifier for dataset
lf_fadd
From address number on left side of street, the lowest number in the address range
lf_toadd
To address number on left side of street, the highest number in the address range
rt_fadd
From address number on right side of street, the lowest number in the address range
rt_toadd
To address number on right side of street, the highest number in the address range
street
Street name without street type
st_type
Street Type (AVE, ST, BLVD, et al.)
f_st
The street name of the segment intersects at its beginning.
t_st
The street name of the segment intersects at its end.
f_node_cnn
Centerline Network Number for the node/intersection that the street segment begins from.
t_node_cnn
Centerline Network Number for the node/intersection that the street segment ends on.
accepted
Accepted by City and County of San Francisco for maintenance.
active
Active street segment, i.e., not retired.
classcode
Classification code for street segment. Used for symbolization: 1 = Freeway 2 = Major street/Highway 3 = Arterial street 4 = Collector Street 5 = Residential Street 6 = Freeway Ramp 0 = Other (private streets, paper street, etc.)
date_added
Date added to dataset by Public Works.
date_altered
Date altered to dataset by Public Works.
date_dropped
Date dropped to dataset by Public Works.
gds_chg_id_add
The internal change transaction id when the segment was added.
gds_chg_id_altered
The internal change transaction id when the segment was altered.
gds_chg_id_dropped
The internal change transaction id when the segment was dropped/retired.
jurisdiction
Agency with jurisdiction over the segment, if any.
layer
Derived from the source AutoCAD drawing, this field indicates the category of segment. Definitions for each of the values: Freeways such as 80, 280 and 101. Paper, the centerline segment is present on Assessor and/or Public Works map, but is not an actual street in reality. Paper_fwys, the centerline segment is present on Assessor and/or Public Works map, but is not an actual street in reality, and is under or near a freeway. Paper_water, the centerline segment is present on Assessor and/or Public Works map, but is not an actual street in reality, and is under water in the Bay. PARKS, street segement maintained by Recreation and Park Department, e.g., in Golden Gate Park. Parks_NPS_FtMaso, street segement maintained by the National Park Service within Fort Mason. Parks_NPS_Presid, street segement maintained by the National Park Service within the Presidio. Private, street segment is not maintained by the City and is not on an Assessor or Public Works map. Private_parking, street segment is not maintained by the City and is not on an Assessor or Public Works map, and is a parking lot. PSEUDO, street segment created for use in addressing. Streets, standard street centerline segement. Streets_HuntersP, standard street centerline segement within the Hunters Point Shipyard area. Streets_Pedestri, standard street centerline segement, but pedestrian access only. Streets_TI, standard street centerline segement within Treasure Island. Streets_YBI, standard street centerline segement within Yerba Buena Island. UPROW, Unpaved Right of Way street centerline segment.
nhood
SFRealtor-defined neighborhood that the segment is primarily intersects
oneway
Indicates if street segment is a one way street: possible values are F (the segment is one way beginning at the "from" street) , T (the segment is one way beginning at the "to" street), or B (traffic is legal in "both" directions)
street_gc
Street name without street type, with the numbered streets with leading zeroes dropped to facilitate geocoding
streetname
Full street name and street type
streetname_gc
Full street name and street type, with the numbered streets with leading zeroes dropped to facilitate geocoding
zip_code
ZIP Code that street segment falls in.
analysis_neighborhood
current analysis neighborhood
supervisor_district
current supervisor district
line
Geometry
data_as_of
Timestamp the data was updated in the source system
data_loaded_at
Timestamp the data was loaded to the open data portal
Note: If no description was provided by DataSF, the cell is left blank. See the source data for more information.
Facebook
TwitterThe SWOT Level 2 Lake Single-Pass Vector Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, area, storage change derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025. Water surface elevation, area, and storage change are provided in three feature datasets covering the full swath for each continent-pass: 1) an observation-oriented feature dataset of lakes identified in the prior lake database (PLD), 2) a PLD-oriented feature dataset of lakes identified in the PLD, and 3) a feature dataset containing unassigned features (i.e., not identified in PLD nor prior river database (PRD)). These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format.Please note that this collection contains SWOT Version C science data products.This dataset is the parent collection to the following sub-collections: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_LakeSP_obs_2.0 https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_LakeSP_prior_2.0 https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_LakeSP_unassigned_2.0
Facebook
TwitterThis dataset contains 1000000 rows of realistic student performance data, designed for beginners in Machine Learning to practice Linear Regression, model training, and evaluation techniques.
Each row represents one student with features like study hours, attendance, class participation, and final score.
The dataset is small, clean, and structured to be beginner-friendly.
Random noise simulates differences in learning ability, motivation, etc.
Regression Tasks
total_score from weekly_self_study_hours. attendance_percentage and class_participation. Classification Tasks
grade (A–F) using study hours, attendance, and participation. Model Evaluation Practice
✅ This dataset is intentionally kept simple, so that new ML learners can clearly see the relationship between input features (study, attendance, participation) and output (score/grade).
Facebook
TwitterA. SUMMARY This dataset stores upcoming and current street closures occurring as a result of the Shared Spaces program, certain special events, and some construction work. This dataset only includes street closures permitted by the San Francisco Municipal Transportation Agency (SFMTA). It doesn’t include street closures managed by other City departments such as Public Works or the Police Department. B. HOW THE DATASET IS CREATED The data is exported from the Street Closure Salesforce database which is maintained by the SFMTA ISCOTT unit and converted to the geometry of streets and intersections based on closed street extent descriptions. C. UPDATE PROCESS Database is updated constantly. This report is issued daily. D. HOW TO USE THIS DATASET Please be aware that this dataset only contains temporary street closure events that are permitted “status = permitted” This dataset contains various types of street closures. If you are looking for a particular type please use the type field to select the appropriate closure type (Ex. type = Shared Space). E. RELATED DATASETS For intersection points affected by temporary street closures (rather than the street segments provided here), please see the Temporary Street Closure Intersections dataset. For street closures in the WZDx standard format, see the Temporary Street Closures in the Work Zone Data Exchange (WZDx) Format dataset.
Facebook
TwitterA. SUMMARY Mechanical street sweeping and street cleaning schedule managed by San Francisco Public Works. B. HOW THE DATASET IS CREATED This dataset is created by extracting all street sweeping schedule data from a Department of Public Works database, it is then geocoded to add common identifiers such as Centerline Network Number ("CNN") then published to the open data portal. C. UPDATE PROCESS This dataset will be updated on an 'as needed' basis, when sweeping schedules change. D. HOW TO USE THIS DATASET Use this dataset to understand, track, or analyze street sweeping in San Francisco.