https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.
This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']
This dataset is deprecated and not being updated.
Fork this kernel to get started with this dataset.
Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @meric from Unplash.
Which neighborhoods have the highest proportion of offensive graffiti?
Which complaint is most likely to be made using Twitter and in which neighborhood?
What are the most complained about Muni stops in San Francisco?
What are the top 10 incident types that the San Francisco Fire Department responds to?
How many medical incidents and structure fires are there in each neighborhood?
What’s the average response time for each type of dispatched vehicle?
Which category of police incidents have historically been the most common in San Francisco?
What were the most common police incidents in the category of LARCENY/THEFT in 2016?
Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?
What is the average tree diameter?
What is the highest number of a particular species of tree planted in a single year?
Which San Francisco locations feature the largest number of trees?
VITAL SIGNS INDICATOR Population (LU1)
FULL MEASURE NAME Population estimates
LAST UPDATED October 2019
DESCRIPTION Population is a measurement of the number of residents that live in a given geographical area, be it a neighborhood, city, county or region.
DATA SOURCES U.S Census Bureau: Decennial Census No link available (1960-1990) http://factfinder.census.gov (2000-2010)
California Department of Finance: Population and Housing Estimates Table E-6: County Population Estimates (1961-1969) Table E-4: Population Estimates for Counties and State (1971-1989) Table E-8: Historical Population and Housing Estimates (2001-2018) Table E-5: Population and Housing Estimates (2011-2019) http://www.dof.ca.gov/Forecasting/Demographics/Estimates/
U.S. Census Bureau: Decennial Census - via Longitudinal Tract Database Spatial Structures in the Social Sciences, Brown University Population Estimates (1970 - 2010) http://www.s4.brown.edu/us2010/index.htm
U.S. Census Bureau: American Community Survey 5-Year Population Estimates (2011-2017) http://factfinder.census.gov
U.S. Census Bureau: Intercensal Estimates Estimates of the Intercensal Population of Counties (1970-1979) Intercensal Estimates of the Resident Population (1980-1989) Population Estimates (1990-1999) Annual Estimates of the Population (2000-2009) Annual Estimates of the Population (2010-2017) No link available (1970-1989) http://www.census.gov/popest/data/metro/totals/1990s/tables/MA-99-03b.txt http://www.census.gov/popest/data/historical/2000s/vintage_2009/metro.html https://www.census.gov/data/datasets/time-series/demo/popest/2010s-total-metro-and-micro-statistical-areas.html
CONTACT INFORMATION vitalsigns.info@bayareametro.gov
METHODOLOGY NOTES (across all datasets for this indicator) All legal boundaries and names for Census geography (metropolitan statistical area, county, city, and tract) are as of January 1, 2010, released beginning November 30, 2010, by the U.S. Census Bureau. A Priority Development Area (PDA) is a locally-designated area with frequent transit service, where a jurisdiction has decided to concentrate most of its housing and jobs growth for development in the foreseeable future. PDA boundaries are current as of August 2019. For more information on PDA designation see http://gis.abag.ca.gov/website/PDAShowcase/.
Population estimates for Bay Area counties and cities are from the California Department of Finance, which are as of January 1st of each year. Population estimates for non-Bay Area regions are from the U.S. Census Bureau. Decennial Census years reflect population as of April 1st of each year whereas population estimates for intercensal estimates are as of July 1st of each year. Population estimates for Bay Area tracts are from the decennial Census (1970 -2010) and the American Community Survey (2008-2012 5-year rolling average; 2010-2014 5-year rolling average; 2013-2017 5-year rolling average). Estimates of population density for tracts use gross acres as the denominator.
Population estimates for Bay Area PDAs are from the decennial Census (1970 - 2010) and the American Community Survey (2006-2010 5 year rolling average; 2010-2014 5-year rolling average; 2013-2017 5-year rolling average). Population estimates for PDAs are derived from Census population counts at the tract level for 1970-1990 and at the block group level for 2000-2017. Population from either tracts or block groups are allocated to a PDA using an area ratio. For example, if a quarter of a Census block group lies with in a PDA, a quarter of its population will be allocated to that PDA. Tract-to-PDA and block group-to-PDA area ratios are calculated using gross acres. Estimates of population density for PDAs use gross acres as the denominator.
Annual population estimates for metropolitan areas outside the Bay Area are from the Census and are benchmarked to each decennial Census. The annual estimates in the 1990s were not updated to match the 2000 benchmark.
The following is a list of cities and towns by geographical area: Big Three: San Jose, San Francisco, Oakland Bayside: Alameda, Albany, Atherton, Belmont, Belvedere, Berkeley, Brisbane, Burlingame, Campbell, Colma, Corte Madera, Cupertino, Daly City, East Palo Alto, El Cerrito, Emeryville, Fairfax, Foster City, Fremont, Hayward, Hercules, Hillsborough, Larkspur, Los Altos, Los Altos Hills, Los Gatos, Menlo Park, Mill Valley, Millbrae, Milpitas, Monte Sereno, Mountain View, Newark, Pacifica, Palo Alto, Piedmont, Pinole, Portola Valley, Redwood City, Richmond, Ross, San Anselmo, San Bruno, San Carlos, San Leandro, San Mateo, San Pablo, San Rafael, Santa Clara, Saratoga, Sausalito, South San Francisco, Sunnyvale, Tiburon, Union City, Vallejo, Woodside Inland, Delta and Coastal: American Canyon, Antioch, Benicia, Brentwood, Calistoga, Clayton, Cloverdale, Concord, Cotati, Danville, Dixon, Dublin, Fairfield, Gilroy, Half Moon Bay, Healdsburg, Lafayette, Livermore, Martinez, Moraga, Morgan Hill, Napa, Novato, Oakley, Orinda, Petaluma, Pittsburg, Pleasant Hill, Pleasanton, Rio Vista, Rohnert Park, San Ramon, Santa Rosa, Sebastopol, Sonoma, St. Helena, Suisun City, Vacaville, Walnut Creek, Windsor, Yountville Unincorporated: all unincorporated towns
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the South San Francisco population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of South San Francisco across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2023, the population of South San Francisco was 63,123, a 0.08% increase year-by-year from 2022. Previously, in 2022, South San Francisco population was 63,073, a decline of 1.16% compared to a population of 63,816 in 2021. Over the last 20 plus years, between 2000 and 2023, population of South San Francisco increased by 2,480. In this period, the peak population was 67,147 in the year 2016. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for South San Francisco Population by Year. You can refer the same here
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A log of dataset alerts open, monitored or resolved on the open data portal. Alerts can include issues as well as deprecation or discontinuation notices.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of South San Francisco by race. It includes the population of South San Francisco across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of South San Francisco across relevant racial categories.
Key observations
The percent distribution of South San Francisco population by race (across all racial categories recognized by the U.S. Census Bureau): 26.58% are white, 1.73% are Black or African American, 0.69% are American Indian and Alaska Native, 43% are Asian, 1.12% are Native Hawaiian and other Pacific Islander, 14.04% are some other race and 12.86% are multiracial.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Racial categories include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for South San Francisco Population by Race & Ethnicity. You can refer the same here
A. SUMMARY This dataset contains population and demographic estimates and associated margins of error obtained and derived from the US Census. The data is presented over multiple years and geographies. The data is sourced primarily from the American Community Survey. B. HOW THE DATASET IS CREATED The raw data is obtained from the census API. Some estimates as published as-is and some are derived. C. UPDATE PROCESS New estimates and years of data are appended to this dataset. To request additional census data for San Francisco, email support@datasf.org D. HOW TO USE THIS DATASET The dataset is long and contains multiple estimates, years and geographies. To use this dataset, you can filter by the overall segment which contains information about the source, years, geography, demographic category and reporting segment. For census data used in specific reports, you can filter to the reporting segment. To use a subset of the data, you can create a filtered view. More information of how to filter data and create a view can be found here
By City of San Francisco [source]
This dataset provides a comprehensive composite index that captures the relative vulnerability of San Francisco communities to the health impacts of flooding and extreme storms. Predominantly sourced from local governmental health, housing, and public data sources, this index is constructed from an array of socio-economic factors, exposure indices,Health indicators and housing attributes. Used as a valuable planning tool for both health and climate adaptation initiatives throughout San Francisco, this dataset helps to identify vulnerable populations within the city such as areas with high concentrations of children or elderly individuals. Data points included in this index include: census blockgroup numbers; the percentage of population under 18 years old; percentage of population above 65; percentage non-white; poverty levels; education level; yearly precipitation estimates; diabetes prevalence rate; mental health issues reported in the area; asthma cases by geographic location;; disability rates within each block group measure as well as housing quality metrics. All these components provide a broader understanding on how best to tackle issues faced within SF arising from any form of climate change related weather event such as floods or extreme storms
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset can be used to analyze the vulnerability of the population in San Francisco to the health impacts of floods and storms. This dataset includes a number of important indicators such as poverty, education, demographic, exposure and health-related information. These indicators can be useful for developing effective strategies for health and climate adaptation in an urban area.
To get started with this dataset: First, review the data dictionary provided in the attachments section of this metadata to understand each variable that you plan on using in your analysis. Second, see if there are any null or missing values in your columns by checking out ‘Null Value’ column provided in this metadata sheet and look at how they will affect your analysis - use appropriate methods to handle those values based on your goals and objectives. Thirdly begin exploring relationships between different variables using visualizations like pandas scatter_matrix() & pandas .corr() . These tools can help you identify potential strong correlations between certain variables that you may have not seen otherwise through simple inspection of the data.
Lastly if needed use modelling techniques like regression analysis or other quantitative methods like ANOVA’s etc., for further elaboration on understanding relationships between different parameters involved as per need basis
- Developing targeted public health interventions focused on high-risk areas/populations as identified in the vulnerability index.
- Establishing criteria for insurance premiums and policies within high-risk areas/populations to incentivize adaption to climate change.
- Visual mapping of individual indicators in order to identify trends and correlations between flood risk and socioeconomic indicators, resource availability, and/or healthcare provision levels at a granular level
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: san-francisco-flood-health-vulnerability-1.csv | Column name | Description | |:---------------------------|:----------------------------------------------------------------------------------------| | Census Blockgroup | Unique numerical identifier for each block in the city. (Integer) | | Children | Percentage of population under 18 years of age. (Float) | | Children_wNULLvalues | Percentage of population under 18 years of age with null values. (Float) | | Elderly | Percentage of population over 65 years of age. (Float) | | Elderly_wNULLvalues | Percentage of population over 65 years of age with null values. (Float) | | NonWhite | Percentage of non-white population. (Float) ...
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This filtered view contains the population estimates for San Francisco demographic groups from the U.S. Census Bureau’s American Community Survey that are used in the Department of Public Health’s public reporting. Details on the underlying demographic data from the American Community Survey are available below. The demographics included are race/ethnicity and age groups. Different age groups are used for reporting on cases reporting versus vaccinations. The specific groups used in each of these reports can be found by using the "reporting_segment" column. We are using 2016-2020 ACS estimates in our public reporting, but additional years are included in this view as well for historical purposes.
The COVID-19 reports which use this data are available on SF.gov by clicking here.
San Francisco Population and Demographic Census data dataset filtered on:
B. HOW THE DATASET IS CREATED The raw data is obtained from the census API. Some estimates as published as-is and some are derived.
C. UPDATE PROCESS New estimates and years of data are appended to this dataset. To request additional census data for San Francisco, email support@datasf.org
D. HOW TO USE THIS DATASET The dataset is long and contains multiple estimates, years and geographies. To use this dataset, you can filter by the overall segment which contains information about the source, years, geography, demographic category and reporting segment. For census data used in specific reports, you can filter to the reporting segment. To use a subset of the data, you can create a filtered view. More information of how to filter data and create a view can be found here
As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward. A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable. Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups. B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health. Data on the population characteristics of COVID-19 deaths are from: Case reports Medical records Electronic lab reports Death certificates Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths. To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more. Data notes on each population characteristic type is listed below. Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. Gender * The City collects information on gender identity using these guidelines. C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week. Dataset will not update on the business day following any federal holiday. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS). This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date. New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed. This data may not be immediately available for more recent deaths. Data updates as more information becomes available. To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset. E. CHANGE LOG 9/11/2023 - on this date, we began using an updated definition of a COVID-19 death to align with the California Department o
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY This dataset includes data on a variety of substance use services funded by the San Francisco Department of Public Health (SFDPH). This dataset only includes Drug MediCal-certified residential treatment, withdrawal management, and methadone treatment. Other private non-Drug Medi-Cal treatment providers may operate in the city. Withdrawal management discharges are inclusive of anyone who left withdrawal management after admission and may include someone who left before completing withdrawal management.
This dataset also includes naloxone distribution from the SFDPH Behavioral Health Services Naloxone Clearinghouse and the SFDPH-funded Drug Overdose Prevention and Education program. Both programs distribute naloxone to various community-based organizations who then distribute naloxone to their program participants. Programs may also receive naloxone from other sources. Data from these other sources is not included in this dataset.
Finally, this dataset includes the number of clients on medications for opioid use disorder (MOUD).
The number of people who were treated with methadone at a Drug Medi-Cal certified Opioid Treatment Program (OTP) by year is populated by the San Francisco Department of Public Health (SFDPH) Behavioral Health Services Quality Management (BHSQM) program. OTPs in San Francisco are required to submit patient billing data in an electronic medical record system called Avatar. BHSQM calculates the number of people who received methadone annually based on Avatar data. Data only from Drug MediCal certified OTPs were included in this dataset.
The number of people who receive buprenorphine by year is populated from the Controlled Substance Utilization Review and Evaluation System (CURES), administered by the California Department of Justice. All licensed prescribers in California are required to document controlled substance prescriptions in CURES. The Center on Substance Use and Health calculates the total number of people who received a buprenorphine prescription annually based on CURES data. Formulations of buprenorphine that are prescribed only for pain management are excluded.
People may receive buprenorphine and methadone in the same year, so you cannot add the Buprenorphine Clients by Year, and Methadone Clients by Year data together to get the total number of unique people receiving medications for opioid use disorder.
For more information on where to find treatment in San Francisco, visit findtreatment-sf.org.
B. HOW THE DATASET IS CREATED This dataset is created by copying the data into this dataset from the SFDPH Behavioral Health Services Quality Management Program, the California Controlled Substance Utilization Review and Evaluation System (CURES), and the Office of Overdose Prevention.
C. UPDATE PROCESS Residential Substance Use Treatment, Withdrawal Management, Methadone, and Naloxone data are updated quarterly with a 45-day delay. Buprenorphine data are updated quarterly and when the state makes this data available, usually at a 5-month delay.
D. HOW TO USE THIS DATASET Throughout the year this dataset may include partial year data for methadone and buprenorphine treatment. As both methadone and buprenorphine are used as long-term treatments for opioid use disorder, many people on treatment at the end of one calendar year will continue into the next. For this reason, doubling (methadone), or quadrupling (buprenorphine) partial year data will not accurately project year-end totals.
E. RELATED DATASETS Overdose-Related 911 Responses by Emergency Medical Services Unintentional Overdose Death Rates by Race/Ethnicity Preliminary Unintentional Drug Overdose Deaths
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset includes the locations of businesses that pay taxes to the City and County of San Francisco. Each registered business may have multiple locations and each location is a single row. The Treasurer & Tax Collector’s Office collects this data through business registration applications, account update/closure forms, and taxpayer filings. The data is collected to help enforce the Business and Tax Regulations Code including, but not limited to: Article 6, Article 12, Article 12-A, and Article 12-A-1. http://sftreasurer.org/registration
This is a dataset hosted by the city of San Francisco. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore San Francisco's Data using Kaggle and all of the data sources available through the San Francisco organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by Rezaul Karim on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY This dataset includes COVID-19 tests by resident neighborhood and specimen collection date (the day the test was collected). Specifically, this dataset includes tests of San Francisco residents who listed a San Francisco home address at the time of testing. These resident addresses were then geo-located and mapped to neighborhoods. The resident address associated with each test is hand-entered and susceptible to errors, therefore neighborhood data should be interpreted as an approximation, not a precise nor comprehensive total.
In recent months, about 5% of tests are missing addresses and therefore cannot be included in any neighborhood totals. In earlier months, more tests were missing address data. Because of this high percentage of tests missing resident address data, this neighborhood testing data for March, April, and May should be interpreted with caution (see below)
Percentage of tests missing address information, by month in 2020 Mar - 33.6% Apr - 25.9% May - 11.1% Jun - 7.2% Jul - 5.8% Aug - 5.4% Sep - 5.1% Oct (Oct 1-12) - 5.1%
To protect the privacy of residents, the City does not disclose the number of tests in neighborhoods with resident populations of fewer than 1,000 people. These neighborhoods are omitted from the data (they include Golden Gate Park, John McLaren Park, and Lands End).
Tests for residents that listed a Skilled Nursing Facility as their home address are not included in this neighborhood-level testing data. Skilled Nursing Facilities have required and repeated testing of residents, which would change neighborhood trends and not reflect the broader neighborhood's testing data.
This data was de-duplicated by individual and date, so if a person gets tested multiple times on different dates, all tests will be included in this dataset (on the day each test was collected).
The total number of positive test results is not equal to the total number of COVID-19 cases in San Francisco. During this investigation, some test results are found to be for persons living outside of San Francisco and some people in San Francisco may be tested multiple times (which is common). To see the number of new confirmed cases by neighborhood, reference this map: https://sf.gov/data/covid-19-case-maps#new-cases-maps
B. HOW THE DATASET IS CREATED COVID-19 laboratory test data is based on electronic laboratory test reports. Deduplication, quality assurance measures and other data verification processes maximize accuracy of laboratory test information. All testing data is then geo-coded by resident address. Then data is aggregated by analysis neighborhood and specimen collection date.
Data are prepared by close of business Monday through Saturday for public display.
C. UPDATE PROCESS Updates automatically at 05:00 Pacific Time each day. Redundant runs are scheduled at 07:00 and 09:00 in case of pipeline failure.
D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
Due to the high degree of variation in the time needed to complete tests by different labs there is a delay in this reporting. On March 24 the Health Officer ordered all labs in the City to report complete COVID-19 testing information to the local and state health departments.
In order to track trends over time, a data user can analyze this data by "specimen_collection_date".
Calculating Percent Positivity: The positivity rate is the percentage of tests that return a positive result for COVID-19 (positive tests divided by the sum of positive and negative tests). Indeterminate results, which could not conclusively determine whether COVID-19 virus was present, are not included in the calculation of percent positive. Percent positivity indicates how widespread COVID-19 is in San Francisco and it helps public health officials determine if we are testing enough given the number of people who are testing positive. When there are fewer than 20 positives tests for a given neighborhood and time period, the positivity rate is not calculated for the public tracker because rates of small test counts are less reliable.
Calculating Testing Rates: To calculate the testing rate per 10,000 residents, divide the total number of tests collected (positive, negative, and indeterminate results) for neighborhood by the total number of residents who live in that neighborhood (included in the dataset), then multiply by 10,000. When there are fewer than 20 total tests for a given neighborhood and time period, the testing rate is not calculated for the public tracker because rates of small test counts are less reliable.
Read more about how this data is updated and validated daily: https://sf.gov/information/covid-19-data-questions
E. CHANGE LOG
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.
B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from: * Case interviews * Laboratories * Medical providers These multiple streams of data are merged, deduplicated, and undergo data verification processes.
Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.
Gender * The City collects information on gender identity using these guidelines.
Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives. * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’.
Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. https://www.sfdph.org/dph/files/PoliciesProcedures/COM9_SexualOrientationGuidelines.pdf">Learn more about our data collection guidelines pertaining to sexual orientation.
Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.
Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.
Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews.
Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.
C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023.
D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cases on each date.
New cases are the count of cases within that characteristic group where the positive tests were collected on that specific specimen collection date. Cumulative cases are the running total of all San Francisco cases in that characteristic group up to the specimen collection date listed.
This data may not be immediately available for recently reported cases. Data updates as more information becomes available.
To explore data on the total number of cases, use the ARCHIVED: COVID-19 Cases Over Time dataset.
E. CHANGE LOG
Draft dataset for Bay Area Census website prototype. Includes census 2010 population breakdown by age, sex and race.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
INTRODUCTION: As California’s homeless population continues to grow at an alarming rate, large metropolitan regions like the San Francisco Bay Area face unique challenges in coordinating efforts to track and improve homelessness. As an interconnected region of nine counties with diverse community needs, identifying homeless population trends across San Francisco Bay Area counties can help direct efforts more effectively throughout the region, and inform initiatives to improve homelessness at the city, county, and metropolitan level. OBJECTIVES: The primary objective of this research is to compare the annual Point-in-Time (PIT) counts of homelessness across San Francisco Bay Area counties between the years 2018-2022. The secondary objective of this research is to compare the annual Point-in-Time (PIT) counts of homelessness among different age groups in each of the nine San Francisco Bay Area counties between the years 2018-2022. METHODS: Two datasets were used to conduct research. The first dataset (Dataset 1) contains Point-in-Time (PIT) homeless counts published by the U.S. Department of Housing and Urban Development. Dataset 1 was cleaned using Microsoft Excel and uploaded to Tableau Desktop Public Edition 2022.4.1 as a CSV file. The second dataset (Dataset 2) was published by Data SF and contains shapefiles of geographic boundaries of San Francisco Bay Area counties. Both datasets were joined in Tableau Desktop Public Edition 2022.4 and all data analysis was conducted using Tableau visualizations in the form of bar charts, highlight tables, and maps. RESULTS: Alameda, San Francisco, and Santa Clara counties consistently reported the highest annual count of people experiencing homelessness across all 5 years between 2018-2022. Alameda, Napa, and San Mateo counties showed the largest increase in homelessness between 2018 and 2022. Alameda County showed a significant increase in homeless individuals under the age of 18. CONCLUSIONS: Results from this research reveal both stark and fluctuating differences in homeless counts among San Francisco Bay Area Counties over time, suggesting that a regional approach that focuses on collaboration across counties and coordination of services could prove beneficial for improving homelessness throughout the region. Results suggest that more immediate efforts to improve homelessness should focus on the counties of Alameda, San Francisco, Santa Clara, and San Mateo. Changes in homelessness during the COVID-19 pandemic years of 2020-2022 point to an urgent need to support Contra Costa County.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The Health Department has developed an inspection report and scoring system. After conducting an inspection of the facility, the Health Inspector calculates a score based on the violations observed. Violations can fall into:high risk category: records specific violations that directly relate to the transmission of food borne illnesses, the adulteration of food products and the contamination of food-contact surfaces.moderate risk category: records specific violations that are of a moderate risk to the public health and safety.low risk category: records violations that are low risk or have no immediate risk to the public health and safety.The score card that will be issued by the inspector is maintained at the food establishment and is available to the public in this dataset. San Francisco's LIVES restaurant inspection data leverages the LIVES Flattened Schema (https://goo.gl/c3nNvr), which is based on LIVES version 2.0, cited on Yelp's website (http://www.yelp.com/healthscores).
This is a dataset hosted by the city of San Francisco. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore San Francisco's Data using Kaggle and all of the data sources available through the San Francisco organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by Autumn Goodman on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This filtered view contains the population estimates for San Francisco demographic groups from the U.S. Census Bureau’s American Community Survey that are used by Controller's Office - City Performance Unit for reporting on Police Stops
San Francisco Population and Demographic Census data dataset filtered on: "reporting_segment" = 'Police Reporting Demographic Categories'
A. SUMMARY This dataset contains population and demographic estimates and associated margins of error obtained and derived from the US Census. The data is presented over multiple years and geographies. The data is sourced primarily from the American Community Survey.
B. HOW THE DATASET IS CREATED The raw data is obtained from the census API. Some estimates as published as-is and some are derived.
C. UPDATE PROCESS New estimates and years of data are appended to this dataset. To request additional census data for San Francisco, email support@datasf.org
D. HOW TO USE THIS DATASET The dataset is long and contains multiple estimates, years and geographies. To use this dataset, you can filter by the overall segment which contains information about the source, years, geography, demographic category and reporting segment. For census data used in specific reports, you can filter to the reporting segment. To use a subset of the data, you can create a filtered view. More information of how to filter data and create a view can be found here
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY Medical provider confirmed COVID-19 cases and confirmed COVID-19 related deaths in San Francisco, CA aggregated by Census ZIP Code Tabulation Areas and normalized by 2018 American Community Survey (ACS) 5-year estimates for population data to calculate rate per 10,000 residents.
Cases and deaths are both mapped to the residence of the individual, not to where they were infected or died. For example, if one was infected in San Francisco at work but lives in the East Bay, those are not counted as SF Cases or if one dies in Zuckerberg San Francisco General but is from another county, that is also not counted in this dataset.
Dataset is cumulative and covers cases going back to March 2nd, 2020 when testing began. It is updated daily.
B. HOW THE DATASET IS CREATED Addresses from medical data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area. The 2018 ACS estimates for population provided by the Census are used to create a rate which is equal to ([count] / [acs_population]) * 10000) representing the number of cases per 10,000 residents.
C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset each day.
D. HOW TO USE THIS DATASET Privacy rules in effect To protect privacy, certain rules are in effect: 1. Case counts greater than 0 and less than 10 are dropped - these will be null (blank) values 2. Cases dropped altogether for areas where acs_population < 1000
Rate suppression in effect where counts lower than 20 Rates are not calculated unless the case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology.
A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are polygonal representations of USPS ZIP Code service area routes. Read how the Census develops ZCTAs on their website.
This dataset is a filtered view of another dataset You can find a full dataset of cases and deaths summarized by this and other geographic areas.
E. CHANGE LOG
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This filtered view contains the population estimates for San Francisco geographic units from the U.S. Census Bureau’s American Community Survey that are used in the Department of Public Health’s public reporting. Details on the underlying geographic unit data from the American Community Survey are available below. The geographies included are census tracts, analysis neighborhoods, and zip codes (ZCTA). We are using 2016-2020 ACS estimates in our public reporting, but additional years are included in this view as well for historical purposes.
The COVID-19 reports which use this data are available on SF.gov by clicking here.
San Francisco Population and Demographic Census data dataset filtered on:
B. HOW THE DATASET IS CREATED The raw data is obtained from the census API. Some estimates as published as-is and some are derived.
C. UPDATE PROCESS New estimates and years of data are appended to this dataset. To request additional census data for San Francisco, email support@datasf.org
D. HOW TO USE THIS DATASET The dataset is long and contains multiple estimates, years and geographies. To use this dataset, you can filter by the overall segment which contains information about the source, years, geography, demographic category and reporting segment. For census data used in specific reports, you can filter to the reporting segment. To use a subset of the data, you can create a filtered view. More information of how to filter data and create a view can be found here
More details about each file are in the individual file descriptions.
This is a dataset hosted by the city of San Francisco. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore San Francisco's Data using Kaggle and all of the data sources available through the San Francisco organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by Emanuel Haas on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
This dataset is distributed under the following licenses: Open Data Commons Public Domain Dedication and License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.
This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']
This dataset is deprecated and not being updated.
Fork this kernel to get started with this dataset.
Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @meric from Unplash.
Which neighborhoods have the highest proportion of offensive graffiti?
Which complaint is most likely to be made using Twitter and in which neighborhood?
What are the most complained about Muni stops in San Francisco?
What are the top 10 incident types that the San Francisco Fire Department responds to?
How many medical incidents and structure fires are there in each neighborhood?
What’s the average response time for each type of dispatched vehicle?
Which category of police incidents have historically been the most common in San Francisco?
What were the most common police incidents in the category of LARCENY/THEFT in 2016?
Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?
What is the average tree diameter?
What is the highest number of a particular species of tree planted in a single year?
Which San Francisco locations feature the largest number of trees?