https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.
This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']
This dataset is deprecated and not being updated.
Fork this kernel to get started with this dataset.
Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @meric from Unplash.
Which neighborhoods have the highest proportion of offensive graffiti?
Which complaint is most likely to be made using Twitter and in which neighborhood?
What are the most complained about Muni stops in San Francisco?
What are the top 10 incident types that the San Francisco Fire Department responds to?
How many medical incidents and structure fires are there in each neighborhood?
What’s the average response time for each type of dispatched vehicle?
Which category of police incidents have historically been the most common in San Francisco?
What were the most common police incidents in the category of LARCENY/THEFT in 2016?
Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?
What is the average tree diameter?
What is the highest number of a particular species of tree planted in a single year?
Which San Francisco locations feature the largest number of trees?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 2 rows and is filtered where the book is Selected : why some people lead, why others follow, and why it matters. It features 7 columns including author, publication date, language, and book publisher.
People Data Labs is an aggregator of B2B person and company data. We source our globally compliant person dataset via our "Data Union".
The "Data Union" is our proprietary data sharing co-op. Customers opt-in to sharing their data and warrant that their data is fully compliant with global data privacy regulations. Some data sources are provided as a one time dump, others are refreshed every time we do a new data build. Our data sources come from a variety of verticals including HR Tech, Real Estate Tech, Identity/Anti-Fraud, Martech, and others. People Data Labs works with customers on compliance based topics. If a customer wishes to ensure anonymity, we work with them to anonymize the data.
Our person data has over 100 fields including resume data (work history, education), contact information (email, phone), demographic info (name, gender, birth date) and social profile information (linkedin, github, twitter, facebook, etc...).
This data contains information about people involved in a crash and if any injuries were sustained. This dataset should be used in combination with the traffic Crash and Vehicle dataset. Each record corresponds to an occupant in a vehicle listed in the Crash dataset. Some people involved in a crash may not have been an occupant in a motor vehicle, but may have been a pedestrian, bicyclist, or using another non-motor vehicle mode of transportation. Injuries reported are reported by the responding police officer. Fatalities that occur after the initial reports are typically updated in these records up to 30 days after the date of the crash. Person data can be linked with the Crash and Vehicle dataset using the “CRASH_RECORD_ID” field. A vehicle can have multiple occupants and hence have a one to many relationship between Vehicle and Person dataset. However, a pedestrian is a “unit” by itself and have a one to one relationship between the Vehicle and Person table. The Chicago Police Department reports crashes on IL Traffic Crash Reporting form SR1050. The crash data published on the Chicago data portal mostly follows the data elements in SR1050 form. The current version of the SR1050 instructions manual with detailed information on each data elements is available here. Change 11/21/2023: We have removed the RD_NO (Chicago Police Department report number) for privacy reasons.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset provides an in-depth look at the dynamics of social interaction, particularly in Hong Kong. It contains comprehensive information regarding individuals, households and interactions between individuals such as their ages, frequency and duration of contact, and genders. This data can be utilized to evaluate various social and economic trends, behaviors, as well as dynamics observed at different levels. For example, this data set is an ideal tool to recognize population-level trends such as age and gender diversification of contacts or investigate the structure of social networks in addition to the implications of contact patterns on health and economic outcomes. Additionally, it offers valuable insights into dissimilar groups of people including their permanent residence activities related to work or leisure by enabling one to understand their interactions along with contact dynamics within their respective populations. Ultimately this dataset is key for attaining a comprehensive understanding of social contact dynamics which are fundamental for grasping why these interactions are crucial in Hong Kong's society today
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides detailed information about the social contact dynamics in Hong Kong. With this dataset, it is possible to gain a comprehensive understanding of the patterns of various forms of social contact - from permanent residence and work contacts to leisure contacts. This guide will provide an overview and guidelines on how to use this dataset for analysis.
Exploring Trends and Dynamics:
To begin exploring the trends and dynamics of social contact in Hong Kong, start by looking at demographic factors such as age, gender, ethnicity, and educational attainment associated with different types of contacts (permanent residence/work/leisure). Consider the frequency and duration of contacts within these segments to identify any potential differences between them. Additionally, look at how these factors interact with each other – observe which segments have higher levels of interaction with each other or if there are any differences between different population groups based on their demographic characteristics. This can be done through visualizations such as line graphs or bar charts which can illustrate trends across timeframes or population demographics more clearly than raw numbers would alone.
Investigating Social Networks:
The data collected through this dataset also allows for investigation into social networks – understanding who connects with who in both real-life interactions as well as through digital channels (if applicable). Focus on analyzing individual or family networks rather than larger groups in order to get a clearer picture without having too much complexity added into the analysis time. Analyze commonalities among individuals within a network even after controlling for certain factors that could affect interaction such as age or gender – utilize clustering techniques for this step if appropriate– then focus on comparing networks between individuals/families overall using graph theory methods such as length distributions (the average number of relationships one has) , degrees (the number of links connected from one individual or family unit), centrality measures(identifying individuals who serve an important role bridging two different parts fo he network) etc., These methods will help provide insights into varying structures between large groups rather than focusing only on small-scale personal connections among friends / colleagues / relatives which may not always offer accurate portrayals due to their naturally limited scope
Modeling Health Implications:
Finally, consider modeling health implications stemming from these observed patterns– particularly implications that may not be captured by simpler measures like count per contact hour (which does not differentiate based on intensity). Take into account aspects like viral transmission risk by analyzing secondary effects generated from contact events captured in the data – things like physical proximity when multiple people meet up together over multiple days
- Analyzing the age, gender and contact dynamics of different areas within Hong Kong to understand the local population trends and behavior.
- Investigating the structure of social networks to study how patterns of contact vary among socio economic backgro...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Retail Analytics: Store owners can use the model to track the number of customers visiting their stores during different times of the day or seasons, which can help in workforce and resource allocation.
Crowd Management: Event organizers or public authorities can utilize the model to monitor crowd sizes at concerts, festivals, public gatherings or protests, aiding in security and emergency planning.
Smart Transportation: The model can be integrated into public transit systems to count the number of passengers in buses or trains, providing real-time occupancy information and assisting in transportation planning.
Health and Safety Compliance: During times of pandemics or emergencies, the model can be used to count the number of people in a location, ensuring compliance with restrictions on gathering sizes.
Building Security: The model can be adopted in security systems to track how many people enter and leave a building or a particular area, providing useful data for access control.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Project Title: Add title here
Project Team: Add contact information for research project team members
Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.
Relevant publications/outputs: When available, add links to the related publications/outputs from this data.
Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.
Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?
Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.
Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.
List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.
Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).
Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14
Background
The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.
Longitudinal data
The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. A full series of longitudinal data has been produced, going back to winter 1992. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.
New reweighting policy
Following the new reweighting policy ONS has reviewed the latest population estimates made available during 2019 and have decided not to carry out a 2019 LFS and APS reweighting exercise. Therefore, the next reweighting exercise will take place in 2020. These will incorporate the 2019 Sub-National Population Projection data (published in May 2020) and 2019 Mid-Year Estimates (published in June 2020). It is expected that reweighted Labour Market aggregates and microdata will be published towards the end of 2020/early 2021.
LFS Documentation
The documentation available from the Archive to accompany LFS datasets largely consists of the latest version of each user guide volume alongside the appropriate questionnaire for the year concerned. However, volumes are updated periodically by ONS, so users are advised to check the latest documents on the ONS Labour Force Survey - User Guidance pages before commencing analysis. This is especially important for users of older QLFS studies, where information and guidance in the user guide documents may have changed over time.
Additional data derived from the QLFS
The Archive also holds further QLFS series: End User Licence (EUL) quarterly data; Secure Access datasets; household datasets; quarterly, annual and ad hoc module datasets compiled for Eurostat; and some additional annual Northern Ireland datasets.
Variables DISEA and LNGLST
Dataset A08 (Labour market status of disabled people) which ONS suspended due to an apparent discontinuity between April to June 2017 and July to September 2017 is now available. As a result of this apparent discontinuity and the inconclusive investigations at this stage, comparisons should be made with caution between April to June 2017 and subsequent time periods. However users should note that the estimates are not seasonally adjusted, so some of the change between quarters could be due to seasonality. Further recommendations on historical comparisons of the estimates will be given in November 2018 when ONS are due to publish estimates for July to September 2018.
An article explaining the quality assurance investigations that have been conducted so far is available on the ONS Methodology webpage. For any queries about Dataset A08 please email Labour.Market@ons.gov.uk.
Occupation data for 2021 and 2022 data files
The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. Further information can be found in the ONS article published on 11 July 2023: https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/articles/revisionofmiscodedoccupationaldataintheonslabourforcesurveyuk/january2021toseptember2022" style="background-color: rgb(255, 255, 255);">Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.
2022 Weighting
The population totals used for the latest LFS estimates use projected growth rates from Real Time Information (RTI) data for UK, EU and non-EU populations based on 2021 patterns. The total population used for the LFS therefore does not take into account any changes in migration, birth rates, death rates, and so on since June 2021, and hence levels estimates may be under- or over-estimating the true values and should be used with caution. Estimates of rates will, however, be robust.
Latest edition information
For the second edition (February 2025), the data file was resupplied with the 2024 weighting variable included (LGWT24).
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Card for StereoSet
Dataset Summary
StereoSet is a dataset that measures stereotype bias in language models. StereoSet consists of 17,000 sentences that measures model preferences across gender, race, religion, and profession.
Supported Tasks and Leaderboards
multiple-choice question answering
Languages
English (en)
Dataset Structure
Data Instances
{'bias_type': 'race', 'context': 'Many people live in… See the full description on the dataset page: https://huggingface.co/datasets/McGill-NLP/stereoset.
Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken out by age group. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the daily COVID-19 update. Data are reported daily, with timestamps indicated in the daily briefings posted at: portal.ct.gov/coronavirus. Data are subject to future revision as reporting changes. Starting in July 2020, this dataset will be updated every weekday. Additional notes: A delay in the data pull schedule occurred on 06/23/2020. Data from 06/22/2020 was processed on 06/23/2020 at 3:30 PM. The normal data cycle resumed with the data for 06/23/2020. A network outage on 05/19/2020 resulted in a change in the data pull schedule. Data from 5/19/2020 was processed on 05/20/2020 at 12:00 PM. Data from 5/20/2020 was processed on 5/20/2020 8:30 PM. The normal data cycle resumed on 05/20/2020 with the 8:30 PM data pull. As a result of the network outage, the timestamp on the datasets on the Open Data Portal differ from the timestamp in DPH's daily PDF reports. Starting 5/10/2021, the date field will represent the date this data was updated on data.ct.gov. Previously the date the data was pulled by DPH was listed, which typically coincided with the date before the data was published on data.ct.gov. This change was made to standardize the COVID-19 data sets on data.ct.gov.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset shows how many people visited Attractions in Lincolnshire by calendar year. Visitor numbers for a wide range of attractions are shown, along with other key information such as entrance fees.
The data's source is the Annual Survey of Visits to Visitor Attractions, run by Visit England. (As usual with survey data there are some limitations, such as not all visitor attractions participating in the survey, and where visitor numbers are estimated that is indicated in the data).
This dataset is updated annually from statistics published by Visit Britain, see the Source link for more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books, has 2 rows and is filtered where the book is Drowning by accident : why so many people drown. It features 7 columns including book, author, publication date, language, and book publisher. The preview is ordered by publication date (descending).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The number of employed persons in The United States decreased to 163273 Thousand in May of 2025 from 163969 Thousand in April of 2025. This dataset provides - United States Employed Persons - actual values, historical data, forecast, chart, statistics, economic calendar and news.
The data included in this publication depict components of wildfire risk specifically for populated areas in the United States. These datasets represent where people live in the United States and the in situ risk from wildfire, i.e., the risk at the location where the adverse effects take place.National wildfire hazard datasets of annual burn probability and fire intensity, generated by the USDA Forest Service, Rocky Mountain Research Station and Pyrologix LLC, form the foundation of the Wildfire Risk to Communities data. Vegetation and wildland fuels data from LANDFIRE 2020 (version 2.2.0) were used as input to two different but related geospatial fire simulation systems. Annual burn probability was produced with the USFS geospatial fire simulator (FSim) at a relatively coarse cell size of 270 meters (m). To bring the burn probability raster data down to a finer resolution more useful for assessing hazard and risk to communities, we upsampled them to the native 30 m resolution of the LANDFIRE fuel and vegetation data. In this upsampling process, we also spread values of modeled burn probability into developed areas represented in LANDFIRE fuels data as non-burnable. Burn probability rasters represent landscape conditions as of the end of 2020. Fire intensity characteristics were modeled at 30 m resolution using a process that performs a comprehensive set of FlamMap runs spanning the full range of weather-related characteristics that occur during a fire season and then integrates those runs into a variety of results based on the likelihood of those weather types occurring. Before the fire intensity modeling, the LANDFIRE 2020 data were updated to reflect fuels disturbances occurring in 2021 and 2022. As such, the fire intensity datasets represent landscape conditions as of the end of 2022. The data products in this publication that represent where people live, reflect 2021 estimates of housing unit and population counts from the U.S. Census Bureau, combined with building footprint data from Onegeo and USA Structures, both reflecting 2022 conditions.The specific raster datasets included in this publication include:Building Count: Building Count is a 30-m raster representing the count of buildings in the building footprint dataset located within each 30-m pixel.Building Density: Building Density is a 30-m raster representing the density of buildings in the building footprint dataset (buildings per square kilometer [km²]).Building Coverage: Building Coverage is a 30-m raster depicting the percentage of habitable land area covered by building footprints.Population Count (PopCount): PopCount is a 30-m raster with pixel values representing residential population count (persons) in each pixel.Population Density (PopDen): PopDen is a 30-m raster of residential population density (people/km²).Housing Unit Count (HUCount): HUCount is a 30-m raster representing the number of housing units in each pixel.Housing Unit Density (HUDen): HUDen is a 30-m raster of housing-unit density (housing units/km²).Housing Unit Exposure (HUExposure): HUExposure is a 30-m raster that represents the expected number of housing units within a pixel potentially exposed to wildfire in a year. This is a long-term annual average and not intended to represent the actual number of housing units exposed in any specific year.Housing Unit Impact (HUImpact): HUImpact is a 30-m raster that represents the relative potential impact of fire to housing units at any pixel, if a fire were to occur. It is an index that incorporates the general consequences of fire on a home as a function of fire intensity and uses flame length probabilities from wildfire modeling to capture likely intensity of fire.Housing Unit Risk (HURisk): HURisk is a 30-m raster that integrates all four primary elements of wildfire risk - likelihood, intensity, susceptibility, and exposure - on pixels where housing unit density is greater than zero.Additional methodology documentation is provided with the data publication download. Metadata and Downloads.Note: Pixel values in this image service have been altered from the original raster dataset due to data requirements in web services. The service is intended primarily for data visualization. Relative values and spatial patterns have been largely preserved in the service, but users are encouraged to download the source data for quantitative analysis.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Jonathan Ortiz [source]
This College Completion dataset provides an invaluable insight into the success and progress of college students in the United States. It contains graduation rates, race and other data to offer a comprehensive view of college completion in America. The data is sourced from two primary sources – the National Center for Education Statistics (NCES)’ Integrated Postsecondary Education System (IPEDS) and Voluntary System of Accountability’s Student Success and Progress rate.
At four-year institutions, the graduation figures come from IPEDS for first-time, full-time degree seeking students at the undergraduate level, who entered college six years earlier at four-year institutions or three years earlier at two-year institutions. Furthermore, colleges report how many students completed their program within 100 percent and 150 percent of normal time which corresponds with graduation within four years or six year respectively. Students reported as being of two or more races are included in totals but not shown separately
When analyzing race and ethnicity data NCES have classified student demographics since 2009 into seven categories; White non-Hispanic; Black non Hispanic; American Indian/ Alaskan native ; Asian/ Pacific Islander ; Unknown race or ethnicity ; Non resident with two new categorize Native Hawaiian or Other Pacific Islander combined with Asian plus students belonging to several races. Also worth noting is that different classifications for graduate data stemming from 2008 could be due to variations in time frame examined & groupings used by particular colleges – those who can’t be identified from National Student Clearinghouse records won’t be subjected to penalty by these locations .
When it comes down to efficiency measures parameters like “Awards per 100 Full Time Undergraduate Students which includes all undergraduate completions reported by a particular institution including associate degrees & certificates less than 4 year programme will assist us here while we also take into consideration measures like expenditure categories , Pell grant percentage , endowment values , average student aid amounts & full time faculty members contributing outstandingly towards instructional research / public service initiatives .
When trying to quantify outcomes back up Median Estimated SAT score metric helps us when it is derived either on 25th percentile basis / 75th percentile basis with all these factors further qualified by identifying required criteria meeting 90% threshold when incoming students are considered for relevance . Last but not least , Average Student Aid equalizes amount granted by institution dividing same over total sum received against what was allotted that particular year .
All this analysis gives an opportunity get a holistic overview about performance , potential deficits &
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains data on student success, graduation rates, race and gender demographics, an efficiency measure to compare colleges across states and more. It is a great source of information to help you better understand college completion and student success in the United States.
In this guide we’ll explain how to use the data so that you can find out the best colleges for students with certain characteristics or focus on your target completion rate. We’ll also provide some useful tips for getting the most out of this dataset when seeking guidance on which institutions offer the highest graduation rates or have a good reputation for success in terms of completing programs within normal timeframes.
Before getting into specifics about interpreting this dataset, it is important that you understand that each row represents information about a particular institution – such as its state affiliation, level (two-year vs four-year), control (public vs private), name and website. Each column contains various demographic information such as rate of awarding degrees compared to other institutions in its sector; race/ethnicity Makeup; full-time faculty percentage; median SAT score among first-time students; awards/grants comparison versus national average/state average - all applicable depending on institution location — and more!
When using this dataset, our suggestion is that you begin by forming a hypothesis or research question concerning student completion at a given school based upon observable characteristics like financ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Any work using this dataset should cite the following paper:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Los Angeles population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Los Angeles across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2023, the population of Los Angeles was 3.82 million, a 0.05% decrease year-by-year from 2022. Previously, in 2022, Los Angeles population was 3.82 million, a decline of 0.26% compared to a population of 3.83 million in 2021. Over the last 20 plus years, between 2000 and 2023, population of Los Angeles increased by 118,340. In this period, the peak population was 3.98 million in the year 2019. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Los Angeles Population by Year. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Ohio population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Ohio. The dataset can be utilized to understand the population distribution of Ohio by age. For example, using this dataset, we can identify the largest age group in Ohio.
Key observations
The largest age group in Ohio was for the group of age 60 to 64 years years with a population of 801,454 (6.80%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Ohio was the 80 to 84 years years with a population of 237,360 (2.01%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Ohio Population by Age. You can refer the same here
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.
This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']
This dataset is deprecated and not being updated.
Fork this kernel to get started with this dataset.
Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @meric from Unplash.
Which neighborhoods have the highest proportion of offensive graffiti?
Which complaint is most likely to be made using Twitter and in which neighborhood?
What are the most complained about Muni stops in San Francisco?
What are the top 10 incident types that the San Francisco Fire Department responds to?
How many medical incidents and structure fires are there in each neighborhood?
What’s the average response time for each type of dispatched vehicle?
Which category of police incidents have historically been the most common in San Francisco?
What were the most common police incidents in the category of LARCENY/THEFT in 2016?
Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?
What is the average tree diameter?
What is the highest number of a particular species of tree planted in a single year?
Which San Francisco locations feature the largest number of trees?