Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset looks at the number of movies produced in the United States of America that fall into the "crime" genre between 1985 and 2017 and compares it to violent crime rates of the same time. The time frame was chosen based off of accessible data (The Movies Dataset ends with 2017 and the FBI's CDE tool starts at 1985).
The data for the movies and genres was pulled from "The Movies Dataset" on Kaggle where columns were adjusted and the first two genres were kept. The data was then filtered to only include films released in the United States of America from 1985-2017. Violent crime data and population data in the USA was then joined.
movies-to-crime_data_by_population_1985-2017_2023-03-06.csv: This file contains the filtered and sorted data joining together the rest of the included data.
movies_data_cleaned_V2.csv: This includes a large movie dataset that was pulled from the aforementioned "The Movies Dataset" and adjusted for usability for this project, find original dataset here.
population_data_1985-2017.csv: This data was pulled from the World Bank, Population, Total for United States [POPTOTUSA647NWDB], retrieved from FRED, Federal Reserve Bank of St. Louis.
violent_crime_rates_USA_1985-2017_2024-03-06.csv: This data was pulled from the Federal Bureau of Investigation's "Crime Data Explorer" tool. Data pulled includes all violent crime 1985-2017. More information concerning how violent crimes are categorized can be found on the Crime Data Explorer's website linked above.
All data was sourced via publicly available datasets and linked to above. Special thanks to Kaggle user Rounak Banik for their work creating "The Movies Dataset" which was incredibly helpful.
This project was a side project to gain further practice with tools such as SQL, R, Tableau and spreadsheets. It began with a focus on authors of crime novels vs amount of actual criminals. The project soon morphed into this after a struggle to find usable datasets.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Introduction: The dataset used for this experiment is real and authentic. The dataset is acquired from UCI machine learning repository website [13]. The title of the dataset is ‘Crime and Communities’. It is prepared using real data from socio-economic data from 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and crimedata from the 1995 FBI UCR [13]. This dataset contains a total number of 147 attributes and 2216 instances.
The per capita crimes variables were calculated using population values included in the 1995 FBI data (which differ from the 1990 Census values).
The variables included in the dataset involve the community, such as the percent of the population considered urban, and the median family income, and involving law enforcement, such as per capita number of police officers, and percent of officers assigned to drug units. The crime attributes (N=18) that could be predicted are the 8 crimes considered 'Index Crimes' by the FBI)(Murders, Rape, Robbery, .... ), per capita (actually per 100,000 population) versions of each, and Per Capita Violent Crimes and Per Capita Nonviolent Crimes)
predictive variables : 125 non-predictive variables : 4 potential goal/response variables : 18
http://archive.ics.uci.edu/ml/datasets/Communities%20and%20Crime%20Unnormalized
U. S. Department of Commerce, Bureau of the Census, Census Of Population And Housing 1990 United States: Summary Tape File 1a & 3a (Computer Files),
U.S. Department Of Commerce, Bureau Of The Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan. (1992)
U.S. Department of Justice, Bureau of Justice Statistics, Law Enforcement Management And Administrative Statistics (Computer File) U.S. Department Of Commerce, Bureau Of The Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan. (1992)
U.S. Department of Justice, Federal Bureau of Investigation, Crime in the United States (Computer File) (1995)
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Data available in the dataset may not act as a complete source of information for identifying factors that contribute to more violent and non-violent crimes as many relevant factors may still be missing.
However, I would like to try and answer the following questions answered.
Analyze if number of vacant and occupied houses and the period of time the houses were vacant had contributed to any significant change in violent and non-violent crime rates in communities
How has unemployment changed crime rate(violent and non-violent) in the communities?
Were people from a particular age group more vulnerable to crime?
Does ethnicity play a role in crime rate?
Has education played a role in bringing down the crime rate?
Facebook
TwitterThese data examine the effects on total crime rates of changes in the demographic composition of the population and changes in criminality of specific age and race groups. The collection contains estimates from national data of annual age-by-race specific arrest rates and crime rates for murder, robbery, and burglary over the 21-year period 1965-1985. The data address the following questions: (1) Are the crime rates reported by the Uniform Crime Reports (UCR) data series valid indicators of national crime trends? (2) How much of the change between 1965 and 1985 in total crime rates for murder, robbery, and burglary is attributable to changes in the age and race composition of the population, and how much is accounted for by changes in crime rates within age-by-race specific subgroups? (3) What are the effects of age and race on subgroup crime rates for murder, robbery, and burglary? (4) What is the effect of time period on subgroup crime rates for murder, robbery, and burglary? (5) What is the effect of birth cohort, particularly the effect of the very large (baby-boom) cohorts following World War II, on subgroup crime rates for murder, robbery, and burglary? (6) What is the effect of interactions among age, race, time period, and cohort on subgroup crime rates for murder, robbery, and burglary? (7) How do patterns of age-by-race specific crime rates for murder, robbery, and burglary compare for different demographic subgroups? The variables in this study fall into four categories. The first category includes variables that define the race-age cohort of the unit of observation. The values of these variables are directly available from UCR and include year of observation (from 1965-1985), age group, and race. The second category of variables were computed using UCR data pertaining to the first category of variables. These are period, birth cohort of age group in each year, and average cohort size for each single age within each single group. The third category includes variables that describe the annual age-by-race specific arrest rates for the different crime types. These variables were estimated for race, age, group, crime type, and year using data directly available from UCR and population estimates from Census publications. The fourth category includes variables similar to the third group. Data for estimating these variables were derived from available UCR data on the total number of offenses known to the police and total arrests in combination with the age-by-race specific arrest rates for the different crime types.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Measuring homicides across the world helps us understand violent crime and how people are affected by interpersonal violence.
But measuring homicides is challenging. Even homicide researchers do not always agree on whether the specific cause of death should be considered a homicide. Even when they agree on what counts as a homicide, it is difficult to count all of them.
In many countries, national civil registries do not certify most deaths or their cause. Besides lacking funds and personnel, a body has to be found to determine whether a death has happened. Authorities may also struggle to distinguish a homicide from a similar cause of death, such as an accident.
Law enforcement and criminal justice agencies collect more data on whether a death was unlawful — but their definition of unlawfulness may differ across countries and time.
Estimating homicides where neither of these sources is available or good enough is difficult. Estimates rely on inferences from similar countries and contextual factors that are based on strong assumptions. So how do researchers address these challenges and measure homicides?
In our work on homicides, we provide data from five main sources:
The WHO Mortality Database (WHO-MD)1 The Global Study on Homicide by the UN Office on Drugs and Crime (UNODC)2 The History of Homicide Database by Manuel Eisner (20033 and 20144) The Global Burden of Disease (GBD) study by the Institute for Health Metrics and Evaluation (IHME)5 The WHO Global Health Estimates (WHO-GHE)6 These sources all report homicides, cover many countries and years, and are frequently used by researchers and policymakers. They are not entirely separate, as they partially build upon each other.
Facebook
TwitterThere has been little research on United States homicide rates from a long-term perspective, primarily because there has been no consistent data series on a particular place preceding the Uniform Crime Reports (UCR), which began its first full year in 1931. To fill this research gap, this project created a data series on homicides per capita for New York City that spans two centuries. The goal was to create a site-specific, individual-based data series that could be used to examine major social shifts related to homicide, such as mass immigration, urban growth, war, demographic changes, and changes in laws. Data were also gathered on various other sites, particularly in England, to allow for comparisons on important issues, such as the post-World War II wave of violence. The basic approach to the data collection was to obtain the best possible estimate of annual counts and the most complete information on individual homicides. The annual count data (Parts 1 and 3) were derived from multiple sources, including the Federal Bureau of Investigation's Uniform Crime Reports and Supplementary Homicide Reports, as well as other official counts from the New York City Police Department and the City Inspector in the early 19th century. The data include a combined count of murder and manslaughter because charge bargaining often blurs this legal distinction. The individual-level data (Part 2) were drawn from coroners' indictments held by the New York City Municipal Archives, and from daily newspapers. Duplication was avoided by keeping a record for each victim. The estimation technique known as "capture-recapture" was used to estimate homicides not listed in either source. Part 1 variables include counts of New York City homicides, arrests, and convictions, as well as the homicide rate, race or ethnicity and gender of victims, type of weapon used, and source of data. Part 2 includes the date of the murder, the age, sex, and race of the offender and victim, and whether the case led to an arrest, trial, conviction, execution, or pardon. Part 3 contains annual homicide counts and rates for various comparison sites including Liverpool, London, Kent, Canada, Baltimore, Los Angeles, Seattle, and San Francisco.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Violent Crime Rates by US State
This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.
A data frame with 50 observations on 4 variables.
Murder is numeric and Murder arrests (per 100,000) Assault is numeric and Assault arrests (per 100,000) UrbanPop is numeric and UrbanPop arrests (per 100,000) Rape is numeric and Rape arrests (per 100,000)
World Almanac and Book of facts 1975. (Crime rates).
Statistical Abstracts of the United States 1975. (Urban rates).
McNeil, D. R. (1977) Interactive Data Analysis. New York: Wiley.
Facebook
TwitterTHIS DATASET WAS LAST UPDATED AT 7:11 AM EASTERN ON DEC. 1
2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.
In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.
A total of 229 people died in mass killings in 2019.
The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.
One-third of the offenders died at the scene of the killing or soon after, half from suicides.
The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.
The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.
This data will be updated periodically and can be used as an ongoing resource to help cover these events.
To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:
To get these counts just for your state:
Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.
This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”
Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.
Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.
Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.
In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.
Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.
Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.
This project started at USA TODAY in 2012.
Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Intentional Homicides: Female: per 100,000 Female data was reported at 2.261 Ratio in 2016. This records an increase from the previous number of 2.062 Ratio for 2015. United States US: Intentional Homicides: Female: per 100,000 Female data is updated yearly, averaging 2.337 Ratio from Dec 2000 (Median) to 2016, with 17 observations. The data reached an all-time high of 3.086 Ratio in 2001 and a record low of 1.983 Ratio in 2014. United States US: Intentional Homicides: Female: per 100,000 Female data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s USA – Table US.World Bank: Health Statistics. Intentional homicides, female are estimates of unlawful female homicides purposely inflicted as a result of domestic disputes, interpersonal violence, violent conflicts over land resources, intergang violence over turf or control, and predatory violence and killing by armed groups. Intentional homicide does not include all intentional killing; the difference is usually in the organization of the killing. Individuals or small groups usually commit homicide, whereas killing in armed conflict is usually committed by fairly cohesive groups of up to several hundred members and is thus usually excluded.; ; UN Office on Drugs and Crime's International Homicide Statistics database.; ;
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing Central America murder/homicide rate per 100K population by year from N/A to N/A.
Facebook
TwitterNumber and rate (per 100,000 population) of homicide victims, Canada and Census Metropolitan Areas, 1981 to 2024.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing North America murder/homicide rate per 100K population by year from 2010 to 2021.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing Virgin Islands (U.S.) murder/homicide rate per 100K population by year from 1997 to 2012.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing Latin America & Caribbean murder/homicide rate per 100K population by year from 2010 to 2021.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Intentional Homicides: Male: per 100,000 Male data was reported at 8.512 Ratio in 2016. This records an increase from the previous number of 7.929 Ratio for 2015. United States US: Intentional Homicides: Male: per 100,000 Male data is updated yearly, averaging 8.551 Ratio from Dec 2000 (Median) to 2016, with 17 observations. The data reached an all-time high of 10.383 Ratio in 2001 and a record low of 6.988 Ratio in 2014. United States US: Intentional Homicides: Male: per 100,000 Male data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s USA – Table US.World Bank: Health Statistics. Intentional homicides, male are estimates of unlawful male homicides purposely inflicted as a result of domestic disputes, interpersonal violence, violent conflicts over land resources, intergang violence over turf or control, and predatory violence and killing by armed groups. Intentional homicide does not include all intentional killing; the difference is usually in the organization of the killing. Individuals or small groups usually commit homicide, whereas killing in armed conflict is usually committed by fairly cohesive groups of up to several hundred members and is thus usually excluded.; ; UN Office on Drugs and Crime's International Homicide Statistics database.; ;
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Surveillance videos are able to capture a variety of realistic anomalies. In this paper, we propose to learn anomalies by exploiting both normal and anomalous videos. To avoid annotating the anomalous segments or clips in training videos, which is very time consuming, we propose to learn anomaly through the deep multiple instance ranking framework by leveraging weakly labeled training videos, i.e. the training labels (anomalous or normal) are at video-level instead of clip-level. In our approach, we consider normal and anomalous videos as bags and video segments as instances in multiple instance learning (MIL), and automatically learn a deep anomaly ranking model that predicts high anomaly scores for anomalous video segments. Furthermore, we introduce sparsity and temporal smoothness constraints in the ranking loss function to better localize anomaly during training. We also introduce a new large-scale first of its kind dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies such as fighting, road accident, burglary, robbery, etc. as well as normal activities. This dataset can be used for two tasks. First, general anomaly detection considering all anomalies in one group and all normal activities in another group. Second, for recognizing each of 13 anomalous activities. Our experimental results show that our MIL method for anomaly detection achieves significant improvement on anomaly detection performance as compared to the state-of-the-art approaches. We provide the results of several recent deep learning baselines on anomalous activity recognition. The low recognition performance of these baselines reveals that our dataset is very challenging and opens more opportunities for future work.
One critical task in video surveillance is detecting anomalous events such as traffic accidents, crimes or illegal activities. Generally, anomalous events rarely occur as compared to normal activities. Therefore, to alleviate the waste of labor and time, developing intelligent computer vision algorithms for automatic video anomaly detection is a pressing need. The goal of a practical anomaly detection system is to timely signal an activity that deviates normal patterns and identify the time window of the occurring anomaly. Therefore, anomaly detection can be considered as coarse level video understanding, which filters out anomalies from normal patterns. Once an anomaly is detected, it can further be categorized into one of the specific activities using classification techniques. In this work, we propose an anomaly detection algorithm using weakly labeled training videos. That is we only know the video-level labels, i.e. a video is normal or contains anomaly somewhere, but we do not know where. This is intriguing because we can easily annotate a large number of videos by only assigning video-level labels. To formulate a weakly-supervised learning approach, we resort to multiple instance learning. Specifically, we propose to learn anomaly through a deep MIL framework by treating normal and anomalous surveillance videos as bags and short segments/clips of each video as instances in a bag. Based on training videos, we automatically learn an anomaly ranking model that predicts high anomaly scores for anomalous segments in a video. During testing, a longuntrimmed video is divided into segments and fed into our deep network which assigns anomaly score for each video segment such that an anomaly can be detected.
Our proposed approach (summarized in Figure 1) begins with dividing surveillance videos into a fixed number of segments during training. These segments make instances in a bag. Using both positive (anomalous) and negative (normal) bags, we train the anomaly detection model using the proposed deep MIL ranking loss. https://www.crcv.ucf.edu/projects/real-world/method.png
We construct a new large-scale dataset, called UCF-Crime, to evaluate our method. It consists of long untrimmed surveillance videos which cover 13 realworld anomalies, including Abuse, Arrest, Arson, Assault, Road Accident, Burglary, Explosion, Fighting, Robbery, Shooting, Stealing, Shoplifting, and Vandalism. These anomalies are selected because they have a significant impact on public safety. We compare our dataset with previous anomaly detection datasets in Table 1. For more details about the UCF-Crime dataset, please refer to our paper. A short description of each anomalous event is given below. Abuse: This event contains videos which show bad, cruel or violent behavior against children, old people, animals, and women. Burglary: This event contains videos that show people (thieves) entering into a building or house with the intention to commit theft. It does not include use of force against people. Robbery: This event contains videos showing thieves taking money unlawfully by force or threat of force. These videos do not include shootings. Stealing: This event contains videos showing people taking property or money without permission. They do not include shoplifting. Shooting: This event contains videos showing act of shooting someone with a gun. Shoplifting: This event contains videos showing people stealing goods from a shop while posing as a shopper. Assault: This event contains videos showing a sudden or violent physical attack on someone. Note that in these videos the person who is assaulted does not fight back. Fighting: This event contains videos displaying two are more people attacking one another. Arson: This event contains videos showing people deliberately setting fire to property. Explosion: This event contains videos showing destructive event of something blowing apart. This event does not include videos where a person intentionally sets a fire or sets off an explosion. Arrest: This event contains videos showing police arresting individuals. Road Accident: This event contains videos showing traffic accidents involving vehicles, pedestrians or cyclists. Vandalism: This event contains videos showing action involving deliberate destruction of or damage to public or private property. The term includes property damage, such as graffiti and defacement directed towards any property without permission of the owner. Normal Event: This event contains videos where no crime occurred. These videos include both indoor (such as a shopping mall) and outdoor scenes as well as day and night-time scenes. https://www.crcv.ucf.edu/projects/real-world/dataset_table.png https://www.crcv.ucf.edu/projects/real-world/method.png
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Citizen security remains one of the most pressing development challenges facing low- and middle-income countries around the globe. In Latin America and the Caribbean (LAC), for example—a region where homicide rates have consistently remained more than three times the global average (UNODC, 2023; InSight Crime, 2024)—crime and violence are estimated to cost approximately 3.4% of the region’s GDP (IDB, 2024), whereas research suggests that bringing the crime level in Latin America down to the world average would boost LAC’s average annual growth rate by 0.5 percentage points (IMF, 2023).
This Evidence Gap Map (EGM) focuses on citizen security interventions led or supported by police agencies, and maps existing impact evaluations and systematic reviews that examine the effectiveness of a wide range of policing interventions aimed at reducing crime, violence, disorder, overall insecurity, and their associated risk factors and underlying determinants. While some previous evidence maps have narrowly focused on specific strategies or contexts, this EGM takes a broader and more inclusive and comprehensive approach, covering a wide range of intervention types, settings, and outcome areas.
The EGM is available at: https://developmentevidence.3ieimpact.org/egm/IDB-policing-egm
By synthesizing rigorous evidence on what works, what doesn’t, and where gaps remain, the EGM aims to inform policymakers, practitioners, and researchers on the most effective uses of policing resources. It is intended to support evidence-informed decision-making, promote more effective allocation of public security investments, and ultimately contribute to safer communities and stronger economies across the LAC region and beyond.
This EGM was developed through a collaborative effort across multiple teams within the Inter-American Development Bank (IDB), and peer-reviewed, and visualized by the International Initiative for Impact Evaluation (3ie) in 2025 to ensure methodological rigor and transparency.
Project collaborators include Camilo Acosta, Sergio Britto Lima, Minji Kang, Indira Porto, Rodrigo Serrano Berthet, and Harold Villalba (listed alphabetically).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset showing U.S. murder/homicide rate per 100K population by year from 1990 to 2021.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
By Rajanand Ilangovan [source]
This Dataset provides an up-to-date analysis of crime trends in India from 2001 to the present. It contains complete information about different types of crimes such as rape, murder, and theft that were committed across India. By analyzing this dataset we can determine the areas where crimes were most prevalent, what type of offenders were usually involved in the crime and which year had the highest number of registered cases. Additionally, we can also analyse which group experienced most complaints and what kind of punishments or consequences they faced like departmental enquiries, magisterial enquiries or police personnel trials completed. This data set is perfect for further research into crime trends in India and will help us better understand why certain types of crimes take place more frequently than others
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
• Area Name (state or UT) where the crime was reported. • Year in which the crime was reported. • Subgroup (type of crime). • Number of cases registered, number of cases reported for departmental action etc., related to a particular type of crime and state/UT.
• Number of complaints/cases declared false/unsubstantiated, number of police personnel convictions etc., related to a particular type of crime and state/UT.
• Number of cases in which offenders were others known persons to the victims, neighbours or relatives to the victims etc., related to a particular type of crime and state/UT.By studying this dataset one might explore different angles by analysing factors like:
• What are the top states with high rate criminal activities? Which areas are relatively safer?
• Are any states witnessing higher incidences than national average levels? Alternatively, are there any regions which have recorded lower rates than national average levels?
• What is trend between sub crimes across India both regional & time wise? How has it changed over time ? (2001-20) ;
Movement among crimes on monthly basis during period 2001 - 2020 Comparison among ages , genders & professions involved with Crime Rates && Timeline comparison between Types Of Crime , Crimes Involving Police Personnel Contractors in Crimes as timeline . Immigration Report . Is absolute difference btw urban & rural up from previous years ? Open conversations about what government efforts need more focus & why . Fundamentals impacting reducing / increasing rate behind closed doors . Any impactful key insights about SelfDefence Degree given out that year highlighting decreasing / increasing amount if increase thenwhat extra activity got curated btw that law was enacted vs before enactment if possible Outliers Analysis on same murders done by pediphiles or sexual assault against women under minorities if exists
- Analyzing crime trends over time by analyzing the Year, Sub_group and Area_Name columns to understand different types of crimes and patterns of criminal activity in India.
Evaluating the effectiveness of police response to different types of crimes, such as comparing the CPA_-_Cases_Registered, CPA_-_Cases_Reported_for_Dept._Action and CPB_-_Police_PersonnelAcquitted data fields across different time periods, sub-groups and areas to assess how well law enforcement is responding to crimes reported.
Tracking changes in punishment awarded for different crimes by analyzing the CPC_-_Police_-Personnel_-Major-Punishment_-awarded data field for changes over ti...
Facebook
TwitterThis is the full dataset that allows replication of the study.
Facebook
TwitterThis dataset contains some data to try to answer the question, "What's the best city to live and work in as a Data Scientist?" I include data from the U.S. News & World Report Best Places to Live and Best States Rankings; city scores from Nomad List; rent indices from Zillow; and the number of jobs openings on Indeed.com. All data is publicly available online and manually compiled by myself.
The U.S. News Best Places and Best States Rankings are updated annually. They were last updated in Dec 2019, so I assume the next update will come in Dec 2020.
For data points from the U.S. News Best Places, drill down into the page for each metro area. I have to manually collect these data points, and not all of them are fully populated. If there's interest, I'll upload a new version with more data filled in.
Nomad List publishes scores for each city that update in real time every 10 mins. These scores are affected by the current weather in each city. Therefore, the scores vary quite a bit seasonally as well as during the day. Learn more here. I've sampled the scores at different times of year and different times of day. Timestamps are in ISO 8601 format. Nomad Scores are not available for all metros in the dataset.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset looks at the number of movies produced in the United States of America that fall into the "crime" genre between 1985 and 2017 and compares it to violent crime rates of the same time. The time frame was chosen based off of accessible data (The Movies Dataset ends with 2017 and the FBI's CDE tool starts at 1985).
The data for the movies and genres was pulled from "The Movies Dataset" on Kaggle where columns were adjusted and the first two genres were kept. The data was then filtered to only include films released in the United States of America from 1985-2017. Violent crime data and population data in the USA was then joined.
movies-to-crime_data_by_population_1985-2017_2023-03-06.csv: This file contains the filtered and sorted data joining together the rest of the included data.
movies_data_cleaned_V2.csv: This includes a large movie dataset that was pulled from the aforementioned "The Movies Dataset" and adjusted for usability for this project, find original dataset here.
population_data_1985-2017.csv: This data was pulled from the World Bank, Population, Total for United States [POPTOTUSA647NWDB], retrieved from FRED, Federal Reserve Bank of St. Louis.
violent_crime_rates_USA_1985-2017_2024-03-06.csv: This data was pulled from the Federal Bureau of Investigation's "Crime Data Explorer" tool. Data pulled includes all violent crime 1985-2017. More information concerning how violent crimes are categorized can be found on the Crime Data Explorer's website linked above.
All data was sourced via publicly available datasets and linked to above. Special thanks to Kaggle user Rounak Banik for their work creating "The Movies Dataset" which was incredibly helpful.
This project was a side project to gain further practice with tools such as SQL, R, Tableau and spreadsheets. It began with a focus on authors of crime novels vs amount of actual criminals. The project soon morphed into this after a struggle to find usable datasets.