^Note: While the goal for this analysis is to specifically extract numbers for racially motivated crime in London, some of the reported data groups racially and religiously motivated crime together. Where this is the case, we will refer to the crime category by its proper full classification. In addition, the groupings for hate crime vary over time and dataset.
The data for the following year Hate crime, England and Wales, 2014 to 2015 similarly contains an excel document, recorded under Hate crime, England and Wales, 2014 to 2015: data tables. Here "Hate crime strand: Race" is reported as:
All the numbers for the period 2011-2014 differ.
Definition of hate crime used for the purposes of hate crime recording in the UK. In particular, note:
import os
import json
import numpy as np
import pandas as pd
from bokeh.resources import INLINE
import bokeh.io
bokeh.io.output_notebook(INLINE)
from bokeh.io import output_file, show, output_notebook
from bokeh.models import ColumnDataSource, RangeTool
from bokeh.plotting import figure
from bokeh.transform import factor_cmap, cumsum
def barchart(counted_data, counts, output_here=True,
save_filename="plot", plot_height=350, plot_width=700, title="Title", bgcolor = "#2b2b2b"):
# either display in the notebook or save file
if output_here:
output_notebook()
else:
output_file(save_filename+".html")
source = ColumnDataSource(data=dict(counted_data=counted_data, counts=counts))
p = figure(x_range=counted_data, plot_height=plot_height, plot_width=plot_width, toolbar_location=None, title=title)
p.vbar(x='counted_data',
top='counts',
width=0.8,
source=source,
line_color='#2a9d8f',
fill_color='#2a9d8f')
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
p.y_range.start = 0
p.xaxis.major_label_orientation = 1.2
p.background_fill_color = bgcolor
show(p)
# table of all the crime classifications
df_crimes = pd.read_excel("data/reccrime-offence-ref.ods")
# list of offenses that fall under the category of racially and religiously motivated crime. This includes:
race_crime_codes = [
"105B", '58E', '58F', '58G', '58H', '58J', '8D', '8H', '8J', '8M', '8P', '9B'
]
df_crimes['Offence Code'] = df_crimes['Offence Code'].astype('str')
df_crimes[df_crimes['Offence Code'].isin(race_crime_codes)]
Offence Code | Offence description | Old PRC offence group | Old offence sub-group | New ONS offence group | New ONS sub-offence group | |
---|---|---|---|---|---|---|
5 | 105B | Racially or religiously aggravated assault wit... | Violence against the person | Violence against the person - without injury | Violence against the person | Violence without injury |
110 | 58E | Racially or religiously aggravated Criminal da... | Criminal damage | NaN | Criminal damage and arson | Criminal damage |
111 | 58F | Racially or religiously aggravated Criminal da... | Criminal damage | NaN | Criminal damage and arson | Criminal damage |
112 | 58G | Racially or religiously aggravated Criminal da... | Criminal damage | NaN | Criminal damage and arson | Criminal damage |
113 | 58H | Racially or religiously aggravated other Crimi... | Criminal damage | NaN | Criminal damage and arson | Criminal damage |
114 | 58J | Racially or religiously aggravated Criminal da... | Criminal damage | NaN | Criminal damage and arson | Criminal damage |
166 | 8D | Racially or religiously aggravated less seriou... | Violence against the person | Violence against the person - with injury | Violence against the person | Violence with injury |
169 | 8H | Racially or religiously aggravated inflicting ... | Violence against the person | Violence against the person - with injury | Violence against the person | Violence with injury |
170 | 8H | Racially or religiously aggravated inflicting ... | Violence against the person | Violence against the person - with injury | Violence against the person | Violence with injury |
171 | 8J | Racially or religiously aggravated actual bodi... | Violence against the person | Violence against the person - with injury | Violence against the person | Violence with injury |
172 | 8J | Racially or religiously aggravated actual bodi... | Violence against the person | Violence against the person - with injury | Violence against the person | Violence with injury |
177 | 8M | Racially or religiously aggravated harassment | Violence against the person | Violence against the person - without injury | Violence against the person | Violence without injury |
178 | 8M | Racially or religiously aggravated harassment | Violence against the person | Violence against the person - without injury | Violence against the person | Violence without injury |
181 | 8P | Racially or religiously aggravated assault wit... | Violence against the person | Violence against the person - with injury | Violence against the person | Violence with injury |
195 | 9B | Racially or religiously aggravated public fear... | Violence against the person | Violence against the person - without injury | Public order offences | Public order offences |
Corresponding FY: FY2002-2003 Q1 to FY2006-2007 Q4
df_crimes_1 = pd.read_csv("data/df_crimes_1.csv", index_col=0)
df_crimes_1.head()
Financial Year | Financial Quarter | Force Name | Offence Description | Offence Group | Offence Subgroup | Offence Code | Number of Offences | |
---|---|---|---|---|---|---|---|---|
0 | 2002/03 | 1 | Avon & Somerset | Arson | Criminal damage and arson | Arson | 56 | 304 |
1 | 2002/03 | 1 | Avon & Somerset | Criminal damage to a dwelling | Criminal damage and arson | Criminal damage | 58A | 1309 |
2 | 2002/03 | 1 | Avon & Somerset | Criminal damage to a building other than a dwe... | Criminal damage and arson | Criminal damage | 58B | 1046 |
3 | 2002/03 | 1 | Avon & Somerset | Criminal damage to a vehicle | Criminal damage and arson | Criminal damage | 58C | 2642 |
4 | 2002/03 | 1 | Avon & Somerset | Other Criminal damage | Criminal damage and arson | Criminal damage | 58D | 935 |
Corresponding FY: FY2007-2008 Q1 to FY2011-2012 Q4
df_crimes_2 = pd.read_csv("data/df_crimes_2.csv", index_col=0)
df_crimes_2.head()
Financial Year | Financial Quarter | Force Name | Offence Description | Offence Group | Offence Subgroup | Offence Code | Number of Offences | |
---|---|---|---|---|---|---|---|---|
0 | 2007/08 | 1 | Avon and Somerset | Abandoning child under the age of two years | Violence against the person | Violence without injury | 12 | 0 |
1 | 2007/08 | 1 | Avon and Somerset | Abduction of female | Sexual offences | Other sexual offences | 25 | 0 |
2 | 2007/08 | 1 | Avon and Somerset | Absconding from lawful custody | Miscellaneous crimes against society | Miscellaneous crimes against society | 80 | 5 |
3 | 2007/08 | 1 | Avon and Somerset | Abuse of children through prostitution and por... | Sexual offences | Other sexual offences | 71 | 1 |
4 | 2007/08 | 1 | Avon and Somerset | Abuse of position of trust of a sexual nature | Sexual offences | Other sexual offences | 73 | 3 |
Corresponding FY: FY2012-2013 Q1 to FY2020-2021 Q3
df_crimes_3 = pd.read_csv("data/df_crimes_3.csv", index_col=0)
df_crimes_3.head()
Financial Year | Financial Quarter | Force Name | Offence Description | Offence Group | Offence Subgroup | Offence Code | Number of Offences | |
---|---|---|---|---|---|---|---|---|
0 | 2012/13 | 1 | Action Fraud | Fraud offences recorded by Action Fraud | Fraud offences | Fraud: Action Fraud | AF | 20323.0 |
1 | 2012/13 | 1 | Avon & Somerset | Abandoning child under two years (outcomes only) | Violence against the person | Violence without injury | 12 | 0.0 |
2 | 2012/13 | 1 | Avon & Somerset | Absconding from lawful custody | Miscellaneous crimes against society | Miscellaneous crimes against society | 80 | 1.0 |
3 | 2012/13 | 1 | Avon & Somerset | Abuse of children through prostitution and por... | Sexual offences | Other sexual offences | 71 | 0.0 |
4 | 2012/13 | 1 | Avon & Somerset | Abuse of position of trust of a sexual nature | Sexual offences | Other sexual offences | 73 | 3.0 |
df_crimes = pd.read_csv("data/crimes_all.csv", index_col=0)
df_crimes.head()
Financial Year | Financial Quarter | Force Name | Offence Description | Offence Group | Offence Subgroup | Offence Code | Number of Offences | |
---|---|---|---|---|---|---|---|---|
0 | 2002/03 | 1 | Avon & Somerset | Arson | Criminal damage and arson | Arson | 56 | 304.0 |
1 | 2002/03 | 1 | Avon & Somerset | Criminal damage to a dwelling | Criminal damage and arson | Criminal damage | 58A | 1309.0 |
2 | 2002/03 | 1 | Avon & Somerset | Criminal damage to a building other than a dwe... | Criminal damage and arson | Criminal damage | 58B | 1046.0 |
3 | 2002/03 | 1 | Avon & Somerset | Criminal damage to a vehicle | Criminal damage and arson | Criminal damage | 58C | 2642.0 |
4 | 2002/03 | 1 | Avon & Somerset | Other Criminal damage | Criminal damage and arson | Criminal damage | 58D | 935.0 |
print('total number of recorded offences: ', df_crimes['Number of Offences'].sum())
total number of recorded offences: 94583962.35102853
Convert financial year to calendar year
# if FY Quarter is 1, 2, or 3 then true, else it's 4 and return false
def fy_q(x):
if x['Financial Quarter'] == 1 or x['Financial Quarter'] == 2 or x['Financial Quarter'] == 3:
return True
else:
return False
# if the FY Quarter is true (1, 2, 3) then the year is the start year of the Financial Year value (e.g. 2019-20 Q1,2,3 is 2019)
# else it's the end year of that FY period (e.g. 2019-20 Q4 is 2020)
def fy_year(x):
fy_start = x['Financial Year'].split('/')[0]
if fy_q(x):
return fy_start
else:
return str(int(fy_start) + 1)
df_crimes['Calendar Year'] = df_crimes.apply(fy_year, axis=1)
# select the data just for the Met Police
df_crimes = df_crimes[df_crimes['Force Name'] == 'Metropolitan Police']
# select data just for racially motivated crimes
df_crimes = df_crimes[df_crimes['Offence Code'].isin(race_crime_codes)]
def crimes_per_group_per_year(year):
return df_crimes[df_crimes['Calendar Year']==year].groupby(['Offence Subgroup'])['Number of Offences'].sum()
Plot a line chart over time to observe if there are any changes per category.
offence_subgroups = df_crimes['Offence Subgroup'].unique()
x = [i for i in range(2002, 2021)]
colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261', '#e76f51']
ys = {}
for offence in offence_subgroups:
if offence != 'Stalking and harassment':
data_offence = df_crimes[df_crimes['Offence Subgroup'] == offence].groupby('Calendar Year')['Number of Offences'].sum().values
else:
data_offence = df_crimes[df_crimes['Offence Subgroup'] == offence].groupby('Calendar Year')['Number of Offences'].sum().values
# since the classification 'Stalking and harassment' only appears for the first time in 2013 calendar year
# we manually fill in the previous years (from 2002) with 0s - this should be treated as missing data
# and not as 0 cases!
data_offence = np.concatenate((np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), data_offence), axis=None)
ys[offence] = data_offence
p = figure(title="Crimes by category over time", x_axis_label="year", y_axis_label="number of recorded crimes",
plot_height=500, plot_width=900)
col = 0
for offence in offence_subgroups:
p.line(x, ys[offence], legend_label=offence, line_color=colors[col], line_width=2)
col+=1
p.legend.location = 'top_left'
show(p)
df_london = pd.read_csv("data/Monthly Crime Other Crime Borough Level Off Type.csv")
# only hate crimes
df_london[df_london['MajorText'] == 'Hate']
Borough | MajorText | MinorText | 201004 | 201005 | 201006 | 201007 | 201008 | 201009 | 201010 | ... | 202008 | 202009 | 202010 | 202011 | 202012 | 202101 | 202102 | 202103 | 202104 | 202105 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Aviation Security(SO18) | Hate | Anti-Semitic | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | Barking and Dagenham | Hate | Anti-Semitic | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
2 | Barnet | Hate | Anti-Semitic | 3 | 3 | 2 | 4 | 4 | 3 | 1 | ... | 9 | 10 | 4 | 10 | 18 | 6 | 14 | 13 | 11 | 48 |
3 | Bexley | Hate | Anti-Semitic | 0 | 0 | 1 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | Brent | Hate | Anti-Semitic | 1 | 2 | 4 | 2 | 5 | 2 | 1 | ... | 2 | 3 | 0 | 2 | 1 | 0 | 0 | 1 | 1 | 5 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
754 | Sutton | Hate | Transgender Hate | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
755 | Tower Hamlets | Hate | Transgender Hate | 0 | 2 | 2 | 1 | 0 | 0 | 1 | ... | 2 | 3 | 2 | 0 | 1 | 1 | 0 | 1 | 11 | 3 |
756 | Waltham Forest | Hate | Transgender Hate | 0 | 0 | 0 | 0 | 0 | 1 | 2 | ... | 1 | 0 | 0 | 1 | 0 | 1 | 2 | 1 | 0 | 1 |
757 | Wandsworth | Hate | Transgender Hate | 0 | 0 | 0 | 1 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 3 | 0 |
758 | Westminster | Hate | Transgender Hate | 0 | 3 | 0 | 0 | 0 | 0 | 0 | ... | 6 | 6 | 5 | 2 | 1 | 3 | 1 | 2 | 3 | 4 |
330 rows × 137 columns
# to view the subcategories of hate crime
print("Hate crime subcategories: ", list(df_london[df_london['MajorText'] == 'Hate']['MinorText'].unique()))
Hate crime subcategories: ['Anti-Semitic', 'Disability Hate Crime', 'Domestic Abuse', 'Domestic Abuse Homicide', 'Domestic Abuse VWI', 'Faith Hate Crime', 'Homophobic Hate Crime', 'Islamophobic', 'Racist and Religious Hate Crime', 'Transgender Hate']
Select only the categories of crime that fall under the major category of hate crime, and the subcategory 'Racist and Religious Hate Crime'.
df_london_race = df_london[(df_london['MajorText'] == 'Hate') & (df_london['MinorText'] == 'Racist and Religious Hate Crime')]
df_london_race_year = pd.DataFrame(df_london_race.iloc[:, 3:].sum()).reset_index()
df_london_race_year.columns = ['year_month', 'number']
df_london_race_year['year'] = df_london_race_year['year_month'].apply(lambda x : x[0:4]).astype(int)
df_london_race_year
year_month | number | year | |
---|---|---|---|
0 | 201004 | 796 | 2010 |
1 | 201005 | 901 | 2010 |
2 | 201006 | 919 | 2010 |
3 | 201007 | 931 | 2010 |
4 | 201008 | 770 | 2010 |
... | ... | ... | ... |
129 | 202101 | 1297 | 2021 |
130 | 202102 | 1330 | 2021 |
131 | 202103 | 1816 | 2021 |
132 | 202104 | 1901 | 2021 |
133 | 202105 | 2234 | 2021 |
134 rows × 3 columns
df_london_race_year.groupby("year")['number'].sum()
year 2010 6996 2011 7990 2012 9559 2013 9373 2014 11548 2015 14168 2016 16791 2017 17006 2018 16543 2019 18480 2020 21226 2021 8578 Name: number, dtype: int64
Re-compute the data on an yearly basis. The montly-basis data shows dips for certain months for which I don't have information of the cause. This could be because of recording timings, etc. To eliminate any strange ambiguity, we want to consider the data on a yearly basis.
# Plot all the crimes per year, over the whole range of avilable data
# note that the removal of the last datapoint is because it corresponds to the year 2021, where the data is incomplete
x = list(df_london_race_year.groupby("year")['number'].sum().index)[:-1]
y = list(df_london_race_year.groupby("year")['number'].sum().values)[:-1]
p = figure(title="Racist and Religious Hate Crime, Met Police (Greater London, minus City of London); period 2010-2020",
x_axis_label="year",
y_axis_label="number of recorded crimes",
plot_height=500, plot_width=900)
p.line(x, y, legend_label="racial crimes", line_color='maroon', line_width=3)
p.legend.location = 'top_left'
show(p)
The same data, plotted in the original montly basis interval. From the original Met Police Dashboard