Crime Analysis

From Cbcb
Revision as of 19:07, 30 September 2009 by Boliu (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Analyzing Campus Crime and Incident Events Using Spotfire

Team members: Hyoungtae Cho and Bo Liu, graduate students in Computer Science of UMCP

Introduction

Campus crimes occurred around College Park pose serious threats to the regular life of students, faculty and staff. How to prevent these incidents as much as possible? Are there any crime hot spots in terms of location or time? In order to try to answer these and other relevant questions, in this application project, we analyzed historical crime events data released by the Department of Public Safety at University of Maryland, College Park.

Dataset Description

All the crime events data from January 2005 to August 2009 are released by the Department of Public Safety at University of Maryland, College Park, and are downloaded from the Daily Crime and Incident Logs website (http://www.umpd.umd.edu/UMDPS-IncidentLogs.cfm). The raw data are stored in HTML pages, and we extracted and preprocessed them using PERL and PYTHON. Irrelevant events (e.g., traffic stop) are removed, and resulting in totally 4849 cases belonging to 13 categories (shown below).

Findings

The Number of Property Destructions is Increasing

To get an overview of the general trend of the occurrences of different events, we plot the counts of different events type by type across 5 years. One important thing we want to point out is that the 2009 data is not complete (only first eight months). For most types, the number of occurrences fluctuates, but the number of property destructions is steadily increasing from 2005 to 2008. This is a very interesting trend, but we could not find a reasonable explanation for this.

File:Jbl jhl yearly overview.jpg

Critique of Spotfire

Pros:

1, Spotfire is really easy to use, and it only takes less than 10 minutes for us to get used to it. The data manipulation and chart creation are very interactive and convenient, which means it is extremely efficient for exploratory data analysis, because normally we do not know where the data trend or pattern is.

2, I like the filter panel most. In our case, most of the features of the crime events are discrete (e.g., location, date, type), so the sliding bar allows us to easily filter the features.

Cons:

1, Spotfire requires internet connection when logging in. When internet is not available, instead of reminding the user or giving warnings, it crashes.

2, When switch to another chart, the whole graph has to be build from scratch. For example, I built a bar chart by selecting some features that I am interested in. But I am not sure bar chart is the best way for visualization, then I decided to try line chart. After switch, the line chart was built using default data features instead of the ones I used in bar chart.