Recently, I spent some evenings slicing and dicing Toronto City Parking Infractions data for 2015 in Tableau 9.3. While working with built-in Tableau maps, I also ended up researching alternative tools that could generate a map visualization. In a discussion about mapping tools with LinkedIn friend, Alan Clark, an expert in this field, I was directed towards ArcGis which is a Geographic Information System for mapping and analyzing geographic data. At first glance, the features were quite appealing and I decided to try mapping the parking infractions data onto a map for easy visualization.
I created a ArcGIS Online trial account to explore this further. The data set I used was from the Open Data program run by the City of Toronto. You can find parking data organized by year (2008 – 2015) here. The table below displays a few records to indicate what the raw data looks like:
|***87325||20150101||15||PARK-WITHIN 3M OF FIRE HYDRANT||100||1||OPP||11 CENTRE AVE||ON|
|***87326||20150101||15||PARK-WITHIN 3M OF FIRE HYDRANT||100||1||OPP||11 CENTRE AVE||ON|
|***09493||20150101||3||PARK ON PRIVATE PROPERTY||30||2||NR||2425 JANE ST||ON|
|***67358||20150101||5||PARK-SIGNED HWY-PROHIBIT DY/TM||40||3||N/S||DUNDAS SQ||E/O||YONGE ST||ON|
|***87327||20150101||5||PARK-SIGNED HWY-PROHIBIT DY/TM||40||3||N/S||ARMOURY ST||W/O||CENTRE AVE||ON|
To prepare the data set for map based analysis and make it more readable, I made the follow changes:
- Removed all columns except location2 and province
- Changed column header location2 to street
- Under the province column, I changed ON to Ontario
- Added new city and country fields
- Created a “Total Fine” field by aggregating fines for same street.
Note that the total number of records in the parking data for 2015 was over 2 million and excel has a limit of 2^20 rows (1,048,576 to be exact). The files from Open Data were divided into 3 sets of CSV’s so either you can repeat your data preparation 3 times, once for each file or find a data management tool that can process 2 million records without causing performance problems. There are many tools that have data management capabilities including SPSS, openRefine, RapidMiner and if you are comfortable with programming, you can use R as well.
This is what CSV file looked like after the data preparation.
|1090 DON MILLS RD||toronto||ontario||canada||643970|
|1 BRIMLEY RD S||toronto||ontario||canada||384740|
|410 COLLEGE ST||toronto||ontario||canada||377625|
|3401 DUFFERIN ST||toronto||ontario||canada||236690|
|45 OVERLEA BLVD||toronto||ontario||canada||217040|
|40 ORCHARD VIEW BLVD||toronto||ontario||canada||217005|
|35 BALMUTO ST||toronto||ontario||canada||215535|
|20 EDWARD ST||toronto||ontario||canada||189070|
|18 GRENVILLE ST||toronto||ontario||canada||170140|
|2075 BAYVIEW AVE||toronto||ontario||canada||157940|
|60 MURRAY ST||toronto||ontario||canada||151965|
|555 REXDALE BLVD||toronto||ontario||canada||147970|
|LA PLANTE AVE||toronto||ontario||canada||121790|
|100 KING ST W||toronto||ontario||canada||118225|
|2075 BAYVIEW AV||toronto||ontario||canada||115120|
|401 COLLEGE ST||toronto||ontario||canada||111550|
|66 WELLINGTON ST W||toronto||ontario||canada||106820|
|150 DAN LECKIE WAY||toronto||ontario||canada||104925|
|225 KING ST W||toronto||ontario||canada||101855|
ArcGIS Mapping Steps
Now, I am finally ready to do the more exciting stuff in ArcGIS! A survey of data scientists found that they spend more than 60% of their time collecting, cleaning and organizing data and only about 10% mining the data. So, nothing extraordinary there, this was expected. After logging in to ArcGIS and proceeding to the Map module, I am greeted with the page below.
When I click on the Basemap menu, I was pleased to see several map options and decided to select Streets.
Then I clicked on the Add menu and selected Add Layer from File to add my prepared CSV file.
I proceeded to add my CSV file and was prompted with the following dialog box. You can select whether the mapping should be based on address of Latitude/Longitude. In my case, I have the street address in my CSV, so I selected Address. I was glad that the tool was able to easily map the columns from my CSV directly to its Location Fields without me having to do additional changes to my data.
Unfortunately I seemed to have hit the limit of my ArcGIS public account. However, that’s ok, I can still plot the first 250 addresses on to a map!
Visualization – TOP 250 Revenue Generating Streets
Fortunately, the excel sheet was already sorted by Total Fine by street, so what I really got was a visualization of the top 250 parking related revenue generating streets in Toronto! The bigger the circle the higher the fine for that location. Based on the data, in total, there were 204,752 unique streets that had parking infractions. The single highest revenue generator was 1090 Don Mills Road with a grand total of $643,970. which is right next to Shops at Don Mills, an upscale mall.
This map is interactive! You can zoom in to look for specific streets
Below is a quick summary of the high revenue generating streets.
1090 Don Mills – Next to High end shopping plaza
# of Tickets – 2,797
Total Fine – $643,970
Avg fine/ticket – $230
1 Brimley Road – Next to Bluffers park
# of Tickets – 3,184
Total Fine – $384,740
Avg fine/ticket – $120
410 College Street – Sandwiched between 2 schools
# of Tickets – 941
Total Fine – $377,625
Avg fine/ticket – $400
2075 Bayview Avenue – Next to Sunnybrook Health Center
# of Tickets – 9,124
Total Fine – $274,600
Avg fine/ticket – $30
The visualization above is interactive and hosted on the ArcGIS platform and you can access it directly here. I encourage you to poke around the map to see if you can find any interesting patterns on how the Toronto City Parking authority is distributing parking infractions. There does seem to be a clear pattern of higher parking related revenue being generated close to hospitals, malls and schools. Perhaps there is a lack of parking facilities in this area? Or is this location highly targeted by parking enforcement officers for one reason or another?
In the future, I plan to drill down deeper into the available data to come up with conclusions about what the map visualizations tell us. Please share your feedback and comments below!