The Storytelling with Data Challenge for July 2019 was:
identify data that makes sense to plot in a radial view and visualize it
Plotting time data on an analog clock seemed like a good use of radial graphs. I’ve been working on a Seattle Collisions analysis project so I chose its dataset and went to work creating a graph representing the trend of collision volume throughout a 24-hour day.
Working this challenge re-enforced for me the importance of these principals:
- understand your data
- know the question you’re trying to answer
- displaying plots of varying scales can be misleading
Understand your data
I started off using Tableau to make my first radial graphs.
Collision time, when it’s reported by the Seattle Police Department, includes the clock hour but not always the minutes. And when minutes are included, they are often rounded to the half hour. Notably, in the case of DUI and other more serious collisions, minutes are usually noted. Because of the significant amount of missing and dirty data, plotting by minute is not going to yield good results. Plotting by hour is a better bet.
Know the question
what is the trend of collision volume throughout a 24-hour day
I intentionally invested time experimenting with graphs and dashboards in Tableau since radials are new to me. Usually, when time is more limited, I save a lot of work and headache if I focus on answering my established question.
Although the Tableau plot (above) with the black background is sexy, I do not feel it (or any of the others) adequately communicate a trend in collisions. I temporarily put the Tableau experiments on hold and opened my RStudio to start developing a better answer.
Displaying plots of varying scales can be misleading
As I continued my radial training using R, I enjoyed exploring trends within various categories of collisions, for example: DUIs, collisions involving pedcycles, and collisions resulting in fatalities.
I wanted to place all these plots on the same page so we can see how the trends differ. For example, while few pedcycle collisions occur at night, most of the DUI collisions are at night.
However, each of the scales on these plots are different. The ‘Count’ is offered on the Y-Axis (left side) to show the scale of each, but at first glance the over-all volume looks similar in each category. I decided to stick with just the ‘all’ plot. Someday it would be cool to create an infograph that made the differentiation between the scales clear, but for this project a single plot meets the Challenge.
I experimented a bit with colors: coloring the bars as well as the background. I was attempting to help the reader by color-izing the part of day (morning, afternoon, etc.) I finally decided that was nice for aesthetics but was otherwise just added noise. So I dropped the colors.
I removed the Y axis tick mark text because the actual number of hourly collisions over a 14-year period is overwhelming, and it’s the general trend that I’m looking for, not actual numbers. The title clearly states what we’re looking at, and the subtitle provides a take-away (lots more accidents in the lunch-to-evening commute hours).