Below are the slides from a talk I gave at VALA 2014 on how we went about trialling Linked Data at the Victorian Parliamentary Library. There is also a longer paper that goes along with the talk.
Here’s another visualisation of some data from the 2013 Australian Federal Election.
I wanted to see how consistent voting patterns were across booths within an electorate. From handing out how to vote cards at my local polling booth I had the feeling that not every booth is the same.
Fortunately the Australian Electoral Commission publishes a live feed of all data by booth. Its the same data that the news outlets use, so its pretty good. It follows the Election Markup Language standard, so extracting the data was not too hard. The AEC had added in their own elements, but they used namespaces which made it fairly simple to process using a fairly basic perl script.
Data is based on first preferences only. Because of the way heatmaps work, if booths are closer together the intensity increases, so to some extent the heat map is determined by the layout of the booths. However, distinct patterns are discernible if you explore the data. Only parties that registered votes in at least 200 booths have been included.
Absentee voting at capital cities booths are mapped at the booth rather than in the electorate that the vote was for, so capital cities tend to show high results for every party.
The Australian Electoral Commission have released the data for preference allocation and makes a good subject for a visualisation.
Getting useful data is always a compromise. Some parties have multiple tickets (ie they split their preference flows in two or more tickets). Independents do not have parties, but work in a group. I grouped independents by using the highest preferenced candidate for that ticket. The coalition have more than one party, so I had to combine these under coalition.
To measure how highly a party was preferenced, I took the average position on the ballot for each member of the party and average these if there were more than one ticket. For some parties with split tickets this meant that they might end up without preferencing any party particularly highly.
Next was how to visualise the data. A chord diagram seemed the natural way to show how parties preference each other. The problem is that there is too much data to show every preference allocation (by definition every party preferences every other candidate). So I needed to draw the line on how much data to show. I arbitrarily decided that any party that averaged a position in the top 25% of the ballot order was highly preferenced by the other party.
The visualisation shows some interesting things. Whether there is a symmetrical or asymmetrical relationship between parties can easily be seen. Als,o the wider the party is around the circle, the more other parties preference it. The ALP and coalition have fairly narrow widths while the bullet train party and family first are relatively wide.
Of course preferences are a lot more complex than shown here. The order of preferences and how they flow once quotas are allocated can have a subtle and profound affect on the election outcome. If you are considering voting below the line Antony Green has some good advice.
Here’s the slides from another talk from VALA2012 where I talked about how we’ve been using OpenCalais at the Victorian Parliamentary Library to add tags and semantic data to one of our databases. You can also see the talk here or download a longer paper that goes with the talk.
There’s a lot of talk about the carbon footprint of attending conferences these days and on Friday I attended my first virtual conference. NSWsphere provided a live stream of the conference. The job they did was excellent (numerous cameras, direct mic etc.) and I was able to watch easily without getting frustrated.
The second important factor was the live twitter stream. This allowed me to tap into some of the intangibles that you get from going to a conference – the important chit-chat on what everyone thought of the presentations. The advantage of twitter was that it was happening as the presenters were talking, so I didn’t have to wait until the session ended to get people’s views.
The last advantage was that I could tune in to just the presenters I was interested in. I simply printed out the agenda and switched over when they were on.
So despite being in another state, I was still able to get something out if this conference without traveling, without spewing out tonnes of carbon and while keeping up with most of my real job. Obviously face-to-face meetings with colleagues are better and I would choose that if I could, but especially for conferences where you might just have a peripherally interest and can’t justify the cost of attending in person, it might be worth giving this a try.