Update: They knocked back my second request on the grounds of anonymity. The sample was so small that giving me details might risk identifying someone. That seems fair, but if nothing else the very low number means that in the context of my original thinking the numbers are not in context large enough to suggest a broader story. (taking as read that the individual circumstances are sad and may have warranted reporting at the time)
I always like to test out the stuff that I ask my students to do; don’t make people to do something you wouldn’t try yourself (apart from maybe fitting a gas cooker or disarming a bomb ). So I’ve been collecting data from various places to use in data journalism exercises including FOI requests via Whatdotheyknow.com.
I asked for details of people who had died whilst on student and Tier 4 visas. It was playing out a hunch (just curiosity) I had about a few things, in particular the number of those that would be suicides. I thought it would make interesting data and would be something that might interest students without getting in to the dangerous territory of ‘student stories’
Where possible I would like to know the date, location of their death, gender, age, cause of death and sponsor institution.If you could provide this information in digital form, preferably in a spreadsheet format, that would be very helpful
Here’s the data I got.
Not really what I wanted. The main reason cited was that apart from the information above, was that they were “only able to report on data that is captured in certain mandatory fields on the Home Office’s Case Information Database (CID).” Most of the information I wanted would be in the ‘notes’ section of any records which would need to be located manually.
The Home Office is not obliged under section 12 of the Freedom of Information Act 2000 to comply with any information request where the estimated costs involved in supplying the information exceed the £600 cost limit. I regret that we cannot supply you with the information that you have asked for, as to comply with your request would exceed this cost limit.
Fair enough although I was a bit suspicious that some of the information that would seem to be pretty useful, like sponsoring institution, would not have a field. But I realised that I didn’t really know what fields were in there. In fact I didn’t really know that the Case Information database was where that stuff would be.
Thanks to an FOI by Helen Murphy, I find out that;
All data held on the Caseworker Information Database will fall within a
minimum data set. The Caseworker Information Database contains:
• Date of birth
• Arrival details
• Temporary admission address
• Detention details
• Refusal reasons• Diary actions
• Removal details
More surprisingly it also reveals that “Currently there are over 75 screens on the Caseworker Information Database (CID)”. 75 screens No wonder they can’t find anything!
7 hour days
Helen’s FOI also helped illuminate working conditions at the Home Office. In Helen’s FOI
The £600 limit is based on work being carried out at a rate of £25 per hour, which equates to 24 hours of work per request.
In my response :
This [£600] limit applies to all central Government Departments and is based on work being carried out by one member of staff at a rate of £25 per hour, which equates to 3½ days work per request.
Taking one as a different way of expressing the other ( a dangerous assumption) would suggest less than 7 hour days at the Home office. Still, that seams fair given the number of screens you’d need to wade through. I’d give up after 2 hours!
Groups of 5
The other thing that struck me about the data was the alarmingly uniform numbers that people die in – 5 at a time. It turns out that the figures are not entirely complete *. A note on the data says:
Figures rounded to nearest 5 (- = 0, * = 1 or 2) and may not sum to totals shown because of independent rounding.
Why round them to 5? It’s not like half a person died! Update: In the comments Martin Stabe suggests “This could be an anonymisation requirement so that individual cases cannot be identified from aggregate data.”
Limits of being human
I’ve put another request in on the basis of the data I got, assuming that 10 cases would be manageable by someone in 3.5 days although 75 screens worth of content might yet fox my demand, so I may never get what I want this way.
The truth is that, as data, what I got is next to useless – no real context and the numbers aren’t even accurate, – but it reinforced a few things for me:
- Good FOI’s rely on good planning and some prior knowledge. I’d done a bit or work understanding the whole Tier4/student thing but clearly I needed to do more on understanding who held the data, how and why. Data, in fact journalism, is all about context
- Good FOI’s rarely stand alone. Often an FOI is an enabler. It opens doors, avenues for further questions. That makes it valuable even when the data might be useless.
- Visibility helps. Helen’s FOI answered questions I had. Maybe mine won’t but It’s in the mix.
- Open government doesn’t just rely on data. It relies on the capacity to retrieve and search that data. Government is really good at collecting it and shockingly bad at having it in a form that is usable even to themselves. (but we all knew that didn’t we)
Not new or startling revelations but it never hurts to be reminded of these things from time to time.
* for ‘not entirely complete’ read ‘bugger all use’