Two fundamentals that define good data journalism

Defining data journalism is a hostage to fortune but as I start teaching a data journalism module I’ve boiled it down to two things visible methodology and data.

I’m teaching a module on Data Journalism to second year undergraduates this year. It’s not the first time we’ve done that at the university. A few years ago three colleagues of mine, Francois Nel, Megan Knight and Mark Porter ran a data journalism module which worked in partnership with the local paper. I’ve also been tormenting the students with elements of data journalism and computational journalism across all four years of our journalism courses.

There are a couple of things I wanted to do specifically with this data journalism module (over and above the required aims and outcomes). The first thing was, right from the start, to frame data journalism as very much a ‘live conversation’. It’s exciting, and rare these days, that students can dive into a area of journalism and not feel they are treading on the toes of an existing conversation. The second thing was to try and get them thinking about the ideological underpinnings of data journalism.

Data journalism as a discourse borrows most heavily and liberally from the vocational underpinnings of journalism — the demand of journalism to serve the public and hold to account that John Snow and others have talked about. But it also draws on the rigour of science, the discipline of code, design thinking, narrative and social change; anything to bring shape, structure and identity. This is often a good thing, especially for journalism, where new ideas are few and far between and it takes a lot to challenge the orthodoxy. Perhaps that’s why data journalism is seen as an indicator for prosperous media companies. But it’s also a bad thing when it’s done uncritically. I’ve written lots about how I think data journalism borrows the concept of open for its own purposes for example. Often much of the value of data journalism seems implied.

The fluid nature of data journalism discussion makes it difficult to identify “schools “of data journalism thought — I don’t think there’s a bloomsbury group of data journalism yet!*- but there are attempts to codify it. Perhaps the most recent (and best) is Paul Bradshaw’s look at 10 principles for data journalism in its second decade. It’s a set of principles I can get behind 100% and it’s a great starting point for the ideological discussion I want the students to have.

That said, and pondering this as I put together teaching materials, I think things could be a little simpler — especially as we begin to identify and analyse good data journalism. So if there was a digitaldickinson school of data journalism I think there would be a simple defining idea…

If you can’t see, understand and ideally, interact with either of those in the piece, it may be good journalism but it’s not good data journalism.

When good journalism becomes good data journalism

Here’s two examples to make the point.

The Guardian published a piece uses Home Office data to reveal that the asylum seekers are being housed by some of the poorest councils in the UK. A story that rightly caught the eye of Government and campaigners alike. Exceptional journalism. Poor data journalism.

An exceptional piece of investigation, great journalism but this would score low as a piece of data journalism

The problem with the piece is that, although it relies heavily on the data used it is light on the method and even lighter on the underpinning data. The data it uses is all public (there is no FOI mentioned here) and there isn’t even a link to the source let alone the source data.

Contrast that with a piece from the BBC looking at the dominance of male acts at festivals. 

The BBC’s piece might be seen as frivolous, but no less a piece of journalism.

An introduction to the method ticks the boxes for me.

It’s a fascinating piece but the key bit for me is at the end where there is a link to find out how the story was put togetherThat’s the think that makes this great data journalism.The link takes you to a github repository for the story which includes more about the method, unpublished extras and, importantly, the raw data.

The BBC England Data Unit GitHub page is a good example of how to add value to data journalism stories.

The BBC take is a full-service, all bases covered example of good data journalism; its the blue ray with special features version of the article. To be fair to the Guardian piece, they do talk a little about the ‘how’. But not on the level of the I also recognise that in these days of tight resources, not every newsroom needs to create this level of detail. But using github to store the data or even just linking to the data direct from the article is a step in the right direction — its often what the journalists would have done anyway as part of the process of putting the article together.

Making a point

I’ve picked the Guardian and BBC stories here as examples of data-driven journalism. These are two stories that put data analysis front and centre in the story. But I recognise that I’m the one calling them ‘data journalism’. I’m making a comparison to prove a point of course, but my ‘method’ aside, the point I think stands — beyond the motivations, aims and underpinning critical reasons, when the audience access the piece, without the method and the data can we really say its data journalism.

I want my data journalism students to really think about why we see data journalism as a thing that is worthy of study not just practice. Not in a fussy academic way but in a very live way. It isn’t enough to judge what is produced by the standards of journalism alone (I’m guessing the Guardian piece would tick the ‘proper journalism’ box for many). But it isn’t ‘just journalism’ and it isn’t just a process. If the underlying principles and process aren’t obvious in the content that the readers engage with, then it’s just an internal conversation. It has to be more than that.

For me ,right now, outside of the conversation, good journalism starts with a visible method and data.

*I guess if there was they would vehemently deny there was one.

Is Data Journalism any more open?

Last year I wrote about how the 2016 Data Journalism awards illustrated that journalism hasn’t quite got to grips with the full meaning of open data. So I thought I’d take a look at this years crop and see if things had improved.

This is last years definition for the open data category:

Open data award [2016] Using freedom of information and/or other levers to make crucial databases open and accessible for re-use and for creating data-based stories.

This years was the same save for an addition at the end.(my emphasis)

Open data award [2017] Using freedom of information and/or other levers to make crucial datasets open and accessible for re-use and for creating data-driven journalism projects and stories. Publishing the data behind your project is a plus.

A plus! The Open Data Handbook definition would suggest it’s a bit more than a plus…

Open data is data that can be freely used, re-used and redistributed by anyone — subject only, at most, to the requirement to attribute and sharealike

…if you want people to re-use and re-distribute then people need the data.

Lets take a look at this years shortlisted entries and see how they do with respect to the open data definition.

So, in the order they appear on the shortlist…

Analyzing 8 million data from public speed limit detectors radars, El Confidencial, Spain

This project made use of Spain’s (relatively) new FOI laws to create “an unique PostgreSQL database” of traffic sanctions due to exceeding the speed limits. A lot of work behind the scenes then to analyse the results and a range of fascinating stories off the back of it. It’s a great way to kick the tyres of the legislation and they’ve made good use of it.

Most of the reporting takes the same form. The story is broken down into sections each accompanied by a chart. The charts are a mix of images and interactives. The interactive charts are delivered using a number of platforms including Quartz’s Atlas tool but the majority use DataWrapper. That means that the data behind the chart is usually available for download. Most of the heavy lifting for users to search for their area is done using TableauPublic which means that the data is also available for download. The interactive maps, made on Carto, are less open as there is no way to get at the data behind the story.

Verdict: Open(ish) — this makes good use of open government legislation to create the data, but is that really open data. The data in the stories is there for people to download but only for the visualisations. That’s not the whole data set. There also isn’t an indication of what you can do with the data. Is it free for you to use?

Database of Assets of Serbian Politicians, Crime and Corruption Reporting Network — KRIK, Serbia (this site won the award)

For their entry independent investigative journalism site KRIK created “the most comprehensive online database of assets of Serbian politicians, which currently consists of property cards of all ministers of Serbian government and all Serbian presidential candidates running in 2017 Elections.” Reading the submission it’s a substantial and impressive bit of work, pulling in sources as diverse as Lexis and the Facebook Graph. They even got in a certified real estate agency “which calculated the market values of every flat, house or piece of land owned by these politicians” Amazing stuff done in a difficult environment for journalism.

Verdict: Closed — This is a phenomenal act of data journalism and would in my view, been a deserving winner in any of the categories. But the data, whilst searchable and accessible and certainly available, isn’t open in the strict sense.

#MineAlert, Oxpeckers Investigative Environmental Journalism, South Africa

Using information access legislation and good old journalistic legwork, Oxpeckers Centre for Investigative Environmental Journalism pulled together a dataset of mine closure information that revealed the impact of a chaotic mining sector in South Africa. The data highlighted the number of derelict mines that hadn’t been officially closed and were now being illegally and dangerously mined. There’s a nice multimedia presentation to the story and the data is presented as an embedded Excel spreadsheet.

The project has been developed and supported by a number or organisations including Code for Africa. It’s no surprise then that the code behind parts of the project via github. The data itself is also available through the OpenAfrica data portal where the licence for reuse is clear.

Verdict: Open. The use of github and the OpenAfrica data portal add to the availability of the data which is clearly accessible in the piece too.

Pajhwok Afghan News, Afghanistan

Independent news agency Pajhwok Afghan News have created a data journalism ‘sub-site’ that aims to “use data to measure the causes, impact and solutions driving news on elections, security, health, reconstruction, economic development, social issues and government in Afghanistan.”

The site itself offers a range of stories and a mix of tools. Infogr.am plays a big part in the example offered in the submission. But other stories make use of Carto and Tableau Public. The story “Afghan women have more say in money that they earned themselves than property in marriage” uses Tableau a lot and that means the data is easy to download, including the maps. That’s handy as the report the piece is based on (which is linked) is only available as a PDF

Verdict: Open(ish) — the use of Infogr.am as the main driver for visualisation does limit the availability of the data, but the use of Tableau and Carto do raise the barriers a little.

ProPublica Data Store, ProPublica, United States

The not-for-profit investigate journalism giant Pro-Publica have submitted a whole site. A portal for the data behind the stories they create Interestingly Pro-Publica also see this project as a “potential way to defray the costs of our data work by serving a market for commercial licenses.” that means that as a journalist you could pay $200 or more to access some of the data.

Verdict: Open. Purists might argue that the paywall isn’t open and ideally it would be nice to see more of the data available and then the service and analysis stuff on top rather than the whole datasets being tied up. That said, its not like ProPublica are not doing good work with the money.

Researchers bet on mass medication to wipe out malaria in L Victoria Region, Nation Media Group, Kenya

This piece published by The Business Daily looks at plans to enact a malaria eradication plan in Lake Victoria region. The piece takes data from the 2015 Kenya Malaria Indicator Survey amongst other places to assess the impact of plans to try and eradicate the disease.

Verdict: Closed. The work done to get the data out of the reports (lots of pdf) and visualise it is great and its a massively important topic. But the data isn’t really available beyond the visualisations.

What’s open?

Like last year it’s a patchy affair when it comes to surfacing data. Only two of the entries make their data open in a way that sits comfortably in the definition of open data. For the majority, the focus here is on using open government mechanisms to generate data and that’s not open data.

As noted last year, what open data journalism should be, is really about where you put the pipe;

  • open| data journalism — data journalism done in an open way.
  • open data | journalism — journalism done with open data.

By either definition, this year’s crop are better representative of open data use but fall short of an ‘open’ ethos that sits at the heart of open data.

Does it matter?

I asked the same question last year; In the end, does the fact that the data isn’t available make the journalism bad? Of course not. The winner, KRIKS is an outstanding piece of journalism and there’s loads to learn from the process and thinking behind all the projects. But I do think that the quality of the journalism could be reinforced by making the data available. After all, isn’t that the modern reading of data journalism? Doesn’t making our working out and raw data more visible build trust as well as meaning?

Ironically perhaps, Pro-Publica highlights the problem in the submission for their data store project —

“Across the industry, the data we create as an input into our journalism has always been of great value, but after publication it typically remained locked up on the hard drives of our data journalists — of no use either to other journalists, or to anybody else who might find value in it.”

Publishing the data behind your project is what makes it open.

If you think I’m being picky, I’d point out that I’m not picking these at random. This is the shortlist for the open data category. These are what the judges (and the applicants) say are representative of open data. I think they could go further.

As I’ve noted before, if the practice of data journalism is to deliver on transparency and openness, then it needs to be part of that process. It needs to be open too. For me I’d like to see the “Publishing the data behind your project is a plus” changed for next year to an essential criteria.

Local votes for hyperlocal #DDJ

There’s a good deal of interest in my feeds in a BBC report Local voting figures shed new light on EU referendum. The work has been a bit of a labour of Hercules by all accounts.  

Since the referendum the BBC has been trying to get the most detailed, localised voting data we could from each of the counting areas. This was a major data collection exercise carried out by my colleague George Greenwood.

This was made more difficult by a number of issues including the fact that: “Electoral returning officers are not covered by the Freedom of Information Act, so releasing the information was up to the discretion of councils.”

But the data is in and the analysis is both thorough and interesting.  I particually like the fact that the data they collected is available as a spreadsheet at the end of the article. There are gaps and there have been some issues with this (but its already being put to good use.) . More and more I’m seeing data stories appear with no  link to the data used or created as a result of the reporting.

Getting local.

In a nice bit of serendipity, Twitter through up a link to a story on Reading (Katesgrove Hill)  based hyperlocal The Whitley Pump. The story, ‘Is east Reading’s MP voting for his constituency?‘, starts with the MP for Reading East, Rob Wilson questioning an accusation that he voted against his constituents in the recent Article 50 vote.  His response was  prove it! saying “Could you provide the evidence on how my constituency voted? My understanding is that no such breakdown is available.” That’s just what Adam Harrington of The Whitley Pump set out to do.

The result is a nice bit of data journalism that draws on a number of sources including council data and draws the conclusion: “There is nothing to support a view that Reading East voted to leave the EU, and available data makes this position implausible.” 

If nothing else, its a great example of how hyperlocal data journalism can work. Unlike the BBC the Pump didn’t need to deliver across the whole country but it did follow a lot of the same methods and fall foul of many of the same issues, not least the lack of data in the first place.

Encouraging data practice at hyperlocal level. 

The BBC’s recent announcement on the next steps for its local democracy reporters scheme include mention of a local Data Journalism Hub. In a blog post officially announcing the scheme, Matthew Barraclough noted:

We hope to get the Shared Data Hub in action very soon. Based in Birmingham, BBC staff will work alongside seconded journalists from industry to produce data-driven content specifically for the local news sector.

It would be great to see that opportunity to work and learn alongside the BBC included hyperlocals like the Whitley Pump.

Image courtesy of The European Parliament on Flickr.

Why open data needs to be “Citizen literate”

A “data literate” citizen isn’t someone who knows how to handle a spreadsheet — it’s someone who inherently understands the value of data in decision making.

So says Adi Eyal in a piece very much worth a read, called Why publishing more open data isn’t enough to empower citizens  over on IJnet.

I’m right behind the sentiment expressed in the headline.

I’m fascinated by the tensions caused by the use of open data – or perhaps more specifically the rhetoric of its use.  I often find myself questioning the claims of the ‘usefulness’ of open data, especially when they are linked to social and community outcomes. I share Eyal’s view that  whilst there may be some big claims, “there is not yet a larger body of work describing how open data has brought about systemic, long-term change to societies around the world.”

Some might argue (me included) that its just too early to make judgements.  As idealistic and iconoclastic as the promises may be at times, I do think it is just a matter of time before we begin to see tangible and consistently replicable  social benefit from the use of open data.

But the the key challenge is not the destination or how long it takes to get there. It’s how we do it.

In the IJNet piece Eyal makes a distinction between simply freeing the data and its effective use, especially by average citizens. He makes a strong case for the role of “infomediaries” :

These groups (data wranglers, academics, data-proficient civil society organizations, etc.) turn data into actionable information, which can then be used to lobby for tangible change.

I’m very drawn to that idea and it reflects the way the open data ecosystem is developing and needs to develop. But I do think there’s an underlying conflation in the article that hides a fundamental problem in the assumption that infomediaries are effective bridges – It assumes that open data and open government data are the same thing.

It’s an important distinction for me.  The kind of activities and infomediaries the article highlights are driven in the most part by a fundamental connection to open government (and its data).  There is a strong underpinning focus on civic innovation in this reading of the use and value of open government data. I’d argue that Open Data is driven more  by a strong underpinning of economic innovation – from which social and civic innovation might be seen as  as value created from the use of services they provide.

There is a gap between those who hold the data and use it make decisions and those that are affected by those decisions.   I don’t think that open data infomediaries always make that gap smaller,  they simply take up some of the  space.  Some do reach across the gap more effectively than others – good data journalism for example.  But others, through an economically driven service model, simply create another access point for data.

From an open data ecosystem point of view this is great, especially if you take a market view. It makes for vibrant open data economy and a sustainable sector.  From the point of view of the citizen, the end user, the gap is still there. They are either left waiting for other infomediaries to bring that data and its value closer or required to skill-up enough to set out across the gap themselves.

The space between citizens and government is often more of a market economy rather than a citizen driven supply chain.

There is a lot of the article that I agree with but I’d support the points made with a parallel view and suggest that as well as data literate citizens as Eyal describes them, open data infomediaries need to be “citizen literate”:

A citizen literate data infomediary isn’t one that just knows how to use data – its one that understands how citizens can effectively use data to be part of a decision making process.

 

 

 

 

 

The BBC, Local democracy, hyperlocal and journalism.

I spent the afternoon in Birmingham at the BBC finding out more about their Local Democracy Reporters scheme.  It’s a project I’ve been keeping an eye on for a number of reasons.

The promise of 150 new jobs in journalism, especially ones that are exclusively aimed at covering local government,  is clearly of interest to me as a Journalism lecturer.  It’s more opportunities for students and journalists for one thing.  But the focus on civic reporting also begins to address an area that I think is under-resourced and under-valued (by producers and consumers alike).  The scheme also includes plans for a content hub called the News Bank for material created by the reporters open for anyone to apply to use. This would also includes content from the BBC’s fast developing Regional Data Journalism unit.

The combination of data, hyperlocal and civic content is too good for me to ignore.

What’s in it for hyperlocals?

One of the underpinning reasons for this scheme is to “share the load” of accountability journalism. The role of journalism as holding the powerful to account is one that many feel is being lost,  especially at a local government level. People talk about a democratic deficit and news deserts; towns with no journalistic representation at all.  Many see hyperlocals as an essential part of filling the gap but its notoriously hard to create a sustainable hyperlocal business model.  So it is no surprise that hyperlocal and community media representatives have been following the development of the project with interest.  When the BBC promise a pot of money to improve local democratic reporting who better to benefit from the cash!

So how would the scheme work?

The fine detail of the plan is still being pulled together, but in principle the scheme would be something like this:

The BBC will have create contracts for  Local Democracy Reporters but they won’t manage the reporters. Rather than 150 separate contracts, they have packaged them up into ‘bundles’ containing a number of reporters per geographic patch.  Local news organisations can then bid to take on these contracts on behalf of the BBC. The organisation will be responsible for the reporter both editorially and also from a straight HR point of view (sick leave, appraisals etc. ). The BBC have a number of criteria and requirements for anyone wanting to bid. This includes a proven track record in producing good quality content and the capacity to properly employ and manage a member of staff.

The content created by the reporters as well as any prospects will be made available on a shared News Bank.  So as well as the ‘host’ organisation, other media organisations can use the content created.  There would be no exclusives for host  organisations; when the content drops, it drops for everyone with access to the content hub. So you don’t need to employ a local democracy reporter to get access to the content on the Newsbank. But  you would need apply to the BBC for access. As long as you fulfil their criteria – adherence to basic editorial standards and a track record in producing good quality content – you’re in!

There is a good deal of simplification here on my part. There is a tonne more detail in the plans that were presented today but we were asked not to share too much. Which is fine by me.

But at the event today, I made a few broad notes on some issues and observations.

 

  • Defining ‘bundles’ – A number of the hyperlocal operators in the room noted that the bundles suggested by the BBC sometimes didn’t make sense when you knew the local geography and political landscape.  Others noted that they seemed to mirror the regional media orgs patches. The BBC noted that the geography of the scheme was, in some part, driven by the location of BBC local offices, who would have a role in overseeing the project. That said the BBC were very open to feedback on the best way to divide up the patches . A positive role for Hyperlocals and it shows the value that the focus on a patch can bring.
  • Scale and partnerships – Many of the hyperlocals in the room felt that the decision to package up reporters by patch and the criteria they set for qualifying organisations effectively shut them out of the process.  They might be able to manage one reporter but not three or four across a large patch. One solution offered was working in partnership with larger, regional media organisations to deliver contracts in an area. e.g. An established media player such as Trinity Mirror or Johnston Press could take on the contract and then work in partnership with a hyperlocal to deliver the content whilst the larger org takes on the HR and management issues.  I think the devil is in the detail but it strikes me as a good compromise. But its fair to say, that idea wasn’t warmly received by many of the hyperlocals in the room. I think the the best way to describe the reason is ‘because trust issues’. Interestingly the idea of collaboration between hyperlocals to create collaborative networks to bid got very little comment of it seems interest.
  • Value to the tax payer – The BBC are clearly caught between a rock and hard place with initiatives like this.  They have money that they want to use to ‘share the load’ but at the same time, would be under huge amounts of scrutiny for what is produced and who they work with.  Accountability is something they take very seriously and the BBC are masters at getting themselves in knots trying to be fair and balanced to everyone.  Often they just can’t win.  The scheme as presented today highlighted some of those tensions.  By ‘outsourcing’ the management of the journalists they deal with the issues of the BBC barging into a sector and skewing the market. But at the same time, the need for accountability means the scheme is run through with ‘checks and balances’ the Beeb would apply to ensure the license fee payers were getting value for money.  Its not quite the hands-off it could be.  It also seems that the ‘value for money’ tests stretches to ensure that the material collected by the reporters is also useful to the BBC and their reporting.  Not quite having your cake and eating it as maybe confusing who you are baking the cake for.

But in the midst of the accountability knots and the predictable cynicsism animosity that underpins the relationship between some hyperlocals and the regional media, I think something really important slipped by thats worth keeping an eye on.

The BBC seal of approval

To get access to the NewsBank organisations will need to submit an application to the BBC. General noises around the criteria suggest these will include caveats on quality content and track in producing news content. Orgs will also need to show  a commitment to the same editorial guidelines for balance and impartiality as the BBC. But details of the process of assesment where sketchy.

But lets look at that another way.  In short the BBC will become a local media accreditation body.  

I don’t know how I feel about that. To be clear, I certainly don’t perceive an suspicious motives. But it still makes me uneasy.

I guess you could read it in the same way as hyperlocal’s being recognised as publishers by Google so they could feature in Google News.   Perhaps, as long as the process was transparent, its not a bad thing that some standards are defined. But then I think the sector doesn’t really have a problem in that area.

I don’t know.  But of all the issues this scheme raises, it feels like the one most likely to generate unintended consequences.

All of that said, its worth watching and supporting. Looking beyond the implementation, which is never going to tick all the boxes,  I do think the scheme when it roles out will mark one of, if not the biggest investments in civic journalism in the UK that isn’t technology driven.  I might go as far as to say its the only journalism first investment in civic innovation that I’ve seen in the UK.

It may not work across the board but you’ve got to admire the idea.

Related

Making Instagram video with Powerpoint

Audio slideshows are something I’ve included in my practical teaching for a little while. The combination of images and well recorded audio is, for me, a compelling form of content and it can be an easy video win for non-broadcast shops.

When I work with the students and journalists exploring the concept, I try and look for free or cheap solutions to the production process. In the past I’ve used everything from Windows Movie Maker to Youtube’s simple editor app to put packages together. But this year when I was putting the workshops together, I wanted to focus on social platforms and go native video on Instagram.

Video on Instagram

It’s not the first time I’ve looked at Instagram video. A few years ago, having seen a presentation about the BBC’s Instafax project (in 2014!), I had a look at cheap and free tools to use to create video for Instagram. But things have moved on — like the BBC’s use of Instagram.

So I started to look at how I might use the combination of accessible tools with a view to doing an update on that post. I found my self thinking about Powerpoint.

Why Powerpoint!

When I talk to students about video graphics, I often point them to presentation apps like Google Slides and Powerpoint as simple ways to create graphic files for their video packages. They have loads of fonts, shapes and editing tools in a format they are familiar with (more of them have made a powerpoint presentation than worked with video titling tool!). The standard widescreen templates are pretty much solid for most video editing packages, and you can export the single slides as images. So I took a quick look at Powerpoint to remind myself of the editing tools. Whilst I was playing around with export tools, I discovered that it had an export to video. So I opened up powerpoint to see how far I could go and about an hour later and some playing around and I had the video below.

I worked through the process on a Windows version of Powerpoint, but the basic steps are pretty much the same for a Mac. If you’re on a MAC then Keynote is also a good alternative which will do all of the stuff you can do with powerpoint but with the added bonus that it will also handle video.

Here’s what I did. (You can download the Powerpoint file and have a look I’m making that available as CCZero)

You can see a video walk-through of parts of the process or scroll down for more details.

The process

  • Open Powerpoint and start with a basic template
  • Click the Design Tab and then select Slide Size > Custom Slide Size(Page Setup on Mac)
  • Set the width and height to an equal size to give us the Square aspect ratio of Instagram. Click OK. Don’t worry about the scaling warning

You can set a custom slide size for Powerpoint which means we can create custom slides that fit with Instagram and other platforms.

You can now play around with the editing tools to place text, images and other elements on each slide.

Animating elements

The tools to add shapes and text are pretty straightforward, but one effect that seems popular is ‘typewriter’ style text, where the words animate onscreen. Luckily thats built in on Powerpoint.

  1. Add a Text box and enter the text. Make sure you have the text box selected not the text
  2. Go to the Animations tab, select the text box and click on Appear.
  3. Open the Animations Pane in the tool bar
  4. In the Animations pane right-click on the text box (it will be named with any text you’ve added) appropriate animation and select Effect Options
  5. In the Animate text select by word. You can speed the text up using the delay setting (Note. You can’t do this with the Mac version).

The typewriter effect is a common one on many social videos. One which powerpoint makes short work of.

For the rest, its worth experimenting with basic transitions and animations before you try anything too complex. Once you start to get separate elements moving around you’ll need to think about text as separate elements — you’ll end up with ‘layers’ of text; but that’s no different from a video editor.

Adding Audio

You can add audio to individual slides or to play as an audio ‘bed’ across all the slides.
You can add audio to individual slides or to play as an audio ‘bed’ across all the slides.

A common feature of Audio Slideshows on Instagram (and other social platforms) is that the text drives the story; the audio is often music or location sound that adds a feel for the story. In this example I used sound that I recorded on the scene but you could use any audio e.g. a music track.

You can also adjust the timing of slides to match the audio or just to give you control over the way slides transition and display.

Transitions and timing give you control over how long and how content appears

Exporting your video

Once you’re happy with your presentation you can create a vide version:

  • Click the File tab
  • Select Export > Create a Video

You have a few choices here. The quality setting allows you to scale the video. Presentation quality exports at 1080×1080; Internet quality 720×720 and Low Quality at 480×480. I went for Internet Quality as it kept the file size down without compromising the quality too much.

You can also set the video to use the timings you set up in each slide or to automatically assign a set time to each slide. Which one you pick will depend on the type of video you want to make.

Exporting to video is one of the default options in powerpoint. PC and Mac will save to MP4

Getting video on Instagram

Instagram has no browser interface for uploading. So once the video is exported, you’ll need to transfer the final file to your mobile device. I didn’t struggle emailing files around but you might want to look at alternatives like WeTransfer or GoogleDrive as a way of moving the files around from desktop to mobile device.

Beyond Instagram

It’s worth noting, even belatedly, that your video doesn’t have to be square. Instagram is happy with standard resolutions of video. You could use a standard 16×9 template and Instagram will be fine. I just wanted to be a bit more ‘native video’. But there is nothing stopping you setting up templates for Twitter video (W10cm X H5.6cm Landscape video) or Snapchat (W8.4 cm X H15cm — Portrait video).

Conclusions

There are limitations to using Powerpoint;

  • You need Powerpoint — It’s an obvious one, but I recognise that not everyone has access Office. That said. It can also be the only thing people do have! It’s a trade off.
  • Its not happy with video — If I embed a video into the presentation, Powerpoint won’t export that as part of the video. According to the help file there are codec issues. I haven’t experimented with windows native video formats which may help but it seems like a bit of a mess. It’s a shame. It will take an MP4 from an iphone and play it well. It will spit out an MP4 but it won’t mix the two! Those of you on a mac, this is the point to move to Keynote. Keynote is quite happy to include video.
  • Effects can get complicated — once you get beyond a few layers of texts then the process of animation can be tricky. In reality its no more or less tricky than layering titles in Premier Pro. The Animation Pane also makes this a little easier by giving you a timeline of sorts.
  • Audio can be a faff — The trick with anything other than background sound is timing. Knowing how long each slide needs to be to track with the audio can add another layer of planning that the timeline interface of an editing package makes more intuitive.
  • It’s all about timing — without a timeline, making sure your video runs to length is a pain. With the limitations of some platforms that could mean some trial and error to get the correct runtime.

But problems aside, once you’ve set up a presentation to work, I could see it easily being used as a template on which to build others. The slideshows are also pretty transferable as media is packaged up in the ppt file.

It’s not an ‘ideal’ solution but it was fun seeing just where you could take the package as an alternative platform for social video.

Don’t forget, you can download the PPT file I used and have a dig around (CCZero). Let me know if you find it useful.

Mapping Drone near misses in Google Earth*

My colleague Andrew Heaton from the Civic Drone Centre set me off on a little adventure with mapping tools when he showed me a spreadsheet of airprox reports involving drones.

In my head an airprox report describes what is often called a ‘near miss’ but more accurately, the UK Airprox board describe it as this…

An Airprox is a situation in which, in the opinion of a pilot or air traffic services personnel, the distance between aircraft as well as their relative positions and speed have been such that the safety of the aircraft involved may have been compromised.

The board produce very detailed reports (all in PDF!) on all events reported to them, not just drones, and they pack that all up in a very detailed spreadsheet each year. You can also get a sheet that has all reports from 200–2016! (h/t Owen Boswarva). If you look at those sheets and you just want drone reports look for ‘UAV’. There is also a very detailed interactive map of UK Airprox locations you can look through.

But given I’m on a bit of a spreadsheet/maps thing at the moment, I thought it would be fun to see if I could get the data from the spreadsheet into Google Earth . Why? Well, why not. But I did think it would be cool to be able to fly through the flight data!

Getting started.

The Airprox spreadsheet

At first glance the data from the Airprox board looks good. The first thing to do is tidy it up a bit. The bottom twenty or so rows are reports that have yet to go to the ‘board’. So the details on location are missing. I’ve just deleted them. Each log also got latitude and longitude data which means mapping should be easy with things like Google Maps. But a look over it shows the default lat and long units are not in the format I’d expected.

This sheet uses a kind of shorthand for Northings and Eastings. These are co-ordinates based on distance from the equator — the N you can see in the Latitude — and distance to the west and east of the Greenwich Meridian line, the W and the E you can see in the Longitude. To get it to work with stuff like Google maps and other off the shelf tools it would be more useful to have it in decimal co-ordinates eg. 51.323 and -2.134.

Converting the lat and long

This turned out to be not that straight forward. Although there are plenty of resources around to convert coordinate systems, the particular notation used here tripped me up a little. A bit of digging around including a very helpful spreadsheet and guide from the Ordnance Survey and some trial and error, sorted me out with a formula I could use in a spreadsheet.

Decimal coordinates = (((secs/60)+mins)+degrees)). 

If the Longitude is W then *-1 eg.(((secs/60)+mins)+degrees))*-1So to convert 5113N 00200W to decimal 

Latitude =((((00/60)+13)/60)+51) = 51.21666667
Longitude =((((00/60)+00)/60)+2)*-1 = -2

Running that formula through the spreadsheet gave me a set of co-ordinates in decimal form. To test it I ran them through Google Maps.

Getting off the ground.

Google maps is great but its a bit flat. Literally. The Airprox data also contain altitude information and that seems like an important part of the data to reflect in any visualization around things that fly!. That’s why Google Earth sprang to mind.

To get data to display in Google Earth you need to create KML files. At their most basic these are pretty simple. You can add a point to a map with a simple text editor and a basic few lines like the one below. Just save it with a KML extension e.g. map.kml

<?xml version="1.0" encoding="UTF-8"?> 
<kml xmlns="http://earth.google.com/kml/2.0"> 
<Document>
<Placemark> 
 <name>Here is the treasure</name> 
 <Point>
  <coordinates>
    -0.1246, 51.5007
  </coordinates>
 </Point>
</Placemark>
</Document> 
</kml>

Any KML files usually open in Google Earth by default and when it opens it should settle on something a bit like the shot below.

Google Earth jumps to the point defined in the KML file.

Adding some altitude to the point is pretty straight forward. The height, measured in meters is added as a third co-ordinate. You also need to set the altitudeMode of the point “which specifies a distance above the ground level, sea level, or sea floor” for the point

<?xml version="1.0" encoding="UTF-8"?> 
<kml xmlns="http://earth.google.com/kml/2.0"> 
<Document>
<Placemark> 
 <name>Here is the treasure</name> 
 <Point>
  <coordinates>
    -0.1246, 51.5007, 96 
  </coordinates>
   <altitudeMode>relativeToGround</altitudeMode>
 </Point>
</Placemark>
</Document> 
</kml>

The result looks something like this.

Setting the altitudeMode and setting an altitude co-ordinate gives your point a lift.

But hold your horses! There’s a problem.

The Altitude column in the Airprox sheet is not in Meters. Its in Feet.

When it comes to distances aviation guidance mixes its unit. Take this advice from the Civil Aviation Authority’s DroneCode as an example:

Make sure you can see your drone at all times and don’t fly higher than 400 feet

Always keep your drone away from aircraft, helicopters, airports and airfields

Use your common sense and fly safely; you could be prosecuted if you don’t.

Drones fitted with cameras must not be flown:

within 50 metres of people, vehicles, buildings or structures, over congested areas or large gatherings such as concerts and sports events

On the ground its meters but height is in Feet! So the altitude data in our sheet will need converting. Luckily Google sheets comes to the rescue with a simple formula:

=CONVERT(A1,"ft","m")

A1 = altitude in feet

Once we’ve sorted that out, we can look at creating a more complete XML file from a spreadsheet with more rows.

Creating a KML file from the spreadsheet

The process of creating a KML file from the Airprox data was threatening to become a mammoth session of cut-and-paste, typing in co-ordinates into a text editor. So anything that can automate the process would be great.

As a quick fix I got the spreadsheet to write the important bits of code using the =concatenate formula.

=CONCATENATE("<Placemark> <name>",A1,"</name><Point> <coordinates>", B1,",",C1,",",D1,"</coordinates <altitudeMode>absolute</altitudeMode> </Point> </Placemark>")

Where 
A1 = the text you want to appear as the marker
B1 = the longitude
C1 = the latitude
D1 = the altitude

The spreadsheet can do most of the coding for you using the =concatenate formula to build up the string (click the image to see the spreadsheet)

To finish the KML file, you select all the cells with the KML code in and then paste that into a text file with a standard text that makes up a KML header and footer.

<?xml version="1.0" encoding="UTF-8"?> 
<kml xmlns="http://earth.google.com/kml/2.0"> 
<Document>

paste the code from the cells here.

</Document> 
</kml>

Your file will look something like the code below. There’ll be a lot more of it and don’t worry about the formatting.

<?xml version="1.0" encoding="UTF-8"?> 
<kml xmlns="http://earth.google.com/kml/2.0"> 
<Document>
<Placemark> <name>Drone</name><Point> <coordinates>-2,51.2166667,91.44</coordinates> <altitudeMode>relativeToGround</altitudeMode> </Point> </Placemark><Placemark> <name>Drone</name><Point> <coordinates>-2.0166667,51.2333333,91.44</coordinates> <altitudeMode>relativeToGround</altitudeMode> </Point> </Placemark><Placemark> <name>Unknown</name><Point> <coordinates>-2.6833333,51.55,2133.6</coordinates> <altitudeMode>relativeToGround</altitudeMode> </Point> </Placemark><Placemark> <name>Model Aircraft</name><Point> <coordinates>0.25,52.2,259.08</coordinates> <altitudeMode>relativeToGround</altitudeMode> </Point> </Placemark>
</Document> 
</kml>

The result of the file above looks something like this.

With a simple file you can add lots of points with quite a bit of detail.

Is it floating?

When we zoom in to a point it can be hard to tell if the marker is off the ground or not especially if we have no reference point like Big Ben! Luckily you can set the KML file to draw a line between the ground and the point to make it clearer. You need to set the <extrude> option by adding it to the point data:

<Placemark> <name>Unknown</name><Point> <coordinates>-2.6833333,51.55,2133.6</coordinates> <altitudeMode>relativeToGround</altitudeMode> <extrude>1</extrude></Point> </Placemark>

The result looks a little like this:

Wrapping up, some conclusions (and an admission)

There is more that we can do here to get our KML file really working for us; getting more data onto the map; maybe a different icon. But for now we have pretty solid mapping of the points and good framework from which to explore how we can tweak the file (and maybe the spreadsheet formula) to get more complex mapping.

Working it out raised some immediate points to ponder:

  • It was an interesting exercise but it started to push the limits of a spreadsheet. Ideally the conversion to KML (and some of the data work) would be better done with a script. But I’m trying to be a bit strict and keep any examples I try as simple as possible for people to have a go.
  • The data from the Airprox board is, erm, problematic. The data is good but it needs a clean and some standard units wouldn’t go a miss. It could also do with some clear licensing or terms of use on the site. I could be breaking all kinds of rules just writing this up.
  • The data doesn’t tell a story yet. There needs to be more data added and it needs to be seen in the context i.e the relationship to flight paths and other information.

And now the admission. I found a pretty immediate solution to this exercise in the shape of a website called Earth Point. It has a load of tools that make this whole process easier including an option to batch convert the odd lat/long notation. It also has a tool that will convert a spreadsheet into a KML file (with loads of options). The snag is that it does cost for a subscription to do batches of stuff. However Bill Clark at EarthPoint does offer free accounts for education and humanitarian use which is very nice of him.

So I used the Earthpoint tools to do a little more tweaking, with some pleasing(to me) results.

You can download the KML file and have a look yourself Let me know what you think and if you have a go.

Thanks to Andrew Heaton for advice and helpful navigation round the quirks of all things drones and aviation. If you have any interest in that area I can really recommend him and the work the CDC do.

*Yes, I’m pretty sure ‘near misses’ isn’t the right word but forgive me a little link bait.

Don’t postmortem journalism. It’s not dead. Fix it.

In the aftermath of the Trump win many in the media are looking inward to understand what went wrong. But is it too soon to write off journalism as a failed project?

In the very short time we’ve had to get used to the idea that Donald Trump will be the 45th President of The United States, the hand wringing for journalism has already started.

‘We did this’.

‘We didn’t see this coming’.

‘We trusted the data and not the people’.

‘We’ve lost touch with proper journalism’

There is no doubt that what we know as the modern media is breaking apart. The strands of a profession that hold it together, that define it, are impossibly stretched by digital fragmentation and an economy that now sells choice over balance. More than any recent events, even post-Brexit here in the UK, the industry seems shaken to its core by its lack of foresight. The simmering existential crisis that dogs journalism now risks becoming full blown, crippling self-doubt for those that find their powerful journalistic tools and practice are ineffectual.

The knives are not just out for a postmortem. Many in journalism are taking the opportunity to cut down some tall poppies. Data journalism is already the main target for the traditional journalists championing a return to ‘proper journalism — with all the self-righteous confidence of Trump supporter mandated by the win to call foul on liberal thinking.

But now is not the time to ‘fix journalism’.

Journalism — this election was not about you. In the next few weeks, we’ll need you to explain what’s happening. God knows what the repercussions of today will be. No one has a clue. That’s your job.

Don’t fill the airwaves with conversations about the role of the media. Don’t cram the pages of your papers with handwringing. This wasn’t a surprise . This was the outcome we couldn’t sell to ourselves.

We know what the lessons are.

Time to learn them by doing .

Why social media isn’t blogging.

I’m teaching first year journalism students at the moment and talking to them about a professional online presence. A phrase that I’ve been using a lot is blogging. The idea of a ‘blog’ and its value to an aspiring journalists is one I’m really comfortable with but I checked myself and wondered just what it might mean to the students.

As part of that, I had a look at Google Trends to see how the term blog was fairing. As I noted on Twitter:

If you read all of this post, the irony that I put this on Twitter before I wrote this post — before I blogged it — will not be lost. As many pointed out in the conversation around the tweet, by putting it on Twitter I was blogging. Maybe its the terminology that’s changed.

But for me there is something more about the idea of blogging; something more about what that term means.

There is a very mechanical element to the idea of a blog. At their heart is a mechanism by which anyone (with little more than the time to google your way through the set-up process) can set up a dedicated publishing platform for their content and share with people — the press tools Jay Rosen talked about. In this context, it’s easy to see how the idea of blogs can be subsumed into contemporary platforms and practice. Twitter and other Social media platforms do the same thing. Don’t they?

Blogger has also become a proper noun (beyond the google platform*). It’s a job title. It must be a proper job because we now differentiate between types of blogger — celebrity bloggers, fashion bloggers (its a kind of differential journalism). And to be frank, the amount of money many of them earn certainly qualifies it as ‘a living’

But, and I realise this is where I make this quite parochial and personal, in the journalism sphere, blogging has always meant more to me than simply the process.

Blogging as critical practice.

As digital disrupts, those in the industry who innovate, explore or just honestly talk about the challenges of the day-to-day, are pushed apart. Connections are lost. So the value of social media to hold together and sustain the communities of practice is immeasurable. But social media is prone to echo chambers and its hard for new voices to break in and disrupt the same old conversations. More fundamentally, social media has no collective memory. The mistakes, learning and context are lost in the stream of news. The echo chamber reverberates to a constant churn of the same questions popping up again and again.

Blogging, for me, was a way of setting that down — the collective wisdom of a community. A way for the community to archive its learning and insights. But more than that it was a way for us to share the working out not just the result — It was and continues to be a way for me to test my thoughts.

It also has been one of the key activities that has driven me to get enough profile that you’re reading this at all. It’s allowed me to build a presence alongside the chatter of social media. Something that underpins my transitory interactions with something more substantial (but maybe no less sensible!). An opportunity that is still there for aspiring journalists to grasp and exploit.

There isn’t the time, space or traction for that level of depth or reflection on social media.

So, as much as blogging may be becoming a bit of a legacy term, I still hold to my thought that “a blog is about the space to say why you think something in a world of people saying what they think in 140 chars or less.”

For me blogging was and still is a critical and thoughtful process.

*Just having to clarify that says something about the collective memory of social media)

Mapping street level crime in an area

A little while ago I was playing around with the API at data.police.uk looking at a way to pull the data into a google spreadsheet (and some of the issues around the way policing areas are constructed)

Yesterday I found myself playing with the API again and looking at quick and easy ways to pull data out based on a particular area.

Before I go any further I’d recommend that if you’re going to do anything with crime data from data.police.uk, you read the About pages for more information on what the data means and where the limitations are. 

Back to the project…

I know that the data.police.uk API can deliver street level crime reports based on a number of criteria including multiple latitude and longitude points that describe a shape.

https://data.police.uk/api/crimes-street/all-crime?poly=52.268,0.543:52.794,0.238:52.130,0.478&date=2013-01

I wondered how easy it would be to get the points of a custom polygon, like the one below, so I could get more specific data.

So I created a basic polygon using Google MyMaps and set about seeing if I could get the data out.

Making the shape

The easiest way to get at the data used to describe the polygons is by exporting the map as a KML file. In Google My Maps:

  1. In the left panel, click Menu (it looks like three dots on top of each other)
  2. Select Export as KML.
  3. You can choose the layer you want to export, or click Entire map. I just picked the layer with the Polygon on.
  4. Click Export.

Sorting out the lat and long points

The file that is exported is a text file so we can open up the file in any text editor and it will look something like this (I’ve just included the first part) and it’s those co-ordinates that I want to get at.

<?xml version='1.0' encoding='UTF-8'?>
<kml xmlns='http://www.opengis.net/kml/2.2'>
 <Document>
  <name>Crime Layer</name>
  <Placemark>
   <name>Crime area</name>
   <styleUrl>#poly-000000-1-77-nodesc</styleUrl>
   <Polygon>
    <outerBoundaryIs>
     <LinearRing>
      <tessellate>1</tessellate>
      <coordinates>-2.7231503,53.7637821,0.0 -2.7239227,53.763021,0.0 -2.720747,53.7586067,0.0 -2.7239227,53.7518067,0.0 -2.7229786,53.7493706,0.0 -2.7213478,53.7495229,0.0 -2.7176571,53.7501319,0.0 -2.715168,53.7485078,0.0 -2.7113915,53.7475942,0.0 -2.7094174,53.7476957,0.0 -2.7033234,53.7507917,0.0 -2.6967144,53.7516544,0.0 -2.6905346,53.7486093,0.0 -2.6857281,53.7488631,0.0 -2.6790333,53.7531769,0.0 -2.6811791,53.7566277,0.0 -2.6800633,53.7606363,0.0 -2.6809216,53.7612959,0.0 -2.6774883,53.7620063,0.0 -2.6780892,53.7630717,0.0 -2.6846123,53.7693626,0.0 -2.6918221,53.7693626,0.0 -2.7057266,53.7690583,0.0 -2.7167988,53.7671305,0.0 -2.7231503,53.7637821,0.0</coordinates>
     </LinearRing>
    </outerBoundaryIs>
   </Polygon>
  </Placemark>...

Sadly the co-ordinates are in the wrong format for data.police.uk;

  1. The lat and long are reversed
  2. The data.police.uk API wants each pair (lat and long that describes a point) separated by a colon (:)

So we are going to need to clean the data up a bit. You could take the data points and use various filters, formulas and other things (regex etc.)There’s plenty of ways we can do this but to be honest with such a small set of points I did it by hand.

The biggest issue is getting each pair on a new line. If you can do that then they should cut and paste into a spreadsheet and you can use the SPLIT command in Google Sheets to break the data down. Once you’ve got the Lat and long in adjacent columns then the CONCATENATE formula will help rebuild things in the right format and then the JOIN formula will shunt them back into one line.

The SPLIT formula can be used to separate lat and long using the comma as the delimiter (the thing you split on) Adding TRUE means it will split on consecutive commas
The CONCATENATE formula can be used to join the Lat and Long back together again in the right order, separated by a comma
Finally the JOIN formula helps shunt them all together on to one line, separated by the colon that data.police.uk wants for the API call. 

Some final cutting and pasting and I ended up with this URL to call the API

https://data.police.uk/api/crimes-street/all-crime?poly=53.7637821,-2.7226353:53.763021,-2.7234077:53.7586067,-2.720232:53.7518067,-2.7234077:53.7493706,-2.7224636:53.7495229,-2.7208328:53.7501319,-2.7171421:53.7485078,-2.714653:53.7475942,-2.7108765:53.7476957,-2.7089024:53.7507917,-2.7028084:53.7516544,-2.6961994:53.7486093,-2.6900196:53.7488631,-2.6852131:53.7531769,-2.6785183:53.7566277,-2.6806641:53.7606363,-2.6795483:53.7612959,-2.6804066:53.7620063,-2.6769733:53.7630717,-2.6775742:53.7693626,-2.6840973:53.7693626,-2.6913071:53.7690583,-2.7052116:53.7671305,-2.7162838:53.7637821,-2.7226353

Notice that there is no trailing : and I’ve left the date option off. That will give me any street level crime reports, in the area defined for the last month they have. Plug that URL into a new browser tab and you get a page full of JSON data:

[{"category":"anti-social-behaviour","location_type":"Force","location":{"latitude":"53.764959","street":{"id":863936,"name":"On or near Carrol Street"},"longitude":"-2.690727"},"context":"","outcome_status":null,"persistent_id":"725ed090a9eda01c7b53e2e474005e78077bb6e9521a600d90b8a10383fbd05e","id":50943777,"location_subtype":"","month":"2016-08"},{"category":"anti-social-behaviour","location_type":"Force","location":{"latitude":"53.762666","street":{"id":862106,"name":"On or near Driscoll Street"},"longitude":"-2.690796"},"context":"","outcome_status":null,"persistent_id":"463cc6c50d3d8464a4f05d1e9f9d9e18d2138d0ba4b3d843daba7419660ddbaf","id":50939501,"location_subtype":"","month":"2016-08"},

Pulling the data into a spreadsheet

There are lots of applications and scripts that can read the JSON output from the Police API. But I wanted to go with something that required minimal coding and could output something pretty easily so I pulled the data into a google spreadsheet using the importJSON script. Making the script work is dead easy thanks to Paul Gambill’s guide to How to import JSON data into Google Spreadsheets in less than 5 minutes.

Using the importJSON script we can use the data.police.uk api call to populate a spreadsheet. (you should be able to click the image and go through to the spreadsheet)

Visualizing the data

Now that we have the data as a spreadsheet we could start to do some analysis, filtering etc. But we can get a quick win by using the spreadsheet to drive a map.

I went back to the map I used to create the polygon shape, added a new layer and then imported my crime layer spreadsheet into the map. A bit of crunching later and each crime was mapped as a point.

Conclusions

The API isn’t perfect — the data isn’t as fresh as I would like and the geolocation isn’t always accurate (they do say this to be fair). Google maps also has its quirks especially when you’re dealing with lots of data points. But being able to export to KML is nice feature, not only for pulling out polygon data. If you have Google Earth on your computer you can open the KML file and fly around the crimes in your area!

Exporting your Google Map as KML data means you can pull the data into Google Earth and fly around the crime locations.
It’s clunky and no doubt there are more elegant solutions out there (please tell me if you know of them) but, a bit of messing with the format of the data aside, it worked how I thought it would; a process of ‘well I can do this, so if I can do that it should work’ way of piecing together the tools. As a quick and dirty visualization tool (and an exploration of what API’s can do), I think it works well. 

Let me know if you try it!

Note: The data from data.police.uk is made available under the Open Government Licence. That means you’re free to do pretty much anything with it but you must link back to the source where you can. 

Afterwards…