Hacks/Hacks London: How big data is changing financial journalism

connecting to wordfaire... this may take a few seconds
09:49 PM on Wednesday, January 16

Interesting and exciting talk from Chris Taggart to finish off a fantastic evening of discussion. Thanks to everyone at #hhldn and Bloomberg.

Goodnight everyone - we're going to take advantage of the rest of the free nibbles/wine.

09:47 PM on Wednesday, January 16

Because OpenCorporates are based in the UK they automatically acquire database rights. They make those rights available under a sharealike license - if people use their data they are encouraged to share their own. If they don't then they may cut a deal - always encouraging transparency.

09:43 PM on Wednesday, January 16

Scrapers are set up to break rather than pull in bad data. There are some quirks with company registers in certain countries - South African one duplicates data and OpenCorporates have been working with them to find out why that is happening.

09:42 PM on Wednesday, January 16

A question from an audience member sounding more like he wants to cut a deal with OpenCorporates. Followed by a Bloomberg staffer asking how information is vetted. Finally one about how quickly data changes.

Answer on how quickly data is pulled in:

APIs and screenscrapes account for most of the data. Masses of information are being pulled in. The lag is a bit of a difficulty. The UK registers is pulled from every 18 hours - meaning that new companies are added by the day. Some take a lot longer. That is why OpenCorporates want to work with Registers to get the API - but most want to sell what is essentially public data. Got it?

09:35 PM on Wednesday, January 16

What does Starbucks have to do with Olympic Casualty Insurance? It owns it.

Information stored on OpenCorporates will help users find this out.

09:32 PM on Wednesday, January 16

OpenCorporates to reveal "underlying plumbing" of the corporate world. Having seen the networking graph that was just shown - it's a very accurate metaphor.

09:28 PM on Wednesday, January 16

Lack of transparency may have contributed to financial crisis: "The only ones with access to data were the ones with the same incentives."

09:27 PM on Wednesday, January 16

Openness extends to working with partners such as ScraperWiki and the World Bank. "Open is really important."

09:26 PM on Wednesday, January 16

They are also indexing company directors - being shown this with entries for Mitt Romney.

09:23 PM on Wednesday, January 16

OpenCorporates also helps with "reconciliation" - cleaning up messy company names. To PLC or not to PLC?

09:21 PM on Wednesday, January 16

Danish journalist put a search into OpenCorporates and found out which Danish companies had entities in tax havens. He called them up and asked - most were confused about where he got the data.

09:19 PM on Wednesday, January 16

A demonstration of OpenCorporates search features shows us every country where there is a company called Tesco.

09:18 PM on Wednesday, January 16

"Being able to do good widespread searches is incredibly useful."

09:16 PM on Wednesday, January 16

OpenCorporates wants an entry for every company (legal entity) in the world. They've gone from 3 million to 49 million companies in two years. It's all about open data.

09:15 PM on Wednesday, January 16

We're on with Chris Taggart:

"Chris Taggart is a former journalist and magazine publisher, who has been working full time in the field of open data for the past 3 years. As well as co-founding OpenCorporates, he was the founder of OpenlyLocal and OpenCharities. Chris will give a short introduction to OpenCorporates - the largest openly licensed database of companies in the world - will look at some newly introduced features, and give a sneak peak at some upcoming ones too." In other words, a legend.

09:14 PM on Wednesday, January 16

Right. Wine-d up! Back to George for the last talk.

08:59 PM on Wednesday, January 16

5 minute break now. The last speaker will be Chris Taggart, from OpenCorporates.

08:58 PM on Wednesday, January 16

James Ball (@jamesbruk) asking why there is a pressure to pile as many data-journalism skills into an individual as possible.

Emily: "I don't know. I think the moment you try to do anything sophisticated, you need experts."

Martin: "On a small scale, you need a jack-of-all-trades. On a larger scale, that doesn't work."

08:56 PM on Wednesday, January 16

Emily also answers...

"It also depends who the designer is. Some like to get involved incredibly early, and may come to us with suggestions. The more sophisticated a graphic, the more time we spend drawing things on each other's desks."

08:56 PM on Wednesday, January 16

Time for questions.

Q: At what stage do you bring the designers on board during a project and how do you get the most out of them?

A: For print, we have a set of newsroom tools that will give a clean feed of data to the designers. The designers then have exactly the data they need to create the visualisation. For web-based, interactive graphics, the designer may sometimes request data in a particular format. There's not really a hard-and-fast rule, but the designer shouldn't really have to worry about the data behind a presentation.