The Evolving Newsroom: Q&A with Aron Pilhofer
The Evolving Newsroom is a series of Q&As with important names in the data journalism field, discussing how the newsroom is evolving to better incorporate data and data-driven journalism. Next, I've talked with Aron Pilhofer, Editor of Interactive News at the New York Times.
Ændrew Rininsland: Could you describe a typical workday?
Aron Pilhofer: For me? Lots and lots of meetings. Not so true for the whole group, but for me — lots and lots of meetings.
Q: So you have more of a managerial role?
A: Right.
Q: How then would you describe a work day for a typical journalist on your team?
A: Well, it depends what they’re working on. They may or may not be part of those meetings, they might be writing a heck of a lot of code — they’re obvious writing a heck of a lot more code than I am these days. But I’m not sure there’s such a thing as a typical day — our projects do tend to be very, very different. If we have a project that’s a quick turnaround kind of thing, chances are it will be a different kind of experience than if it were a long turnaround thing, like in elections, where it’s in a constant state of release, revise, try something, tweak it, release it again, see if it works — stuff like that. Conceiving of different elements of this through the arc of a story or event.
Q: Would you say the work your team does is an extension of Computer Assisted Reporting (CAR)?
A: I do think that it is; I know that reasonable people can disagree about this, but I do think that these roles fall very naturally outside of the tradition of the CAR movement from the ‘80s and ‘90s. It goes back before then, but it really wasn’t until ubiquitous personal computer had arrived that it was really cost-effective who could specialize in that.
Q: In what ways has the newsroom adapted to these new methodologies in reporting?
A: Well, I don’t know I would say that — I very much think that’s a worth in progress. I do think we do a lot more web-first type projects than we ever had in the past. I think that some of the projects we’re able to do or conceive of are influenced heavily by the kinds of things we’re now capable of doing, that we weren’t in the past. In the past, we were pretty much shackled to whatever our content management system provided and whatever templates our designers and IT department were able to cram into that. I think now we’ve been able to break free from those templates and do everything from a custom-built interactive for a single event or breaking news to a full-blown website like School Book, which is the most ambitious project we’ve done editorially. The latter was a really cool process, it involved a core group of reporters, editors, my folks, all together around a table, working on a single project. It was really quite interesting to see. I don’t think the ambition to build a sub-branded experimental site like that would have or could have happened if a team like this had not existed.
Q: Why do you say that?
A: I frequently am asked “How technical do journalists need to get? Do they need to code, et cetera?” I don’t think they need to, but I think they need to have an understanding of what’s possible. The problem is, it’s much easier to say than to actually do. To actually understand what’s possible, you have to have at least a basic understanding of how all this stuff works — what are good devices for telling stories, what are bad devices for telling stories? What makes sense? To know that building a sub-branded education site in the way we did is possible, wouldn’t have been possible before. It wouldn’t have even dawned on anybody to try something like that.
Q: Given this, how do other journalists perceive of the work that you and your team do?
A: I think there’s a range. I think there are some people who are indifferent to what we’re doing and there are some people who are really into it. I think it depends — for the most part, digital writ large has been increasing in its importance over the years, and particularly with Jill [Abramson, NYT Executive Editor] taking over as editor, it’s obviously her signature priority. Everyone’s gotten the message that digital is important. It’s still not at the level as the printed newspaper, but it’s certainly closing the gap in terms of how journalists feel what’s a valuable way to spend their time. So, in most cases, journalists are quite receptive when we propose projects and things that are web-first or web-only, but that wasn’t always the case in the past — I think our election coverage absolutely showed that. As an example, we have now a dashboard — think of it as a live-blog on steroids that comingles content for a wide variety of sources into one stream of news. But one thing we’re doing to make it more interactive is we’re fielding questions. For instance, during one of the debates, which might be a two hour debate, readers are asked via Twitter or via a web form to submit questions as the debate’s going on in real time. We’re then taking those questions — in some cases, as many as half a dozen — and reporters who are there just to do that are taking those questions and answering them, basically doing a fact check. That’s something that never in a million years we would have been able to pull of before. Just the commitment to it would not have been there.
Q: Has it been difficult to get buy-in for projects like these?
A: Oh, of course it’s been hard. It’s a huge newsroom, with 1100 people. There are many parts of this newsroom that are dramatically underserved — let’s put it that way. So yeah, it’s been hard at times to get buy-in, but it’s becoming much easier. Especially in the last two years it’s become quite a bit easier.
Q: Any particular idea why that might be the case?
A: Well, I think in part it’s due to the newness of this group — and I’m talking about projects we’re working on, which often break the article template and are done outside of our content management system. These are projects that aren’t necessarily the kinds of things we’ve done before in the sense of taking them outside of the familiar templates we’ve used. It’s just become quite a bit easier because the profile of this group has grown internally quite a bit — we’ve been around and people know what we do. I don’t know whether there’s a single reason; obviously the support from the top has been incredible. I mean, Jill and John [M. Geddes, Managing Editor (Production)] are both totally into what we’re doing, and so was Bill [Keller, Executive Editor 2003-2011)]. All of those things combined make it easier. Plus we have many more people working in this team than we did five years ago. When we started it, it was three people including me. Now it’s 14.
Q: In an article for Idealab you wrote in 2010, you mentioned others in the newsroom viewed you different once you were given the title “Computer Assisting Reporting Specialist”...
A: This is a little bit of an “Inside Baseball” kind of discussion about what do you call people who do what we do, who are clearly not working in the traditional technology environment but are clearly not writing inverted pyramid-style news stories — what do you call them? I think we were struggling internally to find titles for people that made sense. And ultimately we... I don’t know, I think we kind of punted on it. I mean, our deputies are all “Editors” — that’s their title, “Deputy Editor.” And that makes sense, because they’re doing things that editors do. There are folks underneath them who play what might be described in a traditional software environment as an “architect” role, but there’s really not a newsroom analogy to that. “Editor” sounds weird and forced. We’ve struggled with it, trying to find the right terminology to describe who these people are and what they do and so forth. Now, the analogy to CAR is whenever you go from being a journalist who uses data to being a data journalist, people view you differently — very differently. They don’t really see you as a reporter as much as they do as someone who is able to contribute in some very focused ways. That’s not necessarily bad or good, it just means that terminology matters. So, the argument I was making here was the debate about whether or not it’s a good idea to come up with these specialist titles like “Hacker Journalist” or “Programmer Journalist” or whether the title we should be just thinking about is just “Journalist.” It’s a debate for us, it’s a debate for the CAR community and it continues to be an issue that I don’t know whether there will be a lot of resolution to.
Q: How do you perceive the open state of the open data movement, particularly in the US?
A: The open data movement here has been more focused on, as my friends at ScraperWiki are fond of saying, liberating data. Getting it out of the hands of government officials. I think that’s where the open data movement has been less successful — they’ve been largely successful, and I think you could say that data.gov is a success, and I think you can say that part of it, the transparency piece, the Sunlight Labs of the world, have been successful. But I think the problem is that you get the data — then what? Simply putting it into a database or creating a web search isn’t enough. I think the goals of the transparency movement are slightly different — related, but slightly different — from that of a journalist. It’s like a Venn diagram. We all want public officials to give up data, we want them to give up documents, we want openness and transparency. The journalist wants it as a means to an end, whereas in some cases the transparency movement has been more about the data itself being the end. I think that’s where the differences are. I think they’ve been successful here, certainly.
Q: What avenues, within the newsroom, do you see most affecting people moving forward?
A: That’s a good question, actually. Starting with when I started identifying with the CAR community back in the 1980s and early 90s, I think there was an ongoing debate about how widespread these tools and technologies could be — I think there was a bit of a naivete about it at the time. I think that’s changed, somewhat. I think we kind of believed that if we just trained enough people, if we just kept at it long enough, there wouldn’t be a need for specialists like CAR folks, and that reporters would just naturally see the value. If we could just demonstrate the value enough places, at enough times, win enough Pulitzer prizes, everyone would go “A-ha!” and have that head-slap moment where they asked “How did I ever do my job without these tools and technologies and techniques?” I think that has been largely proven to be way off-base. Right now, I think a very, very, very small subset of journalists industry-wide are even capable — I hate to say it, but even interested in many cases — of working with data in even its simplest forms. To me, I find that unbelievable and even tragic. I think, personally, I’ve wondered how you could cover a local government or a school board, how you could do your job, without some basic data skills. Particularly now, when so much government data is so available and so many public records are going electronic. Then, how is widespread what we do? Now you’ve ratcheted it up a whole other level, using data analysis for a purpose of storytelling. In some cases you’ve then added so many layers of complexity... I don’t see many journalists hacking Ruby code in the near future. I just don’t think it’s going to happen. So I’d say my short answer to your short question would be — very few reporters are going to be experiencing this in any sort of meaningful way beyond the conceptual level.
Q: So you see it as continuing more as a kind of specialist function?
A: Yeah. I don’t see that changing any time soon. And part of the reason is that it’s unfortunate but true, but even the simplest web application, you start putting a database behind something, and you put it up on the web, and if it’s not done properly, it will fall apart under even a small amount of stress and traffic. It will completely fall apart. Back in the old CAR days, it didn’t matter how terrible your SQL code was, as long as at the end of the day you got the right answer. If that sucker ran for 15 minutes because you wrote some crazy outer join, didn’t index your tables properly, hadn’t normalized your data — it didn’t matter. So long as you got the right answer. That doesn’t work on the web — there are scalability issues, a whole lot of new variables in the mix. Believe me — I got into this thinking I was very technical, and I was not technical; I just didn’t know any better. It’s a very steep climb.
Q: For people wanting to specialize in CAR, are there any technologies you’d say would be useful to have knowledge of at this stage in the game?
A: Yes and no. I guess it depends. I want to be careful that we’re talking about the same thing — for me, the most important skill to have for any reporter is having some basic data skills. Knowing your way around a spreadsheet is the most fundamentally important thing that any single reporter could learn. Then from there, you start to get slightly diminishing returns. Next on the ladder would be some basic database skills; next on the ladder would be some basic skills in statistics or mapping, GIS. When you have the opportunity to apply those skills... I think there will be fewer and fewer opportunities as you walk up this ladder. Doing some basic programming could be incredibly valuable — but like I said, it very much depends on the situation and the reporter and how motivated they are to use these tools and technologies in their day-to-day reporting. Many reporters don’t see the value, so they’re not going to do it. As a result, I usually answer this question by saying “Excel,” just leaving it at that, thinking Excel might be the gateway drug into these things.
Q: What do you think the reasons are for reporters just not having the interest in CAR?
A: At the very basic level of spreadsheets and database manager, or even Google Fusion tables or something like that encompassing bits of both and some GIS on top of that — the barrier to entry is relatively low. I don’t think it’s a question of technology so much a question of journalists just not seeing the value, that they don’t have to know it so why bother? Or finding some other excuse — to me, that’s what it is. Only when you get to the public facing things does the technology really pose a barrier, where even an inspired, highly-motivated journalist who is a beginning programmer is going to make some fundamental mistakes that could be fatal. Scalability isn’t an issue for a newspaper — once you’ve done your analysis, you write your news story. The printed page scales pretty well. Not so for the web.