4:48 pm Mar. 21, 20121
On March 12, Mayor Michael Bloomberg signed legislation that requires city agencies to make all the data they produce public online, and structured such that it can easily be used for almost any purpose, with only certain privacy and confidentiality requirements delimiting its release.
“If we’re going to continue leading the country in innovation and transparency,” he said in his statement on the local law, Intro. 29-A, “we’re going to have to make sure that all New Yorkers have access to the data that drives our City.”
From here on out, it is the job of Carole Post to manage that open-data system. She's the city's chief information officer, and also serves as the commissioner of her own department, the Department of Information Technology and Communications.
It's more complicated than it seems.
There are scores of mundane but useful civic datasets that often get overshadowed by higher profile ones, but it's reality that some of the most powerful databases are also the most controversial.
The Police Department’s required release of stop-and-frisk data, for example, was driven by the outrage of a city over the death of Amadou Diallo in 1999 and subsequent focus on racial profiling.
The recent release of teacher evaluation data followed a lengthy legal battle with the teachers’ union.
Citing that latter data, Bloomberg predicted at his press conference on the open data bill that “it will useful in fighting back those who don’t want data out there.” He went on. “You saw that sometimes people don’t want to look at what the data says or don’t believe what the data says, [but] it’s the public’s data.”
“While the philosophy and vision of open data is critical and one we all embrace, we’d rather err on the side of overprotecting rather than letting data be published that we have an obligation as stewards to protect,” Post told me in an interview. “That’s the push and pull of this legislation.”
On one level, there’s truly sensitive data; personal data collected by the government in the course of its business, including, in some cases, names, addresses, even Social Security numbers; representations and filings about personal income, marital status, race, health status.
“If you make a mistake and end up releasing data that shouldn’t be released,” Post said, “or damaging the integrity of data sets that have confidential information or private information, you damage the program across the board. The starting place is that all data is open and should be published to the public, but from there you have to recognize a very important balance of protecting privacy, confidentiality, and public safety – the many concerns that come into play data set by data set, data point by data point.”
The challenge is differentiating between privacy concerns, which systems can be put in place to mitigate, and merely sensitive data that some public entities are reluctant to publish for fear of embarrassment or being misunderstood.
“Agencies have said, and I think fairly, 'I don’t have a problem releasing the data per se, but there’s always a story. There’s always context,’" Post said. "When you just throw up a spreadsheet, you’re inviting myriad explanations for it, some not always accurate.”
Post said that the gradual and voluntary rollout of public data by agencies did the work of convincing agencies that they could make government data public and “the sky didn’t fall.”
Post, who rose up through the mayor’s office before taking her post in 2009, said one option before them to get the data open might have been to simply have the mayor or other elected officials issue a decree that would just force agencies to comply. That’s arguably the route President Obama took when he issued an open government memo as his first executive action on his first full day in office.
“We look incredibly smart now,” said Post, “but one of the most powerful lessons is that this wasn’t simply legislated. We didn’t bang a club over anyone’s head.”
Soft launching the open data initiative brought out the data sets of a few willing agencies. Rather than getting slammed, they were praised by the public and press, and the feedback loop encouraged them to do more. Case in point: the city’s first BigApps competitions in 2009.
The city is now in the middle of its third round of Big Apps competitions, which awards cash prizes —the top in the 3.0 round is $5,000—to applications which make use of New York City data. Last round, the winner was Roadify, an app aimed at connecting commuters to transportation data. One powerful takeaway from those apps competitions, Post said, is that the data that might get all the news attention, that might be most critical inside government isn’t necessarily the data that civilians hunger after.
“Some very obscure data sets,” as Post put it.
She points to a Department of Parks census of street tree demographics. Inside city government, she said, the reaction was, “Who cares?”
But Parks didn’t see any particular reason not to release it to the public. After all, they were using the data to do their work, and it was just sitting around on their computer systems.
That information formed the basis of an app called Trees Near You, a mobile tool that allows urban dwellers to click on a nearby tree and pull up details on its kind and size, along with the Wikipedia entry on the particular type of tree.
“We were able, at no cost,” Post said, “to satisfy an unknown public need.”
"By making the data open and available across the board, we, the government, can get out of the business of trying to prioritize what the public might want to see, in what form they want to see it, and what they want to do with it.”
Some of the more popular datasets on NYC.gov’s open data portal are expected: DoiTT’s database of wifi hotspots, 311 service requests, geographical data on public parks. But others aren’t necessarily what a bureaucrat would focus on if he or she were charged with picking winners: maps of water fountains, locations where movies have been filmed, a Department of Transportation database of traffic signs.
And the BigApps competition has also revealed another benefit to open data for the city. A recent tech-industry-sponsored study claimed that nearly half a millions jobs have been created around mobile and web apps, including those on Facebook.
City data can be the lifeblood of those apps—and cities that provide that data can become magnets for app creation.
Post pointed to developers, some in the financial industry, who were able to quit their day jobs, get some seed funding, and turn their apps into a viable commercial concern. She talked about MyCityWay, an app that helps you find food, shelter, wifi, parking spots, and more. It started in New York City and has since expanded to 70 markets.
“We didn’t necessarily see the intersection,” she said, between sparking up local startups, “until it produced itself.”
As a civic experiment, apps competitions began in 2008 with Washington D.C.’s Apps for Democracy, under the direction of Vivek Kundra. Kundra was the chief technology officer for D.C. before he became the chief information officer in the Obama White House.
In the nationwide open government movement, their usefulness has been much debated. Kundra’s D.C. successor put a stop to the contest, arguing that they produce tools “designed for devices that aren’t necessarily used by the large populations” interested in city services.
But as Post told it, app production aside, the BigApps experience over the years proved enormously successful in New York City as a way of making an open data ideal tangible and not all that frightening.
“By the time we got to the point where we wanted to legislate it,” Post said, “we had the majority of stakeholders pretty warmed up to the idea.”
The White House chose to build out Obama’s call to open the federal government by setting standards and then leaving it to agencies to figure out how to meet them.
“We don’t have a one-size-fits-all plan,” Chopra explained as he left office in February, saying that it’s a choice with its own “risks and rewards.”
Once agencies and elected officials were on board, said Post, New York City decided to go the top-down and across-the-board route, issuing the same expectations for every agency.
“To leave it to each one’s designs,” Post said, “may mean that achieve openness but not usefulness.”
Under the bill, Post's department has been charged with creating by September a manual that directs agencies on how to set up their processes to produce useable data, as well as with building an A.P.I. for use on the city’s data portal.
The next step is that, in a year, agencies must convert information that is now up on the NYC.gov but in a “locked” format (PDF, for example) into machine-readable formats.
In 18 months, agencies must have inventoried what data sets they have, described their content, detailed their plans for making that information public, and chosen which datasets will get priority treatment.
By 2014, agencies will have had to start publishing yearly compliance reports. It is not until 2018—six years from now—that the NYC.gov open data portal is slated to be fleshed out with every data set that is meant to be public.
For Michael Bloomberg, who built his first career and first billion on his skill and organizing and publishing structured financial data, this open data bill stands as a legacy builder. Bloomberg thrives on information. In his autobiography, Bloomberg wrote with awe about John Aubert, a man who started Bloomberg Inc.’s first data collection facility. “He collected data better than anyone and love the process,” writes Bloomberg in Bloomberg by Bloomberg.
Which helps to explain why New York City is, ahead of the urban open government movement generally, taking the step of elevating from administrative practice to law the status data should hold in the city’s future.
“To the extent that the successor administration has a different approach or a different philosophy,” Post explained, “Mayor Bloomberg has wanted to cement many of the advancements that we feel are very positive for New Yorkers.”
For a mayor who is paying close attention to what the city will look like after he leaves office, “legislation is more difficult to unravel,” Post said.
“These are things that you could barely imagine life without after they’ve been implemented."
The teacher scores were released only after a protracted court battle, and even then, the usefulness of the data without any serious effort at interpretation was widely panned.
But short of the courts, I asked Post, how do you go about handling recalcitrant agencies who might claim that all the data they hold is simply too sensitive, too open to misinterpretation to be made public?
“It’s a really good question,” Post said. “You’ll have to check with me in 18 months.”
More by this author:
- Inside Walmart's slow, quiet campaign to crack New York City
- A poll finds most N.Y.U. faculty oppose big expansion plan, but are open to changing their minds