North Carolina’s public records were difficult to search through and compare, stretching the resources of shrinking community newsrooms.
Ryan Thornburg, a Knight Foundation-funded professor of journalism at the University of North Carolina (UNC), recognized the need for journalists to have faster, easier access to public record data. Public record data can provide great insight and critical stories to community newspapers. In the world of both smaller newsrooms and bigger and bigger data, journalists faced a real challenge. According to data scientists interviewed by the New York Times, even data experts can spend 50-80 percent of their time collecting and preparing unruly data.
While public records were accessible by anyone, the way in which they are accessed was not standardized. Some records were accessible by the web, but on a static webpage or in formats which made data entry and comparison more difficult. Other data was only accessible at local offices, requiring journalists to spend valuable time driving to those locations. Clerks at public offices might lack the expertise needed to understand exactly what records the journalists needed, which resulted in more time lost.
Ryan received a Knight Foundation News Challenge grant to create a website that would consolidate and aggregate all of North Carolina’s public records into one database, but he needed a web application firm to build the application framework. He chose Caktus.
Caktus supports the open government movement, an initiative to make public information easily accessible. As an open source company, this is a natural extension of the work we do. Using Open Data Philly as a model, another Django-based data repository, Caktus implemented Open-NC on a wider scale to include local, county, and state public records. The web structure Caktus built would have three main functions:
- Help people research data more effectively. To help data journalists save time, the website indexes data in a searchable and shareable form, and provides sources for each data set. For ease of comparison, the data could easily be dropped into spreadsheets and allow for multiple data sets to be cross-referenced. Web scrapers, some developed by Caktus, were used to collect data from static web pages.
- Enable anyone to request data that isn’t already stored. Anyone needing data not yet available at Open-NC could request it through the website. Other citizens may be looking for the same data and requests would be approved, then made public so others might be able to add the requested records.
- Allow registered users to submit data to the website. In order for Open-NC to succeed as a data repository, there needed to be a way for users to submit public records on their own. Users would be able to register on the site and submit new data. The data would then be vetted by Ryan Thornburg and later added to the repository for all to use.
In addition to in-house code review and unit testing, Caktus deployed on a near-weekly basis, putting distinct pieces of the website’s infrastructure so the client could test the functionality. This generated valuable feedback early in development stages and led to implementing new features the client requested after hands-on testing. A student of Ryan Thornburg’s designed the site’s visual aesthetic.
After launching the webpage in early 2014 with only 20 datasets, Open-NC now houses more than 100 searchable datasets from towns and counties across North Carolina. The code was also open sourced, making it free for replication in other communities.
Open-NC is a way of saving us all time… it’s just smart.
-Paula Seligson, former reporter, Smithfield Herald
Other communities can also replicate the work of Open-NC. Caktus made the website code open source, meaning it is freely and publicly available for any organization or individuals to use for their communities. This supports Ryan’s ultimate goal of helping community newspapers everywhere remain financially stable during the shift from print to digital publishing. Enabling access to vast numbers of records that were, in practice, closed to the public provides a new resource to both journalists and citizens.