Skip to content

Latest commit

 

History

History
206 lines (108 loc) · 22.3 KB

File metadata and controls

206 lines (108 loc) · 22.3 KB

Introduction

What is open data

Governments all over the world are sitting on treasure troves of data waiting to be released into the public. A lot of this closed data is a result of daily processes and can create an incredible positive effect when turned into open data. Open data is data that is freely accessible to the public, offered in machine readable formats, and published with an open license.

Kinds of spatial data

Most data that is generated and used has a spatial component to it — the geotag is one critical component of a dataset with many attributes. Releasing authoritative spatial data to the public offers the opportunity for awareness and analysis on where certain phenomena are occurring. Spatial data can include discrete phenomena in the form of points (bus stops), lines (bus routes), or polygons (transit jurisdiction areas). It also includes continuous data from aerial imagery or raster data (land cover or tree canopy). Highlighting the location of features when publishing data stresses the important role that geography plays in the phenomena we interact with in our daily lives. Cities are awash with vast amounts of both discrete and continuous data that can be shared with the public to help aid in decision making and spur economic development and civic engagement.

Why bother?

There is incredible value in releasing government data to the public — both internal and external. Government data should be open for many reasons, of which not all are captured below. There are benefits experienced within government which serve as internal incentives for openness, and positive externalities produced by the community which contribute to the external value derived from open government data.

Internal Incentives

  • Easier data sharing and resource consolidation: A huge user of government data are governments themselves! Having accessible open data lowers the cost of data sharing between departments — people save time and resources by accessing a central repository of data rather than submitting and fielding requests for specific datasets. The number of FOI requests is also lowered, leading to increased efficiency and reduced costs in government.
  • Openness agenda—transparency and accountability: Releasing open data grants a clearer look into government operations to its constituents. Citizens are able to access government data, manipulate and analyze it, and share it with others so broader populations can understand what their government is doing. Governments are held accountable by their citizens when citizens can both access and understand government data.
  • Upgrade technological environments: While releasing data can be done without needing an overhaul of technical infrastructure, it can provide a good opportunity to upgrade an environment to ensure interoperability, ease of workflows, and increase speed.

External Value

  • Economic development: It is oft cited that open data is a three billion dollar a year industry. Open government data provides businesses, entrepreneurs, and innovators a key resource to use for economic activities and spur data-driven economic development. Governments are enabling and promoting the creation of services and businesses by opening resources that they already have!
  • Civic engagement: Communities that are introduced to open data are given a multitude of entry points into participatory governance. Citizens can use data to create opportunities to be directly involved and informed in government decision making, or use the data to help make better decisions in their daily life.

There are many specific examples that can be used to illustrate the above points, and also many opportunities that we have not observed yet! Part of the beauty in opening up government data is the unknown benefits and possibilities that can arise from combining datasets, creating services, and putting them in action.

This Assessment

Getting an open data initiative up and running shouldn’t be difficult or pain-staking. This guide will outline important factors to consider before you embark on opening government data, during the implementation of the site or portal, and after publication. It will ultimately provide insight to where more attention to particular factors may be needed to ensure success, and will distinguish best practices to help governments reach their goals.

Success Factors

Beforehand

There are several things to consider before launching an open data initiative, and many ways to assess how successful your open data initiative may be once launched.

Leadership

A successful project must have some sort of leader. While an open data specialist is not necessary for success, having senior support and someone (or a group of people) within a GIS department to spearhead the initiative will help get it off the ground, and institutional support will contribute to its sustainability and longevity.

Senior Support

Having a senior staffer vouch for the implementation of the open data initiative will help overcome any resistances that may arise. Governments may or may not experience a change in workflows, and senior support will help incentivize any actors that need to make any changes in a timely and effective manner.

Senior support is proven when a government has issued a political promise or policy position on open data, open government, or FOI, and visible champions exist for open data at a political level. It’s beneficial if there are existing activities related to open data and open government — if the idea of open data is already generally known and supported within the government, an initiative is more likely to have support and interest, and will lead to success.

Institutional Support

Sustainable institutional support will help ensure the open data initiative is upheld beyond a single administration. Government initiatives may be pressured to produce results within an administration period, and it may be difficult to prove ROI on a well-planned open data initiative in that short time. It is essential to ensure that a change in government will not jeopardize the existence and maintenance of the program.

Open data should be supported in the broader political context of the government and have commitments across the political spectrum. An initiative is most likely to succeed if a government’s stated priorities and goals align with open data across any political divides and administrations.

Project Lead

Governments don’t need to hire people specifically to manage and oversee open data programs; open data is part of daily processes that can easily be exposed to the public by adding a couple steps to already existing workflows.

It is best to have a project leader(s) from the GIS department who can serve as a point of contact. This point of contact should have an understanding of the data and policies, and can leverage existing data and services to get the project off the ground quickly.

Policies and Legal

Governments must delineate certain policies and legal frameworks in which an open data initiative will operate to serve the city and its citizens. In many cases there are existing policies that can be applied to open data, in others, new policies must be developed. It is critical that all of these policies and frameworks are documented clearly for the long-term success and sustainability of the program — this also helps contribute to institutional support in the Leadership category.

Existing Policies

It is very likely that governments have applicable policies that already exist. These policies, where appropriate, can be applied to reduce overhead and enable the initiative to be launched more quickly.

Governments with existing Freedom of Information policies can take relevant components and apply them to their open data initiatives, for example, processes for data that requires sanitization. If governments have a track record of releasing information requested using the FOI mechanisms, they are likely already set up well to release open data. Privacy policies are also necessary for data anonymization in FOI requests, and should be used when preparing open data where applicable.

Licensing

A key component of an open data initiative is establishing under which license the data will be published. This makes users aware of any use constraints bestowed upon the data and allows governments to know how their data is being used.

Governments may have pre-established terms and conditions that can be used for open datasets, however, these terms can be filled with legalese and be cumbersome to the data user. It is strongly recommended to choose a clear and concise license that is easy to understand by the users. Creative Commons has a selection of licenses that are common in the open community and have clear and accessible meanings.

Sustainability

All of the chosen policies and legal frameworks should be properly documented to ensure the longevity of the initiative as governments see employee and administrative turnover. Governments that effectively do this already are at an advantage; governments that do not include proper documentation in their processes should make the proper changes to include it.

Financial

Launching and sustaining and open data initiative should not and does not have to be a large financial undertaking. Many governments already have the required technical resources and can focus devoting any available budget to personnel and programs.

Platform

Exposing open data in a consistent, accessible, and explorable manner through ArcGIS Open Data is a functionality included in ArcGIS Online subscriptions — for most governments that already have this subscription, launching open data is free. Publication is part of normal, daily operations and the platform does not require any extra budgeting.

Personnel and Programs

Ensuring an open data program is effective does take some personnel resources to ensure data quality, support, and outreach. If governments have funds dedicated for an open data initiative, they can be driven towards staff for community support and development of outreach programs. This focus on the user and the communities will help ensure that users are being properly served and have the data they need.

Institutional structures and capabilities

Governments need to be equipped to handle data management for the open data initiative. Clearly defined processes for data gathering, staging and quality control, and publication are essential. Most data is geospatial in nature and lots of data is already available within governments to immediately release. Once initiated, programs can and should grow beyond the GIS department.

GIS Department

People in the GIS department have been dealing with data management, service infrastructures, and analysis for decades. It makes sense for this department to begin the initiative with the data they are most familiar with.

Governments with a GIS department that is experienced using ArcGIS Server and ArcGIS Online can quickly get their open data program off the ground with a few button clicks. Using existing data and services is a great way to kickstart the initiative — commence with grassroots beginnings and work upwards.

Departmental Coordination Beyond GIS

Data will likely be coming from many sources and across many departments — transportation, planning, etc. Coordination amongst the GIS department and other departments in terms of data formats and workflows will help consistency.

Departments should follow policies and frameworks that were previously delineated to maintain standards. If all departments are supportive of the program, users will benefit from the most data available. Distributed workflows can help the open data initiative grow beyond the GIS department, and other departments can work to publish their own data straight to the open data site.

Demand for Open Data

If there is a current demand for open data that a new open data program will meet, then it will likely see immediate use and its effectiveness will be jumpstarted. The demand for open data can come from the external community or from inside government.

External Community

The external community can include citizens seeking data for personal and non-profit use, as well as companies seeking data for use in businesses and services.

A city that has an already engaged community who consistently ask for data and who submit FOI requests can expect to see immediate use and success. These users will have a new resource full of accessible data in multiple open formats for use in civic projects and for personal use. If a city has companies that are using public and open data to create services, their open data site will cater to a wider demographic and see multitudes of uses.

Internal Community

Open data will satisfy the needs of internal data sharing, and will increase government efficiency. Departments will no longer have to request and wait for data, or field and fill data requests.

If a government sees a lot of data sharing amongst its departments, it is likely that these departments will see the benefits of improved workflows from accessing data in a central repository.

Implementation

Once a government has assessed their capabilities to launch and sustain an open data initiative, the technical details of implementation are left to be detailed. They can be divided into two categories regarding a government’s technical ecosystem and their data management and procedures.

Technical Ecosystem and Skills

A city’s technical ecosystem refers to their current available infrastructure and data sources. Skills refer to the city’s personnel and their capacity to implement a usable and effective open data site. These particular technical considerations are mostly specific to the ArcGIS Platform and consider what is necessary and what is possible when establishing an ArcGIS Open Data site.

ArcGIS Platform

ArcGIS Open Data is a capability of ArcGIS Online, where ArcGIS Online organizations can enable open data and create their own site for free. Governments can use ArcGIS Open Data to easily provide their users with a consistent experience to access, discover, explore, and download their authoritative data. Administrators can host their data on ArcGIS for Server, on ArcGIS Online, or elsewhere.

If a city has an ArcGIS Online subscription already, they are ready to start publishing their data. Administrators wishing to house their data on their own server must have ArcGIS for Server 10.x — the more recent, the better. Data from Server must be registered in ArcGIS Online and organized in open data groups to be added to an open data site.

Multiple Sources

An ArcGIS Open Data site can contain data from multiple sources and APIs and present it in a consistent interface. Administrators can use Koop to pull in data from multiple sources, such as CKAN and Socrata. Koop will transform your data into a feature service, which can then be included in an ArcGIS Open Data site and is available in multiple open formats.

If a government would like to pull in data from outside the platform, they must deploy their own instance of Koop.

Multiple Services

Many different types of data can be shared in ArcGIS Open Data: map services, feature services, image services, and more. ArcGIS for Server has a common API across all these services, and Koop provides an option to bring other data into the ecosystem. Users can access all different kinds of data through the same interface and can download them all in the same open formats, regardless of the data type upon upload or registering.

Design

Designers are free to fully customize their homepage with their own HTML and CSS. This allows sites to be fully branded to an organization to meet any necessary design guidelines. However, having an dedicated experienced designer is not needed to establish an authoritative site — administrators can use the WYSIWIG editor to drag and drop text and image widgets to configure the homepage.

Designing the homepage may take some time to reach a consistent web experience across governmental sites and their open data site. It is important to create an authoritative site that guides users to find the data they are looking for.

Data Management, Procedures, and Availability

There are several ways governments can organize their departments and employees to optimize workflows for publishing open data. Cities can have federated sites with data from multiple departments, and can control how personnel publish data through user roles. When doing so, it’s important to control for data quality and to ensure that the most impactful datasets are made available to the user community.

Departmental Federation and User Roles

A government with an ArcGIS Online subscription can create as many ArcGIS Open Data sites as they’d like — there can be one central site, as well as individual sites for different departments, themes, or special events. Departments can manage their own data on their own site, and data can be federated from these groups into a common site. Establishing a workflow of staging groups and staging sites can help ensure quality assurance of the data.

Employees within an organization can have different user roles and different access to certain groups. This can help in a federated workflow where everyone across all departments can submit data to staging group that is not open data enabled. A select few open data admins can monitor this group to ensure the datasets are ready for publication: they have complete metadata, properly sanitized, etc. Once approved, open data admins can add that dataset to an open data-enabled group to add to the ArcGIS Open Data site.

Quality Datasets

Ensuring high quality data is crucial when curating data for publication — data should go through a quality assurance process so users get access to the best data possible. This process is part of the policies and workflows for data publication established within a city and should account for complete titles, descriptions, licensing, and other appropriate metadata.

Lots of data is coming from the GIS department that is optimized for GIS software and used by people with high degrees of knowledge and familiarity with the data. Part of the process is making this data accessible to the end user — including field aliases and attribute descriptions makes the data easier to understand. When possible, it is best to collect and produce data with the intent to publicly distribute.

Impactful Datasets

If publishing data incrementally, the data being initially published should be data that is most useful for the user. Publishing high impact data that is already available within the GIS department is a great way to start and generate utility. Data that sees high use can be determined by the department — may include parcels, crime data, census data, building footprints, and addresses.

When deciding which datasets to publish, personnel may consider what data is most often requested through FOI — this is likely data that will see a lot of use once they’re open. Personnel should also communicate with their user base to find out what kind of data they are seeking.

Beyond the Portal

An open data initiative doesn’t stop with an open data site — putting the data in the open is only useful if people actually use it and generate the benefits discussed in the first section of this assessment. After the portal is implemented there are many things governments should be doing to ensure their open data initiative reaches its full potential.

Usage Analytics

Tracking usage analytics of a site enables administrators to gain more insight regarding how their data portal is being used, and offers opportunities to identify where efforts should be focused going forward.

Health

Administrators can track metrics to measure the ‘health’ of usage, monitor the growth of the initiative, and track return on investment. These numbers help quantify the utility of an open data program and help lead to its sustainability.

Metrics are built in to ArcGIS Open Data - administrators can track the number of site sessions, dataset views, dataset downloads, and more.

Identification

Analytics also shed insight to where users are spending their time, which datasets they are accessing and exploring, which formats they are downloading, and what datasets they search for. Knowing what the common search terms are allows administrators to devote efforts to creating datasets that fulfill users needs. Admins can also use this information to better categorize their datasets on the site homepage so users can quickly access what they are looking for.

Civic Engagement

City governments must connect with their community to ensure their open data initiative is effective and sustainable. Meeting with the users of the data is the best way to ensure your data is meeting their needs. There are great examples of ways governments are creating spaces for these connections and meeting communities where they are to create a two-way conversation surrounding open data.

Hosting Events

Hosting events is a fantastic way to raise awareness of an open data initiative and the vast quantities of data that are now available for use. Cities can host hackathons, contests, and office hours to help spur and support the use of open data, and will prompt the community to get involved.

These events are a good opportunity to get feedback from users about their experiences with the open data site and the available data.

Attending Events

Lots of cities have thriving civic technology scenes. Attending and speaking at these kind of meet ups will raise awareness of the open data available and will provide use cases for particular groups.

Getting in touch with a local Code for America brigade, if present, will put the data at the forefront of civic hacking. These groups have the technical skills and the field knowledge to turn data into services and information that the greater community can use.

Local Organizations and Schools

Connecting with local start-up incubators and schools will extend the reach of the data. Connecting with incubators will provide local start ups with high quality, authoritative data that can be used to stimulate economic development. Connecting with schools is a great opportunity to get open data in the classroom and in the hands of students. Universities and schools can access the data for assignments to create meaningful projects about the community they live in.