A lot has been written about “big data” lately. The rapid growth of varying data sources coupled with
the enhanced density in data sources is establishing a huge resource for transportation operators. The rapid proliferation of data sources from new devices such as smartphones and other newly connected devices, in conjunction with the advancement of technologies for data collection and management have manifested a sizeable inflection point in the availability of data. So what does this mean for ITS operators and the systems they currently manage? What will be required to extract and leverage values associated with “big data”?
At First Glance
Federal regulations for performance measures and real-time monitoring associated with MAP-21 and 23 CFR 511 have implemented a framework for the increased need of new, refined data and information systems. System enhancements will require improvements to existing networks and communications systems in order to optimize data and metadata flows between data sources and central applications. Robust central network equipment, including L3 switches, servers and storage will also be required. Enhanced security measures associated with new data sources and big data values will also need to be reviewed and attended to. New central data warehouse infrastructure will also be required, including new database applications (such as Hadoop), that are capable of managing “big data” and the “Internet of Things” (IoT).
A closer look reveals additional layers of change required in order to begin abstracting value from the new data sources. “Big data” will also require somewhat less obvious changes in the way transportation agencies currently do business.
Increased Data Management and Analytics Expertise – The new data paradigm will require new staff skills, most notably, experience in data analytics (Quants). Staff skills will not only require knowledge of the data available now or potentially available in the near term, but also understand transportation systems in order to apply the most beneficial data mining tactics available. The new role must not only be aware of current data and information needs and values, but also be cognizant of what is capable, and potential hidden values currently unrealized or unknown by an operating agency. The new role will also be an integral part of the development of embedded system features and be able to identify nuances in data meaning, as well as establish effective predictive analytics.
Policy and Digital Governance – New data sources are also giving rise to discussion regarding privacy and liability. Data sourced from private entities will always contend with privacy fears and concerns, at least for the near term, although recent analysis is showing a steady lessoning of those fears as “digital natives” begin to represent a greater percentage of the traveling public. Data generated from sources outside of transportation agencies, but utilized by transportation agencies for systems operations, can lead one to question who is responsible should data errors occur that might affect a system.
Networks and Communications – Data sources, formats and general data management practices will need extensive review of existing conditions. What values are attained from real-time, or near real-time collection from subsequent analytics, as well as determining what data is less time dependent. Existing formats and protocols should also be included in the mapping exercise. For example, CV will require a mandatory upgrade of IP protocols from IPv4 to IPv6. General planning regarding the utilization of “the cloud” need to be weighed for benefit-cost. Third-party data brokers and other outsourcing alternatives such as cloud computing need to also be assessed.
Data Management and Analysis Tools – Operating entities also need to look at implementing data management tools (applications) that will assist in extracting value from large data sets. These tools should be integrated with core systems, and provide real-time metrics of collected data. The tools should also provide the ability for “Cloud collaboration”, in order to process data stored by third parties, or general data stored in the cloud.
What to do
Transportation budgets are as tight as ever. How can operating agencies begin to make incremental steps towards the goal of realizing benefits associated with “big data”? The first step is to begin now. Start by mapping existing data sources to existing data management technologies, policies and processes, from end to end. Also, widen your perspective and begin to look at possible benefits from a wide array of new data sources. In addition, “open” it up, and benefit from the wisdom of the crowd. New analytics skill sets should be considered a condition of certain new hires in the transportation and ITS planning departments. A staff member should be designated for leading the way with decisions regarding “big data”, relationships with third party data brokers, cloud management, as well as be responsible for implementing an agile framework for next-gen data systems.
References and Resources