Big Data: The Future of Urban Mobility Systems

In this day and age, every industry including transportation records an extraordinary amount of data. Big data has emerged as a result of rapidly decreasing costs of collecting, storing, processing, and dispersing data. This decrease in data storage costs has allowed the possibility of absorbing data rather than discarding it. To quote science professor George Dyson “Big Data is what happened when the cost of storing information became less than the cost of throwing it away”.

In the past, data that was considered insignificant or trivial (digital dust), was discarded.  Today, ‘’digital dust’’ is analyzed with sophisticated software, and when merged with other contextual trivial data, it can provide valuable insights. 

Let’s take a look at why Big Data is the future of urban mobile operations and how it can help travelers, transportation services, and public agencies make smarter decisions. 

Big Data


Without a doubt, we have witnessed the rapid development of software that is changing our transportation. Most people are using mobile websites and apps for a variety of transportation functions including vehicle routing, parking, trip planning, and fare payment.  

Most people aren’t aware of the fact that real-time analytics and algorithms are constantly working to improve their travel experience. This includes managing crowdsourced and flexible routing, providing predictive analytics for accurately forecasting and responding to demand, and improving operational responses, when natural or manmade hazards occur.

Although transportation public services already use a vast amount of data in their modeling and operations, Big Data along with data sharing has a much bigger potential to exceed transportation planning and traveler services.  

A good example of this was the 2014 Soccer Cup in Rio De Janeiro. Namely, the local government acquired driver navigation data from Google Waze and combined it with the data gathered from the pedestrian transit app Moovit. All this data provided crucial real-time information about the transportation network.  As a result, engineers and local transportation planners were able to get ahold of data on half a million drivers and identify operational issues.

Big Data

Data Extraction

An important factor to consider when selecting a data source is its quality for analysis. Data analytics refers to how all information is extracted from a data set. First, it is categorized into relevant fields such as origin, destination time, longitude, and latitude. Next, a series of operations are performed to clean, transform and model the data to obtain significant conclusions.

At this point, a range of techniques and tools have already been developed to manipulate and visualize Big Data. Furthermore, expertise is drawn from various fields including statistics, computer science, mathematics, and economics. As we can see, there are more than few challenges to face, and that’s why it requires a multidisciplinary approach.

When it comes to transportation planning and urban mobility, spatial analytics are used to extract the topological, geographical, and geometric properties that are encoded inside a data set. 

One of the most important factors when selecting a data source is the scope and quality of the data set. Naturally, data extracted from a single source is considered clean and precise.
In reality, data from a single source is often messy and includes incorrect, mislabelled, missing, and even spurious. Because of its heterogeneous nature, it is often incompatible with other sources.

Big Data

Data Mining and Modeling

Although traditional methods that involve statistics and optimization are still relevant, they have certain limitations when faced with high-velocity data sets. Approaches such as data mining, network analysis, visualization techniques, and pattern recognition have shown much better results when it comes to Big Data. 

Rather than assuming a model that describes relationships in the data or requiring specific queries on which to base analysis, data mining lets the data speak for itself. This means it relies on algorithms to discover patterns that are not evident in single or joined data sets.

Data mining algorithms perform different types of operations and these include classification, clustering, regression, association, anomaly detection, and summarization. These approaches can be based on examples provided by human operators. They are used to guide the process using unsupervised operations, in which patterns are detected algorithmically.

Building and running models are crucial for testing hypotheses concerning the importance of different variables in real-world systems.  Models can simulate real-life scenarios and, as a result, can characterize, understand, and visualize relationships that are difficult to understand in complex systems. 

Using Big Data, the scale, scope, and accessibility of modeling exercises are increased drastically. Through modeling exercises, we can ensure that the right questions are asked and remain essential for providing high-value outputs.

What Does the Future Hold?

Urban mobility is an ongoing problem for cities worldwide. Cities have a giant task of guaranteeing travelers to get from point A to B safely and affordably. Because of this, cities need to have a better understanding of their complex mobility systems. Naturally, this is where Big Data and visualization take over, and so let’s take a look at what the future holds for urban mobility systems. 


As we already mentioned, data analysis can simulate real-life scenarios. Using data, cities can now understand their ecosystem better and answer various crucial questions. This includes travel demand throughout the day, public transport system capacities, bottlenecks, etc.

Using heatmaps, cities can now illustrate their traffic volume, whether it’s neighborhoods, roads, or entire regions. These visualizations can now help traffic planners to optimize infrastructure more easily.

Traffic congestion

According to Inrix and Tom Tom, traffic volume dropped due to the global pandemic, and also fewer vehicles on the road in 2020 resulted in drivers saving money due to the lack of congestion. However, traffic jams are still a huge problem in cities around the world. They produce large quantities of carbon dioxide and therefore raise public health risks and medical treatment costs.

Traffic congestions can be avoided with the help of transportation data. Three modules make up the general framework of this data-analytics-based traffic flow prediction‒ data collection module, data analytics-based module, and application module.

Data Collection Module

Transportation data can be collected from various vehicle-mounted devices including GPS, WiFi, Bluetooth, and RFID. For instance, shared bikes can be equipped with GPS, WiFi, and Bluetooth for tracking. As a result, we can see when the bikes are used and their traveling trajectories. However, micro-mobility vehicles have one big disadvantage: they are small in size and this affects their visibility on the road which causes crashes on the road. 

Data Analytics-Based Module

Transportation data can now be analyzed with methods such as deep learning, classification, ranking, and regression to predict traffic flow. Technologies such as the time series model, deep- learning-based predictor, Markov chain model, and the combination of neural networks, can be used for data-analytics-based prediction. 

Application Module

Predicted results can be used to support many applications for improving the quality of life in cities. This includes transportation planning, transportation management, and city management and planning

Big Data

Long-Term Traffic Planning

Cities have an ongoing problem with construction work. It creates constant disruptions in traffic and cities are forced to create changes in routing. Using data analytics, cities can now look at possible scenarios that can have an impact on traffic flow

A good example of this method in action is the district Harburg in Hamburg. The city had many construction projects planned for 2021, which required full or partial road closures. The Authorities used simulation software to plan detours. The software showed that these alternative roads will end up clogged, and so they decided to begin construction projects at different times.

Short-Term Traffic Planning

Cities can now use software to simulate incidents or events that usually affect the road network. This can help the authorities to stabilize the traffic situations by suggesting alternate routes and sharing them on the news. Hamburg is once again a good example of putting this strategy to good use. Namely, in 2020 the environmental movement ‘’Extinction Rebellion’’ blocked all access to Hamburg’s  Köhlbrand bridge. However, Hamburg’s police had already simulated this kind of street closure and were able to instantly make clever decisions. 

Final Word

Big Data and predictive analytics are creating an impact on the mobility industry. However, we are also witnessing an exponential growth of global data, enhanced by emerging technologies such as automated vehicles, drones, automated aerial vehicles, robotic deliveries, etc. Once these technologies become the new standard, we will surely see the true power of Big Data and predictive analytics.

Big Data

About the author


Leandro Nesi

Data Scientist at 2hire

Less is more. I look for science and numbers in all my interests and emotions with the people I surround myself with.