Ciarán O’Mara
As a generation that relies heavily on road infrastructure as our main medium of transport, we are constantly faced with the problem of congested and inefficient road networks. The majority of research carried out in this area has approached the problem in what could be described as a one-dimensional manner, attempting to route traffic in a more efficient way using smart traffic lights, dynamic routing algorithms, etc. This, although successful in some cases, has yielded little return for road networks like in the University of Limerick, which has small scale infrastructure but large volumes of traffic.
The importance of data and the interpretation of said data has become a topic of increasing importance in the last decade. Similarly, the term Machine Learning (ML) has become ubiquitous in modern data engineering. There are a variety of challenges associated with the ML space including the amount of computing power needed to handle these power-hungry deep learning algorithms, the question marks surrounding explainability, and even the ethics associated with the use of these sophisticated technologies. Another challenge that is faced in every ML project is the ability to get one’s hand on high quality, diverse, and unbiased imagery datasets. In this project a dataset collection pipeline was integrated with a traffic data analysis tool (T-DAT) outlined in Figure 1, to provide valuable insights to planning authorities on road networks.
The objective of the tool is to collect traffic data using novel, classical and machine vision techniques. The collected data is then used to train a ‘carpool classifier’ which indicates how many front seat passengers a vehicle has. Along with traffic counts, direction and density, the carpooling data can be used to assess proposals for carpooling lanes, right/left turn only lanes or even for carpooling policing. The processing pipeline outlined in Figure 2 doubles as a data collection pipeline and is used to build a dataset of cars which is then leveraged to train a carpool classifier. It uses a Mixture of Gaussian (MOG) background subtractor coupled with some basic denoising techniques to generate bounding boxes for passing vehicles. The bounding boxes can be used along with road mask overlays and the Simple Online Realtime Tracking (SORT) algorithm (which uses bounding box intersection over union (IOU)) to compute metrics such as traffic, density, estimated speed, count and direction.
MOG works by using a set of training frames to generate a model of the background in order to understand the foreground. This method is extremely fast when compared with deep learning approaches but relies on denoising to account for moving background objects such as trees. As seen in Figure 3 it also struggles to differentiate between cars when they are very close together.
Identifying passengers in the front of a car through the wind screen is a segmentation problem. Mask RCNN (Region-based Convolutional Neural Network) is a deep convolutional neural network which excels at complex instance segmentation. The backbone of this architecture consists of a feature pyramid network (FPN) which feeds into a region proposal network (RPN). To train an instance segmentation model from scratch would require a vast amount of diverse data (a dataset of only 3000 images was built using the data collection pipeline). Transfer learning, summarised in Figure 4, is an approach whereby the fully connected layers of a model that has been trained on a large amount of data are stripped and trained using the new dataset. In this case Mask RCNN was initially trained on the COCO dataset (81 classes including people).
Before training, the images need to be labelled (Figure 5) with segmentation masks. This is a time-consuming process where a polygon mask must be drawn over the passengers and vehicle and stored in a json annotations file. This is often the most time-consuming part of any ML project. Using transfer learning, impressive results can be achieved without the need to label a large amount of data.
An overall car detection accuracy of 92.2% was achieved across ten, 15-minute test videos using 1000 MOG training frames. An accuracy of 82% was achieved for the carpool classifier using the transfer learning methodology, an increase in accuracy of 22% when compared to the model that was trained solely on the COCO dataset. A visualisation of the overall output of the analysis tool and carpool classifier are presented in Figure 6 & 7.
A C# dashboard (Figure 8) was developed to visualise the results and display the metrics as well as managing the imagery data captured. The implementation shows that traditional machine vision techniques can be blended with deep learning approaches to build a data collection tool capable of generating high quality datasets. Furthermore, the design of the processing pipeline allows for data to be both processed, producing valuable insights, as well as collected and partially labelled using background subtraction.
This project was awarded ‘Best Application of AI in a Student Project’ at the AI Awards and ‘Best Oral Presentation in 10’ at the All-Ireland Conference of Undergraduate Research (AICUR).
Ciarán is a PhD candidate in the Centre for Research and Training in Foundations of Data Science at the University of Limerick, working under the supervision of Dr. John Nelson and Dr. Pepijn Van de Ven. Learn more about the CRT PhD programme that Ciarán is doing in our previous post: Not your traditional PhD journey….