For an autonomous car to drive safely, being able to predict the behavior of other road users is essential. A research team at the Massachusetts Institute of Technology’s CSAIL, along with researchers at the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University in Beijing, have developed a new ML system that could one day help driverless cars predict in real time the upcoming movements of nearby drivers, cyclists and pedestrians. They titled their study, ” M2I: From Factored Marginal Path Prediction to Interactive Prediction.”
Qiao Sun, Junru Gu, Hang Zhao are the IIIS members who participated in this study while Xin Huang and Brian Williams represented MIT.
Humans are unpredictable, which makes predicting road user behavior in urban environments de facto very difficult. The AI solutions currently in use are too simplistic: for them, a pedestrian, for example, can stay on the same sidewalk without trying to cross. If they consider that pedestrians cross, to avoid them, the robot simply parks the car, some of them only predict the movements of a single road user.
Divide and conquer for better prediction
Path prediction is widely used by intelligent driving systems to infer the future movements of nearby agents and identify risky scenarios to enable safe driving. For the team, existing models excel at predicting the marginal trajectories of single agents, but do not provide an answer for traffic in urban environments where many users interact, with the prediction space increasing exponentially with their number.
Researchers at MIT have developed a seemingly simple solution to this complex problem: they divide a multi-agent behavior prediction problem into several small parts and then tackle each one individually, so that a computer can solve this complex task in real time. They called this approach M21. Their behavior prediction framework first guesses the relationships between two road users: which car, cyclist, or pedestrian has the right of way and which agent will yield… It then uses these relationships to predict the future trajectories of several agents.
The trajectories estimated by M21 were found to be more accurate than other ML models, compared to the actual traffic flow in a huge dataset compiled by the autonomous driving company Waymo.(MIT’s technique even outperformed the latter’s recently published model.) In addition, dividing the problem into sub-problems, allowed them to use less memory.
Xin “Cyrus” Huang, a graduate student in the Department of Aeronautics and Astronautics and a research assistant in the lab of Brian Williams, professor of aeronautics and astronautics and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), co-lead author of the study, states:
“It’s a very intuitive idea, but no one has fully explored it before, and it works pretty well. The simplicity is definitely a plus. We compare our model with other leading models in the field, including Waymo, the leading company in the field, and our model achieves the best performance on this challenging benchmark. This has a lot of potential for the future.”
The M21 method
In this work, the researchers exploited the underlying relationships between interacting agents. The M21 algorithm has two inputs: past trajectories of cars, cyclists, and pedestrians interacting in a traffic environment such as a four-lane intersection, as well as a map with street locations, lane configurations, etc.
Using this information, a relationship predictor infers which of the two agents has the right to pass first, classifying one as a passer and the other as a yielder. Then, a prediction model, called the marginal predictor, guesses the path of the passing agent, since that agent behaves independently.
A second prediction model, known as a conditional predictor, then guesses what the yielding agent will do based on the actions of the passing agent. The system predicts a number of different trajectories for the giver and the passer, calculates the probability of each individually, and then selects the six joint outcomes with the highest probability of occurring.
The M2I method provides a prediction of the trajectory of these agents for the next eight seconds. It can cause a vehicle to slow down so that a pedestrian can cross the street, and then speed up when it has cleared the intersection. In another example, the vehicle waited until several cars had passed before turning from a side street onto a busy main road.
Tests on Waymo Open Motion Dataset
The researchers trained the models on the Waymo Open Motion dataset, which contains millions of real-world traffic scenes involving vehicles, pedestrians and cyclists recorded by sensors and lidar (light detection and ranging) cameras mounted on the company’s autonomous vehicles. They retained only scenes where multiple agents were involved.
They then compared the six prediction samples from each method, weighted by their confidence levels, to the actual paths taken by cars, cyclists and pedestrians in a scene. Their method was the most accurate. M21 also outperformed the baseline models on a metric known as overlap rate; if two trajectories overlap, it indicates a collision. M2I had the lowest overlap rate.
Xin Huang states:
“Rather than simply building a more complex model to solve this problem, we took an approach that more closely resembles the way a human thinks when reasoning about interactions with others. A human does not reason about all the hundreds of combinations of future behaviors. We make decisions fairly quickly. Another advantage of M2I is that because it breaks the problem down into smaller pieces, it is easier for a user to understand the model’s decision making. In the long run, this could help users trust autonomous vehicles more.”
On the other hand, the framework can’t account for cases where two agents influence each other, such as when two vehicles each move forward at a four-way stop because the drivers don’t know who should yield. The team plans to address this limitation in future work. In addition, they hope to use their method to simulate realistic interactions between road users, which will help verify planning algorithms for autonomous cars or create huge amounts of synthetic driving data to improve model performance.
Article source: “M2I: From Factored Marginal Trajectory Prediction to Interactive Prediction” by Qiao Sun, Xin Huang, Junru Gu, Brian C. Williams and Hang Zhao. 28 March 2022, Computer Science Robotics.