What Is the Most Effective Application of Large Models in Autonomous Driving?

Author | Harper ZhuoInterview | Echo TangEditor | Monica HeProduced | GOSIM

Ming Cen holds a Ph.D. in Optical Engineering from the Institute of Optics and Electronics, Chinese Academy of Sciences. He previously served as the Head of the Department of Automation and Robotics Engineering at Chongqing University of Posts and Telecommunications and as a Software Systems Analyst and Deputy Director at the Automotive Electronics and Embedded Systems R&D Center. His research spans information fusion, multi-target tracking, autonomous driving, and intelligent robotics, making him one of China's leading researchers in automated driving.

From electrical automation to information automation and now intelligent automation, Ming Cen has witnessed and driven technological leaps in automation. He emphasizes that the ultimate goal of technology is to achieve full user-friendliness. Exploring how the combination of autonomous driving and large models can enhance vehicles' perception and knowledge, he aims to break down the silos of functional autonomous vehicles and tackle the "last mile" challenges across various scenarios.

He firmly believes that modularization is the key to cost-effective development, leveraging task modeling and standardization of common functionalities. As a dedicated advocate of the open-source community, he understands that the true power of technology lies in sharing and collaboration—AI and open-source communities must work together for the future.

GOSIM is honored to have invited Professor Ming Cen to share his insights with technologists worldwide. At the GOSIM CHINA 2024 conference, he delivered a compelling keynote on "Task Modeling Methods for Functional Autonomous Vehicles." Following the event, the GOSIM Dialogue series Open AGI Forum hosted an exclusive conversation between him and Echo Tang, Executive Editor-in-Chief of CSDN’s New Programmer magazine, to discuss the evolution of automation and dive deep into topics such as large models in autonomous driving, vehicle safety, and modular programming.

Highlights of the Conversation:

The most immediate and impactful application of large models in autonomous driving is simulation.
Combining human expertise with large models allows for a hybrid approach. When models are less mature, rules contribute 60%, while models account for 40%. As models mature, rules become advisory, with decisions increasingly driven by AI.
Technological iterations provide new solutions, but the value of accumulated experience remains undeniable.
Functional autonomous vehicles can break automation barriers and solve last-mile logistics challenges.
Safety technology is not a bottleneck—safety performance in autonomous driving is positively correlated with cost.
For a technology to mature and achieve mass adoption, it must become user-friendly. Large models are now enabling everyone to become a designer.

1.A Decade of Automation: From “Electric” to “Intelligent”

Echo Tang: You majored in Optical Engineering, yet you have spent years focusing on automation and system integration. How did you embark on this path?

Ming Cen: Optical Engineering isn’t just about image processing or mirrors. A significant field within optics is beam control. For instance, when an optical device needs to aim at or track a target, and that target is moving, how can we precisely identify and extract it? This involves control and signal processing, which are fundamentally automation topics. That’s why beam control and automation are deeply interconnected.

Echo Tang: The Department of Automation and Robotics Engineering at Chongqing University of Posts and Telecommunications (CQUPT) is well known. Could you introduce its key features?

Ming Cen: The department was originally called the Department of Automation. It was the first and strongest program established under the School of Automation. As the field expanded, we introduced new disciplines like Vehicle Engineering and Robotics, leading to adjustments in the department's name.

In its early days, automation was primarily centered on traditional electrical automation, where computers played a crucial role in control systems. However, electrical-based automation systems struggled with adaptability in changing environments. To improve flexibility, we leveraged software to replace certain hardware functions, allowing real-time updates via software refreshes. This approach not only reduced costs but also significantly enhanced control system flexibility and functionality.

With the rise of artificial intelligence, information automation alone was no longer sufficient. Our department gradually shifted from traditional automation to intelligent automation. Automation has undergone three key phases: electrical automation → information automation → intelligent automation—a natural evolution. Whether in robotics or vehicles, we’ve seen a transformation from traditional to intelligent systems. Industrial robots are evolving into intelligent robots, and vehicles have progressed from electrification to intelligence, now reaching autonomous driving. Over the past decade, I have not only witnessed but also actively participated in this transformation.

Echo Tang: In autonomous driving and robotics, embodied intelligence has become a buzzword, especially with the rise of large models. Many entrepreneurs are now venturing into this field, seeing it as a crucial path toward AGI. What are your thoughts?

Ming Cen: Many fields overlap significantly. Autonomous driving, for instance, has always been a fusion of cameras, radars, inertial navigation, and control systems. When designing an autonomous vehicle, what if we add robotic arms? Could it perform transportation tasks? What if we integrate cameras? Could it function as a patrol vehicle? If we install a cleaning system, does it become an autonomous cleaning vehicle?

At this point, we start questioning: Is it still a car? Is it a vehicle or a robot? From an automotive perspective, these are functional autonomous vehicles. From a robotics standpoint, they could be considered intelligent robots or mobile robots. The same entity, when viewed from different angles, takes on different identities.

Embodied intelligence is a term that differentiates physical robots from chatbots or disembodied AI. Unlike pure conversational AI, modern robots integrate mobility, perception, and intelligence. Compared to purely virtual intelligent agents, robots with physical form, mobility, and sensors are referred to as embodied intelligence. However, fundamentally, there’s no real difference—many of these distinctions are just a matter of terminology.

2.Large Models Empower Vehicles to "See More and Know More"

Echo Tang: Have you considered how autonomous driving can be integrated with large models?

Ming Cen: In the short term, I believe the most immediately effective application of large models is simulation. Why aren’t smart vehicles intelligent enough? Because they haven’t seen enough. These cars operate in normal weather and road conditions 99% of the time. They rarely encounter unusual scenarios—such as extremely steep mountain roads, severe weather, or heavy congestion. If we rely on real-world data collection, it would take a long time to gather even a small amount of data.

Vehicles trained on existing onboard data perform well on normal roads, but if suddenly placed on a mountain road, the system fails. Since these cars are primarily trained on sunny-day data, they don’t know how to handle extreme conditions.

How can we generate data to ensure vehicles experience a complete range of scenarios? We can collect real-world scene data, but some conditions are difficult to capture due to high costs and limited opportunities. This is where intelligence comes in to generate data. Manufacturers in flatland cities lack frequent access to mountain roads, but we can record a mountain road, have the large model learn from it, and generate various unusual mountain road scenarios. This approach improves efficiency, reduces costs, and allows AI to generate data that cannot be easily collected. By filling in missing scenarios, vehicles achieve broader coverage, making them smarter than conventional ones. As of now, this is a practical and viable solution that can be implemented immediately.

Some funded projects in Chongqing have already begun exploring this area. Automakers are developing generative simulation systems to test their feasibility for vehicle training. Many datasets are repetitive, but I hope to generate data that vehicles have never encountered—data that cannot be collected through traditional means can be created through simulation.

The quality of simulated data still requires further research. While the concept is clear, there is still much work to be done for implementation. However, since the data is system-generated rather than directly controlling the vehicle, the risks and reliability concerns are relatively manageable. Using a simulation-based approach to integrate large models is both feasible and reliable.

Another approach is full end-to-end implementation. However, when can such large models be considered sufficiently trained? Once they become a complete black box, people feel uneasy—we don’t know if they truly work as expected.

When large models are proposed as a way to build reliable autonomous driving systems, the actual results do not necessarily increase trust. Therefore, we can incorporate human experience-based rules. However, we also need to consider whether these rules are too rigid. Should we integrate rules such as left and right turns into the model? These rules wouldn’t be strict constraints but weighted factors in the system. When the model is less mature, rules could account for 60%, while the model accounts for 40%. As the model matures, the rules become more advisory, relying more on the model’s decisions. The implementation should be a hybrid approach that adjusts based on maturity.

Of course, this raises further challenges. When these two sets of rules conflict, how do we resolve them? This is how technological iteration works—no technology is perfect. Most provide great problem-solving approaches, but that doesn’t mean they can replace all past accumulated knowledge.

Echo Tang: If all parameters are properly set, can the system handle general scenarios?

**Ming Cen: **We need to separate instruction parameters to form a formalized task definition. Task definitions should be editable, allowing instructions to be passed to an executor. The vehicle then reads, parses, and executes the instructions.

The goal is to abstract specific development details into descriptive information. From the user’s perspective, they only need to provide descriptive inputs, like communicating with a person. Essentially, control and perception tasks are hidden, while task requirements are shared and executed by the system, with parameters being fully adjustable.

However, another challenge is that even writing script definitions can be tedious. Can we make it as intuitive as editing a game map? For example, import a city map, mark the start and endpoints, and have the system automatically generate a route for the vehicle—this would be a much simpler process. The goal is to continuously lower the entry barrier, enabling more people to participate in and use this open tool. If route marking on a map replaces scripting, even users without coding knowledge can operate the system, further reducing complexity. For a technology to mature and be widely adopted, it must become more user-friendly—what is simplified becomes mainstream. Our ultimate goal in development is maximum accessibility. Now, large models enable everyone to become designers.

Echo Tang: Can vehicles autonomously build maps by collecting sensor data and modeling their surroundings?

Ming Cen: This approach requires high-level planning. Essentially, the key is to have a map. Each vehicle follows assigned tasks based on planned routes. This inevitably requires cloud-based coordination and management.

3.Functional Autonomous Vehicles Break Automation Barriers, Solving the "Last Mile" in Logistics

Echo Tang: There may already be discussions and research in Chongqing on this topic. The city’s robotics and intelligent connected vehicle industries are among the most advanced in China. Are the functional autonomous vehicles you mentioned frequently tested in industrial parks?

Ming Cen: We have been working on heavy trucks and commercial vehicles, not just simple mobility solutions. Our goal is not to replace human labor entirely but to reduce workload and free up manpower. We previously undertook a special-purpose autonomous vehicle project in Chongqing, focusing on a heavy truck. Since trucks are not allowed to operate on city roads, they mainly run within industrial parks, eventually evolving into functional autonomous vehicles such as logistics carriers. Their core mission is to solve the "last mile" challenge in logistics.

Smart manufacturing demands zero inventory in production workshops: customer orders must be fulfilled quickly, and finished goods must be transported immediately, eliminating the need for intermediate storage. By deploying functional autonomous vehicles, we integrate warehouse management into the logistics process, providing point-to-point short-distance logistics services for logistics providers, completing the final leg of transportation.

While warehouses and workshops have achieved automation, the connection between them still relies on manually driven vehicles, creating an inefficiency. Functional autonomous vehicles bridge this gap between these two automation "islands." When an order is placed in the backend ERP system, the production scheduling system issues a command, and the vehicle dispatch system initiates the corresponding transport task. Previously, drivers manually operated vehicles and carried paper orders. Now, in theory, drivers are no longer needed, though safety officers remain on standby to intervene in emergencies.

After several years of operation, our partners have reported that safety officers may only need to step in once every few months. This proves the core value of functional autonomous vehicles—significantly reducing labor-intensive tasks. More importantly, they successfully break down the automation barrier between warehouses and production workshops, integrating two previously isolated automated systems into a larger, more efficient automation network, achieving seamless coordination from storage to manufacturing.

Echo Tang: When it comes to automation, many people’s first concern is job displacement. Recently, the U.S. saw a large-scale port worker strike due to fears of automation-induced job losses. How do you consider this issue while advancing automation?

Ming Cen: The trend toward autonomous driving is unstoppable. The improvements in our lives today result from a series of technological revolutions. On an individual level, people should proactively adapt to these changes and embrace automation.

4.Balancing Safety and Cost in Autonomous Driving

Echo Tang: Can you share any real-world applications of autonomous driving in industrial parks?

Ming Cen: Initially, our projects aimed to help students gain technical experience. As opportunities arose, we expanded our research and eventually launched an autonomous heavy truck project.

At that time, we were confident in tackling popular industrial park tasks like patrolling and material transport. However, one proposal left a deep impression on me—a park wanted to introduce an autonomous vehicle to attract tourists. Despite strong recommendations from others, I had reservations.

In a factory setting, employees are trained professionals who follow rules. But in a park, the situation is completely different—there are many visitors, including children whose curiosity knows no bounds. We invested significant effort in communicating with manufacturers to increase the number of sensors. During trials, children often gathered around the vehicle, some even trying to touch it, reinforcing the necessity of additional sensors.

Autonomous vehicle safety technology itself is not the issue. The real problem lies in manufacturers cutting sensor costs. Many accidents occur because manufacturers reduce sensor configurations excessively to lower expenses. If sensor counts were tripled, I believe the risk of accidents would decrease significantly. With technological advancements, driving risks have been somewhat mitigated, but if manufacturers prioritize safety over cost, vehicles could theoretically achieve near-complete safety.

Echo Tang: Regarding the safety of functional autonomous vehicles, what issues can be resolved, and what critical concerns should we focus on?

Ming Cen: Safety is a broad topic, and most people first think of cybersecurity. For instance, if an autonomous system takes control of a vehicle, people worry that a system failure could send the vehicle into a dangerous area—like driving into a river—with no way for the driver to intervene. This fear stems from concerns about hacking or malicious attacks, which fall under cybersecurity.

Another major concern is functional safety, which ensures that the system operates without failures or crashes. Functional safety is categorized into different levels, including redundancy design and fail-safe mechanisms. The key is to prevent complete system failure by enabling gradual performance degradation rather than abrupt breakdowns.

Functional safety further divides into software and hardware safety. Hardware safety focuses on circuit reliability, such as ensuring the system continues functioning even if a software module crashes and stops responding. Can the system detect such failures? This is where redundancy in software functionality and control mechanisms comes into play.

Moreover, when multiple system nodes communicate, the system should detect if a module stops responding within a given timeframe. At that point, it must decide how to handle the situation—whether to notify other nodes, slow down, stop operations, or send an alert for human intervention. These are critical safety strategies in real-world applications.

Echo Tang: You've done extensive research on cost-effective development. Can you elaborate on how to achieve efficiency while reducing costs?

**Ming Cen: **Developing all relevant software is incredibly expensive. After completing the heavy truck project, we looked for ways to reuse existing work in future projects. The key is identifying reusable components and maximizing efficiency in their integration.

We adopted a task modeling approach, breaking projects into smaller modules, extracting essential functions across different vehicle types, and standardizing common features. Components that have been repeatedly tested and proven reliable can be directly reused, significantly reducing redundant coding.

A well-designed architecture is crucial for improving efficiency, reducing costs, and enhancing reliability. Once the architecture is in place, whether it's students or developers, they can focus solely on their modules without needing to understand the entire system in depth.

Echo Tang: You're referring to a modular development approach?

Ming Cen: Exactly—creating a standardized framework so that developers can concentrate on their specific tasks. Human attention is limited, and focusing on one’s structure within a defined framework is the most efficient way to develop. GOSIM focuses on development, and within development, the top priorities are efficiency, reliability, and long-term sustainability.

5.Open Source and Collaborative Intelligence Driving Innovation

Echo Tang: Is your current research open-source?

Ming Cen: Right now, our work is still in its early stages, but we plan to open-source it in the future. If our research gains recognition, sees widespread adoption, and even inspires further exploration by others, that would be incredibly meaningful. That's why we need to embrace openness—by making our tools open-source, we can attract more support and accelerate technological progress on a larger scale.

The advantage of open source is that it enables more people to contribute to a project. There may be potential issues in our development that we haven't noticed, but by open-sourcing our work, other developers can help identify and improve these areas. Additionally, open source fosters innovation, allowing new ideas to emerge and integrate into the project.

Our individual capabilities are limited, but through open source, we can bring more people into the fold, collectively advancing the project and expanding its impact. This not only benefits the technology itself but also contributes positively to the broader community and industry.

Echo Tang: For students and developers working in automation and robotics today, what core skills do you think they need to succeed in the current tech landscape?

Ming Cen: The open-source community already provides an excellent learning environment. Take platforms like CSDN, for example—when I first started as a programmer, I often encountered unfamiliar problems. Searching on CSDN would connect me with insights from experienced professionals. Later, various developer communities emerged, offering similar Q&A platforms. The programming community is full of passionate, sharing-oriented individuals.

Echo Tang: With the rise of large AI models, many people now turn to them for answers. How do you see this shift impacting the way developers work?

Ming Cen: AI-generated answers aren’t always correct, so users need some level of critical thinking to assess the responses. In a way, AI could complement developer communities like CSDN. Ideally, we can move toward a model of collaborative intelligence, where AI provides initial insights and community discussions help refine and validate those ideas.

Echo Tang: Thank you, Professor Cen, for this insightful conversation! That wraps up today’s discussion。see you next time!

What Is the Most Effective Application of Large Models in Autonomous Driving? | Open AGI Forum

1.A Decade of Automation: From “Electric” to “Intelligent”

2.Large Models Empower Vehicles to "See More and Know More"

3.Functional Autonomous Vehicles Break Automation Barriers, Solving the "Last Mile" in Logistics

4.Balancing Safety and Cost in Autonomous Driving

5.Open Source and Collaborative Intelligence Driving Innovation

Table Of Content