

1 行人的行為意圖模組化和預測







  • 和駕駛行為一樣,不确定性和多模态是行人行為模組化的挑戰;
  • 大多方法采用遞歸神經網絡,如RNN/LSTM/GRU模型;
  • 有采用對抗理論GAN,比如social GAN和Social Ways;
  •  有采用增強學習(RL),本身RL和GAN之間有聯系;
  • 大多考慮環境的互動(interaction)模型,不管局部或者全局,這是對行人的社會屬性模組化;
  • 對人群(grouping)和單個行人,社會行為模組化會不同;
  •  一些采用注意機制,比如social attention和Sophie;
  • 對行人意圖和行人活動類型的了解。

2 車輛的行為意圖模組化和預測


駕駛行為模組化(DBM,driver behavior modeling)目的就是預測駕駛動作,預測駕駛員心思,還有環境因素,如下圖所示:各種傳感器和車載控制器CAN資料作為輸入,預處理算法過濾資料,然後給各種應用提供預測模型。



(1)可以參考2019年12月25日英國的大學研究人員上傳arXiv的綜述《Deep Learning-based Vehicle Behaviour Prediction For Autonomous Driving Applications: A Review》;




3 運動預測(Motion prediction)

(1)模組化參與者間互相作用(interactions )的機制總結

從科學上講,運動預測對于了解人類行為和運動動力學很有用。這項任務的基本挑戰之一是對場景限制進行模組化,尤其是對角色之間的隐藏互動進行模組化。 例如,在駕駛場景中,交通參與者(例如車輛和行人)以及交通條件和規則會互相影響,如下圖所示。


來源《 Collaborative Motion Prediction via Neural Motion Message Passing》

參考論文《Collaborative Motion Prediction via Neural Motion Message Passing》中的論述,對交通參與者(traffic actors)間隐藏的互相作用進行模組化的機制包括以下三種:

  • 以空間為中心的機制( spatial-centric mechanism)


Social Conv^[1]^和MATF^[2]^利用交通參與者的空間結構來學習互動作用;ChauffeurNet^[3]^和Motion Prediction^[4]^将交通參與者的軌迹和場景背景編碼為鳥瞰圖像; FMNet^[5]^使用輕量級的CNN來實作實時推斷; IntentNet ^[6]^将LiDAR資料與圖像結合在一起。

  •  社會機制(the social mechanism)

它将鄰近交通參與者的資訊彙總為社會表征(social representation),并廣播給每個參與者。 這樣,每個交通參與者都知道鄰近資訊。如:

Social LSTM^[7]^将最大池化作用于的鄰近交通參與者;考慮長期間的互相作用,Social GAN^[8]^将最大池化應用于所有參與者;CIDNN^[9]^則在交通參與者的先驗的位置嵌入之間使用内積。 但是,最大池化操作會忽略每個參與者的獨特性,而内積運算會将所有交通參與者同等對待。 注意力操作(attention operation)^[10,11]^,以便交通參與者可以專注于關鍵的影響因素。 然而不可避免地,注意力操作伴随着計算複雜性的增加。

  • 基于圖的政策(graph-based mechanism)


Social-BiGAT^[12]^基于圖注意力網絡(GAT)學習全局嵌入來表示場景中的互動。 Social Attention^[13]^和STGAT^[14]^分别通過使用時空圖和LSTM捕獲了随時間變化動态互動作用的變化。

[1] Nachiket Deo and Mohan M Trivedi. Convolutional social pooling for vehicle trajectory prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1468–1476, 2018.

[2] Tianyang Zhao, Yifei Xu, Mathew Monfort, Wongun Choi, Chris Baker, Yibiao Zhao, Yizhou Wang, and Ying Nian Wu. Multi-agent tensor fusion for contextual trajectory predic- tion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 12126–12134, 2019.

[3] Mayank Bansal, Alex Krizhevsky, and Abhijit Ogale. Chauf- feurnet: Learning to drive by imitating the best and synthe- sizing the worst. arXiv preprint arXiv:1812.03079, 2018.

[4] Nemanja Djuric, Vladan Radosavljevic, Henggang Cui, Thi Nguyen, Fang-Chieh Chou, Tsung-Han Lin, and Jeff Schnei- der. Short-term motion prediction of traffic actors for au- tonomous driving using deep convolutional networks. arXiv preprint arXiv:1808.05819, 2018.

[5] Fang-Chieh Chou, Tsung-Han Lin, Henggang Cui, Vladan Radosavljevic, Thi Nguyen, Tzu-Kuo Huang, Matthew Niedoba, Jeff Schneider, and Nemanja Djuric. Predicting motion of vulnerable road users using high-definition maps and efficient convnets. arXiv preprint arXiv:1906.08469, 2019.

[6] Sergio Casas, Wenjie Luo, and Raquel Urtasun. Intentnet: Learning to predict intention from raw sensor data. In Con- ference on Robot Learning, pages 947–956, 2018.

[7] Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. So- cial lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 961–971, 2016.

[8] Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. Social gan: Socially acceptable tra- jectories with generative adversarial networks. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2255–2264, 2018.

[9] Yanyu Xu, Zhixin Piao, and Shenghua Gao. Encoding crowd interaction with deep neural network for pedestrian trajec- tory prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5275– 5284, 2018.

[10] Anirudh V emula, Katharina Muelling, and Jean Oh. Social attention: Modeling attention in human crowds. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1–7. IEEE, 2018.

[11] Amir Sadeghian, Vineet Kosaraju, Ali Sadeghian, Noriaki Hirose, Hamid Rezatofighi, and Silvio Savarese. Sophie: An attentive gan for predicting paths compliant to social and physical constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1349– 1358, 2019.

[12] Vineet Kosaraju, Amir Sadeghian, Roberto Mart´ ın-Mart´ ın, Ian Reid, S Hamid Rezatofighi, and Silvio Savarese. Social- bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks. In Advances in Neural Infor- mation Processing Systems (NeurIPS) 32, 2019.

[13] Anirudh V emula, Katharina Muelling, and Jean Oh. Social attention: Modeling attention in human crowds. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1–7. IEEE, 2018.

[14] Yingfan Huang, HuiKun Bi, Zhaoxin Li, Tianlu Mao, and Zhaoqi Wang. Stgat: Modeling spatial-temporal interactions for human trajectory prediction. In International Conference on Computer Vision (ICCV), 2019.
