laitimes

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

CNCC2024

Brief introduction of the forum:

Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Time: 13:30-17:30, October 24

Venue: Autumn Garden - Classroom Area (West 2)

Note: If there is any change, please refer to the final information on the official website (https://ccf.org.cn/cncc2024).

With the rapid development of mobile Internet and artificial intelligence technology, multimedia content understanding and generation is becoming a hot field at the forefront of science and technology, and continuous breakthroughs in technology are promoting the intelligent processing and innovative application of multimedia data, bringing new opportunities and challenges to the industry and research.

Generative AI has shown great potential in multimedia content generation, and the interdisciplinary application of this technology has yielded remarkable results, while geometric deep learning-based Riemann manifold generation has further advanced the application of generative models in non-Euclidean spaces. In addition, in terms of multimedia content comprehension, the knowledge-guided continuous learning method provides new ideas for solving how to avoid catastrophic forgetting of old knowledge when learning new knowledge, and the rise of general multimodal learning provides new possibilities for realizing a comprehensive understanding of multimodal data.

This forum will focus on the key technologies of multimedia content understanding and generation, and deeply discuss the interdisciplinary application of generative artificial intelligence, the latest progress of geometric deep learning in manifold generation, lidar visual global positioning technology, and the breakthroughs and challenges of multimodal learning under a unified architecture, aiming to provide new research directions and application scenarios for multimedia content generation and understanding, and promote the continuous development and innovation of technology.

Forum Agenda

order topic Keynote speaker unit
1 Opening remarks at the forum High praise Shandong Provincial Institute of Artificial Intelligence
2 Speeches by special guests Huang Qingming University of Chinese Academy of Sciences
3 Knowledge-led, continuous learning approach Li Hongliang University of Electronic Science and Technology of China
4 LiDAR Vision Global Positioning Wang Cheng Xiamen University
5 General multimodal learning Yu Jun Harbin Institute of Technology (Shenzhen)
6 Generative AI and its multidisciplinary applications Yan Bo Fudan University
7 Riemann manifold generation based on geometric deep learning Zou Junni Shanghai Jiao Tong University
Panel link Huang Tiejun Peking university
Tao Jianhua Tsinghua University
Wang Xun Zhejiang Gongshang University
Zhou Jie Tsinghua University
Zhang Yanning Northwestern Polytechnical University
Nie Liqiang Harbin Institute of Technology (Shenzhen)

Introduction of the chairman and guests of the forum

Chair of the Forum

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

High praise

Professor and Vice Dean of Shandong Institute of Artificial Intelligence/Tianjin University of Technology

He is a senior member of CCF, a candidate of the National Youth Talent Program, an expert of Tugong in Shandong Province, and the deputy director of the Key Laboratory of "Computer Vision and Systems" of the Ministry of Education. In recent years, he has presided over or participated in more than 20 projects at or above the provincial and ministerial level, including the National Natural Science Foundation of China and the National Key Research and Development. He has published more than 100 papers in international conferences and journals, including TPAMI and CVPR, including 6 ESI highly cited papers and 1 hot paper. In 2021, he won the best student paper of the CCF Class A conference SIGIR, and more than 50 invention patents were authorized.

Special guests

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Huang Qingming

He is a member of CCF, a director of CCF, a director of the Multimedia Technology Professional Committee, and a professor at the University of Chinese Academy of Sciences

Chair Professor of the University of Chinese Academy of Sciences, winner of the National Science Fund for Distinguished Young Scholars, and enjoys special government allowance from the State Council. His main research directions are multimedia computing, pattern recognition, machine learning, etc., and he has presided over and undertaken the research work of national, provincial and ministerial projects such as the National Science and Technology Major Project of the New Generation of Artificial Intelligence, the Key and Key International Cooperation Projects of the National Natural Science Foundation of China, the 863 Project, and the 973 Project, and has published more than 600 papers in authoritative journals and international conferences at home and abroad. He has won many national and provincial awards, such as the first prize of Wu Wenjun Artificial Intelligence Natural Science Award, the first prize of CSIG Natural Science Award, and the first prize of Science and Technology Progress Award of the Ministry of Education.

Forum Speaker

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Li Hongliang

Professor, University of Electronic Science and Technology of China

He is a recipient of the National Science Fund for Distinguished Young Scholars, and his main research areas include multimedia understanding and analysis, visual perception models, and machine learning, and he has published more than 70 IEEE Transactions papers. He has presided over the major projects of the Ministry of Science and Technology of the People's Republic of China on the 2030 new generation of artificial intelligence, the key projects of the National Natural Science Foundation of China and the key projects of the joint fund. He serves as an editorial board member of Acta Electronica Sinica, IEEE TCSVT (2018-2021), JVCI and journals. He has served as the chair of the IEEE ISPACS 2010 conference and the chair of the IEEE VCIP 2016 and PCM 2017 technical committees. He has received the JVCI Outstanding Service Award, the IEEE TCSVT Best Editorial Board Award, and the CCF and IEEE CAS Outstanding Speaker.

Title: Knowledge-Led Continuous Learning Methods

Report Summary: How to learn new knowledge while reducing the catastrophic forgetting of old knowledge is an important challenge facing continuous learning. This report will focus on the problem of continuous learning in visual tasks, from the continuous learning mechanism of multiple teachers to the implementation of continuous learning models in specific visual tasks, and introduce recent related work. Finally, some problems of continuous learning of visual tasks are briefly discussed.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Wang Cheng

Professor of Xiamen University

He is an outstanding member of CCF, a winner of the National Talent Program Fund, a leading talent in scientific and technological innovation of the National "Ten Thousand Talents Program", and an IET Fellow. He is currently the director of Fujian Provincial Key Laboratory of Smart City Perception and Computing. His research interests include computer 3D vision, lidar data processing, remote sensing intelligent processing, spatial big data analysis, and smart cities. He has published more than 300 papers in top journals and conferences such as Nature Communication, IEEE TGRS, CVPR, etc., and has been cited more than 10,000 times. He is the chairman of the ISPRS Multi-sensor Integration and Fusion Working Group, the chairman of the Xiamen Sub-forum (founder) of CCF YOCSEF, and the executive director of CSIG. He has won 5 awards such as the first prize of provincial and ministerial scientific and technological progress.

Title: LiDAR Vision Global Positioning

Report Summary: Global positioning is at the core of the digital economy, but the complex urban environment limits the application of satellite positioning. 3D laser scanning technology is becoming a new dawn of urban positioning with accurate 3D perception capabilities. The report will introduce the research progress of the ASC laboratory of Xiamen University in the field of lidar visual global positioning. Firstly, the basic principle of LiDAR visual localization based on implicit expression is explained. Then, the efficient positioning method from depth regression to geometric coding was introduced, and the world's first large-range LiDAR global positioning results with sub-meter positioning accuracy were demonstrated. Finally, the future development trend is summarized and prospected.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Yu Jun

Professor, Harbin Institute of Technology (Shenzhen).

Winner of the National Fund for Distinguished Young Scholars. His main research direction is cross-media analysis technology, and he has published more than 100 IEEE/ACM Transactions and CCF Class A papers, with more than 10,000 Google Scholar citations, and more than 10 papers have been selected as ESI highly cited/hot papers. In recent years, he has presided over the key R&D program of the Ministry of Science and Technology, the key projects of the National Natural Science Foundation of China, and the general projects of the National Natural Science Foundation of China, and has won the IEEE TMM, TIP, TCYB best paper awards in 2015, 2016 and 2017, the second prize of natural science of the Ministry of Education in 2018, and the first prize of natural science of Zhejiang Province in 2021. He has served as an associate editor for several international journals.

Title: General Multimodal Learning

Abstract:Thanks to the rapid development of the deep self-attention network model Transformer and the pre-training method BERT in the field of natural language, the research of multimodal deep learning has gradually evolved from "divide and conquer" to "general unity" of each task. This report first briefly introduces the representative work in the development of multimodal deep learning. Then, three representative methods in the field of general multimodal deep learning: multimodal multi-task joint learning, multimodal neural architecture search, and multimodal pre-training are introduced in detail. Finally, the future development of general multimodal deep learning is prospected and reflected.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Yan Bo

Professor, Fudan University

Changjiang Scholar of the Ministry of Education, Deputy Director of the Development Planning Office of Fudan University, Deputy Director of the Academic Committee of the School of Computer Science and Technology, and Vice Chairman of the Shanghai Society of Image and Graphics. His research interests include computer vision, smart medicine, and scientific intelligence. He has published more than 70 papers in international journals and conferences such as Nature Methods as the first/corresponding author. It has been funded by a number of provincial, ministerial and enterprise cooperation projects, such as the National Joint Key Project and the Huawei Foundation. The results of the project have been well applied in Huawei's flagship mobile phones, public security systems, and tertiary hospitals. He has won the second prize of Natural Science Award of the Ministry of Education in 2020 and the second prize of CSIG Science and Technology Award in 2019.

Title: Generative Artificial Intelligence and Its Multidisciplinary Applications

Report Summary: With the continuous breakthrough of large model technology, generative AI has demonstrated powerful generative capabilities. This report will mainly introduce the team's continuous exploration results in the direction of generative artificial intelligence, and some research results achieved in multiple scenarios such as intelligent terminals and smart security. At the same time, it is deeply integrated with material science, assists in the synthesis of new materials, and provides new data for generative AI, so as to realize human-machine collaborative research and accelerate scientific research fission. Finally, the relevant results are applied to smart medical scenarios to provide strong support for intelligent diagnosis and treatment in various departments.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Zou Junni

Professor, Shanghai Jiao Tong University

Winner of the National Science Foundation for Distinguished Young Scholars. He has won 4 first prizes of Shanghai Science and Technology Award, 3 second prizes of Science and Technology Award of Chinese Institute of Electronics, and 1 second prize of Wu Wenjun Artificial Intelligence Science and Technology Award. His research interests include multimedia communication, high-dimensional visual information processing, geometric deep learning, etc. He has presided over 8 key projects of the National Natural Science Foundation of China, and published more than 150 papers included in SCI and EI, including more than 50 papers in IEEE Transactions and 45 papers in international academic conferences such as NeurIPS and ICML. He has co-published 2 monographs and obtained 50 invention patents authorized by China and the United States. He served on the editorial board of the international journal Digital Signal Processing.

Title: Riemannian manifold generation based on geometric deep learning

Report Summary: In recent years, diffusion models have achieved great success in generative modeling tasks. Inspired by diffusion models based on two-dimensional images, many studies have begun to focus on diffusion models of high-dimensional manifold structures. The manifold signal is located in a non-Euclidean space, and in order to achieve accurate generation and reconstruction, it is necessary to consider the probability distribution of the original data and the geometric characteristics and topology of the manifold signal at the same time. This report will introduce the latest progress in manifold signal generation from the perspective of geometry and deep learning, and further explore the possibility of introducing Ricci curvature flow into the diffusion model to learn the inherent geometric features of manifold signals.

Guests

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Huang Tiejun

He is a fellow of CCF and a professor at Peking University

Professor of the School of Computer Science of Peking University, Dean of Beijing Academy of Artificial Intelligence, Vice Dean of Peking University Institute of Artificial Intelligence, and winner of the National Science Fund for Distinguished Young Scholars. He has been engaged in the research of intelligent visual information processing technology for more than 30 years, published more than 300 academic papers, 2 monographs, and authorized more than 100 invention patents. He participated in the proposal, drafting and implementation of the new generation of artificial intelligence development plan in mainland China, and served as the deputy leader of the expert group of the major scientific and technological projects of the new generation of artificial intelligence in science and technology innovation 2030, the deputy leader of the overall group of national artificial intelligence standardization, and the secretary general of the new generation of artificial intelligence industry technology innovation strategic alliance.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Tao Jianhua

He is a fellow of CCF and a professor of Tsinghua University

Deputy Director of the State Key Laboratory of Pattern Recognition, winner of the National Science Foundation for Distinguished Young Scholars. His research interests include speech synthesis and recognition, speech coding, human-computer interaction, multimedia information processing, and pattern recognition. He is responsible for more than 20 projects such as the National 863, the Natural Science Foundation of China, and the National Key R&D Program, and has published more than 240 papers in academic journals and conferences at home and abroad, including more than 110 SCI or EI retrievals, and has authorized 15 domestic invention patents and 1 international patent, and has won awards at important academic conferences at home and abroad. He has served as an expert in the evaluation of national projects such as 863 and the National Natural Science Foundation of China for many times.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Wang Xun

Professor of Zhejiang Gongshang University and Dean of the School of Computer Science

Professor of Zhejiang Gongshang University and Dean of the School of Computer Science. Director of Zhejiang Provincial Key Laboratory of Big Data and Future E-commerce Technology, Director of Zhejiang Engineering Center of Visual Media Big Data Technology. He was selected into the "National Millions of Talents Project", won the special government allowance of the State Council, and was the leader of the first batch of high-efficiency Huang Danian-style teachers in Zhejiang Province. In recent years, he has carried out research in the fields of mobile graphics computing and computer vision, and has published more than 150 high-level academic papers in important journals and international conferences at home and abroad. He has presided over more than 20 major key projects at or above the provincial and ministerial levels; The first finisher won 6 provincial and ministerial first and second prizes and 1 second prize of national teaching achievements.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Zhou Jie

Professor of Tsinghua University

Professor of the Department of Automation, Tsinghua University, Director of the National Key Laboratory, winner of the National Science Fund for Distinguished Young Scholars, academic leader of the innovative research group of the National Science Foundation of China, and IAPR Fellow. He has been engaged in pattern recognition and computer vision research for a long time, and has published more than 100 IEEE journal papers, including 34 IEEE TPAMI long articles. As the first completer, he won the second prize of the National Technological Invention Award, the silver award of the China Patent, the first prize of the Chinese Institute of Electronics and other awards, and guided the doctoral students to win the National Excellent Doctoral Dissertation Nomination Award, the Excellent Doctoral Dissertation of the Chinese Society of Artificial Intelligence (4 times), and the Excellent Doctoral Dissertation of the Chinese Society of Image and Graphics. He is currently a member of the editorial board and deputy editor-in-chief of PR of IEEE TPAMI.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Zhang Yanning

He is a fellow of CCF, an executive director of CCF, and a professor of Northwestern Polytechnical University

Professor of Northwestern Polytechnical University, member of the Standing Committee of the Party Committee, Vice President, National Talent, Chief of the National Defense 973 Project. He has long been committed to the research of image processing, pattern recognition, computer vision and intelligent information processing, and has been combined with the major national needs of aerospace and aviation. He has published more than 100 papers in international journals and conferences such as IEEE TPAMI, IEEE TIP, CVPR, ICCV, etc., and undertaken more than 40 national projects. The research results have been adopted by a number of major national engineering projects, and have been successfully applied to more than 20 units in aerospace, aviation, energy, water conservancy and other industries, and won the first prize of Shaanxi Province Science and Technology Progress Award and the first prize of National Defense Technology Invention Award.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Nie Liqiang

Professor, Harbin Institute of Technology (Shenzhen).

He is a second-level professor and executive dean of the School of Computer Science of Harbin Institute of Technology (Shenzhen), an IAPR Fellow, and has been selected for the National Talent Program twice. He has presided over two key projects of the National Science Foundation Committee, 1*3 foundation strengthening projects, key R&D projects of the Ministry of Science and Technology, provincial outstanding young people, and two 10-million-level horizontal projects. He is committed to the research of multimedia content analysis and search, and has published more than 100 CCF Class A papers. He has served as an editorial board member of IEEE TKDE, ACM ToMM and other Transactions Journals, and has been a conference AC or SPC for NeurIPS/AAAI/KDD/IJCAI. He has won the Damo Academy Young Fellow Award 2020, the first prize of Shandong Science and Technology Progress Award 2021 (preface 1), and the first prize of Shandong Provincial Technological Invention Award 2023 (preface 1) and other awards.

CNCC | Multimedia Content Understanding and Generation: A Conversation between Technology and Applications

Liu Meng

Professor of Shandong Jianzhu University

Co-chair of the forum, professor of Shandong Jianzhu University, leader of the innovation team of "intelligent media analysis and retrieval" in Shandong Province. His research interests include multimedia content analysis, information retrieval, cross-media analysis and reasoning. He has presided over a number of projects such as sub-projects of national key projects, general funds, and youth funds. He has published more than 70 papers in CCF Class A conferences and IEEE/ACM Transactions, and 1 English monograph. He has won the Best Student Paper Award of ACM SIGIR 2021 of CCF Class A Conference, the champion of ACM MM 2023 Challenge, the champion of CVPR 2024 Challenge, the Frontier Science Award of the First International Basic Science Conference, and the ACM SIGMM 2020 Youbo Award.

About CNCC2024

CNCC2024 will be held on October 24-26 in Hengdian Town, Dongyang City, Zhejiang Province, with the theme of "Developing New Quality Productivity, Computing Leads the Future". The three-day conference included 18 invited reports, 3 conference forums, 138 thematic forums, 34 thematic activities and more than 100 exhibitions. More than 800 speakers, including Turing Award winners, academicians of the Chinese Academy of Sciences and the Chinese Academy of Sciences, top scholars at home and abroad, and well-known entrepreneurs, looked forward to cutting-edge trends and shared their innovative achievements. More than 10,000 people are expected to attend.

Read on