CNCC | Explore the potential and limitations of large language models: where are the boundaries of the capabilities of large language models

CNCC2024

Brief introduction of the forum:

Exploring the Potential and Limitations of Large Language Models - Where Are the Capabilities of Large Language Models?

Time: 13:30-17:30, October 26

Location: Classroom 10, 1st Floor, Summer Garden-United Kingdom Pavilion

Note: If there is any change, please refer to the final information on the official website (https://ccf.org.cn/cncc2024).

In recent years, large language models have become one of the key driving forces for the advancement of artificial intelligence technology, and large language models based on Transformer architecture have performed well in various natural language processing tasks, covering text generation, sentiment analysis, question answering systems, knowledge reasoning and many other fields.

However, these models still face many challenges and limitations in practical applications, including the generalization, robustness, and interpretability of the models. Therefore, how to detect the capability boundaries of large language models and design enhancement strategies in different domain applications is an important challenge for the implementation of large language models.

The forum will delve into the boundaries of large language model generalization capabilities, the robustness to new fields or unseen data, the interpretability of model generated content, and various enhancement strategies such as retrieval enhancement generation, size model coordination, and efficient parameter fine-tuning. The forum invited front-line scholars and technical experts from the research and development of large language models in China to discuss the key technologies, cutting-edge progress and future directions of large language model capability boundary detection and enhancement, and looked forward to stimulating participants to think deeply about the potential and limitations of large language models, and jointly promoting the development and application of large language model technology.

Forum Agenda

CNCC | Explore the potential and limitations of large language models: where are the boundaries of the capabilities of large language models

Introduction of the chairman and guests of the forum

Chair of the Forum

Li Yang

Chairman of CCF YOCSEF Harbin, Vice Dean, Associate Professor, and Doctoral Supervisor of the School of Computer and Control Engineering, Northeast Forestry University

Introduction: His main research interests are natural language processing, artificial intelligence and bioinformatics processing. He has presided over many national, provincial and ministerial projects, such as the National Natural Science Foundation of China, the National Natural Science Foundation of China, the Outstanding Youth Project, and the General Project. He has published more than 30 papers in top international journals and conferences in the field of artificial intelligence, and obtained 6 national invention patents. He is an executive member of the Bioinformatics Committee of the China Computer Federation, a member of the Youth Working Committee of the China Chinese Information Processing Society, a member of the Social Media Processing Committee, and the chairman of CCF YOCSEF Harbin 24-25.

Wu Tianxing

Chairman of CCF YOCSEF Nanjing, Associate Professor and Doctoral Supervisor of School of Computer Science and Engineering, Southeast University

Introduction:His main research interests are knowledge graphs, large language models, and artificial intelligence applications. He has won the Outstanding Doctoral Dissertation Award of Jiangsu Computer Federation, CCKS 2022 and WISA 2024 Best Paper Awards. He has presided over a number of vertical and horizontal projects such as the National Natural Science Foundation of China, the Youth Project, and the "Entrepreneurship and Entrepreneurship Doctor" project in Jiangsu Province. He has published more than 50 papers in international journals and conferences in the field of artificial intelligence, and 6 national invention patents have been authorized. He is a member of the Language and Knowledge Computing Committee of the China Chinese Information Society, an executive member of the Information System Committee of the China Computer Federation, and the chairman of CCF YOCSEF Nanjing 24-25.

Forum Speaker

Qin Bing

Dean Professor of Computing, Harbin Institute of Technology

Introduction: Deputy Director of the Institute of Natural Language Processing, Harbin Institute of Technology. He is the leader of the National Key R&D Project and the National Natural Science Foundation of China. He is an expert of the Science and Technology Innovation 2030-"New Generation Artificial Intelligence" Major Project Management Expert Group of the Ministry of Science and Technology, the executive director of the China Chinese Information Society/Director of the Affective Computing Committee, and the director of the Natural Language Processing Committee of the Heilongjiang Computer Federation. Selected into the "Forbes China 2020 Women in Technology List".

Title: Exploration of self-perception ability of large models

Abstract:In recent years, with the emergence of large language models, there has been a major shift in the way knowledge is called. With its extensive knowledge storage capabilities, large language models have become a new way to call knowledge after databases and search engines. However, the intrinsic knowledge of these models still has limitations, and it is very important to understand and improve the perception ability of large models on the knowledge boundary to advance their intelligence. This report explores the knowledge boundaries of large models around the "knowledge quadrant". These include: Does the large model really have the ability to self-perceive the boundaries of its knowledge? How can this self-perception be effectively improved? How to further expand the knowledge boundaries of large models? The discussion of these questions will help to deeply understand the internal knowledge mechanism of large models.

Lacquer Guilin

He is a professor at the School of Computer and Software Engineering, Southeast University, and the director of the Institute of Cognitive Intelligence, Southeast University

Introduction: One of the initiators of OpenKG, deputy director of the Language and Knowledge Computing Professional Committee of the Chinese Chinese Information Society, deputy director of the Knowledge Organization Professional Committee of the Chinese Society for Scientific and Technical Information, and deputy director of the Knowledge Engineering and Intelligent Service Special Committee of the Jiangsu Artificial Intelligence Society, editor-in-chief of the international journal Journal of Data Intelligence, associate editor-in-chief of the international journal Journal of Web Semantics and Semantic Web Journal He is a member of the editorial board of the Journal of Big Data Research, an advisor to the Data Management Advisory Board of Elsevier, and an editorial board member of the Journal of Information Engineering.

Title: Integrating Knowledge Graph and Large Language Models from the Perspective of Knowledge Engineering

Abstract:Knowledge graph is considered to be a new generation of knowledge engineering and has been widely used in many industrial applications. However, building and maintaining knowledge graphs relies on experts, hindering the rapid replication of knowledge graph-based products. Large language models such as ChatGPT have achieved great success as general artificial intelligence, and they can be considered knowledge bases that support knowledge services such as question answering. The report first introduces the interaction between knowledge graphs and large language models, including how language models support ontology extraction, knowledge graph construction, knowledge alignment, and knowledge reasoning, as well as how knowledge graphs support model training and fine-tuning, knowledge editing, and knowledge fusion of large language models. Secondly, the knowledge service platform that integrates knowledge graph and large language model from the perspective of knowledge engineering is discussed. Finally, the field application of the new generation of knowledge service platform is introduced.

Wang Haofen

Distinguished Research Fellow of Tongji University

Introduction: He has served as CTO in a first-line artificial intelligence company for a long time. He is one of the initiators of OpenKG, the world's largest Chinese open knowledge graph alliance. He is responsible for participating in a number of national AI-related projects, publishing more than 100 high-level papers in the field of AI, with more than 4,100 citations and an H-index of 30. He built the world's first virtual idol that can be cultivated interactively - "Amber · Void"; The intelligent customer service robot has served more than 1 billion users. Currently, he serves as the Deputy Director of the Terminology Working Committee of the China Computer Federation, the Secretary-General of the Natural Language Processing Committee, the Standing Committee of the Information System Committee, and the Executive Committee of the Intelligent Robot Committee. Director of the Chinese Society of Chinese and Informatics, member of the Steering Committee of the Large Model Committee, Deputy Secretary-General of the Language and Knowledge Computing Committee; Member of the Standing Committee of the Large Model Special Committee of the Chinese Command and Control Society; Deputy Director of the Natural Language Processing Committee of Shanghai Computer Federation and other social positions.

Title: Convergence Innovation and Prospect of Knowledge Augmentation Model from the Perspective of World Model

Abstract:The world model and the knowledge augmentation model are gradually becoming the core forces to promote the new generation of artificial intelligence technology innovation. This report will introduce the relevant basic concepts, analyze cutting-edge research cases and the latest technology development trends, give the key role of knowledge augmentation models in understanding the mechanism of the complex physical world, improving the quality of decision-making and accelerating knowledge acquisition from the perspective of world models, and finally look forward to the future technology paradigm in combination with the application of vertical fields.

Dong Yuxiao

He is a member of the AC Committee of CCF YOCSEF Headquarters and an associate professor of the Department of Computer Science of Tsinghua University

Introduction: Worked at Facebook AI and Microsoft HQ Research. His research interests include data mining, graph machine learning, and basic models, and his research results have been awarded or nominated for the best paper awards in ECML'23, WWW'22/19, WSDM'15, and are applied to billion-user social networks and knowledge graphs. Winner of the 2022 ACM SIGKDD Rising Star Award.

Title: Understanding and Exploring the Emergence of Large Model Capabilities

Abstract:The basic large model shows strong generalization ability in intention perception, instruction following, target planning, etc., and provides a general model base for the research and application of agents. We found that pre-training loss can better predict the emergent ability of language models than model size or computational effort, and then reasonably guide model training and capacity improvement. The GLM-4 All Tools agent model is used as a column, which can autonomously understand user intent, automatically plan complex instructions, and freely call web browsers, code interpreters, and multimodal models to complete complex tasks.

Liu wei

Head of Large Model Algorithm at Xiaomi AI Lab

Introduction: Academic member of CCF YOSCEF headquarters, master's industry tutor of School of Psychology and Cognitive Science, Peking University, defense tutor of machine learning course of Tsinghua University, and founding member of Microsoft Xiaoice. His research interests are human-computer dialogue and large models. With more than 10 years of experience in human-computer dialogue, he has led and deeply participated in the research and development of influential human-computer dialogue products in the industry such as Microsoft Xiaoice and Xiaoai. He has won the 23-year Xiaomi Million Dollar Technology Award, the 24-year CCF Computer Application Innovation Technology First Prize, the 24-year Data Expo Outstanding Scientific and Technological Achievement Award, etc., and has more than a dozen patents and top papers.

Feng Xiaocheng

Deputy Dean of the School of Artificial Intelligence, Harbin Institute of Technology, Assistant Director of Heilongjiang Provincial Key Laboratory of Chinese Information Processing

Introduction: Assistant to the director of Heilongjiang Provincial Key Laboratory of Chinese Information Processing. His research interests include natural language processing, text generation, machine translation, and more. He has published more than 40 papers in several CCF A/B international conferences and journals. Serve as a senior/ordinary member of the program committee of NIPS, ICML, AAAI, IJCAI, ACL, etc.; He is also a dual-employed scholar of Pengcheng Laboratory, deputy director of the Youth Working Committee of the China Chinese Information Society, deputy secretary-general of the Large Model and Generation Special Committee, and chairman of the 2023-2024 YOCSEF Harbin Sub-forum of the China Computer Association.

Zhang Ningyu

Deputy Director of the Institute of Intelligent Science and Industrial Software, Zhejiang University

Introduction: Deputy Director of the Institute of Intelligent Science and Industrial Software, selected into the high-level talent introduction program of the sub-provincial city, Qizhen Outstanding Young Scholar, served as a member of the CCF Computer Terminology Validation Committee, an executive member of the Information System Committee, a member of the Youth Working Committee of the Chinese Society of Chinese Information, a member of the Language and Knowledge Computing Committee, an executive member of the Affective Computing Professional Committee, a member of the Large Model Search and Generation Committee, and a member of the ACM Transactions on Asian and Low-Resource Language Information Processing, Data Intelligence Associate Editor, Chairman of ACL, EMMLP, ICLR, ARR Action Editor, IJCAI Senior Program Committee, main research directions are natural language processing, knowledge graph, etc.

Ryo Taue

Assistant Researcher of National University of Defense Technology

Introduction: The technical backbone of a national key laboratory in the School of Computer Science, National University of Defense Technology. He is committed to the research of large models, text generation, and privacy protection. He has published more than 30 papers in artificial intelligence and natural language processing conferences such as NeurIPS, ACL, WWW, EMNLP, etc., including more than 20 one-work or communication papers, and has been authorized more than 20 national invention patents. He has been nominated for Microsoft Scholars, Baidu's highest award, and Baidu Scholarship Global Top 40. He has served as the chairman of ACL, EMNLP, NAACL, senior program member of IJCAI, etc., and an executive member of the Youth Working Committee of the Chinese Information Society. Selected into the China Association for Science and Technology Youth Lifting Talent Project.

About CNCC2024

CNCC2024 will be held on October 24-26 in Hengdian Town, Dongyang City, Zhejiang Province, with the theme of "Developing New Quality Productivity, Computing Leads the Future". The three-day conference included 18 invited reports, 3 conference forums, 138 thematic forums, 34 thematic activities and more than 100 exhibitions. More than 800 speakers, including Turing Award winners, academicians of the Chinese Academy of Sciences and the Chinese Academy of Sciences, top scholars at home and abroad, and well-known entrepreneurs, looked forward to cutting-edge trends and shared their innovative achievements. More than 10,000 people are expected to attend.

CNCC | Explore the potential and limitations of large language models: where are the boundaries of the capabilities of large language models

Read on

Small tricks make a big difference, "only read twice prompts" makes the loop language model surpass Transformer++

PubMed GPT: A domain-specific large language model for biomedical texts

The current state of large language models: evolving along an S-curve

Carnegie Mellon University launches online graduate certificates in generative AI and large language models

How do I build a large language model from scratch and further train and fine-tune it?

MICROSOFT, NVIDIA AND OPENAI ARE ALL FULLY SUPPORTING, AND THIS IS THE HUMANOID ROBOT CLOSEST TO TRUSS'S "OPTIMUS PRIME" AT PRESENT! On August 6, Figure was officially released

Interpretation of the paper | ACL 2024: Self-distillation bridges distribution differences in language model fine-tuning

Report: Large Language Model Natural Language Processing Job Recruitment Increases by 111% Year-on-Year

Top 10 Global Company News of the Week | Alibaba's large language model is open to the global open source community; The Boeing union strikes 737 to suspend production

大语言模型如何助力药物开发? 哈佛 George Church Lab 最新综述

Li Shen, Hu Renfen, Wang Lijun丨Construction and application of ancient Chinese large language model

20,000 words: The intersection of large language models, prompt learning, and future technology research and development

Apple issued a question: large language models are simply unable to perform logical reasoning

Institutions are optimistic about the decline of experts and criticize the project for being difficult, will the large language model become an AI bubble that is about to burst?

Millions of robust data training, new SOTA for 3D scene large language models! IIT and others released Robin3D

【AASLD2024 Express】Prediction of HBsAg clearance by peginterferon α-2b treatment: a simple model based on baseline HBsAg levels

Large models lead the 6G revolution! The latest review explores the future of communication methods, covering multimodality, RAG, etc

The top CP of the large model turned from sweet to abusive: they were dissatisfied with each other, and they all looked for a spare tire, because the money was unpleasant

Archetype AI released a large model of Newtonian physics to learn physics principles from sensor data

CNCC | The future of multimodal affective computing under large models

The "Fuxi Eye" large model was released! It has the world's largest ophthalmic image database

New car | The AI large model is on the car, 13 new/27 optimizations, and the ZEEKR 009 glorious OTA upgrade

AI Daily: Fudan and Baidu's new models can generate 1-hour long videos; The new version of ChatGPT for Windows is launched; Two new features have been added to NotebookLM

Surveying and Mapping Bulletin | Ren Ping: Noise data visualization based on LOD1 city model

The terminal AI grading standard has been implemented, and the "fire" of the mobile phone model has burned to the agent

J Clin Invest丨Yang Weili/Li Shihua/Li Xiaojiang's team used monkey models to reveal new pathological mechanisms of Parkinson's disease

Tens of millions of dollars lost by poisoning for large model training? Anthropic found a hidden bug in the LLM codebase

Nearly 1,000 teenagers in the city gathered at Zhonghai Expo to show their skills in the three major model competitions of navigation, aviation and architecture

DeepMind and MIT developed Fluid, which enables autoregressive models to achieve large-scale expansion of Wensheng graphs

AI Weekly | ByteDance's large model training was "poisoned"; Microsoft will terminate the Azure OpenAI service for individuals in China

ByteDance responded to the attack on the intern for the training of the large model: it has been dismissed and does not affect the online business