GPTSecurity is a community that includes cutting-edge academic research and practical experience sharing, integrating knowledge of security applications such as generative pre-trained Transformer (GPT), artificial intelligence generated content (AIGC), and large language models (LLMs). Here you can find the latest research papers, blog posts, useful tools, and Prompts on GPT/AIGC/LLM. In order to better understand the content of the contributions in the past week, we summarize them as follows.

Security Papers

1. REVS: Eliminate sensitive information in the language model with rank editing in the lexical space

Introduction: Large language models (LLMs) can inadvertently remember and leak sensitive information in training data, raising privacy concerns. To solve this problem, the researchers proposed an innovative model editing technique called REVS. This technique eliminates sensitive information by identifying and modifying a subset of neurons associated with it. REVS maps these neurons to the lexical space to identify the key components that generate sensitive content and edits the model by computing the pseudo-inverse of the anti-embedding matrix, thereby reducing the generation of sensitive data. The researchers verified the effectiveness of REVS through two datasets, the email dataset and the synthetic Social Security Number dataset. The results show that REVS excels in eliminating sensitive information and defending against extraction attacks while maintaining the integrity of the model.

Link:

https://arxiv.org/pdf/2406.09325

2. Artificial Intelligence as the New Hacker: Developing Agents for Offensive Security

Introduction: In the field of cybersecurity, artificial intelligence (AI) technology is being used to develop ReaperAI, an autonomous attack agent, to simulate and execute cyberattacks. By leveraging the capabilities of large language models such as GPT-4, ReaperAI is able to autonomously identify and exploit security vulnerabilities. Tests on multiple platforms have shown that ReaperAI has successfully exploited known vulnerabilities, demonstrating its potential in offensive security strategies. However, the use of AI in offensive security also raises ethical and operational challenges, including command execution, error handling, and ethical constraints. This study highlights the importance of innovative applications of AI in cybersecurity, and proposes future research directions, including the optimization of the interaction between AI and security tools, the improvement of learning mechanisms, and the discussion of ethical guidelines.

Link:

https://arxiv.org/abs/2406.07561

3. Code completion models assisted by large language models (LLMs) are easy to trigger backdoor attacks: inject disguised vulnerabilities to resist strong detection

Introduction: Large language models (LLMs) have significantly improved the efficiency of code completion by providing contextual suggestions in software engineering. However, fine-tuning of these models in specific applications can be subject to poisoning and backdoor attacks, resulting in the output being secretly tampered with. In response to this security threat, the researchers proposed CodeBreaker, a backdoor attack framework that leverages LLMs (such as GPT-4) for complex payload transformation, to ensure that poisoned data and generated code can bypass strong vulnerability detection. CodeBreaker challenges existing security measures by integrating malicious payloads directly into the source code, highlighting the need for stronger defenses against code completion systems. Experiments and user studies confirm the attack performance of CodeBreaker in different environments, and prove its superiority among existing methods.

Link:

https://arxiv.org/abs/2406.06822

4. SecureNet: DeBERTa与大语言模型在钓鱼检测中的比较研究

Summary: In the field of cybersecurity, phishing attacks pose a significant threat to organizations by tricking users into revealing sensitive information through social engineering methods. This paper explores the potential of large language models (LLMs) in detecting phishing content and compares them with the DeBERTa V3 model. Using public datasets including email, HTML, URLs, text messages, and synthetic data, the researchers systematically evaluated the performance and limitations of these models.

The study found that the Transformer-based DeBERTa model performed best in detecting phishing content, with a recall rate of up to 95.17%, while GPT-4 had a recall rate of 91.04%. In addition, the researchers explored the challenges of these models in generating phishing emails and evaluated their performance in this context. The findings provide valuable insights into strengthening cybersecurity measures in the future, helping to detect and respond to phishing threats more effectively.

Link:

https://arxiv.org/abs/2406.06663

5. Explore the effectiveness of large language models (GPT-4) in binary reverse engineering

Introduction: This study explores the application capabilities of large language models (LLMs), especially GPT-4, in the field of binary reverse engineering (RE). Using a structured experimental approach, the researchers analyzed the performance of LLMs in interpreting human-written and decompiled code. The study consisted of two phases: the first focused on the basic code explanation, and the second involved more complex malware analysis. Key findings show that LLMs excel at general code understanding, but vary in performance for detailed technical and security analysis.

The study highlights the potential and current limitations of LLMs in reverse engineering, providing key insights for future applications and improvements. In addition, the investigators examined experimental methods, such as evaluation methods and data constraints, to provide a technical perspective for the investigator's future research activities in this area.

Link:

https://arxiv.org/abs/2406.06637

6. OccamLLM:单步快速精确语言模型算术

Introduction: In order to improve the accuracy of large language models (LLMs) when performing complex arithmetic operations, researchers propose a new framework that allows precise arithmetic operations to be performed in a single autoregressive step. By exploiting the hidden state of the LLM to control the symbolic architecture, the researchers' method achieves 100% accuracy on a single arithmetic operation, comparable to GPT-4, and even surpasses Llama 3 8B Instruct and GPT-3.5 Turbo on multi-step inference problems. This method not only improves speed and safety, but also maintains the original capabilities of the LLM. The investigators plan to make the code public soon to facilitate wider research and application.

Link:

https://arxiv.org/abs/2406.06576

Issue 55 |GPTSecurity Weekly

Security Papers