使用AI来修复AI：OpenAI推出CriticGPT

author：The frontier of the AI era 2024-06-30 08:59:00

In the current era of AI explosion, one of the main challenges we face is that AI sometimes makes mistakes. What's more, the black-box nature of many AI tools means that catching these errors and understanding why they happen can be very difficult.

OpenAI recently discussed the problem — and a potential solution — in a blog post based on one of the company's research papers. Here, the company released CriticGPT – a GPT-4-based architecture model that identifies and highlights inaccuracies in ChatGPT-generated responses, particularly in programming tasks.

OpenAI's researchers found that when human reviewers used CriticGPT to evaluate ChatGPT's code output, they outperformed those without CriticGPT 60% of the time. This work goes far beyond mere error detection to reshape how we train, evaluate, and deploy AI.

Digging into the details, CriticGPT was trained using Human Feedback Reinforcement Learning (RLHF). This is a method similar to that used by ChatGPT itself. The method involves an AI trainer manually inserting errors into ChatGPT-generated code and then providing feedback on those inserted errors. In the process, OpenAI found that in 63% of naturally occurring bugs, trainers preferred to use CriticGPT over ChatGPT. This is due to the fact that CriticGPT produces fewer small complaints, as well as the fact that CriticGPT does not hallucinate very often.

The study found that identifying specific, predefined bugs was more intuitive than evaluating other aspects of code quality or effectiveness than other attributes (detail or comprehensiveness).

The paper discusses two types of evaluation data: human-inserted errors and human-detected errors. This dual approach provides a more comprehensive understanding of CriticGPT's performance in different scenarios, including human-introduced errors and naturally occurring errors. However, consistency is greatly improved when analyzing data that contains artifact insertion errors that contain reference error descriptions.

This consistent pattern suggests that clearly identifying errors provides a more concrete context for evaluation, allowing developers to make more consistent judgments. But it also raises difficulties in making a consistent assessment of AI-generated opinions, especially when dealing with other aspects of code quality.

In addition, OpenAI points out that CriticGPT doesn't do all the work. They observed that human developers often retain or modify AI-generated opinions, suggesting a synergistic relationship between human expertise and AI assistance.

Obviously, there's more work to be done here, but OpenAI's CriticGPT is a big step towards reducing the error rate generated by models like ChatGPT.

使用AI来修复AI：OpenAI推出CriticGPT

Read on

OpenAI model sales surpass Microsoft, a profound change in the competitive landscape of the AI industry

The EU's competition chief has hinted at a new AI review of Microsoft's OpenAI deal and Google

#头条创作挑战赛#OpenAI将从7月9日开始, API access traffic from unsupported countries and regions is blocked. One of the domestic replacements: AI large model unicorn

OpenAI discontinued", can the domestic large model take over the baton?

The United States restricts China's use of OpenAI: many domestic companies fought back strongly, and many bloggers were slapped in the face

OpenAI cut off supply, Chinese companies: Thank you for the invitation, it has reached the top

Can Chinese AI find an opportunity in the OpenAI ban?

OpenAI's "suspension" of China, is it a "poison" or an "assist"?

Unexpectedly, domestic large-scale model manufacturers are high again, because OpenAI has cut off supply!

Can't wait for OpenAI's Q*, Huawei Noah's secret weapon to explore LLM inference, MindStar, came first

Galaxy General Robotics completed a $700 million financing, AI chip company Etched raised $120 million, and OpenAI acquired Multi|Weekly AI Worldview

OpenAI Develops Large Model Content Error Correction Tool丨Zhihu Launches AI Search Product "Zhihu Direct Answer"

OpenAI Releases CriticGPT: GPT-4 Self-Improvement Breaks Through the Human Limits of RLHF

Microsoft backstabbed OpenAI, Nvidia InfiniBand is in danger?