The CEO of Facewall Intelligence issued a response to the "plagiarism" of an AI team in Stanford: I regret it

Recently, the news that the Stanford University AI team is suspected of plagiarizing a Chinese large-scale model startup has attracted great attention in the industry.

On June 3, Li Dahai, CEO of Facewall Intelligence, and Liu Zhiyuan, co-founder, successively issued articles in response to the plagiarism of the open-source model by the AI team of Stanford University. Li Dahai said: "We deeply regret this incident. On the one hand, he felt that this is also a way to be recognized by the international team, and on the other hand, he called on everyone to build an open, cooperative and trusting community environment. "We want the team's good work to be noticed and recognized by more people, but not in this way."

On May 29, an AI team from Stanford began advertising on the Internet that it could train a SOTA multimodal model called Llama3-V for $500, and the authors claimed that Llama3-V was more powerful than GPT-4V, Gemini Ultra, and Claude Opus. According to public information, the two members of the team are undergraduates from Stanford University, who have published a number of papers in the field of machine learning, and their internship experience includes AWS, SpaceX, etc.

Because the team members have a strong background such as Stanford and Tesla, the Llama3-V project quickly rushed to the homepage of HuggingFace (a developer community and platform) and attracted the attention of the developer community.

A user questioned whether llama-3V is a shell of MiniCPM-Llama3-V 2.5 on social platforms X and HuggingFace, an open-source device-side multimodal model launched by Facewall Intelligence, which was released on May 21, 2024.

The Llama-3V team responded at the time that they were just using the tokenizer (tokenizer, an important part of natural language processing) of MiniCPM-Llama3-V 2.5 and started working on it before the release of MiniCPM-Llama3-V 2.5. However, the team did not explain how to obtain the detailed tokenizer before the release of MiniCPM-Llama3-V 2.5.

But then, there were more and more voices about the plagiarism of the aforementioned AI team. For example, the model structure and configuration files of Llama3-V are exactly the same as those of MiniCPM-Llama3-V 2.5, but with some reformatting and renaming of some variables, such as image slicing, tokenizer, resampler, data loading, and other variables. The Llama3-V also has the same tokenizer as the MiniCPM-Llama3V 2.5, including the special notation newly defined by the MiniCPM-Llama3-V 2.5.

According to the HuggingFace page, the original author of Llama3-V directly imported the code of MiniCPM-V when uploading the code, and then changed the name to Llama3-V. But as one of the authors, Mustafa Aljadery, does not consider the act to be plagiarism. He posted that there is a bug in llama3-v reasoning and it is not plagiarism. "I've already pointed out that the architectures are similar, but the architecture of the MiniCPM comes from Idéfics, and we follow those in the Idéfics paper. The architecture is based on comprehensive research, how can you say it's MiniCPM? The visual part of the MiniCPM code also looks like it's being used from Idéfics. ”

Tsinghua Jane identifies, circle red is the correct answer

In Li Dahai's view, another piece of evidence is that Llama3-V also uses the newly set up Tsinghua Jane recognition capability of the Facing Wall Intelligence team (a batch of Warring States bamboo slips collected by Tsinghua University in July 2008), and the cases presented are exactly the same as MiniCPM, and this training data has not yet been fully disclosed. Li Dahai said that this work took the team of students several months to scan from the huge volume of Tsinghua Simplified Texts word by word, and annotate the data one by one, and integrate them into the model. More subtlely, the two models are highly similar in terms of both correct and false representations after Gaussian perturbation verification, a method used to verify model similarity.

One of the authors explains the reason for the deletion

Explaining the action, the Llama3-V model on HuggingFace has been hidden by the Stanford AI team mentioned above, "I hid it to fix the model's inference problem, because the model must have a specific configuration to run. ”

"I'm so sorry, I deleted them because the inference code wasn't ready and everyone had running errors. I think it's better for people not to use it now. You have to have a special configuration. I'll put it back as soon as it's fixed. The said team responded.

The reporter sent an email to Siddharth Sharma, one of the authors' team, asking about the specific reasons for the deletion action and what conditions will be followed by the recovery of the large model. No response has been received as of press time.

Liu Zhiyuan commented on this matter, saying that the rapid development of artificial intelligence is inseparable from the open source sharing of global algorithms, data and models, so that people can always stand on the shoulders of SOTA and continue to move forward. The open-source MiniCPM-Llama3-V 2.5 uses the latest Llama3 as the language model base. The cornerstone of open source sharing is adherence to open source protocols, trust in other contributors, respect and tribute to the achievements of predecessors, and the Llama3-V team has undoubtedly seriously undermined this. They have deleted the database at Huggingface after being questioned, and two of the three members of the team are also Stanford undergraduates, and there is still a long way to go.

Liu Zhiyuan said that domestic large-scale model teams such as Zhipu-Tsinghua GLM, Ali Qwen, DeepSeek and Facewall-Tsinghua OpenBMB are receiving widespread attention and recognition in the international community through continuous open source sharing, "This incident can also be regarded as a side reflection that our innovative achievements have also been receiving international attention." Liu Zhiyuan said.

Latest responses

On June 4, Siddharth Sharma and Aksh Garg, the two authors of the Stanford Llama3-V team, formally apologized to the wall-facing MiniCPM team for this academic misconduct on social platforms, and said that they would remove all Llama3-V models.

Aksh Garg said, "First of all, we would like to apologize to the original author of MiniCPM. I, Siddharth Sharma, and Mustafa released Llama3-V together, and Mustafa wrote the code for this project, but I haven't been able to contact him since yesterday. Siddharth Sharma and I are mainly responsible for helping Mustafa with model rollout. The two of us looked at the latest paper to verify the novelty of this work, but were not informed or aware of any previous work on OpenBMB (a large-scale library of pre-trained language models and related tools initiated by the Tsinghua team). We apologize to the authors and are disappointed that we did not make the effort to verify the originality of this work. We take full responsibility for what happened and have taken down Llama3-V, apologizing again. ”

Source: Yicai

编辑：Sharon

The CEO of Facewall Intelligence issued a response to the "plagiarism" of an AI team in Stanford: I regret it

Latest responses