Reporting by XinZhiyuan
EDIT: Time
Recently, the research team of Zhejiang University has realized the change of the portrait in the video, and the adjustment parameters can be expanded or reduced.
Videos can slim the face? Let's see what's going on.
This is the American actress Jennifer Lawrence, on the left is a original video on YouTube, and on the right is her after "Skinny Face".
The slightly rounded chin became pointed, and the melon face was about to become an awl face, and it seemed to look older.
Since you can "slim your face", can you also "widen your face"?
No problem, and the effect is outstanding, it is about to become a chinese character face.
Let's give Xiao Za another one:
On one side is "wide face", on the other side is "thin face", hey, the middle one is the familiar him on the screen.
Face
In recent years, major social media popular short videos, such as video number, vibrato, Kuaishou and so on.
So, in addition to the self-beauty and filter functions, how to digitally retouch the portraits of the people in the video?
Because video is dynamic, the technology is probably not as simple as flat beauty.
Recently, a study of portrait video editing was published that shared this video face scaling technique.
The research was mainly done by personnel from the State Key Laboratory of Computer-Aided Design and Graphics (CAD&CG) at Zhejiang University, and the University of Bath in the United Kingdom participated in the research.
Thesis link: https://arxiv.org/abs/2205.02538
As the authors put it, "The goal of the study was to generate high-quality portrait video remodeling results by editing the overall shape of the portrait face based on natural facial deformations in the real world."
The team's training environment was a desktop computer with an AMD Ryzen 9 3950X with 32GB of RAM.
Among them, OpenCV's optical flow method is responsible for motion mapping and smoothed by the StructureFlow framework. The Face Alignment Network (FAN) is responsible for feature point estimation, while Ceres Solver is used to solve optimization problems.
Comparison with the baseline approach. Given the frame (a) of a frame of portrait video, the portrait reshaping method creates artifacts near the tip of the nose (b) because the nose obscures the sides, while the author's method (c) can produce satisfactory results using the same plastic parameters.
Compared to methods that use only MLS (a), methods that use optimization only (b), and methods that authors (c).
Although the MLS approach only ensures consistency of face boundaries and video stability, mesh optimization is effective in correcting for background distortion.
However, the author's method has achieved good results in background separation, distortion, and consistency of face boundaries.
By comparing with method (a) that does not fix the contour mesh points and method (b) that uses only sparse contour point mapping and the author's method (C).
It can be seen that the method proposed by the author achieves better results in terms of performance, facial boundary consistency and reshaping consistency.
The entire research process mainly includes 2 stages: stable facial reconstruction, and continuous video reconstruction.
In the first phase, the authors estimated the facial rigid posture transformation from the entire video.
Then, multiple frames are jointly optimized for accurate facial recognition reconstruction.
In this way, this approach extends from reshaping one monocular image to reshaping the entire sequence of images.
Immediately after, the facial expressions are restored throughout the video.
In the second phase, the authors first reshaped the reconstructed 3D face, using parameters to reshape the weight change of the model face.
Then, use the reconstructed 3D face to guide the warping of the video frame.
The results show that the method proposed by the authors can robustly generate coherent remodeled portrait videos, while the image-based approach can easily lead to obvious flicker artifacts.
A useful deployment of such a system is to implement real-time deformation, and the necessary computing resources overcome the challenge of "real-time" deformation.
In this way, it greatly facilitates portrait video editing and visual effects on social media.
This will have broad application prospects in video speeches, e-commerce anchors and so on.
Research team
Professor Jin Xiaogang of the School of Computer Science and Technology of Zhejiang University is the corresponding author of the paper.
Jin Xiaogang is a professor and doctoral supervisor, and his CAD &CG (Computer Aided Design and Graphics) laboratory is affiliated with the School of Computer Science and Technology of Zhejiang University.
His research focuses on computer animation, computer graphics, and computer-aided geometric design.
According to the official website page, he also has a strong research interest in film special effects simulation, cloth animation and virtual fitting, insect swarm simulation, traffic simulation and automatic driving.
In addition, he has researched crowd and group animation, implicit surface modeling and applications, creative modeling, Sketch-Based modeling, texture compositing, etc.
His co-authored book "The Algorithmic Basis of Computer Realism Graphics" won the second prize of the National Science and Technology Book Award in 2001.
Resources:
http://www.cs.zju.edu.cn/csen/27059/list.htm
https://www.unite.ai/restructuring-faces-in-videos-with-machine-learning/