laitimes

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

author:Heart of the Machine Pro

Released by the Heart of the Machine

Heart of the Machine Editorial Department

Recently, the video generation model track has been hot, and Wensheng videos, picture videos, and patterns have emerged in an endless stream. However, despite the large number of models on the market, most people are still unable to experience it because they do not have the qualification for internal testing, and can only sigh at the "model". Not long ago, we reported on Luchen Technology's Open-Sora model, as the world's first open-source Sora model, which not only performs well on various types of videos, but also focuses on low cost and is available to everyone. Does it work any good? How to use it? Let's take a look at the evaluation of the heart of the machine.

Recently, Open-Sora has released version 1.2, which can generate 720p HD video up to 16s, and the official video effect is as follows:

Loading...

The effect of this generation is really amazing, and it's no wonder that so many readers in the background want to get their hands on it.

Compared to many closed-source software, which requires long queues to wait for beta qualifications, this completely open source Open-Sora is obviously more accessible. However, on Open-Sora's official GitHub, it is densely packed with technology and code, and if you want to deploy the experience yourself, not to mention the high hardware requirements of the model, it is also a challenge for the user's code skills when configuring the environment.

So is there any way to make Open-Sora easy for AI novice users?

Let's take a conclusion first: Yes, and it can be deployed with one click, and it can also control parameters such as video length, format, and lens with zero code after startup.

Intrigued? Let's take a look at how to deploy Open-Sora. At the end of the article, there are detailed tutorials and usage addresses at the nanny level, which can be operated without any technical background.

Gradio-based visualization scheme

We did an in-depth report on the latest technical details of Open-Sora. In this report, we focus on the core architecture of the OpenSora model and its innovative Video Compression Network (VAE). At the end of that article, we mentioned that the Luchen Open-Sora team provides a Gradio application that can be deployed with one click on its own. So, what exactly does this Gradio app look like?

Gradio itself is a Python package designed for the rapid deployment of machine learning models. It allows developers to automatically generate a web interface by defining the inputs and outputs of the model, thus simplifying the online presentation and interaction of the model.

We carefully read the GitHub homepage of Open-Sora and found that the application organically combines the Open-Sora model with Gradio to provide an elegant and concise interactive solution.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

It features a graphical interface to make the operation simpler. In the interface, users can freely modify the basic parameters such as the duration, aspect ratio, and resolution of the generated video, while also adjusting the motion amplitude, aesthetic score, and more advanced camera movement of the generated video. It also supports calling GPT-4 to optimize prompts, so it can support both Chinese and English text input.

After deploying the application, users don't need to write any code when using the Open-Sora model, they only need to enter a prompt and click on the replace parameter to try different combinations of parameters to generate a video. The generated video will also be displayed directly in the Gradio interface, which can be downloaded directly from the web without the need for complex paths.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Image source: https://github.com/hpcaitech/Open-Sora/blob/main/assets/readme/gradio_basic.png

We noticed that the Luchen Open-Sora team has provided a script to adapt the model to Gradio in Github, and also provided the command line code for deployment. However, we still need to go through complex environment configurations to successfully run the deployment code. If we want to fully experience the features of Open-Sora, especially to generate long-term high-resolution (such as 720P 16 seconds) videos, we need a graphics card with good performance and large video memory (the official H800 is used). The Gradio scheme doesn't seem to mention how to solve these two problems.

These two problems seem to be very tricky at first glance, but they can be perfectly solved by Luchenyun, which truly realizes easy deployment without technology. How do I get started? Heart of the Machine has a super simple tutorial here.

Ultra-simple one-click deployment tutorial

How easy is it to deploy Open-Sora on Luchen Cloud?

First of all, Luchenyun offers many types of graphics cards, among which, high-end graphics cards such as A800 and H800 can also be easily rented. After our testing, this 80GB video memory card can meet the inference needs of the Open-Sora project with a single card.

Secondly, Luchenyun has equipped an exclusive image for the Open-Sora project. This mirror image is like a hardcover room that can be moved in, and the full set of operating environment can be started with one click, eliminating the need for complex environmental configuration.

Finally, Luchenyun also has super preferential prices and super humanized services. The price of an A800 card is less than 10 yuan per hour, and the time to initialize the image is not charged. In other words, for less than $10 / hour, you can fully enjoy the amazing experience that Open-Sora has to offer! In addition, we also put a 100 yuan coupon to get the way at the end of the article, hurry up to register an account to get the coupon, follow our tutorial to open the whole!

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Lu Chenyun website: https://cloud.luchentech.com/

First of all, enter the website to register an account on Luchen Cloud. As soon as you enter the main page, you can directly see the rentable machines in the computing power market. Get a coupon, or recharge 10 yuan, you can follow the user guide of Luchen Cloud and start building a cloud host.

The first step is to select an image. As soon as you open the public image, the first one you click on is OpenSora (1.2), which is very convenient.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

The second step is to select a billing method. There are two billing methods: tidal billing and pay-as-you-go billing methods. We tried it out and found that tidal billing is more cost-effective and the price of the A800 can be lower during idle hours!

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation
Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

For Open-Sora inference, an A800 was sufficient, we chose a 1-card configuration, and SSH connections, storage persistence, and public data (including model weights) were mounted. These features are free of charge, and they can also provide more convenience, super conscience.

After selecting, click Create, the cloud host starts in a very short time, and the machine will be up within dozens of seconds. This period is not billed, so you don't have to worry about fees if you encounter a large image and wait for a long time.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

In the third step, we click on JupyerLab from the cloud hosting page to enter the web page. As soon as we entered, a terminal was opened for us.

Let's type ls and look at the files of the virtual machine, and we can see that the Open-Sora folder is at the initial path.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Since we are using an Open-Sora dedicated image, we don't need to install any additional environments. This step, which was the most time-consuming, was solved perfectly.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

At this time, we can directly enter the command to run Gradio, and we can quickly start Gradio, and truly realize one-click deployment.

Bash

python gradio/app.py

The speed is very fast, it only takes more than ten seconds for the Gradio to run.

However, we found that this gradio runs on the server's http://0.0.0.0:7860 by default, and if you want to use it in your local browser, you must first add your own SSH public key to Lu Chenyun's machine. This step is also very simple, just go to the following file and paste the key of the local machine into it.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Next, we also need to write the local port mapping instructions, we can follow the instructions in this screenshot, and you need to replace it with the specific address and port of your own cloud host when you use it.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Then, when you open the corresponding web page, you will soon see a visual interface.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Let's enter a random English prompt and click Start Generating (with the default 480p, the speed will be faster).

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

a river flowing through a rich landscape of trees and mountains (一条河流流经茂密的树木和山脉)

The build was completed in no time, taking about 40 seconds. The overall result is not bad, there are rivers, mountains and trees, and it is in line with the instructions. But what we were looking forward to was the effect of the eagle looking down from above.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Video Links: https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650924168&idx=1&sn=311c6f5738d46764db045946da9cbcb5&chksm=84e420f6b393a9e0e2031f8ceb4f9fc405355c91adc3a25e558998d2dbf01415d925bc6cb524&token=1170670493&lang=zh_CN#rd

It's okay, adjusted the command and do it again:

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

a bird's eye view of a river flowing through a rich landscape of trees and mountains (鸟瞰河流流经树木和山脉的丰富景观)

The content generated this time really has a bird's-eye view. Yes, this model is still very obedient.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Video Links: https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650924168&idx=1&sn=311c6f5738d46764db045946da9cbcb5&chksm=84e420f6b393a9e0e2031f8ceb4f9fc405355c91adc3a25e558998d2dbf01415d925bc6cb524&token=1170670493&lang=zh_CN#rd

As mentioned earlier, there are many other options on the gradio interface, such as adjusting the resolution, aspect ratio, video duration, and even controlling the dynamic effect amplitude of the video, etc., which is very playable, we used 480P resolution in the test, and it can support up to 720P, you can try them one by one to see the effect of different options.

Want to take it to the next level? Fine-tuning is also easy to use

In addition, digging deeper into Open-Sora's web page, we found that they also provide code instructions to continue fine-tuning the model. If I use my favorite type of video fine-tuning model, I will be able to make this model produce a video that is more in line with my aesthetic requirements!

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Let's verify it with the video data provided in Luchenyun's public data.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Since the environment is all configured, we just need to copy and paste the training instructions.

torchrun --standalone --nproc_per_node 1 scripts/train.py configs/opensora-v1-2/train/stage1.py --data-path /root/commonData/Inter4K/meta/meta_inter4k_ready.csv

This is the output of a series of model training information.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

The training has started normally, and you can train with just a single card!

(Tip: We encountered an OOM before this, and found that the video memory was still occupied after the program was hung up, and then found that we forgot to close the inference of Gradio in the previous step, so we must remember to turn off Gradio when training with a single card, because the model loaded on Gradio has been waiting for user input to inference).

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Here's how much GPU resources we used up while training:

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

If you do a simple math, the training step takes about 20 seconds, and according to the data provided by Open-Sora, training 70k steps (as shown in the figure below) would take them about 16 days, which is similar to the 2 weeks they claim in their documentation (assuming that all of their machines complete a step in a similar time to our machine).

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Of these 70k steps, the first stage accounts for 30k steps, the second stage accounts for 23k steps, and the third stage actually only trains 17k steps. And this third stage is to fine-tune with high-quality video to greatly improve the quality of the model, which is what we want to do now.

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

However, from the report, they used 12 8-card machines for training, so if we use the Luchen cloud platform to train the same amount of data as the third stage, it will take approximately:

95 hours * 8 cards * 12 units * 10 yuan / hour = 91,200 yuan

This number is still a bit of a threshold for evaluation, but it is also too cost-effective for creating an exclusive Wensheng video model. Especially for enterprises, there is basically no need for any preliminary preparation, and according to the tutorial step by step, you can complete a fine-tuning at a price of less than 100,000 yuan or even less. I'm really looking forward to seeing more of Open-Sora's enhanced version in the professional world!

Finally, put on the 100 yuan coupon welfare activity we mentioned earlier~ Although the cost of our evaluation is less than 10 yuan, the wool should still be picked!

From the official information of Lu Chenyun, it can be seen that users share their experience (with #潞晨云或 @潞晨科技) on social media and professional forums (such as Zhihu, Xiaohongshu, Weibo, CSDN, etc.), and they can get a 100 yuan voucher (valid for one week) for effective sharing, which is equivalent to five or six hundred ~ when converted into this kind of video generated during our evaluation

Come on! Luchen Open-Sora wool can be picked, 10 yuan easy to get started with video generation

Finally, we've put together links to relevant resources below to make it easy for you to get started quickly. If you want to try it now, click to read the original article to transfer it with one click and start your AI video journey!

Related Resource Links:

Luchen Cloud Platform: https://cloud.luchentech.com/

Open-Sora 代码库:https://github.com/hpcaitech/Open-Sora/tree/main?tab=readme-ov-file#inference

Bilibili 教程:https://www.bilibili.com/video/BV1ow4m1e7PX/?vd_source=c6b752764cd36ff0e535a768e35d98d2

Read on