Ming Min Mengchen was sent from Oufei Temple
Qubits | Official account QbitAI
Unexpectedly, I usually go around with the latest AI demo to Amway.
This time, he was actually put on an AI painting APP by his friend Amway in the art circle???
Without further ado, let's see the effect:
This modern and mysterious picture hit me as soon as it came up.
Abstract lines combine beauty and imagination while conveying the feel of the city of the future.
If it weren't for spoilers, I really wouldn't have guessed right away that it was from ai.
And in addition to being able to make two changes to the photo, this AI can also draw according to the text proposition and according to its own imagination.
For example, entering the four words of sunset car, it will be like this in the AI "imagination":
In addition, you can also draw different styles, and 20 kinds are currently supported.
It can really meet all the needs you put forward, no wonder it can conquer the art design circle (doge).
After the AI is finished, it can even be saved as a mobile phone wallpaper with one click.
Reply to "wombo" in the background of the WeChat public account, you can open the blind box to get random wallpapers, a total of 5 Oh ~
In the past few days, the app has also dominated the Apple Store graphics and design area for many days (the Android version is also available).
You know, in the past, there were many people in the art circle and the art circle who complained that the content generated by AI did not have a soul...
As a result, they are now po their own works on social platforms, and the big V in the design area wants to talk about this matter.
But what shocked me even more was the story of the company behind this APP.
The company's founder, who is now only 25 years old, dropped out of the University of Toronto directly to start a business.
The company's first app (WOMBO.AI) allows photos to be lip-synced.
That's right, it's the magic special effect that exploded on Douyin and swept the global Internet.
With the app, the company's valuation soared to $40 million (about 250 million yuan).
Their initial start-up capital was only $60,000.
This can't help but make people wonder, what kind of people can use AI to make a global explosive APP again and again?
He dropped out of school at the age of 25 and valued the company at $40 million
Start with the founding of the little brother's company, Wombo.
Wombo is a 25-year-old Founder and CEO of A Canadian company named Ben-Zion Benkhin.
(Next, we will call him "Little Bengo" for the time being)
Bengo Jr. was originally a student of mathematics and philosophy at the University of Toronto.
When he was in school, he formed an ai-power interest club and was also very interested in deepfake.
One summer night in August 2020, Little Bengo and his roommate were blowing wind on the roof of their apartment when they suddenly had an epiphany:
Why doesn't anyone make an APP that can turn an ordinary photo into a funny video?
After four hours of discussion, the rudimentary outline of Wombo gradually became clearer.
To this end, he chose to drop out of school to complete his business.
At the same time, he also brought together his friend Paul Pavel, who was a management consultant, to "do things" and recruited some students at the University of Toronto.
Among them, Angad Arneja gave up the full scholarship and chose to drop out of school like Little Bengo, who is now Wombo's human resources supervisor.
The company initially started with $60,000, largely thanks to the generosity of the founders' parents.
Bengo said the money was mainly used to buy computers, recruit developers and brand promotion.
About half a year later, on February 28, 2021, Wombo was ready to be released.
Bengo Jr. and other company founders sent the app to about 10 people.
Within a week, Wombo had 500,000 downloads.
The next week, that number jumped to 9 million.
As a result, Wombo also attracted the attention of investors from all sides, and successfully won the $6 million angel round jointly led by Global Founders Capital and Sofreh Capital.
It is now valued at $40 million.
It is worth mentioning that Wombo was rejected by more than 200 VCs before the explosion.
So this wave down, the real winners are the parents who initially provided the start-up capital.
Paul Pavel's parents, for example, funded $20,000 and eventually bought hundreds of thousands of dollars in stock.
At present, Wombo's two APP downloads have reached 84 million+, with more than 10 million monthly active users.
The number of creations on Wombo has reached 1 billion, and the number of Dream by Wombo has reached 180 million.
The resulting revenue is also very impressive, with Wombo going online for more than 4 months last year, relying on internal advertising and a free song library to generate hundreds of thousands of dollars in revenue.
Dream by Wombo enables users to purchase their own AI-generated works.
A poster can be customized for $20, and the price with a border starts at $45.
How does this AI draw?
Let the AI draw according to the text, and friends who know AI will know that this is multimodal generation.
Modality refers to different forms of information such as text, images, and sounds.
Multimodality combines different types of information.
If you label each picture with a text description to form a pair, and train the AI with a large number of such graphic pairs, you can make it understand the correspondence between the graphics and texts.
OpenAI's open source CLIP is one of these principles, and Wombo engineers also revealed in an interview that clips are used in their algorithms.
CLIP uses 400 million sets of graphic pairs collected online to train to understand colors and shapes, everyday objects or buildings, and even abstract art styles such as "impressionism" or "cyberpunk."
△ CLIP training data example
Next, we'll also address the part of the image generation.
That's right, IT's time to get out of GAN again, and this time GAN is under the command of CLIP.
The whole process goes like this:
Start by generating a mediocre random image as a seed.
Let CLIP rate the similarity of the image to the text description, feedback to the GAN, and the GAN will continue to iterate with the goal of improving the score.
The entire iterative process can be visually seen in the app.
The randomness in it means that it is almost impossible for an AI to generate the same image twice.
If you are not satisfied with the results for the first time, you can also click the button to try again with the same configuration.
As for the specific GAN used in Wombo's algorithm, it was not disclosed.
But in the job posting, the job description of the senior machine learning engineer says that DC-GAN experience is preferred.
DC-GAN was first proposed in 2015 and was the first GAN variant to use a deep convolutional network to generate images.
This means that Wombo's algorithm is most likely improved on this basis.
Wombo's reasons for choosing convolutional networks over Transformers are also not hard to guess.
It is made into a mobile app for players around the world to use, and it generates high-resolution images, and convolution has an advantage in efficiency.
The method of combining CLIP+GAN into an AI painter was not the first of its kind in Wombo.
CLIP was released in January 2021, and the next day netizens @advadnoun began experimenting with its combination with various generative models.
In the end he chose BigGAN and published the code as Colab Notes The Big Sleep
The paintings generated by The Big Sleep in the early days, how to say it, were always a little mentally polluted, and the resolution was not high.
(It is recommended not to go over the early sharing of @advadnoun, it is really toxic)
Later Spanish player Katherine Crowson released a version of CLIP+VQGAN on this basis.
VQGAN is the CVPR 2021 Oral selected paper, combining the high efficiency of CNNs with the high performance of Transformer to produce higher image quality.
This edition of Colab Notes really became popular, and many people began to share AI-created paintings and developed various techniques.
For example, the addition of Unreal Engine or ray tracing to text prompts can greatly improve the image quality.
Around the CLIP+VQGAN began to form a community, the code was constantly optimized and improved, and there was a special collection and release of AI paintings.
The earliest pioneer @advadnoun also successfully joined Adobe as a researcher.
But the players of this wave of AI painting are mainly technology enthusiasts.
After all, queuing up on Colab to apply for a GPU, running code to train AI, and dealing with error reports from time to time, the threshold is still a bit high.
It wasn't until the advent of Dream by WOMBO that everything changed.
AI painting is starting to be taken seriously
In fact, in recent years, many technical tools for AI to paint have emerged.
The earliest was DeepDream, which Google launched in 2015.
Later, in addition to the aforementioned mentioned, there was NVIDIA's Gaugan, OpenAI's DAL · E, open source Disco Diffusion, etc.
With the unique and surprising characteristics of everyone, the circle layers affected by AI painting are becoming more and more extensive, and the more typical ones are the art circle, the art collection circle and the NFT field.
First look at the art world, they are more exposed to Disco Diffusion.
This AI replaces THE GAN with a diffusion model, and the quality of the images generated is higher, almost reaching the original level.
Although the threshold for running your own code on Colab is not low, it is still very popular, and there are even shared documents that collect prompts.
There is also a hot topic related to it recently.
What kind of impact will AI painting have on the art industry?
In this discussion, most people feel that the impact of AI on the current art circle is still relatively limited.
But what about the future? People's views are different.
Some people think that AI can become an auxiliary tool for creators; some people think that AI can directly replace painters.
Zhihu Answers @ Painting Flower Choke Believes that AI painting still cannot eliminate the painting industry.
In the distance, the camera did not eliminate realistic oil painting; closer to say that 3D assistance did not eliminate realistic number painting. ...... If you're still afraid of being robbed of your job, you might as well paint better. Because no matter what industry, the high-end market is the most difficult to be eliminated.
@Fish generally feels that AI will be a good tool in the hands of professional painters, can provide a lot of inspiration, and can also be used as a manuscript.
although @Liuuzaki also agree that AI is a longboard in imagination, he believes that AI will one day replace practitioners who work in a similar way.
AI is not good at logic, only good at aesthetics. It is a natural artist, not an engineer.
This way of working is very similar to some art workers today.
If you extend your gaze to the art collection circle, AI painting has brought some visible influences to the naked eye in recent years.
In 2018, an AI-created portrait sold for $432,000 at Christie's in New York.
This price is also the highest selling price at the auction, even exceeding the Picasso works at the same auction.
The biggest gimmick of this painting is the uniqueness of painting with GAN.
One More Thing
Finally, AI painting has also influenced NFTs that are also impacting the art world.
This was preceded by the launch of a platform, Eponym.
It is able to leverage AI to turn text into drawings, which are then cast directly into OpenSea, the largest NFT market.
On this platform, only one painting can be generated per text.
The first NFTs (3500) were launched by it, which sold out overnight on OpenSea.
△ Eponym generated works
Ai painting will become the next trend in the NFT field and has become a hot topic in the circle recently.
In fact, Wombo also has plans to enter the NFT field.
At the end of last year, a netizen asked them on Twitter:
Do you cast user-generated paintings into NFTs?
The official response to this: there is no casting at present, but this plan is under consideration!
What do you think about this?
(Don't forget to reply to the WeChat public number background "wombo", you can open the blind box to get random wallpaper ~)
AI Drawing Inspiration Sharing Library: https://docs.qq.com/sheet/DWFR0VmpQa3ZtbXda
TheBigSleep:https://colab.research.google.com/drive/1NCceX2mbiKOSlAd_o7IU7nA9UskKN5WR
CLIP+VQGAN:https://ljvmiranda921.github.io/notebook/2021/08/11/vqgan-list/
Disco Diffusion:https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb
参考链接:[1]https://www.theglobeandmail.com/business/article-making-it-by-faking-it-how-torontos-wombo-became-canadas-fastest/[2]https://www.8btc.com/article/6722724[3]https://artthescience.com/magazine/2022/02/16/features-wombo-dream-and-ai-art-with-salman-shahid/[4]https://weibo.com/u/5619550614?is_hot=1[5]https://www.zhihu.com/question/528563685/answer/2447959396[6]https://www.zhihu.com/question/528563685/answer/2445286621[7] https://www.zhihu.com/question/528563685/answer/2445279372