Welcome to the [AI Daily] column! Here's your guide to exploring the world of AI every day, and every day we present you with hot content in the field of AI, focusing on developers, helping you gain insight into technology trends and understand innovative AI product applications.
Fresh AI productsClick to learn: https://top.aibase.com/
1、阿里发布 FLUX.1-Turbo-Alpha:基于FLUX.1-dev、8步提炼Lora
The creative team of Alimama released FLUX.1-Turbo-Alpha based on the FLUX.1-dev model, which uses the 8-step distillation Lora model and the multi-head discriminator to significantly improve the distillation quality and support a variety of FLUX-related applications. The recommended bootstrap scale is set to 3.5 and the Lora scale is 1, and a lower step version will be launched in the future. Models can be used in conjunction with the Diffusers framework to load models with a few lines of code to generate high-quality images. The training process was carried out on more than 1 million images, using an adversarial training method, with an aesthetic score of 6.3 or more and a resolution of more than 800. The launch of FLUX.1-Turbo-Alpha has promoted the popularization and application of artificial intelligence technology.
【AiBase Summary:】
🌟 Based on FLUX.1-dev, 8-step distillation and multi-head discriminator are used to improve the image generation quality.
🖼️ With support for text-to-image generation and repair control networks, users can easily create a variety of interesting scenes.
📊 The training process adopts adversarial training, and the training data exceeds 1 million images to ensure the high-quality output of the model.
Link: https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha
2. Say goodbye to cumbersome alignment! F5-TTS makes text-to-speech easy!
Recently, a team of researchers from Shanghai Jiao Tong University, the University of Cambridge, and Geely Automobile Research Institute launched a new text-to-speech (TTS) system called F5-TTS. What makes this system special is that it uses a non-autoregressive approach that combines flow matching with a diffusion converter (DiT) to successfully simplify the complex steps in traditional TTS models.
【AiBase Summary:】
🌟 F5-TTS is a new autoregressive text-to-speech system that simplifies the complexity of traditional TTS models.
⚡ The system combines ConvNeXt and DiT to improve text-to-speech alignment and synthesis quality.
🔒 The research team is concerned about ethical issues and recommends the introduction of watermarking and detection mechanisms to prevent abuse.
Project Entrance: https://github.com/SWivid/F5-TTS
Experience address: https://huggingface.co/spaces/mrfakename/E2-F5-TTS
3. OPPO Docs AI New Features Exposed! Support Apple iWork format conversion, document translation, document scanning, and more!
OPPO officially released a warm-up news today, announcing the upcoming launch of a new Docs app. According to the warm-up news, the app will support a number of features, including "open files as you like", "AI write as you like", "format as you like", and "document search as you like".
【AiBase Summary:】
📝 OPPO is launching a new Docs app with support for several AI features.
🔄 The app can convert Apple iWork formats to improve compatibility.
📄 Provide document scanning and translation capabilities to optimize the document processing experience.
4. Ant CodeFuse IDE version 0.6 is released, which supports AI fixes for editor diagnosis problems
Ant CodeFuse IDE 0.6 is released, adding the AI fix function and introverted completion feature to improve the convenience and efficiency of code writing. The IDE supports mainstream programming languages and provides code writing suggestions, problem fixes, and other functions.
【AiBase Summary:】
🚀 The IDE adds the AI fix function for editor diagnosis issues, and developers can solve error messages by hovering over the Smart Fix button.
⚙️ Optimized the intelligent code completion experience, supporting drop-down completion and introverted completion at the same time, and users can quickly adopt introverted completion through the Tab key.
💻 CodeFuse IDE is developed based on Ant's self-developed large model and the OpenSumi framework, and provides functions such as intelligent terminals and unit test generation.
Details: https://github.com/codefuse-ai/codefuse-ide
5. Apple's "multi-modal alchemy furnace" has been upgraded again! MM1.5 enhances text-dense and multi-image comprehension
Apple recently rolled out a major update to its multimodal AI model, MM1, upgrading it to version MM1.5. This upgrade is not just a simple version number change, but an all-round capability improvement, which makes the model show more powerful performance in various fields. The core upgrade of MM1.5 lies in its innovative approach to data processing, including the use of high-definition OCR data and synthetic image descriptions, as well as optimized visual instructions to fine-tune data mixing.
【AiBase Summary:】
🚀 MM1.5 adopts a data-centric training method to optimize the training dataset, and significantly improves the performance of text recognition, image understanding, and visual command execution.
💡 MM1.5 covers multiple versions from 1 billion to 30 billion parameters, including Intensive and Expert Hybrid (MoE) variants, enabling even smaller-scale models to achieve impressive levels of performance.
🔍 The capabilities of MM1.5 are mainly reflected in text-intensive image understanding, visual referencing and localization, multi-image inference, video understanding, and mobile UI understanding, expanding application scenarios.
Link: https://arxiv.org/pdf/2409.20566
6. Synthetic data is toxic! The Meta team confirmed that 1% of the data can make a large model completely collapse
Recently, a strange thing happened in the AI circle, just like a food blogger suddenly started eating the food he made, and the more he ate, the more addicted he became, and the food became more and more unpalatable. It's scary to say, and it's called model collapse in technical terms. Model crash is when the AI model uses a lot of self-generated data during the training process, which will fall into a vicious circle, resulting in the quality of model generation getting worse and worse, and finally the calf will be finished.
【AiBase Summary:】
🔍 Model crash: The AI model relies too much on synthetic data in training, resulting in a decrease in the generation quality and eventual crash.
💡 Solution: Prioritize the use of real data, use synthetic data sparingly, control the size of the model, and avoid model crashes.
📈 Experiments show that even the use of 1% of synthetic data may cause the model to crash, and the larger the model size, the more serious the crash phenomenon.
Details: https://arxiv.org/pdf/2410.04840
7. The copyright application of the award-winning AI painting "Space Opera" was rejected
Recently, the copyright office refused to register the work "Space Opera" by synthetic media artist Jason Allen, which caused controversy. Allen appealed against the decision, arguing that the work contained a large number of human creations and should be protected by copyright. Whether AI-generated works should be protected by copyright has become the focus, leading to discussions about copyright laws for AI artistic creations.
【AiBase Summary:】
🌟 Allen believes that there is a lot of human creativity in the work and that it should be protected by copyright.
🤖 The Copyright Office refused to register AI-generated works because they lacked sufficient human creation.
📜 Allen's appeal could spur further discussion on copyright law for AI art creations.
8. TSMC's profit increased by 40% in the third quarter, and the AI boom boosted the surge in demand
Recently, TSMC announced that it expects a significant 40% increase in net profit in the third quarter, benefiting from the surge in demand for AI chips. The company's customers include well-known companies such as Apple and Nvidia to promote the development of AI technology. The market is optimistic about TSMC's future performance, with both revenue and capital expenditure plans rising.
【AiBase Summary:】
💰 TSMC's third-quarter net profit is expected to reach NT$298.2 billion, up 40% from the same period last year.
📈 TSMC's customers are launching new products that are driving its performance above expectations.
🌍 TSMC has increased investment in the construction of new factories, and future capital expenditures are expected to be between $30 billion and $32 billion.
9. CEO of Anthropic: AI will help humans fight diseases and extend human life expectancy to 150 years in 5-10 years
In an article by Anthropic's CEO, Dario · Amodai, he boldly predicts the future of artificial intelligence (AI). Although there are concerns about the risks of AI, he believes that the positive potential of AI is huge, and it can bring unprecedented progress to human society. Amodayi stressed the urgency of tackling AI risks and warned of the misleading nature of over-exaggerating technology. He envisioned the transformation of AI in areas such as biology, health, neuroscience, mental health, the economy, poverty, and the nation of the wise.
【AiBase Summary:】
🔬 Biology and health: AI can accelerate medical advances, control infectious diseases, reduce cancer mortality, treat genetic diseases, and double human life expectancy to 150 years.
🧬 Biological freedom: AI gives humans more control over biological traits, including reproductive and physical choices.
🧠 Neuroscience and mental health: AI applications to improve the understanding and treatment of mental illness and enhance mental health.
10. Apple may launch a $2,000 Vision headset next year
Apple plans to launch a new Vision headset that will cost around $2,000 with cheaper materials and a lower performance processor. The device does not include the EyeSight feature and is part of Apple's mixed reality program. Apple will also launch the second-generation Vision Pro, smart glasses, and AirPods with cameras, as well as affordable iPad-like screens and desktop devices with robotic arms. Despite the lack of success in mixed reality, Apple has continued to push ahead with the development of related products.
【AiBase Summary:】
🔍 Apple plans to launch a new Vision headset that will cost about $2,000, with cheaper materials and a lower performance processor.
🚀 Apple will push the boundaries of mixed reality with the launch of second-generation Vision Pros, smart glasses, and AirPods with cameras in the future.
💡 Apple also plans to launch affordable iPad-like screens and desktop devices with robotic arms as part of its smart home strategy.
11. Google's market share in United States search advertising may fall below 50%
Google's market share in search advertising in the United States could fall below 50% by 2025, facing new competitor challenges. Emerging competitors such as TikTok, Amazon, and AI startup Perplexity are grabbing market share. Amazon is growing rapidly in search ad spending, putting pressure on Google. The development of artificial intelligence has changed the search advertising landscape, and Google plans to insert ads into AI search snippets. The sponsored ads market is undergoing a profound transformation.
【AiBase Summary:】
📉 Google's search ads market share is expected to fall below 50% by 2025, facing challenges from new competitors.
📱 TikTok and Amazon are rapidly rising to grab market share from Google.
🤖 Google plans to include ads in AI search snippets to provide brands with new channels for advertising.
12. Lenovo released ThinkSmart Core Gen2, which is specially designed for video conferencing, and AI helps efficient collaboration!
Lenovo's ThinkSmart Core Gen2 brings a breakthrough in the field of intelligent collaboration, powered by Intel Core Ultra processors, powerful AI processing power to improve meeting efficiency, redefine meeting spaces and work methods, and achieve intelligent collaboration experience.
【AiBase Summary:】
💡 The ThinkSmart Core Gen2 is powered by an Intel Core Ultra processor with an integrated neural processing unit (GPU) that delivers powerful AI processing power and consumes 40% less energy.
💼 Support for Microsoft Teams Rooms and Zoom Rooms, with AI-enhanced features including smart box picking, automatic speech recognition, smart gesture labeling, and more.
🔒 Offering a high degree of automation and proactive management, pre-installed ThinkSmart Manager software and ThinkShield solutions ensure all-round security.