Enterprise AI requires lean and efficient data machines

How your company handles this will determine whether it evolves with the next phase of AI or becomes a relic of the past.

译自 Enterprise AI Requires a Lean, Mean Data Machine，作者 Bharti Patel。

Seven years ago, eight Google researchers introduced Transformer at a major machine learning conference, taking AI to a new stage of evolution. Transformer architectures are innovative AI neural networks that enable today's large language models (LLMs) and generative AI applications built on them. This work builds on the foundation of many people, including AI giants such as Turing Award winner Geoffrey Hinton and legend Fei Fei Li, who is recognized for insisting that big data is central to unlocking the power of AI. While hyperscale computing and research in academia remain as vibrant as ever, another hot spot for AI model innovation today is the enterprise itself.

Companies across all verticals are wisely assessing this watershed moment in AI history to seize the opportunity to optimize LLMs in innovative ways and use them to create new value. However, so far, that value has remained largely unrealized.

Now, halfway through 2024, in order to get the most out of LLMs, enterprise innovators must first understand a large number of moving parts. Having the right underlying technology and adapting it to the unique needs of your business will help ensure that generative AI applications can produce reliable results – and real-world value.

Datasets, models, and tools

Of course, data is the fuel for AI, and massive public datasets power LLMs. But these public datasets may not contain the right data on what enterprise innovators are trying to achieve. The illusions and deviations that arise through them also conflict with the quality control required by businesses. Data genealogy, traceability, explainability, reliability, and security are all more important to business users. They must be accountable for data usage or face costly lawsuits, reputational issues, customer harm, and damage to their products and solutions. This means they must determine which in-house proprietary datasets should support model customization and application development, where those datasets are located, and how best to clean and prepare them for use by the model.

The LLMs we hear the most about are considered foundational models: models built by companies like OpenAI, Google, Meta, and others that are trained on massive amounts of internet data – some high-quality data, some so poor that it is considered misinformation. The base model is built for massive parallelism, adapts to a variety of different scenarios, and requires significant safeguards. Meta's Llama 2, "a pre-trained and fine-tuned LLM family with parameter sizes ranging from 7B to 70B," is a popular starting point for many businesses. It can be fine-tuned with unique internal datasets and combined with features such as knowledge graphs, vector databases, SQL for structured data, and more. Fortunately, there is a strong activity in the open source community that can provide new optimized LLMs.

The open-source community has also become particularly helpful in providing tools to serve as connected organizations for the generative AI ecosystem. LangChain, for example, is a framework that simplifies the creation of AI-based applications, and it has an open-source Python library specifically designed to optimize the use of LLMs. In addition, the Linux Foundation branch is developing an open standard for Retrieval Enhanced Generation (RAG), which is critical for bringing enterprise data into pre-trained LLMs and reducing hallucinations. Enterprise developers can access many tools using APIs, a paradigm shift that helps democratize AI development.

While some businesses will have a pure research department to investigate the development of new algorithms and LLMs, most will not reinvent the wheel. Fine-tuning existing models and leveraging a growing ecosystem of tools will be the fastest path to value.

Supercomputing and elastic data planes

The current era of AI, especially the boom in generative AI, is driving phenomenal growth in computing usage and advancements in GPU technology. This is due to the complexity and sheer number of computations required for AI training and AI inference, although there are differences in how these processes consume computation. It's impossible here not to mention Nvidia GPUs, which supply about 90% of the AI chip market, and which is likely to continue to dominate with the recently announced powerful GB200 Grace Blackwell superchip, capable of real-time trillion-parameter inference and training.

In addition to this powerful computation, the combination of the right datasets, fine-tuned LLMs, and a robust ecosystem of tools is critical to enabling enterprise AI innovation. But the technical backbone that provides the form for all of this is data infrastructure – storage and management systems that unify data ecosystems. The data infrastructure that laid the foundation in cloud computing is now also the foundation for AI's existence and growth.

Today's LLMs require unprecedented speeds to gain volume, velocity, and diversity of data, which creates complexity. It is not possible to store the data types required by the LLM in the cache. For high-IOPS and high-throughput storage systems that need to scale for massive data sets, this is the base of LLMs, where millions of nodes are required. With super-GPUs capable of lightning-fast read storage read times, enterprises must have low-latency, massively parallel systems that can avoid bottlenecks and design for such stringent requirements. For example, Hitachi Vantara's virtual storage platform, One, offers a new way to achieve data visibility across blocks, files, and objects. Different types of storage need to be readily available to meet different model requirements, including flash, on-site, and in the cloud. Flash memory can provide a denser footprint, aggregate performance, scalability, and efficiency to accelerate AI model and application development while keeping a carbon footprint in mind. Flash memory can also reduce power consumption, which is critical to reaping the benefits of generative AI in a sustainable present and future.

Ultimately, data infrastructure providers can best support enterprise AI developers by providing them with a unified, resilient data plane and easy-to-deploy devices (as well as generative AI building blocks, appropriate storage, and compute). A unified elastic data plane is a lean machine that processes data extremely efficiently, with data plane nodes located close to where the data resides, easy access to different data sources, and increased control over data lineage, quality, and security. With the help of the device, the model can be on top and can be continuously trained. This approach will accelerate the development of value-generating AI applications across domains of the enterprise.

Control costs and carbon footprint

Crucially, these technology foundations in the age of AI must be built with the goal of cost-effectiveness and reducing carbon footprint. We know that at a time when the world is in dire need of reducing its carbon footprint, the expansion of trained LLMs and generative AI across industries is increasing our carbon footprint. We also know that CIOs have always made cost cutting a top priority. Adopting a hybrid data infrastructure approach helps ensure that organizations have the flexibility to choose the most cost-effective way to meet those needs that best suits their specific requirements.

Most importantly, AI innovators should be clear about what they want to achieve and the models and datasets needed to achieve it, and then adapt to hardware requirements such as flash, SSD, and hard drives. Renting or using on-premise machines from a hyperscale computing provider can be advantageous. Generative AI requires energy-efficient, high-density storage to reduce power consumption.

A hybrid data center with a high level of automation, a resilient data plane, and equipment optimized for building AI applications will help drive AI innovation in a socially responsible and sustainable way while still respecting the bottom line. How your business handles this issue may determine whether it evolves with the next phase of AI or becomes a relic of the past.

Enterprise AI requires lean and efficient data machines

Datasets, models, and tools

Supercomputing and elastic data planes

Control costs and carbon footprint