Vultr Research Show Manufacturers Moving AI Inference to the Edge

The vast amount of IoT devices and equipment collecting data on-premises and in the cloud presents a challenge for manufacturers looking to generate insights. The reason? Manufacturers must first transfer all the data from their silos into one centralized repository. Then, after moving the data, manufacturers need specialized AI stack components to process and analyze the data in real time.

Manufacturers that can overcome these challenges have a massive opportunity to gain a competitive advantage with AI. Use cases span predictive maintenance, enhanced quality control and improved customer experience via faster response times, personalized communications and proactive issue resolution.

We have data to back up this claim: according to a 451 Research survey report commissioned by Vultr, 83% of AI-driven manufacturers confirm AI directly contributes to core business outcomes, such as revenue growth, market share and customer satisfaction. Moreover, 66% of these organizations report performing moderately to significantly better than industry peers in 2023.

In manufacturing, AI must often make split-second decisions, such as identifying defects on the production line. When the safety and performance of critical processes hang in the balance, latency becomes unacceptable, demanding that AI inference move closer to manufacturing operations.

However, the window to take advantage of this opportunity is closing: Nearly two-thirds of manufacturers report they expect to reach the highest level of AI maturity within the next two years.

So, what does it take to be AI-mature, and how can manufacturers overcome the unique challenges standing in their way? To answer that, let’s unpack the survey data to learn from manufacturers that have already transformed their operations with AI.

How AI inference at the edge leads to AI maturity in manufacturing

Regardless of their current infrastructure, 89% of manufacturing respondents to the survey anticipate shifting AI inference to the edge. (Editor’s note: If you’re unfamiliar with the term “AI inference” it refers to the use of trained machine learning models that can make decisions without requiring additional training. Essentially, it’s AI that’s been trained to learn on its own through ongoing interactions with data on the network to which it’s connected.)

In manufacturing, AI must often make split-second decisions, such as identifying defects on the production line. When the safety and performance of critical processes hang in the balance, latency becomes unacceptable, demanding that AI inference move closer to manufacturing operations.

While moving AI inference to the edge comes with challenges of its own, it can also alleviate many of the unique obstacles manufacturers must overcome in adopting AI. The three challenges most frequently cited by manufacturers in the survey were: scaling applications in production (34%), governance for data and AI systems (33%) and data quality (33%). With the right infrastructure in place, AI inference at the edge can help manufacturers overcome each of these.

Considering that manufacturing companies tend to rely on a hybrid cloud infrastructure to connect a complex array of IoT devices and machinery via cloud-based infrastructure, AI inference can be difficult to scale unless all data is first aggregated in one place for analysis before inference.

Data quality and governance at the edge

Manufacturers typically have strict data security and governance constraints, particularly in regulated industries like automotive and pharmaceuticals. Maintaining data governance compliance can be challenging, especially as the number of models in production stretches into the hundreds across multiple regulatory jurisdictions.

One strategy for ensuring data governance flexibility is to leverage vector stores and retrieval-augmented generation (RAG). This approach trains models on GPU (graphics processing units) clusters in a centralized location that complies with the organization’s security standards. Then, edge-based models use RAG to access sensitive local data held in protected vector stores as needed, maintaining local governance controls without exposing this data to model providers.

Using RAG and vector stores to separate sensitive data from the training data comes with several advantages. In addition to making it easier to distribute models across geographies with different data sovereignty requirements, it also reduces operating costs by reducing the hours required for model training. Rather than retraining the entire model with each influx of refreshed local data, enterprises can just update the vector database the model accesses at the edge.

As manufacturing companies adopt a multi-cloud strategy, the ability to move data to the cloud will benefit from the RAG layer that can extrapolate the precise information needed at the exact time it’s needed at lightning-fast speed with GPUs. However, for manufacturing companies to unlock this potential, they will need seamless access to the data through a pipeline to efficiently provision the data to applications.

Another key consideration here is streaming data management. Apache Kafka is an ideal platform for the real-time data processing necessary to feed low-latency inference applications, capable of continually updating RAG vector stores with the latest data. This allows AI models to generate real-time, accurate outputs without the need for manual data uploads.

The infrastructure needed to scale AI inference at the edge Serving inference at the edge depends on having the proper infrastructure in place. While 21% of manufacturers in the survey reported relying on a single cloud platform like AWS, Azure or GCP (Google Cloud Platform), doing so can expose companies to vulnerabilities like downtime or security breaches. An on-premises approach also has drawbacks, as few companies can afford the computing resources necessary to support AI inference at scale. Even if they did, the rapid pace of innovation would quickly render their investments outdated.

Considering that manufacturing companies tend to rely on a hybrid cloud infrastructure (a combination of on-premises and cloud technologies) to connect a complex array of IoT devices and machinery, AI inference can be difficult to scale unless all data is first aggregated in one place for analysis before inference.

In contrast, a multi-cloud approach to AI inference allows manufacturers to distribute AI workloads across different environments, ensuring continuity and the flexibility to scale rapidly as production demands increase while optimizing operational costs.

This kind of serverless inference approach outsources the management and scaling of the infrastructure layer to the cloud provider, eliminating the operational overhead of infrastructure management and ensuring each AI workload is matched to the optimal compute resource for cost and performance. Manufacturers can then focus on the AI application layer, where they can best apply their expertise to optimize operations with AI.

Kevin Cochrane is CMO at cloud services company Vultr.

More industrial AI coverage from Automation World: