Understanding Edge AI: Its Functions and Benefits as an Alternative to Cloud Computing

“Edge computing”, which was initially developed to make big data processing faster and more secure, has now been combined with AI to offer a cloud-free solution. Everyday connected appliances from dishwashers to cars or smartphones are examples of how this real-time data processing technology operates by letting machine learning models run directly on built-in sensors, cameras, or embedded systems.

Homes, offices, farms, hospitals and transportation systems are increasingly embedded with sensors, creating significant opportunities to enhance public safety and quality of life.

Indeed, connected devices, also called the Internet of Things (IoT), include temperature and air quality sensors to improve indoor comfort, wearable sensors to monitor patient health, LiDAR and radar to support traffic management, and cameras or smoke detectors to enable rapid-fire detection and emergency response.

These devices generate vast volumes of data that can be used to ‘learn’ patterns from their operating environment and improve application performance through AI-driven insights.

For example, connectivity data from wi-fi access points or Bluetooth beacons deployed in large buildings can be analysed using AI algorithms to identify occupancy and movement patterns across different periods of the year and event types, depending on the building type (e.g. office, hospital, or university). These patterns can then be leveraged for multiple purposes such as HVAC optimisation, evacuation planning, and more.

Combining the Internet of things and artificial intelligence comes with technical challenges

Artificial Intelligence of Things (AIoT) combines AI with IoT infrastructure to enable intelligent decision-making, automation, and optimisation across interconnected systems. AIoT systems rely on large-scale, real-world data to enhance accuracy and robustness of their predictions.

To support inference (that is, insights from collected IoT data) and decision-making, IoT data must be effectively collected, processed, and managed. For example, occupancy data can be processed to infer peak usage times in a building or predict future energy needs. This is typically achieved by leveraging cloud-based platforms like Amazon Web Services, Google Cloud Platform, etc. which host computationally intensive AI models – including the recently introduced Foundation Models.

What are Foundation Models?

Foundation Models are a type of Machine Learning model trained on broad data and designed to be adaptable to various downstream tasks. They encompass, but are not limited to, Large Language Models (LLMs), which primarily process textual data, but can also operate on other modalities, such as images, audio, video, and time series data.
In generative AI, Foundation Models serve as the base for generating content such as text, images, audio, or code.
Unlike conventional AI systems that rely heavily on task-specific datasets and extensive preprocessing, FMs introduce zero-shot and few-shot capabilities, allowing them to adapt to new tasks and domains with minimal customisation.
Although FMs are still in the early stages, they have the potential to unlock immense value for businesses across sectors. Therefore, the rise of FMs marks a paradigm shift in applied artificial intelligence.

The limits of cloud computing on IoT data

While hosting heavyweight AI or FM-based systems on cloud platforms offers the advantage of abundant computational resources, it also introduces several limitations. In particular, transmitting large volumes of IoT data to the cloud can significantly increase response times for AIoT applications, often with delays ranging from hundreds of milliseconds to several seconds, depending on network conditions and data volume.

Moreover, offloading data – particularly sensitive or confidential information – to the cloud raises privacy concerns and limits opportunities for local processing near data sources and end users.

For example, in a smart home, data from smart meters or lighting controls can reveal occupancy patterns or enable indoor localisation (for example, detecting that Helen is usually in the kitchen at 8:30 a.m. preparing breakfast). Such insights are best derived close to the data source to minimise delays from edge-to-cloud communication and reduce exposure of private information on third-party cloud platforms.

What is edge computing and edge AI?

To reduce latency and enhance data privacy, Edge computing is a good option as it provides computational resources (i.e. devices with memory and processing capabilities) closer to IoT devices and end users, typically within the same building, on local gateways, or at nearby micro data centres.

However, these edge resources are significantly more limited in processing power, memory, and storage compared to centralised cloud platforms, which pose challenges for deploying complex AI models.

To address this, the emerging field of Edge AI – particularly active in Europe – investigates methods for efficiently running AI workloads at the edge.

One such method is Split Computing, which partitions deep learning models across multiple edge nodes within the same space (a building, for instance), or even across different neighbourhoods or cities. Deploying these models in distributed environments is non-trivial and requires sophisticated techniques. The complexity increases further with the integration of Foundation Models, making the design and execution of split computing strategies even more challenging.

What does it change in terms of energy consumption, privacy, and speed?

Edge computing significantly improves response times by processing data closer to end users, eliminating the need to transmit information to distant cloud data centres. Beyond performance, edge computing also enhances privacy, especially with the advent of Edge AI techniques.

For instance, Federated Learning enables Machine Learning model training directly on local Edge (or possibly novel IoT) devices with processing capabilities, ensuring that raw data remain on-device while only model updates are transmitted to Edge or cloud platforms for aggregation and final training.

Privacy is further preserved during inference: once trained, AI models can be deployed at the Edge, allowing data to be processed locally without exposure to cloud infrastructure.

This is particularly valuable for industries and SMEs aiming to leverage Large Language Models within their own infrastructure. Large Language Models can be used to answer queries related to system capabilities, monitoring, or task prediction where data confidentiality is essential. For example, queries can be related to the operational status of industrial machinery such as predicting maintenance needs based on sensor data where protecting sensitive or usage data is essential.

In such cases, keeping both queries and responses internal to the organisation safeguards sensitive information and aligns with privacy and compliance requirements.

How does it work?

Unlike mature cloud platforms, such as Amazon Web Services and Google Cloud, there are currently no well-established platforms to support large-scale deployment of applications and services at the Edge.

However, telecom providers are beginning to leverage existing local resources at antenna sites to offer compute capabilities closer to end users. Managing these Edge resources remains challenging due to their variability and heterogeneity – often involving many low-capacity servers and devices.

In my view, maintenance complexity is a key barrier to deploying Edge AI services. At the same time, advances in Edge AI present promising opportunities to enhance the utilisation and management of these distributed resources.

Allocating resources across the IoT-Edge-Cloud continuum for safe and efficient AIoT applications

To enable trustworthy and efficient deployment of AIoT systems in smart spaces such as homes, offices, industries, and hospitals; our research group, in collaboration with partners across Europe, is developing an AI-driven framework within the Horizon Europe project PANDORA.

PANDORA provides AI models as a Service (AIaaS) tailored to end-user requirements (e.g. latency, accuracy, energy consumption). These models can be trained either at design time or at runtime using data collected from IoT devices deployed in smart spaces. In addition, PANDORA offers Computing resources as a Service (CaaS) across the IoT–Edge–Cloud continuum to support AI model deployment. The framework manages the complete AI model lifecycle, ensuring continuous, robust, and intent-driven operation of AIoT applications for end users.

At runtime, AIoT applications are dynamically deployed across the IoT–Edge–Cloud continuum, guided by performance metrics such as energy efficiency, latency, and computational capacity. CaaS intelligently allocates workloads to resources at the most suitable layer (IoT-Edge-Cloud), maximising resource utilisation. Models are selected based on domain-specific intent requirements (e.g. minimising energy consumption or reducing inference time) and continuously monitored and updated to maintain optimal performance.

A weekly e-mail in English featuring expertise from scholars and researchers. It provides an introduction to the diversity of research coming out of the continent and considers some of the key issues facing European countries. Get the newsletter!

The post “What is ‘Edge AI’? What does it do and what can be gained from this alternative to cloud computing?” by Georgios Bouloukakis, Assistant Professor, University of Patras; Institut Mines-Télécom (IMT) was published on 02/22/2026 by theconversation.com