The chip can run LLMs, vision-language models (VLMs), and other generative AI models entirely on-device, without relying on cloud connectivity, bringing generative AI to edge devices.
The Hailo-10H is compatible with Hailo’s software stack and supported by a global developer community with more than 10,000 users each month. It empowers developers to run state-of-the-art vision and generative AI models directly on edge devices, delivering real-time responsiveness with ultra-low latency.
By processing data locally, the Hailo-10H ensures data privacy, since personally identifiable information remains on the device, while significantly reducing overall system costs by minimizing both cloud bandwidth usage and the need for expensive cloud-based AI service subscriptions. Just as importantly, the AI operates independently of cloud connectivity, ensuring consistent availability even in environments with limited or no internet access.
Specifically designed for edge devices across consumer, enterprise, and automotive markets, including media centers, home gateways, and automotive cockpit systems, the Hailo-10H enables advanced use cases like natural language human-machine interaction, visual awareness, and multi-modal AI to run seamlessly within the power and cost constraints typical of edge environments.
In performance benchmarks, the Hailo-10H has demonstrated outstanding results across generative workloads. For example, achieving a first-token latency of under 1 second and over 10 Tokens per Second on a variety of 2B language and vision-language models. For video analytics, the Hailo-10H enables state-of-the-art object detection (e.g., YOLOv11m) on a real-time 4K video stream. All of these come at a typical power consumption of just 2.5W, making it ideal for compact, efficient AI-enabled systems. The Hailo-10H is automotive-qualified to AEC-Q100 Grade 2 standards and is aimed at automotive designs with 2026 start of production.