Abhishek Bagade's blog

This is Abhishek Bagade's blog on which I try to maintain my hobby projects and other stuff. I try to add blogs related to GATE, Arduino Projects and general ML side projects.

View on GitHub

HermesDisplay — Ambient AI Display on ESP32-S3

AI-assisted research synthesis. Verify critical claims with primary sources. Status: Completed Last updated: 2026-05-01T20:00:00+05:30 Mode: explanatory

Summary

Overview

HermesDisplay is a calm, always-on, glanceable screen that answers one question: “Do I need to check anything right now?” Instead of repeatedly opening Gmail, Calendar, Slack, and news apps, an AI agent summarizes important items into a bounded daily brief and pushes it to a dedicated ESP32-S3 LCD screen. The device is passive by design — no notifications, no feed, no interactivity. It works with any AI agent via a standard MCP server, installed as a systemd daemon.

Background

The problem is real: ADHD-friendly alternatives to compulsive phone checking are poorly served by existing products. Smart displays (Echo Show, Nest Hub) are cloud-dependent and designed for commerce, not peace. Dedicated info screens (DakBoard, Aura) lack AI pipelines. DIY ESP32 dashboards are user-configured, polling-based, and scroll endlessly. HermesDisplay fills a gap: AI decides what matters, pushes to a calm screen, and nothing else happens.

The target hardware is a Waveshare ESP32-S3 7-inch LCD board (800x480, RGB565, Wi-Fi/BLE, up to 16MB flash, 8MB PSRAM). The display stack is LVGL. The server side is Python: a FastAPI daemon with SQLite retry queue, HSDL validation, template-based layout engine, and an MCP server stdio interface.

Core Analysis

Architecture

Any AI Agent (Hermes / Claude Code / Codex / OpenCLAIDE)
   |
   +-- [via MCP / direct shell] -> hermes-display-mcp (Python stdio MCP server)
   |                                  |
   |                                  +-- hermes-display-daemon (FastAPI + retry worker)
   |                                  |       +-- SQLite (pending payloads queue)
   |                                  |       +-- template layout engine (5 templates)
   |                                  |       +-- [optional] Claude API layout generation
   |                                  |       +-- strict HSDL validator
   |                                  |       +-- HTTP POST to ESP32
   |                                  |
   |                                  +-- CLI modes:
   |                                        hermes-display-mcp         -> MCP server mode (stdio)
   |                                        hermes-display-mcp status  -> quick terminal check
   |                                        hermes-display-mcp push    -> manual push
   |                                        hermes-display-mcp init    -> first-time setup

HSDL — Hermes Screen Description Language

HSDL v0 is a restricted JSON layout format with exactly 3 primitives: label, line, rect. Full screen updates only (no partial push in v0). Max 30 objects, 16KB payload, coordinates within 800x480, colors as #RRGGBB, 5 Montserrat font sizes. Server validates before push; ESP32 re-validates before render.

DisplayPayload Format

The semantic payload an agent produces before layout generation:

{
  "timestamp": "Mon 28 Apr 07:42",
  "mode": "morning",
  "headline": "No urgent fires. Two things need attention.",
  "sections": [
    {"title": "TODAY", "items": ["10am standup", "3pm 1:1"], "priority": 1},
    {"title": "EMAIL", "items": ["HR: leave update"], "priority": 2}
  ],
  "footer": "Hermes checked email, calendar, news and transcripts."
}

Priority rules: 1 = always visible, 2 = visible if space allows, 3 = optional.

Template-Based Layout (v0)

5 deterministic templates: morning, afternoon, urgent, quiet, error. Each maps DisplayPayload to HSDL using fixed positioning. Display works without an LLM. Claude is an optional flavor layer later.

Provisioning

First-time setup on bare ESP32:

  1. Hardcode Wi-Fi in firmware (simplest for desk use)
  2. If Wi-Fi fails, ESP32 falls back to soft AP mode with captive portal at 192.168.4.1
  3. User connects phone to HermesDisplay-XXXX, enters credentials, ESP32 saves and reboots
  4. Daemon discovers ESP32 via hardcoded IP in .env (v0) or ESP32 announces itself (v1)

Device Discovery

Method Viability
Hardcoded IP + DHCP reservation v0 default - simplest
ESP32 announces IP to daemon v1 - best UX, requires daemon URL in firmware
mDNS Avoid - ESP32 mDNS unreliable
LAN scan Avoid - slow, fragile

Customer Flows (12 audited)

Flow Status
A: First-time setup Mostly covered (Gap A: PlatformIO docs)
B: Daily morning brief Good
C: Afternoon refresh Minor (mode diff, mid-glance swap)
D: Device reboot/recovery Good
E: Agent/daemon failures Gap F: stale-content nudge missing
F: Manual push from terminal Gap G: must fix
G: Firmware brick recovery Document recovery steps
H: Wi-Fi network change Dual reconfig: ESP32 + daemon
I: Night mode / display dim Gap K: must fix
J: Health check / status Good
K: Multiple users Out of scope v0
L: First-time ESP32 user Add docs

Gaps Summary (27 total, A-Z)

Must-fix before v0:

High-risk tech gaps:

Risk Table (15 risks)

# Risk Level Mitigation
R1 Claude on critical path High Template layout first
R2 HTTP + LVGL concurrency Medium Lightweight handler, dirty flag
R3 First-time LVGL/ESP32 Med-High Run Waveshare demo first
R4 HSDL validation duplication Low Correct design
R5 Value vs smartphone Medium Physically separate screen
R6 Hermes-only MCP → agent-agnostic Resolved Stdio MCP server
R7 LAN-only security Low Bearer token, locked URL
R8 Touch hardware unused Medium Single-user, acceptable
R9 No scrolling beyond 800x480 Low 30-object limit, priority pruning
R10 Board variant differences High Pin exact SKU
R11 Flash size overflow High Measure binary, require 16MB
R12 Backlight unknown Medium Verify board schematic
R13 No persistent RTC Medium NTP every boot
R14 Dual-core not pinned Medium Pin LVGL + WiFi to separate cores
R15 Flash wear from writes Low Skip flash for unchanged content

Evidence and Sources

Alternatives Landscape

Gap: No existing product combines AI-decides-what-matters, push delivery, calm content, and notification-free UX.

Hardware Feasibility

Cross-Agent Compatibility

Uncertainties and Competing Views

High-Confidence Claims

Medium-Confidence Claims

Unresolved Questions

Practical Takeaways

Build Order

  1. Daemon + fake device (Python-only, test push/retry/validation)
  2. Real ESP32 firmware (PlatformIO, LVGL, HTTP server)
  3. End-to-end integration (fake agent pushes to real ESP32)
  4. MCP server + skill documentation
  5. Real digest pipeline (Gmail + Calendar + RSS)

Before Buying Hardware

Before Writing Firmware

First Agent Setup

pip install hermes-display
hermes-display-mcp init
systemctl --user start hermes-display-daemon

For Hermes: add to config.yaml under mcp_servers with stdio transport pointing to hermes-display-mcp. For Claude Code: claude mcp add hermes-display – hermes-display-mcp

References

  1. Original plan — Codex implementation brief — community
  2. PlatformIO Docs — primary
  3. LVGL Documentation — primary
  4. Waveshare ESP32-S3 7-inch LCD — primary hardware reference
  5. ESP32-S3 Datasheet (Espressif) — primary
  6. MCP Python SDK — primary
  7. DeepSeek API Docs — secondary