Status150/185 features (81%)

Feature Registry

Canonical feature completeness across the optalocal stack.

150 of 185 features implemented across all apps

Opta LMX

1M — MLX Inference
26/33

Opta LMX Features

Inference Server

  • MLX-native inference on Apple Silicon
  • OpenAI-compatible /v1/chat/completions endpoint
  • Streaming SSE responses
  • GGUF model loading (llama.cpp fallback)
  • Automatic quantization selection
  • Model hot-swap without restart
  • Concurrent request handling
  • KV cache management
  • Context length enforcement
  • vLLM backend for parallel batching

Model Management

  • Model inventory API (/admin/models)
  • Dynamic load/unload API
  • Memory headroom enforcement (never crash on OOM)
  • Model health monitoring
  • HuggingFace model download integration
  • GGUF format support
  • LoRA adapter loading
  • Model benchmarking suite

API Compatibility

  • OpenAI /v1/chat/completions
  • OpenAI /v1/models
  • Health endpoint /healthz
  • Admin events SSE /admin/events
  • Rerank endpoint /v1/rerank
  • Skills API /v1/skills
  • Agents API /v1/agents
  • Embeddings endpoint /v1/embeddings
  • Function calling (tool_use)

Performance

  • ANE (Apple Neural Engine) utilization
  • Batch request coalescing
  • Throughput metrics (tokens/sec)
  • Active request tracking
  • Auto-tune quantization per model size
  • Thermal throttle detection