Infrastructure partner
onprem.ai logo

Infrastructure partner

Swiss-engineered on-premise enterprise LLM infrastructure.

onprem.ai replaces cloud AI with local enterprise LLM servers that are plug-and-play: OpenAI-compatible APIs, the latest models, and a managed platform built on proven data centre software. Member of the NVIDIA Inception Program, developed in Switzerland for air-gapped and regulated environments.

Why we partner with onprem.ai

Lunnoa Automate needs a sovereign inference layer beneath agents and workflows. onprem.ai delivers that as a turnkey on-premise LLM platform: NVIDIA-accelerated GPU servers, high-performance inference engines, and Kubernetes-based operations with zero cloud data leakage. Lunnoa sits on top as the application layer for automation, governance, and integrations.

  • LLM inference via vLLM, SGLang, LLama.cpp, and TensorRT-LLM on optimised NVIDIA and AMD GPU hardware
  • NVIDIA Blackwell-powered servers from single-GPU units to multi-node DGX-class clusters
  • OpenAI-compatible REST APIs so existing integrations swap cloud endpoints for local ones without code changes
  • Kubernetes and GitOps orchestration with API gateway, metrics, and automated model updates from onprem.ai's AI lab
  • 0% cloud data egress, GDPR/DSGVO-aligned processing fully inside your perimeter
Focus areas

What onprem.ai brings to the table

Core capabilities this partner contributes to Lunnoa Automate engagements.

  • LLM inference engine layer

    High-performance model serving through modern inference engines including vLLM, SGLang, LLama.cpp, and TensorRT-LLM. Models are tested, optimised, and delivered as updates from onprem.ai's enterprise AI lab so infrastructure stays current without manual patching cycles.

  • NVIDIA-accelerated GPU hardware

    Enterprise AI servers built for fast LLM inference at scale, from entry Blackwell RTX workstations through multi-GPU MGX configurations to DGX-class datacentre systems. Hardware and drivers are tuned for very large models, with modular sizing for teams from a single user to 150+ concurrent users per cluster.

  • Managed platform and GitOps operations

    A multi-layer stack from hardened Linux and GPU drivers through Kubernetes, Argo CD, and Helm. Containerised workloads combine API gateways, real-time metrics, and inference services into one maintainable private AI datacentre, including self-healing agents for incident detection and remediation.

  • Cloud-compatible integration APIs

    Fully OpenAI-compatible REST endpoints let teams connect chatbots, document processing, and custom AI workflows to local infrastructure. Deploy models and tools as Docker containers that scale with demand, alongside standard protocols for databases, internal services, and business applications.

Partner contacts

People at onprem.ai

Specialists from onprem.ai who collaborate with Lunnoa on client engagements.

Tino Bächtold

Tino Bächtold

Founder, onprem.ai

Founder of onprem.ai, the Swiss enterprise platform for on-premise LLM infrastructure. UZH-trained computational scientist who builds plug-and-play local AI servers for sovereign, air-gapped deployments with no cloud data leakage.

Your infrastructure. Your data.

Automate smarter. Stay in control.

Deploy Lunnoa Automate inside your own infrastructure. Every workflow, every agent run, and every data point stays in your environment, with full governance from day one.

Request a demo

Self-hosted · Deployed within days · Flat licence, unlimited usage · Full IT governance

Your Infrastructure

Lunnoa Automate

Governance layer

Govern

SSOSCIMRBACAudit trails
  1. Build

    WorkflowsAgentsKnowldge basesSkillsTools
  2. Automate

    RoutingWorkflow EngineSchedulingAgent Jobs
  3. Observe

    LogsTracesJob HistoryConversationsExecutions