Landon Gavin - Senior Software Engineer

The Kosmos Project

I've been building a homelab infrastructure called Kosmos. The goal: run a production-grade Kubernetes cluster at home with GPU acceleration for AI workloads.

Hardware Setup

Node 1 (pve-node1)

Intel i9-13900K

64GB RAM

NVIDIA RTX 4090

ZFS storage

Node 2 (pve-node2)

Intel i7-7700K

32GB RAM

NVIDIA GTX 1070

LVM storage

Software Stack

Proxmox (Layer 2)

Virtualization layer

GPU passthrough via IOMMU

Cluster for high availability

K3s (Layer 4)

Lightweight Kubernetes

Installed via Pulumi

Director + agent architecture

Workloads (Layer 5)

Media: Plex, *arr stack, qBittorrent

AI: Ollama, Open WebUI

Infrastructure: Traefik, Authentik, Uptime Kuma

GPU Passthrough

The key to running AI workloads is GPU passthrough:

Enable IOMMU in BIOS

Configure GRUB for IOMMU

Pass GPU PCI ID to VM

Install NVIDIA drivers in VM

Use nvidia-device-plugin in K8s

# GPU workload example
spec:
  containers:
  - name: ollama
    resources:
      limits:
        nvidia.com/gpu: 1
  nodeSelector:
    nvidia.com/gpu.product: NVIDIA-GeForce-RTX-4090

Infrastructure as Code

Everything is managed with:

Pulumi (TypeScript) for K8s resources

Ansible for Proxmox configuration

GitHub Actions for CI/CD

Results

I now have:

Self-hosted LLM inference (Ollama)

Hardware-accelerated media transcoding

SSO for all services (Authentik)

Monitoring and alerting (Uptime Kuma)

The project demonstrates how Infrastructure as Code can manage complex homelab setups.