Lumen

LLM inference in Rust, for Apple Silicon (Metal) and NVIDIA (CUDA). One binary — download a model, run it GPU-resident, print text.

curl -fsSL https://servelumen.com/install.sh | bash