Jörg FuchsRunning AI video generation on consumer hardware - here is our full E2E pipeline that generates...
Running AI video generation on consumer hardware - here is our full E2E pipeline that generates photos and videos without any cloud APIs.
The pre-quantized FP8 Hunyuan model with quantization=disabled causes OOM because HyVideoModelLoader upcasts weights to bf16 (~24GB). Setting quantization to fp8_e4m3fn keeps it in FP8 format (~12GB), leaving room for VAE and sampling.
We built a custom VRAM Guard service that coordinates GPU access between Ollama (LLM) and ComfyUI (media generation). Before video generation, Ollama models are unloaded and ComfyUI cached models are freed.
ComfyUI API → n8n workflow orchestration → Social Poster service → auto-post to Twitter, LinkedIn, Reddit, Dev.to
All running on Docker Swarm across 6 nodes. No cloud dependencies.