Fair points, and yes, failed deploys need to be handled explicitly.
In our case, the answer is not "hope and bash". We deploy versioned images, use health checks, monitor the result, and keep rollback simple: redeploy the previous known-good image/config. Host upgrades are also treated as maintenance events, with backups and a recovery path, not as something Compose magically solves.
But I think there is an opposite mistake too: assuming every production system should be operated like a high-scale tech company.
Many production workloads are boring, predictable, and business-critical. They do not need aggressive autoscaling, multi-node orchestration, or constant traffic-spike handling. They need reliable deploys, backups, monitoring, health checks, and a clear rollback path.
That is where Compose can be a good fit: simple operational model, understood failure modes, low moving parts.
Kubernetes becomes much more compelling when you actually need automated failover, rolling deploys, autoscaling, multi-node scheduling, and stronger deployment primitives.
Not needing Kubernetes is not necessarily denial, it is just choosing the complexity budget that matches the problem.
Definitely not a one-size-fits-all choice, but Kubernetes can be so easy and there are so many benefits that get you from one small app to a medium sized business that it seems like a no-brainer for someone starting out. Spinning up k3s is pretty minimal overhead, but right away you can handle storage and backups very easily, automatic certs for all your apps with cert-manager is pretty much a one-and-done, traffic management for external and internal tools is easy, and even logins for websites is just an annotation in a yaml file. You can spin up and try out any software you want without spending time configuring it or setting up additional servers- and when you do need more hardware, it's one command on a virtual server, and just about as easy with physical hardware.
2-3 miniPCs, cloudflare, tailscale, and k3s can save (possibly tens of) thousands on SaaS products, and would probably scale you to a company of dozens AND host your product.
>2-3 miniPCs, cloudflare, tailscale, and k3s can save (possibly tens of) thousands on SaaS products, and would probably scale you to a company of dozens AND host your product.
Get a few Beelink SER5 or SER9's, install Nextcloud to cover the files, document editing, communications (to save on Microsoft 365). Then you can have Gitea (and gitea actions) for your source code and building (skipping github enterprise), Harbor to host and scan your containers, frappe for HR, etc. Pretty much anything you pay enterprise rates for, you can self-host a version that will get your company from 1 to 100s with minimal extra work. If it's not on https://github.com/awesome-selfhosted/awesome-selfhosted, you can probably vibe code it in a couple hours.
I just started to run a k3s cluster with an almost enterprise grade software factory and a few (light) production workloads on a single cheap minipc.
The concept totally works but I would worry about using a beelink in a business context where I had to support it.
For up to low hundreds of users I think you're better off just with 1 vertically scalable box for all the officey / web server workloads.
You mitigate the hardware failure stuff with a vendor contract where you can get someone on-site and overnight you parts, and by keeping things super boring. Volume replication is not boring, avoid at all costs. NAS or SAN if you have to but all disks in the main box for as long as you can.
For 20 person SME maybe a 2-bay Synology or similar, for a heavier company a low end 2U with hardware support. Proxmox under the OS for reduced worry snapshots, rollback, backup etc. Proxmox is there for operational flexibility, resist the temptation to create a network of VMs, you just need 1 CT or VM with all the workload inside it.
For container workloads on 1 host Portainer works as well as k8s IMHO, it gives you the key property you want - you can IaC everything declaratively with terraform + compose over an API.
Caveat that if CI gets heavy you might need to scale that out but you can keep it stateless.
I checked your page. Wanted to ask, are you using longhorn with k3s for replicated volumes? How beefy a box do you need for that (CPU/MEM/Disk speed)?
I have several VMs in clouds with similar k3s architecture as yours and am wondering if there are any benefits to installing longhorn vs sticking to logical (postgres, mimir, whateveritis) replication instead.
In our case, the answer is not "hope and bash". We deploy versioned images, use health checks, monitor the result, and keep rollback simple: redeploy the previous known-good image/config. Host upgrades are also treated as maintenance events, with backups and a recovery path, not as something Compose magically solves.
But I think there is an opposite mistake too: assuming every production system should be operated like a high-scale tech company.
Many production workloads are boring, predictable, and business-critical. They do not need aggressive autoscaling, multi-node orchestration, or constant traffic-spike handling. They need reliable deploys, backups, monitoring, health checks, and a clear rollback path.
That is where Compose can be a good fit: simple operational model, understood failure modes, low moving parts.
Kubernetes becomes much more compelling when you actually need automated failover, rolling deploys, autoscaling, multi-node scheduling, and stronger deployment primitives.
Not needing Kubernetes is not necessarily denial, it is just choosing the complexity budget that matches the problem.