Build a 3-Node Kubernetes Home Cluster

This post contains affiliate links. If you buy through them, we may earn a small commission at no extra cost to you. Learn more.

A single-node Kubernetes install teaches you the API. A three-node cluster teaches you Kubernetes. The moment there is more than one machine, the things that actually matter in production show up on your bench: a control plane that survives a node reboot, storage that replicates across hosts, a load balancer handing out real IPs, and pods that reschedule when you pull the power on a node mid-demo. None of that exists on one box, and all of it is the point of building a cluster at home.

Original content from computingforgeeks.com - post 169411

This guide lays out three spec-verified 3-node Kubernetes home cluster builds, each one three identical nodes plus the managed switch that ties them together. A mini-PC learning cluster for k3s and a real HA control plane, a serious homelab cluster with 10GbE built into every node for Longhorn or Ceph, and a production-like cluster that scales to many stateful namespaces. Two rules run through all of them and matter more than any single part: the three nodes must be identical, and the network between them is a first-class component, not the cheap switch left over in a drawer.

Specs verified against manufacturer datasheets and live listings in June 2026. The control-plane and storage guidance draws on our own Proxmox fsync testing rather than estimates.

The three clusters at a glance

The serious homelab cluster is the one most people should build. Its nodes carry dual Intel 10GbE on the board, which is exactly what replicated storage wants, and three of them run real stateful workloads without strain. The mini-PC learning cluster is the right start if your goal is to learn HA Kubernetes, MetalLB, and ingress cheaply and quietly, and you can outgrow it gracefully. The production-like cluster is for someone who already runs Ceph and many namespaces and needs the headroom, with an honest choice between quiet new mini-workstations and loud used servers that buy you ECC.

Mini-PC learning cluster (~$650 to $2,060 for all three nodes + switch): 3x small mini-PCs + a managed 2.5GbE switch. Learn k3s, an HA control plane, MetalLB, ingress.
Serious homelab cluster (~$3,740 to $6,450): 3x Minisforum MS-01 with dual 10GbE each + a managed switch. Longhorn or Ceph, real ingress, meaningful workloads. The all-rounder.
Production-like cluster (~$1,800 used to $8,750 new): 3x Ryzen 9 mini-workstations or 3x used ECC servers + a multi-10GbE switch. Heavy Ceph, many namespaces.

How we picked, and what the lab data shows

Every spec here was checked against the manufacturer page and a live listing, and the cluster-specific choices came from two places: the hard rules of running Kubernetes on bare metal, and our own storage measurements.

The measurement that drives the node choice is fsync latency. On our Proxmox lab nodes, an NVMe device cleared about 2,024 fsync operations per second where a SATA SSD managed 642. That gap is decisive for a cluster, because etcd, the database behind the control plane, fsyncs every change to disk before it acknowledges it. The etcd maintainers are blunt about it: slow disk fsync is the most common cause of leader elections and “apply took too long” warnings that make a cluster feel unstable. So every node that runs the control plane in these builds boots from a real NVMe, and the one configuration to avoid is the bargain mini-PC that boots from eMMC or an SD card, which has terrible fsync latency and makes a flaky etcd host. The numbers behind this are in our NVMe versus SATA comparison.

For sizing workloads per node, we lean on the per-model density figures from our homelab mini-PC guide rather than inventing pod counts. What we do not do is quote a failover time or a Ceph rebuild rate for these exact three-node builds, because we have not benchmarked that specific hardware and a made-up number helps no one. Where a figure is ours, it is measured; the rest stays honest about being qualitative.

Cluster comparison: which build fits you

Spec (per node)	Learning cluster	Serious homelab	Production-like
Node	Beelink SER8 / EQ14	Minisforum MS-01	Minisforum MS-A2 / used server
CPU	Ryzen 7 8C / N150 4C	Core i9-13900H 14C	Ryzen 9 16C / Xeon
RAM/node	32GB / 16GB	64GB	96GB / 128GB+ ECC
Network/node	2.5GbE	2x 10GbE + 2x 2.5GbE	2x 10GbE + 2x 2.5GbE
ECC	No	No	Used-server path only (RDIMM)
Cluster total	~$650 to $2,060	~$3,740 to $6,450	~$1,800 to $8,750
Best for	Learning HA k8s	Longhorn/Ceph, real apps	Heavy Ceph, many namespaces

Mini-PC learning cluster: three small nodes to learn HA Kubernetes

Beelink SER8 Ryzen 7 mini-PC used as a node in a 3-node Kubernetes home cluster — The learning cluster uses three identical Beelink SER8 mini-PCs. Image: Beelink.

This is the quietest, cheapest way to run a real three-member control plane at home. The recommended node is the Beelink SER8, an eight-core Ryzen 7 box with 32GB of DDR5 and a real NVMe slot, which makes a capable control-plane or worker node that handles etcd’s fsync load without complaint. Three of them draw very little power and sit silently on a shelf, which is what you want for a cluster that runs all day while you learn k3s, MetalLB, and ingress on it.

Skip it if you intend to run serious replicated storage and real stateful apps from day one. A single 2.5GbE port per node and 32GB of RAM is fine for learning, but Longhorn and Ceph replication want the 10GbE fabric of the serious tier.

Component	Pick	Approx price (Jun 2026)
Node (recommended) x3	Beelink SER8 (Ryzen 7 8745HS, 32GB, NVMe)	$450-$650/node
Node (budget floor) x3	Beelink EQ14 (Intel N150, 16GB, dual-2.5G SKU)	$190-$260/node
Switch	MokerLink 8x 2.5G + 10G SFP+ (managed, fanless)	$80-$110

The budget floor is three Beelink EQ14 boxes on the Intel N150. They cap at 16GB of RAM each, so a three-node EQ14 cluster is a 48GB cluster, which is enough for a control plane, MetalLB, ingress, and a few light pods, and almost nothing more. It is the cheapest honest way to learn HA Kubernetes, and you will outgrow it, which is fine. One caution on the EQ14: it ships in several network variants, so buy the explicit dual-2.5G SKU and buy three of the exact same one, because mixing a Realtek-NIC node with an Intel-NIC node breaks the identical-node rule that the whole cluster depends on. The switch is the part people skimp on and should not; a managed 2.5GbE switch with VLANs covers three nodes with room for a NAS, and the managed switch guide covers the choice.

Serious homelab cluster: three 10GbE nodes for real workloads

Minisforum MS-01 mini-workstation with dual 10GbE for a 3-node Kubernetes homelab cluster — The serious cluster uses three Minisforum MS-01 nodes, each with dual Intel 10GbE. Image: Minisforum.

This is the cluster most people should build, and the reason is on the back panel. The Minisforum MS-01 carries two Intel X710 10GbE SFP+ ports and two Intel 2.5GbE ports onboard, so every node has an enterprise-grade 10GbE replication fabric without an add-in card. That matters because Longhorn and Rook-Ceph keep a copy of each volume on a different node, and that replication traffic crosses the wire constantly; on a slow network it becomes the bottleneck, on 10GbE it stays out of the way. Each node also has three M.2 slots, so one NVMe handles boot and etcd while a second holds the replicated storage pool. Three of these is a 192GB cluster that runs real stateful applications.

Skip it if you are learning and the spend is hard to justify, in which case the mini-PC tier teaches the same lessons, or if you need ECC and very large per-node memory, which is the production tier’s used-server path.

Component	Pick	Approx price (Jun 2026)
Node x3	Minisforum MS-01 (i9-13900H, dual 10GbE X710 + dual 2.5G)	$680-$900/node
RAM per node (to 64GB)	2x 32GB DDR5-5600 SO-DIMM (non-ECC)	$140-$300/node
NVMe per node (x2)	Samsung 990 Pro 2TB (boot+etcd / storage)	$200-$450 each
Switch	Sodola 8x 2.5G + 10G SFP+ (managed, fanless)	$100-$150

Memory is the line that moves the total, and it moves three times because you buy it per node. At 2026 prices, 64GB of DDR5 per node is the largest single cost after the boxes themselves, so re-check it the day you order. Two NVMe per node is the deliberate design: keep the boot disk and etcd on one drive and the replicated storage on another, so a busy Ceph rebuild does not contend with the control-plane database. One note on the switch: the MS-01 has 10GbE NICs, and if you push Ceph or Longhorn hard you will want a switch with several 10G ports so replication runs at 10G rather than 2.5G, which is the production tier’s switch below. For the bare-metal load balancer and ingress these nodes need, the MetalLB guide and the Longhorn storage guide pick up where the hardware leaves off.

Production-like cluster: 10GbE nodes, or used servers for ECC

Minisforum MS-A2 Ryzen 9 mini-workstation with dual 10GbE for a production-like 3-node Kubernetes cluster — The production-like cluster runs on three Minisforum MS-A2 nodes, or on used ECC servers. Image: Minisforum.

This tier is for a cluster that runs Ceph in earnest across many namespaces, and it splits into two honest paths. The new path is three Minisforum MS-A2 nodes, each a 16-core Ryzen 9 with dual Intel X710 10GbE SFP+ onboard, which gives a quiet, warrantied, 48-core cluster with a proper 10G fabric. The used path is three identical second-hand servers, a Dell R640 or HP DL360 Gen10 or Lenovo SR630, which is how you get ECC memory and 128GB or more per node affordably. The tradeoff is exactly what you would expect: the mini-workstations are silent and sip power but cap at 96GB of non-ECC memory, while the used servers give you registered ECC and far more RAM but are loud, rack-mounted, and draw real idle power.

Skip it if the serious homelab tier already covers your workloads, which it does for most people. This tier earns its cost only when you genuinely run heavy Ceph or need ECC across a large memory footprint.

Component	Pick	Approx price (Jun 2026)
Node, new path x3	Minisforum MS-A2 (Ryzen 9 9955HX, dual 10GbE, barebone)	$799-$871/node
RAM per node (to 96GB)	2x 48GB DDR5-5600 SO-DIMM (non-ECC)	$360-$560/node
NVMe per node (x2-3)	Samsung 990 Pro 2TB (boot+etcd / Ceph / scratch)	$200-$450 each
Node, used path x3	Dell R640 / HP DL360 Gen10 / Lenovo SR630 + RDIMM ECC + Intel X710 used/eBay	$400-$900/node
Switch	MokerLink 8x 2.5G + 2x 10GE + 2x 10G SFP+ L3 MokerLink store	$300-$400

The memory rule is the load-bearing detail on this tier. The new MS-A2 nodes take DDR5 SO-DIMM, which is non-ECC on this platform and tops out at 96GB per node. ECC enters only on the used-server path, where the servers use registered ECC RDIMM, and registered DIMMs are not interchangeable with the SO-DIMM or unbuffered modules of the mini-PCs, so never put one type in the other platform. That single distinction is what splits this tier into two builds. The switch here has four 10G ports, enough that each of the three nodes runs its replication NIC at 10G with an uplink to spare, which is what makes Ceph viable. The exact model sells mainly through the manufacturer’s store rather than a clean retail listing, so the table points there rather than guessing a link.

Why the three nodes must be identical

This is the rule that separates a cluster from three random computers on a switch. Kubernetes spreads pods across nodes and, with Longhorn or Ceph, keeps a copy of each volume on a different node. If one node is weaker, has less RAM, or a slower disk, the scheduler and the storage layer both degrade to the level of that node, and the failover drills that are the whole reason to build a cluster stop being meaningful, because draining node two no longer behaves like draining node one. So buy three of the exact same model and the exact same configuration: same CPU, same RAM, same NIC, same NVMe. This is also why the budget tier warns against mixing network variants of the same mini-PC, since a Realtek NIC on one node and an Intel NIC on another is not an identical cluster. Identical nodes cost a little more discipline at the order stage and save a great deal of confusion later.

The switch is part of the cluster, not an afterthought

Cluster traffic never stops. Pods talk to pods across nodes, etcd members exchange heartbeats, and replicated storage copies every write to another host, all of it across the switch. A dumb gigabit switch throttles that replication and adds latency to etcd, which is precisely the traffic a cluster cannot afford to slow down. Every build here names a managed switch sized for three nodes plus an uplink, with VLAN and LACP support so you can separate cluster, storage, and management networks. The learning tier runs fine on a managed 2.5GbE switch; the storage tiers want a 10GbE fabric so replication is never the bottleneck. Treat the switch as a node-class purchase rather than the cheapest box that has enough ports, and the managed switch guide covers the 2.5G-versus-10G decision in detail.

Give the control plane a real NVMe

The control plane’s database, etcd, is unusually sensitive to disk speed, and specifically to fsync latency, because it forces every change to stable storage before acknowledging it. Our lab numbers put the gap in perspective: an NVMe drive cleared roughly three times the fsync operations per second of a SATA SSD, and a slow disk shows up in a cluster as leader elections and “apply took too long” warnings that make everything feel unreliable. The practical rule is simple. A node that runs the control plane needs a real NVMe, which every recommended node here has, and you should avoid the bargain mini-PCs that boot from eMMC or an SD card for any control-plane role. Those boxes are fine as pure workers if you must, but their fsync latency makes them poor etcd hosts, and a flaky control plane undermines the entire cluster.

Storage lives in the nodes: Longhorn and Ceph

A home Kubernetes cluster does not bolt on a separate storage array the way a virtualization host might. Instead, the storage lives inside the nodes, and Longhorn or Rook-Ceph replicate each volume across them, so a node failure does not take your data with it. That design has two hardware consequences. Each node needs its own fast NVMe for the replica it holds, which is why the storage tiers give every node two drives, one for boot and etcd and one for the replicated pool. And replication runs constantly between nodes, which is why those same tiers insist on a 10GbE fabric, because on a slower network the rebuild after a node comes back online drags and the cluster feels sluggish. The lighter the storage ambition, the less this matters; the moment you run real stateful apps, per-node NVMe and 10GbE stop being optional. The Longhorn guide walks through the software side once the hardware is in place.

ECC and the cluster: when it matters, and which kind

None of the mini-PC nodes here use ECC memory, because the mobile AMD and Intel platforms they are built on take SO-DIMM modules that do not support it, and for a homelab cluster that is a reasonable tradeoff for silence, low power, and a warranty. ECC enters the picture only on the production tier’s used-server path, where enterprise servers use registered ECC, the RDIMM modules. The rule to carry from the rest of this build series is that the memory type follows the platform and the two are never crossed: SO-DIMM and unbuffered modules in the mini-PCs, registered RDIMM in the servers, and never one in the other. If a node’s data integrity genuinely matters more than its noise and power, that is the signal to take the used-server path; otherwise the quiet non-ECC nodes are the right call for a cluster that lives in a home. The same memory-type discipline runs through the DevOps workstation build.

Growing the cluster from here

The reason to leave headroom on the switch is that a cluster is rarely finished at three nodes. The first thing most people add is a fourth node, which is why every switch here has spare ports, and a managed switch makes adding it a cable and a join command rather than a rewiring job. From there the common next steps are software, not hardware: MetalLB to hand out real load-balancer IPs on bare metal, an ingress controller in front of your services, Longhorn or Ceph for storage that survives a node loss, and GitOps with Argo CD so the cluster’s state lives in a repository instead of your shell history. Each of those has its own guide, linked through this series, and each is easier to add to a cluster that was built on identical nodes and a real network than to one cobbled together from mismatched parts.

If you add a node that does machine-learning inference, the GPU choice and its power budget live in the AI workstation build, and if the cluster grows past what mini-PCs comfortably hold, the Proxmox host build is the natural way to virtualize several nodes on one larger machine. Start with three identical nodes and a switch you will not have to replace, get k3s and an HA control plane working, and grow it one honest step at a time.