Help with Home Server Architecture and Hardware Selection?

libretech@reddthat.com · edit-2 3 days ago

Help with Home Server Architecture and Hardware Selection?

ikidd@lemmy.world · edit-2 3 days ago

So, I’m a rabid selfhoster because I’ve spent too many years watching rugpull tactics from every company out there. I’m just going to list what I’ve ended up with, and it’s not perfect, but it is pretty damn robust. I’m running pretty much everything you talk about except much in the way of AI stuff at this point. I wouldn’t call it particularly energy efficient since the equipment isn’t very new. But take a read and see if it provokes any thoughts on your wishlist.

My Machine 1 is a Proxmox node with ZFS storage backing and machine 2 is mirror image but is a second Proxmox node for HA. Everything, even my OPNsense router runs on Proxmox. My docker/k8s hosts are LXCs or VMs running on the nodes, and the nodes replicate nearly everything between them as a first level, fast recovery backup/high availability failover. I can then live migrate guests around very quickly if I want to upgrade and reboot or otherwise maintain a node. I can also snapshot guests before updates or maintainance that I’m scared will break stuff. Or if I’m experimenting and like to rollback when I fuck up.

Both nodes are backed up via Proxmox Backup Server for any guests I consider prod, and I take backups every hour and keep probably 200 backups at various intervals and amounts. These dedup in PBS so the space utilization for all these extra backups is quite low. I also backup via PBS to removable USB drives on a longer schedule, and swap those out offsite weekly. Because I bind mount everything in my docker compose stacks, recovering a particular folder at a point in time via folder restore lets me recover a stack quite granularly. Also, since it’s done as a ZFS snapshot backup, it’s internally consistent and I’ve never had a db-file mismatch issue that didn’t just journal out cleanly.

I also zfs-send critical datasets via syncoid to zfs.rent daily from each proxmox node.

Overall, this is highly flexible and very, very bulletproof over the last 5 or 6 years. I bought some decade old 1-U dell servers with enough drive bays and dual xeons, so I have plenty of threads and ram and upgraded to IT-mode 12G SAS RAID cards , but it isn’t a powerhouse server or anything, I might be $1000 into each of them. I have considered adding and passing through an external GPU to one node for building an ollama stack on one of the docker guests.

The PBS server is a little piece of trash i3 with a 8TB sata drive and a GB NIC in it.

libretech@reddthat.com · 3 days ago

This is super interesting, thanks so much for sharing! In my initial poking around, I’d seen a lot of people that suggested virtualizing TrueNAS within Proxmox was a bit of a headache (especially when something inevitably goes wrong and everything goes down), but I hadn’t considered cutting out TrueNAS entirely and just running directly on Proxmox and pairing that virtualization with k8s and robust backups (I am pleasantly shocked that PBS can manage that many backups without it eating up crazy amounts of space). After the other comments I was sort of aligning around starting off with a TrueNAS build and then growing into some of the LLM stuff I mentioned, but I have to admit this is really intriguing as an alternative (even if as something to work towards once I’ve got some initial prototypes; figuring out k8s would be a really fun project I think). Just out of curiosity, how noisy do you find the old Dell servers? I have been hesitant both because of power draw and noise, but would love to get feedback from someone who has them. Thanks so much again for taking the time to write all of this out, I really appreciate it!

ikidd@lemmy.world · 3 days ago

Oh, they’re noisy as hell when they wind up because they’re doing a big backup or something. I have them in my laundry room. If you had to listen to them, you’d quickly find something else. In the end, I don’t really use much processor power on these, it’s more about the memory these boards will hold. RAM was dirt cheap so having 256GB available for experimenting with kube clusters and multiple docker hosts is pretty sweet. But considering that you can overprovision both proc and ram on PM guests as long as you use your head, you can get away with a lot less. I could probably have gotten by as well or better with a Ryzen with a few cores and plenty of ram, but these were cheaper.

At times, I’ve moved all the active guests to one node (I have the PBS server set up as a qdevice for Proxmox to keep a quorum active, it gets pissy if it thinks it’s flying solo), and I’ll WoL the other one periodically to let the first node replicate to the second, then down it again when it’s done. If I’m going to be away for a while, I’ll leave both of them running so HA can take over, which has actually happened without me even noticing that the first server packed in a drive, the failover was so seamless it took me a week to notice. That can save a bit of power, but overall, it’s a kWh a day per server which in my area is about 12 cents.

I’ve never seen the point of TrueNAS for me. I run Nextcloud as a docker stack using the AIO mastercontainer for myself and 8 users. Together, we use about 1TB of space on it, and that’s a few people with years of photos etc. So I mount a separate virtualdisk on the docker host that both nextcloud and immich can access on the same docker host, so they can share photos saved in users NC folders that get backed up from their phones. The AIO also has Collabra office set up by default, so that might satisfy your document editing ask there.

As I said, I’ve thought I might get an eGPU and pass it to a docker guest for using AI. I’d prefer to get my Home Assistant setup not relying on the NabuCasa server. I don’t mind sending them money and the STT service that buys me works very well for voice commands around the house, but it rubs me the wrong way to rely on anything on someone else’s computers. But it’s brutally slow when I try to run it even on my desktop ryzen 7800 without a GPU, so until I decide to invest in a good GPU for that stuff, I’ll be sending it out. At least I trust them way more than I ever would Google or Amazon. I’d do without if that was the choice.

All of this does not need to be a jump both feet first; you can just take some old laptop and start to build a PM cluster and play with this. Your only limit will be the ram.

I’ve also seen people build PM clusters using Mac Pro 2013 trashcans, you can get a 12core xeon with 64GB of ram for like $200 and maybe a thunderbolt enclosure for additional drives. Those would be super quiet and probably low power usage.

libretech@reddthat.com · 3 days ago

(Also very curious about all of the HA stuff; it’s definitely on my list of things to experiment with, but probably down the line once I’ve gotten some basic infrastructure in place. Very excited at the prospect though)

ikidd@lemmy.world · 3 days ago

The HA stuff is as hard as prepping the cluster and making sure it’s repping fine, then enable whichever guests you want to HA. It’s seriously not difficult at all.