Checking my Homelab Deployment Plan

a5tra3a · November 7, 2022, 7:34pm

I am reorganizing and deploying my homelab environment and currently running Proxmox VE on 3 HP DL380 machines using a 250GB SSD for Proxmox VE and NAS (TrueNAS) storage over NFS for all other storage (backups, virtual disks, ISOs). While I am running on these 3 nodes I have the 4 other nodes offline and waiting for me to redeploy Proxmox VE with my new configuration, I wanted to post to make sure my plan is sound before spending the better part of the day deploying it and moving everything over from the old to the new cluster before rebuilding the currently running 3 nodes to add them into the new 4 node cluster for a total of 7 nodes at the end of the day.

I plan to use a 250GB SSD for the OS and have it partitioned as follows in each system, The second SSD in each system will be used by Ceph for virtual disk storage and specific backups only: (This disk will likely be formatted as EXT4 as it is a single disk though might use ZFS, though I only have 32GB RAM in the system.)

        Proxmox VE: 20 GB
        ISO Images: 15 GB
           Backups: 50 GB
     Virtual Disks: 75 GB
    Docker Storage: 90 GB

I also plan to create an LXC container on each system and attach to it the non-Proxmox partitions and also install GlusterFS to allow them all to be in sync across all the hosts, as my ISO library does not change very much and backups can happen and sync at off hours of the day and the partition of the virtual disk will not be synced across machines as it is only used for this LXC container and any other virtual machines or containers if I need.

I use Docker (Docker Swarm in particular) for most of my homelab services and plan to install an LXC container on each node as worker nodes and will connect to the Docker storage partition and have the data synced across all the nodes via the previous container for shared Docker Swarm storage. I also plan to deploy 3 virtual machines on 3 different nodes to be the manager nodes for the Docker Swarm cluster.

ThatGuyB · November 7, 2022, 7:45pm

Welcome to the forum!

Should not be a problem. ZFS can run on 2GB of RAM if you are just using a single array of mirrors / stripes / stripped-mirrors / RAID-z1 / z2.

I don’t understand why you plan on running Ceph and Gluster combined, I suggest sticking with Ceph, since that’s what Proxmox knows. Or just run ZFS and have gluster or ceph inside your containers? Your choice.

Unless you really need the HA / fencing feature in Proxmox, you can get away with LXC and nested containers too. For just docker stuff, VMs are overkill. Just my $0.02.

a5tra3a · November 8, 2022, 1:23pm

I was planning on just using Ceph but have not found a way to use it with partitions vs whole disks, which is why I added GlusterFS, I could use ZFS replication though my understanding is that it is only able to sync every 1 minute at the quickest.

The reason for using VMs for the managers is so should a node fail it can be moved using HA and also have the ability to live migrate should I need to for maintenance.

The only other hiccup on this setup is that each disk in the server will be setup as a RAID0 on the RAID card in the system as I do not have an HBA card and the P400 does not support passthrough.

I did manage to get Proxmox to create OSDs from a partition on both the OS install drive as well as a secondary drive but I cannot seem to figureout the best way to keep the pools split on the OSDs, I want to have all the OSDs that are for each type of storage separated from the others, or should I just have 1 partition for the remaining part of the OS disk and make that an OSD, though I would still need to keep the other data disk separate from this OSD set when making the pools.