Virtualization strategies for production

hypoiodous · February 20, 2022, 7:51pm

Hi everyone,

I’ve noticed a couple of threads that have gone a little off-topic so I decided to open a new one specifically for the topic of virtualization on production, as I’m really enjoying the conversation and taking some really useful notes.

As I’m starting out with LXC (virtualization in general) I’d like to ask you what are some common strategies for deploying virtual machines / containers in production servers. This can be either home lab or more elaborated setups.

So, I thought using something like Docker directly on the server was quite common, I suppose it is, but from what I’ve been reading here it seems that using another layer of containerization is also possible. So on the server maybe there are a couple of LXC containers which themselves are running Docker and maybe even some form of orchestration for the running services (Swarm, Kubernetes…).

What are some advantages and disadvantages of doing this?

KI7MT · February 20, 2022, 7:53pm

This would be a good place to start: Getting started with LXD Containerization – LearnLinuxTV

hypoiodous · February 20, 2022, 7:57pm

I have just watched this and it serves as a great introduction. It left me wondering if there is a way to have all of this configuration expressed in some form of code, similar to what can be done with a Vagrantfile or docker-compose.yml?

KI7MT · February 20, 2022, 8:05pm

If you’re referring to provisioning resources, there are a number of third-party integrations folks commonly use (there are more):

For me, when you say software, or infrastructure as code, I immediately think Terraform for the resource, and Ansible for configuration. I should also add Chef in that configuration portion. It’s very handy.

hypoiodous · February 20, 2022, 8:17pm

I guess I am talking about something like this, yes. Terraform is another one of these tools that I’d love to learn more but one step at a time For now, Ansible will do. Thanks for the links!

KI7MT · February 20, 2022, 8:26pm

It’s important to understand the distinction between provisioning a resource (standing up a server, networking, storage, etc), and configuration management of said resource, e.g. what services are to be run on the provisioned resource. Then there is orchestration, e.g. your k3s, k8s, and how to manage the workloads on your provisioned / configured resources.

All of these are similar, but use different tool sets to implement their respective portion of the process. So for example, you could (use the HashiCorp Stack, this is just one of many ways to accomplish it):

Use Terraform to provision and manage the resource state (CPU, Mem, Network, Storage, etc)
Use Chef / Ansible to configure and manage the instance (your containers)
Use Vault / Consul for secrets and discovery
Use Nomad for workload orchestration

All of this, to a degree, is infrastructure and management with code.

A few Examples:

Say you wanted to rate limit, or change the Nginx configuration of all your running containers. That would be a good job for Ansible, via Consul discovery, to perform.

Or maybe you want to run weekly os-level package updates. That could be Chef in concert with Consul.

Maybe you need more capacity due to loading, Nomad and Consul can manage that for you, either manual or automatically based on parameters you set.

Say you need to rotate DB passwords, ssl-cert, and the like. That could be Vault, or a combination of Vault and Ansible.

What if you wanted to add, or update service monitoring. That’s a good job for Chef or Ansible.

Say you wanted to stand up an RPI k3s cluster, you could use Terrafom to provision the resources, Chef / Ansible to update and configure the resources, Vault for secrets management between them, and Nomad to deploy / manage your container instances.

ThatGuyB · February 21, 2022, 2:11am

Not sure what you mean by virtualization strategies. I mean, it depends on the admins to decide how that goes. In big enough infrastructures, one may not even be running VMs, but run on bare metal, due to the need of a lot of performance dedicated to a single application (like database servers).

There are a lot of things to consider, so I can’t really say what is “the best” because there is no one-size-fits all solution.

But, I’ll take the second half of your comment as a starting point.

Again, depends on the production. I’ll be talking the “what” first.

If you are a small company creating and hosting an application like a an inventory system for your own store, then you would likely do it with 1 small host running a VM or 2 on local storage. If you get bigger and having the application down would mean you lose money (people don’t do anything while waiting for it to come back up), then you would likely do at least 3 hypervisors and a NAS or SAN and you would set the VM in a HA mode. If one host dies, the other takes the VMs and continues running it as if nothing happened, nobody will notice anything (except the admins who see that a host is down).

You get even bigger, the platform now needs to run in parallel and load balance, so you move your stack to kubernetes and keep the db separate. You buy 2 physical servers for the DB and run them on a local, very fast storage. DBs are load balanced. Then you get 12 other servers which you use to deploy your software on. 3 servers would be lower powered, used as the control plane, the rest are worker nodes. You now have a big infrastructure to manage just for the application itself. You need to have other servers who take care of monitoring, managing, deploying and others, plus backup servers.

Those were just examples, now let’s talk theory a bit. VMs still have the advantage of live migration and high availability, being able to move from a hypervisor to another without impacting the application or the OS running on them. Despite OCI and linux containers resource efficiency, VMs still hold a stronghold in this department, and that is assuming you want to run your software on Linux, if you need another OS, like Windows or BSDs, you need VMs.

LXC is not that used industry wide, or is just a stepping stone to the final goal. Docker’s popularity came from the idea that you can have the same software be packaged the same for everyone, so that dev, uat, prod or other people’s infrastructures would all be running the code the exact same way. But Docker and OCI containers in general became so popular that big players started adopting them and thus they evolved to become self-manageable for the most part.

But one thing that OCI containers and LXC can hold against VMs is that, if your software can be parallelized and doesn’t get affected by one instance dying, you can be more efficient with your resources. Instead of doing a HA environment with a VM basically “running” on 2 physical servers, you just create 2 separate containers on 2 different servers and load balance them. For example, web servers are a good example of things that can be load balanced, which is why nginx containers are so popular and why there are so many custom nginx images.

Now let’s talk the “how.” Classically, you would have admins create new VMs, deploying the software and launching it. When this is a rare job to do, that’s fine, but when you have to deploy and delete a lot of new instances depending on traffic, then OCI container orchestration comes into play. You could in theory do it with LXC and even VMs, but OCI containers have the advantage that they are not running a full OS inside them, so in theory, they should be more resource efficient, which add up when you run thousands of them.

As mentioned by KI7MT, there are tools like Terraform or MaaS (metal-as-a-service) that are used to provision servers and containers, then tools like Juju and Ansible (notice the or and and) used to manage the application automatically. I’m not familiar with the high level stuff and massive infrastructures, so I can’t speak for those scenarios, but you can use any tool you have at your disposal to provision and configure VMs and containers.

I am still “”“stuck”“” in the era of manual provisioning and configuring, because I don’t manage large infrastructures. I have automated some tasks with shell scripting for configuration, after manual provisioning, like automatic adding to the monitoring server (Zabbix autodiscovery), but haven’t used “big boy toys” yet and it doesn’t look like I’ll have the chance any time soon.

Actually, this comment became a rambly mess, I’m not sure I even want to post it, but I spent too much time writing it, so whatever.

KI7MT · February 21, 2022, 9:11pm

Homelabs are great for learning smaller pieces of the larger pie so to speak. Rarely do you find engineers that manage the entire infrastructure stack by themselves, though it does happen. Normally, it is multiple teams depending on the size of the company, etc. Generally, they (engineers) hone in on one area or another that interests them.

To manage infrastructure top to bottom, backend to fronend, and through the DevOps SDLC / CI-CD pipelines is no small task. Tackling one small piece at a time is how many that I know progress through the process.

hypoiodous · February 21, 2022, 10:34pm

What I meant by “virtualization strategies” is what factors are considered when deciding whether to use a full VM, containers or nothing at all for deploying applications. One of the “bad” things of having as many options as we have today is that for people getting started can get confused so I wanted to get some insight as to what to think and consider. I appreciate the advise. It’s already given me something (a lot hehe) to think about.

For now I’ve been learning a bit of everything here and there: bit of shell scripting, docker and currently Ansible. I like to try out things the more “traditional” way to better understand why are we even using these tools. But knowing the tools is one thing, another is knowing when to use them and that’s what I’m after. Much like knowing how to code and how to structure and application.

Currently, I’m trying to setup my blog in WordPress (yeah, don’t laugh) which is really very simple, with Docker. Absolutely unnecessary it’s just for the learning experience. But of course in doing that some questions come up about how to handle stuff like backups, etc.

In the other threads I understood that people sometimes use VMs in production but I thought these were mostly used during development, testing, etc. So I was curious about it. Should I try then to maximize resources in a host by setting up a VM for my blog? Split it up into a VM per service (server, database, caching, etc)? Is Docker enough for this (OCI containers?)? That sort of stuff.

But yeah, lots to think about and lots to learn about this entire topic. Thanks a ton for the replies here!

KI7MT · February 21, 2022, 10:46pm

Well, I’m not sure how others approach this, but, what I think about first and foremost is scaling. Every project and/or service can be different, so I tend to evaluate them along the following lines:

Does my application / service need to Scale-Up, Scale-Out, or not at all
What are the resource requirements: Disk IO, Net IO, CPU / Mem
What are the storage requirements (OLTP, OLAP, NoSQL, Velocity, Storage Volumes, etc)
How many concurrent users will be using the app/service
Do I need encryption at rest, in-transit, etc
Security and access constraints, etc
Can my service be co-located on a shared app server, etc
Uptime / Downtime maintenance window(s)
What is my SLA, if any
Deployment and Maintenance methods to be used (Manual or Automated)
Can my application be multi-threaded, and/or use shared file system resources
What are the challenges for dev, test, stage, prod and DR. Are they all needed
Is my application a micro-service or monolith
Is DB sharding, replication, routing, scaling needed (this can get complicated quickly)

From there, I can determine if a single VM is warranted, Bare Metal is needed, or dynamic container orchestration would be a better fit.

EDIT: I should have noted. In my early days, putting a DB, App, Frontend on the same server (VM or BM) was a no-brainer. Nowdays, all things considered, I prefer having four subnets: Load Balancers, Frontend, Backend, and Data Storage (either shared file systems, or DB’s dependent on need). Microservies tend to change that dynamic in terms of persisted storage, so its more complicated to implement that type of strategy. If I don’t need micro-services, I don’t deploy them just to keep up with the latest trends. Likewise, if I don’t need k8s, why use containers other than for learning purposes, as it only complicates the entire stack.

ThatGuyB · February 22, 2022, 5:05am

While KI7MT’s reply is top notch when it comes to analyzing a production environment, I’d say take note, but only keep it in mind in the future if you are going to work in the industry. It’s good advice, but since you are just getting started, I would say:

Learn what is easier for you
Be mindful of the resources you have available
Only scale up / out if absolutely required
Try securing your network (it’s basically free and you get to learn something)

Again, you are just getting started. I would say, use linux containers as much as possible (again, LXD is easier), to learn Linux administration. If something doesn’t run in linux containers, which is rare, but possible (I believe VaultWarden only has OCI container package), use Docker or Podman inside linux containers (nested containers). If you absolutely need an OS other then Linux, consider creating a VM. Once you get a lot more comfortable with Linux management and running your services, you may want to learn a bit about file systems, so VMs will also play a good role to play with (LUKS, LVM, ZFS etc.).

I would advise to avoid buying a ton of hardware just to learn, stick to the low-end, budget friendly and low-power consumption computers or single board computers. Try making as much use of your current resources as possible. Containers are your friend here. You don’t need high availability, you don’t need auto-scaling, you don’t need container orchestration (i.e. don’t need a kubernetes stack, just docker or podman).

See how applications work and learn. If your wordpress website has a database, try splitting that into 2 containers (application container and db container). Try running ansible in another container, and a monitoring system (grafana + prometheus, or zabbix) in another. Try setting up a samba server in another container. Then try hosting nextcloud in yet another one. Then get a reverse proxy and learn how to put your wordpress and nextcloud behind it.

Buffy · February 22, 2022, 9:00am

Totally. I think so many of the “homelab” channels get a bit carried away building out impressive server racks full of goodies and lose focus on what the purpose was supposed to be. They end up with way overkill hardware that runs up their electric bills…

A headless Linux server doesn’t need tons of resources to be able to do a lot.

I’d just add that setting up a git server to keep your configurations and notes in makes life a lot nicer.

KI7MT · February 22, 2022, 6:05pm

I have one main server, if you can call it that (it’s just a workstation MB with a fair number of cores), and a couple SBC’s. That is it for my “homelab”, aside from the networking pieces. I also run TrueNAS on Proxmox, though I do have a dedicated set of drives for the ZFS pool. Like you, I’m not a fan of paying the power company more than I need to !

Mr_McBride · February 23, 2022, 2:35pm

I also have one home-built server (Ryzen 8-core with 32GB of RAM) that I use to host a couple of VM’s.

I also have several RPI-4’s in my home-lab, which are power friendly.

hypoiodous · February 24, 2022, 5:28pm

I have a couple of old laptops that I’m using and was thinking of getting a Raspberry Pi for something a little more discrete,but I intend to use those as much as I can. For Docker, I’m using Docker Swarm at the moment as it comes already backed into it and I’ve been told there are advantages to use it even on a single host (although I’m using a couple of VMs to see how it behaves). I have not yet looked into monitoring, that will have to come after but I’ll write it down for later.

Do you have your own git server? I’m using GitHub and Codeberg at the moment for my own stuff but I guess that’d be a great experience. I was also thinking of setting up something like Kanboard for my own tasks, in addition to Nextcloud. I imagine setting up all these services is a great experience all together.

Buffy · February 24, 2022, 5:31pm

We run Gitlab CE as a Docker container on one of our Synology NASes.

ThatGuyB · February 24, 2022, 8:20pm

You probably don’t need the whole GitLab stack. It’s pretty heavyweight. Get Gitea instead, it should do the trick and is more lightweight. I haven’t used anything other than GitLab, but I heard good things about Gitea. I am planning on using stagit (because I have autism, an muh JavaScript), but there is no reason for most people to not use Gitea if they aren’t doing anything too weird that requires the full feature set of GitLab Community.

Buffy · February 24, 2022, 9:23pm

IDK, I think Gitlab’s Docker image is really easy to run and doesn’t strain the NAS at all. It makes it super easy to add new projects and securely access them from pretty much anywhere. I can do all of it myself without having to bother my dad, even.

I never used that Gitea, so I’m not going to claim that there’s no reason for people to not use it; I don’t think it’s a good idea to recommend something that I don’t or haven’t used. People will have to try it out for themselves and see.

hypoiodous · February 24, 2022, 10:34pm

I’ve heard of Gitea before and always thought of it as the go-to option for self-hosting a git server. I think I will try this first but I’m glad to know GitLab is also easy to setup in case I don’t like Gitea that much.

ThatGuyB · February 24, 2022, 11:36pm

I agree and I usually always disclaim when I don’t. But I know a few people who used it and are happy, so no reason not to recommend it. For example, it’d be much harder for me to recommend buying a Synology as opposed to doing a TrueNAS build, because I haven’t used Synology and heard people having mixed feelings about it.

GitLab is easy to setup and to update both as a docker image, and as a repo to do a package manager install. I believe the same is true for Gitea.