27

Hello all! Just curious what y'alls typical setup is when it comes to running multiple stacks which require the same "support" containers.

What I mean is, say you want to run two services that both require a connection to a database, would you run two separate DB containers, one for each service and have them connected only to their respective DB "stacks"? Or do you prefer to run a single centralized DB server/service and have your self hosted stacks all communicate with their own databases inside the server?

top 13 comments
sorted by: hot top controversial new old
[-] morethanevil@lmy.mymte.de 17 points 1 year ago

One big DB will be a single point of failure. You will not benefit from much more speed, the only thing which is "easier" is backup.

I use different DBs for different stacks. If one hangs up or messed up completly, it won't affect others

[-] brennor@kbin.social 12 points 1 year ago

I take the latter approach -- a single PostgreSQL database service for all other containers to use. That allows me to concentrate memory/CPU to a single service and optimize for that. I've found that a single database service uses less total resources (especially memory) than running separate DB stacks for each service.

[-] nickwitha_k@lemmy.sdf.org 6 points 1 year ago

I've found that a single database service uses less total resources (especially memory) than running separate DB stacks for each service.

This should indeed be the expected result. Each DB server will have a set amount of overhead from the runtime before data overhead comes in. Ex (made up numbers):

  • storage subsystem=256MB
  • config subsystem=128MB
  • auth subsystem=280MB
  • api subsystem=512MB
  • user tables=xMB

The subsystem resource usage would be incurred by every instance of the DB server. Additionally, you have platform-level overhead, especially if you are running as VMs or containers as that requires additional resources to coordinate with the kernel, etc.

It's very much like micro-kernels vs monoliths. On the surface a lean micro-kernel seems like it should be more performant since less is happening during kernel time but, the significant increase in operations to perform basic tasks. For example, if storage access was in userspace, an application would need to call back to the kernel to request communication, which would need to call up to the storage driver, then back... and it becomes a death of a thousand cuts. In a monolithic kernel, the application just tells the kernel that it wants to access storage, what mode, and provide either the input or a buffer receive data.

[-] MaggiWuerze@feddit.de 7 points 1 year ago

I don't wanna go all 90s forum user here, but I think this is the 5th post asking this. Maybe just have a look first if this has been answered already.

On topic: just use a new db with each service.

[-] vegetaaaaaaa@lemmy.world 2 points 1 year ago

I think this is the 5th post asking this

Agree, there are a few topics like this. Other ones being "backup software recommendation" threads, "look at my dashboard" threads, "what should I self-host" threads... I just post a comment linking to previous threads like this https://lemmy.world/comment/1761863. Maybe after a while they will learn... On the other hand Lemmy's search feature is not very pleasant to work with, so I don't blame OP. Maybe there should be a pinned post about "frequent" questions.

On topic: single DB engine/service, connect each app to this DB service, and let each one have its own database in it. Anything else degrades performance and makes backups more complex.

[-] nickwitha_k@lemmy.sdf.org 6 points 1 year ago* (last edited 1 year ago)

My recommendation, if practical, is a single, potentially containerized DB server that is backed by storage that provides high availability and redundancy. This is supposing that you are using the same sort of DB (ex SQL, NoSQL, etc) and that you are targeting a smallish number of services that are on-premise.

My reasoning here is that you can treat the DB server effectively as a storage API service and run it via some orchestration service like K8S. This lets you offload your DB stability and data integrity to the FS and/or other low-level stuff that is simple to configure once and only dirty about when hardware fails. This in turn greatly reduces DB server configuration and deployment as well as treat them like livestock, not pets.

Now, if you are using a public cloud provider, my view changes slightly. Generally, I'd suggest offloading the DB to a provided service that is compatible with a FOSS alternative so that you can avoid vendor lock-in. This means that you get the HA, etc without having to worry about maintenance and configuration overhead. Just be aware of cost modeling - it's easy to run up large bills.

[-] chiisana@lemmy.chiisana.net 4 points 1 year ago

Multiple. They might need different versions of the database server, they might need to be scaled differently, they might need to be backed up at different cadences, they might need to be moved to different servers…. Etc.

The small marginal resource reduction is just not worth it.

[-] redcalcium@lemmy.institute 4 points 1 year ago* (last edited 1 year ago)

I'll usually centralized them. Use less resources than running them separately, and makes backup easier.

[-] Valmond@lemmy.mindoki.com 1 points 1 year ago

Backups harder(well more if them), restaurations easier right?

[-] timespace@sh.itjust.works 2 points 1 year ago

Separate DB containers for each service. Easier, more secure, and less messy. The overhead impact is trivial.

[-] PriorProject@lemmy.world 2 points 1 year ago
  1. If a service supports sqlite, I often will use that option. It provides everything a self-hoster needs from a DB with basically no operational overhead.
  2. If I do need a proper RDBMS (because the software I'm using doesn't support sqlite), I'm going to use...
    1. A single Postgres container.
    2. Configured with multiple logical "databases" (the container for schemas and tables), one DB for each app connecting.

I do this because I'm always memory constrained and the rdbms is generally the most memory-hungry part of any software stack. By sharing one db-process across all the apps that need it I get the most out of my db cache memory, etc. And by using multiple logical db's, I get good separation between my apps, and they're straightforward to migrate to a truly isolated physical DB if needed... but that's never been needed.

That makes a lot of sense and where I'm leaning towards as well

While my homeserver still has plenty of resources to spare, I see a lot of them going towards multiple DB containers. It's nice for "segregating" the containers, but backups are also a pain, gotta plan backups/restores for multiple DBs

Same story with an s3 (well, minio) instance running. Seems like it would make more sense to centralize DB and file operations and having different services talk to them. Then if I ever needed to move them into separate servers, it wouldn't be as big a move.

Thanks!

[-] skadden@ctrlaltelite.xyz 1 points 1 year ago

I do a separate container for each service that requires a db. It's pretty baked into my backup strategy at this point where the script I wrote references environment variables for dumps in a way that I don't have to update it for every new service I deploy.

If the container name has -dbm on the end it's MySQL, -dbp is postgres, and -dbs would be SQLite if it needed its own containers. The suffix triggers the appropriate backup command that pulls the user, password, and db name from environment variables in the container.

I'm not too concerned about system overhead, but I'm debating doing a single container for each db type just to do it, but I also like not having a single point of failure for all my services (I even run different VMs to keep stable services from being impacted by me testing random stuff out.)

this post was submitted on 17 Aug 2023
27 points (90.9% liked)

Selfhosted

39700 readers
678 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS