24

Hello everyone!

I had a container with a DB crap itself yesterday so I'm trying to speed up my learning to back up stuff.

I came across a script that taught me how to back-up a containerized postgres db at given intervals and it works. I managed to create db dumps and restore them. I've documented everything and now my whole docker-compose/env etc are on git control.

There's one part of the script I don't decypher but I'd like to maybe change it. It is about the number of back-up copies.

Here's the line from the tutorial: ls -1 /backup/*.dump | head -n -2 | xargs rm -f

Can someone explain to me what this line does? I'd like to keep maybe 3 copies just in case the auto-backup backs up a rotten one.

Thanks!

Full code below:

backup:
    image: postgres:13
    depends_on:
      - db_recipes
    volumes:
      - ./backup:/backup
    command: >
      bash -c "while true; do
        PGPASSWORD=$$POSTGRES_PASSWORD pg_dump -h db-postgresql -U $$POSTGRES_USER -Fc $$POSTGRES_DB > /backup/$$(date +%Y-%m-%d-%H-%M-%S).dump
        echo ""Backup done at $$(date +%Y-%m-%d_%H:%M:%S)""
        ls -1 /backup/*.dump | head -n -2 | xargs rm -f
        sleep 86400
      done"
all 13 comments
sorted by: hot top controversial new old
[-] doeknius_gloek@feddit.de 13 points 1 year ago* (last edited 1 year ago)

This line seems to list all dumps and then deletes all but the two most recent ones.

In detail:

  • ls -1 /backup/*.dump lists all files ending with .dump alphabetically inside the /backup directory
  • head -n -2 returns all filenames except the two most recent ones from the end of the list
  • xargs rm -f passes the filenames to rm -f to delete them

Take a look at explainshell.com.

[-] klay@lemmy.world 6 points 1 year ago

I just looked up the man page, and actually head -n -2 means "everything up to but not including the last two lines", so this should always leave two files remaining.

[-] doeknius_gloek@feddit.de 3 points 1 year ago

You're right, I edited my comment. Thanks!

[-] mnmalst@lemmy.zip 3 points 1 year ago

Your xargs comment is still wrong tho. It deletes ALL but the most recent two files.

[-] doeknius_gloek@feddit.de 2 points 1 year ago

Fixed, thanks.

[-] klay@lemmy.world 4 points 1 year ago* (last edited 1 year ago)

Ah! This is a shell pipe! It's composing several smaller commands together, cool stuff.

  • ls -1 is the grep-friendly version of ls, it prints one entry per line, like a shopping list.

  • head takes a set number of entries from the head of a list, in this case ~~2 items.~~ negative two, meaning "all but the last two."

  • xargs takes the incoming pipe and converts it into extra arguments, in this case applying those arguments to rm.

So, combined, this says "list all the .dump files, pick ~~the first two,~~ all but the last two, and delete them." Presumably the first are the oldest ones and the last are the newest, if the .dump files are named chronologically.

[-] smileyhead@discuss.tchncs.de 2 points 1 year ago

Backups are created to /backup directory and are ended with .dump file extention.

ls -1 is listing all those files chronologically, -1 is to keep one file per one line.

head -n -2 is getting lines from the top to the last two at bottom.

xargs rm -f is calling rm -f on every line of the input.

| is pipe symbol, that gets output from command before and gives it to command after

So TLDR it's removing all backups except the last 2 ones.

[-] Doomdoxrulz@lemmy.ml 1 points 1 year ago* (last edited 1 year ago)

The first command (ls -1 /backup/*.dump) just creates a list of files in the backup folder that have the extension .dump. the output of the prior command is then sent to the next command (head -n -2) this cuts the list down to everything except the last 2 items in the list this is then sent to the final command which takes the list and runs the final (rm -f) command with the items in the list as the targets to delete.

heres a solution based on this post https://stackoverflow.com/questions/25785/delete-all-but-the-most-recent-x-files-in-bash

ls -tp /backup/*.dump | grep -v '/$' | tail -n +4 | tr '\n' '\0' | xargs -0 rm -f

There is an explanation on that post that explains it in better detail but in simple terms it deletes all files but the most recent 3 files in the directory that have the .dump extension

[-] un_ax@lemmy.sdf.org 0 points 1 year ago

If you want to get more in depth, I've been using this container:

https://github.com/jareware/docker-volume-backup

It can be setup in the same compose or in it's own, and it supports pre/post commands if you want to dump a db or stop a container before backup.

Additionally, Setting a post backup command like in their docs:

POST_BACKUP_COMMAND: "docker run --rm -e DRY_RUN=false -e DAILY=3 -e WEEKLY=1 -e MONTHLY=1 -v /backup:/archive ghcr.io/jan-brinkmann/docker-rotate-backups"

Lets you specify the number of backups retained per period, E.G. 3 daily, 1 weekly, 1 monthly.

You could also mix and match.

[-] tburkhol@lemmy.world 0 points 1 year ago

Others have explained the line.

Worth noting that not all implementations of head accept negative line counts (i.e. last n lines), and you might substitute tail.

i.e.: ls -1 /backup/*.dump | tail -2 | xargs rm -f

[-] klay@lemmy.world 5 points 1 year ago

Won't this delete the two newest files, as opposed to everything except the two newest files?

[-] doeknius_gloek@feddit.de 4 points 1 year ago* (last edited 1 year ago)

~~Yeah, tail would be the more obvious choice instead of negating head.~~

Fuck, I need coffee. @klay@lemmy.world is right (again).

this post was submitted on 05 Oct 2023
24 points (90.0% liked)

Selfhosted

40152 readers
488 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS