I'll ping @Penguincoder@beehaw.org since they would know a lot more about this.
Well thank you kindly!
Look at vacumn command. Also dbeaver is a teally good gui tool. Right now there is one table thats the main cause (i think its called activity but dont quote me on that there was a post about it)
That table is mainly for debugging as it logs every action in activitypub your instance does. I truncate it once a week and vacumn it (shrinks the disk space used). Caution. You must shut down lemmy before doing so. I run containers so i stop all except postgres and clean it up.
You are exactly correct about that one table. It is like 90% of the space. When I ran the vacuum I did not shutdown Lemmy. Now I know.
Really ugly, but really works. Connect to psql and run:
WITH RECURSIVE pg_inherit(inhrelid, inhparent) AS
(select inhrelid, inhparent
FROM pg_inherits
UNION
SELECT child.inhrelid, parent.inhparent
FROM pg_inherit child, pg_inherits parent
WHERE child.inhparent = parent.inhrelid),
pg_inherit_short AS (SELECT * FROM pg_inherit WHERE inhparent NOT IN (SELECT inhrelid FROM pg_inherit))
SELECT table_schema
, TABLE_NAME
, row_estimate
, pg_size_pretty(total_bytes) AS total
, pg_size_pretty(index_bytes) AS INDEX
, pg_size_pretty(toast_bytes) AS toast
, pg_size_pretty(table_bytes) AS TABLE
, total_bytes::float8 / sum(total_bytes) OVER () AS total_size_share
FROM (
SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes
FROM (
SELECT c.oid
, nspname AS table_schema
, relname AS TABLE_NAME
, SUM(c.reltuples) OVER (partition BY parent) AS row_estimate
, SUM(pg_total_relation_size(c.oid)) OVER (partition BY parent) AS total_bytes
, SUM(pg_indexes_size(c.oid)) OVER (partition BY parent) AS index_bytes
, SUM(pg_total_relation_size(reltoastrelid)) OVER (partition BY parent) AS toast_bytes
, parent
FROM (
SELECT pg_class.oid
, reltuples
, relname
, relnamespace
, pg_class.reltoastrelid
, COALESCE(inhparent, pg_class.oid) parent
FROM pg_class
LEFT JOIN pg_inherit_short ON inhrelid = oid
WHERE relkind IN ('r', 'p')
) c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
) a
WHERE oid = parent
) a
ORDER BY total_bytes DESC LIMIT 3;
That will show the top 3 database tables sizes. I bet you number one is activity
.
Yeah, that worked magically!
You really don't want to do a vacuum full on the live DB, it will lock everything. While useful and necessary I think, you have got to plan down time for it. Or your site is going to be inaccessible anyway.
learn PostgreSQL administration
Definitely helpful, but administration only goes so far with the Lemmy database. Take a look at this post and let me know if it answers any of your questions; if you have more feel free to ask, or ping me on matrix @penguincoder:hive.beehaw.org
Why are you concerned with your database ballooning?
What do you think you'll be able to do to the database to prevent this?
What do you want to do?
Technology
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.