this post was submitted on 10 Jun 2025
20 points (100.0% liked)

New to Lemmy

785 readers
1 users here now

Did you just join Lemmy? This community is for you!

Do you want to help new users around? Then this community is for also you!

Thank you all for being here, it makes a big difference



FAQ


I don't know what to post

founded 2 years ago
MODERATORS
 

Not exactly new to Lemmy, but my search-fu has been noobish at best.

Lemm.ee would be shutting down at the end of this month, and while I've already moved to a new instance, I'm yet to do any archiving on my posts and comments. Is there an automated way to save my posts and comments.

Ideally, I'd want the archive to be:

  • a full copy of all the posts including
    • the full OP text
      • any embedded images will be saved and included in the OP
      • outward links are left as is
    • all of the comments, including any deleted ones (deleted by the user and deleted by mods/admins)
  • a full copy of the comments including
    • the full OP text of the thread in which the comment is made
    • the full tree up to the top-level comment
    • optionally including any deleted comments in this tree

Is this already a thing? I don't think I have the skills and the time to make one before June 23 (one week before the instance shuts down on June 30), so that is not an option. I doubt anyone can make it upon short notice either.

Is there any other method I can do this without resorting to manual saving? And if I have no choice other than to save each and every post and comment manually? How should I be doing it?

If anyone can suggest anywhere else I can crosspost this for better visibility, that will be welcome as well.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] Nils@lemmy.ca 2 points 5 days ago (1 children)

all of the comments, including any deleted ones (deleted by the user and deleted by mods/admins)

You probably will not have access to this.

Is there any other method I can do this without resorting to manual saving?

Only if you are an Admin. https://join-lemmy.org/docs/index.html

The data seems to be stored in a postgres database, and could technically be queried.

And if I have no choice other than to save each and every post and comment manually? How should I be doing it?

It depends on your need, if it is for legal purpose, it depends on your jurisdiction. Maybe web archive is enough for this case, https://archive.org/.

Otherwise, you can use the API as others mentioned. Or you can use Selenium/Scrapy/BeautifulSoup to scrape the website.

If you would rather not program something with code, look into browser add-ons that scrape websites, they are mostly visual, and you click on the things you want to save or navigate into. I am not familiar with them to recommend you something, but there are plenty of videos on how to use them.

That said, depending on the security of your instance, your ip/account might get flagged.

Talk with the Admins of your instance first and express your intentions, maybe they can help with what you need.

[โ€“] megane_kun@lemm.ee 1 points 5 days ago (1 children)

I'd rather not bother my admins (they're already burnt-out), and with what you just said, maybe I'm better off doing it manually--if I would do it afterall. TBH, I'm scared off by the impression that saving my own posts and comments is somehow taboo.

Thanks!

[โ€“] Nils@lemmy.ca 1 points 4 days ago (1 children)

I don't think it would be taboo to save something that is public available, as you can just as simple visit those pages and print them to pdf for example.

The same goes to things you have access with your account, and it is not bound by a non-disclose agreement (you can still save, but not broadcast it - depending on the laws of your jurisdiction).

I looked into the lemm.ee default profile page, and you might have success with the tools I mentioned.

You just need to navigate these links, and make the tool open the post names.

https://lemmy.ca/u/megane_kun@lemm.ee?page=1&sort=New&view=Posts https://lemmy.ca/u/megane_kun@lemm.ee?page=1&sort=New&view=Comments

You will need to take in account navigation with "next" buttons, and some pages need to be scrolled down to load all the comments.

If you don't want to contact the Admins, and you are not pressed for time, put a delay on the tool's web requests to not overload the servers. The browser add-on might avoid that because it mimics a more natural way of navigating a website.

[โ€“] megane_kun@lemm.ee 1 points 4 days ago

Yeah, I've had a think after I made my previous reply and the questionable part mentioned isn't the archiving, but the inclusion of the deleted/removed comments.

I still haven't started with the script mentioned in a different comment, but if I were to do it, I'd likely be putting a one second delay on every request.

Thanks!