[-] admin@lemmit.online 3 points 1 year ago

Can't blame you for that. Personally, I still think it excels at content where communication with OP is irrelevant, like !itookapicture@lemmit.online, !todayilearned@lemmit.online or !dataisbeautiful@lemmit.online. And by far best example of this, if you look at the subscriber count, is nsfw content.

[-] admin@lemmit.online 2 points 1 year ago

Nope. That would be very hard to implement, and probably very confusing and disliked by other lemmy users.

[-] admin@lemmit.online 4 points 1 year ago

I don’t know how the karma thresholds work behind the scenes, but might I suggest for the bot to do a “top for” sort instead? Like it will only repost top content for the past 6 hours only. This will also help get more quality content as well and avoid reposting low effort/quality posts.

This is effectively already kinda how it works. For each subreddit it periodically (anywhere between every 30 minutes to every 12 hours, based on subscriber count and posts per day) requests the "hot" content feed. It then checks each post if it has at least 20 upvotes, and a 80% upvote to downvote ratio. Those numbers are configurable, but that's what they're currently set to - I believe they're a good mix between filtering out the complete garbage while still making sure it doesn't miss good content is.

54
submitted 1 year ago* (last edited 1 year ago) by admin@lemmit.online to c/lemmy@lemmy.ml

A few months ago, I launched the Lemmit instance and bot (@bot@lemmit.online). Primarily, this was to help me stay up to date with some of the content I'd leave behind on Reddi. Additionally, I wanted to give back to the community, so I made it possible for anyone to request the archiving of subreddits to the Lemmit instance.

However, this came with some unintended consequences. Notably, the most subscribed community on the instance has been !AmItheAsshole@lemmit.online. Even though it should have been obvious that there is no way to communicate with the Original Poster, given they're on Reddit.

The pushback against the bot and the instance has increased over time. A recent post, This bot is bad for Lemmy, highlighted these concerns. I've also received similar feedback from admins of major Lemmy Instances and through direct PMs.

As a response, last week I stopped accepting requests for archiving new subreddits. This weekend, I went a step further by discontinuing the archiving of a large amount of "interactive subreddits"—communities primarily centered around Q&A or communication with the Original Poster. This includes subs like !AskReddit@lemmit.online and !dating_advice@lemmit.online, as well as niche and support communities. Such discussions are better hosted on Reddit or Lemmy's equivalent spaces.

I've also adjusted the post karma thresholds to curb spam posts. While this probably won't appease everyone, it should reduce the bot's posting frequency.

Perhaps this might prompt some admins to rethink their choice to defederate from the Lemmit instance, or the banning of the bot. I'm not expecting anyone to, and won't take it personally if you don't, but I wanted to give the community this update nonetheless.

In !about@lemmit.online there's a sticky post of all the Actively archived communities on the server (including NSFW ones, since that is not public without logging in), as well as the list of communities for which archiving is now disabled.

Cheers!

[-] admin@lemmit.online 1 points 1 year ago

Sure thing, done.

[-] admin@lemmit.online 1 points 1 year ago

Thanks, that means a lot.

I have nothing but respect towards Ruud and the others at LemmyWorld, and I understand how some instances want to block the Lemmit bot - so no hard feelings to whatsoever.

[-] admin@lemmit.online 1 points 1 year ago

I figured it out - earlier this morning Lemmit was defederated from your instance, but that seems to have been fixed now.

[-] admin@lemmit.online 2 points 1 year ago* (last edited 1 year ago)

Still not getting updates properly, but I have approved lemmy.basedcount.com, @Nerd02@lemmy.basedcount.com

[-] admin@lemmit.online 1 points 1 year ago

Yeah, wasn't properly subscribed to this community yet, so couldn't reply to them. Glad to have helped though.

[-] admin@lemmit.online 1 points 1 year ago* (last edited 1 year ago)

Hey all. Owner of a highly controversial Lemmy instance requesting a guarantee here: lemmit.online - the on-demand Reddit archiver.

While the bot that runs on it is not loved by everyone, I can assure you that I will never let any other users register, and the only content that gets posted on it will have to be vetted by Reddit first.

Edit: Guaranteed, thanks! Next admin can tag me, I'll gladly pay it forward.

1
testie mctestes (old.reddit.com)
1
submitted 1 year ago* (last edited 1 year ago) by admin@lemmit.online to c/about@lemmit.online

Okay, this one took me a bit longer than I planned (mostly due to sql fun and trying to use integers as minutes, WEEEE!).

Backdrop: Last week I disabled the mirroring of a couple of subreddits to the database, because they were initially requested but the nobody subscribed to them. At the same time, the bot was just crawling in a loop, starting at todayilearned, ending at latestsubreddit. As more subreddits were requested, this loop took longer and longer (21 minutes before I rolled out this update). This wasn't sustainable.

So here's the new situation. The more popular a community is, the more often it will be updated. In this case popular means a mixture between number of subscribers and the amount of posts it receives per day (Link to relevant snippet of source code).

In short, the most popular subs will be synced every 10 minutes, the next tier ever 30 minutes, 120 minutes and the content with either no posts per day or no subscribers (other than the bot), will only be synced every 12 hours. I hope this will hit a good distribution of updates vs popularity, but it will most likely be refined at some point in the future.

Speaking of distribution, we now have over 300 communities on this server 🥳, and their update intervals are spread out as such:

  • Every 10 minutes: 22
  • Every 30 minutes: 39
  • Every 60 minutes: 55
  • Every 120 minutes: 143
  • Every 720 minutes: 44

With this update running live (I started typing after I deployed it, and it has now gotten through the backlog of 'abandoned' subs), I'm going to step back from feature development for a few days. Any bugs that cause the bot to crash will of course continue to be addressed.

Have a blast!

1
submitted 1 year ago by admin@lemmit.online to c/about@lemmit.online

Before was running on the cheapest model (1 core / 1GB mem / 30GB storage) at $12/month. The machine was running pretty low on memory, causing it to start swapping, which in turn caused the cpu to get too busy, and everything to slow down.

Now it has a whopping 2GB of memory, and things seem to have calmed down - cpu is back to around 10-15% usage, and swap is down to 0. Happy times all around.

Because of the amount of subs being archived, it now takes about 15 minutes between updates for each sub (was 18 before I updated the VM).

I'm planning to build some kind of scoring system, based on the amount of posts per subreddit (per day?), and amount of subscribers on the lemmy community. That way communities with little subscribers or that don't see many posts per day, will only be updated once per hour.

At the same time, I feel that subs like AskReddit, OutOfTheLoop and other "question-based" subreddits shouldn't be archived by Lemmit. In my opinion those kind of posts are useless without those answers, but please let me know if you disagree.

1
Bug fixes 24-06-2023 (lemmit.online)
submitted 1 year ago by admin@lemmit.online to c/about@lemmit.online
  • Fixed a bug where posts would not be submitted because the title didn't contain long enough words.
  • Fixed a bug where posts would not be submitted because the url was too long.
  • Fixed a bug where posts would not be submitted when it was linking to a /user subreddit.
  • Fixed a bug where the bot would think Every Post Everywhere was a subreddit request, and would reply to it.
  • Fixed a bug where the bot would crash without recovering whenever something went wrong during new subreddit requests

A fruitful day all in all, I'd say.

1
Please don't tell me (lemmit.online)
submitted 1 year ago by admin@lemmit.online to c/about@lemmit.online

That the replies-everywhere-bug was just because I forgot to include a variable in the bot deployment? 🤦

2
submitted 1 year ago* (last edited 1 year ago) by admin@lemmit.online to c/about@lemmit.online

In the short time since this instance and bot launched, I've been seeing the same questions resurface multiple times. This is totally understandable, since the concept of a Fediverse is still new to most (myself included), and this server is not like the others.

Q: What is Lemmit?

A: Lemmit is a Lemmy instance specifically designed for archiving Reddit content. Users can request new subreddits to be included in the archiving process by posting in the !requests@lemmit.online community. It is powered by an open source python bot, which periodically checks the request list, adds new requests to the queue, and continuously monitors the Hot feed of those subs for new posts to cross-post here.

Q: Does it synchronize comments?

A: No, that would be impossible. Considering there are thousands of posts already on Lemmit, many of them having at least several hundred comments on Reddit, often buried in deep layers, it simply wouldn't be feasible to index those for more than a few posts, let alone keep them up to date.

Unfortunately, this means that archiving certain subreddits, such as Ask Historians/Men/Women/Hyperintelligentshadesofthecolourblue-type subs, is going to be rather pointless.

Q: Can it send comments back to Reddit?

A: No, it cannot. The purpose is to help bootstrap the Lemmy platform, not to serve as a bridge between the two networks. Also, see the answer about synchronizing comments.

Q: Can I request any subreddit?

A: Technically, yes. However, as the list of subs grows, the time it takes to update all of them will also increase. I do not have strict guidelines in place for this, so I'm relying on your common sense (hoooo boy). At some point, I will probably have to either stop accepting new requests or disable scraping for very low-traffic communities.

Q: Does this use the API? Will it keep working after July 1st?

A: Nope, it uses a combination of the public feed and scraping old.reddit.com. So, as long as those are still available, it will continue working. And even if they close those sources, there will probably be new ways to achieve the same effect. "Content, eh, finds a way."

Q: What started this?

A: Okay, nobody asked this, but I'm going to tell you anyway. After Reddit made it clear that they are effectively killing third-party apps and implementing plenty of other anti-end user decisions, I realized that I would either have to accept not being able to access my time-wasting content or have to do so in a rather uncomfortable way (either through the official app or old.reddit.com for as long as they'll allow it to exist).

Being a stubborn developer, naturally, I chose option C: Have my own Reddit. With blackjack, and hookers. This way, I would still be able to access my beloved content without being beholden to Reddit's mood swings and abusive relationship tendencies.

Besides that, I also know that Content is King. So I'm order to counter the network effect (No users because no content, No content because no users), I figured it would be better to have some inorganic content to bootstrap the adoption of Lemmy.

Q: This is spam, can you stop?

A: First off all, I apologise for the inconvenience. All you have to do is block @bot@lemmit.online, and none of its posts will ever show up on your instance.

Obviously I could stop, because running this server and software is only ever going to cost me time and money. But for the reasons listed above, I still think this server is a useful addition to the lemmyverse at this time. But I'm looking forward to the day where I can turn the bot off because it's no longer needed.

Q: Are NSFW subreddits allowed?

A: Absolutely. Like I said: Blackjack and hookers.

Q: My request isn't picked up by the bot!

A: That isn't a question. But yeah, the process isn't flawless yet. I'm trying to iron out all the bugs as I encounter them. In the meantime, feel free to re-request the subreddit by making a second post. No harm done.

Q: No new posts are showing up at all on Lemmit

A: If no posts are appearing on the Lemmit Frontpage (sorted by NEW), it's possible that the bot has crashed or is stuck on something. Since no software is flawless, this sometimes happens. I usually fix this as soon as I'm aware, and I'm happy to say that these kinds of fatal errors are becoming less and less frequent. However, they may still occur, and as a human with needs of sleep and other responsibilities, I'm not always able to fix them immediately.

Q: Posts aren't showing up on my instance, what's up?

A: First, check if any posts show up on the frontpage (see previous question). Other than that - I wish I had an answer to this. I'm not an expert on the inner workings of the Lemmy service. If you're an experienced Lemmy admin and think you may have a clue, please reach out.

Q: When are you updating to v0.18?

A: There have been some varying reports on its stability. For now I will be waiting until v0.18.1 rolls out.

1
submitted 1 year ago by admin@lemmit.online to c/about@lemmit.online

Long story short: I messed up with the domain registration for this instance, and never replied to a mandatory email. The domainname (lemmit.online) got put in suspension, causing disconnects all over the fediverse.

I fixed it as soon as I found out, but it will probably take a few more hours for the issues to be fully fixed.

So ehm. Whoops. Hope this explains and fixes the federation issues we've been having today.

1
Bug fixes 21-06-2023 (lemmit.online)
submitted 1 year ago by admin@lemmit.online to c/about@lemmit.online

Most importantly that the bot no longer crashes (and does nothing all night while I sleep 😛) when trying to create a community that has already been requested.

Furthermore mostly making the code prettier and adding tests.

1
Bug fixes 19-06-2023 (lemmit.online)
submitted 1 year ago* (last edited 1 year ago) by admin@lemmit.online to c/about@lemmit.online

Fixed a couple of bugs today:

  • Nasty one that made the bot get stuck in an infinite when trying to add a post by a deleted user, which kept the bit offline for most of last night.
  • Another creative one that, when posting certain links, would actually work, but the lemmy gateway would respond with a timeout. It only happens on certain links, but consistently. Which would make the bot think it was unsuccessful, which would make it try to post again the next time. Causing a duplicate post each time (technically it was a cross post to itself... Which is interesting in a whole new way).

TLDR: right now there is a workaround in place that assumes a timeout post to lemmit was actually successful. This might cause it to drop posts in the future, but seeing that the server is barely breaking a sweat at this time, it should be good until a better fix is implemented.

Also got some great feedback from users, which I added to the TODO.

3
submitted 1 year ago by admin@lemmit.online to c/lemmy@lemmy.ml

I have created some software that is capable of synchronising posts from Reddit to Lemmy. It's still a little rough around the edges, but it works as a such:

People can request new subreddits to be mirrored on !requests@lemmit.online. A bot (open source) will monitor the threads there, and if it finds a new request for a subreddit, it will make a new community on the Lemmit server, and add it to its monitored list. It will then make periodic checks to see if any new posts (it doesn't copy any comments) have been posted on reddit, and copy those over.

Users can then subscribe to those communities from their own lemmy instance, and from there federation will pick it up. Or at least, that's the theory. At the moment, federation is not working awesomely, and that is where my lack of fediverse knowledge comes in. Maybe it needs more time, or something is not so properly - I don't know.

Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, "original" content to all the rest of the Fediverse while it's going through a ramp-up phase. Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself. There is a !about@lemmit.online community for further questions about the project itself though, in case people want to discuss it further.

So ehm... Let me know what you think :)

view more: next ›

admin

joined 1 year ago
MODERATOR OF