DataHoarder

44 readers
1 users here now

founded 2 weeks ago
MODERATORS
1
 
 

Hey folks! First time contributor here looking for some insight into a backup need I have.

My current backup situation is a single USB SSD that stores my active projects, which I backup to a Hard Drive. It's not exactly a full backup at the moment, as non-active jobs are only saved onto the backup drive. I'm hoping to get a second drive to RAID 1 with the main backup once I have a bit more money.

Onto my issue- I'm looking for a backup software on MacOS that will only add and replace existing files on the backup, not delete ones that don't match. That way I can keep moving files from the working SSD onto the backup drive, while still being able to clear off space on the working SSD.

I think that makes sense? Let me know if I need to clarify better!


Originally posted by u/PM_ME_TINY_PIANOS on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

If you see an issue with this post, such as no content or links broken or other issues, please report the post.

2
 
 

I've essentially archived a website and want to be able to view it in say Kiwix but that takes ZIM files, so I want to know how I can compress all the html files and folder structure into a zim file that I can view offline or maybe a WARC (i'm not sure how this would work).

The alternative is that I create an app that has a browser that can open html files by decompressing on the fly into ram for example but I feel like this is what a ZIM is. Can anyone help? Thanks.

The reason I'm not using a tool like ZimIT is because I have to edit the html code to eliminate cookie popups, so now it's nice and clean ready to be archived/zimmed up.


Originally posted by u/Specific-Judgment410 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

3
 
 

I used an extension called myfavett on chrome but that only grabbed about a 1000 videos and refuses to download any further. Anyone know any workarounds?


Originally posted by u/Forsaken_Pea3464 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

4
 
 

Hey everyone, I shucked my Seagate Backup Plus Slim 2TB External HDD hoping that the internal SATA to USB adapter could be used for another SATA drive I have. Picture shows the opened casing, I removed the shielding tape and used the adapter but it has a motherboard which seems to restrict it to work only with the Seagate drive.

Unfortunately, when I plugged it into my PNY 2.5” drive, nothing popped up.

Hoping that someone knows how to make it work universally? I was trying not to buy a SATA to USB adapter because it would take a few days for delivery and I want to use the PNY drive today


Originally posted by u/BruhJr on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

If you see an issue with this post, such as no content or links broken or other issues, please report the post.

5
 
 

I recently got a pCloud subscription to back up my neurotically tagged and organised music collection.

pCloud says a couple of things about backing up folders from your local drive to their cloud:

(pCloud) Sync is a feature in pCloud Drive. It allows you to connect locally-stored folders from your PC with pCloud Drive. This connection goes both ways, so if you edit or delete the files you’re syncing from your computer, this means that you'll also be editing them or deleting them from pCloud Drive.

That description and especially the bold part leaves me less than confident that pCloud will never edit files in my original local folder. Which is a guarantee I dearly want to have.

As a workaround, I've simply copied my music folder (C:\Users\\Music) to the virtual P:\ drive created by pCloud (P:\My Music). I can use TreeComp for manual one-way syncing, but that requires I remember to sync manually regularly. What I'd really like is a tool that automatically updates P:\My Music whenever something changes in C:\Users\\Music, but will 100% guaranteed never change anything in C:\Users\\Music.

Any tips? Thanks in advance!


Originally posted by u/midnightrambulador on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

If you see an issue with this post, such as no content or links broken or other issues, please report the post.

6
 
 

Today I was given an IBM 3590 tape cartridge by someone completely else to the person that gave me the 3592 tape cartridge but it still came from the same PGS geographical company as the 3592 cartridge which now I am very curious to see what the data is on there assuming I can decode the .TAR format into files, the person also had a few 3590 tape drives at their job which were unfortunately signed off for recycling and they are to be sent off to another country to be scrapped out which means I can’t have a single one of them :( or go to the recycling company’s place and buy one from them which is a shame as I have a video of one operating that I took before they loaded them up onto a lorry (truck for the UK people) and took them, I cried a little knowing these pieces of history are wasted, I did try to offer £40 for one but they didn’t budge on it citing the contract has been signed and not being able to go back on it.

The IBM 3590 was a format that replaced the IBM 3490 tape series and was eclipsed by the IBM 3592 which had much higher storage capacities up to 50TB, speeds and drive density as these IBM 3590 drives took up a lot more space while the IBM 3592 was a full height 5.25” drive which means it could fit inside of a PC bay provided you bend the tabs out inside (these tabs are there to help guide half height 5.25” drives into the bay as most common consumer drives and accessories are half height) to allow the full height drive to fit in the 2 5.25” bays, these types of drive were intended to be used in a mainframe application with rows upon rows of tapes that are picked and chosen by robots to be placed into the tape drives for data backup, humans aren’t meant to touch or see any of these tapes with the exception of expired cleaning cartridges which are deposited into a box to be collected and replaced with new ones, there are also calibration cartridges which are only used for when a new tape drive is put into service or in the event of a read/write error to be able to recalibrate the heads and tape mechanism.

The IBM 3590 tape cartridges came in 3 different generations which is further split into 2 lengths where one is a standard length “High Speed” data cartridge and an extended length “High Speed” data cartridge, the types are as follows:

3590-B

10GB standard length “High Speed” data cartridge (this is what I have)

20GB extended length “High Speed” data cartridge

3590-E

20GB standard length “High Speed” data cartridge

40GB extended length “High Speed” data cartridge

3590-H

30GB standard length “High Speed” data cartridge

60GB extended length “High Speed” data cartridge

Here is a video of it operating which shows the marvel of engineering that was unfortunately scrapped (16 of them D: ), it had pneumatic tubes feeding to many parts of the tape drive to keep the tape stuck to the walls as the tape needed to be tight on the heads to ensure good reads and writes moving back and forth at high speeds and to operate the arm that pulls the tape media around the mechanism and to the drive spool (you can even hear a slight hiss as the arm makes its way around the drive), the design stuck around on the 3592 and IBM LTO tape drives but was motorized instead of being pneumatic which is why it was very loud.

The inner workings of an IBM 3590 tape drive complete with sound - GIF - Imgur

Thank you for reading this Friday‘s post and I hope you have a great day, if you have any queries, thoughts about the format, additional information or to point out a mistake, please put them in the comments :)

Link to previous post, post 17 (36th week): My data storage mediums, post 17 (36th week) : r/DataHoarder

Link to future post, (To be posted)

The cartridge on my wall

The cartridge up close, not shown is the very cool font used on the barcodes which I wish I could have taken a photo of before this post


Originally posted by u/LaundryMan2008 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

If you see an issue with this post, such as no content or links broken or other issues, please report the post.

7
 
 

Hey everyone!

Apologies if this isn’t the right place to ask, but I need a little advice on the easiest way to go about backing up my old computer (which has developed some disk issues in recent months with both the boot drive and an internal HDD). To not bore everyone with the details, there have been error messages/indications that a disk failure is imminent and I would like to back up everything from both drives to avoid data loss since I have some important stuff on there.

I was thinking I could maybe back up both drives onto a single 4TB HDD. However, I am unsure how feasible that would be as one of the drives has a Windows installation and the other is additional storage. What do you all think the best solution would be? I have important project files on both drives so I’m at a bit of a loss for how to best go about this.

Thanks for reading! :)


Originally posted by u/ghostpicnic on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

If you see an issue with this post, such as no content or links broken or other issues, please report the post.

8
 
 

I'm looking for a box or case for internal hard drives (1TB, 2TB, 4TB, 6TB) when I'm not using them. Which models would you recommend ?


Originally posted by u/Yukinoooo on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

If you see an issue with this post, such as no content or links broken or other issues, please report the post.

9
 
 

Right now my set up is an M4 desktop Mac + 2tb external hard drive (for now). I’ve saved a handful of movies and shows on it and have been watching them through infuse on my Apple tv. Have been very satisfied with how it’s all worked out so now I would like to begin the process of going full hoarder mode and really start loading up on shows and movies.

My immediate first use case is that I want to add all my favorite shows - mainly 30 min sitcoms like Seinfeld, trailer park boys, it’s always sunny, etc. to the drive. Using Seinfeld as an example, each episode is roughly between 800mb and 1gb as it stands now.

I own Apple compressor and would like to run all these shows through it to save on space. Any recommendations for format/audio/visual settings? HEVC? h264? h265? MP4? Other? Really don’t need super high quality here, certainly not 4k, but was thinking 1080.

Also would be curious to hear streaming platform recommendations. Infuse has been terrific so far but didn’t know if plex, jellyfin, kodi were worth a look or better in any way. Thanks in advance


Originally posted by u/SummerWhiteyFisk on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

10
 
 

Okay, captured minidv taped with WinDV and set it to split into clips instead of one big file so I can see the time and date each clip was taken, and now I want to join them in virtual dub without re encoding using direct stream copy and append clip. Problem is, I can only figure out how to do one at a time. There's like a hundred clips per tape, and I have tried highlighting all of them and dragging them into virtualdub while holding control but it puts them out of order. How can I combine all of them at once and keep them in the right order by file name. Or do I need some software besides VD. I do not want to just throw them into an editor and end up re encoding them. Thanks.


Originally posted by u/Unusual_Poem_9864 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

11
 
 

I'm creating my first Plex server and have not purchased any drive larger than 2 TB before. Right now, Western Digital is having a deal where two 12 TB drives are going for $200 each (i.e., ~$16.7/terabyte).

Is $15-17 good enough to buy four and take advantage of the limited-time offer or is that "Just buy a couple" territory?

How much do you usually spend new per terabyte? Used?


Originally posted by u/Metallica93 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

12
 
 

I've been thinking about trying various software raids, truenas, unraid, freenas, etc. and I'm not sure which one to try first. Are there other major software options that I'm not listing? Which do you recommend I try first and which would you ultimately implement to be the central backup to about 5-6 pcs/laptops and three Synology 8 bay NAS?

I've been building my own PCs since I was a kid and I pretty much have most of the pcs I've ever built, some 8 cores and a spare 16 core pc. Only about a year ago did I finally dive into the world of NAS and RAID and ended up getting three eight bay Synology NAS boxes. They are doing alright for what I'm using them for. I thought at first I'd not be good at learning about these things but I dedicated about three months of reading and youtubing and feel I have a good understanding of the synology ecosystem and some general raid knowledge.

Now I'm ready to take the next leap. Instead of buying a different brand NAS I would like to build my own and try some of these free software options using old hardware.

I am a tinkerer but I've never really had to get into much anything dealing with NAS, servers, and commercial IT stuff. Once I'm done tinkering and learning the softwares I'd like to pick one and build a cheap huge cold storage for more tinkering and to back the other computers and three Synology boxes to.

What do you all think? Any tips? Any suggestions?

TLDR: another newb decided to post a question instead of researching this topic ad nauseum and wants to know if he should play around with truenas, unraid, freenas, or other software using older hardware, 8-16 cores, 16 to 64gigs ram.


Originally posted by u/itsthexypat on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

13
 
 

I have an Orico 9958C3 with hard drives (WD Red and Iron Wolf drives) formated and showing in Windows Disk Manager (NTFS). However, they do not show in Orico's proprietary Raid Manager software. I have reformated drives, changed slots, restarted, etc. Any advice on how to setup Raid 5?


Originally posted by u/Zavad6404 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

14
 
 

Does anyone have a good grasp or understanding from experience if hiding usb drives (or things in general) in plain sight is more effective than concealing from sight?

I have important data id like to keep backed up, but mobile and offline. I don't care if the data got destroyed over time or corrupted but I want to keep it safe from prying eyes.(i have backups i just need this data offline and portable for my own convenience)

I'm also somewhat new to using bitlocker encryption and it's easy to use but I do find myself wondering how hackable it is if at all (for the common attacker on a common person like myself). is it even worth it to buy a dedicated disguised cheap usb(pen style, throw it in my massive pen collection in office? Or can I just write the data to 1 or 2 of my old usb drives? I guess my concern is if an attacker came though my home they'd check for things that might be valuable like my safe, and obvious data storages/certain paperworks. But again would that even matter if 99.9% of attackers can't fathom breaking a bitlocker encryption?

Thanks for any input


Originally posted by u/0SwifTBuddY0 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

15
 
 

Is there a way to tell Ripme to download only images from a URL that contains both images and videos? And can I set a minimum resolution for dowloaded images? I am new to all this. There doesn't seem to be a setting, Can this be done vie a config file?


Originally posted by u/Famous_Assistant5390 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

16
 
 

https://preview.redd.it/zp9vlha0vmoe1.png?width=1200&format=png&auto=webp&s=25233afd4d8804e65b7d6dff7bab03f33fe6ef53

I want to start a personal project where I scan, OCR and index markdown for old books. This is a book with ALL of Romania's roads back in 1974. It has tables and maps and all sorts of other interesting historical data points.

I already have some idea of data engineering. I'm a software engineer and I've made a project that helps with RAG, search and indexing of markdown files (even very big ones). My problem is the OCR part. Any tips?


Originally posted by u/alexlazar98 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

17
 
 

Hi all,

There are a wide number of sites which offer paid access to film references, including:

  • Shotdeck
  • Film Grab
  • Eyecandy
  • Filmboard
  • Shot Cafe
  • Frame Set
  • Screenmusings

They are paid archives, rather than being true data hoarding / open access.

Is there a centralised resource for this form of data hoarding, does anyone know? A group project?


Originally posted by u/cartrouble111112 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

18
 
 

Hi All

First off,

Thank you for all the support while I've been building out https://pricepergig.com/ (it will be the best place to find digital storage on the internet, and is right now for Amazon imo, but I would say that right :) )

If you were to sign up for price alerts (e.g. the cheapest HDD, or the cheapest NVMe price per TB for example) or in the future alerts for your saved searches HOW would you like to be alerted?

If you could also let me know your country that would help me understand, perhaps it's different in different locations.

Backstory, you don't need to read this!

Many people asked for 'alerts', and I assumed email would be ok/good/great, perhaps I was wrong, not so many people have signed up, it could well be just the form looks scary, perhaps I need to point it out more, I can work on that, or email isn't the thing you guys wanted (I know I have plenty of emails I don't look at). So, let's find out.

Today PricePerGig 'only' does Amazon, but I will be adding other marketplaces once we've figured out the base feature set, so please do participate assuming your large marketplace is also in here.

Thanks

View Poll


Originally posted by u/PricePerGig on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

19
 
 

Hello fellow Data Hoarders!

I've been eagerly awaiting Gitea's PR 20311 for over a year, but since it keeps getting pushed out for every release I figured I'd create something in the meantime.

This tool sets up and manages pull mirrors from GitHub repositories to Gitea repositories, including the entire codebase, issues, PRs, releases, and wikis.

It includes a nice web UI with scheduling functions, metadata mirroring, safety features to not overwrite or delete existing repos, and much more.

Take a look, and let me know what you think!

https://github.com/jonasrosland/gitmirror


Originally posted by u/jonasrosland on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

20
 
 

Looking for a new solution to backup my raw photos that are currently about 5 TB and have a few questions:

  1. Should I use 2 separate external HDDs and sync them from time to time or is 1 enclosure with 2 mirrored HDDs better? I am leaning towards 2 separate ones as it appears to be more redundant.
  2. If I get 2 separate HDDs should I buy 2 different brands or is it safe enough to buy 2 of the same model?
  3. Anyone here who could share their experience with the G-Drive Project 12 TB?
  4. Any other suggestions?

Thanks in advance.


Originally posted by u/Rick-Valassi on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

21
 
 

There was someone trying to dedupe 1 million videos which got me interested in the project again. I made a bunch of improvements to the video part as a result, though there is still a lot left to do. The video search is much faster, has a tunable speed/accuracy parameter (-i.vradix) and now also supports much longer videos which was limited to 65k frames previously.

To help index all those videos (not giving up on decoding every single frame yet ;-), hardware decoding is improved and exposes most of the capabilities in ffmpeg (nvdec,vulkan,quicksync,vaapi,d3d11va...) so it should be possible to find something that works for most gpus and not just Nvidia. I've only been able to test on nvidia and quicksync however so ymmv.

New binary release and info here

If you want the best performance I recommend using a Linux system and compiling from source. The codegen for binary release does not include AVX instructions which may be helpful.


Originally posted by u/JohnDorian111 on Reddit.com/r/datahoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

22
 
 

Forgive me for my ignorance on this, as I'm still pretty inexperienced with this, but is there a group or a project that makes data available from various sources, such as Kiwix for downloading Wikipedia? I figure the last 2 months have been a real wake up call and I have since downloaded the .wix for Wiki, but wonder if there is something similar that crawls .gov sites or .uni/.edu sites for archiving purposes and packaged for easy distribution/downloading?

Keep in mind, I have no idea how much effort goes into projects like that, and I can definitely appreciate it now that we have seen what happens when we take something for granted.

Just a thought that crossed my mind this morning and I wanted to post it before I forgot.


Originally posted by u/canigetahint on Reddit.com/r/DataHoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!

23
 
 

I'm trying to pull some videos and haven't found any add-on or app that can do it from Podia.com (an online course platform).

Thanks in advance for any thoughts.


Originally posted by u/magicmikela on Reddit.com/r/DataHoarder


beep boop I'm a bot to seed discussions from Reddit. Upvote or downvote posts like normal, discuss the topics here as well!