204
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 19 Oct 2023
204 points (95.9% liked)
Hardware
5006 readers
1 users here now
This is a community dedicated to the hardware aspect of technology, from PC parts, to gadgets, to servers, to industrial control equipment, to semiconductors.
Rules:
- Posts must be relevant to electronic hardware
- No NSFW content
- No hate speech, bigotry, etc
founded 4 years ago
MODERATORS
From the article:
I was asking myself similar questions to these, alongside even more basic details like, "What if the future computer systems simply aren't compatible with the old filesystems, thus indicating nothing as being present on the storage media (if it's even recognized as storage media to test)?" It's the deeply fascinating problem all long-term information storage/transmission faces regarding future comprehensibility.
It's possible to reverse engineer data you have forgotten how to read.
It's impossible to read data that you know how to read, but it has become annihilated by time.
The former is far more valuable.
We've reconstructed archaic languages that no living person speaks from fragments of written records, I find it unlikely that we'll be completely unable to reverse engineer an ancient file system architecture - especially since the most likely course for someone actually reading one of these 1000's of years in the future is for the reader to be from a more technologically advanced civilization.
Think of what modern archeologists would give to have the equivalent of a wikipedia archive from 10,000 years ago - imagine the colossal amounts of grant funding that would be thrown at the problem if we even suspected such a thing was within reach.
Of course all the other issues about keeping the actual system safe for 10k years are totally valid, but you have to start somewhere, and getting a data storage system that can last that long even in perfect conditions is the necessary first step.
I saw another reply mention similar, and I see where you're both coming from, but seeing another reply in this vein has encouraged me to ask the question the other reply inspired which is: what if you lack the fragments needed to reverse engineer/reconstruct a means to access the information?
Chances are slim, and to be clear here, I'm by no means knocking this development, as I find it really exciting, but I also enjoy thinking through some of the different potential points of failure. Not from a cynical/pessimistic perspective, but because it's a compelling challenge and puzzle. How much else alongside this specific media may need to survive so that it may remain accessible, directly or indirectly, y'know?
That's as cool and fun to consider as the new storage media itself to me! Come to think of it, maybe I really should look into some kind of archival/museum jobs considering that...
In this case the "Fragment" isnt even a fragment, it would be a completely intact start to finish monstrous amount of data.
The larger the "fragment" is, and more complete it is, the more trivial it becomes to decode it.
And since this data is being purposefully stored in a manner intended for future use, it's very likely it will be encoded in a manner to facilitate and make it as easy as possible to decode in an intuitive manner.
Id strongly suspect every individual "glass" would have some form of "clue" or "how to" at the start of it, that serves as a guide to help the consumer know they are decoding it right.
Off the top of my head one example would be encoding a bunch of digits of the Fibonacci Sequence at the start as character literals (so text form), which even in binary form when inspected physically with a microscope, any scientist would go "oh hey thats Fibonacci!"
Then after that a large blank, followed by perhaps in order the entire ANSI character set from 0 to whatever it goes to now. Or perhaps Unicode.
The whole thing is only like a megabyte or two, so it would be less than 0.1% of the storage data, but having those 2 items at the start of every disk would be an easy way for the consumer to sanity check they are "reading" the data right, and clue them into "yo there's data stored on here" very fast
Agreed, except in my crunchy post-pedal glitter punk opera they would say, "oh hey that's the numbers my screensaver uses!"
Although seriously, what would dictate the "start" of the disk - the top, left, foremost block? I think we can assume they would try to read the data contiguously, but that's about it. I guess you could have some kind of visual indicator, like it's in a different colour...
Interesting problem!
Well that's a different question, because now it sounds like you're assuming that significant data loss will occur before it's read. If the storage unit itself is damaged in the meantime to where it's data is corrupted beyond recovery, then yes - that's a potential total loss scenario. Assuming however that the storage unit remains intact, I don't see how a dedicated team of smart individuals couldn't handle it, unless their technology is somehow inferior to ours.
It's also worth considering that this storage unit probably won't be their very first interactions with modern data storage systems. This may or may not be their first interaction with a data storage system that was actually written from modern times, but unless we have a total technological collapse in the intervening 10,000 years, chances are they'll have records from our time that have been copied over however many thousands of times to make it there. Afterall, to use a much less extreme example, I don't need to get my hands on a CD-Rom or Floppy Disk burned in 1991 to get a copy of Linux 0.01, it's been copied over and over through the years and is now available for download online. Data will surely degrade over time, and large chunks will get lost as people stop copying things they think are no longer important, but I feel pretty confident in the idea that enough pieces will make it that far that these scientists (techno-archeologists?) won't be starting from scratch
Right, that's what I was trying to refer to in my reply, not a damage to this new storage media itself, but surrounding data/storage media that would provide help in reverse engineering it. Sorry I wasn't clearer about that! I was thinking like if you didn't have, say, a Rosetta Stone kind of artifact (or artifacts) to help in translating/reconstructing/reverse engineering.
That's why I wrote that I think it's really unlikely, like yourself, but it's interesting to consider.
I would think that you could leave a Rosetta Stone with directions on how the data is stored and read. It wouldn't take much, I think. "These glass things contain information, here's how it is encoded. Here's the requirements on reading these". You could start off simple and have a rudimentary one that can be deciphered by hand that describes how to make a device that can quickly pull information from a few others that give directions on how to build another device to read the high capacity ones. You don't need a specific filesystem or computer to read it, you just need to know how to decipher it and that it IS data stored in a certain way, not just cool looking glass art.
Might as well ask what's indicative of stone tablets from millennia ago being data to us now? These things aren't discovered and studied in a vacuum. They operate within context - where the items were found, their similarity to other better understood things, known history of data storage, etc etc.
Given enough time and disruption, sure, all context could be lost, but if that's the case, I'd assume figuring out what the weird glass cube thing is would be the least of their problems.
Id expect its something akin to average half life or whatnot, such that you can make multiple backups and further improve that number.
Honestly Im curious how something could last for over a few thousand years and not be effectively speaking eternal.
Like at a certain point, if it hasnt failed by 5,000 years, what on earth would cause it to fail after another 5,000 years? What process is slow enough to "erode" the perfectly preserved object that cant get the job done in 5,000 years, but it can get it done in 10,000?
Since I am sure error correction code is used, it is one and the same.
More importantly than the filesystem formats, for media I hope they're using codecs that are as simple and as close to raw as possible, eg: PCM and BMP. Chances are pretty high that with something like PCM data, even if nobody had any idea what it was, at some point somebody would stumble upon turning it into audio. I can't imagine ever successfully decoding HEVC data without a specification.