Topic Actions

Topic Search

Who is online

Users browsing this forum: No registered users and 2 guests

12 petabytes vs the Black Hole

This fascinating series is a combination of historical seafaring, swashbuckling adventure, and high technological science-fiction. Join us in a discussion!
12 petabytes vs the Black Hole
Post by wingfield   » Fri Apr 12, 2019 3:41 am

wingfield
Lieutenant Commander

Posts: 110
Joined: Sat Apr 25, 2015 11:15 pm
Location: Melbourne, Australia

I never truly grasped the truly momentous size of the file in the Key until I saw the stack of discs representing the FIVE petabytes of data used to create the visual image of the Black Hole that we saw this week. It gives one pause ...
Top
Re: 12 petabytes vs the Black Hole
Post by DMcCunney   » Sun Apr 14, 2019 2:51 pm

DMcCunney
Captain of the List

Posts: 421
Joined: Mon Jul 02, 2012 1:49 am

wingfield wrote:I never truly grasped the truly momentous size of the file in the Key until I saw the stack of discs representing the FIVE petabytes of data used to create the visual image of the Black Hole that we saw this week. It gives one pause ...
The more interesting bit is the storage technology. The Key is a tiny object, all told, yet that 12 petabyte file is just one of the things it contains. (IIRC, we aren't told total capacity. I assume space left over in the Key.)

I think we can assume further advances in compression technology, as well as further advances in miniaturizing semi-conductor electronics. The tablet I'm configuring at the moment has a 64GB microSD card as plug in external storage. The largest microSD card currently available holds 512GB.

Stuffing more than 12 petabytes of data into something with the Key's form factor is likely several generations beyond current tech.
______
Dennis
Top
Re: 12 petabytes vs the Black Hole
Post by isaac_newton   » Sun Apr 14, 2019 3:24 pm

isaac_newton
Commodore

Posts: 931
Joined: Fri Oct 18, 2013 5:37 am
Location: Brighton, UK

wingfield wrote:I never truly grasped the truly momentous size of the file in the Key until I saw the stack of discs representing the FIVE petabytes of data used to create the visual image of the Black Hole that we saw this week. It gives one pause ...


It does indeed. One thing that suprised me was the statements that they had no backups... Now obs that would be extremely non trivial, but must have given them a LOT of sleepless nights. I presume that the loss of a single disk would not have blown the entire project, so I wonder what failure rate would have been acceptable.
Top
Re: 12 petabytes vs the Black Hole
Post by DMcCunney   » Tue Apr 16, 2019 6:14 pm

DMcCunney
Captain of the List

Posts: 421
Joined: Mon Jul 02, 2012 1:49 am

isaac_newton wrote:
wingfield wrote:I never truly grasped the truly momentous size of the file in the Key until I saw the stack of discs representing the FIVE petabytes of data used to create the visual image of the Black Hole that we saw this week. It gives one pause ...
It does indeed. One thing that suprised me was the statements that they had no backups... Now obs that would be extremely non trivial, but must have given them a LOT of sleepless nights. I presume that the loss of a single disk would not have blown the entire project, so I wonder what failure rate would have been acceptable.
How do you back up that much data?

I used to be backup admin at a former employer. We backed up to a tape jukebox, and backups were sent offsite to a storage facility each day. I couldn't do a full backup - there wasn't enough time in the overnight backup window. I had to do incrementals. What should have been done was back up to disk on another server, and send backups of that to the offsite facility. My boss agreed, but the funding wasn't there to get another server for it.

Failure rate is an imponderable. I assume that there were many data sets of observational data that became part of that image. Failure of storage for some of them would have simply meant less precision in the final output. It would have taken a really serious problem to stop production of the image, like the destruction of the entire data center where the data sets were stored.

Google makes redundant copies of any data in several different geographical locations. There was a presentation by a Google VP of Engineering years back where he showed slides on an anonymous cinder block building surrounded by fire equipment. He said "I can't tell you what happened, but it was bad, and it was one of our facilities." They lost an entire data center. The most any Google user might have noticed if they did notice was that queries took a second or so longer than usual. No actual data was lost.

But the rest of the world isn't Google, and doesn't have the resources for redundant storage of petabytes of data, nor good means of backing it up if they have it.
______
Dennis
Top
Re: 12 petabytes vs the Black Hole
Post by phillies   » Wed Apr 17, 2019 11:42 am

phillies
Vice Admiral

Posts: 1942
Joined: Sat Jun 19, 2010 8:43 am
Location: Worcester, MA

DMcCunney wrote:
wingfield wrote:I never truly grasped the truly momentous size of the file in the Key until I saw the stack of discs representing the FIVE petabytes of data used to create the visual image of the Black Hole that we saw this week. It gives one pause ...
The more interesting bit is the storage technology. The Key is a tiny object, all told, yet that 12 petabyte file is just one of the things it contains. (IIRC, we aren't told total capacity. I assume space left over in the Key.)

I think we can assume further advances in compression technology, as well as further advances in miniaturizing semi-conductor electronics. The tablet I'm configuring at the moment has a 64GB microSD card as plug in external storage. The largest microSD card currently available holds 512GB.

Stuffing more than 12 petabytes of data into something with the Key's form factor is likely several generations beyond current tech.
______
Dennis


Once upon a time, something like three decades ago, there was an event in Byte magazine. As I vaguely recall it, someone was advertising for inquiries about their one terabyte hard drives. In period, this was viewed as physically or at least technically impossible. There was speculation that this might be a feeler by the CIA or the like, hoping that someone would come up with a way to build one of these 1 Tb devices. I now have several 1Tb and 3Tb disks under my desk, and I am not a computer gamer jock.
Top
Re: 12 petabytes vs the Black Hole
Post by FriarBob   » Wed Apr 17, 2019 12:55 pm

FriarBob
Rear Admiral

Posts: 1046
Joined: Thu Jan 27, 2011 7:29 pm

phillies wrote:Once upon a time, something like three decades ago, there was an event in Byte magazine. As I vaguely recall it, someone was advertising for inquiries about their one terabyte hard drives. In period, this was viewed as physically or at least technically impossible. There was speculation that this might be a feeler by the CIA or the like, hoping that someone would come up with a way to build one of these 1 Tb devices. I now have several 1Tb and 3Tb disks under my desk, and I am not a computer gamer jock.


And I am not a math nerd, but sometimes it's fun to do some anyway...

Though obviously a bit expensive as an SSD, a 1TB old-style drive is pretty much ubiquitous today. So are 2s and 4s and perhaps even 8s, but let's baseline it at 1TB.

Moore's law supposedly broke down in 2012, but let's pretend it didn't (or that it resumes "soon"), and let's stick to the simple 2-year number and apply it to HD space rather than transistors on the chip (yes I know it's a stretch, bear with me). So every 2 years from now we'll double the 'normal' size of the hard drive.

2021 - 2TB
2023 - 4TB
2025 - 8TB
[etc]

1KB (OK, 1KiB for purists using the latest standards) is 2^10 so after 20 years (2039) we're going to have 1024 times as much HD space. This means 1 petabyte drives are now ubiquitous. Of course, a single file being 12 PB means that we're going to need just a bit more space... so let's carry it out a bit further. But after 40 years (2059) we'll have 1048576 times (or 1024x1024). At this point the exabyte drive is as common as dirt and a 12PB file is now "quite large" but no longer an impossible deal. (And go another 20 years and the zettabyte drive means it is no big deal at all.)

In Weber's world here the first extra solar colony was built in 2091 and the Federation didn't get annihilated until 2430. More (or Moore) than enough time to make the Key a viable bit of technology.

Even if we said that the HD space took a full decade to double (which is way more than recent history suggests) we'd still have enough time. At this point we'd need 200 years to make the exabyte drive ubiquitous, but we've got over 400 to work with. Even taking 20 years still fits, and 25 is theoretically feasible if you assume massive tech improvement during the desperation of the war against the Gbaba. Even if you postulate the zettabyte drive to be required then 300/600 years is the range and we're only in trouble on the 20-year cycle if the massive tech boost from the Gbaba war isn't quite as large as we thought.

12 petabytes sounds huge to us today (because it is!), but just give it a few more years...
Top
Re: 12 petabytes vs the Black Hole
Post by FriarBob   » Wed Apr 17, 2019 4:17 pm

FriarBob
Rear Admiral

Posts: 1046
Joined: Thu Jan 27, 2011 7:29 pm

DMcCunney wrote:But the rest of the world isn't Google, and doesn't have the resources for redundant storage of petabytes of data, nor good means of backing it up if they have it.


Well most people don't have petabytes of data to back up. At most they have terrabytes, some even still have mere gigs or so. For those folks they may not have it "on-site" or anything, but most of them can GET access to it if they want. Especially if they are willing to trust their backups to the cloud. I'm not willing to use any sort of system that doesn't encrypt my data (at least enough to keep the amateurs out) and I don't trust Google as far as I can throw them, but they provide options here and so do quite a few other companies. I definitely wouldn't trust one that's free (if there are any); they are probably using your data for marketing purposes of some sort. But I pay only ~$50 a year for my online backup solution and there are multiple affordable systems out there.

Part of the secret is probably RAID-5. The striping of the data and the checksums allow even partially destroyed data to be reconstructed "on the fly". Hardware RAID cards even do it quite quickly, nearly invisibly. Software RAIDs are much slower but can still get the job done.

The rest is just space. And there is space available right now even before the petabyte drive. It's just a matter of how much you need and how much you can afford to pay for.
Top
Re: 12 petabytes vs the Black Hole
Post by Joat42   » Thu Apr 18, 2019 2:43 am

Joat42
Vice Admiral

Posts: 1517
Joined: Tue Apr 16, 2013 6:01 am
Location: Sweden

FriarBob wrote:..snip..
Part of the secret is probably RAID-5. The striping of the data and the checksums allow even partially destroyed data to be reconstructed "on the fly". Hardware RAID cards even do it quite quickly, nearly invisibly. Software RAIDs are much slower but can still get the job done.

The rest is just space. And there is space available right now even before the petabyte drive. It's just a matter of how much you need and how much you can afford to pay for.

RAID-5 tend to work poorly on excessive disk-sizes since the rebuild time can actually exceed the time for another read error to statistically occur which then can invalidate the whole array depending on the setup.

---
Jack of all trades and destructive tinkerer.


Anyone who have simple solutions for complex problems is a fool.
Top
Re: 12 petabytes vs the Black Hole
Post by DMcCunney   » Sun Apr 21, 2019 5:48 pm

DMcCunney
Captain of the List

Posts: 421
Joined: Mon Jul 02, 2012 1:49 am

phillies wrote:Once upon a time, something like three decades ago, there was an event in Byte magazine. As I vaguely recall it, someone was advertising for inquiries about their one terabyte hard drives. In period, this was viewed as physically or at least technically impossible. There was speculation that this might be a feeler by the CIA or the like, hoping that someone would come up with a way to build one of these 1 Tb devices. I now have several 1Tb and 3Tb disks under my desk, and I am not a computer gamer jock.
I fondly remember the old days of Byte Magazine.

I worked for a major bank in the 80s when the original IBM PC was first showing up on corporate desks as an engine to run Lotus 1,2,3. One of the officers in my area got a PC with a <gasp!> 5 Megabyte hard drive. IIRC, it cost about $5K, and half of that was the cost of the hard drive.

When I finally got a PC clone at home to complement my Unix machine, I installed two Seagate ST-225 20 megabyte hard drives. Wow! Unlimited storage! :P

But semiconductor electronics get steadily smaller, faster, and cheaper. A chap elsewhere talked about converting a database server he administered from 16TB of SATA hard drives to 16TB of 2TB Samsung SSDs. He got an order of magnitude performance increase. It screamed through queries and updates. What struck me as significant was that the tech had gotten cheap enough that it was an affordable upgrade. Lots of things folks might like to do are technically possible, but just too expensive. Those barriers are falling, and we are just seeing the tip of that iceberg.
______
Dennis
Last edited by DMcCunney on Thu Apr 25, 2019 1:05 pm, edited 1 time in total.
Top
Re: 12 petabytes vs the Black Hole
Post by DMcCunney   » Sun Apr 21, 2019 6:11 pm

DMcCunney
Captain of the List

Posts: 421
Joined: Mon Jul 02, 2012 1:49 am

FriarBob wrote:
DMcCunney wrote:But the rest of the world isn't Google, and doesn't have the resources for redundant storage of petabytes of data, nor good means of backing it up if they have it.
Well most people don't have petabytes of data to back up. At most they have terrabytes, some even still have mere gigs or so. For those folks they may not have it "on-site" or anything, but most of them can GET access to it if they want. Especially if they are willing to trust their backups to the cloud. I'm not willing to use any sort of system that doesn't encrypt my data (at least enough to keep the amateurs out) and I don't trust Google as far as I can throw them, but they provide options here and so do quite a few other companies. I definitely wouldn't trust one that's free (if there are any); they are probably using your data for marketing purposes of some sort. But I pay only ~$50 a year for my online backup solution and there are multiple affordable systems out there.
There are plenty of cloud based backup solutions available, but then your scarce resource is network bandwidth. How fast is your pipe to the cloud? How much data can you upload or download in X amount of time?

Depending on the amount of data you wish to back up, you may find yourself in a situation similar to the one I mentioned earlier, where I couldn't do a full backup of the servers I was backup admin for, because it couldn't be written to the tape jukebox we used fast enough to fit within the backup window. It's the same problem in a different guise - can't write the data to the backup medium fast enough to handle it all.

For really large data sets, I'd back up to other local drives, and then back them up to something else for archival storage.

I keep some stuff in the cloud on Google Drive, but that's stuff that doesn't have privacy concerns and that I want to access from anywhere, or allow others to access. I don't actually have a lot of data stored locally, and I back up selected stuff to USB thumb drives. (I have a bit over 2TB of local storage in the current desktop. Less than half of it is used.)
Part of the secret is probably RAID-5. The striping of the data and the checksums allow even partially destroyed data to be reconstructed "on the fly". Hardware RAID cards even do it quite quickly, nearly invisibly. Software RAIDs are much slower but can still get the job done.
I have seen failures that zapped both sides of a RAID 5 array. The results were not pretty.

(And there are RAID levels beyond 5, like RAID 6 (Striping and double parity) and RAID 10 (striping and mirroring.)

I'm happy I no longer have to deal with that.
The rest is just space. And there is space available right now even before the petabyte drive. It's just a matter of how much you need and how much you can afford to pay for.
I think the how much you can afford to pay for is the critical part.
______
Dennis
Top

Return to Safehold