Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Backblaze Hard Drive Stats Q3 2019 (backblaze.com)
197 points by garaetjjte on Nov 12, 2019 | hide | past | favorite | 122 comments


These stats are wonderful and make me really appreciate the culture of the company. I've been considering becoming a customer because of these posts, since they reflect a lot of pride in the craft and care for the community.

But I'm stuck on one thing. Does Backblaze offer a solution for Linux backup? I've got an NFS server running that I use for home storage that I want to back up - but looks like Backblaze is only offering a Windows or Mac client.

Maybe the business version would work, since it claims to support NAS backup. But then the pricing seems lower than the personal edition (60$/computer/year = 5$/month < 6$/month) - unless that's implying that every computer that accesses the NAS is part of the fee?

So I guess: is there a reasonable Linux offering for home users from Backblaze? If not, what service do folks suggest?


Yev here from Backblaze -> the Computer Backup service does not offer an unlimited Linux backup service. We do have support for Linux with Backblaze B2 Cloud Storage and our integration partners (https://www.backblaze.com/b2/integrations.html?platform=linu... - I filtered the list by Linux for you). On the business side the NAS backup is also done via our parnterships and B2. On the consumer end, we haven't found a way to make unlimited backup sustainable with NAS/Linux since those devices/platforms typically have WAY more data than the average user.


How do you suspect that CrashPlan is making it sustainable?


The answer is likely "They're not".

Gotta decide between taking the hit on adding another "except for" to your unlimited marketing claim...versus taking the hit on people pushing their multi-terabyte torrent collection into your service.

It's a lesser of two evils & the two providers picked different ones.


So then why doesn't Backblaze have a Linux client again and just have like a 1TB limit or something? I mean, CrashPlan has both a Linux client and has unlimited storage. The great thing about CrashPlan as well is that it can be ran in a Docker container and the data is versioned as well.


Disclaimer: I work at Backblaze on the "Personal Backup" client.

> So then why doesn't Backblaze have a Linux client

There are several choices for Linux including: Duplicacy, Duplicati, GoodSync, HashBackup. All of those will backup your Linux data to the Backblaze datacenter. Is there some feature those are missing? Are those not good solutions for Linux?

> again and just have like a 1TB limit

I believe you can set a 1 TByte limit in all of the above, and then the cost will be a very reliable $5/month or less.


> just have like a 1TB limit or something?

Marketing. You can quantify/understand 1TB limit. Grandma can't.

That's what I meant by you want as few "except fors" on your unlimited claim as possible.

The second you do a tiered by GB model you destroyed your company principle / core selling point - "don't worry".


I've been using restic with the Backblaze B2 backend for a home server backup, which seems to be as close as Backblaze will ever get to having a Linux client.

My rough numbers are 450GB stored monthly, 4GB downloaded monthly, 90k stored files, 385,000 individual transactions, which ends up costing about $2.25 in storage fees, $0.25 for transactions, and $0.10 for download bandwidth.


I will at my two cents:

I've been very happily using Restic and B2 for a long time. It's cheaper than the unlimited service given I have ~600GB stored. Plus, it's a "real" backup: on-going series of snapshots, older files are not replaced when a new back up is made, I can restore from as many points in time that I want to store, and I don't have to worry about Backblaze deleting anything due to inactivity.

One underrated feature of Restic is tagging, which lets you identify a collection of snapshots. I can point multiple separate backups and devices to the same repository tracked with tags; thus, I de-dupe across all of them.


One thing that has been keeping from backing up more content to B2 has been finding an appropriate encryption strategy, and it looks like Restic manages encryption the remote backup repository automatically?

If I'm reading this correctly, Restic + B2 sounds like an absolutely godsend!


That's correct, restic operates under the assumption that you cannot trust where it's storing the data.

Some more (informal) analysis on restic's crypto was done here:

https://blog.filippo.io/restic-cryptography/


You can use backblaze B2 with duplicacy. I backup my machines to my NAS and the NAS itself to B2 using duplicacy. The free version is command line only (with text config files). It works really well and B2 prices are reasonable ! They have a billing estimator somewhere on their site.

I made a script to backup nested zfs volumes to B2 using duplicacy if anyone is interested : https://gist.github.com/icefo/07aab2789e5cfa71045343953aaf88... It makes a snapshot, backup that and handle unexpected network, power loss or backup that span longer than the cron interval gracefully.


If you're a Linux home user and you want a second copy of your data somewhere else in the event of a system failure / flood / fire I'd consider scripting something to B2 (Backblaze's object storage), AWS Glacier, or other archive priced cloud services.

If you're looking for a backup service to handle point in time recovery, differential deduplication, or other features of a backup service (vs. a second copy of your data archive) those also exist though I don't have a clear recommendation for home users.

On Backblaze's business pricing my understanding is they require minimum of 5 users so that's where you'll see the difference solved.


Yev here -> there's no minimum for our Groups feature (which Business Backup runs on) - so you can have Groups with 1 or 2 people in them, no minimum of 5.


They probably don't want using their "unlimited" 6$/month plan for terabytes of data. They also have B2 priced by stored amount, but they don't provide software: https://www.backblaze.com/b2/integrations.html?use-case=back...


I also use B2 even though I am on Mac. I bought Arq to backup to B2 and I will break even within the first year. I used to pay for the personal edition before realising that my new setup is far cheaper.

I have used restic and rclone with very crappy platforms (Google Drive and Microsoft OneDrive) for testing purposes and that worked fine enough so I can imagine it works even better on something like B2.


I also run Linux, and I use rclone[1] with Backblaze B2 (cloud storage). It works really well. I have it set on a nightly cron job locally. I also trigger it manually if I have something immediate (like photos transferred via adb from android that I want to ensure get backed up).

[1]: https://rclone.org/b2/


Excellent answers here. As can be surmised, the answer is for Linux, use b2.

But it just occurred to me, that for home use, if you're a special kind of masochist, you might expose your DAS to a vm running ReactOS and use the windows client?

I don't recommend it, and have no idea if it would work.. But would love to see a write-up if someone wanted to try it...


Use their B2. rclone supports B2 as target


No, but CrashPlan is $10 per month and has a Linux client that supports unlimited storage.


It's also a RAM hog (the client), slow restores and backups, restore problem stories online... Yes it's an option but has its own issues. And recent changes that make it unlimited except for (growing list of exceptions/exclusions).

I was on Crashplan for a long time, moved to RClone to Google and AWS and likely Duplicacy to those systems.


>If not, what service do folks suggest?

O365 comes with 5tb of space that can be addressed by a diff backup tool like duplicati.

Only gotcha I can see is that it's 5x1tb


There was a post from BackBlaze a year (?) or so back where they commented on the Toshiba low failure rate with something along the lines of "they seem really reliable, but we buy in bulk and just don't have enough offers at low price for those, otherwise we would buy a lot of them".

Well I run a couple dozen Synology NAS in professionnal setup, as well as two in personnal setup (mine and my parents'), and ever since that post I made the experiment of having almost 50% of all drives be Toshibas, and I have to say they do seem to be much more reliable (on the scale of "why do every other drive from Seagate and WD keep dying first, and often their replacement dies first too").

It is still a scale of use where it's mostly anecdotical rather than verifiable data, so don't take this fun comment for more than that. But I suspect a lot of people reading these posts are not interested for some large scale setup or anything like that but rather to know which drives to put in their home computer or NAS, and honestly I can highly recommend the Toshiba for that. They do tend to be a bit more expensive (around 10% more ? I buy them from ldlc.com and grosbill.com , french IT stores, no bulk buying or anything like that)

Of course no matter the brand never expect no failure and a Toshiba drive may just as much die in the first ten minutes so always plan for it.


Yev from Backblaze here -> Yea, the Toshiba drives are great! If the price was lower, they'd likely play a larger role in our hard drive mix!


If they need replacing less often, then presumably you’d be willing to spend more on them, so do I infer that they are sufficiently more expensive to wipe out the benefits of greater reliability?


Yev here -> Yes, that's basically the trade-off for us. Is the drive available and reliable vs. how much does it cost. One of the reasons we have so many Seagate drives is because they are available in abundance, are affordable for us, and have failure rates in-line with the other drives that we're testing.


I'm theorizing that one of the things Backblaze has optimized is the labor cost of drive replacement, yes?

So the reliability may be "worth more" to someone who pays a lot for remote-hands in a server farm somewhere, but not in this case.


Yev here -> we do have remote hands available in some cases, but we try to have our own teams in place where possible, but yes - it's one of the things we optimize on our end!


To what extent do you guys maintain drive diversity artificially? i.e. include higher % seagates just to diversify risk of one batch/type being bad news.


I don't buy in bulk or anything like that but the Toshiba seems to be reliably 5-10% more expensive than the same WD models on the french IT stores I use, mostly because WD and Seagate always have special offers on one of their drive or the other.

I suspect when buying in bulk like BB does the difference becomes even larger ? Also they seem to build so as not to care of the reliability of any drive (as long as it stays within acceptable range), so a gain that may seem massive for us low scale user may not be of much impact to them compared to the cost benefit.


This is my favorite kind of content marketing.


Shout out to Seagate, every drive that my friends and I have bought from them have eventually failed, good to see that they fail in non-consumer use too, not just me! Stick to the Western Digitals.


Yev here from Backblaze -> The Western Digital drives that you've purchased will fail too. That's part of the whole point of these reports, all of the drives eventually fail out or reach a state where we have to replace them - it's not any one specific manufacturer. That's part of why having a backup is so important, even the SSDs in newer machines will eventually go wonky.


Thanks for posting this data! I love it!

I appreciate your diplomatic language, but the time to failure does matter, and consumers don’t have the same cost structure that incentivizes replacing working drives the way you do.

FWIW I share GP’s experience with Seagate. I had quite a few of them, ranging in size from 500 gigs to 2TB. Every last one of them died relatively quickly, while most of my Hitachis, Toshibas, and WDs from that era still work.

Seagate earned a permanent boycott from this customer.


This data isn't particularly useful in more ways than that because there are a lot of variables that differ between running in a data center vs desktop/home NAS. I like reading these data because it's interesting to me, but I make purchasing decisions based on other sources, usually in terms of ease of dealing with support for end customers (when I need a drive replaced), noise (it's in my house), heat sensitivity (my house is warmer than a data center), and sensitivity to things like vibration (my case is far less stable than enterprise storage clusters).

That being said, I occasionally use these data to break ties or see variations between generations of drives.


WD isn't much better, but Backblaze stats have consistently shown Seagate to be higher failure rate than everybody else out there.

Good riddance.


I just purchased a newer WD Red (WD80EFAX) to replace an older HGST (pre-WD if memory serves) drive. That drive (and another of the same model I purchased elsewhere to test) is not recognized by the UEFI on the motherboard. The OS probes the drive and fails to negotiate DMA transfers.

Turns out there is a publicly available KB article that mentions known bugs with transfer rate negotiation on some of their SATA3 drives. Of course WD won't say which drives. There's even a utility referenced in the article that you can use to disable SATA3 support. But WD won't make it publicly utility.

Meanwhile their support is stuck in a "have you tried power cycling the computer" loop.

If memory serves Western Digital / HGST / SanDisk were among the first to jack up prices after the Thailand disasters. Fuck em.

Meanwhile is the difference in failure rate between Seagate and Western Digital even statistically significant? For most comparisons you're looking at a fraction of a percent.


As noted above all driver will fail, though I get your point I think saying that seagate driver time to fail is lower is a fairer statement.


I feel like there might be a difference - do drives fail more if they're not continuously running? When I was working in a lab it was generally close to 70% failure rates for drives that were sitting in a box for more than a couple years.


Do you think the seagate barracuda deserves to take the “death star” title from HGST? I think HGST has redeemed themselves with sub 1% failure rates for years. Seagate managed to put out a 30% failure rate drive.


The "Death Star" title really belongs only to that one specific model (75GXP) produced by IBM and its spectacular failure mode (shredding the magnetic media off the platter almost entirely in some cases)


The term “Death Star” was also a play on the product’s marketing name of “Deskstar”, so it wouldn’t make sense for any other drive.


It came with a failure auto audio alarm!


I've never gotten to hear one fail, but these were the way I learned never to use drives from a single batch in a RAID array back in the day (most of the time these days I never match models in a single array and try to avoid matching manufacturers; gotten paranoid over the years after successive RAID failures)

Thankfully it was just stressful - we had backups, but drives in our array started failing one by one, with about a weeks interval; unfortunately it took something like 4 days to rebuild the array each time, so we wasted a lot of time shifting writes elsewhere so the last backup we had + writes going elsewhere combined remained recent enough in case a second drive would fail. We got off easy, but the time we spent probably cost us more than having set up a more redundant system in the first time would have.


I'm not sure about today, but 4 years ago we bought 120 HP notebooks from Verizon (they had cell modems in them) to use with a project. We had a horrendous failure rate of 100 hard drives in the first six months. They were Seagate drives. We ended up just replacing the remaining 20s hard drives as a preventative measure.


Then I stand corrected. Is there anyway to avoid drive failures or should we accept the fact that they will break eventually? Also, the data is interesting and appreciated.


Yev here -> There's no way to avoid drive failure. We circumvent drive failure issues by using our vault architecture (https://www.backblaze.com/blog/vault-cloud-storage-architect...). It's essentially like a giant RAID++. Anything mechanical will break down eventually, which is why we try to push people towards having backups - the more copies of the data you have the less chance you have of all those mediums failing simultaneously, resulting in loss.


There's no real way to avoid failure in any particular piece of hardware. Eventually, everything from the HDD to the computer it's in will fail. The only wait to ensure data durability is to back up across multiple independent physical sites. Even then, there's never a guarantee - it's just that the chance of data loss decreases to levels where other issues start being more problematic, e.g. global nuclear war.


All things fail eventually.


Even if it physically works forever you can have things like "bit rot" so you should continuously backup to different medias and locations


Buy a RAID enclosure or setup a home RAIDed server and do backups. You cannot avoid it but you can mitigate the damage.


I really wish that consumer OSs would mandate RAID (mirroring at the minimum), requiring additional steps to install on unprotected storage. This should have been implemented even back during the early days of consumer hard drives, or at the minimum have a warning on boot that "Your data is stored on temporary storage -- go [here] to configure redundancy". At least something that puts it front and center that storing data on a single hard drive will eventually lead to data loss.

Also would be nice to have a similar warning to "Your data has not been backed up in [x] days".


Ugh, god why? Why must people in our industry daydream about over-complicating things all the time? Very few use-cases benefit significantly from mirrored disks. The amount of data a user actually cares about having a backup of is often significantly smaller than the OS and related garbage on a disk that they can do without. Besides which, a local mirror isn't a good backup anyway!


My thinking is that this would serve a similar purpose to the trend for web browsers to warn users of insecure websites -- the more "in your face" the warnings are, the more incentive there is for providers to be secure.

I've had friends / relatives experience drive failure a few time in the past, and the look of horror they have when there is very little I can do to help them recover their photos etc. is something that I hate seeing.

And having the OS give a simple warning (that can be dismissed, with a "don't show me this any more" checkbox) would not over complicate things, and may end up saving some people's data.

Really no different than the current warning that Windows has, when you don't have antivirus installed. Or the fasten seatbelt signal that comes on the dashboard of your car when you start it. Or the flashing red light that gets added to some stop sign controlled intersections.

Also it would be nice if there was a standard way that backup software could inform the OS of backup status (this way it would serve as a secondary check in case the backup software's internal reporting fails to notify the user of bad backups). Just a little nice-to-have.

(I'm not advocating for this to be a legal requirement, just a nice feature if any OS vendor wants to add it).


> My thinking is that this would serve a similar purpose to the trend for web browsers to warn users of insecure websites -- the more "in your face" the warnings are, the more incentive there is for providers to be secure.

Another thing I hate about the industry today: being hostile to the user and trying to force them to use their computer how you want them to use it.

> I've had friends / relatives experience drive failure a few time in the past, and the look of horror they have when there is very little I can do to help them recover their photos etc. is something that I hate seeing.

Then teach them about proper backups. It doesn't take a rocket scientist to understand "don't keep all your eggs in one basket". Or if you're going to implement some stupid forced user-hostile scheme at least use something that actually qualifies as a backup.

> And having the OS give a simple warning (that can be dismissed, with a "don't show me this any more" checkbox) would not over complicate things, and may end up saving some people's data.

Here's what will happen: the user will dismiss the dialog without even reading it. We have decades of experience showing us this. Users have learned that "warning" dialogs are meaningless precisely because of crap like this. Oh yeah, and they're super annoying.

> Really no different than the current warning that Windows has, when you don't have antivirus installed.

Exactly my point.


I've looked into RAID at home but came to the conclusion that there are plenty of easier/cheaper ways to do backup. I just use USB drives, have a couple rotating Time Machine backups plus Backblaze. (Plus, when I think of it once a year or so, I have one more copy of my main data disk that I keep in a fire box.)

I don't really use Windows but you can do something similar--though in my experience it's not as simple.

I don't really care if I avoid any downtime. So long as I have belt and suspenders backups, I'm pretty comfortable.


RAID is for lower downtime, not data protection.

If we're talking about "protecting" consumers, I think it might be wiser for all OS vendors to provide a free tier of version-controlled cloud storage.


Backup is much more important than RAID in most situations. RAID doesn't protect from user error, virus, fire, etc.

I think Windows does remind to create backup (but not very loudly and it accepts local backups to another drive which is poor solution).

Unfortunately, almost everything is cheapest possible. For example, I would prefer ECC RAM as standard. It's very cheap for production (one additional chip per RAM stick), but Intel wants to force people wanting reliability to pay for Xeon CPU and most people don't care.


I use a Mac. I get those warnings.

I use a Synology drive (using Synology's RAID-like format) for my Time Machine backups, and a Drobo for my less critical stuff.

They have WD drives. I have not liked the Seagates. However, I like the HGST stats.

One of the nice things about both Drobo and Synology, is that I can change the drive type and capacity "on the fly."


More specifically, adopt a 3-2-1 scheme for data you care about: store it in three places, two onsite but on different media, and 1 offsite. Backblaze is probably as good a choice as any for offsite storage, Tarsnap is also really popular here.


Yev - have you considered adding in some SSD drives just to see how well they last & what failure modes they have vs HDDs?


Good question and nope. They're simply too expensive for us. As they get lower in price they do become more interesting, but their $ to density is just nowhere near where we'd need it to be in order to continue providing our service at the rate we do. Some of our boot drives are becoming SSDs though, and they are in the "raw" data that we publish.


As a teenager, I helped a friend revive a Seagate drive after it bricked due to faulty firmware. If I recall correctly, my friend had actually installed some firmware updates for the drive, but had not installed one recently enough to avoid the problem. We had to run wires to contacts on the board to allow us to run commands in a terminal (in Windows) on the only machine we could find with a serial port. When it worked, we felt like hackers from a movie!


Agreed. My seagate drives all died, even ones bundled in external usb drives.

I have 10 year old WD black and green drives still kicking. I still have a gen 1 or 2 intel ssd drive that’s still kicking.


I think those in external USB drives are the most likely to fail. It seems that they put their lower quality stuff or at least drives that didn't quite pass qc into those because they know the huge majority of those will have some data out on them and then never touched again.


If you rip them apart they are the same as their laptop drives with dragster. Same with WD, I ripped out a green drive. Users in forums have recommended buying certain external drives and ripping them apart to save money. Certain high capacity WD red drives cost less when they were bundled into an “NAS” enclosure.


>If you rip them apart they are the same as their laptop drives with dragster.

Dragster should be seagate. Weird autocorrect.

If you rip them apart they are the same as their laptop drives with seagate.


How do you know? As I said in my comment, they may be drives that didn't quite meet the qc threshold.

All drives usually ship with a certain number of bad sectors. An excessively high number of bad sectors can mean there are quality issues with that specific drive. They may be 'binning' those more poorly testing drives for the external USB drives.

https://en.wikipedia.org/wiki/Product_binning


I always wonder if they bin those USB drives differently than the ones sold for internal laptop/desktop use.


Yes, that is what I was getting at. I almost exclusively "shuck" my drives like that for my server. But I have gotten a seemingly high number of failures over the years.


Dissenting opinion: I've owned and used a lot of Seagate drives. I have had a total of one fail on me. Given that they are usually less expensive they are then competing drives, I think they're fine. At the rate of disk size growth I end up replacing the disks for size reasons before they fail anyway.


My experience is the opposite. 100% of the Seagate drives I have purchased for personal use (total: 4) experienced failures within the warranty period, and of the drives I had a hand in purchasing professionally at least half needed replacement within the warranty period (~64).

I think Seagate is fine if you are willing to overbuild and deal with the refurb process within the warranty period. I'm not, and will not personally buy a Seagate product again unless their reputation improves (similar to what happened with IBM/HGST post-deathstar). It's not worth the time or aggravation, to me.


All I can say is my experience differs. The only Seagate drives I've had that kind of failure rate with were the 3TB ones that were out, IIRC, around the time of The Great Disk Shortage.


I've still got my 1tb seagate up and running from around six years ago? No problems with that but all the 3tb I've encountered all seemed to have failed quite quickly. Maybe it's the specific model that I got but the higher capacities, in my case, tend to be more faulty.


I used to work for a long defunct storage manufacturer, reached the point where we skipped either even or odd (i can't remember) firmware versions on the seagate drives. RMA counts would always be higher on the even/odd number.


That would be an interesting analysis! Like buying a car built on a friday?



Call it paranoia but I see this "WD is amazing, Seagate is shit" sentiment everywhere, but the data doesn't back it up. Is this guerilla marketing by Western Digital?


I think there's some selection bias as Seagate drives are fairly abundant in oem systems where the oems included the lowest tier drive possible to shave some cost off which may be less well built and more susceptible to damage over time from vibration/impacts than higher tier drives.

I suspect that most of the time after a hard drive failure the drive gets replaced with a higher tier drive (typically from another manufacturer like WD as the person feels burnt by Seagate for having a hdd failure) which may prove more reliable.

That's my wild speculation anyways.


Well, JFYI, in 2009 (starting 2009, last user posted a few days ago) there was the great Seagate 7200.11 Bricking Season (unrelated to quality of the actual hardware, the issues were related to firmware) that killed so many disks (most of which could be revived) that I guess noone hasn't had one or knew not someone with the issue (or heard someone talking of the matter).

By any metric a thread on an all in all "niche" technical board with almost 5,000 posts and nearly 4,700,000 views should mean that a lot of people experienced the issue:

https://msfn.org/board/topic/128807-the-solution-for-seagate...


There was a certain line of Seagate 3TB drives that seemed particularly prone to failure, as reflected in consumer reports as well as Backblaze's statistics at the time:

- https://en.wikipedia.org/wiki/ST3000DM001

- https://www.backblaze.com/blog/3tb-hard-drive-failure/

Later Seagate drives don't seem to be worse than the competition, but memories of that infamous drive model still linger.


I got burned pretty badly by those as I bought 12. I now avoid seagate if possible because I'm pretty sure they knew their disks were faulty. Even if they didn't know they didn't handle the whole fiasco in a good way.


No one is safe from inevitable drive failure, it seems.


Funny how everyone chimes in with their Seagate horror stories. My experience with them isn't much better.

But another fun story: I had a PSU blow up a couple years ago in a machine with three WDs and three HGST. All WDs were dead after that, the others worked flawlessly. Probably not a large enough sample size for any definite conclusions but at least it put a failure mode on my radar that wasn't there before.


I've ever only had one drive failure. It was a 3 TB Seagate. I remember it failing in like three years, so after the warranty had already expired.

My oldest drive is a 500 GB Western Digital from 2008 that's still operational today. I imagine its end is near, but I've thought about that for a couple of years now.


My seagate that failed on me was a 3tb too! Do you remember what year you got it? Thinking that it may be that model.


Seagate Barracuda 3 TB, ordered in January 2014.


Every failed drive I have owned was a Seagate too.


I'll never buy Seagate again after losing a drive because of a firmware bug that prevents it from coming online.


I love these articles - I'd also love an update to the cost curves: https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/ (2017)

It looks like after stalling in 2017-2018, $/GB has dropped again - https://jcmit.net/diskprice.htm - but JCM doesn't have the large sample sizes Backblaze does.


Interesting how the checksumming process sounds very much like zfs's scrubbing process. One of the reasons I trust zfs with my large data volumes is because it proactively looks for problems and fixes them. (and most filesystems really can't look/check)


> By increasing the shard integrity check rate, we potentially moved failures that were going to be found in the future into Q3. While discovering potential problems earlier is a good thing, it is possible that the hard drive failures recorded in Q3 could then be artificially high as future failures were dragged forward into the quarter. Given that our Annualized Failure Rate calculation is based on Drive Days and Drive Failures, potentially moving up some number of failures into Q3 could cause an artificial spike in the Q3 Annualized Failure Rates. This is what we will be monitoring over the coming quarters.

Wouldn't survival analysis on interval-censored data handle this problem automatically? All of your observations of failure presumably are actually interval data, where all you know is that the drive failed sometime in between the last good check and the first bad check. Then it doesn't matter if some time periods have large intervals and others have small intervals, that just affects the precision of estimates.


The only reason I use other storage provider than backblaze is simply because of the benchmark done by one of the more modern backup tool author.

https://github.com/gilbertchen/cloud-storage-comparison/blob...

Can anyone from backblaze say anything about their performance compared to other vendors?

The pricing is certainly ahead of others, so I would use if the performance is comparable to some of the leading group tested there.


Yev here -> well that chart hasn't been updated in a while. For starters we're just $0.01/GB for downloads (we dropped the price last year). Our performance is generally pretty good, and we're partnered with cloudflare (free egress) if you need more umph. But most of the time folks don't have any issues with just our regular service.


I must have missed it somehow, but what is the difference between boot drives and data drives in a typical Backblaze server other than the boot drives store the OS? Obviously you don’t need 8TB capacity solely for the OS, I’d assume you also store user data on boot drives? In which case why is there a distinction?


Yev here from Backblaze -> Mainly it's the OS and log files - the reason we make a distinction is that the boot drives typically do not have as much load as the data drives, so it wouldn't be a real 1:1 comparison.


Thanks! What’s the capacity and utilization rate of those boot drives, then?


Andy from Backblaze here: The capacity of the boot drives ranges from 80 to 500gb typically. They are mostly hard drives with some SSDs added recently. We are switching over to SSDs for boot drives. The workload is reasonable, but on the higher side as they not only boot the systems, they also are used to store log files temporarily - so lots of reads, writes, and deletes. Since we only have a little over 2,000, any data we published not be very accurate. If you are really interested, the boot drive data is in the data files we publish each quarter.


Thanks for the pointer! :)


I wonder If Backblaze will eventually offer some sort of consumer solution to Backup. An App on iOS, Android, Mac and Windows that simplifies backup to your Backblaze NAS, that Backup to B2 as well.


I've been following these since maybe 2016? I don't remember when they started. It's striking to note that for as long as I recall, HGST still holds the crown of lowest annualised failure rate across the board.


Off-topic but sort of related: is there a hard drive tower designed specifically for a pool of SSDs? The one I have for 8 3.5 bays could easily fit twice as many of the little SSDs...


The term "mobile rack" will find some of the densest ways to shove, for instance, eight 2.5" drives into a single 5.25" bay. That can get you some serious density in one of those "cdrom duplicator" tower cases that's all drive-bays, but heaven help you on the controller side.


You mean something like these? (not "tower" though)

https://www.addonics.com/category/sda.php


And is there any reason to not trust WD Red drives in my home NAS?


The plural of anecdote is not data. The point of backblaze's publication is to share statistical data with the public at large. Their dataset doesn't include any WD Red drives, therefore, the data does not provide a statement about WD Red drives' reliability in either direction.

Consider this: the least reliable drive in this dataset has a 2.7% annualized failure rate at an average age of almost 4 years.

That's low enough that many SOHO users will never see a failure, yet high enough that a rack worth will have seen more than one failure recently. Thus, your question could be answered with complete honesty in both ways: with anecdotes by happy users and with anecdotes by unhappy users.

Therefore, neither positive nor negative answers are useful to you.

This phenomenon underlies the fundamental weakness of self-reported online reviews. You cannot actually get a useful measurement for how reliable a product is solely by self-reported sparse feedback.


You shouldn't trust any drive. Trust in probabilities to protect you. You should be backing up your NAS to either a cloud provider, an external drive (if data will fit), or another NAS. Make sure to locate one copy offsite and practice restoring from your backup.


I trust the Reds. Although, as someone else just said, it's an anecdote, not real data like the Backblaze report.

I started my NAS in 2012 with two 2 TB Red drives. Later I added two 6 TB Red drives. Some time, around 2016 I think, one of the 2 TB Reds failed and I replaced it with another 6 TB. Then the other 2 TB Red failed, like a year later and I put in another 6 TB so they all matched. I did not get replacements even though one failed during its warranty period (I am pretty sure the 2 TB Reds still had a 5-yr warranty at that time), because I wanted to replace it with a 6 TB anyway.

Currently all four 6 TB drives are still running, plus a couple of 4 TB Toshibas I grabbed at some point.

So, I don't think that drive failure after 4 or 5 years is so bad. I got my money's worth anyway, and that's why a NAS has redundant drives.


All drives will fail. You're just buying time (or rather, the likelihood of getting more time out of the drive).

I've had a couple of WD reds fail. You pop in another and rebuild the volume. It's normal as long as it's not too often.


Practically speaking there's no way to avoid them (or their white-label equivalents) if you're on a typical home budget and you buy any significant number of drives. They're just too cheap (when you get the Easystores) compared to any other way of acquiring a large amount of disk space.

I'd rather buy WD than Seagate, but that's just me. I don't really have a choice. Maybe the only people who have a choice are either in the enterprise or buying one drive every few years for a PC build or something.


Your question makes no sense. Any drive can and will fail, and no matter the brand it may fail immediately or it may last years.

The rule is to avoid all of your drive being from the same factory date/batch, even if they're all from the same brand order from different stores or whatever to help make sure in case of a defect they're not all affected.

Also for home NAS ensure you have redundancy (a proper raid level, avoid JBOD and Raid-0), and have backups. Raid is not a backup.


I would add data integrity to the list, in addition to redundancy and backups.

ZFS is great in this regard: redundancy, data scrubbing to ensure data integrity and built-in snapshotting and data replication features (zfs send-receive). Can't believe how I got by without ZFS in my earlier years of data hoarding.


Absolutely right, most home Nas don't have zfs but can use btrfs, which helps protect against bit rot and offers proper data scrubbing.


In reality, there are only two kinds of hard drive: Those which have already failed, and those which will.


Is Blackblaze used mainly as a backup solution? Can anyone give me a use case?


Yev here from Backblaze -> the company started by providing unlimited online backup, and that's a great industry for us. About 4 years ago we released Backblaze B2 Cloud Storage, which allows developers or sysadmins or enthusiasts to directly upload/retrieve data to/from our data centers. Our core competency is data storage - so while most folks do use us for a backup (either of their Mac or PC on the consumer side or servers/NAS devices with B2 Cloud storage) - what we really do is store and retrieve data.


So you guys basically wrote your own Ceph?


Sort of, but not really, we just wrote APIs that let people talk to our pods directly. You can read about our architecture here (https://www.backblaze.com/blog/vault-cloud-storage-architect...) and check out the APIs and how we built them here (backblaze.com/b2/docs/).


Assuming you mean Backblaze and not Blackblaze..

I have data that is important to me (family photos, etc.) on my hard drive. I have a backups of that data running on a Raspberry Pi with an attached drive. If my house gets broken into, or burns down, or hit by a tornado, basically something Really Bad, Backblaze has a copy of my data offsite.

It is tempting to use Backblaze as my only backup, but like they describe on their site, their primary value is as a backup of your backups. Normally you should never have to use them, and if you do, it will be slow. Now if you are in a hurry they offer a service to ship you your data on a thumb drive or hard drive, but that gives you an idea of their primary use.


Amusingly, blackblaze.com redirects to backblaze.com. I assume people have made that mistake before.


Yes, they did and they even wrote a blog post about it:

https://www.backblaze.com/blog/why-backblaze-bought-a-porn-s...


Yea, that was a fun "should we do this" conversation around the office...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: