Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Trustless, distributed, scalable, redundant, anonymous cloud storage with a market-driven pricing scheme? Why _wouldn't_ you want that? For me that's pretty much the ideal cloud storage system.


"distributed, scalable, redundant" describes AWS pretty well, and if I can pay much less for a faster more reliable service but sacrifice "trustless and anonymous", I'll stick to it.


What makes you think AWS would be necessarily be faster or more reliable? If done correctly, think a decentralized system could be significantly faster and more reliable than even the best centralized storage solution.

For example, with a decentralized system files could be broken up into shards and distributed across multiple nodes, so unless a significant percentage of computers on the planet all melt down at once your data would remain accessible (i.e. zero downtime). That would also allow downloads to be parallelized across multiple connections like with BitTorrent, meaning bandwidth wouldn't be an issue either.

And that's not even considering the effect that a commoditized pricing system would have on the costs of a distributed storage solution. I imagine it'd be significantly cheaper than AWS too, as storage would be sold at a price very close to costs.


> files could be broken up into shards and distributed across multiple nodes

You can setup any old distributed database to do this... If there was a benefit to breaking down a file rather than replicating it, AWS would be using this efficiency to provide a better service.

> the effect that a commoditized pricing system would have on the costs of a distributed storage solution

It'll always be cheaper for someone to sell their hard drive than to sell access to their hard drive, just like it is cheaper to buy bitcoin instead of mining it.

Here is why: let's say Alice is profit-driven and greedy and is making $1 profit for 1gb of space. She can use those profits to buy more hardware, make more profit, buy more hardware, and so on and so forth until she has 100,000gb of space being offered. At that point, because of economies of scale, it will cost Alice less to maintain 100,000gb then the total cost of 100,000 people hosting 1gb each (eg. reduce total electricity cost because of bulk purchase).

Now, Alice can grow her operation so big (say, 100 petabytes) that it actually becomes no longer profitable for Bob to host 1gb. Alice has so much lower costs per gb that she was able to drive the price per gb down to a point where the price is lower than Bob's costs.

This is basic economy of scale so far, right? And it's the same reason why data centers filled with ASICs make it useless for you to mine Bitcoin on your laptop.

OK now that Alice is hosting say 100petabytes and has priced out all the individuals on the network, she will be subject to data policy rules and will have to obey law enforcement to kick out users that are using her hard drives for illegal activity.

Alice is now AWS.


> If there was a benefit to breaking down a file rather than replicating it, AWS would be using this efficiency to provide a better service.

Are you suggesting that AWS is doing every possible thing that would increase efficiency, or in other words, that AWS' services are optimally efficient? That seems very unlikely.


Yes. This is how capitalism works. If there is an inefficiency in the market, there is an opportunity for a competitor to beat you on price or quality. If AWS is not incorporating these efficiencies into their tech, then Microsoft Azure will and beat them by providing either a cheaper or higher quality service.

If FileCoin actually worked as a better technology, Google or some big name would say they will start using it to beat AWS in the data storage game.

I'm not saying FileCoin is useless, it is useful to store 3D printed gun designs and illegal media, because it is outside the reach of the law. But to think it is a more efficient technical solution to data storage is a bit naive...


"Yes. This is how capitalism works. If there is an inefficiency in the market, there is an opportunity for a competitor to beat you on price or quality. If AWS is not incorporating these efficiencies into their tech, then Microsoft Azure will and beat them by providing either a cheaper or higher quality service."

Not that I'm saying Filecoin or a decentralised storage coin is in any way the way forward or even a competitor.. but saying that is how capitalism works is such rubbish.

In an ideal system you are right, but how often have you worked for a company that is raking in the cash by marketing well and being the "established" hand in the market?

There are so many companies kicking around that are the biggest player in their field that continue to make bank because accountants and upper management decide they are the proven, safe pick in the market and have signed up for long, entrenched software packages.

Also, if this upended the entrenched business model companies were using, they would not be able to quickly switch to it.

There are plenty of ways this could be the a better solution and AWS wouldn't switch to it, saying capitalism proves this is not a good argument.


It is not just AWS. The data storage market is saturated with competitors fighting hard to come out on top. Amazon, Apple, Box, Dropbox, Google, Microsoft are all in this and some of these companies have massive R&D budgets, have a history of delivering cutting edge tech and have strategies to overcome the Innovator's Dilemma (which is the problem you're describing).

It is precisely because these companies are not entrenched that my argument holds. But also consider the technical downside to FileCoin (slow connections to unreliable laptops VS fast connections to redundant data centers) and the economic downsides to FileCoin (it won't be worth hosting due to economies of scale).


Just a minor nit... Apple doesn't really belong on that list. But you're right in general. Cloud storage is a big race to the bottom, and add-ons are going to be what's monetized. I can't see Filecoin or Siacoin having any advantages here.

I don't know how big the Filecoin network is, but it's very telling that they're not sharing this information. I expect that the amount of data stored is not large. Siacoin makes this available but total contracts are currently a bit under 200TB.

For those of you keeping track at home, that's a single medium-size storage appliance.


Again, Filecoin monetizes IPFS. The guy in the same Starbucks as you may have the file you want, so it could be faster to get it from him than from some data center.

For example, see https://partysha.re


> If there is an inefficiency in the market, there is an opportunity for a competitor to beat you on price or quality.

Of course. That's why when someone proposed a potential mechanism for increasing efficiency, it doesn't make sense to say "that couldn't possibly work, because if it could work then AWS would already be doing it."


1. Filecoin is the monetization layer for IPFS; on IPFS you never store stuff you didn't agree to store. So there won't be 3D printed gun designs, porn and other illegal media on your drives unless that's what you wanted.

2. Filecoin enables a transparent market for storage with bids and asks for storage, so you'll know what you're paying compared to AWS, Azure, etc. and you can decide what to do.

3. I'm on Comcast; if some guy two towns over from me is running his Filecoin-enabled IPFS node also on Comcast, there's going to be less latency than hitting an AWS or Google data center thousands of miles away.

4. My DigitalOcean droplet is ~200 miles away from me; trace route says it's 10 hops away. I can see the network name for Comcast two towns over from that's just 4 hops away. The other guys may be big but they can't change the speed of light.

5. Because IPFS is content addressed, I don't need to know where the content I want is located; thanks to the distributed hash table, any node that has that data can respond to my request.

6. IPFS and therefore Filecoin still work even if you can't reach the internet backbone… you can still get stuff via your neighborhood mesh network.


Bittorrent at least has has much higher latency than a protocol like HTTP with simple auth.

As for price, AWS has some of the cheapest operating costs due to scale, something which smaller players will have trouble competing with. Moving data within AWS is fast and sometimes free.

Smaller players also pay the highest cost for bandwidth. If some plans don't have bandwidth caps yet (many already do), they will get caps as soon as serving bandwidth as an individual becomes profitable.

Pretty much all distributed systems have the following in common: You pay for resilience with overhead. If it wasn't that way, everything would become as distributed as possible over time.


> Smaller players also pay the highest cost for bandwidth. If some plans don't have bandwidth caps yet (many already do), they will get caps as soon as serving bandwidth as an individual becomes profitable.

Isn't it the other way around? Small providers offer low traffic costs while the big cloud providers (Google, AWS, Azure) charge significantly for bandwidth.


Small providers pay the most for traffic to their ISPs, because they get smaller volume discounts. What Google/AWS/Azure charge is a different topic entirely, they themselves pay much less.


I had been recently looking at Siacoin, but still a lot of work is needed before it can displace something like AWS. Took nearly a day with lots of stalls just to sync the blockchain. Also I am not sure about how price competitive it will ultimately be. Amazon has huge buying power which I am sure means they pay less per TB then consumers. Also other economies of scale. I bet the cost of labour to bring online each 1TB on S3 is very low.


Siacoins calculator at http://sia.tech/ is claiming a price of $2/TB/month vs $23/TB/month for Amazon S3.

However it neglects to include Backblaze's B2 storage which is only $5/TB/month - https://www.backblaze.com/b2/cloud-storage-pricing.html

That is maybe a not a big enough price saving to convince conservative corporations to switch. Imagine going to your accounts department to request they purchase a cryptocurrency so you can use it to pay for data storage.


I don't know how using Sia works in practice, but comparing it with regular S3, which is connected to the entire AWS ecosystem (including EC2), is absurd.

Amazon S3 also offers "glacier" for longterm storage with few accesses, which is $4 per TB.


Costs for retrieval from Glacier are insanely complex and can be shockingly high.

It's best not to use Glacier as a price comparison because it's really misleading.

https://news.ycombinator.com/item?id=4412886


But that's the problem. S3 is cheap if you use it with other AWS services and it's designed to be only cheap that way. Bandwidth is expensive to avoid that people pick services from a combination of Azure/AWS/Google.

If you're just looking for storage, S3 is certainly not the cheapest provider. Glacier is a bit different but clearly just for archives.


I'm comparing just storage prices, because I have no idea how Sia works in terms of latency/bandwidth/access. Being distributed, I suppose it will fare worse in most (if not all) these respects compared to Azure/AWS/Google and maybe even glacier.

As for bandwidth cost: I don't believe Sia can be successful and stay that cheap. Why would providers of bandwidth for Sia be able to offer it magnitudes cheaper than the biggest tech companies in the world? Answer: It's offered by a bunch of individuals with no caps on their data plans. If Sia takes off and lots of people start using terabytes of bandwidth, the ISPs will put an end to it.


Does backblaze have a single enterprise client that anyone knows of? That's Sia's target market. They are targeting companies that use S3, not a company that's pretty clearly targeting personal and SMB's.


Continuing on my thoughts on costs, with some very quick napkin maths (apologies if errors)

* From https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/ 1TB of storage cost around $25. * I'll assume 3 year life-span for a drive. * Add 50% costs for the infrastructure and electricity to support the storage. * Add another 17.6% due to redundancy (17 data shards + 3 parity) - https://www.backblaze.com/blog/vault-cloud-storage-architect...

Gives a minimum cost (before employee costs, marketing etc) of $1.22/TB/month.

If we do the same with Siacoin (which I believe stores 3 copies of your data) we get $3.12/TB/month.


If the decentralised network is popular and stable and clients happen to be next to you and they happen to have parts of your files, then yes. It could be fast and reliable.

On the other hand, you could spend some money and guarantee it instead: https://aws.amazon.com/directconnect/


Also if people ever stop using the service your files could be gone forever.


Ooohh we thought that as well, until the latest S3 outage.


Ask the CTO of any Fortune 100 company what they're doing to mitigate such an outage.

What do you think they will answer?

Option A: cross-region replication, regular backups, and secondary service provider failover plan

Option B: decentralize everything on people's laptops


There's nothing I love more than a chicken or egg argument. Nobody is doing B because there isn't a company that has a scalable B option yet.

But hey, your argument is totally valid, which is why everyone is still programming in Cobalt!


Sarcasm aside, option A works and option B does not, as you admit. What's your counter-argument? If it is that option B will one day be better than option A, you will need to provide evidence because that's a claim that is not addressed in the posted filecoin.pdf.


> If it is that option B will one day be better than option A, you will need to provide evidence because that's a claim that is not addressed in the posted filecoin.pdf.

Argument to the Future. You can't provide evidence of a future event and asking for it is a joke. Maybe Sia/File/Storj are the future maybe it's something else, but asking for "evidence" of a future event is a joke.


Option C: cross-region replication, regular backups and multiple service provider failover plan in a secure, decentralized storage network.

Don't have to be choosing either "Businesses To Big To Fail(tm)" or "Average Joe" when you can have "Many businesses"


Totally agree with this. This might be a better option as part of risk management to de-risk the scenario of complete wipeout across providers.


The only thing I want out of that is redundancy. Why do I want those other things?

(Bear in mind I know what they are - I want you to sell me on them).


How would a market driven solution not drive centralization? Datacenters are cheaper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: