Great discussion yesterday in [Blockstack Slack] (https://blockstack.slack.com/archives/storage/p1446056224000190) about incentives and decentralized storage. I’ve replicated it here so that everyone can learn from and it join in the discussion.
Bedeho of Joystream writes in the Blockstack Slack:
Happy to be refuted on this one, want someone to change my mind:
@jude writes:
the elephant in the room is that decentralized storage (in particular, across the wide-area) is actually very hard to do compared to “centralized” storage (i.e. cloud storage, NFS, etc.), since there are way more failure modes to deal with. However, there has been decades of research into such systems from the best minds in computer science, and there are scalable decentralized storage systems that do work well. The reason you don’t see them in the wild is because they’re hard to commercialize, hard to deploy and manage at scale, and hard to run with good performance compared to cloud storage. (edited)
bedeho writes:
interesting. Not sure why you say that its hard to compare? Consumers care about cost, primarily, so if we keep other variables fixed (reliability, privacy, etc), then are these research systems actually better?
@jude writes:
“reliability” and “privacy” are not fixed, nor are they really comparable between cloud storage and fully decentralized storage. so, it’s hard to say in an absolute sense which approach is “better.” It’s really a question of whether or not it’s the right tool for the job
bedeho writes:
ok, lets say consumer dropbox is the job, and priacy is done through client side encryption with all keys with host, what is your feeling in that spesific case?
@jude writes:
regardless of the storage medium, client-side encryption ensures that realistically the only person who can read the data is the client. However, the network traffic generated by uploading the data reveals information–i.e. an attacker will know that I sent something to dropbox. This problem is even worse with a decentralized system, where I might connect to many peers and send them chunks of data, and those peers would in turn replicate those chunks. This network traffic also reveals to someone watching the network who the storage peers are, and which of them store pieces of my file.
from a reliability standpoint, dropbox is an all-or-nothing proposition: either my upload works, or it doesn’t
it’s different in a decentralized system–sometimes, my upload might be a lot slower since some of the peers are slow to respond. Other times, the decentralized system might be even more available than dropbox, since dropbox might be throttled or blocked but the peers might not be.
also sometimes I can get a partial/corrupt copy of my file if some peers don’t serve back the chunks I sent them, or if they serve back the wrong chunks
so you see, reliability and privacy have totally different meanings in these two contexts
bedeho writes:
ok, you seem even more bearish than me, I was actually willing to grant that all these issues were solved perfectly in decentralized solution, to consider best case scenario, seems like you are saying that basic value prop is dibious, and on top of that, there are tons of extra problems.
feel like some proponents need to chime in on this and save the day
@jude writes:
I’m not bearish on this–there are cases where decentralized systems will beat the pants off of a centralized solution by orders of magnitude (i.e. a CDN or a bittorrent swarm can serve data much faster and to many more clients than a single server) (edited)
bedeho writes:
yes, this is spesific to storage though, obviously bittorrent works very well, which is why it has existed for 15 yrs
@jude writes:
the point I’m trying to make is that decentralized storage systems are fundamentally different animals from centralized storage systems
I think the market indicates that a lot of money-making applications today simply do better with a centralized storage system
we don’t really have decentralized applications, for example–they might actually do better with a decentralized storage system, depending on how they use it
bedeho writes:
you could still totaly make money from the system I described, it combines a server to coordinate and pay, and distributed system for actual storage… lots of rents to collect there for the entrepreneur
@jude writes:
how do you prove that your remote peers are actually serving the data?
bedeho writes:
tit for tat
@jude writes:
i.e. how do you know you’re not paying them to just sit there and not serve it
bedeho writes:
A serves B with data, and they report back to my server when a piece has been sent, at which point i do a little payment in my database in from B to A
@jude writes:
how do you know B isn’t lying on behalf of A?
bedeho writes:
no benefit, B is paying
@jude writes:
sorry–the point I’m trying to get at is that it looks very easy to game your proposal. i.e. to not serve data, but still get paid
bedeho writes:
I get your point, but it is no harder to sovle than in the decentralized system. tit for tat works
@jude writes:
no, it does not. not even with bittorrent. there are still leechers
bedeho writes:
hehe, I am aware of that, making this -> joystream.co, but the point is that the underlying protocol for this would work with tit for tat where payment for service happens on each step, and lack of cooperation means no payment, reagrdless of who cheats.
this is really a side issue, I think I get the gist of what you are saying about distributed storage… if anything I feel more sceptical now
@jude writes:
it’s hard, but not impossible
so with joystream.co, a reader only pays the seeder once they get the content?
bedeho writes:
leecher you mean?
if A downloads from B, then A pays B after each torrent piece using paychan
@jude writes:
how do you make sure A actually pays?
bedeho writes:
this is a godd question, if you do one payment, its prisoners dilemma, it will not work out, if you do iterated payments, you get iterated prisoners dilemma, which generally works out, in particular in horizon is uncertain
bedeho writes:
typical torrent file has 500-1000 pieces, so you verify integrity and pay for each one to get next
same arrangement would work for transmitting data which someone has stored on your behalf
@jude writes:
I’m not seeing how B is guaranteed to get money for serving data
bedeho writes:
paying for actual storage, rather than just transmission, requires another set of protocols which I am not intimately familiar with - but it is supposed to work some how
B is not guarnteed to get money, first piece is sent in good faith, but no more pieces are sent before payment is made https://en.wikipedia.org/wiki/Prisoner's_dilemma#The_iterated_prisoners.27_dilemma
so tit for tat
a few bytes of free service is not a big sacrifice
@jude writes:
it’s not, but if A is patient, then A will never have to pay, right?
bedeho writes:
how so
@jude writes:
A downloads from B, and does not pay B. B can try to block A, but then A can just pretend to be C, and to the same thing
bedeho writes:
sure, or more likely connect to D and try the same game
even leaving aside the insane latency issues that would cause, B should not transmit first peice until its clear that A has paid sunk cost of tx fee to setup paychan, as long as its greater than the price per piece, A has no incentive to switch to setup new paychan with D
@jude writes:
then, what stops B from taking the money and not serving to A?
bedeho writes:
that is not possible by assumption, right? B gets payment after each unit of service
B goes first on each round of the game, so to speak
@jude writes:
either B gets paid before, or after, A gets the data. If B gets paid before, then B can simply not transmit. If B gets paid after, then A can simply not pay B. What you need is some sort of escrow service, it seems
bedeho writes:
B gets paid after giving service
@jude writes:
then a clever A never needs to pay
bedeho writes:
well that is what I described above
“even leaving aside the insane latency issues that would cause, B should not transmit first peice until its clear that A has paid sunk cost of tx fee to setup paychan, as long as its greater than the price per piece, A has no incentive to switch to setup new paychan with D”
@jude writes:
maybe I’m misinterpreting, but it sounds like the following:
- A spends money to set up a pay channel for X btc
- B sends the data to A
- A sends value Y < X btc to B
bedeho writes:
yes
@jude writes:
then, it’s cheaper for A to skip step 3, no?
bedeho writes:
and then setup new paychan with someone else?
fee on doing that is worse than price per piece
by a large margin
@jude writes:
could B trick A into sinking money into the paychan, and then not serve data?
i.e. B stops at step 2?
@jude writes:
yes, excellent point
bedeho writes:
this is why you need multiway paychan
@jude writes:
also, A simply doesn’t pay for the last chunk
better than not paying at all, but A can still be dishonest (but probably not a big deal if there are lots and lots of chunks)
bedeho writes:
classic paychan is 1to1, 1toN is best way to protec against this
unless the attacker can occupy all slots in paychan, A is fine
which is a serious amount of work for almost no benefit
@jude writes:
the fact that it’s possible is worrisome, though. The benefit could be nothing more than “for the lulz”
bedeho writes:
yes, defection on last round is also possible, and can lead to death spiral if players start anticipating this, but that is very unlikely to be an issue
also, how do you guarantee it’s costly to set up 1-to-N? Can’t a single seeder pretend to be many different seeders?
and, doesn’t A pay for the entire paychan setup?
bedeho writes:
A pays entire cost, true
yes single seeder can try to occupy all spots (1-10) in multiway channel, but doing this with many peers for any long period of time without much payoff is unlikely to be very attractive
bottom line is however that it is an imperfect system, but still way better than no incentives or tit for tat barter
@Taek writes:
Wish I had seen this last night, have a lot to say. Obviously I think decentralized storage is fully viable. I’m busy most of today and tomorrow but hopefully soon I can jump in with some of my own points. Sia should easily be able to compete with centralized systems, the biggest challenge is managing the incentives and failure modes. I think we are well equipped to do so.
More specific technical points to come later.
Look forward to hearing more from @Taek and others.