Duplication of Code & State in Blockstack

larry · September 5, 2017, 4:20pm

This post was inspired by @jackzampolin’s post on Component Architecture and my discussions with members of the Blockstack engineering team and community:

–

A recurring theme in Blockstack is confusion over where to store state and where code should live in our decentralized system.

Since the system is decentralized, we want to store the state and run code in locations under user control or that the user can trust. For the most part, this is at the edges of the system, as close to the user as possible.

First, let’s take a look at the state as it currently (v0.15) exists in Blockstack. Then, we’ll take a brief look at some of the functionality provided by each of our main products. Finally, we’ll look at some proposals.

Current locations of state

Blockstack Browser

Wallet seed
Preordered, current name registrations
Next unused address index
Selected storage providers
Storage provider access tokens
URLs of providers such as UTXO provider, search provider, etc

Blockstack Core/CLI

Wallet information
Selected storage providers
Storage provider access tokens
URLs of providers such as UTXO provider, search provider, and other data in the directory containing client.ini

Profile/token/key file

Profile
App public keys
Device public keys

localStorage of apps using Blockstack.js

Gaia storage session tokens

Current locations of code

Below is a list of some functionality provided by both Blockstack Core and Blockstack Browser. This partial list is not exhaustive, but simply to demonstrate the degree of overlap in functionality between our two main deliverables.

Blockstack Core/CLI (Python)

Transaction interpretation & indexing
Transaction generation
Zone file creation
Zone file storage
Zone file retrieval
Bitcoin wallet
Identity wallet
Profile retrieval & verification
Profile generation
Social proof verification
Storage provider interface
Gaia storage

Blockstack Browser (JavaScript)

Zone file creation
Bitcoin wallet
Identity wallet
Profile retrieval & verification
Profile generation
Storage provider interface
Social proof verification

Challenges from current architecture

As you can see, our system has a state stored in various locations in the system. There’s also a large amount of overlap in functionality between our two main deliverables.

This results in a number of challenges for users trying to understand how the system works. It also makes it difficult for developers trying to build apps in our ecosystem and developers working on the core project.

The following are a list of some of the problems

Confusion from different state in CLI & Browser

Users are trained by centralized systems to expect to see the same view of state no matter which client they’re using to view a service. Users using both the CLI and the Browser see completely different views of the system. A name registered with one does not appear in the wallet of the other and vice versa.

Users are confused by multiple wallets.

It isn’t clear to them which wallet has their bitcoins and which one has their names and how to use one wallet in multiple browsers or devices.

Users can’t use a name bought on their computer on all browsers and in all apps

Users don’t understand why a name bought on their computer can’t be used in all browsers. Why does it appear in only one browser and not in the command line? (or vice versa)

Backing up state is complicated

Some state in ~/.blockstack or ~/Library/Application Support/Blockstack/ while other is in a browser’s localStorage. Many people, especially in our early adopter target market, always use incognito mode. The implications of using the browser in the fashion are not clear.

Configuration and users preferences are in multiple places.

We have information about user configuration and preferences stored in:

Blockstack Core’s configuration directory
Blockstack Browser’s localStorage
User’s Gaia storage device
User’s key delegation file

State is stored in apps’ `localStorage` without their knowledge

Our JavaScript libraries storage state in app’s localStorage without their permission. This leaky abstraction makes it difficult to understand the behavior of our libraries as behavior changes based on cached state that the app developer is not aware of.

Proposal

State

A user should be able to completely restore their account with their 12 word identity key phrase and no other information. This phrase should be the only state stored by our client apps.
Apps are responsible for for managing their own state. Our libraries should not create state without app developers’ permission.

Possible implementation steps

Storage provider information and access tokens are stored encrypted in the key file.
Information about user’s preferred UTXO provider, search provider, etc, also stored in the key file.
Information about registration state of a name can be generating either from possible Blockstack transactions in the mempool or state stored by clients in the key file.

Code

All code that can run on the client should run on the client.
Business logic code should be written once and not duplicated.
The majority of functionality provided by the current REST API should be moved to blockstack.js.
The client should generate its own Blockstack transactions and send them to endpoints for broadcast.
The browser and CLI should be simply different “dumb” interfaces to system state that is retrieved and rendered by blockstack.js.

Blockstackd

This is what someone who wants to run a full node would run. It includes the indexer and Atlas components of the network.

No write operations except for publishing zone file to Atlas network.
Read operations limited to:
- Querying namespace information
- Query name information: highest level return value should be zone file, anything high-level than zone file is out of scope
- Querying information about the virtual blockchain.
Written in python

Blockstack Gaia Storage

Each user has one or more Gaia storage nodes.

blockstack.js

All name and namespace-related write transactions are generated and sent from the library. Transactions that need to be sent out at certain times, such as name pregister and register
Includes a registrar that monitors the blockchain, queues transactions and sends transactions at the right time. This registrar can be run locally or remotely.
Includes higher level profile resolver code, identity and authentication related tools
Usable in browser, node.js or wrapped with language specific bindings

Proposed Action Items

Today: Institute a moratorium on adding additional functionality to Blockstack Core (specifically REST API)
Today: Deprecate current Blockstack CLI. No longer encourage users to install it.
Today: Going forward, put new functionality in blockstack.js
Future: Start migrating existing Blockstack Core REST API functionality to blockstack.js
Future:Extract blockstackd functionality from Blockstack Core into separate repository
Future: Extract Gaia storage functionality from Blockstack Core into separate repository
Future: Come up with a plan for moving existing state off of the client (see above for some possible approaches)

jackzampolin · September 5, 2017, 7:10pm

Thank you for putting this together @larry. Great writeup!!! Reading through this I did have one question:

Aren’t we planning to have Gaia be a dumb drive? In that case how would there be nodes? Would the right word be more like stores?

larry · September 7, 2017, 5:06pm

That’s a good question. It seems like for collections there will need to be some sort of logic in storage nodes that enforce which apps are allowed to write or read which collections. I’ll let someone else expand on that though.

jackzampolin · September 7, 2017, 5:43pm

This is sounds like an Identity Hub…

alidcastano · September 7, 2017, 9:19pm

This is a great write up. I experienced a similar confusion when experimenting with creating a dApp with Blockstack (created an issue here) and I’m glad to see the difficulties are being addressed.

Is there an exact issue open where we can track the progress of these changes? Also, will the current Blockstack.js API remain the same?

larry · September 9, 2017, 3:45am

Thanks for posting! There’s not yet issues created about this - we’re in the process of building consensus around these changes.

muneeb · September 11, 2017, 8:55pm

+1 this is really important. State should be in a user’s “trust zone”.

This needs to be fixed. Users should not have 2+ versions of Blockstack installed. There should be just 1 version and both the CLI and the browser should display the same info.

+1 as a product requirement.

Users may not want to make this data public. Or were you thinking encrypted?

Separating out the indexer and calling it the indexer would help a lot here.

+1 but would be hard to do in practice so need to watch out for this.

Let’s do it in our docs/instructions. The CLI is integrated with Core and would be hard to deprecate anytime soon.

+1 and we already have a repo for that! https://github.com/blockstack/gaia

RE Gaia and “dumb drives”, the backend storage providers are dumb drives but we might need a node that is on 24/7 for certain functionality. We don’t want to require a user to run a full Blockstack Core node for this and are thinking about a much more light-weight node. This node can be trusted by the user and can take actions on behalf of the user.

aaron · September 11, 2017, 10:54pm

I agree 100% with the above, and also in broad terms with the vision of separation you laid out. I have a couple of concerns though.

I am okay with storing provider information in the key file, however, I don’t think access tokens should be stored there.

Access tokens are, for the most part, intended to be used by a single client, not multiple clients (even of the same user), and their use as such can trigger account security measures. (An analogous situation would be trying to copy your browser’s cookies around between devices to keep you logged in on different websites). Users today already are accustomed to having to re-authorize browsers on different devices, this is a similar situation. Access tokens are often treated as transient state (and usually given expirations), which means that a user is going to be experiencing “re-authorization” dialogues in any case. None of these are (even close to) unsolvable problems, but I’m just not convinced that we should be trying to solve them.

Preorder -> Register is a transition that you can’t figure out easily without local state (you can’t shove it in the keyfile, because you don’t know where the keyfile is until registration is finished).

If we do a protocol change, the preorder transaction can encrypt the name to be revealed, and then reveal it (details would need to be hashed out). Under the current protocol though, you would need to store the name in local state until the register has hit mempool.

How will we deal with CORS headers in profile lookups? We could have something like a “blockstack-resolver”, which is core.blockstack.org 's resolver and search. To reduce duplication of code, that could be a node server that uses blockstack.js.

I agree with this. I think we need to think through what functionality we want on core. As in should we support the whole public api at core.blockstack.org ? Because that’s search + profile resolution, which we can move out of core proper and into a node server.

I think moving users to install browser on Linux with the docker installation is maybe the way forward here, as that will hide the CLI from them a bit.

I want to add that we should hold off on starting a second ‘dumb CLI’ project before removing browser’s trust from Core, because that sounds like a good way to end up having multiple CLIs at the same time, which is worse than the situation we’re in now.

Yes!

The only thing holding me back from starting on this now (well, in my hobby-hacking-time) is that I think we need to figure out how we want to set up a regtest environment for the browser. We could have a blockstackd regtest mode, where the browser would then need to be configured to point at the local node (rather than a real resolver) and also a local bitcoind (and use it as a UTXO provider?). Does that setup make sense?

jackzampolin · September 11, 2017, 11:02pm

What about running regtest mode in a number of docker containers. This would make setup and tear down easy. I guess the only issue here is creating the browser and api configs to point to the right places.

We could also have the default path for regtest be something like $HOME/.blockstack-regtest to make switching environments really easy. The command could be ./launcher regtest or something like that. Also I’m super pumped about these changes!!! Going to make the platform MUCH easier to use.

Also just opened an issue related to this thread: https://github.com/blockstack/blockstack/issues/345

aaron · September 11, 2017, 11:04pm

Yeah, putting the regtest mode in a container is probably the simplest way to set it up.

larry · September 12, 2017, 3:06pm

Yes, encrypted.

Agree using access tokens across different devices would probably cause problems with security measures for something like dropbox and is probably not a problem we want to be solving.

I think the specifics of how this gets implemented depends on where we go with our storage service anyways.

Good point!

We can make having the ability to set proper CORS headers set on your profile file a requirement for being used as a storage driver in Gaia. That would make services like dropbox unacceptable. It looks like dropbox won’t work for us on mobile anyways.

+1000

Avoid this:

clis

Yes. Configuring browser to point at the local node and local utxo provider is pretty easy. Instead of the current bitcoind we use (compiled from source) - we could use insight-api + bitcore + their forked bitcoind as the utxo provider (assuming it works in regtest mode and has been updated to work with segwit)

jude · July 2, 2018, 4:01am

Good news everyone! We just merged feature/blockstack_client_refactor to develop in Blockstack Core! develop no longer contains blockstack_client—not even the test framework needs it anymore.

ShankarGanesh_PJ · July 3, 2018, 9:30am

Thanks @jude and this will help others.

alexc.id · January 9, 2019, 5:37am

Thanks for this writeup @larry, it’s very helpful. I agree with @aaron about the proposal–having the browser and CLI be different windows into the same core code.

At the moment, @prabhaav and I have Stealthy using a forked version of the CLI to run channels as a service (web browsers were unreliable in the environment we are using–and the login situation was severely limiting). Hopefully that continues to be functional as the CLI evolves or we are able to transition to the new blockstack.js based CLI (though I suspect that would be the case largely as many other systems would fail to operate).

willstSm · February 3, 2019, 8:17am

I’ve experienced it myself, so fully understands all concerns, thanks for putting everything up