Hey everyone! I just pushed a skeleton storage driver here. If you’re a developer and want to use your preferred storage system to host your Blockstack data, feel free to take a look and send us a PR with the driver implementation
Background
Blockstack storage drivers are responsible for implementing a get/put/delete interface for two logical types of I/O: mutable data, and immutable data.
Mutable data is data that does NOT touch the underlying blockchain. Instead, mutable data is signed by a private key derived from the keypair listed in the user’s zone file. Most user data (profiles, application data stores) follows the mutable data I/O model, since mutable I/O can happen as fast as the storage service allows.
Immutable data is data that touches the underlying blockchain. Each ‘put’ and ‘delete’ corresponds to an on-chain transaction (specificially, a NAME_UPDATE transaction that modifies the user’s zone file). Similarly, each ‘get’ corresponds to a previously-sent transaction. Immutable data is appropriate for storing data that will only be written once, where freshness, integrity, and consistency are more important than I/O performance (examples include storing PGP keys, software releases, and certificates).
In practice, most storage drivers can implement the mutable I/O path and immutable I/O path the same way; the only difference between the two will be the interfaces. For example, the disk
driver simply stores everything to disk, immutable or mutable.
Replication Strategy
Replication in Blockstack is best-effort. On a given put
, some data may be successfully replicated to some storage providers, and some data may not. Blockstack automatically masks any inconsistencies that get introduced
(see Responsibilities below). Blockstack uses three configuration fields in its config file to determine how to replicate data.
-
blockstack-client.storage_drivers
. This is the list of storage drivers to use to both read and write data. All of these drivers will be attempted on anyget
orput
. Aget
orput
is attempted on each driver in the order they are listed (but this may change in the future). -
blockstack-client.storage_drivers_required_write
. This is the list of storage drivers that must successfullyput
data in order for a write to succeed. If even one of them fails, the entire write fails. -
blockstack-client.storage_drivers_local
. This is the list of drivers that keep their data invisible to other clients. For example, thedisk
driver is listed here by default since writes to disk are invisible to other clients.
In order for put
to work on mutable data, there must be at least one driver listed in blockstack-client.storage_drivers_required_write
that is NOT listed blockstack-client.storage_drivers_local
.
There are no long-term plans for creating more sophisticated replication strategies. This is because more sophisticated strategies can be implemented as “meta drivers” that load existing drivers as modules, and forward get
and put
requests to them according to the desired strategy. For example, a “meta driver” could be written to turn data on put
into erasure codes, replicate the erasure codes to multiple separate providers, and reconstruct the data on get
even if some providers later go offline.
Responsibilities
Blockstack handles a lot of higher-level storage responsibilities on its own, so the driver implementer can focus on interfacing with the storage provider and/or creating the desired replication strategy. The responsibilities are divided as follows:
-
Consistency. Blockstack takes care of writing immutable data hashes to the zone file, and takes care of maintaining consistency info for mutable data. Specifically:
-
Blockstack guarantees per-key monotonic read consistency for mutable data (i.e. a
get
on a key returns the same or newer data as the previousget
on the same key, but does not guarantee that theget
returns the same data written by the lastput
on it). -
A correct driver must guarantee per-key read-your-writes consistency (i.e. a
put
followed by aget
on the same key should return the last-put
data to the local client). -
It is acceptable to rely on the storage system to enforce consistency. For example, most cloud storage providers claim to offer per-key sequential consistency already (i.e. a
put
followed by aget
on the same key returns the data stored by theput
to all clients). However, the driver must mask weak consistency by the storage provider if the provider cannot offer per-key read-your-writes consistency.
-
-
Authenticity. Blockstack signs all data before giving it to the driver. The driver does not need to implement separate authenticity checks.
-
Integrity. Similarly, Blockstack ensures that the data hasn’t been tampered with. No action is required by the driver.
-
Data Confidentiality. Blockstack encrypts data before giving it to the driver, and decrypts it after it loads it. However, Blockstack does not guarantee that all the data it writes will be encrypted (i.e. the user or application may specify that it is “public” data). If this is unacceptable, then the driver may take its own additional steps to ensure data confidentiality.
-
Behavioral Confidentiality. Blockstack does NOT take any action to hide network-visible access patterns. Without assistance from the driver, someone watching the network can do timing analysis on the packets Blockstack sends and receives, and deduce things like the user’s network location and the application being used. If behavior confidentiality is required, then the driver must take additional steps to implement it.