Privacy & Anonymity
Just because a Blockchain is private & permissioned, does not mean that the parties involved wish to make all of their data and transactions visible to all other participants within the business network. There are many reasons why data might need to be kept private.
Competition
Blockchain based business networks often foster coopetition. Enterprises collaborate in the business network for a common benefit, while actively competing with each other. These are some of the healthiest and most dynamic ecosystems. However, knowledge of details of a business transaction/trade, or the simple fact that a trade happened between a set of parties in the business network, might provide data that other parties could use to gain a competitive advantage.
Data Privacy
Some data is simply too sensitive to include on an immutable shared ledger. Consider the example of Sensitive Personal Information - this is often governed by strong data protection legislation such as GDPR, including controls that mean it must be deleted upon request from the owner of that data. Even if their data itself is not held on-chain there might be the potential for metadata leakage breaching privacy. Consider tracing drivers' license information on-chain - even if the identity is not widely available, you might be able to infer from direct knowledge of one citation a link to the token that represents a person on-chain. From there you can see that person's full history.
Privacy Solutions
Practical solutions for privacy are available today, and there are maturing solutions that are gaining traction and evolving to become practical for Enterprise use.
Proof-only Solutions and HD Wallets
In cases where all the material actions for a transaction are coordinated off-chain (see the next topic on off-chain comms), it might only be required to put a record of agreement on-chain. A signature from each party involved that agreement has been reached. This can take the form of a simple proof readable only to the other parties involved.
- Sign some data that designates the agreement using your private key
- Encrypt that data with the public key of each party that needs to read the proof
- Put each of those encrypted proofs on-chain
- Once all parties have submitted their proofs on-chain, the agreement is immutably recorded
However, in order to avoid leaking information about who is involved in a transaction you must first mask the identity that is submitting the transaction to the chain. It is possible in some cases to simply generate a completely random address to sign the Blockchain transaction. However, that does not allow for permissioning of who can submit signatures to the contract, or traceability back to who submitted the transaction. As such, a Hierarchically Deterministic (HD) Wallet is commonly used to generate a single-use Ethereum address that the owner can prove is their own if asked.
Private Transaction Managers & Enclaves
One of the most popular options to make transactions and data visible only to a subset of the participants, is to use a Private Transaction Manager such as Tessera from Quorum (the successor to the original Constellation project in this space).
These work by sending off-chain point-to-point encrypted communications to select parties that are allowed to view the full input data for the transactions. These select parties can process the full transaction, and update their state to include the full results. Other parties in the network maintain a proof that the data was sent, a hash of the payload and the address of the sender, but cannot execute the transaction as they never receive the encrypted payload.
Again a HD Wallet can be combined with private transaction managers to submit the transactions, if knowledge of which parties are transacting is itself a privacy concern.
Key aspects of this approach that differentiate it from roll-your-own off-chain communications are:
- Agreed Smart Contract logic is used, just like any other transaction
- Installation of the Smart Contract logic is itself a private transaction
- The deterministic properties of the EVM are maintained - parties can compare merkle roots of their contract instances to confirm they have the same state
- The state and events of the Smart Contracts can be processed just like any other Ethereum transaction
- Private Smart Contracts have read-only access to the all-participant chain
Off-chain compute
A straight forward way to maintain privacy, is to execute all logic and state transitions off-chain in environments where there is safe access to the data. After-all that's how transactional systems have been working for decades. The questions in the context of a decentralized solution are:
- How to invoke that off-chain processing of the data as part of the transaction
- The level of agreement on the outcome that is required in order for the transaction to complete
The answers to these questions are always going to be dependent on the business network. In a large number of cases, you can fully meet the requirements of the network using trusted bilateral communications (such as with App 2 App Messaging) coordinated with on-chain transactions at specific agreed and verifiable points. In fact this is extremely common in cases where the private processing is specific to the core systems of each business.
Take a simple example of a private off-chain action - Verify the authenticity of this Passport as part of a know your customer (KYC) use case. This might need to trigger a different business process for each participant, with different amounts of automation and human verification in each enterprise. All individually conforming to their audited processes and regular requirements, but unique to the structure and systems of each business. The data for the passport itself is almost certainly not on-chain. Instead maybe it's shared via a private documents store, or distributed as an encrypted payload over an IPFS decentralized file system.
It is possible to go a step further, and coordinate these private off-chain activities as mandatory steps initiated by on-chain Smart Contract logic. We talked about Chainlink in the Oracles section, as a robust system for triggering external processing off-chain in response to on-chain Smart Contract logic, and then bringing some data back onto the chain. In these more sophisticated cases, you need to consider the level of verification you make of the off-chain processing.
A simple example that works for many private permissioned systems, is that the participant performing the action provides a signed proof that they performed it. This could be as simple as a transaction signed by their address, stating their compliance, or the answer to a simple question where the answer does not leak sensitive information (yes/no or threshold based). Here there is trust and/or accountability for answering correctly, tied back to the known organizational identity responsible in the business network.
There are some more elaborate schemes maturing in the blockchain space to provide enhanced proof that off-chain processing was executed correctly. Please contact us to us if you have requirements for your business network in these areas.
Trusted Execution Environments (TEEs)
A TEE provides evidence, an attestation, that proves the exact code that is being executed. The TEE is also able to lock sensitive data inside of itself, such as an encryption key generated inside of the TEE, so that it can perform processing on sensitive data without it being available, even in memory, to other processes on the machine. A TEE requires special hardware, setup and code to be available for the execution, so there are some setup complexities required to incorporate TEE-based execution in your solution. The emerging value here is the ability for shared compute to execute rich and complex off-chain processing with high throughput, and verified outcomes. This means a single execution off-chain of attested code in a TEE can give equivalent assurance of the outcome of a transaction to executing it many times by the nodes of the chain in on-chain Smart Contract logic.
Multi Party Compute (MPC)
In MPC processing the computation is distributed between multiple parties, where no individual party has access to the full data. This can allow summary calculations to be performed across a large data set, without revealing sensitive data about the individual pieces of data in that data-set. This approach has seen most adoption in cases where individuals have self-sovereign control over their own data, and wish to provide limited access to that data for research, analytics, etc.
Zero Knowledge Proofs (ZKP)
You might have heard of zero-knowledge proofs such as zk-SNARKs (which powered Zcash), zk-STARKs, Range proofs, Bulletproofs and Sigma-Bullets. These are sophisticated mathematical techniques that can be used to generate mathematical proofs that computation has been performed correctly, without divulging the details or data of that computation. A simple example would be asking a question such as "Is the number between 10 and 20?", where off-chain compute that has access to a private number could generate a verifiable proof that the number is (or isn't) in that range, and submit that proof back without revealing the actual value of the number.
The proofs are usually expensive to calculate, but much, much cheaper to verify. Which means ZKPs are being considered as potential solutions for scaling throughput of high volume blockchain solutions, as well as providing privacy solutions. This is an exciting and fast evolving field, and choosing and implementing the correct algorithm needs some care to ensure that the required level of security is obtained.