new blog: building fair webs of trust via ocap

2022-12-03 01:26:38 -06:00 · 2022-12-03 01:26:38 -06:00 · 41350afedf
parent 63ad030dad
commit 41350afedf
1 changed files with 176 additions and 0 deletions
--- a/content/blog/building-fair-webs-of-trust-by-leveraging-the-ocap-model.md
+++ b/content/blog/building-fair-webs-of-trust-by-leveraging-the-ocap-model.md
@ -0,0 +1,176 @@
+---
+title: "Building fair webs of trust by leveraging the OCAP model"
+date: "2022-12-03"
+---
+
+Since the beginning of the Internet, determining the trustworthiness
+of participants and published information has been a significant
+point of contention.
+Many systems have been proposed to solve these underlying concerns,
+usually pertaining to specific niches and communities, but these
+pre-existing solutions are nebulous at best.
+How can we build infrastructure for truly democratic Webs of Trust?
+
+## Fairness in reputation-based systems
+
+When considering the design of a reputation-based system, *fairness*
+must be paramount, but what is *fairness* in this context?
+A reputation-based system can be considered *fair* if it appropriately
+balances the concerns of the data publisher, the data subject, and
+the data consumer.
+Regulatory frameworks such as the GDPR attempt to provide guidance
+concering how this balance can be accomplished in the general sense
+of building internet services, but these frameworks are large and
+complicated, and as such make it difficult to provide a definition
+which is adequate for a reputation-based trust system.
+
+To understand how these concerns must be balanced, we must understand
+the underlying risks for each participant in a reputation-based system:
+
+- The **data subject** is at risk of harm to their professional
+  reputation due to annotations they did not consent to, and mistakes
+  in those annotations.
+  This is a problem which has already captured regulatory ire, as I
+  will explain later.
+- The **data publisher** is at risk of being sued for defamation due to
+  the annotations they publish.
+- The **data consumer** is at risk of being misled by inaccurate
+  annotations they consume.
+
+A *fair* reputation-based system must attempt to provide an adequate
+balance between these concerns through active harm reduction in its
+design:
+
+- The harm to the **data subject** from misleading annotations can be
+  reduced by blinding the identity of the data subject.
+- The harm to the **data publisher** from misleading annotations can
+  also be reduced by blinding the identity of the data subject.
+- The harm to the **data consumer** from misleading annotations can be
+  reduced by allowing them to consume annotations from multiple sources.
+
+## Shinigami Eyes, or how designing for fairness can be difficult
+
+The [Shinigami Eyes][se] browser extension was designed to help people
+establish trust in various web resources using a reputation-based system.
+In general, the author attempted to make thoughtful choices to ensure
+the system was reasonably fair in its design.
+However the system has [a number of flaws, both technical and social][er],
+which highlight how building systems of trust requires a detailed
+understanding concerning how the underlying primitives interact and
+the consequences of those interactions.
+
+   [se]: https://shinigami-eyes.github.io/
+   [er]: https://eyereaper.evelyn.moe/
+
+### Shinigami Eyes and Blinding
+
+As already noted, a *fair* reputation-based system must blind the identity
+of the data subject to protect both the data subject and data publisher.
+The approach used by Shinigami Eyes was to use a bloom filter constructed
+with a 32-bit [`FNV-1a` hash][fnv].
+
+   [fnv]: http://www.isthe.com/chongo/tech/comp/fnv/index.html
+
+The FNV family of hashes are a non-cryptographic family of hashes, which
+provide scalability up to 1024 bits, which works by performing an XOR of
+the current byte's value against the current hash value, then multiplying
+that value by the designated FNV prime.
+There is an alternate set of FNV hashes which swaps the XOR and
+multiplication steps, which is the variant used by Shinigami Eyes.
+
+The use of a bloom filter is an acceptable blinding method, assuming that
+the underlying hash provides sufficient resolution, such as a 256-bit
+or 512-bit hash.
+Presumably, due to the constraints of having to run as a JavaScript extension,
+the weak 32-bit `FNV-1a` hash was used instead.
+Because of this, while the reputation lists used by Shinigami Eyes were
+acceptably blinded, there was an extremely [high risk of false positives
+caused by hash collisions][collided-account].
+
+   [collided-account]: https://twitter.com/x0s1jpnq2sk2
+
+Concerns about the technical implementation of the Shinigami Eyes extension
+led Datatilsynet, the Norwegian GDPR regulatory agency, to [ban the extension][se-ban]
+at the end of 2021, and development of the extension appears to have
+ended as a result of their initial inquiry.
+
+   [se-ban]: https://www.datatilsynet.no/en/news/2021/varsler-forbud-mot-nettleserutvidelsen-shinigami-eyes-i-norge/
+
+## Can we build systems like Shinigami Eyes more robustly?
+
+The main reason why Shinigami Eyes gained attention of Datatilsynet was due to
+the centralized nature of the data processing.
+Can we build a system which avoids centralized data processing and promotes
+democratic participation?
+Yes, it is quite easy, but like most things, the challenge will be delivering
+a good user experience.
+
+### Leveraging the OCAP model to build a robust solution
+
+The largest problem in building this system is ensuring that the published
+reputation data is reliably blinded.
+To this end, I propose that feeds are a simple dataset containing a set of
+blinded hashes and annotations.
+The physical representation of the dataset does not matter, though keeping
+it as simple as possible will expand the number of places where the data
+can be consumed.
+
+In the Object Capability model, we can think of the physical feed as an
+*object*, and a blinding key as a *capability* to access that object in a
+useful way.
+You have to have both in order for either to be useful.
+
+A participant can publish multiple copies of their feed, with different
+blinding keys for each friend they wish to share it with, or they can
+choose to publish a single key and share the same key with every friend,
+or even the public at large.
+Users can then choose which feeds they want to use when making trust
+decisions from the collection of feeds and blinding keys they have been
+given.
+
+By comparison to Shinigami Eyes, this better satisfies the conditions for
+*fairness*: there is no risk of a false positive, the contents of the
+reputation lists remain private, and publishers can choose to consent to
+data sharing requests however they wish.
+
+### Choosing a reasonable set of primitives
+
+To build such a system, I would probably personally choose to use
+`HMAC-SHA3-256` as the blinding primitive.
+This provides a good balance between collision protection,
+cryptographic strength, and hash resolution.
+A scheme which provides less than 256 bits of hash resolution should
+be avoided due to the risk of collisions.
+
+I would distribute the feeds as CSV files.
+This would allow users the most flexibility in managing feeds, they
+could distribute different feeds with different meanings, and include
+extended data alongside the blinded hash as a form of annotation.
+
+On the client side, I would calculate sets of blinded hashes for each 
+possible subset of the URI, all the way to the parent domain.
+By doing so, it would be possible for feeds to match against a large
+number of children URIs instead of having to list them all manually.
+
+Implementations should store the learned hashes in a [radix trie][rt].
+This allows the hash lookups to be done in constant time, as well
+as allowing for automatic bucketing, which can be helpful for
+implementing quorum requirements.
+
+   [rt]: https://en.wikipedia.org/wiki/Radix_tree
+
+## Things we can build with this
+
+The use of friend-to-friend reputation-based systems can be powerful.
+They provide accountability (as you know who you are getting your
+data from) and collaboration (your friends can consume your data in
+exchange).
+
+They can be used in the way Shinigami Eyes was used: to allow interested
+parties to identify resources they should trust or distrust, but they can
+also be used to enable collaborative blocking amongst friends and system
+administrators.
+
+They can also be used to determine if e-mail domains or URLs inside e-mails
+are actually trustworthy.
+The possibilities are truly endless.