ariadne.space/content/blog/building-fair-webs-of-trust...

---
title: "Building fair webs of trust by leveraging the OCAP model"
date: "2022-12-03"
---

Since the beginning of the Internet, determining the trustworthiness
of participants and published information has been a significant
point of contention.
Many systems have been proposed to solve these underlying concerns,
usually pertaining to specific niches and communities, but these
pre-existing solutions are nebulous at best.
How can we build infrastructure for truly democratic Webs of Trust?

## Fairness in reputation-based systems

When considering the design of a reputation-based system, *fairness*
must be paramount, but what is *fairness* in this context?
A reputation-based system can be considered *fair* if it appropriately
balances the concerns of the data publisher, the data subject, and
the data consumer.
Regulatory frameworks such as the GDPR attempt to provide guidance
concering how this balance can be accomplished in the general sense
of building internet services, but these frameworks are large and
complicated, and as such make it difficult to provide a definition
which is adequate for a reputation-based trust system.

To understand how these concerns must be balanced, we must understand
the underlying risks for each participant in a reputation-based system:

- The **data subject** is at risk of harm to their professional
  reputation due to annotations they did not consent to, and mistakes
  in those annotations.
  This is a problem which has already captured regulatory ire, as I
  will explain later.
- The **data publisher** is at risk of being sued for defamation due to
  the annotations they publish.
- The **data consumer** is at risk of being misled by inaccurate
  annotations they consume.

A *fair* reputation-based system must attempt to provide an adequate
balance between these concerns through active harm reduction in its
design:

- The harm to the **data subject** from misleading annotations can be
  reduced by blinding the identity of the data subject.
- The harm to the **data publisher** from misleading annotations can
  also be reduced by blinding the identity of the data subject.
- The harm to the **data consumer** from misleading annotations can be
  reduced by allowing them to consume annotations from multiple sources.

## Shinigami Eyes, or how designing for fairness can be difficult

The [Shinigami Eyes][se] browser extension was designed to help people
establish trust in various web resources using a reputation-based system.
In general, the author attempted to make thoughtful choices to ensure
the system was reasonably fair in its design.
However the system has [a number of flaws, both technical and social][er],
which highlight how building systems of trust requires a detailed
understanding concerning how the underlying primitives interact and
the consequences of those interactions.

   [se]: https://shinigami-eyes.github.io/
   [er]: https://eyereaper.evelyn.moe/

### Shinigami Eyes and Blinding

As already noted, a *fair* reputation-based system must blind the identity
of the data subject to protect both the data subject and data publisher.
The approach used by Shinigami Eyes was to use a bloom filter constructed
with a 32-bit [`FNV-1a` hash][fnv].

   [fnv]: http://www.isthe.com/chongo/tech/comp/fnv/index.html

The FNV family of hashes are a non-cryptographic family of hashes, which
provide scalability up to 1024 bits, which works by performing an XOR of
the current byte's value against the current hash value, then multiplying
that value by the designated FNV prime.
There is an alternate set of FNV hashes which swaps the XOR and
multiplication steps, which is the variant used by Shinigami Eyes.

The use of a bloom filter is an acceptable blinding method, assuming that
the underlying hash provides sufficient resolution, such as a 256-bit
or 512-bit hash.
Presumably, due to the constraints of having to run as a JavaScript extension,
the weak 32-bit `FNV-1a` hash was used instead.
Because of this, while the reputation lists used by Shinigami Eyes were
acceptably blinded, there was an extremely [high risk of false positives
caused by hash collisions][collided-account].

   [collided-account]: https://twitter.com/x0s1jpnq2sk2

Concerns about the technical implementation of the Shinigami Eyes extension
led Datatilsynet, the Norwegian GDPR regulatory agency, to [ban the extension][se-ban]
at the end of 2021, and development of the extension appears to have
ended as a result of their initial inquiry.

   [se-ban]: https://www.datatilsynet.no/en/news/2021/varsler-forbud-mot-nettleserutvidelsen-shinigami-eyes-i-norge/

## Can we build systems like Shinigami Eyes more robustly?

The main reason why Shinigami Eyes gained attention of Datatilsynet was due to
the centralized nature of the data processing.
Can we build a system which avoids centralized data processing and promotes
democratic participation?
Yes, it is quite easy, but like most things, the challenge will be delivering
a good user experience.

### Leveraging the OCAP model to build a robust solution

The largest problem in building this system is ensuring that the published
reputation data is reliably blinded.
To this end, I propose that feeds are a simple dataset containing a set of
blinded hashes and annotations.
The physical representation of the dataset does not matter, though keeping
it as simple as possible will expand the number of places where the data
can be consumed.

In the Object Capability model, we can think of the physical feed as an
*object*, and a blinding key as a *capability* to access that object in a
useful way.
You have to have both in order for either to be useful.

A participant can publish multiple copies of their feed, with different
blinding keys for each friend they wish to share it with, or they can
choose to publish a single key and share the same key with every friend,
or even the public at large.
Users can then choose which feeds they want to use when making trust
decisions from the collection of feeds and blinding keys they have been
given.

By comparison to Shinigami Eyes, this better satisfies the conditions for
*fairness*: there is no risk of a false positive, the contents of the
reputation lists remain private, and publishers can choose to consent to
data sharing requests however they wish.

### Choosing a reasonable set of primitives

To build such a system, I would probably personally choose to use
`HMAC-SHA3-256` as the blinding primitive.
This provides a good balance between collision protection,
cryptographic strength, and hash resolution.
A scheme which provides less than 256 bits of hash resolution should
be avoided due to the risk of collisions.

I would distribute the feeds as CSV files.
This would allow users the most flexibility in managing feeds, they
could distribute different feeds with different meanings, and include
extended data alongside the blinded hash as a form of annotation.

On the client side, I would calculate sets of blinded hashes for each 
possible subset of the URI, all the way to the parent domain.
By doing so, it would be possible for feeds to match against a large
number of children URIs instead of having to list them all manually.

Implementations should store the learned hashes in a [radix trie][rt].
This allows the hash lookups to be done in constant time, as well
as allowing for automatic bucketing, which can be helpful for
implementing quorum requirements.

   [rt]: https://en.wikipedia.org/wiki/Radix_tree

## Things we can build with this

The use of friend-to-friend reputation-based systems can be powerful.
They provide accountability (as you know who you are getting your
data from) and collaboration (your friends can consume your data in
exchange).

They can be used in the way Shinigami Eyes was used: to allow interested
parties to identify resources they should trust or distrust, but they can
also be used to enable collaborative blocking amongst friends and system
administrators.

They can also be used to determine if e-mail domains or URLs inside e-mails
are actually trustworthy.
The possibilities are truly endless.
new blog: building fair webs of trust via ocap 2022-12-03 07:26:38 +00:00			`---`
			`title: "Building fair webs of trust by leveraging the OCAP model"`
			`date: "2022-12-03"`
			`---`

			`Since the beginning of the Internet, determining the trustworthiness`
			`of participants and published information has been a significant`
			`point of contention.`
			`Many systems have been proposed to solve these underlying concerns,`
			`usually pertaining to specific niches and communities, but these`
			`pre-existing solutions are nebulous at best.`
			`How can we build infrastructure for truly democratic Webs of Trust?`

			`## Fairness in reputation-based systems`

			`When considering the design of a reputation-based system, fairness`
			`must be paramount, but what is fairness in this context?`
			`A reputation-based system can be considered fair if it appropriately`
			`balances the concerns of the data publisher, the data subject, and`
			`the data consumer.`
			`Regulatory frameworks such as the GDPR attempt to provide guidance`
			`concering how this balance can be accomplished in the general sense`
			`of building internet services, but these frameworks are large and`
			`complicated, and as such make it difficult to provide a definition`
			`which is adequate for a reputation-based trust system.`

			`To understand how these concerns must be balanced, we must understand`
			`the underlying risks for each participant in a reputation-based system:`

			`- The data subject is at risk of harm to their professional`
			`reputation due to annotations they did not consent to, and mistakes`
			`in those annotations.`
			`This is a problem which has already captured regulatory ire, as I`
			`will explain later.`
			`- The data publisher is at risk of being sued for defamation due to`
			`the annotations they publish.`
			`- The data consumer is at risk of being misled by inaccurate`
			`annotations they consume.`

			`A fair reputation-based system must attempt to provide an adequate`
			`balance between these concerns through active harm reduction in its`
			`design:`

			`- The harm to the data subject from misleading annotations can be`
			`reduced by blinding the identity of the data subject.`
			`- The harm to the data publisher from misleading annotations can`
			`also be reduced by blinding the identity of the data subject.`
			`- The harm to the data consumer from misleading annotations can be`
			`reduced by allowing them to consume annotations from multiple sources.`

			`## Shinigami Eyes, or how designing for fairness can be difficult`

			`The [Shinigami Eyes][se] browser extension was designed to help people`
			`establish trust in various web resources using a reputation-based system.`
			`In general, the author attempted to make thoughtful choices to ensure`
			`the system was reasonably fair in its design.`
			`However the system has [a number of flaws, both technical and social][er],`
			`which highlight how building systems of trust requires a detailed`
			`understanding concerning how the underlying primitives interact and`
			`the consequences of those interactions.`

			`[se]: https://shinigami-eyes.github.io/`
			`[er]: https://eyereaper.evelyn.moe/`

			`### Shinigami Eyes and Blinding`

			`As already noted, a fair reputation-based system must blind the identity`
			`of the data subject to protect both the data subject and data publisher.`
			`The approach used by Shinigami Eyes was to use a bloom filter constructed`
			with a 32-bit [`FNV-1a` hash][fnv].

			`[fnv]: http://www.isthe.com/chongo/tech/comp/fnv/index.html`

			`The FNV family of hashes are a non-cryptographic family of hashes, which`
			`provide scalability up to 1024 bits, which works by performing an XOR of`
			`the current byte's value against the current hash value, then multiplying`
			`that value by the designated FNV prime.`
			`There is an alternate set of FNV hashes which swaps the XOR and`
			`multiplication steps, which is the variant used by Shinigami Eyes.`

			`The use of a bloom filter is an acceptable blinding method, assuming that`
			`the underlying hash provides sufficient resolution, such as a 256-bit`
			`or 512-bit hash.`
			`Presumably, due to the constraints of having to run as a JavaScript extension,`
			the weak 32-bit `FNV-1a` hash was used instead.
			`Because of this, while the reputation lists used by Shinigami Eyes were`
			`acceptably blinded, there was an extremely [high risk of false positives`
			`caused by hash collisions][collided-account].`

			`[collided-account]: https://twitter.com/x0s1jpnq2sk2`

			`Concerns about the technical implementation of the Shinigami Eyes extension`
			`led Datatilsynet, the Norwegian GDPR regulatory agency, to [ban the extension][se-ban]`
			`at the end of 2021, and development of the extension appears to have`
			`ended as a result of their initial inquiry.`

			`[se-ban]: https://www.datatilsynet.no/en/news/2021/varsler-forbud-mot-nettleserutvidelsen-shinigami-eyes-i-norge/`

			`## Can we build systems like Shinigami Eyes more robustly?`

			`The main reason why Shinigami Eyes gained attention of Datatilsynet was due to`
			`the centralized nature of the data processing.`
			`Can we build a system which avoids centralized data processing and promotes`
			`democratic participation?`
			`Yes, it is quite easy, but like most things, the challenge will be delivering`
			`a good user experience.`

			`### Leveraging the OCAP model to build a robust solution`

			`The largest problem in building this system is ensuring that the published`
			`reputation data is reliably blinded.`
			`To this end, I propose that feeds are a simple dataset containing a set of`
			`blinded hashes and annotations.`
			`The physical representation of the dataset does not matter, though keeping`
			`it as simple as possible will expand the number of places where the data`
			`can be consumed.`

			`In the Object Capability model, we can think of the physical feed as an`
			`object, and a blinding key as a capability to access that object in a`
			`useful way.`
			`You have to have both in order for either to be useful.`

			`A participant can publish multiple copies of their feed, with different`
			`blinding keys for each friend they wish to share it with, or they can`
			`choose to publish a single key and share the same key with every friend,`
			`or even the public at large.`
			`Users can then choose which feeds they want to use when making trust`
			`decisions from the collection of feeds and blinding keys they have been`
			`given.`

			`By comparison to Shinigami Eyes, this better satisfies the conditions for`
			`fairness: there is no risk of a false positive, the contents of the`
			`reputation lists remain private, and publishers can choose to consent to`
			`data sharing requests however they wish.`

			`### Choosing a reasonable set of primitives`

			`To build such a system, I would probably personally choose to use`
			`HMAC-SHA3-256` as the blinding primitive.
			`This provides a good balance between collision protection,`
			`cryptographic strength, and hash resolution.`
			`A scheme which provides less than 256 bits of hash resolution should`
			`be avoided due to the risk of collisions.`

			`I would distribute the feeds as CSV files.`
			`This would allow users the most flexibility in managing feeds, they`
			`could distribute different feeds with different meanings, and include`
			`extended data alongside the blinded hash as a form of annotation.`

			`On the client side, I would calculate sets of blinded hashes for each`
			`possible subset of the URI, all the way to the parent domain.`
			`By doing so, it would be possible for feeds to match against a large`
			`number of children URIs instead of having to list them all manually.`

			`Implementations should store the learned hashes in a [radix trie][rt].`
			`This allows the hash lookups to be done in constant time, as well`
			`as allowing for automatic bucketing, which can be helpful for`
			`implementing quorum requirements.`

			`[rt]: https://en.wikipedia.org/wiki/Radix_tree`

			`## Things we can build with this`

			`The use of friend-to-friend reputation-based systems can be powerful.`
			`They provide accountability (as you know who you are getting your`
			`data from) and collaboration (your friends can consume your data in`
			`exchange).`

			`They can be used in the way Shinigami Eyes was used: to allow interested`
			`parties to identify resources they should trust or distrust, but they can`
			`also be used to enable collaborative blocking amongst friends and system`
			`administrators.`

			`They can also be used to determine if e-mail domains or URLs inside e-mails`
			`are actually trustworthy.`
			`The possibilities are truly endless.`