177 lines
7.9 KiB
Markdown
177 lines
7.9 KiB
Markdown
|
---
|
||
|
title: "Building fair webs of trust by leveraging the OCAP model"
|
||
|
date: "2022-12-03"
|
||
|
---
|
||
|
|
||
|
Since the beginning of the Internet, determining the trustworthiness
|
||
|
of participants and published information has been a significant
|
||
|
point of contention.
|
||
|
Many systems have been proposed to solve these underlying concerns,
|
||
|
usually pertaining to specific niches and communities, but these
|
||
|
pre-existing solutions are nebulous at best.
|
||
|
How can we build infrastructure for truly democratic Webs of Trust?
|
||
|
|
||
|
## Fairness in reputation-based systems
|
||
|
|
||
|
When considering the design of a reputation-based system, *fairness*
|
||
|
must be paramount, but what is *fairness* in this context?
|
||
|
A reputation-based system can be considered *fair* if it appropriately
|
||
|
balances the concerns of the data publisher, the data subject, and
|
||
|
the data consumer.
|
||
|
Regulatory frameworks such as the GDPR attempt to provide guidance
|
||
|
concering how this balance can be accomplished in the general sense
|
||
|
of building internet services, but these frameworks are large and
|
||
|
complicated, and as such make it difficult to provide a definition
|
||
|
which is adequate for a reputation-based trust system.
|
||
|
|
||
|
To understand how these concerns must be balanced, we must understand
|
||
|
the underlying risks for each participant in a reputation-based system:
|
||
|
|
||
|
- The **data subject** is at risk of harm to their professional
|
||
|
reputation due to annotations they did not consent to, and mistakes
|
||
|
in those annotations.
|
||
|
This is a problem which has already captured regulatory ire, as I
|
||
|
will explain later.
|
||
|
- The **data publisher** is at risk of being sued for defamation due to
|
||
|
the annotations they publish.
|
||
|
- The **data consumer** is at risk of being misled by inaccurate
|
||
|
annotations they consume.
|
||
|
|
||
|
A *fair* reputation-based system must attempt to provide an adequate
|
||
|
balance between these concerns through active harm reduction in its
|
||
|
design:
|
||
|
|
||
|
- The harm to the **data subject** from misleading annotations can be
|
||
|
reduced by blinding the identity of the data subject.
|
||
|
- The harm to the **data publisher** from misleading annotations can
|
||
|
also be reduced by blinding the identity of the data subject.
|
||
|
- The harm to the **data consumer** from misleading annotations can be
|
||
|
reduced by allowing them to consume annotations from multiple sources.
|
||
|
|
||
|
## Shinigami Eyes, or how designing for fairness can be difficult
|
||
|
|
||
|
The [Shinigami Eyes][se] browser extension was designed to help people
|
||
|
establish trust in various web resources using a reputation-based system.
|
||
|
In general, the author attempted to make thoughtful choices to ensure
|
||
|
the system was reasonably fair in its design.
|
||
|
However the system has [a number of flaws, both technical and social][er],
|
||
|
which highlight how building systems of trust requires a detailed
|
||
|
understanding concerning how the underlying primitives interact and
|
||
|
the consequences of those interactions.
|
||
|
|
||
|
[se]: https://shinigami-eyes.github.io/
|
||
|
[er]: https://eyereaper.evelyn.moe/
|
||
|
|
||
|
### Shinigami Eyes and Blinding
|
||
|
|
||
|
As already noted, a *fair* reputation-based system must blind the identity
|
||
|
of the data subject to protect both the data subject and data publisher.
|
||
|
The approach used by Shinigami Eyes was to use a bloom filter constructed
|
||
|
with a 32-bit [`FNV-1a` hash][fnv].
|
||
|
|
||
|
[fnv]: http://www.isthe.com/chongo/tech/comp/fnv/index.html
|
||
|
|
||
|
The FNV family of hashes are a non-cryptographic family of hashes, which
|
||
|
provide scalability up to 1024 bits, which works by performing an XOR of
|
||
|
the current byte's value against the current hash value, then multiplying
|
||
|
that value by the designated FNV prime.
|
||
|
There is an alternate set of FNV hashes which swaps the XOR and
|
||
|
multiplication steps, which is the variant used by Shinigami Eyes.
|
||
|
|
||
|
The use of a bloom filter is an acceptable blinding method, assuming that
|
||
|
the underlying hash provides sufficient resolution, such as a 256-bit
|
||
|
or 512-bit hash.
|
||
|
Presumably, due to the constraints of having to run as a JavaScript extension,
|
||
|
the weak 32-bit `FNV-1a` hash was used instead.
|
||
|
Because of this, while the reputation lists used by Shinigami Eyes were
|
||
|
acceptably blinded, there was an extremely [high risk of false positives
|
||
|
caused by hash collisions][collided-account].
|
||
|
|
||
|
[collided-account]: https://twitter.com/x0s1jpnq2sk2
|
||
|
|
||
|
Concerns about the technical implementation of the Shinigami Eyes extension
|
||
|
led Datatilsynet, the Norwegian GDPR regulatory agency, to [ban the extension][se-ban]
|
||
|
at the end of 2021, and development of the extension appears to have
|
||
|
ended as a result of their initial inquiry.
|
||
|
|
||
|
[se-ban]: https://www.datatilsynet.no/en/news/2021/varsler-forbud-mot-nettleserutvidelsen-shinigami-eyes-i-norge/
|
||
|
|
||
|
## Can we build systems like Shinigami Eyes more robustly?
|
||
|
|
||
|
The main reason why Shinigami Eyes gained attention of Datatilsynet was due to
|
||
|
the centralized nature of the data processing.
|
||
|
Can we build a system which avoids centralized data processing and promotes
|
||
|
democratic participation?
|
||
|
Yes, it is quite easy, but like most things, the challenge will be delivering
|
||
|
a good user experience.
|
||
|
|
||
|
### Leveraging the OCAP model to build a robust solution
|
||
|
|
||
|
The largest problem in building this system is ensuring that the published
|
||
|
reputation data is reliably blinded.
|
||
|
To this end, I propose that feeds are a simple dataset containing a set of
|
||
|
blinded hashes and annotations.
|
||
|
The physical representation of the dataset does not matter, though keeping
|
||
|
it as simple as possible will expand the number of places where the data
|
||
|
can be consumed.
|
||
|
|
||
|
In the Object Capability model, we can think of the physical feed as an
|
||
|
*object*, and a blinding key as a *capability* to access that object in a
|
||
|
useful way.
|
||
|
You have to have both in order for either to be useful.
|
||
|
|
||
|
A participant can publish multiple copies of their feed, with different
|
||
|
blinding keys for each friend they wish to share it with, or they can
|
||
|
choose to publish a single key and share the same key with every friend,
|
||
|
or even the public at large.
|
||
|
Users can then choose which feeds they want to use when making trust
|
||
|
decisions from the collection of feeds and blinding keys they have been
|
||
|
given.
|
||
|
|
||
|
By comparison to Shinigami Eyes, this better satisfies the conditions for
|
||
|
*fairness*: there is no risk of a false positive, the contents of the
|
||
|
reputation lists remain private, and publishers can choose to consent to
|
||
|
data sharing requests however they wish.
|
||
|
|
||
|
### Choosing a reasonable set of primitives
|
||
|
|
||
|
To build such a system, I would probably personally choose to use
|
||
|
`HMAC-SHA3-256` as the blinding primitive.
|
||
|
This provides a good balance between collision protection,
|
||
|
cryptographic strength, and hash resolution.
|
||
|
A scheme which provides less than 256 bits of hash resolution should
|
||
|
be avoided due to the risk of collisions.
|
||
|
|
||
|
I would distribute the feeds as CSV files.
|
||
|
This would allow users the most flexibility in managing feeds, they
|
||
|
could distribute different feeds with different meanings, and include
|
||
|
extended data alongside the blinded hash as a form of annotation.
|
||
|
|
||
|
On the client side, I would calculate sets of blinded hashes for each
|
||
|
possible subset of the URI, all the way to the parent domain.
|
||
|
By doing so, it would be possible for feeds to match against a large
|
||
|
number of children URIs instead of having to list them all manually.
|
||
|
|
||
|
Implementations should store the learned hashes in a [radix trie][rt].
|
||
|
This allows the hash lookups to be done in constant time, as well
|
||
|
as allowing for automatic bucketing, which can be helpful for
|
||
|
implementing quorum requirements.
|
||
|
|
||
|
[rt]: https://en.wikipedia.org/wiki/Radix_tree
|
||
|
|
||
|
## Things we can build with this
|
||
|
|
||
|
The use of friend-to-friend reputation-based systems can be powerful.
|
||
|
They provide accountability (as you know who you are getting your
|
||
|
data from) and collaboration (your friends can consume your data in
|
||
|
exchange).
|
||
|
|
||
|
They can be used in the way Shinigami Eyes was used: to allow interested
|
||
|
parties to identify resources they should trust or distrust, but they can
|
||
|
also be used to enable collaborative blocking amongst friends and system
|
||
|
administrators.
|
||
|
|
||
|
They can also be used to determine if e-mail domains or URLs inside e-mails
|
||
|
are actually trustworthy.
|
||
|
The possibilities are truly endless.
|