new blog: building fair webs of trust via ocap
ci/woodpecker/push/woodpecker Pipeline failed
Details
ci/woodpecker/push/woodpecker Pipeline failed
Details
parent
63ad030dad
commit
41350afedf
|
@ -0,0 +1,176 @@
|
|||
---
|
||||
title: "Building fair webs of trust by leveraging the OCAP model"
|
||||
date: "2022-12-03"
|
||||
---
|
||||
|
||||
Since the beginning of the Internet, determining the trustworthiness
|
||||
of participants and published information has been a significant
|
||||
point of contention.
|
||||
Many systems have been proposed to solve these underlying concerns,
|
||||
usually pertaining to specific niches and communities, but these
|
||||
pre-existing solutions are nebulous at best.
|
||||
How can we build infrastructure for truly democratic Webs of Trust?
|
||||
|
||||
## Fairness in reputation-based systems
|
||||
|
||||
When considering the design of a reputation-based system, *fairness*
|
||||
must be paramount, but what is *fairness* in this context?
|
||||
A reputation-based system can be considered *fair* if it appropriately
|
||||
balances the concerns of the data publisher, the data subject, and
|
||||
the data consumer.
|
||||
Regulatory frameworks such as the GDPR attempt to provide guidance
|
||||
concering how this balance can be accomplished in the general sense
|
||||
of building internet services, but these frameworks are large and
|
||||
complicated, and as such make it difficult to provide a definition
|
||||
which is adequate for a reputation-based trust system.
|
||||
|
||||
To understand how these concerns must be balanced, we must understand
|
||||
the underlying risks for each participant in a reputation-based system:
|
||||
|
||||
- The **data subject** is at risk of harm to their professional
|
||||
reputation due to annotations they did not consent to, and mistakes
|
||||
in those annotations.
|
||||
This is a problem which has already captured regulatory ire, as I
|
||||
will explain later.
|
||||
- The **data publisher** is at risk of being sued for defamation due to
|
||||
the annotations they publish.
|
||||
- The **data consumer** is at risk of being misled by inaccurate
|
||||
annotations they consume.
|
||||
|
||||
A *fair* reputation-based system must attempt to provide an adequate
|
||||
balance between these concerns through active harm reduction in its
|
||||
design:
|
||||
|
||||
- The harm to the **data subject** from misleading annotations can be
|
||||
reduced by blinding the identity of the data subject.
|
||||
- The harm to the **data publisher** from misleading annotations can
|
||||
also be reduced by blinding the identity of the data subject.
|
||||
- The harm to the **data consumer** from misleading annotations can be
|
||||
reduced by allowing them to consume annotations from multiple sources.
|
||||
|
||||
## Shinigami Eyes, or how designing for fairness can be difficult
|
||||
|
||||
The [Shinigami Eyes][se] browser extension was designed to help people
|
||||
establish trust in various web resources using a reputation-based system.
|
||||
In general, the author attempted to make thoughtful choices to ensure
|
||||
the system was reasonably fair in its design.
|
||||
However the system has [a number of flaws, both technical and social][er],
|
||||
which highlight how building systems of trust requires a detailed
|
||||
understanding concerning how the underlying primitives interact and
|
||||
the consequences of those interactions.
|
||||
|
||||
[se]: https://shinigami-eyes.github.io/
|
||||
[er]: https://eyereaper.evelyn.moe/
|
||||
|
||||
### Shinigami Eyes and Blinding
|
||||
|
||||
As already noted, a *fair* reputation-based system must blind the identity
|
||||
of the data subject to protect both the data subject and data publisher.
|
||||
The approach used by Shinigami Eyes was to use a bloom filter constructed
|
||||
with a 32-bit [`FNV-1a` hash][fnv].
|
||||
|
||||
[fnv]: http://www.isthe.com/chongo/tech/comp/fnv/index.html
|
||||
|
||||
The FNV family of hashes are a non-cryptographic family of hashes, which
|
||||
provide scalability up to 1024 bits, which works by performing an XOR of
|
||||
the current byte's value against the current hash value, then multiplying
|
||||
that value by the designated FNV prime.
|
||||
There is an alternate set of FNV hashes which swaps the XOR and
|
||||
multiplication steps, which is the variant used by Shinigami Eyes.
|
||||
|
||||
The use of a bloom filter is an acceptable blinding method, assuming that
|
||||
the underlying hash provides sufficient resolution, such as a 256-bit
|
||||
or 512-bit hash.
|
||||
Presumably, due to the constraints of having to run as a JavaScript extension,
|
||||
the weak 32-bit `FNV-1a` hash was used instead.
|
||||
Because of this, while the reputation lists used by Shinigami Eyes were
|
||||
acceptably blinded, there was an extremely [high risk of false positives
|
||||
caused by hash collisions][collided-account].
|
||||
|
||||
[collided-account]: https://twitter.com/x0s1jpnq2sk2
|
||||
|
||||
Concerns about the technical implementation of the Shinigami Eyes extension
|
||||
led Datatilsynet, the Norwegian GDPR regulatory agency, to [ban the extension][se-ban]
|
||||
at the end of 2021, and development of the extension appears to have
|
||||
ended as a result of their initial inquiry.
|
||||
|
||||
[se-ban]: https://www.datatilsynet.no/en/news/2021/varsler-forbud-mot-nettleserutvidelsen-shinigami-eyes-i-norge/
|
||||
|
||||
## Can we build systems like Shinigami Eyes more robustly?
|
||||
|
||||
The main reason why Shinigami Eyes gained attention of Datatilsynet was due to
|
||||
the centralized nature of the data processing.
|
||||
Can we build a system which avoids centralized data processing and promotes
|
||||
democratic participation?
|
||||
Yes, it is quite easy, but like most things, the challenge will be delivering
|
||||
a good user experience.
|
||||
|
||||
### Leveraging the OCAP model to build a robust solution
|
||||
|
||||
The largest problem in building this system is ensuring that the published
|
||||
reputation data is reliably blinded.
|
||||
To this end, I propose that feeds are a simple dataset containing a set of
|
||||
blinded hashes and annotations.
|
||||
The physical representation of the dataset does not matter, though keeping
|
||||
it as simple as possible will expand the number of places where the data
|
||||
can be consumed.
|
||||
|
||||
In the Object Capability model, we can think of the physical feed as an
|
||||
*object*, and a blinding key as a *capability* to access that object in a
|
||||
useful way.
|
||||
You have to have both in order for either to be useful.
|
||||
|
||||
A participant can publish multiple copies of their feed, with different
|
||||
blinding keys for each friend they wish to share it with, or they can
|
||||
choose to publish a single key and share the same key with every friend,
|
||||
or even the public at large.
|
||||
Users can then choose which feeds they want to use when making trust
|
||||
decisions from the collection of feeds and blinding keys they have been
|
||||
given.
|
||||
|
||||
By comparison to Shinigami Eyes, this better satisfies the conditions for
|
||||
*fairness*: there is no risk of a false positive, the contents of the
|
||||
reputation lists remain private, and publishers can choose to consent to
|
||||
data sharing requests however they wish.
|
||||
|
||||
### Choosing a reasonable set of primitives
|
||||
|
||||
To build such a system, I would probably personally choose to use
|
||||
`HMAC-SHA3-256` as the blinding primitive.
|
||||
This provides a good balance between collision protection,
|
||||
cryptographic strength, and hash resolution.
|
||||
A scheme which provides less than 256 bits of hash resolution should
|
||||
be avoided due to the risk of collisions.
|
||||
|
||||
I would distribute the feeds as CSV files.
|
||||
This would allow users the most flexibility in managing feeds, they
|
||||
could distribute different feeds with different meanings, and include
|
||||
extended data alongside the blinded hash as a form of annotation.
|
||||
|
||||
On the client side, I would calculate sets of blinded hashes for each
|
||||
possible subset of the URI, all the way to the parent domain.
|
||||
By doing so, it would be possible for feeds to match against a large
|
||||
number of children URIs instead of having to list them all manually.
|
||||
|
||||
Implementations should store the learned hashes in a [radix trie][rt].
|
||||
This allows the hash lookups to be done in constant time, as well
|
||||
as allowing for automatic bucketing, which can be helpful for
|
||||
implementing quorum requirements.
|
||||
|
||||
[rt]: https://en.wikipedia.org/wiki/Radix_tree
|
||||
|
||||
## Things we can build with this
|
||||
|
||||
The use of friend-to-friend reputation-based systems can be powerful.
|
||||
They provide accountability (as you know who you are getting your
|
||||
data from) and collaboration (your friends can consume your data in
|
||||
exchange).
|
||||
|
||||
They can be used in the way Shinigami Eyes was used: to allow interested
|
||||
parties to identify resources they should trust or distrust, but they can
|
||||
also be used to enable collaborative blocking amongst friends and system
|
||||
administrators.
|
||||
|
||||
They can also be used to determine if e-mail domains or URLs inside e-mails
|
||||
are actually trustworthy.
|
||||
The possibilities are truly endless.
|
Loading…
Reference in New Issue