The Tech Behind Our PolySwarm Arbiter

2018-12-17

Written by
Ben de Graaff & Menno Brendeke

Polyswarm

Introduction

PolySwarm is a threat detection marketplace where antivirus companies and enterprises can expand and improve their protection coverage against new threats, based on Ethereum smart contracts. In our previous blog Making the Call: The First PolySwarm Arbiter described how Hatching became PolySwarm’s first arbiter and we took a high-level look at arbitership in general. In this blog we will dive deeper into the technical side of our arbitership. We will describe the process all the way from listening for bounties through to casting our vote.

This blog assumes that the reader has basic knowledge on how a blockchain functions. If you feel like your knowledge is a bit rusty, you might want to check this article on blockchain first.

Technical Side of Arbitership

The Hatching Arbiter interacts with PolySwarm via a web service called polyswarmd. This service acts as an interface to the blockchain so that our arbiter does not directly need to deal with implementation details. It does so by providing a REST-like API that fetches and helps to publish the data. Blockchain events are sent in real-time using a WebSocket using simple JSON messages.

Event-based processing is an important part of the Hatching Arbiter implementation, since there are many asynchronous processes occurring at the same time. Many of the events described in the workflow below are both time-based (based on blockchain block progression) and dependent on the completion of a previous step. And since (network) errors are inevitable in a distributed system, retry logic is required to ensure smooth operation.

The Workflow of a bounty

A new bounty is announced the via WebSocket and is recorded by the Arbiter.

The Arbiter fetches the artifacts (samples) from polyswarmd which fetches it from IPFS.
The Arbiter submits the artifacts to its configured analysis backends. All our backends either support a HTTP-based REST-like API, or we provide a wrapper service. This way the Arbiter never directly interacts with the artifact or any analysis subprocesses, which allows us to run the Arbiter with very minimal privileges.
Depending on how the analysis backends are configured, processing may take a while. For example, we can choose how long to examine an artifact using Cuckoo’s dynamic analysis, although this defaults to one or two minutes. Processing time is further dependent on the current workload. When completed, the analysis backend asynchronously reports its verdict to the Arbiter via an inbound HTTP call.
Once the verdicts from one or multiple backends (e.g., Cuckoo) have been received the Arbiter can publish its vote. There is a specific window in which the Arbiter can vote, expressed in number of blocks. This is the trickiest bit, since it effectively restricts how long our analyses can take. It also means that we need to scale our analysis backends to be able to cope with a large number of incoming artifacts.
After the voting window has closed the Arbiter can fetch assertions made by other experts. These assertions are used to validate the Arbiter’s vote, and any voting discrepancies are tracked so that we can manually validate our verdict.
Additionally, after the vote window has closed the bounty can be settled. This distributes the payment to all parties who have a stake in this bounty.

Visually, the workflow of the arbiter looks something like this:

A new bounty is announced the via WebSocket and is recorded by the Arbiter.
The Arbiter fetches the artifacts from polyswarmd which fetches it from IPFS.
The Arbiter submits the artifacts to its configured analysis backends.
The artifacts is proccessed, which typically takes one or two minutes.
Once the verdicts have been received the Arbiter can publish its vote.
After the voting window has closed the Arbiter can fetch assertions made by other experts.
Additionally, after the vote window has closed the bounty can be settled.

All blockchain interactions (e.g., voting and settling) require a cryptographically signed transaction to take place. Only the Arbiter has access to the signing key. The Arbiter uses polyswarmd to format JSON-based requests as proper blockchain transactions. The transaction is then signed by the Arbiter and sent to for publishing.

Conclusion

The event-driven architecture of the Hatching Arbiter ensures a smooth operation, as it can deal with processes that are time-based as well as processes that are dependent on the completion of a previous step.

Because we can rely on polyswarmd to handle the implementation details, our arbiter can be much leaner than it could have otherwise been. This also lowers the entry barrier for other arbiters that want to join PolySwarm’s movement and help out. The last blog of this series will, therefore, focus on how one can become an arbiter, as we believe it is important for the development marketplace to have multiple arbiters.

Making the Call