Anonymous Voting – A Design Survey

Published: March 3^rd, 2019 (Originally at Medium)

Note: This article leads up to the design and WIP implementation of an anonymous voting protocol on Ethereum with zero-knowledge proofs for preserving anonymity.

The history of anonymous voting dates back thousands of years to the ancient Greeks and Romans. In their cultures, anonymous voting was used to exile individuals perceived as threats and also for general electoral matters. Preserving anonymity was and is important in past and present cultures. Anonymity allows individuals to make unprejudiced decisions on the basis that these decisions cannot come back to haunt the voter. In modern society, and more specifically modern democracies/republics, most public elections occur with almost anonymous voting. The almost part comes from various precautions to detect election fraud, such as numbering systems and other schemes for reverse-engineering a vote. All in all, they are mostly secret, but how did we get here?

Voting in the United States

Before we had the secret ballot or Australian ballot in the USA, we had public elections. Until the 1890s, some states even used oral ballot systems for determining elections, a process in which voters would congregate around government buildings and cast votes in a public manner. As was common in Australia and soon America around this era, local gangsters and criminals would intimidate voters in public and coerce different outcomes of elections. Soon, the consequences of public voting systems grew too large and the governments changed to their private counterparts.

The conditions of the updated secret ballot were [1]:

an official ballot being printed at public expense,
on which the names of the nominated candidates of all parties and all proposals appear,
being distributed only at the polling place and
being marked in secret.

There are some states that, according to their constitution, still use either mailed or open ballot systems, and all states allow the option of an absentee ballot. The states adopting different practices are:

West Virgina — open, sealed, or secret ballot decided by the voter
Colorado, Oregon, and Washington — entirely mailed ballots.

Consequences of a secret ballot.

As a result of improvements to privacy, anonymous voting at the level of secret ballots led to changes in voter turnout. Before the change, political parties had an incentive to pay people to vote, ideally for their party. This also closely aligns with that of a rational economic framework [2], where bribes were used to incentivize voter participation. Ultimately, the change to a secret ballot shielded the actual vote of voters from the general public, preventing political parties from the verification of an individual vote. Unable to verify a task they were paying people for, the parties have little to no incentive to continue making payments. As a result, the secret ballot caused a decrease in voter participation, though not because voters decided not to vote, but rather that a pool of voters who were incentivized to do so weren’t anymore.

Another recent study in 2012 sought to understand consequences of a secret ballot from the perspective of unmet expectations by voters [3]. The researchers designed an experiment to understand whether voters withheld participation because of their beliefs or rather lack thereof in the actual anonymity of votes. They found that nearly 3% of voters with histories of notparticipating did so because they didn’t believe the votes were truly secret. To me, this translates to a lack of incentive based on the alleged privacy of the system and a lack of trust.

Today, our election system is still not anonymous. There are mechanisms designed to reverse-engineer the voting process to uncover election fraud. This poses a risk for privacy-focused voter participation, if participation is a metric we hope to improve.

Designing a vote

There are many ways to design a vote, all with different qualities and consequences. Voting systems should be designed with an engineering mentality and with a heavy emphasis on security engineering. We should design voting systems from an adversarial perspective in hopes of creating a favorable environment. This means that we should analyze the weakest links of our system, since these weaknesses present the easiest opportunity for an attacker to exploit. In order to determine these weaknesses, we must begin the construction of the voting mechanism and analyze it, repeatedly improving the system until it is as robust as possible.

Participants

It is pertinent to first determine who should be participating in a vote because why else would we be designing one?

Citizens — Legal members of sovereign states and nations.
Organizations — And the members contained within. There could even be an interest in multi-organizational participation such as from participants in a trusted alliance like the Ethereum Enterprise Alliance.
Users of products — Sourced individuals who should have a say in the governance of future product iterations.

Threat Model

We might then ask: who wants to attack a voting system and how? This question can be answered differently at many different stages and sizes of elections. Obviously as elections grow in size, so too does the amount of resources required to attack it. We concern ourselves with the most powerful adversaries possible with the remark that as we decrease the size of the adversary, so too does the scale at which an attack occurs. That is, it is less likely for a nation-state like Russia to have large incentives in attacking your local cities election versus their incentive for attacking the presidential election. However, there are still adversaries at the local city scale with the incentives and capabilities of attacking such an election.

Adversaries — Nation-states, political parties, multi-national corporations, individuals with unbounded money and resources
Methods — Information warfare, election fraud, bribery

Tallying Strategies

Beyond threat modeling, there are other design decisions such as how to handle tallying of votes. Different strategies potentially affect the outcome of elections since knowledge of progress or lack thereof influences the beliefs voters have about the power of their vote. When coupled with a voter’s incentive, it can lead to undesirable effects on voter participation.

Public — Where progress can be viewed throughout the entire process such as on online polls (after submitting your vote of course).
Private — Where progress is hidden until the election has concluded such as in the popular vote for the next president in the United States.

Voting Strategies

Next, there is the decision to hold a public or anonymous vote. These are both quite common to us as we have most likely participated in one or the other or both. We witness, participate in, and hear news of both in the form of general popular vote elections (almost anonymous) and the electoral voting process (public), wherein our congressmen vote on new bills as well as select a new president.

Public — Where knowledge of who voted for what is public, with “who” being a strict, non-pseudonymous identity.
Pseudo-public — Where any identity and the respective identity’s vote is publicly linked. An example could be a pseudonym that proves membership in some organization and publicly votes.
Anonymous — Where knowledge of who voted and for what is unlinkable. This can also be extended from a pseudo-public setting where any individual that proves members in some organization votes anonymously.

Tooling

Finally, there are the tools at one’s disposal for designing a desirable mechanism. The tools for designing voting mechanisms exist in both physical and digital ecosystems since, after all, we have voting solutions in both manifestations: government elections and online polling respectively.

Physical — Tools such as written ballots, mailed ballots, voting booths, and actual people for conducting election.
Digital — Cryptography, websites, social media, and other software that interface with hardware for conducting elections in a digitized manner.

Most important to us for the rest of this article will be cryptography. We have laid out the groundwork for designing a voting system, and now we will build one.

A decentralized, anonymous voting mechanism.

Let’s now design a decentralized voting mechanism on Ethereum using zero-knowledge proofs to achieve true, cryptographic anonymity. If you are unaware of what zero-knowledge proofs are, I recommend reading up on thisand anything on zkp.science.

Background

Ethereum — Ethereum is a blockchain project that enables the execution of smart contracts, which can be thought of as programs running on a decentralized computer.

Zero-knowledge Proof of Knowledge — A ZKP is a cryptographic proof detailing knowledge of an input x to an output f(x) without revealing x.

Anonymous coin mixer/tumbler — An anonymous mixer or tumbler is a service that “mixes” or “tumbles” cryptocurrency to anonymize the trail of ownership. A user deposits coins into a mixer and withdraws them using different, unlinked addresses, preserving anonymity.

Merkle Tree — A merkle tree is a cryptographic binary tree data structure that is built by repeatedly hashing the children to create parent nodes, starting at the leaves or data nodes.

Mechanism Details

The voting mechanism we will build uses:

Anonymous voting strategy
Public tallying strategy

and is designed with these groups in mind:

Ethereum owners as the organization entity
Any adversary capable of finding and bribing token holders.

Mixer Protocol

The protocol that we will build is largely a derivative of an anonymous coin mixer on Ethereum, called Miximus developed by barryWhiteHat [4].

The mixer functions primarily through the novel usage of a merkle tree and non-interactive zero-knowledge proofs (ZK-SNARKs). The leaf nodes of the merkle tree contain the hash of a secret, which we will call H(x1,…,xN). The scheme works as follows:

First, a user deposits 1 ETH into the smart contract and creates a leaf node in the merkle tree with data H(x1,…,xN) with an address A.
Then, using the secret (x1,…,xN), the user creates a special zero-knowledge proof or ZK-SNARK, denoted p.
Finally, the user creates a new address B that is not linked in any way to Aand withdraws 1 ETH from the mixer, providing p as a proof that the user knows the secret contained in a leaf of the merkle tree.

Security of protocol. The protocol above is secure under the assumption that the Diffie-hellman discrete log problem is hard under the chosen hash function. This hardness prevents any adversary from finding the pre-images of the data of leaf nodes input into the tree. Moreover, the use of a ZKP allows the withdrawer to prove they deposited into the tree without disclosing which leaf is their corresponding deposit. Under the former assumption, finding out which leaf they eventually withdraw from is also hard.

Voting Protocol

We will design a voting protocol that is weighted against token holdings; that is, under a 1 coin, 1 vote paradigm. These are useful in polling the opinions/thoughts of the aggregate market of token holders of a specific token like Ether and have nice analogues to the voting capabilities of shareholders in publicly traded companies on the stock market. Our voting protocol requires a few changes but is derived from the protocol above.

First, a poller/developer deploys an Election smart contract with a future time cliff t for ending the election. The contract contains logic for voting.
Then, a user deposits 1 ETH into the smart contract and creates a leaf node in the merkle tree with data H(x1,…,xN) with an address A. This locks the user’s ETH until time t’ such that t<t’.
Then, using the secret (x1,…,xN), the user creates a special zero-knowledge proof or ZK-SNARK, denoted p.
Next, the user creates a new address B that is not linked in any way to A and submits their vote v, providing p as a proof that the user knows the secret contained in a leaf of the merkle tree.
Finally at any time t’ such that t<t’, the user withdraws the 1 ETH stuck in the contract to the same account that voted. This can be handled by storing voters on the contract and marking when various voting addresses have withdrawn or not.

Security of protocol. The protocol above is secure under the same cryptographic assumptions as the Mixer protocol. The reason the votes are not cryptographically linked to the addresses of the token holders is due to the fact that the tokens cannot be either. Following a vote, these new, unlinkable accounts can proceed in a similar fashion — mixing and voting — for as long as they desire.

However, now that we are using a token-weighted voting protocol, there is the potential for linking votes with token holders through statistical analysis. This, however, is largely unavoidable while tallying of votes remains public. In the scheme above, votes are recorded in plaintext and thus provide insight into the state of the market’s belief or side on a given issue, which can pose other issues in relation to voter incentives.

Example. There are 3 owners of tokens: Alice, Bob, and Charlie. Assume for the sake of argument that Alice, Bob, and Charlie purchased all the tokens in an ICO; that is, the tokens were sent to addresses of Alice, Bob, and Charlie. Since these fractions were disclosed through the on-chain transfer of tokens, we can potentially link the votes of a user depending on how many votes we have. Under a rational voter model, a truthful voter will vote for only their side of an issue. Thus, when and if they use all their tokens to vote, it is possible to link, with some probability, the likelihood that Alice voted a certain way AND controls the mixed tokens from the resulting protocol.

Say what? More security!

In the example described above, we’ve potentially run into an issue due to the public tally of our designated voting protocol. We can resolve this by changing a portion of the protocol, instead requiring a commitment to a vote. Our changes start at step 4, but we repeat its entirety for clarity. Recall H is our chosen hash function, we use (y1,…,yM) to denote randomness for obfuscating a committed vote.

First, a poller/developer deploys an Election smart contract with a future time cliff t for ending the election and a further time period Δ for the revealing of commitments. The contract contains logic for voting.
Then, a user deposits 1 ETH into the smart contract and creates a leaf node in the merkle tree with data H(x1,…,xN) with an address A. This locks the user’s ETH until time t’>t.
Then, using the secret (x1,…,xN), the user creates a special zero-knowledge proof or ZK-SNARK, denoted p.
Next, the user creates a new address B that is not linked in any way to A and submits a commitment to a vote H(v,y1,…,yM), providing p as a proof that the user knows the secret contained in a leaf of the merkle tree.
Finally at any time t’ such that t<t’<t + Δ, the user reveals their vote v’ along with the randomness (y’1,…,y’M) to be verified by the contract such that the following relation holds H(v,y1,…,yM)=H(v’,y’1,…,y’M). In the event this doesn’t hold, i.e. a user submits a false vote, the voter loses their deposit.
If ultimately there are unrevealed votes at time t’ such that t + Δ<t’, those deposits are also similarly slashed and kept as punishment. This provides an incentive for users to actually reveal their votes in the allotted time.

With hashed commitments, our election or poll is no longer publicly tallying. Instead, votes are revealed after the voting protocol has closed and tallied at the time t’ such that t<t’<t + Δ. The downside of this enhancement is that it requires more effort on behalf of the voters, though it provides added incentives or rather punishment against not revealing.

Applications

The protocols above are examples of a token based voting system, weighted by the number of tokens a voter has. It can be extended to allow larger casting of votes on arbitrary orders such as X coins for X votes.

We can utilize tokens as the de facto method for casting anonymous votes but alter the distribution process of those tokens. Perhaps, a trusted third party such as a basketball team can issue tokens to season ticket holders. Then using these tokens, 1 per season ticket holder, they can vote on new changes they would like to see in stadium offerings.

I leave most of this imagination up for another time. I think there are interesting applications with these protocols when combined with other identity solutions and wonder if they’re even possible given the structure of the protocols defined above.

Attacks

I will conclude the article with a discussion on where these voting systems fail. The downside of anonymous voting on blockchain based networks using the protocols defined above rests precisely on their anonymity. These problems potentially restrict the solution above to only successfully handle weighted-token voting schemes. The reason anonymity is a downside is because it opens up the possibility for bribery and election fraud.

Recall our previous ICO example with potentially many more token owners. Alice and Bob can collude off-chain and share secret information for verifying their respective ZKPs. Then, either of them can cast votes on behalf of newly anonymized accounts.

Season ticket holders can purchase votes from one another outside and away from the basketball team’s owners. The main reason is there is pure anonymity and thus no guarantee that the future owners of the tokens remain the same. After all, coin mixing services were designed to obscure the path of ownership of tokens. We simply abuse that principle to handle anonymous voting.

Furthermore, this attack vector makes it nearly impossible to handle reputation or identity within the protocol, since there should not be a way to transfer reputation or identity to anyone.

Protocol and proof extensions

On the contrary and with a more delicate design, there may be ways to build a system to allow for reputation and/or greater identity based voting solutions, beyond owning tokens. Some of these ideas lead to alternative weighted voting mechanisms where:

Users continue to submit ZKPs of ownership of a chain of accounts in the procedure, without disclosing which accounts they own.
Users submit ZKPs of reputation on accounts they own without disclosing those accounts, in a differentially-private manner.
Users submit ZKPs of identity that allows them ability to vote at all.

I will be working on an implementation here on my github and potentially in this repo.

Thanks to Saahil Jain for comments and edits.

If you have more ideas for anonymous voting protocols or interesting commentary, feedback, or questions, reach out!

References

https://en.wikipedia.org/wiki/Secret_ballot
Heckelman, Jac C. “The effect of the secret ballot on voter turnout rates.” Public Choice 82.1–2 (1995): 107–124.
Gerber, Alan S., et al. “Do perceptions of ballot secrecy influence turnout? Results from a field experiment.” American Journal of Political Science 57.3 (2013): 537–551.
https://github.com/barryWhiteHat/miximus

Drew Stone

Anonymous Voting

– A Design Survey

People

Cyberlaw Blogospace