Image credit: Wikipedia

Threshold Secret Sharing Schemes

The following is a re-upload of a post in my previous blog (which was destroyed by mistake).


This week’s topic is on secret sharing schemes, in particular threshold based ones. The topic was inspired by my recent viewing of a 7-week Coursera course, by Philip Klein called Coding the Matrix: Linear Algebra through Computer Science Applications. I’ve heard good reviews about the book (by the same name) and was thinking of checking it out. While doing so, I chanced upon the course by the author and just decided, why not?

In the video lectures, one of the videos mentioned about threshold secret sharing as an application of linear algebra. That fascinated me and reminded me of some brain teaser asked by friends some time back - “Can you design a lock-and-key system such that you need at least k out of n people to open a safe?” At that point in time, it was simply a brain teaser and I had no idea about it’s formalisation. The Coursera video gave me a bunch of keywords to look up - “threshold secret sharing”. And I did.

In this post, I will cover the following:

(k,n)-threshold secret sharing schemes

Scenarios

Scenario #1 A pirate crew found a chest full of treasure on an expedition but their ship has already reached maximum capacity. They agreed to bury the treasure and come back another day to retrieve it.

However, they are untrusting of each other - someone may come back earlier to loot all the treasure for himself/herself. Fortunately, there are several locks available. Assuming the only way to safely retrieve the chest contents is by removing the locks, how can the pirates devise a scheme to ensure that the chest can only be opened if sufficiently many crew members are present?

Scenario #2 Country ABC recently built a nuclear weapon. To prevent misuse, it requires sufficiently many generals to authorise the launch of the weapon. One possible solution is the employ the two-man rule. It ensures that at least 2 authorised personnels are present before a critical action is taken. However, can we generalise this further?

Formalisation

As Adi Shamir (one of the guys who gave a solution to the problem) states in his paper:

Threshold schemes are ideally suited to applications in which a group of mutually suspicious individuals with conflicting interests must cooperate.

There is a secret S which needs to be split across n parties. Each party is said to be given a key/share/shadow. S can be revealed if and only if at least k keys are present. In secret sharing schemes, S is a hidden integer1 and the encoding/form of keys differ from scheme to scheme.

In Scenario #1, S is the treasure, n is the number of crew members and k is the minimum number of crews required to be present. For example, consider n = 3 and k = 2. Then, 3 locks $L_1$, $L_2$, $L_3$ can be used with the following key distribution: Pirate$_1$ gets keys to $L_1$ and $L_2$, Pirate$_2$ gets keys to $L_1$ and $L_3$ and Pirate$_3$ gets keys to $L_2$ and $L_3$. Clearly, any single one of them cannot open the chest on their own while any combination of 2 or more pirates can successfully retrieve the treasure.

In Scenario #2, S is the act for launching the nuclear weapon, n is the number of army generals and k is the minimum number of generals present. One may think of each general having a launch key and the nuclear weapon only launches when enough keys are inserted and turned simultaneously. In a two-man rule situation, k = 2.

Variants, trade-offs and concerns about secret sharing

Here are some of them. Check out the Wikipedia page to learn more.

  • Reliability
    In the case where some keys are misplaced/destroyed/corrupted, can S still be recovered?

  • Convenience
    How easy is it to reveal S? Do we always need to trouble all n key holders?

  • Secrecy
    Does holding key(s) reveal information/properties about S even if S itself is not revealed?

  • Verifiability
    Is the system able to detect liars?

3 main approaches

Based on my literature review, there are 3 commonly cited ways of tackling the (k,n)-threshold secret sharing problem. Other papers often build upon these works.

  1. Blakley2
    Blakley’s approach relies on finding intersection of k-dimensional hyperplanes by solving a consistent system of linear equations.

  2. Shamir3
    Shamir stores the secret as the constant in a (k-1)-degree polynomial and uses polynomial interpolation to compute its value.

  3. Mignotte4 / Asmuth-Bloom5
    Mignotte’s and Asmuth-Bloom’s approaches are very similar. Both rely on solving a consistent system of (at least k) congruences via Chinese Remainder Theorem.

Chronologically, Blakley and Shamir came up with their approaches independently around the same time while Mignotte / Asmuth-Bloom had their results later. Blakley’s solution is less space efficient as each key is an equation describing a k-dimensional hyperplane as compared to the other 2 approaches which simply store a value as large as the secret S itself. Impact wise, Shamir’s paper has the most citations. It has many extensions in Cryptography such as secure multiparty computation.

I will review each approach in detail in the subsequent sections. Readers are assumed to have basic understanding of the following:

To make things concrete, I will go through a toy example near the end and provide a link to my implementations where I use the schemes to share a 256-bit integer secret S.

Blakley2

In the paper, Blakley studies how cryptographic keys should be handled. The front bulk of the paper discusses the motivation for threshold secret sharing schemes which can be summarised nicely in the last sentence of the first paragraph:

If too many copies are distributed one might go astray. If too few copies are made they might all be destroyed.

If one is interested purely on the technical details, skip to the last paragraph on the second page, from “The rest of the paper describes…” onwards. The approach described here is a geometric one.

Key idea

Fix a coordinate/index, say the first one. Then, the secret S is stored in a k-vector X with (k-1) other randomly generated values. Geometrically, this represents a point in a k-dimensional space.

Each key can be viewed as a k-dimensional hyperplane containing point X. To uncover the secret S, find the k-dimensional intersection point of at least k valid hyperplanes, then read off the first coordinate/index.

Splitting of secret Each of the n keys are generated as follows:

  1. Generate coefficients $a_1, … , a_k$
  2. Compute $(a_1, … , a_k) \cdot X = a_1 x_1 + … + a_k x_k = y$
    i.e. k-dimensional hyperplane equation: $a_1 x_1 + … + a_k x_k = y$
  3. Key = ( $a_1, … , a_k, y$ )

Out of all the n rows of $(a_1, … , a_k)$, at least k of them need to be linearly independent. This is to ensure unique reconstruction of the secret S later on. Generation of coefficients $a_1, … , a_k$ may be made public but the values of $y$ need to be hidden.

Combining of keys Given t keys,

  1. Stack the keys in rows to form the t-by-k matrix B and column t-vector y. i.e. B(i,j) = $a_j$ of $i^{th}$ key. y(i) = $y$ of the $i^{th}$ key.
  2. Solve for X in the system of linear equations BX = y. This is equivalent to finding the intersection of all the given hyperplanes.
  3. Extract S from the solved X at the previously fixed coordinate.

The system of linear equations will only have a unique solution if rank(B) = k. If less than k hyperplane equations are given, the intersection will result in a higher dimensional object than a point (for instance, a line). The combination phase will also fail if any key is corrupted or invalid. In either case, X cannot be recovered, hence S is safe.

One downside about this approach is that a set of less than k keys still reveal some information about the secret S. For every additional key, we know which hyperplanes the secret lies in (even though we may not be able to fully reveal S without sufficiently many hyperplane equations).

Implementation details

Construction of coefficient matrix A One common choice of coefficients for $(a_1, … , a_k)$ is the Vandermonde matrix due to it’s simplicity and nice properties. My implementation instead uses the Pascal matrix, the second suggested matrix by Hei, Du and Song6. They proved the correctness of their matrices and gave some nice properties. I chose to use the Pascal matrix because the values in the Pascal matrix grow much slower than those in the Vandermonde matrix, allowing for larger values of n and k.

Dealing with arbitrary precision In our problem setting, we are manipulating a 256-bit integer, the AES-256 key, which clearly does not fit into a typical integer or long data type. To deal with this kind of situation, we need to able to manipulate numbers up to arbitrary precision. In Java, one would rely on the BigInteger class. Python3’s native int handles arbitrary precision natively but not their float. Consider the following:

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 2 ** 3000 / 10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: integer division result too large for a float
>>> 2 ** 3000 // 10
123023192216111717693155881327675251464071389573683371576611802916005880061467294877536006783859345958242964925405180490851288418089823682358508248206534833123495935035584501741302332011136066692262472823975688041643447831569367501341309075720869037679329665881066294182449348845172650530371291600534674790862370267348091935393681310573662040235274477690384047788365110032240930198348836380293054048248790976348409825394072868513204440886373475427121259247177864394948668851172105156197043278074745482377680846418069710308386181218434856552274019579668262220551184551208055201031005025580158934964592800113374547422071501368341390754277906375983387610135423518424509667004216072062941158150237124800843044718484209861032058041799220666224732872212208851364368390767036020916265367064113093699700217050067550137472399876600582757930072325347489061225013517188917489907991129151239977387217851901822998937

The latter works because // enforces integer division and precision is not lost.

Existing math packages Most programming languages have math libraries/packages that solve system of linear equations. In Python, people usually turn to NumPy and SciPy for mathematical and statistical computations. The problem is, neither of them support arbitrary precision (i.e. arithmetic operations on huge numbers will result in close but inaccurate results). Some people recommend mpmath on forums and stackoverflow but I have not had any success in using it.

Workaround During the reconstruction of S since we cannot simply make a library call to solve BX = y without loss of precision. Any form of division is delayed to the end and care is taken to ensure integer division is used to avoid loss of precision. For more details, please read the comments in my implementation.

Shamir3

Shamir’s paper is well-written — only 2 pages but direct to the point. Technical details begin on the second page.

Key idea

Consider the standard 2-dimensional plane.

  • Given a single point $(x_1, y_1)$, there is only 1 unique degree-0 (constant) polynomial that passes through it: $q(x) = y_1$
  • Given 2 points $(x_1, y_1)$ and $(x_2, y_2)$, there is only 1 unique degree-1 (linear) polynomial that passes through it: $q(x) = y_0 * \frac{x - x_1}{x_0 - x_1} + y_1 * \frac{x - x_0}{x_1 - x_0}$
  • Given k points $(x_1, y_1)$, $(x_2, y_2)$, … , and $(x_k, y_k)$, there is only 1 unique degree-(k-1) polynomial that passes through it: $q(x) = \sum_{i = 1}^k (y_i * \Pi_{j \neq i} \frac{x - x_j}{x_i - x_j})$

The secret S is stored as the constant term in a (k-1) degree polynomial $q(x) = a_0 + a_1x + … + a_{k-1}x^{k-1}$, i.e. S = q(0).

Splitting of secret Each of the n keys are generated as follows:

  1. Pick a prime p that is larger than S
  2. Generate random 256-bit coefficients $a_1, … , a_{k-1}$
  3. Evaluate $q(i) = a_0 + a_1x + … + a_{k-1}x^{k-1}$ (mod p), for $i = 1, 2, … , n$
  4. $i^{th}$ key = ($i, q(i)$)

Notice that each key only consists of 2 values - the point of evaluation and the value of $q(\cdot)$. This contrasts greatly with Blakley’s scheme where each key consists of (k+1) values. However, in Shamir’s scheme, the values of $a_1, … , a_{k-1}$ must be randomised and kept secret as the polynomial itself reveals the value of S.

Combining of keys Given t keys,

  1. Extract $(x_i, y_i)$ pairs from t keys
  2. Apply Lagrange interpolation on the t pairs to compute q(0) mod p

The evaluated value of q(0) will equal S if and only if at least k valid $(x_i, y_i)$ pairs are provided.

Implementation details

Care needs to be taken when handling coefficients and values that are 256-bit long. My choice of p is $2^{257} - 93$. I came across this here and verified it here.

Mignotte4 / Asmuth-Bloom5

Many future work commonly cite both although it seems the latter’s result is slightly stronger. Both works employ the Chinese Remainder Theorem and subject the choice of values to certain conditions to prove some properties.

Chinese Remainder Theorem

Let $m_1, … , m_k$ be positive integers that are pairwise coprime and let $M = \Pi_{i=1}^k m_i$. Then, for any given $a_1, … , a_k$, there exists a unique solution $x$ that satisfies the following system of simultaneous congruences in mod M:

  • $x \equiv a_1$ (mod $m_1$)
  • $x \equiv a_2$ (mod $m_2$)
  • $x \equiv a_k$ (mod $m_k$)

Mignotte

  • Pick a n pairwise coprime integers $m_1, … , m_n$, such that $m_1 < … < m_n$ (relabel if necessary)
  • Compute lower bound $a = \Pi_{i = n-k+2}^n m_i$
    ($a$ the product of largest $(k-1)$ $m$’s)
  • Compute upper bound $b = \Pi_{i = 1}^k m_i$
    ($b$ the product of the smallest $k$ $m$’s)
  • Ensure that $a < b$
    Otherwise, pick another set of n integers
  • Secret S must lie between $a$ and $b$. i.e. $a < S < b$

Asmuth and Bloom

  • Pick a n+1 pairwise coprime integers $m_1, … , m_n$, such that $m_1 < … < m_n$ (relabel if necessary), and an unconstrained $m_0$
  • Compute lower bound $a = \Pi_{i = n-k+2}^n m_i$
    ($a$ is the product of largest $(k-1)$ $m$’s)
  • Compute upper bound $M = b = \Pi_{i = 1}^k m_i$
    ($b$ the product of the smallest $k$ $m$’s)
  • Ensure that $m_0 * a < b$
    Otherwise, pick another set of n+1 integers
  • Randomly select a value A such that $S + A * m_0 \in [0,M)$
    i.e. $A \in [0, \frac{M - S}{p})$
  • Define $y = S + A * m_0$ and treat $y$ as the secret to be stored
  • To recover S, first recover $y$, then take modulo $m_0$

The purpose of ensuring S lies below upper bound b is to ensure a unique solution after applying the Chinese Remainder Theorem. The lower bound gives the property that having less than k congruence relations will yield multiple plausible values of S.

In Mignotte’s formulation, the number of possible values is proportional to the range of $\frac{b-a}{\text{Product of given $m_i$’s}}$. With proper selection of $m_i$’s, it is computationally infeasible to check every possible value for S. One downside about this approach is that a set of less than k keys still reveal some information about the secret S.

In Asmuth and Bloom’s formulation, having less than k congruences does not leak any information about S. This is because S could possibly lie within $[0, m_0]$ and $m_0$ was picked independently of S. This was what I meant by “the latter’s result is slightly stronger” earlier on.

Key idea

Here, we focus on the formulation due to Asmuth and Bloom.

Splitting of secret Based on S, pick n+1 positive integers $m_0, m_1, … , m_n$ such that:

  • $m_1 < … < m_n$
  • $m_i$ and $m_j$ are coprime for any $i \neq j$.
    i.e. $(m_i, m_j) = 1, \forall i \neq j$, where $i, j = 0, 1, 2, … , n$
  • $a = \Pi_{i = n-k+2}^n m_i$
  • $M = b = \Pi_{i = 1}^k m_i$
  • $m_0 * a < b$

Each of the n keys are generated as follows:

  1. Generate random A such that $A \in [0, \frac{M - S}{p})$
  2. Compute $y = S + A * m_0$
  3. $i^{th}$ key = ($m_i, y$ mod $m_i$)

There are many ways to generate infinite coprime sequences7, but one easy way to generate $m_1, … , m_n$ is to just pick n suitable primes along with a $m_0$ that works. The values of $m_i$ may be made public but the values of $A$, $y$, and $y$ mod $m_i$ need to be hidden.

Combining of keys Given t keys,

  1. Extract t congruence equations from t keys
  2. Solve them using Chinese Remainder Theorem to obtain $y$
  3. Obtain $S = y$ mod $m_0$

If less than k keys are given, the solution from Chinese Remainder Theorem is not unique and hence the secret S is safe.

Implementation details

This is by far the most tedious scheme to implement because of the conditions that need to be met. The conditions are actually pretty restrictive and are affected by the values of S, n and k.

Given a n and k (from say, a user input), it is non-trivial to choose $m_i$’s that satisfy the conditions for a random 256-bit secret S. It is really hard to satisfy $m_0 * a < b$. The worst case scenario happens when $k = \frac{n}{2}$. By that, I mean $b - m_0 * a$ is most negative when there is only 1 non-overlapping item:

One workaround is to split S up into smaller, more manageable 8-bit chunks and dictate that one key store 32 keys to smaller 8-bit secrets. Although there is only merely 256 possible secrets per chunk, the fact that each of the 32 chunks are supposedly independent (since they are part of a randomly generated AES key) makes it secure.

For my implementation, I limit n to be at most 10 and use:

  • The smallest n primes from the $91^{st}$ to $100^{th}$ primes as $m_1, … , m_n$
  • $m_0 = 2^8 = 256$ because we are dealing with 8-bit/1-byte chunks
    Since $m_0$ is a power of 2, it is clearly coprime with the other $m_i$’s

I have verified that they satisfy the conditions for any 8-bit S, n, and k as long as $2 \leq k \leq n \leq 10$.

A toy example

Let’s consider a concrete example and trace how each scheme handles it. For simplicity, let’s use a randomly generated 16-bit secret (i.e. S $\in [0, 65535]$) and reasonable values of n and k.

  • S = 9406
  • n = 5
  • k = 3

Blakley

Given: S = 9406, n = 5, k = 3

First, randomly generate 2 other 16-bit numbers: 55142, 238

$$ X = \begin{bmatrix} 9406 \\ 55142 \\ 238 \end{bmatrix} $$

Now, formulate the 5x3 Pascal matrix:

$$ A = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 1 & 3 & 6 \\ 1 & 4 & 10 \\ 1 & 5 & 15 \end{bmatrix} $$

Plug in and solve for y:

$$ AX = y = \begin{bmatrix} 64786 \\ 120404 \\ 176260 \\ 232354 \\ 288686 \end{bmatrix} $$

The 5 keys are then:

  • Key 1: (1, 1, 1, 64786)
  • Key 2: (1, 2, 3, 120404)
  • Key 3: (1, 3, 6, 176260)
  • Key 4: (1, 4, 10, 232354)
  • Key 5: (1, 5, 15, 288686)

Suppose now we have keys 2, 3 and 5. To reconstruct, first form B, then solve for X.

$$ B = \begin{bmatrix} 1 & 2 & 3 \\ 1 & 3 & 6 \\ 1 & 5 & 15 \end{bmatrix} , \hspace{15pt}y = \begin{bmatrix} 120404 \\ 176260 \\ 288686 \end{bmatrix} $$

Since we are dealing with relatively small values here, we can invoke NumPy’s solver:

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> B = [[1,2,3],[1,3,6],[1,5,15]]
>>> y = [120404, 176260,288686]
>>> np.linalg.solve(B,y)
array([  9406.,  55142.,    238.])

Hence, we recover the secret S = 9406 successfully.

Shamir

Given: S = 9406, n = 5, k = 3

Pick a prime larger than S: 104729 (the $10000^{th}$ prime)

Generate 2 random 16-bit coefficients: 55142, 238 (reuse from before)

Our polynomial: $q(x) = 9406 + 55142 x + 238 x^2$

Evaluate $q(x)$ at points 1,2,3,4,5:

  • $q(1) = 64786 \equiv 64786$ (mod 104729)
  • $q(2) = 120642 \equiv 15913$ (mod 104729)
  • $q(3) = 176974 \equiv 72245$ (mod 104729)
  • $q(4) = 233782 \equiv 24324$ (mod 104729)
  • $q(5) = 291066 \equiv 81608$ (mod 104729)

The 5 keys are then:

  • Key 1: (1, 64786)
  • Key 2: (2, 15913)
  • Key 3: (3, 72245)
  • Key 4: (4, 24324)
  • Key 5: (5, 81608)

Suppose now we have keys 2, 3 and 5. To reconstruct, apply Lagrange interpolation to compute $q(0)$.

$$ \begin{array}{rcl} q(0) & = & \sum_{i = 1}^3 (y\_i * \Pi\_{j \neq i} \frac{x - x\_j}{x\_i - x\_j}) \\ & = & (y\_2 * \Pi\_{j \neq 2} \frac{x - x\_j}{x\_2 - x\_j}) + (y\_3 * \Pi\_{j \neq 3} \frac{x - x\_j}{x\_3 - x\_j}) + (y\_5 * \Pi\_{j \neq 5} \frac{x - x\_j}{x\_5 - x\_j}) \\ & = & y\_2 * \frac{(0 - x\_3)(0 - x\_5)}{(x\_2 - x\_3)(x\_2 - x\_5)} + y\_3 * \frac{(0 - x\_2)(0 - x\_5)}{(x\_3 - x\_2)(x\_3 - x\_5)} + y\_5 * \frac{(0 - x\_2)(0 - x\_3)}{(x\_5 - x\_2)(x\_5 - x\_3)} \\ & = & 15913 * \frac{(-3)(-5)}{(2 - 3)(2 - 5)} + 72245 * \frac{(-2)(-5)}{(3 - 2)(3 - 5)} + 81608 * \frac{(-2)(-3)}{(5 - 2)(5 - 3)} \\ & = & -200052 \\ & \equiv & 9406 \text{ (mod 104729)} \end{array} $$

Hence, we recover the secret S = 9406 successfully.

Asmuth-Bloom

Given: S = 9406, n = 5, k = 3

Fixed:

  • $m_0 = 2^8 = 256$
  • $m_1, … , m_{10} = 467, 479, 487, 491, 499, 503, 509, 521, 523, 541$
    (The $91^{st}$ to $100^{th}$ primes)

Split up into chunks of 8-bit: $9406 = 190 + 36 * 2^8$, so $[9406] \Rightarrow [190, 36]$

Pick $m_0, m_1, …, m_5 = 256, 467, 479, 487, 491, 499$

Verify that conditions are satisfied:

  • $a = \Pi_{i = n-k+2}^n m_i = \Pi_{i = 4}^5 m_i = 491 * 499 = 245009$
  • $M = b = \Pi_{i = 1}^k m_i = \Pi_{i = 1}^3 m_i = 467 * 479 * 487 = 108938491$
  • $62722304 = m_0 * a < b = 108938491$

For first chunk ($S_1$ = 190)

Generate a random A such that $A \in [0, \frac{M - S}{p}) = [0, 425504.23828125)$: 50481

Compute $y = S + A * m_0 = 190 + 50481 * 256 = 12923326$

Verify that $y = S + A * m_0 \in [0, M)$: $12923326 \in [0, 108938491)$

For second chunk ($S_2$ = 36)

Do likewise. Say, random A = 115520

Compute $y = S + A * m_0 = 36 + 115520 * 256 = 29573156$

Verify that $y = S + A * m_0 \in [0, M)$: $29573156 \in [0, 108938491)$

The 5 keys are then:

  • Key 1:
    (467, 12923326 mod 467, 29573156 mod 467) = (467, 35, 381)
  • Key 2:
    (479, 12923326 mod 479, 29573156 mod 479) = (479, 385, 175)
  • Key 3:
    (487, 12923326 mod 487, 29573156 mod 487) = (487, 294, 81)
  • Key 4:
    (491, 12923326 mod 491, 29573156 mod 491) = (491, 206, 226)
  • Key 5:
    (499, 12923326 mod 499, 29573156 mod 499) = (499, 224, 420)

Suppose now we have keys 2, 3 and 5. To reconstruct, we solve the following systems of congruences (1 for each chunk):

  • $S + A * m_0 \equiv 385$ (mod 479)
  • $S + A * m_0 \equiv 294$ (mod 487)
  • $S + A * m_0 \equiv 224$ (mod 499)

and

  • $S + A * m_0 \equiv 175$ (mod 479)
  • $S + A * m_0 \equiv 81$ (mod 487)
  • $S + A * m_0 \equiv 420$ (mod 499)

Apply the Chinese Remainder Theorem,

$$ S + A * m\_0 \equiv y\_2 * z\_2 * b\_2 + y\_3 * z\_3 * b\_3 + y\_5 * z\_5 * b\_5 \text{ (mod M)} $$

where

  • $M = 479 * 487 * 499 = 116403227$
  • $z_i = \frac{M}{m_i}$
    $z_2, z_3, z_5 = 243013, 239021, 233273$
  • $b_i = $Multiplicative inverse of $z_i$ in modulo $m_i$
    i.e. $b_i * z_i \equiv 1$ (mod $m_i$)
    $b_2, b_3, b_5 = 3, 279, 210$

For first chunk

$$ \begin{array}{rcl} S_1 + A * m_0 & = & 385 * 243013 * 3 + 294 * 239021 * 279 + 224 * 233273 * 210 \\\\ & = & 30859778481 \\\\ & \equiv & 12923326 \text{ (mod 116403227)} \end{array} $$

Hence, $S_1 \equiv S_1 + A * m_0$ (mod $m_0$) $\equiv 12923326$ mod 256 = 190

For second chunk

$$ \begin{array}{tcl} S_2 + A * m_0 & = & 175 * 243013 * 3 + 81 * 239021 * 279 + 420 * 233273 * 210 \\\\ & = & 26103896004 \\\\ & \equiv & 29573156 \text{ (mod 116403227)} \end{array} $$

Hence, $S_2 \equiv S_1 + A * m_0$ (mod $m_0$) $\equiv 29573156$ mod 256 = 36

Merging the chunks together

Putting the solved chunks together, we get: $$S_1 + S_2 * 2^8 = 190 + 36 * 256 = 9406$$ Hence, we recover the secret S = 9406 successfully.

In practice…

For actual usage, instead of trying to treat a file as a secret, it is better to encrypt the file using cryptography techniques such as AES and treat the decryption key as the secret. That way, the secret will be generally shorter, which allows faster sharing and re-combination.

Python3 implementation

I have implemented all 3 variants here in Python3. It encrypts and decrypts files using AES-256 via PyCrypto. AES is a symmetric-key algorithm (i.e. encryption key = decryption key). The secret S to be shared is not the original file, but the 256-bit AES key. Feel free to play around with it! :)

Encryption mode

  • Read a given input file (plaintext)
  • Encrypt it using AES-256 with a randomly generated 256-bit key
  • Store encrypted file (ciphertext)
  • Split the key into n parts with threshold set to k
  • Store keys to be given out

Decryption mode

  • Read in encrypted file (ciphertext)
  • Read in the given t keys
  • Attempt to combine the keys into the original 256-bit key
  • Decrypt ciphertext with combined key using AES-256
  • Store decrypted file (plaintext)

Obviously, decryption will succeed if and only if at least k valid keys are provided.

Sample execution:

>>> python3 main.py -scheme Blakley -encrypt -infile lenna.png -outfile cipher.png -keysfile keys.txt -n 7 -k 5

Split keys to 7 different parties
Some time passes...
Gather some keys and put in keys.txt

>>> python3 main.py -scheme Blakley -decrypt -infile cipher.png -outfile lenna_restored.png -keysfile keys.txt

Successful if and only if keys.txt contain >= 5 valid keys

>>> diff lenna.png lenna_restored.png

Files are identical!

Conclusions

We have looked at 3 different ways of tackling the (k,n)-threshold secret sharing problem. Although they may look like 3 vastly different approaches, they are actually closely related. For example, in Page 3 of Mignotte’s paper, he formulated Shamir’s solution under his setting and argued that “the Chinese Remainder Theorem is the Legendre Theorem on interpolation of polynomials”.

From a practical point of view, I would recommend Shamir’s approach. Here are a few reasons why:

  • Blakley’s approach is not space efficient and may run into numerical accuracy issues while solving linear equations. It also reveals information about the secret even when less than k valid keys are present.
  • Asmuth’s and Bloom’s approach is a tad too cumbersome to use. It is non-trivial to find a set of positive, pairwise coprimes integers that satisfy any arbitrary S, n and k. The compromise to an arbitrary S is to break it up into pre-defined chunk sizes but I still do not have a simple way to resolve the issue of arbitrary n and k.
  • Mignotte’s approach suffer the same drawback as Asmuth’s and Bloom’s. Furthermore, it reveals information about the secret even when less than k valid keys are present.

Going further

For those new to Linear Algebra or looking for a means to revise, consider the Coding the Matrix: Linear Algebra through Computer Science Applications that I mentioned at the beginning. While it covers materials in a different order as how I learnt it in NUS (still not sure which is pedagogically the best), it forces learners to embrace the idea of abstraction / “black box”, which I feel can be beneficial to some people. He also employs a few interesting examples (such as reducing the “Lights out” game to a linear algebra problem over GF2) throughout the course and relate new materials to them. The specific video that prompted this blog post is the last video of Week 5 titled “Dimension: Threshold secret sharing”. In it, he traces an example of Blakley’s algorithm in GF2 (but I didn’t know what sorcery it was at that point in time!).

Cryptography is a pretty interesting field. Tal Rabin has an interesting 2-part lecture on Secure Multiparty Computation8. Towards the end, she talks about homomorphic encryption.

References and Footnotes


  1. Every computable object can be encoded as a natural number. ^
  2. Blakley, G. R. (1899, December). Safeguarding cryptographic keys. In afips (p. 313). IEEE. ^
  3. Shamir, A. (1979). How to share a secret. Communications of the ACM, 22(11), 612-613. ^
  4. Mignotte, M. (1982). How to share a secret. In Cryptography (pp. 371-375). Springer Berlin Heidelberg. ^
  5. Asmuth, C., & Bloom, J. (1983). A modular approach to key safeguarding. IEEE transactions on information theory, 30(2), 208-210. ^
  6. Hei, X., Du, X., & Song, B. (2012, June). Two matrices for Blakley’s secret sharing scheme. In Communications (ICC), 2012 IEEE International Conference on (pp. 810-814). IEEE. ^
  7. Edwards, A. W. F. (1964). Infinite coprime sequences. The Mathematical Gazette, 48(366), 416-422. ^
  8. I especially like the “Game of Like” scheme that exploits cyclic shift. Pretty slick! ^