A fundamental component of many cryptographic protocols is the algebraic structure known as a field. Fields are sets of objects (usually numbers) with two associated binary operators and such that various field axioms hold. The real numbers are an example of a field with uncountably many elements.
Halo makes use of finite fields which have a finite number of elements. Finite fields are fully classified as follows:
- if is a finite field, it contains elements for some integer and some prime ;
- any two finite fields with the same number of elements are isomorphic. In particular, all of the arithmetic in a prime field is isomorphic to addition and multiplication of integers modulo , i.e. in . This is why we often refer to as the modulus.
We'll write a field as where . The prime is called its characteristic. In the cases where the field is a -degree extension of the field . (By analogy, the complex numbers are an extension of the real numbers.) However, in Halo we do not use extension fields. Whenever we write we are referring to what we call a prime field which has a prime number of elements, i.e. .
- There are two special elements in any field: , the additive identity, and , the multiplicative identity.
- The least significant bit of a field element, when represented as an integer in binary format, can be interpreted as its "sign" to help distinguish it from its additive inverse (negation). This is because for some nonzero element which has a least significant bit we have that has a least significant bit , and vice versa. We could also use whether or not an element is larger than to give it a "sign."
Finite fields will be useful later for constructing polynomials and elliptic curves. Elliptic curves are examples of groups, which we discuss next.
Groups are simpler and more limited than fields; they have only one binary operator and fewer axioms. They also have an identity, which we'll denote as .
Any non-zero element in a group has an inverse , which is the unique element such that .
For example, the set of nonzero elements of forms a group, where the group operation is given by multiplication on the field.
(aside) Additive vs multiplicative notation
If is written as or omitted (i.e. written as ), the identity as , and inversion as , as we did above, then we say that the group is "written multiplicatively". If is written as , the identity as or , and inversion as , then we say it is "written additively".
It's conventional to use additive notation for elliptic curve groups, and multiplicative notation when the elements come from a finite field.
When additive notation is used, we also write
for nonnegative and call this "scalar multiplication"; we also often use uppercase letters for variables denoting group elements. When multiplicative notation is used, we also write
and call this "exponentiation". In either case we call the scalar such that or the "discrete logarithm" of to base . We can extend scalars to negative integers by inversion, i.e. or .
The order of an element of a finite group is defined as the smallest positive integer such that (in multiplicative notation) or (in additive notation). The order of the group is the number of elements.
Groups always have a generating set, which is a set of elements such that we can produce any element of the group as (in multiplicative terminology) a product of powers of those elements. So if the generating set is , we can produce any element of the group as . There can be many different generating sets for a given group.
A group is called cyclic if it has a (not necessarily unique) generating set with only a single element — call it . In that case we can say that generates the group, and that the order of is the order of the group.
Any finite cyclic group of order is isomorphic to the integers modulo (denoted ), such that:
- the operation in corresponds to addition modulo ;
- the identity in corresponds to ;
- some generator corresponds to .
Given a generator , the isomorphism is always easy to compute in the direction; it is just (or in additive notation, ). It may be difficult in general to compute in the direction; we'll discuss this further when we come to elliptic curves.
If the order of a finite group is prime, then the group is cyclic, and every non-identity element is a generator.
The multiplicative group of a finite field
We use the notation for the multiplicative group (i.e. the group operation is multiplication in ) over the set .
A quick way of obtaining the inverse in is . The reason for this stems from Fermat's little theorem, which states that for any integer . If is nonzero, we can divide by twice to get
Let's assume that is a generator of , so it has order (equal to the number of elements in ). Therefore, for any element in there is a unique integer such that .
Notice that where can really be interpreted as where and . Indeed, it holds that for all . As a result the multiplication of nonzero field elements can be interpreted as addition modulo with respect to some fixed generator . The addition just happens "in the exponent."
This is another way to look at where comes from for computing inverses in the field:
Montgomery's trick, named after Peter Montgomery (RIP) is a way to compute many group inversions at the same time. It is commonly used to compute inversions in , which are quite computationally expensive compared to multiplication.
Imagine we need to compute the inverses of three nonzero elements . Instead, we'll compute the products and , and compute the inversion
We can now multiply by to obtain and multiply by to obtain , which we can then multiply by to obtain their respective inverses.
This technique generalizes to arbitrary numbers of group elements with just a single inversion necessary.
A subgroup of a group with operation , is a subset of elements of that also form a group under .
In the previous section we said that is a generator of the -order multiplicative group . This group has composite order, and so by the Chinese remainder theorem1 it has strict subgroups. As an example let's imagine that , and so factors into . Thus, there is a generator of the -order subgroup and a generator of the -order subgroup. All elements in , therefore, can be written uniquely as for some (modulo ) and some (modulo ).
If we have notice what happens when we compute
we have effectively "killed" the -order subgroup component, producing a value in the -order subgroup.
Lagrange's theorem (group theory) states that the order of any subgroup of a finite group divides the order of . Therefore, the order of any subgroup of must divide
PLONK-based proving systems like Halo 2 are more convenient to use with fields that have a large number of multiplicative subgroups with a "smooth" distribution (which makes the performance cliffs smaller and more granular as circuit sizes increase). The Pallas and Vesta curves specifically have primes of the form
with and odd (i.e. has 32 lower zero-bits). This means they have multiplicative subgroups of order for all . These 2-adic subgroups are nice for efficient FFTs, as well as enabling a wide variety of circuit sizes.
In a field exactly half of all nonzero elements are squares; the remainder are non-squares or "quadratic non-residues". In order to see why, consider an that generates the -order multiplicative subgroup of (this exists because is divisible by since is a prime greater than ) and that generates the -order multiplicative subgroup of where . Then every element can be written uniquely as with and . Half of all elements will have and the other half will have .
Let's consider the simple case where and so is odd (if is even, then would be divisible by , which contradicts being ). If is a square, then there must exist such that . But this means that
In other words, all squares in this particular field do not generate the -order multiplicative subgroup, and so since half of the elements generate the -order subgroup then at most half of the elements are square. In fact exactly half of the elements are square (since squaring each nonsquare element gives a unique square). This means we can assume all squares can be written as for some , and therefore finding the square root is a matter of exponentiating by .
In the event that then things get more complicated because does not exist. Let's write as with odd. The case is impossible, and the case is what we already described, so consider . generates a -order multiplicative subgroup and generates the odd -order multiplicative subgroup. Then every element can be written as for and . If the element is a square, then there exists some which can be written for and . This means that , therefore we have , and . would have to be even in this case because otherwise it would be impossible to have for any . In the case that is not a square, then is odd, and so half of all elements are squares.
In order to compute the square root, we can first raise the element to the power to "kill" the -order component, giving
and then raise this result to the power to undo the effect of the original exponentiation on the -order component:
(since is relatively prime to ). This leaves bare the value which we can trivially handle. We can similarly kill the -order component to obtain , and put the values together to obtain the square root.
It turns out that in the cases there are simpler algorithms that merge several of these exponentiations together for efficiency. For other values of , the only known way is to manually extract by squaring until you obtain the identity for every single bit of . This is the essence of the Tonelli-Shanks square root algorithm and describes the general strategy. (There is another square root algorithm that uses quadratic extension fields, but it doesn't pay off in efficiency until the prime becomes quite large.)
Roots of unity
In the previous sections we wrote with odd, and stated that an element generated the -order subgroup. For convenience, let's denote The elements are known as the th roots of unity.
The primitive root of unity, is an th root of unity such that except when .
If is an th root of unity, satisfies If then
Equivalently, the roots of unity are solutions to the equation
("Negation lemma"). Proof: Since the order of is , Therefore,
("Halving lemma"). Proof: In other words, if we square each element in the th roots of unity, we would get back only half the elements, (i.e. the th roots of unity). There is a two-to-one mapping between the elements and their squares.