Let be a polynomial over with formal indeterminate . As an example,
defines a degree- polynomial. is referred to as the constant term. Polynomials of degree have coefficients. We will often want to compute the result of replacing the formal indeterminate with some concrete value , which we denote by .
In mathematics this is commonly referred to as "evaluating at a point ". The word "point" here stems from the geometrical usage of polynomials in the form , where is the coordinate of a point in two-dimensional space. However, the polynomials we deal with are almost always constrained to equal zero, and will be an element of some field. This should not be confused with points on an elliptic curve, which we also make use of, but never in the context of polynomial evaluation.
- Multiplication of polynomials produces a product polynomial that is the sum of the degrees of its factors. Polynomial division subtracts from the degree.
- Given a polynomial of degree , if we obtain evaluations of the polynomial at distinct points then these evaluations perfectly define the polynomial. In other words, given these evaluations we can obtain a unique polynomial of degree via polynomial interpolation.
- is the coefficient representation of the polynomial . Equivalently, we could use its evaluation representation at distinct points. Either representation uniquely specifies the same polynomial.
(aside) Horner's rule
Horner's rule allows for efficient evaluation of a polynomial of degree , using only multiplications and additions. It is the following identity:
Fast Fourier Transform (FFT)
The FFT is an efficient way of converting between the coefficient and evaluation representations of a polynomial. It evaluates the polynomial at the th roots of unity where is a primitive th root of unity. By exploiting symmetries in the roots of unity, each round of the FFT reduces the evaluation into a problem only half the size. Most commonly we use polynomials of length some power of two, , and apply the halving reduction recursively.
Motivation: Fast polynomial multiplication
In the coefficient representation, it takes operations to multiply two polynomials :
where each of the terms in the first polynomial has to be multiplied by the terms of the second polynomial.
In the evaluation representation, however, polynomial multiplication only requires operations:
where each evaluation is multiplied pointwise.
This suggests the following strategy for fast polynomial multiplication:
- Evaluate polynomials at all points;
- Perform fast pointwise multiplication in the evaluation representation ();
- Convert back to the coefficient representation.
The challenge now is how to evaluate and interpolate the polynomials efficiently. Naively, evaluating a polynomial at points would require operations (we use the Horner's rule at each point):
For convenience, we will denote the matrices above as:
( is known as the Discrete Fourier Transform of ; is also called the Vandermonde matrix.)
The (radix-2) Cooley-Tukey algorithm
Our strategy is to divide a DFT of size into two interleaved DFTs of size . Given the polynomial we split it up into even and odd terms:
To recover the original polynomial, we do
Trying this out on points and , we start to notice some symmetries:
Notice that we are only evaluating and over half the domain (halving lemma). This gives us all the terms we need to reconstruct over the full domain : which means we have transformed a length- DFT into two length- DFTs.
We choose to be a power of two (by zero-padding if needed), and apply this divide-and-conquer strategy recursively. By the Master Theorem1, this gives us an evaluation algorithm with operations, also known as the Fast Fourier Transform (FFT).
So we've evaluated our polynomials and multiplied them pointwise. What remains is to convert the product from the evaluation representation back to coefficient representation. To do this, we simply call the FFT on the evaluation representation. However, this time we also:
- replace by in the Vandermonde matrix, and
- multiply our final result by a factor of .
In other words:
(To understand why the inverse FFT has a similar form to the FFT, refer to Slide 13-1 of 2. The below image was also taken from 2.)
The Schwartz-Zippel lemma
The Schwartz-Zippel lemma informally states that "different polynomials are different at most points." Formally, it can be written as follows:
Let be a nonzero polynomial of variables with degree . Let be a finite set of numbers with at least elements in it. If we choose random from ,
In the familiar univariate case , this reduces to saying that a nonzero polynomial of degree has at most roots.
The Schwartz-Zippel lemma is used in polynomial equality testing. Given two multi-variate polynomials and of degrees respectively, we can test if for random where the size of is at least If the two polynomials are identical, this will always be true, whereas if the two polynomials are different then the equality holds with probability at most .
Consider the order- multiplicative subgroup with primitive root of unity . For all we have In other words, every element of fulfils the equation
meaning every element is a root of We call the vanishing polynomial over because it evaluates to zero on all elements of
This comes in particularly handy when checking polynomial constraints. For instance, to check that over we simply have to check that is some multiple of . In other words, if dividing our constraint by the vanishing polynomial still yields some polynomial we are satisfied that over
Lagrange basis functions
TODO: explain what a basis is in general (briefly).
Polynomials are commonly written in the monomial basis (e.g. ). However, when working over a multiplicative subgroup of order , we find a more natural expression in the Lagrange basis.
Consider the order- multiplicative subgroup with primitive root of unity . The Lagrange basis corresponding to this subgroup is a set of functions , where
We can write this more compactly as where is the Kronecker delta function.
Now, we can write our polynomial as a linear combination of Lagrange basis functions,
which is equivalent to saying that evaluates to at , to at , to at and so on.
When working over a multiplicative subgroup, the Lagrange basis function has a convenient sparse representation of the form
where is the barycentric weight. (To understand how this form was derived, refer to 3.) For we have .
Suppose we are given a set of evaluation points . Since we cannot assume that the 's form a multiplicative subgroup, we consider also the Lagrange polynomials 's in the general case. Then we can construct:
Here, every will produce a zero numerator term causing the whole product to evaluate to zero. On the other hand, will evaluate to at every term, resulting in an overall product of one. This gives the desired Kronecker delta behaviour on the set .
Given a polynomial in its evaluation representation
we can reconstruct its coefficient form in the Lagrange basis:
Dasgupta, S., Papadimitriou, C. H., & Vazirani, U. V. (2008). "Algorithms" (ch. 2). New York: McGraw-Hill Higher Education.
Golin, M. (2016). "The Fast Fourier Transform and Polynomial Multiplication" [lecture notes], COMP 3711H Design and Analysis of Algorithms, Hong Kong University of Science and Technology.