Variable-base scalar multiplication

In the Orchard circuit we need to check where and the scalar field is with .

We have and , for .

Witness scalar

We're trying to compute for . Set and . Then we can compute

provided that , i.e. which covers the whole range we need because in fact .

Thus, given a scalar , we witness the boolean decomposition of (We use big-endian bit order for convenient input into the variable-base scalar multiplication algorithm.)

Variable-base scalar multiplication

We use an optimized double-and-add algorithm, copied from "Faster variable-base scalar multiplication in zk-SNARK circuits" with some variable name changes:

Acc := [2] T
for i from n-1 down to 0 {
    P := k_{i+1} ? T : −T
    Acc := (Acc + P) + Acc
}
return (k_0 = 0) ? (Acc - T) : Acc

It remains to check that the x-coordinates of each pair of points to be added are distinct.

When adding points in a prime-order group, we can rely on Theorem 3 from Appendix C of the Halo paper, which says that if we have two such points with nonzero indices wrt a given odd-prime order base, where the indices taken in the range are distinct disregarding sign, then they have different x-coordinates. This is helpful, because it is easier to reason about the indices of points occurring in the scalar multiplication algorithm than it is to reason about their x-coordinates directly.

So, the required check is equivalent to saying that the following "indexed version" of the above algorithm never asserts:

acc := 2
for i from n-1 down to 0 {
    p = k_{i+1} ? 1 : −1
    assert acc ≠ ± p
    assert (acc + p) ≠ acc    // X
    acc := (acc + p) + acc
    assert 0 < acc ≤ (q-1)/2
}
if k_0 = 0 {
    assert acc ≠ 1
    acc := acc - 1
}

The maximum value of acc is:

    <--- n 1s --->
  1011111...111111
= 1100000...000000 - 1

=

The assertion labelled X obviously cannot fail because . It is possible to see that acc is monotonically increasing except in the last conditional. It reaches its largest value when is maximal, i.e. .

So to entirely avoid exceptional cases, we would need . But we can use larger by if the last iterations use complete addition.

The first for which the algorithm using only incomplete addition fails is going to be , since . We need to make the wraparound technique above work.

sage: q = 0x40000000000000000000000000000000224698fc0994a8dd8c46eb2100000001
sage: 2^253 + 2^252 - 1 < (q-1)//2
False
sage: 2^252 + 2^251 - 1 < (q-1)//2
True

So the last three iterations of the loop () need to use complete addition, as does the conditional subtraction at the end. Writing this out using ⸭ for incomplete addition (as we do in the spec), we have:

Acc := [2] T
for i from 253 down to 3 {
    P := k_{i+1} ? T : −T
    Acc := (Acc ⸭ P) ⸭ Acc
}
for i from 2 down to 0 {
    P := k_{i+1} ? T : −T
    Acc := (Acc + P) + Acc  // complete addition
}
return (k_0 = 0) ? (Acc + (-T)) : Acc  // complete addition

Constraint program for optimized double-and-add

Define a running sum , where and:

The helper . After substitution of , and , this becomes:

Here, is assigned to a cell. This is unlike previous 's, which were implicitly derived from , but never actually assigned.

The bits are used in three further steps, using complete addition:

If the least significant bit we set otherwise we set . Then we return using complete addition.

Let

Output .

(Note that represents .)

Incomplete addition

We need six advice columns to witness . However, since are the same, we can perform two incomplete additions in a single row, reusing the same . We split the scalar bits used in incomplete addition into and halves and process them in parallel. This means that we effectively have two for loops:

  • the first, covering the half for from down to , with a special case at ; and
  • the second, covering the half for the remaining from down to , with a special case at .

For each and half, we have three sets of gates. Note that is going from ; is NOT indexing the rows.

This gate is only used on the first row (before the for loop). We check that are initialized to values consistent with the initial where

This gate is used on all rows corresponding to the for loop except the last.

where

This gate is used on the final iteration of the for loop, handling the special case where we check that the output has been witnessed correctly. where

Complete addition

We reuse the complete addition constraints to implement the final rounds of double-and-add. This requires two rows per round because we need 9 advice columns for each complete addition. In the 10th advice column we stash the other cells that we need to correctly implement the double-and-add:

  • The base coordinate, so we can conditionally negate it as input to one of the complete additions.
  • The running sum, which we constrain over two rows instead of sequentially.

Layout

Constraints

In addition to the complete addition constraints, we define the following gate:

where .

LSB

Layout

Constraints

where .

Overflow check

cannot overflow for any , because it is a weighted sum of bits only up to , which is smaller than (and also ).

However, can overflow .

Note: for full-width scalar mul, it may not be possible to represent in the base field (e.g. when the base field is Pasta's and ). In that case, we need to special-case the row that would mention so that it is correct for whatever representation we use for a full-width scalar. Our representation for will be the pair . We'll use in place of for , constraining to 254 bits so that it fits in an element. Then we just have to generalize the argument below to work for (because the maximum value of is ).

Since overflow can only occur in the final step that constrains , we have . It is then sufficient to also check that (so that ) and that . These conditions together imply that as an integer, and so as required.

Note: the bits do not represent a value reduced modulo , but rather a representation of the unreduced .

Optimized check for

Since (also true if and are swapped), we have

We may assume that .

(This is true for the use of variable-base scalar mul in Orchard, where we know that . If is also true if we swap and so that we have . It is not true for a full-width scalar when .)

Therefore,

Given , we prove equivalence of and as follows:

  • shift the range by to give ;
  • observe that is guaranteed to be in and therefore cannot overflow or underflow modulo ;
  • using the fact that , observe that .

(We can see in a different way that this is correct by observing that it checks whether , so the upper bound is aligned as we would expect.)

Now, we can continue optimizing from :

Constraining to be all- or not-all- can be implemented almost "for free", as follows.

Recall that , so we have:

So are all exactly when .

Finally, we can merge the -bit decompositions for the and cases by checking that .

Overflow check constraints

Let . The constraints for the overflow check are:

Define

Witness , and decompose as .

Then the needed gates are:

where can be computed by another running sum. Note that the factor of has no effect on the constraint, since the RHS is zero.

Running sum range check

We make use of a -bit lookup range check in the circuit to subtract the low bits of . The range check subtracts the first bits of and right-shifts the result to give

Overflow check (general)

Recall that we defined , where .

cannot overflow for any , because it is a weighted sum of bits only up to and including . When and this sum can be at most , which is smaller than (and also ).

However, for full-width scalar mul, it may not be possible to represent in the base field (e.g. when the base field is Pasta's and ). In that case can overflow .

So, we need to special-case the row that would mention so that it is correct for whatever representation we use for a full-width scalar.

Our representation for will be the pair . We'll use in place of for , constraining to 254 bits so that it fits in an element.

Then we just have to generalize the overflow check used for variable-base scalar mul in the Orchard circuit to work for (because the maximum value of is ).

Note: the bits do not represent a value reduced modulo , but rather a representation of the unreduced .

Overflow can only occur in the final step that constrains , and only if has the bit with weight set (i.e. if ). If we instead set , now cannot overflow and should be equal to .

It is then sufficient to also check that as an integer where .

Represent as where we constrain and and to boolean. For this to be a canonical representation we also need .

Let .

If :

  • constrain and . This cannot overflow because in this case and so .

If :

  • we should have iff , i.e. witness as boolean and then
    • If then constrain .
      • This can be done by constraining either or . ( cannot overflow.)
    • If then constrain .
      • This can be done by constraining and .

Overflow check constraints (general)

Represent as as above.

The constraints for the overflow check are:

Note that the four 130-bit constraints marked are in two pairs that occur in disjoint cases. We can therefore combine them into two 130-bit constraints using a new witness variable ; the other constraint always being on :

( is unconstrained and can be witnessed as in the case .)

Cost

  • 25 10-bit and one 3-bit range check, to constrain to 253 bits;
  • 25 10-bit and one 3-bit range check, to constrain to 253 bits when ;
  • two times 13 10-bit range checks.