The problem

Given a non-negative integer nn and prime pp, count the number of binomial coefficients (ik)\binom{i}{k} for ini \le n that are not divisible by pp.The original problem was presented as a code golf challenge. It turns out that the same problem already exists on Project Euler.

To solve this problem, we will invoke Lucas’s Theorem which allows us to test the divisibility of every binomial coefficient in a particular row of Pascal’s Triangle. Consider the base pp expansion of non-negative integers mm and nn:

m=mipi+mi1pi1++m0n=nipi+ni1pi1++n0 \begin{aligned} m &= m_i p^i + m_{i-1} p^{i-1} + \dots + m_0 \\ n &= n_i p^i + n_{i-1} p^{i-1} + \dots + n_0 \end{aligned} \\

By Lucas’s Theorem, we know that (mn)\binom{m}{n} is not divisible by pp iff for all ii, minim_i \ge n_i.

As an example, let us count the number of binomial coefficients in the 16th row of Pascal’s Triangle that are not divisible by 3. Since 16=1×32+2×31+0×3016 = 1 \times 3^2 + 2 \times 3^1 + 0 \times 3^0, the base 3 representation of 16 is 1203120_3. We know that if 3 does not divide (16k)\binom{16}{k}, every digit in the base pp representation of kk must be no larger than the corresponding digit in 1203120_3. Thus, we can reduce this to a simple counting problem - there are (1+1)×(2+1)×(0+1)=6(1+1) \times (2+1) \times (0+1) = 6 binomial coefficients that satisfy this condition. Generally, for any row nn, we simply compute ni+1\prod n_i+1, the number of ways to choose each digit in the base pp representation of nn.

This gives us a straightforward algorithm to solve the original problem. However, it’s not fast enough under the source constraints where nn can get as large as 10910^9. If we can somehow avoid iterating over every row, we may be able to do better than O(n)\mathcal{O}(n).

An interesting regularity

Coloring Pascal’s Triangle by the divisibility of the binomial coefficients reveals a distinct pattern. Here are the first 30 rows mod 3 (zeros are replaced with spaces for clarity):

  0   0000   1
  1   0001   11
  2   0002   121
  3   0010   1  1
  4   0011   11 11
  5   0012   121121
  6   0020   1  2  1
  7   0021   11 22 11
  8   0022   121212121
  9   0100   1        1
 10   0101   11       11
 11   0102   121      121
 12   0110   1  1     1  1
 13   0111   11 11    11 11
 14   0112   121121   121121
 15   0120   1  2  1  1  2  1
 16   0121   11 22 11 11 22 11
 17   0122   121212121121212121
 18   0200   1        2        1
 19   0201   11       22       11
 20   0202   121      212      121
 21   0210   1  1     2  2     1  1
 22   0211   11 11    22 22    11 11
 23   0212   121121   212212   121121
 24   0220   1  2  1  2  1  2  1  2  1
 25   0221   11 22 11 22 11 22 11 22 11
 26   0222   121212121212121212121212121
 27   1000   1                          1
 28   1001   11                         11
 29   1002   121                        121

This self-similar pattern resembles a Sierpinski Triangle. By repeating the coloring with different primes, we can see that every prime generates a corresponding unique fractal. This strongly suggests that a nice recursive definition of G(n)G(n) exists and gives us a convenient visual framework to construct it.

Constructing the recurrence

First, some definitions:

g(n)=Number of binomial coefficients in row n of Pascal’s Triangle not divisible by pG(n)=i=0ng(i) \begin{aligned} g(n) &= \text{Number of binomial coefficients in row $n$ of Pascal's Triangle not divisible by $p$} \\ G(n) &= \textstyle{\sum_{i=0}^{n} g(i)} \\ \end{aligned}

For notational convenience, an overline is used to denote a group of digits interpreted as a base pp number. We will often express nn in terms of its base pp expansion n=baiai1a0=aipi+ai1pi1++a0n = \overline{\vphantom{b}a_i a_{i-1} \dots a_0} = a_i p^i + a_{i-1}p^{i-1} + \dots + a_0.

Now, consider the fractal for p=3p=3.

                 #
                 ##
                 ###
        #        #  #
 #  =>  ##   =>  ## ##
        ###      ######
                 #  #  #
                 ## ## ##
                 #########

We note that the index of the last row of each triangle has the form pi1p^i-1, which has the base pp representation of the digit p1p-1 repeating ii times since

pi1=(p1)(pi1+pi2++1)=(p1)pi1+(p1)pi2++p1 \begin{aligned} p^i-1 &= (p-1)(p^{i-1} + p^{i-2} + \dots + 1) \\ &= (p-1)p^{i-1} + (p-1)p^{i-2} + \dots + p-1 \end{aligned}

It is clear that at every generation, the fractal is copied and arranged into a larger triangle and G(n)G(n) grows geometrically by a factor of p(p+1)/2p(p+1)/2. Also, the number of rows grows geometrically by a factor of pp. Putting it together, we get

G(pi1)=p(p+1)2G(pi11)=(p(p+1)2)i \begin{aligned} G(p^i-1) &= \frac{p(p+1)}{2} G(p^{i-1}-1) \\ &= \left(\frac{p(p+1)}{2}\right)^{i} \end{aligned}

This gives us G(n)G(n) for values of nn where the digits are maximal. In fact, we can easily extend this result for arbitrary leading digits, or values of nn in the form aipi1a_i p^i - 1 or bai001\overline{\vphantom{b}a_i 0 \dots 0} - 1.

G(aipi1)=ai(ai+1)2G(pi1)=ai(ai+1)2(p(p+1)2)i \begin{aligned} G(a_i p^i - 1) &= \frac{a_i(a_i+1)}{2} G(p^i-1) \\ &= \frac{a_i(a_i+1)}{2} \left(\frac{p(p+1)}{2}\right)^i \end{aligned}

Our strategy then, is to find an expression for G(n)G(n) in terms of G(leading digit)G(\textit{leading digit}) and G(trailing digits)G(\textit{trailing digits}) so that we can reduce nn by extracting its leading digit. Combining these results should be simple. With a bit more work, we get the following:

G(baiai1a0)=G(bai001)+g(bai00)++g(baiai1a0)=G(aipi11)+(ai+1)(g(b00)++g(bai1a0))=ai(ai+1)2(p(p+1)2)i+(ai+1)G(bai1a0) \begin{aligned} G(\overline{\vphantom{b}a_i a_{i-1}\dots a_0}) &= G(\overline{\vphantom{b}a_i 0 \dots 0} - 1) + g(\overline{\vphantom{b}a_i 0 \dots 0}) + \dots + g(\overline{\vphantom{b}a_i a_{i-1}\dots a_0}) \\ &= G(a_i p^{i-1} - 1) + (a_i+1)(g(\overline{\vphantom{b}0 \dots 0}) + \dots + g(\overline{\vphantom{b}a_{i-1} \dots a_0})) \\ &= \frac{a_i(a_i+1)}{2} \left(\frac{p(p+1)}{2}\right)^i + (a_i+1)G(\overline{\vphantom{b}a_{i-1} \dots a_0}) \end{aligned}

This recursive definition of G(n)G(n) gives rise to an O(logn)\mathcal{O}(\log n) algorithm since we reduce the input size digit by digit. Also, it can be translated into code pretty much directly. Here is a sample implementation in Python:

def T(k):
    return k*(k+1)/2

def G(n, p):
    if n == 0:
        return 1
    i = 0
    while n >= p**(i+1):
        i += 1
    head = n/p**i # leading digit
    tail = n%p**i # trailing digits
    return T(head)*T(p)**i + (head+1)*G(tail, p)

By unrolling the recursion, we can simplify it further. Here is the final solution:

def T(k):
    return k*(k+1)/2

def G(n, p):
    total = 1
    i = 0
    while n:
        total = T(n%p)*T(p)**i + (n%p+1)*total
        n /= p
        i += 1
    return total