# The problem

Given a non-negative integer $n$ and prime $p$, count the number of binomial coefficients $\binom{i}{k}$ for $i \le n$ that are not divisible by $p$. 1

To solve this problem, we will invoke Lucas’s Theorem which allows us to test the divisibility of every binomial coefficient in a particular row of Pascal’s Triangle. Consider the base $p$ expansion of non-negative integers $m$ and $n$:

By Lucas’s Theorem, we know that $\binom{m}{n}$ is not divisibile by $p$ iff for all $i$, $m_i \ge n_i$.

As an example, let us count the number of binomial coefficients in the 16th row of Pascal’s Triangle that are not divisible by 3. Since $16 = 1 \times 3^2 + 2 \times 3^1 + 0 \times 3^0$, the base 3 representation of 16 is $120_3$. We know that if 3 does not divide $\binom{16}{k}$, every digit in the base $p$ representation of $k$ must be no larger than the corresponding digit in $120_3$. Thus, we can reduce this to a simple counting problem - there are $(1+1) \times (2+1) \times (0+1) = 6$ binomial coefficients that satisfy this condition. Generally, for any row $n$, we simply compute $\prod n_i+1$, the number of ways to choose each digit in the base $p$ representation of $n$.

This gives us a straightforward algorithm to solve the original problem. However, it’s not fast enough under the source constraints where $n$ can get as large as $10^9$. If we can somehow avoid iterating over every row, we may be able to do better than $\mathcal{O}(n)$.

# An interesting regularity

Coloring Pascal’s Triangle by the divisibility of the binomial coefficients reveals a distinct pattern. Here are the first 30 rows mod 3 (zeros are replaced with spaces for clarity):

  0   0000   1
1   0001   11
2   0002   121
3   0010   1  1
4   0011   11 11
5   0012   121121
6   0020   1  2  1
7   0021   11 22 11
8   0022   121212121
9   0100   1        1
10   0101   11       11
11   0102   121      121
12   0110   1  1     1  1
13   0111   11 11    11 11
14   0112   121121   121121
15   0120   1  2  1  1  2  1
16   0121   11 22 11 11 22 11
17   0122   121212121121212121
18   0200   1        2        1
19   0201   11       22       11
20   0202   121      212      121
21   0210   1  1     2  2     1  1
22   0211   11 11    22 22    11 11
23   0212   121121   212212   121121
24   0220   1  2  1  2  1  2  1  2  1
25   0221   11 22 11 22 11 22 11 22 11
26   0222   121212121212121212121212121
27   1000   1                          1
28   1001   11                         11
29   1002   121                        121


This self-similar pattern resembles a Sierpinski Triangle. By repeating the coloring with different primes, we can see that every prime generates a corresponding unique fractal. This strongly suggests that a nice recursive definition of $G(n)$ exists and gives us a convenient visual framework to construct it.

# Constructing the recurrence

First, some definitions:

For notational convenience, an overline is used to denote a group of digits interpreted as a base $p$ number. We will often express $n$ in terms of its base $p$ expansion $n = \overline{\vphantom{b}a_i a_{i-1} \dots a_0} = a_i p^i + a_{i-1}p^{i-1} + \dots + a_0$.

Now, consider the fractal for $p=3$.

                 #
##
###
#        #  #
#  =>  ##   =>  ## ##
###      ######
#  #  #
## ## ##
#########


We note that the index of the last row of each triangle has the form $p^i-1$, which has the base $p$ representation of the digit $p-1$ repeating $i$ times since

It is clear that at every generation, the fractal is copied and arranged into a larger triangle and $G(n)$ grows geometrically by a factor of $p(p+1)/2$. Also, the number of rows grows geometrically by a factor of $p$. Putting it together, we get

This gives us $G(n)$ for values of $n$ where the digits are maximal. In fact, we can easily extend this result for arbitrary leading digits, or values of $n$ in the form $a_i p^i - 1$ or $\overline{\vphantom{b}a_i 0 \dots 0} - 1$.

Our strategy then, is to find an expression for $G(n)$ in terms of $G(\textit{leading digit})$ and $G(\textit{trailing digits})$ so that we can reduce $n$ by extracting its leading digit. Combining these results should be simple. With a bit more work, we get the following:

This recursive definition of $G(n)$ gives rise to an $\mathcal{O}(\log n)$ algorithm since we reduce the input size digit by digit. Also, it can be translated into code pretty much directly. Here is a sample implementation in Python:

def T(k):
return k*(k+1)/2

def G(n, p):
if n == 0:
return 1
i = 0
while n >= p**(i+1):
i += 1
tail = n%p**i # trailing digits


By unrolling the recursion, we can simplify it further. Here is the final solution:

def T(k):
return k*(k+1)/2

def G(n, p):
total = 1
i = 0
while n:
total = T(n%p)*T(p)**i + (n%p+1)*total
n /= p
i += 1