Sunday, June 14, 2015

Things practiced: math, graphics programming

Tools used: WebGL, benvanik’s WebGL Inspector for Chrome, glMatrix for matrix math in Javascript

For whatever we lose(like a you or a me)
it's always ourselves we find in the sea
(from an e.e. cummings poem that autoplays in my head every time I picture the ocean, i.e. ~10,000× while coding this)

Despite being terrible at it, I find graphics programming pretty fun. Building realistic 3D environments in real time blends physics and computer science—graphics programmers have to satisfy two conflicting objectives: to simulate reality as accurately as possible, while using fewer than 16 or 33 milliseconds to process each frame.

Modeling the ocean is a good example of a situation where tradeoffs are required. Fluid dynamics is based around the Navier-Stokes equations, which are a set of nonlinear partial differential equations that describe the flow of viscous fluids. Fully solving these equations numerically provides an exact model of the ocean, but is computationally infeasible.

Instead, I tried to simulate an ocean scene using this approach:

1. Generate realistic water surface height fields to model waves, using empirical knowledge of ocean phenomena.

2. Account for optical processes acting on ocean water: reflection and refraction from a water surface, the color filtering behavior of ocean water, and maybe more-complex effects like caustics and godrays.

3. Render a realistic sky gradient using known properties of atmospheric scattering.

4. Think about computational-resource cost versus quality gained and simplify where possible.

Results so far (underwhelming but workable) (click):

The code is on GitHub; the simulation is running here.

Friday, November 28, 2014

Thing practiced: ?

Tools used: IntelliJ IDEA community edition, Applied Cryptography

Scheming humans have always faced a basic problem: how can we communicate securely in the presence of adversaries? Since ancient times, the art of secret communication, or cryptography, has been crucial for governments and the military. But today cryptography affects us all. As messages are increasingly transmitted through public channels, cryptography prevents malicious interceptors from using our information against us.

The evolution of cryptography can be split into two eras by the invention of asymmetric ciphers in the 1970s. Historically, encrypting and decrypting a message had been symmetric processes — that is, unscrambling the message required knowing the key that had been used to scramble it. This begat the problem of key distribution: before sending a secret message, the sender would have to take great precautions to transport her encryption key safely to the recipient. In asymmetric cryptography, the keys used to encrypt and to decrypt a message are different. This means that the recipient can make his encryption key public, as long as he keeps his decryption key private.

What we need for an asymmetric-cryptography protocol to work is a trapdoor one-way function. This is the mathematical equivalent of a physical lock: easy to process in one direction (snapping the lock closed), but hard to undo (opening the lock), unless we have a special secret (the key). In RSA – the first public-key cryptosystem, and still the most popular – the trapdoor function exploits mathematical features of modular exponentiation, prime numbers, and integer factorization. Let’s throw together a toy implementation in Scala.

Helpful math functions

A number is prime if it has exactly two divisors: 1 and itself. Sometimes it’s useful to have a list of small primes, so we need a Sieve of Eratosthenes:

// Sieves a stream of Ints
def sieve(s: Stream[Int]): Stream[Int] =
  s.head #:: sieve(s.tail.filter(_ % s.head != 0))

// All primes as a lazy sequence
val primes = sieve(Stream.from(2))

What we really want, though, is large primes – ones higher than 2500. To get a prime that large, we can’t sieve up from 1; instead, we find a prime by taking a random number, checking if it’s probably prime, and repeating if necessary. The primality test of choice in real systems is Miller-Rabin, but we’ll use Solovay-Strassen to keep things simple:

// Ints won't fit these
import math.BigInt

// Euclid's GCD algorithm
def gcd(a: BigInt, b: BigInt): BigInt =
  if (b == 0) a else gcd(b, a % b)

// Computes the Jacobi symbol (needed for our test)
def jacobi(a: BigInt, n: BigInt): BigInt = {
  if (a == 1) 1
  else if (n == 1) 1
  else if (gcd(a, n) != 1) 0
  else if (a == 2 && (n % 8 == 1 || n % 8 == 7)) 1
  else if (a == 2 && (n % 8 == 3 || n % 8 == 5)) -1
  else if (a > n) jacobi(a % n, n)
  else if (a % 2 == 0) jacobi(a / 2, n) * jacobi(2, n)
  else if (n % 2 == 0) jacobi(a, n / 2) * jacobi(a, 2)
  else if ((((a - 1) * (n - 1)) / 4) % 2 == 0) jacobi(n, a)
  else jacobi(n, a) * -1
}

// Runs the Solovay-Strassen test for i iterations
def isPrime(n: BigInt, i: Int): Boolean = {
  if (i <= 0) true
  else if (n % 2 == 0 && n != 2) false
  else {
    val a = (random(n.bitLength) % (n - 1)) + 1
    val j = jacobi(a, n)
    val exp = a.modPow((n - 1) / 2, n)
    if (j == 0 || j % n != exp) false
    else isPrime(n, i - 1)
  }
}

Finally, the main reason we’re interested in primes is so we can do calculations modulo our prime. To do this, we need the Extended Euclidean algorithm:

// Returns (x, y) such that a*x + b*y = gcd(a, b)
def extendedGcd(a: BigInt, b: BigInt): (BigInt, BigInt) = {
  if (a % b == 0) (0, 1)
  else {
    val (x, y) = extendedGcd(b, a % b)
    (y, x - y * (a / b))
  }
}

The RSA system

An asymmetric encryption system has two parts: a public key and a private key. In theory, encrypting a message with the public key can only be reversed by decrypting with the private key.

In RSA, we encrypt a message by computing msge mod n, using the publicly-known information n and e. This is the crux of RSA’s security: modular exponentiation is easy to do, but exceedingly hard to undo. To make it possible to retrieve the message from the ciphertext, we build a trapdoor into our encryption routine by making n the product of two large primes p and q (which we keep private). We can then reconstruct the message using a calculated decryption exponent d.

To choose the right decryption exponent, we first need a value φ based on n’s factorization such that xφ(p, q) = 1. Luckily, Euler’s totient function gives us just this:

// Euler's totient phi, where p and q are n's prime factors
def phi(p: BigInt, q: BigInt): BigInt = (p - 1) * (q - 1)

As long as we choose our public exponent e so that it doesn’t share a common factor with the totient φ, we can decrypt using the inverse of e mod φ:

// A legal (public) exponent has to
// (1) be between 1 and the totient
// (2) have no non-1 factors in common with the totient
def isLegalExponent(e: BigInt, p: BigInt, q: BigInt): Boolean =
  e > 1 && e < phi(p, q) && gcd(e, phi(p, q)) == 1

// Returns b such that a*b = 1 mod n
def modularInverse(a: BigInt, n: BigInt): BigInt = {
  val (x, _) = extendedGcd(a, n)
  x % n
}

// Our decryption exponent: the inverse of e mod phi
def d(e: BigInt, p: BigInt, q: BigInt) = modularInverse(e, phi(p, q))

Encryption and decryption are now trivial:

// Takes a message m and public information e and n
def encrypt(m: BigInt, e: BigInt, n: BigInt): BigInt =
  m.modPow(e, n)

// Takes a ciphertext c, private exponent d, and public information n
def decrypt(c: BigInt, d: BigInt, n: BigInt): BigInt =
  c.modPow(d, n)

(In reality, RSA is never used to encrypt messages — for n to be large enough to encode any reasonable length of message, the computing resources required for even the “easy” process of encryption are prohibitively high. Instead, RSA is used to safely deliver symmetric keys, which can then be used with block or stream ciphers to encrypt and decrypt large messages.)

Wednesday, April 10, 2013

Thing practiced: math

Tools used: A Friendly Introduction to Number Theory, Python

“I’m tired of seeing girls, I want to see more math.”
—No one ever

Problem:

Consider the infinite polynomial series

$$ {A_F}(x) = x{F_1} + {x^2}{F_2} + {x^3}{F_3} + \ldots \tag{1} $$

where Fk is the kth term in the Fibonacci sequence (1, 1, 2, 3, 5, 8, ...).

For every case where x is rational and AF(x) is a positive integer, we call AF(x) a golden nugget, because these are increasingly rare – for example, the 10th golden nugget is 74,049,690.

What is the 15th golden nugget?

Solution:

Infinite series are unwieldy, so first we replace (1) with a closed-form version (derived here):

$$ {A_F}(x) = \frac{x}{1 - x - {x^2}} \tag{2} $$

We need AF(x) to be a positive integer, which we’ll call n. So, rearranging (2), we get:

$$ {A_F}(x) = n = \frac{x}{1 - x - {x^2}} $$ $$ n{x^2} + (n + 1)x - n = 0 \tag{3} $$

This is a standard quadratic equation with solutions

$$ x = \frac{-(n + 1) \pm \sqrt{(n + 1)^2 + 4n^2}}{2n} \tag{4} $$

As long as the square root portion of (4) is an integer, x will be rational. We can solve for this case by calling this portion m (making the discriminant m2):

$$ m^2 = (n + 1)^2 + 4{n^2} = 5{n^2} + 2n + 1 $$

and doing some algebra we get:

$$ 5m^2 = 25n^2 + 10n + 1 + 4 = (5n + 1)^2 + 4 $$ $$ (5n + 1)^2 - 5m^2 = -4 \tag{5} $$

Now, if we substitute p = 5n + 1, then (5) becomes a Pell-like equation:

$$ p^2 - 5m^2 = -4 \tag{6} $$

This means that we can find solutions to (6) by finding solutions to the corresponding unit-form Pell equation x2 - 5y2 = 1 and transforming them.

To compute the solutions, we use the technique described here. We first identify the fundamental solutions to (6). Then, we transform these into solutions of the unit-form equation by the identity (pi2 - 5mi2)(xi2 - 5yi2) = -4. From these, we can generate subsequent solutions, then transform them back into solutions to (6) by the same identity. Finally, we re-substitute our original n = AF(x) back in to find the golden nuggets.

This gives us the answer: 1,120,149,658,760.

Thursday, February 7, 2013

Thing practiced: math

Tool used: Sublime Text 2

Problem 112:

If no digit in a number is exceeded by the digit to its left it’s called an increasing number – e.g. 134,468. Similarly if no digit is exceeded by the digit to its right it’s called a decreasing number – e.g. 66,420. We’ll call a positive integer that is neither increasing nor decreasing a “bouncy” number – e.g. 155,349.

Clearly there cannot be any bouncy numbers below 100. In fact, the least number for which the proportion of bouncy numbers reaches 50% is 538. By the time we reach 21,780 the proportion of bouncy numbers is 90%.

Find the least number for which the proportion of bouncy numbers is exactly 99%.

Solution to 112:

This seems easy enough to brute-force, so we’ll try that in Python:

# Parse the number right-to-left to find if it's bouncy.
def is_bouncy(num):
    has_incr_seq, has_decr_seq = False, False
    right_digit = num % 10
    num = num // 10

    while num > 0:
        left_digit = num % 10
        if left_digit < right_digit:
            has_incr_seq = True
        elif left_digit > right_digit:
            has_decr_seq = True
        right_digit = left_digit
        num = num // 10
        if has_incr_seq and has_decr_seq:
            return True
    return False

# Iterate through numbers until the bouncy count is 99%
# of the total count.
count = 0
i = 99
while count < 0.99 * i:
    i = i + 1
    if is_bouncy(i):
        count = count + 1
print(i)

This gives us the answer: 1,587,000. (Python took 2.8 s to run this on a MacBook Air; out of curiosity I tried it in C — 42 ms.)

Next, problem 113:

As n increases, the proportion of bouncy numbers below n increases such that there are only 12,951 numbers below 1,000,000 that are not bouncy and only 277,032 non-bouncy numbers below 1010.

How many numbers below a googol (10100) are not bouncy?

Solution to 113:

Let’s start with a simple question and expand from there – how many increasing numbers are there below 1000?

Well, there are 3 digits in any number below 1000 (if we consider e.g. 012 to be 12), and for a number to be increasing each digit must be no lower than the digit to its left. Let’s label the digits in a 3-digit number A, B, and C. The leftmost digit A can be any of the digits 0-9; the middle digit B can be any of the digits from A-9; and the rightmost digit C can be any of the digits from B-9. That means that there are 12 potential digits to choose the 3 digits of our number from – the 10 digits 0-9, plus 2 potential duplicates if A = B or B = C.

The number of increasing numbers below 1000, then, is just (12 choose 3) = 220, minus 1 because we don’t want to count 000. Likewise, the number of increasing numbers below a googol is

$$ {109 \choose 100} - 1 \tag{increasing} $$

Decreasing numbers are similar, except that we can’t just ignore leading zeroes as we’d done with increasing numbers – for example, 072 is in fact a decreasing number, but wouldn’t be caught by the (12 choose 3) we used earlier. So, how many decreasing numbers are there below 1000?

If we label the digits in a 3-digit decreasing number D, E, and F, then the leftmost digit D can be any of the digits 0-9; the middle digit E can be any of the digits 0-D, or, if D = 0, any of the digits 0-9; and the rightmost digit F can be any of the digits 0-E, or, if D = E = 0, any of the digits 0-9. That means that there are 13 potential digits to choose the 3 digits of our number from – the 10 digits 0-9, plus 2 potential duplicates if D = E or E = F, plus 1 for potential leading zero(es).

The number of decreasing numbers below 1000, then, is (13 choose 3) = 286, minus 4 because 000 will occur 4 times (one time as before, plus once for each digit). Analogously, the number of decreasing numbers below a googol is

$$ {110 \choose 100} - 101 \tag{decreasing} $$

Almost there! In order to find all the non-bouncy numbers, we just need to add the increasing numbers to the decreasing numbers. However, some numbers are increasing and decreasing – e.g. 888. How many of these monotonic numbers are there? Well, for each power of 10 there are 9 potential monotonic numbers – e.g. we have 1, 2, ..., 9, 11, 22, ..., 99, 111, 222, ..., 999, etc. Since we have 100 digits, we have:

$$ 9 \times 100 \tag{duplicates} $$

Thus, the total number of non-bouncy numbers below a googol should be:

$$ \left({109 \choose 100} - 1\right) + \left({110 \choose 100} - 101\right) - (9 \times 100) \tag{total} $$

This gives us a final solution of 51,161,058,134,250.