programming

Seaflailing: Or, how to try (and mostly fail) to draw an ocean in OpenGL

Things practiced: math, graphics programming

Tools used: WebGL, benvanik’s WebGL Inspector for Chrome, glMatrix for matrix math in Javascript

For whatever we lose(like a you or a me)
it's always ourselves we find in the sea

(from an e.e. cummings poem that autoplays in my head every time I picture the ocean, i.e. ~10,000× while coding this)

Despite being terrible at it, I find graphics programming pretty fun. Building realistic 3D environments in real time blends physics and computer science—graphics programmers have to satisfy two conflicting objectives: to simulate reality as accurately as possible, while using fewer than 16 or 33 milliseconds to process each frame.

Modeling the ocean is a good example of a situation where tradeoffs are required. Fluid dynamics is based around the Navier-Stokes equations, which are a set of nonlinear partial differential equations that describe the flow of viscous fluids. Fully solving these equations numerically provides an exact model of the ocean, but is computationally infeasible.

Instead, I tried to simulate an ocean scene using this approach:

1. Generate realistic water surface height fields to model waves, using empirical knowledge of ocean phenomena.

2. Account for optical processes acting on ocean water: reflection and refraction from a water surface, the color filtering behavior of ocean water, and maybe more-complex effects like caustics and godrays.

3. Render a realistic sky gradient using known properties of atmospheric scattering.

4. Think about computational-resource cost versus quality gained and simplify where possible.

Results so far (underwhelming but workable) (click):

The code is on GitHub; the simulation is running here.

In colour: Audio rainbowscapes

Thing practiced: drawing rainbows

Tools used: the JS Web Audio API, three.js instead of actual WebGL, Safari mobile web inspector

Jamie xx’s new album In Colour is finally out, and it’s good.

Do you ever get that feeling, when you’re up at 2 am and almost alone, of being enveloped in something with maybe someone, of melancholy becoming euphoric? This album feels like that. It’s dense and fleshed out, not downbeat and cryptic like the xx past — but still lonely, and lovely. Listen with good headphones.

And

Because I loved this album and its cover so much, I built an album-themed beat-detecting visualizer here:


(If the presets are unsatisfactory, drag and drop your own audio file onto the page.)

Secrets & spies: or, I try to understand RSA

Thing practiced: ?

Tools used: IntelliJ IDEA community edition, Applied Cryptography

Scheming humans have always faced a basic problem: how can we communicate securely in the presence of adversaries? Since ancient times, the art of secret communication, or cryptography, has been crucial for governments and the military. But today cryptography affects us all. As messages are increasingly transmitted through public channels, cryptography prevents malicious interceptors from using our information against us.

The evolution of cryptography can be split into two eras by the invention of asymmetric ciphers in the 1970s. Historically, encrypting and decrypting a message had been symmetric processes — that is, unscrambling the message required knowing the key that had been used to scramble it. This begat the problem of key distribution: before sending a secret message, the sender would have to take great precautions to transport her encryption key safely to the recipient. In asymmetric cryptography, the keys used to encrypt and to decrypt a message are different. This means that the recipient can make his encryption key public, as long as he keeps his decryption key private.

What we need for an asymmetric-cryptography protocol to work is a trapdoor one-way function. This is the mathematical equivalent of a physical lock: easy to process in one direction (snapping the lock closed), but hard to undo (opening the lock), unless we have a special secret (the key). In RSA – the first public-key cryptosystem, and still the most popular – the trapdoor function exploits mathematical features of modular exponentiation, prime numbers, and integer factorization. Let’s throw together a toy implementation in Scala.


Helpful math functions

A number is prime if it has exactly two divisors: 1 and itself. Sometimes it’s useful to have a list of small primes, so we need a Sieve of Eratosthenes:

// Sieves a stream of Ints
def sieve(s: Stream[Int]): Stream[Int] =
  s.head #:: sieve(s.tail.filter(_ % s.head != 0))

// All primes as a lazy sequence
val primes = sieve(Stream.from(2))

What we really want, though, is large primes – ones higher than 2500. To get a prime that large, we can’t sieve up from 1; instead, we find a prime by taking a random number, checking if it’s probably prime, and repeating if necessary. The primality test of choice in real systems is Miller-Rabin, but we’ll use Solovay-Strassen to keep things simple:

// Ints won't fit these
import math.BigInt

// Euclid's GCD algorithm
def gcd(a: BigInt, b: BigInt): BigInt =
  if (b == 0) a else gcd(b, a % b)

// Computes the Jacobi symbol (needed for our test)
def jacobi(a: BigInt, n: BigInt): BigInt = {
  if (a == 1) 1
  else if (n == 1) 1
  else if (gcd(a, n) != 1) 0
  else if (a == 2 && (n % 8 == 1 || n % 8 == 7)) 1
  else if (a == 2 && (n % 8 == 3 || n % 8 == 5)) -1
  else if (a > n) jacobi(a % n, n)
  else if (a % 2 == 0) jacobi(a / 2, n) * jacobi(2, n)
  else if (n % 2 == 0) jacobi(a, n / 2) * jacobi(a, 2)
  else if ((((a - 1) * (n - 1)) / 4) % 2 == 0) jacobi(n, a)
  else jacobi(n, a) * -1
}

// Runs the Solovay-Strassen test for i iterations
def isPrime(n: BigInt, i: Int): Boolean = {
  if (i <= 0) true
  else if (n % 2 == 0 && n != 2) false
  else {
    val a = (random(n.bitLength) % (n - 1)) + 1
    val j = jacobi(a, n)
    val exp = a.modPow((n - 1) / 2, n)
    if (j == 0 || j % n != exp) false
    else isPrime(n, i - 1)
  }
}

Finally, the main reason we’re interested in primes is so we can do calculations modulo our prime. To do this, we need the Extended Euclidean algorithm:

// Returns (x, y) such that a*x + b*y = gcd(a, b)
def extendedGcd(a: BigInt, b: BigInt): (BigInt, BigInt) = {
  if (a % b == 0) (0, 1)
  else {
    val (x, y) = extendedGcd(b, a % b)
    (y, x - y * (a / b))
  }
}

The RSA system

An asymmetric encryption system has two parts: a public key and a private key. In theory, encrypting a message with the public key can only be reversed by decrypting with the private key.

In RSA, we encrypt a message by computing msge mod n, using the publicly-known information n and e. This is the crux of RSA’s security: modular exponentiation is easy to do, but exceedingly hard to undo. To make it possible to retrieve the message from the ciphertext, we build a trapdoor into our encryption routine by making n the product of two large primes p and q (which we keep private). We can then reconstruct the message using a calculated decryption exponent d.

To choose the right decryption exponent, we first need a value φ based on n’s factorization such that xφ(p, q) = 1. Luckily, Euler’s totient function gives us just this:

// Euler's totient phi, where p and q are n's prime factors
def phi(p: BigInt, q: BigInt): BigInt = (p - 1) * (q - 1)

As long as we choose our public exponent e so that it doesn’t share a common factor with the totient φ, we can decrypt using the inverse of e mod φ:

// A legal (public) exponent has to
// (1) be between 1 and the totient
// (2) have no non-1 factors in common with the totient
def isLegalExponent(e: BigInt, p: BigInt, q: BigInt): Boolean =
  e > 1 && e < phi(p, q) && gcd(e, phi(p, q)) == 1

// Returns b such that a*b = 1 mod n
def modularInverse(a: BigInt, n: BigInt): BigInt = {
  val (x, _) = extendedGcd(a, n)
  x % n
}

// Our decryption exponent: the inverse of e mod phi
def d(e: BigInt, p: BigInt, q: BigInt) = modularInverse(e, phi(p, q))

Encryption and decryption are now trivial:

// Takes a message m and public information e and n
def encrypt(m: BigInt, e: BigInt, n: BigInt): BigInt =
  m.modPow(e, n)

// Takes a ciphertext c, private exponent d, and public information n
def decrypt(c: BigInt, d: BigInt, n: BigInt): BigInt =
  c.modPow(d, n)

(In reality, RSA is never used to encrypt messages — for n to be large enough to encode any reasonable length of message, the computing resources required for even the “easy” process of encryption are prohibitively high. Instead, RSA is used to safely deliver symmetric keys, which can then be used with block or stream ciphers to encrypt and decrypt large messages.)

Re: Swift

Thing practiced: overenthusing

Depending on your level of nerdiness, you may have heard Apple’s announcement today that a team of mutant programming virtuosos has been working—in secret, for years—on a new language that will replace Objective-C as the lingua franca of the iPhone, iPad, and Mac – plus presumably whatever Apple televisual, home-automating, and/or wearable products are to come. The circle is at last complete.

It’s called Swift—and, if you’re interested, Apple has released a 500-page ebook detailing the language to let developers get started today.

Technically, Swift has some nice characteristics:

It’s fast. Like C, C++, and Objective-C, Swift compiles down to native code—there’s no VM or interpreter slowing things down at run time. Coupled with some serious compiler optimization, this should make Swift performant enough for any size of application.

It’s safe. Unlike Objective-C, Swift code is proven safe at compile time—that is, no compiled code can cause an access violation at run time (unless we explicitly mark our code as unsafe).

It has first-class functions. Functions can be nested within other functions, passed as arguments to other functions, and stored as values.

It has closures. Along with the code for a function, Swift stores the relevant parts of the environment that was current when the function was created. (This makes higher-order functions like sort and fold possible and powerful.)

It supports both immutable values and variables. Languages usually choose one or the other, but Swift has both—just type let for immutable values and var for (mutable) variables.

Swift is statically-typed, but types can be inferred. If the type-checker can tell what type the value/variable should be, we don’t have to write it.

And types can be generic. Functions and methods are automatically generalized by the type-checker, so our functions can be used across compatible parameter types—without manually writing overloads, and without resorting to id and using casts everywhere.

Swift gives us namespaces, with its module system.

Memory is managed automatically, and without pointers. Like Objective-C post-ARC, Swift uses compiler magic to track and deallocate instances as needed. We don’t need to explicitly specify strong vs. weak references any more, or use * for reference types. This makes iOS programming more accessible (pro/con), though not as easy as with garbage collection. And since ARC happens at compile time, there’s no performance hit at run time.

Swift has tuples (or product types). We can group multiple values into one compound data type, without using container classes/structs.

It has very-special enums (or sum types). A Swift enum contains a value of one of various specified types. We can use these like plain C integer enums if we like, but Swift enums go further: Swift lets each enum type carry data.

It has pattern-matching to go with tuples and enums. Values can be matched against patterns to decompose them concisely.

And much more: lazy properties, property observers, class/struct extensions, buffed-up structs, option types, function currying, type aliasing, and more. And Swift manages to do all this while maintaining smooth interoperability with Objective-C classes.

(For a trivial demo of some of these features, here’s code for Minesweeper in Swift.)

With ideas drawn (in Chris Lattner’s words) from Haskell, Rust, C#, Ruby, Python, and the last twenty years of PL research and practice, Swift looks like a language even programming-languages nerds should find pleasing. It delivers what everyone wants from a language: an elegant way to interface with software.

More important than Swift’s elegance, though, is its context. Programming languages rise and fall not on language quality but on more-practical factors: which libraries and tools are available, what the industry standard is, and what’s most likely to get you a job. Swift is the first place all these factors converge: a highly relevant platform, a sole platform owner with the will to impose a standard, a human-friendly toolset, and a newly-modern language.

Have you ever wanted to start programming on iOS? Today’s the day.

Classic AI: Mancala

Thing practiced: logic programming

Tools used: Sublime Text 3 beta, SWI-Prolog, J.R. Fisher’s Prolog tutorial

Mancala is a traditional African two-player board game with many different variations. One variant of mancala, called Kalah, has a long history in AI research – see e.g. Silver 1961 (pdf), Bell 1968, Slagle and Dixon 1969 (pdf), and Irving et al 2000 (pdf).

Prolog is a logic programming language devised by Alain Colmerauer and his colleagues in the early 1970s. The idea behind the language is that you feed the computer a set of facts and rules, then ask the computer to use those constraints to answer questions about things you want to know.

For today’s practice, I tried to implement a program to play mancala in Prolog – specifically the version of mancala I learned in elementary school day care, which has three seeds per hole (game rules are here). To solve the game, we can generate a tree of possible moves for each state and use minimax with alpha-beta pruning to find the optimal one. The code is here.

Web game: Pillow Fight!

Thing practiced: programming

Tools used: Sublime Text 2, Inkscape, Google Chrome

screenshotscreenshot

I was never much of a gamer, probably because I really suck at video games. I do remember spending an entire afternoon as a kid playing a Flash game about pancakes, though. For today’s practice I tried to replicate the pancake game, but in HTML5/JavaScript instead of Flash. And with pillows.

Try it here.

The goal: to earn points by arranging pillows into piles of the same color. You earn 100 points whenever you make a pile of three same-colored pillows. You can only move the topmost pillow in each stack. To move a pillow, touch the stack it’s in, then touch the stack you’d like to move it to. New pillows are added at an increasing rate as the game level increases, and the game ends when any stack reaches the top of the screen.

Scientific computing: Fast Fourier transform

Thing practiced: programming

Tools used: Visual Studio 2012 on a Windows machine

(Quick background: The Fourier transform lets us translate any irregular signal into a combination of pure sine waves. The numerical version of this is the discrete Fourier transform (DFT), which is ubiquitous today – it’s responsible for jpegs and mp3s, among other things – thanks to the efficient fast Fourier transform (FFT) family of algorithms.)

Let’s try to implement the FFT in F#. First, here’s the naive DFT algorithm (a direct translation to code of the definition): $$ X_k = \sum_{n=0}^{N-1} {x_n}e^{\frac{-2{\pi}i}{N} kn} \tag{DFT} $$

let slow_dft xs =
    let N = Array.length xs
    [| for k in 0 .. N - 1 ->
           let mutable x_k = Complex.Zero
           for n = 0 to N - 1 do
               let a = 2.0 * Math.PI / float N *
                       float k * float n
               x_k <- x_k + xs.[n] *
                      Complex((cos a), (sin a))
           x_k |]

This gives an exact answer, but with poor performance – this has O(N2) time complexity. To improve performance, we first reduce problem scope by assuming that the number of input terms is an integral power of 2, letting us use the radix-2 DIT divide-and-conquer algorithm. This splits the DFT into two DFTs – a DFT of the even-numbered and a DFT of the odd-numbered terms:

$$ X_k = \sum_{m=0}^{\frac{N}{2} -1} {x_{2m}}e^{\frac{-2{\pi}i}{N} k(2m)} + \sum_{m=0}^{\frac{N}{2} -1} {x_{2m+1}}e^{\frac{-2{\pi}i}{N} k(2m+1)} $$

which gets rearranged as: $$ X_k = \left\{ \begin{array} {r@{\quad}l} E_k + e^{\frac{-2{\pi}i}{N} k}{O_k} & \mbox{if} & k < N/2 \\ E_{k-N/2} - e^{\frac{-2{\pi}i}{N} (k-N/2)}{O_{k-N/2}} & \mbox{if} & k \ge N/2 \end{array} \right. $$

This article explains further. Code:

let radix2_fft xs =
    let bit_rev arr =
        let N = Array.length arr
        let mutable j = 0
        let swap (a : _[]) i j =
            let tmp = a.[j]
            a.[j] <- a.[i]
            a.[i] <- tmp
        let rec helper m j =
            let m = m / 2
            let j = j ^^^ m
            if j &&& m = 0 then helper m j else j
        for i = 0 to N - 2 do
            if i < j then swap arr i j
            j <- helper N j
    let radix2_helper (xs : Complex[]) N j m =
        let t = Math.PI * float m / float j
        let a = Complex((cos t), (sin t))
        let mutable i = m
        while i < N do
            let x_i = xs.[i]
            let t = a * xs.[i+j]
            xs.[i] <- x_i + t
            xs.[i+j] <- x_i - t
            i <- i + 2 * j
    let N = Array.length xs
    bit_rev xs
    let mutable j = 1
    while j < N do
        for m = 0 to j - 1 do
            radix2_helper xs N j m
        j <- 2 * j

This is fast – O(n log n) – but only handles signals with lengths of integral powers of 2. To use the radix-2 algorithm for a signal of arbitrary length, we need to increase length to the next smallest power of 2. We can’t zero-pad the signal itself, but if we express the transform as a convolution then pad the convolution we’ll get an exact answer. (My hour’s up, but if you’re interested the convolution algorithm is explained here.)

Project Euler: #108, Diophantine reciprocals

Thing practiced: math

Tools used: pen, paper, and MonoDevelop

Problem:

In the equation
$$ \frac{1}{x} + \frac{1}{y} = \frac{1}{n} \tag{1} $$ where x, y, and n are positive integers, what is the least value of n where the number of distinct solutions (i.e. solutions where x, y does not recur as y, x) exceeds 1000?

Solution:

Since x, y, and n are positive integers, we know that x and y must each be greater than n. Thus, we can rewrite (1) as
$$ \frac{1}{n+a} + \frac{1}{n+b} = \frac{1}{n} \tag{2} $$ where a and b are again positive integers.

Doing some algebra, we get:
$$ \frac{n+b}{(n+a)(n+b)} + \frac{n+a}{(n+a)(n+b)} = \frac{1}{n} \tag{3} $$ $$ n(2n+a+b) = (n+a)(n+b) \tag{4} $$ $$ n^2 = ab \tag{5} $$

This means that each distinct pair of divisors of n2 represents a distinct solution to (1). Thus, if we can find the number of divisors that n2 has, we’ll know that half that number (well, actually divisors(n2) + 1 / 2 because one of the divisor pairs – n * n = n2 – won’t be doubled) is the number of solutions to (1).

Now we just need an algorithm to find the number of divisors of a given integer m. An easy way to do this would be to enumerate each of the numbers up to sqrt(m) and count each number that divides m evenly as a divisor. Since we’re going to use this algorithm many times, though, this brute force approach won’t work.

Instead, we’ll use the divisor function. This tells us that we can obtain the number of divisors of m from its prime factorization. Specifically, the number of divisors of m is the product over all distinct prime factors of the power of the prime factor, plus 1. (That is, if we can write
$$ m = \prod^r_{i=1} p_i^{s_i} \tag{6} $$ then we have $$ d(m) = \prod^r_{i=1} (s_i + 1) \tag{7} $$ .) Think of it this way: each distinct divisor of m is some distinct combination of m’s prime factors, and each prime factor can appear 0, 1, …, or si times, so multiplying through for each of the prime factors we get (7).)

Now we just need an algorithm to factor a number into primes. We can use a (pseudo-)sieve of Eratosthenes to generate a seed list of potential primes:

// This is not optimized, and not the real sieve
// -- see Melissa E. O'Neill, The Genuine Sieve
// of Eratosthenes -- because it doesn't matter
// here and my brain hurts.

// Returns a list of primes up to a given maximum.
let primes max =
    let rec sieve list =
        match list with
        | [] -> []
        | hd :: rest ->
            hd :: (sieve <| List.filter (fun x -> x % hd <> 0) rest)
    sieve [ 2 .. max ]

We use the list of primes to obtain the prime factorization:

// Returns the prime factorization of a number
// as a list of tuples of (prime, exponent). This
// only looks for prime factors up to a given
// maximum. Again this isn't optimized, because
// it doesn't matter for this size of problem.
// Definitely won't work for #110 though.
let primeFactors m max =
    let rec primeFactorsHelper (m, factorsSoFar) primesToTest =
        let singleFactorHelper m p factorsSoFar =
            let rec exponentHelper m p e =
                if m % p = 0 then exponentHelper (m/p) p (e+1)
                else (m, e)
            let (rem, exp) = exponentHelper m p 0
            rem, (p, exp) :: factorsSoFar
        match primesToTest with
        | [] -> factorsSoFar
        | hd :: rest when hd > m -> factorsSoFar
        | hd :: rest when m % hd = 0 ->
            primeFactorsHelper (singleFactorHelper m hd factorsSoFar) rest
        | hd :: rest -> primeFactorsHelper (m, factorsSoFar) rest
    primeFactorsHelper (m, []) (primes max)

Factoring n2 still takes some time, though, as n2 grows much more quickly than n. Luckily, we can factor n instead. From (6) we can see that $$ m^2 = (\prod^r_{i=1} p_i^{s_i})^2 = \prod^r_{i=1} p_i^{2{s_i}} \tag{8} $$ so with the same combinatorics that gave us (7) we have
$$ d(m^2) = \prod^r_{i=1}(2{s_i} + 1) \tag{9} $$

Now, finally, we can find the solution using (9):

// Returns the number of solutions for a given n.
// This still only uses prime factors up to a
// given maximum.
let solutions m maxPrime =
    primeFactors m maxPrime
    |> List.fold (fun acc (p, e) -> acc * ((2*e) + 1)) 1

And obtain the solution using an (empirically-determined-to-be) appropriate max prime seed:

// Returns the solution, finally, I think.
let problem108answer =
    { 1 .. System.Int32.MaxValue }
    |> Seq.find (fun n -> (((solutions n 17) + 1) / 2) > 1000)

This gives us a final answer of 180,180.

Meant to have a crack at #110 today also, but this problem took an embarassingly long time to solve.