The Future of Programming or How Genetic Programming Will Obsolete Handwritten Code in Our Lifetimes

When I was much younger I wrote a program that took sentences that people had typed in and it would respond with a sentence that was formed by randomly selecting a word that had appeared previously in each position in the sentences it had received as input. Needless to say, talking to my program left a lot to be desired. But I thought it was really cool and I had stumbled upon AI all on my own. Man, was I wrong.

I have many interests, including evolution and computer science. Because of my diverse interests I was totally blown away in college when I got my hands on a copy of Flake’s “The Computational Beauty of Nature” which is an attempt to explain chaos, fractals, evolutionary computation (EC), game theory and several other fields all through the Theory of Computation. It wasn’t the first time that I had stumbled across Evolutionary Computation or Evolutionary Algorithms but it was a great read nonetheless and I highly recommend it. The other day I was trying to think of ways to use my new improved monthly benefit from work (in Korea offices give you a stipend for self improvement. At least our office does). I went to the bookstore and found a very gentle introduction to Genetic Programming. I don’t have a copy of TCBoN here with me in Korea but, as I recall, Genetic Programming is only briefly mentioned. After graduating I purchased one of Koza’s books. He was the original “inventor” of GPs. Basically, Genetic Programming is a superset of Evolutionary Computation. The engine of ECs is selection. Users define a fitness function which tests each organism in a population. Initially, populations are randomly produced but over time they come to be inhabited by more and more successful organisms. Programs are improved through slight mutations and crossover through breeding (taking roughly half of the solution and swapping it with a partner). Wikipedia can explain it better than I can. Most Evolutionary Algorithms require that you define a way of encoding the problems. In GP, the encoding for the problems is a program. Even though I understood the basic concepts, I was totally overwhelmed by the complexity of Koza’s book since it talked a lot about the specific problems he was using GP to solve. I dabbled a little with writing some GP tools. For some reason I was concerned with speed so I wrote it in C. Pointers, yuck! It solved some toy problems and I thought it was cute. I still thought that programming was to be done by humans. I don’t think that way anymore. When I was in the bookstore staring at the textbook it all came back to me. My Moby Dick. You see, in a former life I was a web developer right around the time when unit testing was all the rage. When writing tests I kept looking at them and having the same idea. I’ve tried implementing it several times but it’s meta-meta stuff, so it gets real ugly real quick. It requires a thorough knowledge of introspection in the language that you are trying to implement it in. The implications of this idea are simple: in the future, developers will only need to write unit tests. Computers will breed the code for you. That is the first step, then of course the computers will just write their own code according to some other way of giving it specifications that is even easier to supply. I’m not a futurist so my vision gets a little hazy over the horizon, but I’m very sure that humans will not be writing code within my lifetime and I’m pretty sure that something like what I’m about to explain is the first step.

I’ve had this idea for a couple years and I haven’t been able to realize an implementation that works. It might be because I’m using scripting languages that don’t do a terribly good job of type introspection (Python) or it might also be because this is a really hard problem. It could also be because I’m not as good of a programmer as I think I am ;-). I believe this to be a novel idea because currently GP is only good at solving isolated problems and the solution isn’t easily interfaced with existing code. By extracting the necessary information from unit tests, the appropriate terminals and functions can be seeded into the population. Also, if done properly, this will allow the generated programs to leverage existing code and it will allow developers to specify interfaces to the generated code and build constraints using the paradigms that they are already familiar with. Need a class for an object-oriented language? No problem. Write the unit test and the appropriate information about the API will be extracted and a population will begin breeding trying to solve all of the constraints you have placed on it with the tests.

We’ll start with a simple example:

def really_basic_test(organism):
    assert organism(0) == 0
    assert organism(1) == 1
    assert organism(2) == 4
    assert organism(5) == 10

This is just a unit test for a function. A list of terminals and functions are extracted by analyzing the abstract syntax tree. We can infer, for example, that we are going to receive an integer as input. Having a statically typed language that has good introspection would be very helpful to get us to this point. I’ve looked for such languages and… maybe I didn’t really look that hard because I don’t like statically typed languages :-). Anyway, we know that we will be returning an integer. Because we are dealing with integers we do a little introspection and come up with a list of methods that can be called on integers in Python: from operator import mul, div, add, sub, etc. Behind the scenes, this function is converted into a fitness function and it is used to evaluate the population.

A slightly more complex example:

def test_string_split(organism):
    assert organism('insane/goat') == ['insane', 'goat']
    assert organism('insane/monkey') == ['insane', 'monkey']
    assert organism('jon/bob') == ['jon', 'bob]

Given the above fitness test, we know that we are given a string and we return an integer. So, we seed the functions with some builtins that return an array and take a string as input and the methods we call on strings that return arrays. This sort of inference can be done with a methodfinder written in Python (I just so happen to have written one of those ;-)). Looks like we’re also going to need to seed with some random characters and strings (initially, I suppose, it would be a good idea to seed with some of the characters and strings provided in the tests).

Sidenote: Method finders were originally created in Smalltalk (as I recall). There is also a Ruby method finder that I’ve run across. Basically, you give the method finder an object, some input and an expected result and it finds all of the methods or built-in functions that yield the expected result. They are a great tool for exploring unknown APIs and discovering things you didn’t know about languages you use everyday. Currently, my Python method finder is in the toy stage, but it works. Here are some examples (note that it detects side-effects, which I believe are the root of all evil ;-)):

>>> methodfinder('jon/bob', '/', ['jon', 'bob'])
'jon/bob'.rsplit('/') == ['jon', 'bob']
'jon/bob'.split('/') == ['jon', 'bob']

>>> methodfinder([1, 2, 3, 4], None, 4)
len([1, 2, 3, 4]) == 4
max([1, 2, 3, 4]) == 4
[1, 2, 3, 4].__len__() == 4
[1, 2, 3, 4].pop() == 4

>>> methodfinder([1, 2, 3, 4], 5, [1, 2, 3, 4, 5])
o = [1, 2, 3, 4]
o.append(5)
o == [1, 2, 3, 4, 5]

And now, for full blown API usage in GP:

def test_xml_xpath_use(organism):
    import lxml
    assert organism('<b>text</b>') == 'text'
    assert organism('<a>text</a>') == 'text'

The Method Finder isn’t ready for stuff like this, but someday it will be. The lxml library contains an xpath parser somewhere in it and it probably also has something that returns a child node. Since it was imported in the test it should be part of the seeded functions. This will probably hit on a solution that finds the text value of the first child node either using xpath or some other function. Of course, the system could be modified so that speed is automatically considered as part of the fitness and eventually we would get a solution that is the fastest.

One of the biggest problems with a system like this is going to be what happens when the fitness landscape is not gradual:

def not_a_good_fitness_landscape(organism):
    assert organism('<form id="login"><label for="username">Username:</label>' + \
                          '<input name="username" /></form>') == "Username:"
    assert organism('<form id="x"><label for="x">Username:</label>' + \
                    '<input name="x" /></form>') == None

“//form[@id="login"]/label[@for="username"]/text()” and equivalent strings are pretty unlikely to be hit upon using a GP search of the solution space.

When it comes to exposing an API, tests could be written to define an API so that a human or another GP tested population can make use of it:

def simple_class_test(organism):
    assert organism.name == 'Dan'
    assert organism.age == 29
    assert organism.speak() = 'hello there'

Populations from previous runs can be cached so that if a test is modified we can quickly evolve a solution if it the new tests describe an organism that is on the same fitness peak. In parallel, another random population can be created in case the peak lies elsewhere. For example, if I go back and change the age attribute to 30, the organisms from the previous run’s population would all do very well (most of them 2/3 I imagine). Getting the right solution is that much closer.

On last important thing to consider when evolving a solution: the organisms will have to evolve in a sandbox (otherwise they could delete files or start making network connections). Also, if anyone ever gets around to implementing this don’t forget to give a very large negative score in the generic fitness function so this puppy doesn’t become SkyNet.