Since I am interested in both linguistics and computer science, I occasionally dabble in natural language processing. Living in South Korea and having studied Korean for several years now I’m getting cocky enough to think that I can explain to a computer how to conjugate verbs in Korean. Actually, this is something that I have started over and over again in different languages but I just wasn’t getting anywhere before. This time I started it in my latest obsession, Erlang, and I have to say pattern matching is where it’s at when it comes to processing this sort of information. Here’s a sample test to merge 보(from 보다: to see, infinitive) and ㅏ to make 봐 (see, present simple):
<<"봐">> = merge(<<"보">>, <<"아">>)
And here’s pattern matching in action that handles the merge:
merge({character, Lead, <<"ㅗ">>, none}, {character, _, <<"ㅏ">>, Padchim}) ->
[{character, Lead, <<"ㅘ">>, Padchim}];
Currently the program has many similar definitions. It might be hard to follow what’s going on if you can’t read Hangul or don’t understand Korean grammar, so I’ll give you a brief overview. Korean has a phonetic “alphabet” and characters are composed of a consonant a vowel and alternatively end with 0-2 other consonants. Conjugation in Korean often requires adding vowels or consonants to the infinitive of the verb, but these are different Unicode characters, so you need to calculate the appropriate offset. A great overview of that tedious but important work can be found here.
I have implemented this in other languages, but I think pattern matching and functional programming have helped me solve this problem in a way that leaves a little bit fresher taste in my mouth. One thing that tends to happen, at least to me, when writing in an imperative language is that I spend a lot of time building data structures and figuring out how to iterate over them to produce something that ends up looking a lot like a pattern matching engine anyway! Pattern matching isn’t hard to implement, but I didn’t realize that that was what I was doing before I saw a language that supported it out of the box. I have heard functional programming zealots make some outrageous claims, such as “Functional programming eliminates the need for patterns.” Yeah, right! One thing I’m really concerned with is refactoring now that the verb stem itself needs to change for ㅂand 르 irregular verbs. I’ll let you know how it goes!