Thursday, February 15, 2007

Nothing New Under The Sun vs. You Can't Teach An Old Dog New Tricks

I mentioned in an earlier post that I'm doing a bit of 'remedial software engineering' at my new job. This project involves taking a large bit of legacy code, and re-organizing it. After talking about it with several of my colleagues, it occurred to me that the proper way to do this was to split the functionality up into Unix-like filters that can be piped together.

Filters and pipes are just one of the reasons that programmers tend to fall in love with Unix. In general, a program that acts like a filter reads data from a standard input, performs some operation on it (select part of it, change part of it, sort it, etc.), and then writes the result to the standard output. A pipe (or pipeline) consists of a string of filters, each of whose output becomes the input for the next filter in the pipeline.

This is an extraordinarily powerful idea. It allows the programmer to concentrate on having a program do just one thing (e.g., select all of the lines in a file that contain a given string), yet be able to use that program in conjunction with other programs to achieve some more complicated results.

Consider a simple example. Here are some commands that work on my Linux machines here at home:
  • 'ps -e' = list all of the processes currently running, showing their process IDs and their names
  • 'grep ' = read all of the input coming into the program one line at a time, but only write out those lines that contain .
  • 'wc -l' = read all of the input coming into the program one line at a time, but only write out how many lines were input
By themselves, these program are certainly useful, but using the pipeline notion, I can quickly combine them to do something very useful. Suppose I wanted to know how many processes my webserver had running, ready to answer web requests from outside. Since I know that the webserver processes are named 'httpd', I can then answer my question like this:
ps -e | grep httpd | wc -l
(The vertical bar symbol is called, appropriately enough, the 'pipe' symbol.) The above command line first executes 'ps -e' to get a list of all the running processes, then puts the resulting list through 'grep httpd' to select only those lines in the list that contain the string 'httpd', and then counts those. Of course, I could have written a program to answer this question. However, since I have these filter-type programs, and pipes, I didn't need to write anything new, but could simply string together the functions that I needed to solve my immediate problem.

The filter/pipe idea is so powerful and useful that even competing operating systems, like Microsoft Windows, have borrowed it. Microsoft's DOS, which Windows sits on top of, implemented a very similar kind of idea years ago (but of course long after Unix had it). Neither Unix nor DOS, nor any other operating system that I know of, has come up with an approach that really matches the power and simplicity of this idea for organizing computation.

All of the above was just to explain the notion of filters and pipes so that you could understand what I wanted to do with the legacy code I inherited. My idea has two key parts. First, re-group the existing code into discrete, well-defined chunks, each of which takes some standard kind of input, performs a single operation, and then returns a standardized kind of output. Second, implement some sort of framework that allows the user to choose these well-defined chunks, and string them into a desired sequence. With this scheme, one can quickly put together a system to solve a particular problem without having to write a customized piece of code to do it.

All of that seemed like the obvious way to go. Then I began having second thoughts. Perhaps one of the (dis)advantages of growing older is that the absolute certainty of youth slowly gives way to the gray-colored ambiguities of middle age. Was a pipe/filter scheme really the best way to handle this design? Would a more integrated platform be a better approach? Was I so stuck in a particular mindset that I could not see a better way to do it? Is there a better new trick that this old dog just can't learn?

I sometimes fear that is so. When I bump into new software development approaches (Java and its menagerie of associated folderol comes to mind), there are times when I just can't make myself buy into it. In fact, I find myself making excuses to avoid having to use the new approach. It feels like the costs of learning the new system far outweigh any visible benefits of doing so. However, if I am honest with myself, that might simply be because of an old-dog syndrome.

I don't have an answer to this particular question with regard to the task at hand. So, I guess I'll proceed along the path that I can see. Unfortunately, it can be demotivating not be sure of the wisdom of one's approach.

1 comment:

marylea said...

My you write well!! As an adult learner, which is how I see myself, having returned to school twice for degrees after I turned 36, I understand the dilemma you describe, even f I lack the computer engineering insights you have. I find myself resisting the cell phones that can do 15 tasks in addition to ringing if someone calls me. It's a bit overwhelming. This year I rather hit the wall on all the technology there is to learn. My grandmother had to learn to drive a car, and later, work a radio, and later, handle a television. Every year that passes I am required to learn a new skill. Today, running my errands, every store acted like I put them out by writing a check rather than swiping a card... It's a brave new world out there, Bill. And we're in it!