Mar 25, 2006

turing & decidability

Decidability. Entscheidungsproblem. Hilbert wasn't the first to call for a solution to decidability. It goes quite a ways back. Decidability can be explained as a simple yes/no question. Does there exist an algorithm that decides the validity of a theorem? I'm using algorithm here in the standard sense of the term - a finite step by step procedure for producing a solution to a problem. In this case we’re talking about math problems. A problem such as, 'find the largest common factor between these two numbers.' The key idea here is the procedure to answering these types of questions is systematic. Follow the instructions precisely and you will arrive at an answer. In essence what Hilbert wants here is an uber-algorithm; one that can be used to test whether an argument is correct. Again this is a tall order much like the demands he placed on completeness and consistency but the mathematics community had hopes.

Turing came upon the problem in a math class on topology. Topology relies on set theory and set theory was a big thorn in the side of Hilbert because of the contradictions it created. Turing solved this issue in 1936 in his paper, "On Computable Numbers, with an Application to the Entscheidungsproblem". He was 24 years old. Basically the same age that Einstein produced his four famous papers and the same age Godel produced his. There is something magical about the mid-twenties.

Turing's paper has three parts - defining the idea of a computable number, the concept of a universal machine, and employing these concepts to prove that decidability is unprovable (I guess I just gave away the answer). Turing's solution makes use of an imaginary computer. He didn't call it that. Computers back then were people (usually women) who did what computers do today by hand. This machine is now called a Turing Machine but he called it an automatic machine or an 'a-machine'. It was simply a construct to help him with his analysis.

This in itself is pretty remarkable. First there were no computers in the modern sense. Second, no one in mathematics used constructs like this to develop their proofs.

Turing asked what are the possible mechanics that can be carried out in computing a number. Remember from Godel's work that computing a number can mean two things - calculating a numerical solution and calculating a theorem or argument. Recall Godel was able to number his arithmetic symbols and theorems. So Turing is talking about both of these 'numbers'. An example of the first one is finding the tangent of an angle (mathematics) and an example of the second one is calculating whether a theorem is true (meta-mathematics).

Let's describe Turing's machine. A tape runs through the machine. The tape is sectioned into squares that contain symbols (or blanks). At any time only one square is inside the machine. When the machine scans a symbol it effectively alters the state of the machine. In this way the machine can remember some of the symbols it has previously seen. The behavior of the machine at any moment is defined as the m-configuration. Including the symbol it is immediately scanning is defined as the machine configuration. And adding the other symbols that are on the tape defines the complete configuration. Depending on the configuration, the machine will both move the tape one or more squares forward or backward (or stop) and either write or erase a symbol (or do nothing) on the square that is in the machine.

So we can imagine a tape being fed into the machine that adds 6,439 and 815. On the tape it would appear as


I'm using the symbol, |, to denote the edges of the square. This kind of computation we can do it our head. But if the numbers were enormous we couldn't. Why? Because our memory is limited. Likewise Turing's machine is limited in memory in theory and so it must cut the overall computation into smaller components much like we would do. First add the 5 and the 9 at the ends of the numbers. You get 14. Write 14 down on some blank squares on the tape. Next add 3 and 1. You get 4. Head back to where you wrote 14 and add the 4 and the remainder 1 to get 5. Erase the 1 in 14 and write a 5 to give 54. And so on. Until you get to 7254. Nothing magical here.

Let's change our notation to a unary system. You're probably familiar with a binary system (0 and 1). Unary systems just have one character (1). Imagine our machine adds numbers. So we can feed into it a tape that looks like this

|1|1| |1|1|

A space here represents a separation between the two numbers we want to add. In this case the symbols on the tape represent the arithmetical statement 2 + 2. Feeding this tape through the machine should give us the obvious answer 4. But how do we ‘code’ the machine do to this. Turing suggested creating a table which laid out the m-configuration that the machine should be in at any given point and what the machine should do in that particular configuration. In this case the adding table looks like this,

m-configSymbolActionNew m-config
AblankMove 1 square to the rightA
A1Move 1 square to the rightB
B1Move 1 square to the rightB
BblankPrint 1; Move 1 square to the rightC
C1Move 1 square to the rightD
CblankMove 1 square to the leftD
DblankNo move; machine stopsD
D1Erase 1; machine stopsD

The machine starts in the m-configuration A. If you follow through the actions of the machine you can see the tape spits out,

| 1 | 1 | 1 | 1 |

This is just the number 4. The mechanics of this are straight forward. It steps through the first number and when it hits a blank space the machine realizes it has finished traversing through the first number. It then prints a ‘1’ in the empty space. Then it moves its way through the tape until it gets to another blank space and realizes it has finished traversing through the second number. It then backs up a space and erases the last ‘1’. This doesn’t seem like much but it’s helpful in pointing out how you can code a table to instruct the machine what to do. In other words this represents an algorithm.

Turing provides a second example using a binary number system (0, 1). Before I show the table let’s create a simplified method of showing the actions. R = Move right 1 space, L = Move left 1 space, P = Print the character that follows, E = erase the character that follows, and ‘;’ separates the actions. Turing also introduces some additional symbols to simplify the number of steps needed. Specifically x and §. These symbols don’t represent anything in the final answer since they must be in the form of 0's and 1's but they do assist in the calculation by making the number of steps less.

m-configSymbolActionNew m-config
AblankP §; R; P §; R; P0; R; R; P0; L; LB
B1R; Px; L; L; LB
B0No actionC
C0 or 1R; RC
CblankP1; LD
DblankL; LD
EE, 0, 1, x, §R; RE
EblankP0; L; LB

For this machine we feed a blank tape in. This will become clear later but we’re not calculating a number based on what is on the tape. Rather we are calculating a string of numbers that have a particular characteristic. We start in the m-configuration A. As you can see we encounter the blank square and then follow the actions.

Print §; Move right 1 square, print §, move right 1 square, print 0, move right 2 squares, print 0, move left 2 squares. We end with this:

|§|§|0| |0|

The bolded ‘0’ represents which square is inside the machine. Now we are in m-configuration B and we carry out the next step. Since we are sitting on a ‘0’, we carry out the step in the third row – No action – and move into m-configuration C. The symbol we are sitting on is ‘0’ so we move two squares to the right and remain in m-configuration C. And so on. If we carry out this table’s tasks we end with something like this

|§|§|0| |0| |1| |0| |1| |1| |0| |1| |1| |1| …

Note there are some blank squares here. Turing does this to minimize the number of steps. The blank squares serve as a crib space for the computer to take ‘notes’ and then they are erased later. Reformatting so it is easier to see what we have here, we get

0, 1, 0, 1, 1, 0, 1, 1, 1

Interpreting the 0's as spaces and the remaining 1s as a unary system we have,

1, 2, 3, …

Steps in the equation y = x + 1 if we plug 0, 1, 2, 3, etc into the equation. However if we interpret the 0's as spaces and the remaining numbers as binary we get

1, 3, 7, 15, 31, …

Steps in the equation y = 2^x -1 if we plug in 1, 2, 3, 4, etc.

In both interpretations the machine is kicking out an infinite series of numbers; a sequence. The machine never stops. Both examples we've gone through where the machine stops and where the machine goes on infinitely are examples of algorithms. One of Turing’s principal points at this point in the paper is that no matter how complex, there exists a Turing machine that can step through an algorithm based on some table of actions. Turing then goes on to transform his tables into a more efficient shorthand. In many ways this mimics what Godel did with his number system to represent arithmetical symbols. Let’s start with a simple table that prints out 0 1 0 1 0 1 0 1 repeatedly:

m-configSymbolActionNew m-config
AblankPrint 0; Move 1 square to the rightB
BblankMove 1 square to the rightC
CblankPrint 1; Move 1 square to the rightD
DblankMove 1 square to the rightA

Now here is how Turing simplifies this.

We rename all the configuration states with Q’s and sub them to distinguish one m-configuration from another. For example Q1, Q2, Q3, Q4. Likewise all the symbols are renamed with S’s. For example S0, S1, S2, S3, S4. S0 = blank, S1 = 0, S2 = 1. Keep the P, R, and L notation and eliminate the E (erase) notation since we can now print a blank with P S0. The table is simplified to this

m-configSymbolActionNew m-config
Q1S0P S1 RQ2
Q2S0P S0 RQ3
Q3S0P S1 RQ4
Q4S0P S0 RQ1

Harder to read but there’s nothing magical here. We’re just using symbols to represent other symbols. Now let’s get rid of the table itself and use semi-colons to represent separations of rows and also get rid of the P since it’s just repetitive.

Q1 S0 S1 R Q2; Q2 S0 S0 R Q3; Q3 S0 S2 R Q4; Q4 S0 S0 R Q1

Even harder to read! But we still have the same information that was in the table above. Now Turing assigns a letter to each symbol in the following way. Each Qi will be replaced with a D and then the letter A repeated i times. For example Q2 = DAA. Si will be replaced with a D and then the letter C repeated I times. For example S3 = DCCC. The rest is the same. N = no move. Doing this we get


Now this is officially unreadable without a crib sheet. Don’t worry about that. Just remember our string here is just a coding of the m-configuration table we had before. Turing calls this the Standard Description or the SD of the machine. Believe it or not Turing has another transformation. A very Godellian one. 1 = A, 2 = C, 3 = D, 4 = L, 5 = R, 6 = N, 7 = ;. Making the replacements we get,


This Turing calls the Description Number of DN. You might be able to guess where we’re going with this. So let’s step back and review because we haven’t done much. We showed that for any algorithmic procedure there is a machine which can follow a set of instructions in table form to carry out that procedure. We then went through a methodology to rewrite the table instructions into a very long number. Also it should be clear that for any computable sequence (i.e., the output of an algorithm) there exists at least one description number. But for any description number there can only be one computable sequence. Think about those last two and they will become self-evident.

You can imagine a situation where we have an infinite number of machines. Each machine is numbered from 1 to infinity. The number on the machine corresponds to the description number within the machine. Naturally not all of these machines would produce a computable sequence of interest. Some wouldn’t do anything. Roger Penrose refers to these as 'duds'. In general these duds are circular machines. Imagine a machine that simply moved to the right and printed nothing, moved to the left and printed nothing and then restarted the sequence. It would go around in circles doing nothing. Circular.

A machine that is not circular in nature will produce some computable sequence regardless of whether that sequence is useful or not. Now Turing poses the question, "can a machine determine where another machine is circular or not?". In other words, is there a machine that can determine if another machine will compute something and then stop or halt or determine if it will continue spinning around in circles. This was referred to as the Halting Problem.

To address this problem Turing puts forth the concept of the Universal Machine. This Universal Machine can imitate the actions of another machine. You can see how this really is analogous to today’s computers. The Universal Machine must be ‘programmable’. If the Universal Machine can mimic any other machine it can therefore compute any computable sequence. Imagine we have a Universal Machine. We thread into it a tape which has a number on it. This number corresponds to the description number of the machine we wish to analyze. Let’s call this machine under analysis, M.

Now Turing considers a third machine. He calls this M’. You don't need to really understand how this M' works to understand what Turing eventually does. This M' really is to establish the basis for the Universal Machine. This machine M' will write down on a tape, the successive complete configurations of M. What do I mean by this? Remember the ‘complete configuration’ at any stage in M’s calculation as the number on the scanned square (the one in the machine), the complete sequence of all symbols on the tape, and the m-configuration (the descriptive table of actions). Just as the table of actions can be described by letters or numbers (as we did above), the complete configuration of M at each point during its calculation can be described in the same fashion.

For example, let's consider the machine that wrote down 00101101110111... Once again the m-configuration is written with D followed by the appropriate number of As and the symbol is D followed by the appropriate number of Cs. An easier way to think about all of this is that DA means configuration A or DAA means configuration AA (or B). And D = blank, DC = 0, DCC = 1, DCCC = §.

Using this we can print out the symbols the M' machine prints out.


What is that? Okay let’s spell out what this means. The first DA means the machine is in configuration A. The second sequence (§§ B 0 blank 0) is what M prints out as a result of being in configuration A. Except for one difference. There's a B in there. The B shows where the tape resides in the machine (on the 0) and also states that the machine then moved into m-configuration B. A very clever notation system if you ask me. The third sequence (§§ C 0 blank 0) shows the next stage of the printout and also what m-configuration it is going into (C). Note nothing has changed because m-configuration B states do nothing if it resides on a 0 and simply go into state C. And so on the sequence goes.

If you don’t understand any of this, don’t worry. Just keep in your mind that the M’ machine is cataloging at each step what the machine M is printing, the state it is in, and the point at which the tape sits in the machine. In other words it is cataloging the sequence of ‘complete configurations’ the machine M undergoes.

The one thing this sequence of D’s, A’s, C’s, and colons doesn’t print out are the numbers that the M machine would print. We have the number encoded into the letters but not the actual numbers. Turing changes his M’ machine to also add the numbers the M machine prints out. So M’ is printing the complete configurations and also the computable sequence of numbers now. I won't show what this looks like.

Using this approach Turing was able to set up the table of actions that would describe the Universal Machine. Turing’s description of this is complex. Penrose thinks the description is more confusing than the machine itself. But let’s remind ourselves what the Universal Machine does. We have some machine of interest (call it machine T). T prints out some sequence. Let’s says it’s 010101010... If we take the description number of T and feed it into U (the Universal Machine) it will spit out T:010101010... In other words it prints out that "the machine T will print this sequence - 010101010... The amazing thing here is that Turing actually manages to describe the table of actions to do this.

Now let’s go back. Let’s imagine we have an infinite number of tapes - 1 to infinity - that we can feed into our Universal Machine. One of those numbers by definition will be the description number for U since Turing showed you could write an action table for U. Apparently it’s 1,653 numbers long and I most definitely won’t replicate it here.

Also recall that some of these tapes cause U to create a sequence and some are circular. Turing calls the numbers that generate a sequence Satisfactory and those that are circular Unsatisfactory (duds).

Now Turing gets to the heart of the problem. Turing asks if there is an algorithm (or a specific Turing machine) which can act on the description number of another Turing machine and determine whether it is satisfactory or unsatisfactory. Let’s say we can build this machine. If we feed a circular (unsatisfactory) description number into it, it prints a 1. If we feed a non-circular (satisfactory) description number into it, it prints a 0.

If this machine really did exist it would be an answer to the Decidability Problem because we would have a method to determine if an algorithm or Turing machine was decidable. How so you ask? We consider if we are trying to determine the decidability (truth or falsity) of the Goldbach Conjecture (i.e., every number that is greater than 2 is the sum of two primes), we would feed the description number of the machine that broke sequential numbers down into primes and stopped only when it had found one that didn’t break down into two primes, into our ‘Decidability Machine’ and if it prints a 1 (circular) we know the Conjecture is true (because it would not stop if it could never find a number that didn’t breakdown into primes) and a 0 if the Conjecture was false (because it found a number that didn’t break down into primes and halted).

Does this Decidability Machine exist in theory? To tackle this Turing turns to reductio ad absurdum. This means he assumes something, follows the repercussions of that assumption through, and determines if an absurdity occurs. If an absurdity is reached then the original assumption must be incorrect. In this case Turing assumes the machine does exist.

Let’s have this Decidability Machine (let’s call it D) work through the list of all numbers (description numbers). If the machine prints out a 1, it’s a ‘bad’ machine and circular. If the machine prints out a 0, it’s a ‘good’ machine and circular free.

Now let’s take all the description numbers that are good. From this list we print out the computable sequence that the corresponding machines would create. So now we have a list of description numbers (which we know are good) and a list of corresponding computable sequences that U would spit out if we stuck those description numbers into it.

Take the computable sequences and list them into a table with infinite rows (there would be an infinite number of good sequences). Perhaps it might look like this,


Now take the numbers that run along the diagonal to create a new sequence - 1, 4, 66, 2211, … Now add 1 to each number to get - 2, 5, 67, 2212... Now convince yourself that this number appears nowhere in the full table of good numbers. It can’t because the first row differs in the first number, the second row differs in the second number, and so on. It's a clever tool in mathematics.

Since we created this number through a process it is by definition algorithmic. And certainly a Turing machine could produce it because Turing machines can replicate all algorithmic processes. Then how can it be that this number is not in the table above? It is algorithmic, non-circular and our list above is supposed to contain all sequences that have that description. Clearly we have a contradiction. We’ve generated the full list of computable sequences but we’ve used an algorithm to create another one that is by definition not in the list. Turing's assumption was wrong. There is no Decidability Machine and thus Decidability does not exist.

Turing also offers another proof using the Universal Machine. Imagine we can link the Decidability Machine to the Universal Machine to create a new machine, DU. Into this machine we feed the description number for some other machine M. DU now runs. First D works on it and it determines if M is circular of non-circular. If M is circular the machine stops. If M is not circular it is fed into U in which case U mimics M’s behavior. Because we check with D first there is no chance the DU is circular.

Now let’s feed DU’s description number into DU. DU determines that DU is not circular. It feeds the description number into U and U mimics DU’s behavior. This behavior means it feeds DU into D and then into U. It then feeds DU into D and then into U. Hey wait a minute. This is circular. But we just showed that DU wasn’t circular. Since Turing has shown that U is a machine that can exists, it must be that D is the one that cannot exist. And just like that Decidability is dead again.

No comments: