You'd think I'd learn. You really would.

I've posted a number of entries about issues with ZFC set theory, all of which, through the patient help of a number of readers, have proven to not live up to their expectations. As it stands, nothing I've said so far has shown any real problems within set theory. ZFC set theory has proven itself durable, and although I find some non-intuitive attributes to the formal system, it has proven itself consistent. It is exactly as it was designed to be.

I am going to present another possible argument (I've learned to not say proof :-) that most likely will crash itself at the rocks of set theory, like all of the others. Still, I am too much of a stubborn fool to just turn and run ...

This argument will come in three parts:

1. I will define a data-structure that exists OUTSIDE of set theory.

2. I will use the data-structure to define a set.

3. I will apply Cantor's diagonal construction to the set, and examine the results.

THE DATA-STRUCTURE

This section will define a data-structure that will be useful later in defining a set. The data-structure itself is not intended to be within set theory, and is only really useful as a convenient method of enumerating the number values to be placed in the set.

Lets start by specifying three irrational values:

sqrt(2)/10 = 0.141421356...

e/10 = 0.271828183...

Pi/10 = 0.314159265...

Each of these values is an irrational number, that is they contain an infinitely long non-repeating sequence of digits. The division by 10 just shifts the digits, it doesn't alter the non-repeating representation.

Lets construct a subtree P0 such that:

- it is an N-ary tree, where each and every node has exactly 10 children.

- the value placed at the root is 0.

- the value of each child node from left to right is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, this extends to all of the children, all of the way down to infinity.

We can define a path through P0, as being the sequence of digits starting at the root and continuing all of the way to infinity. Thus:

- All paths are infinitely long.

- There are an infinite number of these paths in the structure.

Each path, expresses either a finite rational, a infinitely repeating rational, or an infinitely long non-repeating irrational. A few examples of each are:

a) 0.000000000... is a finite rational. (commonly written as 0)

b) 0.000110100... is a finite rational. (commonly written as 0.0001101)

c) 0.333333333... is an repeating rational (commonly written as 1/3)

d) 0.151515151... is an repeating irrational (commonly written as 1/6.6)

e) 0.999999999... is an repeating rational

f) 0.141421356... is an irrational (commonly written as sqrt(2)/10)

g) 0.271828183... is an irrational (commonly written as e/10)

h) 0.314159265... is an irrational (commonly written as Pi/10)

We can see that a) is in the structure, starting at root and continuing down each and every far left hand child (0) until infinity. b) goes from the root to thru a 3 zeros, then to a 1, another 1, and then continues down the tree for all of the zeros. c) follows the 3 children (4th branch) and d) alternates between 1 and 5. e) follows the far right child all of the way down to infinity.

f), g) and h) follow a much more complex path from the root, twisting and turning at each level in a way to guarantee that they aren't going to repeat in any way.

By construction, this tree contains paths that specify all of the permutations of all possible real values in the range [0,1). Although each node has exactly 10 children, there are an infinite number of paths, and each path is infinitely long.

Because it is fully specified, we can say that 0.0000000..., 0.333333333... and 0.999999999... are paths in P0. We can also say that sqrt(2)/10 is a path in P0. And we can say that both e/10 and Pi/10 are paths in P0. There are an infinite number of these defined paths in P0.

We can now construct a tree called T that has an infinite number of children, stretching out to infinity on the right. For convenience we can say that the value contained at the root of T is 0.

To construct the children we can start by adding in P0, at the left-hand node. To the right, increasing by 1 each time, we can add in an infinite number of children:

P0, P1, P2, P3, P4, ...

P1 is similar to P0, in that it permutes all possible combination of paths, but it's root value is 1 instead of 0. P2 is 2, P3 is 3, etc. Each Pn subtree covers the range [n,n+1) of real numbers. Each Pn subtree permutes all possible combinations of the infinite paths. Each Pn subtree fits snugly against its siblings, so that together the children cover the range [0, infinity]

Now we need to show that T encodes all possible representations for all possible real values. We can start by noting that any infinitessimally small values are contained in P0, located somewhere on the left-hand side of the branches of P0, such as example b) from above.

Any large irrational value, such as Pi^e can be found thus:

Pi^e = 22.4591577... which starts at the child P22, the 23rd node of T, and works its way down to infinity.

Because there are an infinite number of Pn subtrees, all extremely huge reals are contained in a subtree somewhere. Pi starts under P3, e in P2, etc.

So the structure contains all of the large numbers, the small ones, and every possible infinite variation of the digits on both sides of the decimal place. Thus the tree T contains an infinite representation for every possible value in R.

THE SETS

By definition, any irrational or real value can be used in a set. That is we can define a finite set

A = { sqrt(2), e, Pi }

and although the actual values of each of these elements are infinitely long non-repeating values, they are completely included in the set, and the set itself is countable.

So, we can define a finite set from our 8 example numbers in P0:

B = { 0, 0.0001101, 1/3, 1/6.6, 0.999999..., sqrt(2)/10, e/10, Pi/10 }

This set is countable.

We can extend this method of defining a set to handle an infinite number of elements. Thus we can define a set P0 from the above subtree P0 as follows:

The set P0 contains exactly one element from each and every infinitely path in the subtree P0. Since order doesn't matter in a set, we can express this set as follows:

P0 = { 0, 1/3, Pi/10, sqrt(2)/10, e/10, r1, r2, r3, ... }

Where r1, r2, r3, ... are all of the other reals matching the infinite number of different paths in the subtree P0. Based on the elementary property of sets we can say that:

- The set P0 contains irrationals such as Pi/10.

- The set P0 contains rationals such as 0 and 1/3.

- The set is infinite.

We can show that:

- The set P0 is countable.

- The set P0 is countably infinite.

because we can trivially construct a function f: P0 -> N that is an injection. If we have an injection f on a infinite set, by the basis of set theory we can always find a bijection.

From the construction and because the set is countably infinite we can say that:

- The set P0 contains all of the infinite paths that exist in the subtree P0.

As well as P0, we can define the sets P1, P2, P3, ... to have the exact same properties as the set P0. That is they have unordered elements such as:

P1 = { 1, 4/3, r1_1, r1_2, r1_3, ... }

P2 = { 2, 7/3, r2_1, r2_2, r2_3, ... }

We can now define a set T, such that it is the union of all of the elements of the sets P0, P1, P3, ... Because the order of sets is unimportant, we can list the elements in the set T by the following method:

The first element in the set T is the 1st element of P0.

The second element in the set T is the 2nd element of P0.

The third element in the set T is the 1st element of P1.

The fourth element in the set T is 3rd element of P0.

The fifth element in the set T is the 2nd element of P1.

The sixth element in the set T is the 1st element of P2.

....

This enumeration shows that there is an injective function between the elements of T and N.

The enumeration gives us the set:

T = { 0, 1/3, 1, Pi/10, 4/3, 2, ... }

Because this set is constructed from the union of countably infinite sets, and because we can trivially construct an f: T -> N, this set too is countably infinite.

Please note that although we defined the sets from the corresponding data structures, other than as a syntactic way to enumerate out all of the elements for the definition of the set, the data structures themselves are not used in the basis of this argument.

I say this because many counter-arguments will attempt to show that the subtree P0 is uncountable. Whether it is or is not, doesn't matter because within the rules of set theory we can define the sets Pn to have an infinite number of elements, and we can only use the tree as a method of enumerating these elements in a descriptive way for the definition.

At very least, we could use an intervening seqeunce constructed from descending the tree, one infinite path at a time in some fixed order. As an infinite sequence, it could contain all of the infinitely long paths. We can always construct such a sequence from the infinite number of infinite paths contained within the tree because otherwise Cantor's diagonal argument would never be able to test such an arrangement of numbers, and this would be an inconsistency in set theory. It has been shown multiple times that Cantor's can test any possible arrangement of real numbers. Because of that constructing a sequence must be possible.

THE TEST

So, now we have a set T, which we believe contains all of the possible real values. Of course we must test this with Cantor's diagonal argument. We do this by constructing a sequence from our set T, and apply the diagonal argument to that sequence.

Now, there are four possible outcomes from Cantor's argument:

1. It constructs an element E, found in T.

2. It constructs an element E, not found in T.

3. It never halts.

4. It constructs an element E, but not from all of the elements in T.

#3 and #4 can be shown to be trivially not possible. That is, they rely on there being structures in set theory that are larger than just sequences (aka magic sequences), but over the last few weeks, there has been more than ample proof that such structures do not, and cannot exist in set theory, and that the first principles of set theory itself keep them from existing. Thus Cantor's diagonal argument by definition can always construct an element E from all of the elements in the sequence T.

#2 is the normal expected behavior for Cantor's. That is, we expect to go through the construction and be able to create an element E that is a real number, but that is not contained in the set T. Since T is countably infinite, there is definitely some enumeration to which we can apply the construction. Now, since T contains all of the elements from the sets P0, P1, P2, ... and these in turn contain all of the possible elements from their data-structure counter-parts, the only way for any elements to get missed, is for them to have dropped out when we went from the data structures to the sets and then to the sequence. That is, the data-structure self-evidently contains all possible permutations of all real numbers, so either Cantor's is wrong, set theory is inconsistent or something got lost along the way (ie. that R itself is not able to fit into set theory, so it's not even uncountable, it is just outside looking in).

#1 would confirm that T, which is R, is countable. Which given it's underlying status, I think this may actually show a contradiction as well because T is infinitely countable and R is uncountable, yet they are isomorphic to each other.

My guess would be that #1 is the likely result, in that we could take the element E that has been constructed, and look it up in the tree T, in some subtree Pn. We will be able to find it there, and thus show that there are no missing elements.

Given everyone's faith in set theory, there is likely some problem with the construction of this argument.

## Wednesday, December 30, 2009

## Thursday, December 24, 2009

### Cantor's Lost Surjection (Revised)

Revision 2.0: Dec 28, 2009 -- I've made the changes suggested by Chad Groft, and boris in their excellent comments.

Before you go any further, PLEASE read this introductory paragraph fully and carefully. Over the last couple of weeks, I've been writing several posts about a very subtle problem in ZFC set theory, whose results wind through a great deal of the existing work. I may be right, I may be wrong, but I have been very careful to try to stay within the first-principles of set theory, in order to shed some illumination onto the issue. The issue is not as simple as "I am a crank", or "Cantor's proves..." or any other quick knee-jerk reaction to what I am saying. You are welcome to comment, to express your opinions, BUT only if you're read what I've written carefully, and with an open and objective mind. Only if you're not succumbing to your emotions. Mathematics is about careful, objective thought. We must never lose sight of that.

This post is NOT about:

First, I want to talk very explicitly about a wikipedia page:

http://en.wikipedia.org/wiki/Countable_set

In particular I want to closely examine the "Definition" section. This part of the document contains exactly 13 lines (although if your window isn't wide enough some of the lines wrap). In the first four lines, we have the following two definitions:

D1: Line 1 -- a set is "countable" if an injective function exists to N

D2: Line 4 -- a set is "countably infinite" if a bijective function exists to N

Before we go any further: A "formal system", as defined in GEB by Douglas Hofstadter, contains a finite number of axioms, an infinite number of theorems, and a finite number of rules of production. A number of people have also included "definitions" as being separate and distinct pieces from the basic axioms in the system, in particular ZFC set theory differentiates between its base definitions (such as countable) and it's axioms (such as the axiom of choice).

All theorems are built on-top of the many self-evident pieces below, and other theorems. Everything that isn't self-evident or independent is a theorem. Everything that stands on its own is an axiom or a definition.

In this way, definitions and axioms act as the basic building blocks on which all of the theorems are based. The different pieces may be interspersed with each other, but some are primitive building blocks (axioms and definitions), while the others are composite building blocks (theorems).

Rigor means the entire construction is based on a finite number of solid, self-evident pieces. Neither the axioms, the definitions, nor the theorems should contradiction each other, or be inconsistent. If that happens, there is a problem with the formal system.

Lines 1, and 4 basically specify two definitions. Together we can make the following statements about the way terminology is defined for set theory:

For all sets within set theory:

a1: "countable" is not always equal to "countably infinite"

For all functions within set theory:

a2: bijective is both injective and surjective

Getting back to the wikipedia page, in the next part:

T1: line 7 -- starts a "theorem"

This is then followed by three statements, each of which are all deemed to be equal to each other. They are:

T1-1: S is countable (injective) (f: S -> N)

T1-2: Either S is empty OR there exists a surjection (g: N -> S)

T1-3: Either S is finite OR there exists a bijection (h: N -> S)

The lines T1-1 and T1-3 basically say that for any infinite set, if it is countable, then there exists a bijection between the naturals numbers and it. So that for any infinite set:

S = { a, b, c, ... }

there is a bijection between N and itself.

PLEASE keep in mind that according to the page itself, T1 isn't an axiom or a definition, it is a theorem. The proof for the theorem isn't given in the page, it says it comes from a textbook.

By T1, the following set:

A = { a1, a2, a3, ..., b1, b2, b3, .... }

is claimed to have a bijection between it and N. The idea is that the above can be rewritten as:

B = { a1, b1, a2, b2, a3, b3, ... }

And because this is countable, there is a bijection between this set and N. If we map these two onto each other we get assignments as follows:

1 <- a1, 2 <- b1, 3 <- a2, 4 <- b2, 5 <- a3, 6 <- b3, ...

That is, it takes the first 6 elements in N to map to the first three elements for each of the ak and bk subsets of elements. This mapping is considered to be a bijective function.

However, if we have the finite set:

D = { a1, a2, a3, b1, b2, b3 }

and another

E = { 1, 2, 3 }

Since E has less elements than D, we can only create an injective function from E to D. The size difference means that no bijective function can exist. We only get something like:

1 <- a1, 2 <- a2, 3 <- a3

In this sense, it is accepted that in set theory, infinite sets that are countable can have a bijective function between them, even if one set appears to be somewhat "larger", with respect to infinity, then the other one. However, this does not apply to finite sets.

As a comment on my last post, delosgatos said:

"One of the cool and weird things you first encounter with infinite sets is that infinite sets are bijective with their own proper subsets."

Actually, it is far more than that. There is basically a bijective function with "any" other infinite set if that set is considered countable. If you can find an injective function, then you can claim a bijective one based on T1.

In this way set theory essentially splits all (non-empty) sets into three groups:

a) finite

b) countably infinite

c) uncountable

Finite sets differ based on size, and thus there are an infinitely large number of different possiblies. By definition there are also an infinite number of uncountable sets.

But for countably infinite sets, since there exists a bijective function between any pair of sets, they are all essentially isomorphic to the same underlying set N. A bijection can always be found.

This behavoir of the system is very different for all sets that are countably infinite then it is for finite sets.

As a consequence, the "definitions" of injective and surjective become watered down for countably infinite sets. They can still exist as function types, in that you can create an injective function, but by definition, you can also always create a bijective one between the same two sets.

Rephrased as I did earlier, we get the following relationships in the terminology:

For all countably infinite sets within set theory:

b1: "countable" is always equal to "countably infinite"

For all functions on countably infinite sets within set theory:

b2: a bijective function exists whenever there exists a function that is injective and surjective OR just injective

Under the theorem T1, as an example, we can find an f: N -> NxN that is considered to be a bijection, for exactly this reason. Where the set NxN is:

{ {0,0}, {0,1}, {1,0}, {0,2}, {1,2}, {2,2}, {2,1}, ... }

Which is interesting, given that if NxN were finite, it would contain n^2 more elements than a finite N, if both were limited to the base elements (like 1, 2 and 3). If f was between finite sets, then there could be no bijection, but because these are infinite sets, that is ingored.

Thus f is widely believed to be bijective, and it's underlying sets are considered countable, and countably infinite.

BUT, the above statement is only true because of the "theorem" T1, not because of the two basic definitions D1 and D2. By themselves the definitions are not strong enough to say that.

Just to be sure I figured I'd check a source other than wikipedia. The following:

http://plato.stanford.edu/entries/set-theory/primer.html

is a well-written introduction to set theory. It uses the term "one-to-one" often, but it is careful to define this in section 3 as being "injective". It avoids the word bijection.

Read through section 7, very carefully. It starts with defining countable as having the same cardinality as N, but then it says in the comments that the definition of "countable" is actually injective. Also it defines "most countable" as being finite or countable (wikipedia warned of this).

Throughout the various theorems, "countable" keeps falling back on being one-to-one (injective). The authors carefully avoid using "bijective" through-out the entire document.

Later in section 9, they do something really weird, they say:

"Naturally, a question arises whether perhaps all infinite sets are countable."

But then they use Cantor's diagonal argument to show that this can't be true.

Still, note that throughout this entire reference, the authors never once go back to the idea that "countable" is "bijective" for infinite sets. Countable stays as a definition which is injective.

So the wikipedia page probably isn't wrong. It accurately describes how the definition starts with countable, and because of a theorem gets to bijection for all infinite sets that are deemed to not be "uncountable".

"That's just the way ZFC set theory is ..."

Well, maybe thus is the way it is taught, but the "definitions" initially state that it is not the case. The wikipedia page clearly casts all of the blame on the unnamed theorem T1. A theorem whose proof is causing inconsistencies within set theory.

For finite and uncountable sets, there can be diffents sets within those groups where the the relationship between the sets in terms of functions is "only-injective", or "only-surjective". That is, no bijection exists or can exist. They are clearly and consistently defined.

For all of the infinite sets in the middle (between finite and uncountable), there is always a bijectiion between any two sets, no matter what. That is a fundumental inconsistency in the way the formal system treats its different mathematical objects.

It starts with pretty clear and simple definitions about how there is a difference between injective and bijective functions, but for one huge segment of the sets, those differences become negligable.Meaningless.

Just for countably infinite sets, the theorem T1 essentially "hides" any differences in the size of the sets. It is just one big giant special case, that does not match with the behavoir of the rest of set theory. With the behavoir set forth by the definitions and the axioms.

And what's going on with counting? The wikipedia page starts with a nice intuitive definition. "We can count things, one at a time, ...". But because of the T1 theorem the terminology and usage become extremely complex. Counting the elements of finite sets, is radically different than counting the elements of infinite sets.

Why?

According to the wikipedia page, the theorem T1 is proven in Lang's text. I don't have access to that proof, but from the other text I included, we can guess that the problem is probably Cantor's diagonal argument. Everything is fine in ZFC set theory when the text is talking about "countable" as just being "injective". It's a pretty simple definition.

The existance of theorem T1, however causes a strange ripple throught-out the system that effectivily treats all countably infinite sets in a special and strange way, inconsistent with the basic foundations of set theory.

That's one potential problem, but I've already shown that there are more problems lurking about. The magic sequences, for example. I provided an example of one, but because of T1, everybody believes that an infinite set that is countable automatically has a bijection to N. This means that the example can just be "bijected" away. The set:

C = { a1, a2, ..., b1, b2, ..., ... }

where the third "..." means creating elements such as c1, c2, ..., has two very different uses of "...", each of which head off to a different type of infinity.

With T1, numerous commenters have shown that this is now countably infinite, and has a bijection to N.

Thus we can pound this thing down into a sequence (even though it doesn't actually fit). This probably shows up another inconsistency because at very least I would expect C to be classed with R as being "uncountable", but it seems to fall the other way, as countably infinite.

So we have three problems now, all acting as a perfect storm with each other:

- Cantor's diagonal argument relies solely on regular sequences.

- magic sequences are believed to not exist.

- T1 allows claims a bijection between 'all' countably infinite sets.

The net result is that T1 declares that where there is any mapping that is injective there is automatically a bijective one too, whether or not it really is. This basically undoes the meaning of injective and surjective for all functions on infinite sets (except those that are proven to be uncountable).

It is this T1 theorem which starts a chain of problems within set theory (although I suspect that the diagonal argument is still the originator of all of this).

Ultimately, what happens is that given any possible structure for reals that could fit into set theory, forcing it to be bijective with T1 incorrectly stuffs it into a sequence. No sequence can get past Cantor's argument. And T1 reduces anything larger than a sequence, to just being a sequence. Thus they all work together to insure that set theory can't be applied to complex structures.

Thus any potential structure to hold R, no matter if it is correct or not, cannot be examined objectively. They all fail regardless of what they hold, or how they hold it.

To some, this is the essence of the proof that reals are uncountable. That there is no way to get around these behaviors. To me this appears to be circular logic, with each flaw protecting each other from detection.

But if we remove T1, and use only D1 and D2 instead, then countable and countably infinite don't blur (and we can ignore countably infinite). Then the rationals are only countable, and any structure with multiple infinities isn't incorrectly flattened into a single infinity. Then magic sequences really do exist and they get past Cantor's and then there is actually a chance (although not definite) that R is in fact countable (injective).

Removing T1 starts a landslide in set theory. And given that parts of it are inconsistent with it's own initial definitions, it is a good thing to question this.

PLEASE, before you rush off and comment, accusing me of missing something elementary, please reread this post very carefully. Over the last few weeks, I have had an excellent stream of comments from which to learn. In writing this, I took many of them into account, and tried to make sure that I didn't pass over anything obvious. I reread my references, and tried to account for all of the details. I do really appreciate comments, but I hope that people can stay constructive, and really dig into the substance of the problem. The problem is subtle, deep and has wedged its way past a great number of people.

Before you go any further, PLEASE read this introductory paragraph fully and carefully. Over the last couple of weeks, I've been writing several posts about a very subtle problem in ZFC set theory, whose results wind through a great deal of the existing work. I may be right, I may be wrong, but I have been very careful to try to stay within the first-principles of set theory, in order to shed some illumination onto the issue. The issue is not as simple as "I am a crank", or "Cantor's proves..." or any other quick knee-jerk reaction to what I am saying. You are welcome to comment, to express your opinions, BUT only if you're read what I've written carefully, and with an open and objective mind. Only if you're not succumbing to your emotions. Mathematics is about careful, objective thought. We must never lose sight of that.

This post is NOT about:

- Reals being (un)countable

- missing irrationals in a bi-infinite spreadsheet

- missing irrational seeds in an infinite number of bi-infinite spreadsheets

- magic sequences

- missing irrationals in a bi-infinite spreadsheet

- missing irrational seeds in an infinite number of bi-infinite spreadsheets

- magic sequences

First, I want to talk very explicitly about a wikipedia page:

http://en.wikipedia.org/wiki/Countable_set

In particular I want to closely examine the "Definition" section. This part of the document contains exactly 13 lines (although if your window isn't wide enough some of the lines wrap). In the first four lines, we have the following two definitions:

D1: Line 1 -- a set is "countable" if an injective function exists to N

D2: Line 4 -- a set is "countably infinite" if a bijective function exists to N

Before we go any further: A "formal system", as defined in GEB by Douglas Hofstadter, contains a finite number of axioms, an infinite number of theorems, and a finite number of rules of production. A number of people have also included "definitions" as being separate and distinct pieces from the basic axioms in the system, in particular ZFC set theory differentiates between its base definitions (such as countable) and it's axioms (such as the axiom of choice).

All theorems are built on-top of the many self-evident pieces below, and other theorems. Everything that isn't self-evident or independent is a theorem. Everything that stands on its own is an axiom or a definition.

In this way, definitions and axioms act as the basic building blocks on which all of the theorems are based. The different pieces may be interspersed with each other, but some are primitive building blocks (axioms and definitions), while the others are composite building blocks (theorems).

Rigor means the entire construction is based on a finite number of solid, self-evident pieces. Neither the axioms, the definitions, nor the theorems should contradiction each other, or be inconsistent. If that happens, there is a problem with the formal system.

Lines 1, and 4 basically specify two definitions. Together we can make the following statements about the way terminology is defined for set theory:

For all sets within set theory:

a1: "countable" is not always equal to "countably infinite"

For all functions within set theory:

a2: bijective is both injective and surjective

Getting back to the wikipedia page, in the next part:

T1: line 7 -- starts a "theorem"

This is then followed by three statements, each of which are all deemed to be equal to each other. They are:

T1-1: S is countable (injective) (f: S -> N)

T1-2: Either S is empty OR there exists a surjection (g: N -> S)

T1-3: Either S is finite OR there exists a bijection (h: N -> S)

The lines T1-1 and T1-3 basically say that for any infinite set, if it is countable, then there exists a bijection between the naturals numbers and it. So that for any infinite set:

S = { a, b, c, ... }

there is a bijection between N and itself.

PLEASE keep in mind that according to the page itself, T1 isn't an axiom or a definition, it is a theorem. The proof for the theorem isn't given in the page, it says it comes from a textbook.

By T1, the following set:

A = { a1, a2, a3, ..., b1, b2, b3, .... }

is claimed to have a bijection between it and N. The idea is that the above can be rewritten as:

B = { a1, b1, a2, b2, a3, b3, ... }

And because this is countable, there is a bijection between this set and N. If we map these two onto each other we get assignments as follows:

1 <- a1, 2 <- b1, 3 <- a2, 4 <- b2, 5 <- a3, 6 <- b3, ...

That is, it takes the first 6 elements in N to map to the first three elements for each of the ak and bk subsets of elements. This mapping is considered to be a bijective function.

However, if we have the finite set:

D = { a1, a2, a3, b1, b2, b3 }

and another

E = { 1, 2, 3 }

Since E has less elements than D, we can only create an injective function from E to D. The size difference means that no bijective function can exist. We only get something like:

1 <- a1, 2 <- a2, 3 <- a3

In this sense, it is accepted that in set theory, infinite sets that are countable can have a bijective function between them, even if one set appears to be somewhat "larger", with respect to infinity, then the other one. However, this does not apply to finite sets.

As a comment on my last post, delosgatos said:

"One of the cool and weird things you first encounter with infinite sets is that infinite sets are bijective with their own proper subsets."

Actually, it is far more than that. There is basically a bijective function with "any" other infinite set if that set is considered countable. If you can find an injective function, then you can claim a bijective one based on T1.

In this way set theory essentially splits all (non-empty) sets into three groups:

a) finite

b) countably infinite

c) uncountable

Finite sets differ based on size, and thus there are an infinitely large number of different possiblies. By definition there are also an infinite number of uncountable sets.

But for countably infinite sets, since there exists a bijective function between any pair of sets, they are all essentially isomorphic to the same underlying set N. A bijection can always be found.

This behavoir of the system is very different for all sets that are countably infinite then it is for finite sets.

As a consequence, the "definitions" of injective and surjective become watered down for countably infinite sets. They can still exist as function types, in that you can create an injective function, but by definition, you can also always create a bijective one between the same two sets.

Rephrased as I did earlier, we get the following relationships in the terminology:

For all countably infinite sets within set theory:

b1: "countable" is always equal to "countably infinite"

For all functions on countably infinite sets within set theory:

b2: a bijective function exists whenever there exists a function that is injective and surjective OR just injective

Under the theorem T1, as an example, we can find an f: N -> NxN that is considered to be a bijection, for exactly this reason. Where the set NxN is:

{ {0,0}, {0,1}, {1,0}, {0,2}, {1,2}, {2,2}, {2,1}, ... }

Which is interesting, given that if NxN were finite, it would contain n^2 more elements than a finite N, if both were limited to the base elements (like 1, 2 and 3). If f was between finite sets, then there could be no bijection, but because these are infinite sets, that is ingored.

Thus f is widely believed to be bijective, and it's underlying sets are considered countable, and countably infinite.

BUT, the above statement is only true because of the "theorem" T1, not because of the two basic definitions D1 and D2. By themselves the definitions are not strong enough to say that.

Just to be sure I figured I'd check a source other than wikipedia. The following:

http://plato.stanford.edu/entries/set-theory/primer.html

is a well-written introduction to set theory. It uses the term "one-to-one" often, but it is careful to define this in section 3 as being "injective". It avoids the word bijection.

Read through section 7, very carefully. It starts with defining countable as having the same cardinality as N, but then it says in the comments that the definition of "countable" is actually injective. Also it defines "most countable" as being finite or countable (wikipedia warned of this).

Throughout the various theorems, "countable" keeps falling back on being one-to-one (injective). The authors carefully avoid using "bijective" through-out the entire document.

Later in section 9, they do something really weird, they say:

"Naturally, a question arises whether perhaps all infinite sets are countable."

But then they use Cantor's diagonal argument to show that this can't be true.

Still, note that throughout this entire reference, the authors never once go back to the idea that "countable" is "bijective" for infinite sets. Countable stays as a definition which is injective.

So the wikipedia page probably isn't wrong. It accurately describes how the definition starts with countable, and because of a theorem gets to bijection for all infinite sets that are deemed to not be "uncountable".

"That's just the way ZFC set theory is ..."

Well, maybe thus is the way it is taught, but the "definitions" initially state that it is not the case. The wikipedia page clearly casts all of the blame on the unnamed theorem T1. A theorem whose proof is causing inconsistencies within set theory.

For finite and uncountable sets, there can be diffents sets within those groups where the the relationship between the sets in terms of functions is "only-injective", or "only-surjective". That is, no bijection exists or can exist. They are clearly and consistently defined.

For all of the infinite sets in the middle (between finite and uncountable), there is always a bijectiion between any two sets, no matter what. That is a fundumental inconsistency in the way the formal system treats its different mathematical objects.

It starts with pretty clear and simple definitions about how there is a difference between injective and bijective functions, but for one huge segment of the sets, those differences become negligable.Meaningless.

Just for countably infinite sets, the theorem T1 essentially "hides" any differences in the size of the sets. It is just one big giant special case, that does not match with the behavoir of the rest of set theory. With the behavoir set forth by the definitions and the axioms.

And what's going on with counting? The wikipedia page starts with a nice intuitive definition. "We can count things, one at a time, ...". But because of the T1 theorem the terminology and usage become extremely complex. Counting the elements of finite sets, is radically different than counting the elements of infinite sets.

Why?

According to the wikipedia page, the theorem T1 is proven in Lang's text. I don't have access to that proof, but from the other text I included, we can guess that the problem is probably Cantor's diagonal argument. Everything is fine in ZFC set theory when the text is talking about "countable" as just being "injective". It's a pretty simple definition.

The existance of theorem T1, however causes a strange ripple throught-out the system that effectivily treats all countably infinite sets in a special and strange way, inconsistent with the basic foundations of set theory.

That's one potential problem, but I've already shown that there are more problems lurking about. The magic sequences, for example. I provided an example of one, but because of T1, everybody believes that an infinite set that is countable automatically has a bijection to N. This means that the example can just be "bijected" away. The set:

C = { a1, a2, ..., b1, b2, ..., ... }

where the third "..." means creating elements such as c1, c2, ..., has two very different uses of "...", each of which head off to a different type of infinity.

With T1, numerous commenters have shown that this is now countably infinite, and has a bijection to N.

Thus we can pound this thing down into a sequence (even though it doesn't actually fit). This probably shows up another inconsistency because at very least I would expect C to be classed with R as being "uncountable", but it seems to fall the other way, as countably infinite.

So we have three problems now, all acting as a perfect storm with each other:

- Cantor's diagonal argument relies solely on regular sequences.

- magic sequences are believed to not exist.

- T1 allows claims a bijection between 'all' countably infinite sets.

The net result is that T1 declares that where there is any mapping that is injective there is automatically a bijective one too, whether or not it really is. This basically undoes the meaning of injective and surjective for all functions on infinite sets (except those that are proven to be uncountable).

It is this T1 theorem which starts a chain of problems within set theory (although I suspect that the diagonal argument is still the originator of all of this).

Ultimately, what happens is that given any possible structure for reals that could fit into set theory, forcing it to be bijective with T1 incorrectly stuffs it into a sequence. No sequence can get past Cantor's argument. And T1 reduces anything larger than a sequence, to just being a sequence. Thus they all work together to insure that set theory can't be applied to complex structures.

Thus any potential structure to hold R, no matter if it is correct or not, cannot be examined objectively. They all fail regardless of what they hold, or how they hold it.

To some, this is the essence of the proof that reals are uncountable. That there is no way to get around these behaviors. To me this appears to be circular logic, with each flaw protecting each other from detection.

But if we remove T1, and use only D1 and D2 instead, then countable and countably infinite don't blur (and we can ignore countably infinite). Then the rationals are only countable, and any structure with multiple infinities isn't incorrectly flattened into a single infinity. Then magic sequences really do exist and they get past Cantor's and then there is actually a chance (although not definite) that R is in fact countable (injective).

Removing T1 starts a landslide in set theory. And given that parts of it are inconsistent with it's own initial definitions, it is a good thing to question this.

PLEASE, before you rush off and comment, accusing me of missing something elementary, please reread this post very carefully. Over the last few weeks, I have had an excellent stream of comments from which to learn. In writing this, I took many of them into account, and tried to make sure that I didn't pass over anything obvious. I reread my references, and tried to account for all of the details. I do really appreciate comments, but I hope that people can stay constructive, and really dig into the substance of the problem. The problem is subtle, deep and has wedged its way past a great number of people.

## Saturday, December 19, 2009

### Crank it Up

What I am about to say, if true will really surprise a lot of people. We all have a tendency to believe that once something is locked in stone, then by virtue of that, it is unshakable. Well, that's mostly true, but mankind has a vast amount of learning left to do...

Picture a spreadsheet, but it's no ordinary spreadsheet. To the left and to the right it extends out infinity. Up and down it also extends out infinitely.

In each cell we place a real number. We start with one in the center and generate the rest of the spreadsheet.

Now, we start in the middle, and we make some program to add the following digits in each cell to the right by:

.1, .111..., .10101..., .1001001..., .100010001..., ...

and to the left by

..., 10001, 1001, 101, 11, 1

and we leave the center at 0.

Going upwards we shift the number to the left by 1 decimal place (divide by 10), and going downwards we shift it right by one digit (multiple by 10).

Now, we place an irrational in the center, lets say Pi, and we do the generation:

...

... 141.4159... 41.4159... 31.4159... 32.4159... 32.5270... ...

... 14.1415... 4.1415... 3.1415... 3.2415... 3.2527... ...

... 1.4141... 0.4141... 0.3141... 0.3241... 0.3252... ...

...

What exactly are we are looking at?

This is a spreadsheet, that when stretched out to infinity in the left, right, up and down directions will generate irrationals. Lots of them. Possibly all of them.

Since we can circle around the spreadsheet starting it the middle and making gradually larger and larger loops, we can map each and every cell onto N. This means that our spreadsheet is countable, and thus if this is all of the irrationals, then the irrationals themselves are countable.This would also mean that R itself, which is made up of the countable rationals and the now countable irrationals is countable.

How do we know this works. One interesting way is to start at Pi and generate a table large enough to get to sqrt(2). If this works, the sqrt(2) will be in the table, and we can navigate to it.

In fact, all other know irrationals are also in the table.

Why?

There are lots of reasons, which I have been circling around for a couple of weeks now. Lets start with some basic premises:

- All irrationals are a non-repeating path that is infinitely long.

- If you add a repeating (or finite) path to a non-repeating one, the result is a non-repeating one.

- If there is a path between any two irrationals, and it is countable, then the path is finite.

- Irrationals in base 10 are just a representation of the underlying number.

- the +1 or +.1 pattern permutes the number, but not the distance between the permutation, so we need to distance the changes from each other, thus +11, +.111..., +101, +.10101... etc.

- the infinitely long permutations permute the tail.

- we have an infinitely long number, so we need infinitely long permutations (this was the most painful part to figure out).

- the shift operators, left and right accomplish the goal of moving the permutations into infinity.

- the table, itself generates all of the permutations and combinations, stretching out to infinity in four different directions.

We can start with any irrational, and get to all of the others.

But there is more:

The bi-infinite spreadsheet (because normally a spreadsheet only goes to infinity in two directions, rows and columns), is actually a binary infinite node tree, as I discussed in an earlier post.

Iterating the tree or the spreadsheet produces a magic sequence, as my other post also showed.

We can see that the spreadsheet is trivially countable by just starting in the center and circling around,

Still, the fact that it is shows that Cantor's diagonalization doesn't apply to the enumeration precisely because there are two different sets of infinities make the result a magic sequence.

And just a little bit more:

In one of my comments in Good Math, Bad Math I talked about how I saw infinities. Pretty much, that perspective and what I wrote this morning in my own comments on the Magic Sequence post, confirm that my understanding is pretty solid.

So another set of results, that is related:

All binary trees are countable, whether or not they contain irrationals.

A "possible" result is that there is no such this as uncounability. Everything in a formal system is countable.

There are at least two categories of infinity: calculus infinities and set theory ones. They are different, but relatable.

There are two different types of calculus infinities, the type that is "going" to infinity, and the type that is already "there". The first type is "dynamic", while the second type is "static".

As set theory infinities, this is static:

3.1415...

And this is dynamic:

{ 1, 2, ... }

Infinities are pretty much interchangeable, but they do come in axises. Thus we can have 1D, 2D and 3D infinities, and probably more beyond that.

Axis are static or dynamic.

A bi-infinite spreadsheet by itself is 2D, filled with irrationals is 3D. Cantor's diagonal proof only works on 1D or 2D. sequence are 1D or 2D, magic sequences are 3D.

It may be true that two dynamic axises don't map to a static/dynamic pair, I'm not sure, I think about it.

You can map from a dynamic infinity to a static one, as I showed in my comment about counting binary trees, but you have to be careful.

To go from a dynamic to static, the concept of actually going "through" infinity with a path, and onto another infinity is crucial. This whole piece is crucial to make set theory consistent, or the fact that "uncountable binary trees" exist causes it be be ill-defined.

We can frame any set theory infinity, either static or dynamic, in terms of calculus infinities. We can step through calculus infinities, one step at a time, to view the behavior.

The set:

{ 3, 3.1, 3.14, 3.1415, ..., Pi }

In calculus infinities, this starts as a dynamic infinity, that "teleports" itself to the static infinity containing Pi at the end. In set theory infinities, this just is.

Another ambiguity that should be noted:

{ 3, 3.1, 3.14, 3.1415, ..., }

This "syntax" should only be used to specify all of the number in the set "approaching Pi", but not Pi itself. Doing this resolves another problem.

I think that's all of it for now.

A special thanks for all of the people who patiently put up with my questions and provided answers over the last two weeks, without your help I could never have debugged the program.

Picture a spreadsheet, but it's no ordinary spreadsheet. To the left and to the right it extends out infinity. Up and down it also extends out infinitely.

In each cell we place a real number. We start with one in the center and generate the rest of the spreadsheet.

Now, we start in the middle, and we make some program to add the following digits in each cell to the right by:

.1, .111..., .10101..., .1001001..., .100010001..., ...

and to the left by

..., 10001, 1001, 101, 11, 1

and we leave the center at 0.

Going upwards we shift the number to the left by 1 decimal place (divide by 10), and going downwards we shift it right by one digit (multiple by 10).

Now, we place an irrational in the center, lets say Pi, and we do the generation:

...

... 141.4159... 41.4159... 31.4159... 32.4159... 32.5270... ...

... 14.1415... 4.1415... 3.1415... 3.2415... 3.2527... ...

... 1.4141... 0.4141... 0.3141... 0.3241... 0.3252... ...

...

What exactly are we are looking at?

This is a spreadsheet, that when stretched out to infinity in the left, right, up and down directions will generate irrationals. Lots of them. Possibly all of them.

Since we can circle around the spreadsheet starting it the middle and making gradually larger and larger loops, we can map each and every cell onto N. This means that our spreadsheet is countable, and thus if this is all of the irrationals, then the irrationals themselves are countable.This would also mean that R itself, which is made up of the countable rationals and the now countable irrationals is countable.

How do we know this works. One interesting way is to start at Pi and generate a table large enough to get to sqrt(2). If this works, the sqrt(2) will be in the table, and we can navigate to it.

In fact, all other know irrationals are also in the table.

Why?

There are lots of reasons, which I have been circling around for a couple of weeks now. Lets start with some basic premises:

- All irrationals are a non-repeating path that is infinitely long.

- If you add a repeating (or finite) path to a non-repeating one, the result is a non-repeating one.

- If there is a path between any two irrationals, and it is countable, then the path is finite.

- Irrationals in base 10 are just a representation of the underlying number.

- the +1 or +.1 pattern permutes the number, but not the distance between the permutation, so we need to distance the changes from each other, thus +11, +.111..., +101, +.10101... etc.

- the infinitely long permutations permute the tail.

- we have an infinitely long number, so we need infinitely long permutations (this was the most painful part to figure out).

- the shift operators, left and right accomplish the goal of moving the permutations into infinity.

- the table, itself generates all of the permutations and combinations, stretching out to infinity in four different directions.

We can start with any irrational, and get to all of the others.

But there is more:

The bi-infinite spreadsheet (because normally a spreadsheet only goes to infinity in two directions, rows and columns), is actually a binary infinite node tree, as I discussed in an earlier post.

Iterating the tree or the spreadsheet produces a magic sequence, as my other post also showed.

We can see that the spreadsheet is trivially countable by just starting in the center and circling around,

Still, the fact that it is shows that Cantor's diagonalization doesn't apply to the enumeration precisely because there are two different sets of infinities make the result a magic sequence.

And just a little bit more:

In one of my comments in Good Math, Bad Math I talked about how I saw infinities. Pretty much, that perspective and what I wrote this morning in my own comments on the Magic Sequence post, confirm that my understanding is pretty solid.

So another set of results, that is related:

All binary trees are countable, whether or not they contain irrationals.

A "possible" result is that there is no such this as uncounability. Everything in a formal system is countable.

There are at least two categories of infinity: calculus infinities and set theory ones. They are different, but relatable.

There are two different types of calculus infinities, the type that is "going" to infinity, and the type that is already "there". The first type is "dynamic", while the second type is "static".

As set theory infinities, this is static:

3.1415...

And this is dynamic:

{ 1, 2, ... }

Infinities are pretty much interchangeable, but they do come in axises. Thus we can have 1D, 2D and 3D infinities, and probably more beyond that.

Axis are static or dynamic.

A bi-infinite spreadsheet by itself is 2D, filled with irrationals is 3D. Cantor's diagonal proof only works on 1D or 2D. sequence are 1D or 2D, magic sequences are 3D.

It may be true that two dynamic axises don't map to a static/dynamic pair, I'm not sure, I think about it.

You can map from a dynamic infinity to a static one, as I showed in my comment about counting binary trees, but you have to be careful.

To go from a dynamic to static, the concept of actually going "through" infinity with a path, and onto another infinity is crucial. This whole piece is crucial to make set theory consistent, or the fact that "uncountable binary trees" exist causes it be be ill-defined.

We can frame any set theory infinity, either static or dynamic, in terms of calculus infinities. We can step through calculus infinities, one step at a time, to view the behavior.

The set:

{ 3, 3.1, 3.14, 3.1415, ..., Pi }

In calculus infinities, this starts as a dynamic infinity, that "teleports" itself to the static infinity containing Pi at the end. In set theory infinities, this just is.

Another ambiguity that should be noted:

{ 3, 3.1, 3.14, 3.1415, ..., }

This "syntax" should only be used to specify all of the number in the set "approaching Pi", but not Pi itself. Doing this resolves another problem.

I think that's all of it for now.

A special thanks for all of the people who patiently put up with my questions and provided answers over the last two weeks, without your help I could never have debugged the program.

## Wednesday, December 16, 2009

### The Magic Sequence

This post is a continuation of both my discussions at:

http://scienceblogs.com/goodmath/2009/12/another_cantor_crank_represent.php

and at:

http://theprogrammersparadox.blogspot.com/2009/12/infinite-node-tree.html

To understand what I am saying you have to get through all of my comments (and other responses) at Mark's site, my post and all of the comments for my post. I know. It's a lot of stuff, and currently it is badly organized.

This post will clear up the discussion at Good Math, Bad Math about Cantor's Diagonalization Argument, and later today I'll add another one which will (hopefully) show that R -> N.

Consider the following sequence of real numbers:

0.100000000...

0.110000000...

0.111000000...

...

0.010000000...

0.011000000...

0.011100000...

...

...

Please note the second terminating ellipsis (...), it is not a typo. The first two ellipsis' are symbolic notation to explain how to extend the two existing sub-sequences of numbers. The third one in this case means that we extend the infinite sequences themselves.

Now, this structure and this notation will provoke two questions:

Can't you enumerate this sequence and re-write is as such?

Is this syntax even legal?

I'd like to answer both of these questions together. Starting with another example:

1 | [2]. 1 0 0 0 0 0 0 0 0 ...

2 | 0 . 1 1 0 0 0 0 0 0 0 ...

4 | 0 . 1 1 1 0 0 0 0 0 0 ...

...

3 | 0 .[2]1 0 0 0 0 0 0 0 ...

5 | 0 . 0 1 1 0 0 0 0 0 0 ...

8 | 0 . 0 1 1 1 0 0 0 0 0 ...

...

6 | 0 . 0[2]1 0 0 0 0 0 0 ...

9 | 0 . 0 0 1 1 0 0 0 0 0 ...

13 | 0 . 0 0 1 1 1 0 0 0 0 ...

...

...

In this example, on the left I have added a unique number from N to each row, using a method of counting where I gradually add in each new infinite sub-sequence, one-by-one and extend the count to include them. This structure is countable.

On the right I have attempted to apply Cantor's Diagonalization, but because there are an infinite number of sub-sequences, I can go forward to each, but I can never get back. The structure can NOT be diagonalized.

So we can clear up the first problem I keep hearing. The above example shows explicitly that:

countable != diagonalizable.

Now the really interesting bit, the coolest part of all of this, is the enumeration on the left-hand side. It is clearly enumerable, so we can write it out as a list of elements, but I won't dare! Why?

Because once I start writing out the elements I can never quit! That is, if at any point in the writing I stop and finish with ..., then that list (the one that I've written) is no longer the same as the one above.

Yes, it's an enumerable list, but No it cannot be written out as a finite list of elements. The ellipsis is not strong enough to describe it. We need a symbol more like [...|...] which would clearly show that the list was going off to "two" infinities at exactly the same time, not just one. (although it won't explain how to construct the missing elements).

Grasping for a name, I decided that "magic sequence" adequacy reflects the fact that it is a valid countable sequence that we can write, but if and only if we never stop writing it!

Is this really part of set theory? Yes, absolutely. The magic sequence is a simple sequence that can be counted. The alternative double ellipsis syntax may or may not be acceptable, but even if it isn't it is just a convenience for being able to write down (in a finite state) a magic sequence.

Again, it is a simple countable sequence that is unwritable as such. The multiple ellipsis is the only syntactic way we have of describing this mathematical object. How can this happen?

If we have:

- an infinite number of infinite elements.

everybody can agree that this is both countable and diagonlizable. But if we have:

- an infinite number of an infinite number of infinite elements

then it can't be diagonalized, the second infinity clearly gums up the works. But if we group it as:

- [ an infinite number of an infinite number ] of infinite elements

The first part of this is countable, and the entire thing is a magic sequence of infinite elements. So we can count it, and use it as a sequence.

Cantor's diagonalization argument starts with a sequence. If I supply it with a magic sequence, it becomes invalid. It cannot say one way or the other that there are missing elements.

This is a very deep and wide hole.

What's really interesting though, is that if R can be counted, then the only way it can be counted is with a magic sequence. A regular infinite sequence will ALWAYS fail the diagonalization argument. There is no question about this. The diagonalization argument was originally constructed to ensure this.

So, if you've not been following my last post (and all of the comments at both sites), I will summarize the current state of my argument.

I have:

a) a mathematical object (binary infinite node tree), that I think is expressible enough to contain the reals (R). This object fits in set theory.

b) a mapping of R onto this structure (for which there are known problems, this mapping is invalid!).

c) another mathematical object (magic sequence).

d) a simple example that shows that a magic sequence is countable, but not diagonalizable, thus Cantor can't be applied to a magic sequence.

e) a way of traversing the structure in a, to generate a magic sequence.

At this moment, the only real hole I have in my argument comes from the irrationals. In the instance of the tree I created in the comments (the 22nd one I think, but there are no numbers), I put ALL of them in the 2nd level, cut into two sets. That clearly doesn't work.

If we start enumerating the tree, starting at root, the first item is Pi, and the second item is the first element of the set of irrationals over the range [0,Pi).

Unfortunately, there is no first element until we get to infinity, but we're not allowed to get there.

Still although that particular mapping doesn't work, after a bit of thinking, I believe I have one that does. But, that will have to wait for another post this evening, since I have to get back to work now ...

Sorry to leave you all hanging :-)

http://scienceblogs.com/goodmath/2009/12/another_cantor_crank_represent.php

and at:

http://theprogrammersparadox.blogspot.com/2009/12/infinite-node-tree.html

To understand what I am saying you have to get through all of my comments (and other responses) at Mark's site, my post and all of the comments for my post. I know. It's a lot of stuff, and currently it is badly organized.

This post will clear up the discussion at Good Math, Bad Math about Cantor's Diagonalization Argument, and later today I'll add another one which will (hopefully) show that R -> N.

Consider the following sequence of real numbers:

0.100000000...

0.110000000...

0.111000000...

...

0.010000000...

0.011000000...

0.011100000...

...

...

Please note the second terminating ellipsis (...), it is not a typo. The first two ellipsis' are symbolic notation to explain how to extend the two existing sub-sequences of numbers. The third one in this case means that we extend the infinite sequences themselves.

Now, this structure and this notation will provoke two questions:

Can't you enumerate this sequence and re-write is as such?

Is this syntax even legal?

I'd like to answer both of these questions together. Starting with another example:

1 | [2]. 1 0 0 0 0 0 0 0 0 ...

2 | 0 . 1 1 0 0 0 0 0 0 0 ...

4 | 0 . 1 1 1 0 0 0 0 0 0 ...

...

3 | 0 .[2]1 0 0 0 0 0 0 0 ...

5 | 0 . 0 1 1 0 0 0 0 0 0 ...

8 | 0 . 0 1 1 1 0 0 0 0 0 ...

...

6 | 0 . 0[2]1 0 0 0 0 0 0 ...

9 | 0 . 0 0 1 1 0 0 0 0 0 ...

13 | 0 . 0 0 1 1 1 0 0 0 0 ...

...

...

In this example, on the left I have added a unique number from N to each row, using a method of counting where I gradually add in each new infinite sub-sequence, one-by-one and extend the count to include them. This structure is countable.

On the right I have attempted to apply Cantor's Diagonalization, but because there are an infinite number of sub-sequences, I can go forward to each, but I can never get back. The structure can NOT be diagonalized.

So we can clear up the first problem I keep hearing. The above example shows explicitly that:

countable != diagonalizable.

Now the really interesting bit, the coolest part of all of this, is the enumeration on the left-hand side. It is clearly enumerable, so we can write it out as a list of elements, but I won't dare! Why?

Because once I start writing out the elements I can never quit! That is, if at any point in the writing I stop and finish with ..., then that list (the one that I've written) is no longer the same as the one above.

Yes, it's an enumerable list, but No it cannot be written out as a finite list of elements. The ellipsis is not strong enough to describe it. We need a symbol more like [...|...] which would clearly show that the list was going off to "two" infinities at exactly the same time, not just one. (although it won't explain how to construct the missing elements).

Grasping for a name, I decided that "magic sequence" adequacy reflects the fact that it is a valid countable sequence that we can write, but if and only if we never stop writing it!

Is this really part of set theory? Yes, absolutely. The magic sequence is a simple sequence that can be counted. The alternative double ellipsis syntax may or may not be acceptable, but even if it isn't it is just a convenience for being able to write down (in a finite state) a magic sequence.

Again, it is a simple countable sequence that is unwritable as such. The multiple ellipsis is the only syntactic way we have of describing this mathematical object. How can this happen?

If we have:

- an infinite number of infinite elements.

everybody can agree that this is both countable and diagonlizable. But if we have:

- an infinite number of an infinite number of infinite elements

then it can't be diagonalized, the second infinity clearly gums up the works. But if we group it as:

- [ an infinite number of an infinite number ] of infinite elements

The first part of this is countable, and the entire thing is a magic sequence of infinite elements. So we can count it, and use it as a sequence.

Cantor's diagonalization argument starts with a sequence. If I supply it with a magic sequence, it becomes invalid. It cannot say one way or the other that there are missing elements.

This is a very deep and wide hole.

What's really interesting though, is that if R can be counted, then the only way it can be counted is with a magic sequence. A regular infinite sequence will ALWAYS fail the diagonalization argument. There is no question about this. The diagonalization argument was originally constructed to ensure this.

So, if you've not been following my last post (and all of the comments at both sites), I will summarize the current state of my argument.

I have:

a) a mathematical object (binary infinite node tree), that I think is expressible enough to contain the reals (R). This object fits in set theory.

b) a mapping of R onto this structure (for which there are known problems, this mapping is invalid!).

c) another mathematical object (magic sequence).

d) a simple example that shows that a magic sequence is countable, but not diagonalizable, thus Cantor can't be applied to a magic sequence.

e) a way of traversing the structure in a, to generate a magic sequence.

At this moment, the only real hole I have in my argument comes from the irrationals. In the instance of the tree I created in the comments (the 22nd one I think, but there are no numbers), I put ALL of them in the 2nd level, cut into two sets. That clearly doesn't work.

If we start enumerating the tree, starting at root, the first item is Pi, and the second item is the first element of the set of irrationals over the range [0,Pi).

Unfortunately, there is no first element until we get to infinity, but we're not allowed to get there.

Still although that particular mapping doesn't work, after a bit of thinking, I believe I have one that does. But, that will have to wait for another post this evening, since I have to get back to work now ...

Sorry to leave you all hanging :-)

## Saturday, December 12, 2009

### The Infinite Node Tree

I have to stop have dreaming about infinities, they keep waking me up at night ...

We can define an 'infinite node tree' as a tree similar to an N-ary tree, except that at each and every node, there are an infinite number of children.

Can this tree be traversed?

Yes, if we start at the first level, and create a set of all of the children at each level we get something like:

level1 = { c1, c2, .... }

level2 = { {c11, c12, ... }, {c21, c22, ...}, ... }

level3 = { {c111, c112, ...}, {c211, c211, ...}, ...}

...

In my previous post, I suggested that if we have several infinite sets we can combined them as follows:

If we have the set:

S = { { a1, b1, c1, ...}, {a2, b2, c2, ...}, ....}

Is there a injective and surjective mapping onto

{ a, b, c, d, ....}

Hypothesis: Yes

If we define a 'pass' to be taking the a finite number of elements from a finite set, then adding in a new set, we get an algorithm like:

N=1

sets = S{N}

forever

for each set in sets

get next element in current set

sets += S{N}

So we traverse the above as follows:

a1, b1, a2, c1, b2, a3, d1, c2, b3, a4, ...

We can see that eventually we will get through all elements of S.

Then:

a = a1, b = b1, c = a2, d = c1, e = b2, f = a3, ....

Thus there appears to be a simple mapping between a single infinite set and an infinite set of infinite sets.

Now we can do that for our infinite node tree, by adding in each new set, one at a time, for each new level. (two iterators, instead of one).

So for each iteration, we add in the next set of each new level (one at a time) and then add in the next elements of each new set in the list of sets. Something like:

N=1

M=1

sets = S{N}

levels = L{M}

forever

for each level in levels

for each set in sets

get next element in current set

sets += S{N++}

levels += L{M++}

(or something similar, it's late :-)

So eventually, slowly, we will eventually traverse all of the elements in the tree exactly once (as we approach infinity).

So what can we do with this infinite node tree?

To make it simpler, we can remove all of the finite elements out of R, and replace them with infinite representation, so for instance:

1 = 1.0000000...

4 = 4.0000000...

NOTE: I'm not sure if this is really necessary, but it does mean that I am left with a consistent set of infinite representations for all of the possible numbers in R. I'd rather only deal with only one type of number, like Pi or sqrt(2).

Now if we consider that between any two numbers in R, there lies an infinite number of other numbers such that as a set they look like:

{ ..., A, ..., B, ... }

Then starting somewhere arbitrarily like 0, we can build an infinite node tree containing all of the numbers in R.

UPDATE2: To clarify, each real is contained in its own node in the tree. That is, there is a node for Pi and a node for sqrt(2), as well as nodes for all of the other numbers (rational and irrational).

Another, perhaps easier way to look at it is to add in all of the known numbers in R to a level in the tree (like in Cantor's first set of the diagonalization argument). Between each of these number is an infinite number of children, and beneath all of them is still more infinite numbers of child, etc. As we find more, we can add them to the lower levels (so we can use Cantor's construction as the way to build the tree itself).

Either way, I think the entire structure of R fits into the structure of the infinite node tree.

UPDATE: I just woke up and realized that I am slightly wrong here. The child and the parent are right next to each other. To fix this, we actually need a binary infinite node tree, that is a tree where there are two infinite sets of children, the left set and the right set. Fortunately, having this extra set at each node doesn't significant change how we traverse the tree (instead of adding the next child, we add in the next one from the left and the next one from the right.). Now between any and every parent and child is an infinite number of other children.

AND this tree is traversable.

Thus, in iterating through the tree as we visit each node we can assign a number in N. This appears to form an injective and surjective mapping between N and R.

"You can't refute Cantor's proof using an enumeration without

In the first part of Cantor's diagonalization argument, we construct a list of all of the elements from the mapping.

That's the first problem. The structure is actually a tree of infinite breadth and infinite depth, and in this case one that is not easily expressible as a simple list.

The second part of Cantor's construction is to create an element from the diagonal. As we start this, we descend down one branch of the tree, but as the tree is infinite in length, we can't ever get to the bottom, so we can't go to the next branch.

What I believe this means (although it is late), is that we can't construct the element. Ever.

So we can't construct an element that isn't already included in our tree. I take this to mean either we can't apply this construction method OR it definitely shows that all elements are now contained in the tree. Either way it doesn't invalidate the mapping.

OK, that's enough, it's late and I need my beauty sleep.

UPDATE2: I thought of another interesting bit. If we see Cantor's diagonalization construction as a number generator, since it will generate an infinite amount of numbers, then we can use it as the means to count. That is, if we seed it with some finte list of initial values, assigning them all a number. Then at each interation, a new unique number will be generation. Since this goes on for infinity, it will create an infinitly large set of countable numbers. However, the question is, what if there are gaps? Someone could prove that as the list approaches infnity, the gaps decrease in size, but another way of dealing with it is to use two of these generators at each node in the binary infinite node tree. Then we can parallelize all of them at once, and construct a tree, that eventually won't have an gaps at any level (I think). That's an interesting way to create the tree.

UPDATE3: There are a couple of problems with the my above arguments, that I was thinking about today.

The first is that I keep falling back to the calculus notion of infinity, instead of the set theory one. That is, I am relying on "generating" the tree, not just having it created (all at once). I keep trying to sneak up on infinity, and it keeps running away and hiding.

The second is that my tree construction is vague. I can fix both problems at once.

First we'll start with: if there are any trees that will satisfy the bijection, there are an infinite number of them.

I can envision a number of ways to create these trees, but I realized the simplest way was there all along. The tree can be constructed by using Dedekind cuts. That is, each and every cut is actually a node in the tree, with the two sets on either side of it as the left and right child sets. It's a very natural extension of the cuts into an (infinite) tree structure.

Since the cuts resolve the differences between rationals and irrationals, I'm assuming that we can cut down to an infinite number in both the left and right sets, thus getting to every point on the R number line, which is more or less what Wikipedia says the cuts accomplish. Well it seems to say that it applies to the irrational real numbers, but I think that it works for both (or at least for my altered reals where I made everything infinite in length).

Then the tree created in that manner is assured of getting to every member of R. There are no duplicates, and there are no gaps. And the whole structure can be created as a set in one shot.

An issue that might bug some people is that I am using the term 'tree' when I believe there is no such thing in ZFC set theory. That's OK because in a level by level manner, sets can nicely express trees. Thus using the term tree in this context is a convenience to make the whole idea easier to understand, not a necessity.

However, the real meat of this idea is how the structure of the tree itself is used to get around Cantor's diagonalization. If we can't create the element, then it can't be missing. But we can't create it because we get lost down one branch of the tree. I don't get into the trap of having to create the element all at once precisely because Cantor's argument is based on 'constructing' the element, not just having it exist in a completed form. Where I was playing fast and loose before, with time, in this case I believe that I am allowed to rely on it (existing).

(Also, I thought about a bunch of other construction techniques, and also whether or not the tree can have duplicates (or be a DAG instead of a tree), and whether the tree actually needs to be ordered (which I was actually assuming all along, but forgot to mention). But I realized that the Dedekind cut construction was by far the cleanest and that it has already been established as being correct (although not necessarily my usage of it)).

We can define an 'infinite node tree' as a tree similar to an N-ary tree, except that at each and every node, there are an infinite number of children.

Can this tree be traversed?

Yes, if we start at the first level, and create a set of all of the children at each level we get something like:

level1 = { c1, c2, .... }

level2 = { {c11, c12, ... }, {c21, c22, ...}, ... }

level3 = { {c111, c112, ...}, {c211, c211, ...}, ...}

...

In my previous post, I suggested that if we have several infinite sets we can combined them as follows:

If we have the set:

S = { { a1, b1, c1, ...}, {a2, b2, c2, ...}, ....}

Is there a injective and surjective mapping onto

{ a, b, c, d, ....}

Hypothesis: Yes

If we define a 'pass' to be taking the a finite number of elements from a finite set, then adding in a new set, we get an algorithm like:

N=1

sets = S{N}

forever

for each set in sets

get next element in current set

sets += S{N}

So we traverse the above as follows:

a1, b1, a2, c1, b2, a3, d1, c2, b3, a4, ...

We can see that eventually we will get through all elements of S.

Then:

a = a1, b = b1, c = a2, d = c1, e = b2, f = a3, ....

Thus there appears to be a simple mapping between a single infinite set and an infinite set of infinite sets.

Now we can do that for our infinite node tree, by adding in each new set, one at a time, for each new level. (two iterators, instead of one).

So for each iteration, we add in the next set of each new level (one at a time) and then add in the next elements of each new set in the list of sets. Something like:

N=1

M=1

sets = S{N}

levels = L{M}

forever

for each level in levels

for each set in sets

get next element in current set

sets += S{N++}

levels += L{M++}

(or something similar, it's late :-)

So eventually, slowly, we will eventually traverse all of the elements in the tree exactly once (as we approach infinity).

So what can we do with this infinite node tree?

To make it simpler, we can remove all of the finite elements out of R, and replace them with infinite representation, so for instance:

1 = 1.0000000...

4 = 4.0000000...

NOTE: I'm not sure if this is really necessary, but it does mean that I am left with a consistent set of infinite representations for all of the possible numbers in R. I'd rather only deal with only one type of number, like Pi or sqrt(2).

Now if we consider that between any two numbers in R, there lies an infinite number of other numbers such that as a set they look like:

{ ..., A, ..., B, ... }

Then starting somewhere arbitrarily like 0, we can build an infinite node tree containing all of the numbers in R.

UPDATE2: To clarify, each real is contained in its own node in the tree. That is, there is a node for Pi and a node for sqrt(2), as well as nodes for all of the other numbers (rational and irrational).

Another, perhaps easier way to look at it is to add in all of the known numbers in R to a level in the tree (like in Cantor's first set of the diagonalization argument). Between each of these number is an infinite number of children, and beneath all of them is still more infinite numbers of child, etc. As we find more, we can add them to the lower levels (so we can use Cantor's construction as the way to build the tree itself).

Either way, I think the entire structure of R fits into the structure of the infinite node tree.

UPDATE: I just woke up and realized that I am slightly wrong here. The child and the parent are right next to each other. To fix this, we actually need a binary infinite node tree, that is a tree where there are two infinite sets of children, the left set and the right set. Fortunately, having this extra set at each node doesn't significant change how we traverse the tree (instead of adding the next child, we add in the next one from the left and the next one from the right.). Now between any and every parent and child is an infinite number of other children.

AND this tree is traversable.

Thus, in iterating through the tree as we visit each node we can assign a number in N. This appears to form an injective and surjective mapping between N and R.

"You can't refute Cantor's proof using an enumeration without

*addressing*Cantor's proof." - Mark C. Chu-CarrollIn the first part of Cantor's diagonalization argument, we construct a list of all of the elements from the mapping.

That's the first problem. The structure is actually a tree of infinite breadth and infinite depth, and in this case one that is not easily expressible as a simple list.

The second part of Cantor's construction is to create an element from the diagonal. As we start this, we descend down one branch of the tree, but as the tree is infinite in length, we can't ever get to the bottom, so we can't go to the next branch.

What I believe this means (although it is late), is that we can't construct the element. Ever.

So we can't construct an element that isn't already included in our tree. I take this to mean either we can't apply this construction method OR it definitely shows that all elements are now contained in the tree. Either way it doesn't invalidate the mapping.

OK, that's enough, it's late and I need my beauty sleep.

UPDATE2: I thought of another interesting bit. If we see Cantor's diagonalization construction as a number generator, since it will generate an infinite amount of numbers, then we can use it as the means to count. That is, if we seed it with some finte list of initial values, assigning them all a number. Then at each interation, a new unique number will be generation. Since this goes on for infinity, it will create an infinitly large set of countable numbers. However, the question is, what if there are gaps? Someone could prove that as the list approaches infnity, the gaps decrease in size, but another way of dealing with it is to use two of these generators at each node in the binary infinite node tree. Then we can parallelize all of them at once, and construct a tree, that eventually won't have an gaps at any level (I think). That's an interesting way to create the tree.

UPDATE3: There are a couple of problems with the my above arguments, that I was thinking about today.

The first is that I keep falling back to the calculus notion of infinity, instead of the set theory one. That is, I am relying on "generating" the tree, not just having it created (all at once). I keep trying to sneak up on infinity, and it keeps running away and hiding.

The second is that my tree construction is vague. I can fix both problems at once.

First we'll start with: if there are any trees that will satisfy the bijection, there are an infinite number of them.

I can envision a number of ways to create these trees, but I realized the simplest way was there all along. The tree can be constructed by using Dedekind cuts. That is, each and every cut is actually a node in the tree, with the two sets on either side of it as the left and right child sets. It's a very natural extension of the cuts into an (infinite) tree structure.

Since the cuts resolve the differences between rationals and irrationals, I'm assuming that we can cut down to an infinite number in both the left and right sets, thus getting to every point on the R number line, which is more or less what Wikipedia says the cuts accomplish. Well it seems to say that it applies to the irrational real numbers, but I think that it works for both (or at least for my altered reals where I made everything infinite in length).

Then the tree created in that manner is assured of getting to every member of R. There are no duplicates, and there are no gaps. And the whole structure can be created as a set in one shot.

An issue that might bug some people is that I am using the term 'tree' when I believe there is no such thing in ZFC set theory. That's OK because in a level by level manner, sets can nicely express trees. Thus using the term tree in this context is a convenience to make the whole idea easier to understand, not a necessity.

However, the real meat of this idea is how the structure of the tree itself is used to get around Cantor's diagonalization. If we can't create the element, then it can't be missing. But we can't create it because we get lost down one branch of the tree. I don't get into the trap of having to create the element all at once precisely because Cantor's argument is based on 'constructing' the element, not just having it exist in a completed form. Where I was playing fast and loose before, with time, in this case I believe that I am allowed to rely on it (existing).

(Also, I thought about a bunch of other construction techniques, and also whether or not the tree can have duplicates (or be a DAG instead of a tree), and whether the tree actually needs to be ordered (which I was actually assuming all along, but forgot to mention). But I realized that the Dedekind cut construction was by far the cleanest and that it has already been established as being correct (although not necessarily my usage of it)).

## Thursday, December 10, 2009

### Quickie: Infinite Sets

This is just a leftover from the primes stuff, but I thought I'd explain it a bit better (in case it might be of interest to others in relation to Cantor):

If we have the set:

S = { { a1, b1, c1, ...}, {a2, b2, c2, ...}, ....}

Is it mappable onto:

{ a, b, c, d, ....}

Hypothesis: Yes

If we define a pass to be taking the a finite number of elements from a finite set, then adding in a new set, we get an algorithm like:

N=1

sets = S{N}

forever

for each set in sets

get next element in current set

sets += S{N}

So we traverse the above as follows:

a1, b1, a2, c1, b2, a3, d1, c2, b3, a4, ...

We can see that eventually we will get through all elements of S.

Then:

a = a1, b = b1, c = a2, d = c1, e = b2, f = a3, ....

And thus there is a simple mapping between an infinite set and an infinite set of infinite sets.

This implies that any number of infinities in a set is identical to a single infinity.

Is this a problem?

UPDATE:

The number Pi can be seem as a series of "lower" precision approximations that are approaching the actual infinitely precise version of the number. Thus as a set we can see it as:

Pi ~= { 3, 3.1, 3.14, 3.141, 3.1415, ... , Pi }

And that set is equalivalent to:

{ Pi, 3, 3.1, 3.14, 3.141, 3.1415, ... }

The same is true for all irrationals they can be represented as a set of increasingly precise approximation to the infinite precision Real representation of the number:

{ 1/3, .3, .33, .333, .3333, .... }

And any two different infinite sets can be traverse in one infinite pass.

Also, since our traversal of the sets essentially has an infinite amount of time and space, we can merge any two duplicate-plagued sets together, being sure to only list each element exact once in the new set construction. It may be unnervingly slow, but it is still computable.

If we have the set:

S = { { a1, b1, c1, ...}, {a2, b2, c2, ...}, ....}

Is it mappable onto:

{ a, b, c, d, ....}

Hypothesis: Yes

If we define a pass to be taking the a finite number of elements from a finite set, then adding in a new set, we get an algorithm like:

N=1

sets = S{N}

forever

for each set in sets

get next element in current set

sets += S{N}

So we traverse the above as follows:

a1, b1, a2, c1, b2, a3, d1, c2, b3, a4, ...

We can see that eventually we will get through all elements of S.

Then:

a = a1, b = b1, c = a2, d = c1, e = b2, f = a3, ....

And thus there is a simple mapping between an infinite set and an infinite set of infinite sets.

This implies that any number of infinities in a set is identical to a single infinity.

Is this a problem?

UPDATE:

The number Pi can be seem as a series of "lower" precision approximations that are approaching the actual infinitely precise version of the number. Thus as a set we can see it as:

Pi ~= { 3, 3.1, 3.14, 3.141, 3.1415, ... , Pi }

And that set is equalivalent to:

{ Pi, 3, 3.1, 3.14, 3.141, 3.1415, ... }

The same is true for all irrationals they can be represented as a set of increasingly precise approximation to the infinite precision Real representation of the number:

{ 1/3, .3, .33, .333, .3333, .... }

And any two different infinite sets can be traverse in one infinite pass.

Also, since our traversal of the sets essentially has an infinite amount of time and space, we can merge any two duplicate-plagued sets together, being sure to only list each element exact once in the new set construction. It may be unnervingly slow, but it is still computable.

## Wednesday, December 9, 2009

### Quickie: Let it be Free

I was thinking about how the "scientists" in the Climategate scandal are hiding their raw data. If the science is really objective, then by its own merits other researchers will reach the same conclusions.

If you're afraid that they won't, then you're hiding something.

As such, no scientific researcher should ever hide their data. Ever! Not only that, but their work should be out there too -- formulas, theories, methodologies, everything -- to be judged not only by their peers, but also by the public at large. If science is a quest for the truth, then the truth has nothing to fear from inspection.

But I realize that it is not just our disgraced climate researchers who have adopted policies of secrecy. You can't, for example, get most of the academic work in Computer Science unless you join the ACM. In fact you can't get most works in academia unless you join some tight-nit clique. It's all closely guarded.

Long ago, I could understand how these organizations needed to charge money in order to recoup the costs of publishing on paper. It makes sense that they shouldn't take a loss. But amazingly that changed in the Information Age, and instead of offering free downloads it appears that most organizations choose to monetize their IP (business speak for "go for the profits").

You could make some well thought-out arguments about how the funds collected go back into the organizations to help spread their good works, but oddly one can easily suspect that most of the income is really used to protect the revenue generation (business speak for "stuff that causes profits") and pay salaries.

Academia -- data, papers, methodology, everything -- should be free and done out in the public. It should be open, and visible. Approachable. Companies hide their research because they are locked in competitive battles. Universities (and other research institutes) are not companies.

Research is not competition, it should be co-operative. And it shouldn't matter who contributes (particularly when we live in a day and age were so many people are both eager to help, but also have the available time).

I should never be told that I can't download a paper unless I pay a big sum of money. It should be free.

If you're afraid that they won't, then you're hiding something.

As such, no scientific researcher should ever hide their data. Ever! Not only that, but their work should be out there too -- formulas, theories, methodologies, everything -- to be judged not only by their peers, but also by the public at large. If science is a quest for the truth, then the truth has nothing to fear from inspection.

But I realize that it is not just our disgraced climate researchers who have adopted policies of secrecy. You can't, for example, get most of the academic work in Computer Science unless you join the ACM. In fact you can't get most works in academia unless you join some tight-nit clique. It's all closely guarded.

Long ago, I could understand how these organizations needed to charge money in order to recoup the costs of publishing on paper. It makes sense that they shouldn't take a loss. But amazingly that changed in the Information Age, and instead of offering free downloads it appears that most organizations choose to monetize their IP (business speak for "go for the profits").

You could make some well thought-out arguments about how the funds collected go back into the organizations to help spread their good works, but oddly one can easily suspect that most of the income is really used to protect the revenue generation (business speak for "stuff that causes profits") and pay salaries.

Academia -- data, papers, methodology, everything -- should be free and done out in the public. It should be open, and visible. Approachable. Companies hide their research because they are locked in competitive battles. Universities (and other research institutes) are not companies.

Research is not competition, it should be co-operative. And it shouldn't matter who contributes (particularly when we live in a day and age were so many people are both eager to help, but also have the available time).

I should never be told that I can't download a paper unless I pay a big sum of money. It should be free.

## Saturday, December 5, 2009

### Transformers

First, some Administrivia. Given that I haven't sold any copies for quite a while now, I've decided to make the electronic copy of "The Programmer's Paradox" free for a while (a few months probably). It can be found at Lulu:

http://www.lulu.com/content/paperback-book/the-programmers-paradox/178872

Lack of sales is really my fault, as I was never really strong about pushing the book (as it was never edited as well as I would have liked). It was my first real act of writing, so please go easy on me :-)

TRANSFORMATIONS

While I was fiddling with prime numbers, I also had another collection of un-related ideas invade my brain. At the high level, it involves how we have constructed Computer Science as a formal system.

Computers deal with two distinct things: code and data. This is an inherent duality in the basic concept.

However, all of our theoretical models are extremely "code-centric". That is, they focus on the boundaries of the code within a formal system, and leave the data as just a secondary player.

This is true of Turing machines, deterministic finite automata and all of the other formal system's modelling of computing that I am aware of. The data exists, but only as an unimportant set of "symbols" that come and go on some "tape". It's the code we are really studying, not the data.

As I've said before, that is not entirely unexpected as most people's introduction to computers come from a code perspective, and they see the machines in terms of the user functionality and how they can apply it.

Still, the counter-approach of seeing computers in terms of machines that assemble mass piles of data is an equally correct, yet significantly less complex way of visualizing them.

Code is intrisically complex and hard to unwind, data is structurally complex and mostly is normalized by virtue of its usage. That is, there are an infinitely large number of ways to write a program, but there is only a small (and finite) number of reasonable mappings from the internal data back to reality. The data is either correct, or it isn't (not accounting for bugs, of course).

A MODEL

So, I started thinking about some way to model the data "within" the system, without having it overshadowed by the code.

I've worked out a number of the basic elements, but these ideas are far from complete. Because of this, I'll try to include my rational in this discussion, along with the rather sparse ideas in the hope that collectively these ideas will get pushed farther along by anyone interested in them.

Starting at the top, if we aren't interested in the code, then it shouldn't appear in the model explicitly. With that in mind we can define a transform T that maps the data from it's value A to another value B.

T(A) -> B

A transformer is some block of (unknown) code that takes an explicit set of inputs, and returns an explicit set of outputs. To make life easier, there won't be a concept of global variables. Everything necessary for the transformation must be passed in explicitly. And there is also no concept of "side effects". Everything changed in any way is passed back outwards. Transforms can return multiple values.

So we might have something more complex like:

T(a,b,c,d,e,f,g,h) -> A,B,C

A transform that takes 8 parameters, applies some type of algorithm to the incoming data and returns three other new parameters. It's a nice simple start.

Now, none of this would be interesting if we couldn't use this to model something relevant. What might be interesting to know, is "given that there is some infinite number of possible algorithms for a class of transforms (or there are none), what are the bounds of these algorithms as they are running?"

SORTING

I woke up one night, a while back, with an idea about how to use transforms to calculate the bounds on algorithms. While asleep it seemed to be a very strong idea, but awake I've been unable to fully pin in down. It's mostly about continuously decomposing the algorithms, to get a better idea of how they work, but without ever having to examine the inner workings themselves.

The general idea is pretty simple (if it works). While we can't just splat in a bunch of algorithms into the transforms and calculate the bounds, we can sub-divide the transforms into smaller ones (endlessly) until we can use those atom pieces to calculate best and worst performing boundaries.

So for instance, we might be interest in studying the class of "sorting" transformers. That is, any transformer that takes N elements in one arrangement and returns them in another re-arranged "sorted" (lexically, numerically, etc.) order.

If we were going to look at the trivial case, then we find that:

Ts(A) -> A [ O(1) .. O(1) ]

In English, this says that any transform that "sorts" a single entity, requires at minimum "constant time" to sort, and at maximum, also constant time to sort (I used a range notation, but I'd guess that using little o and big O notation may be more expected (acceptable)).

What's really interesting here is to figure out the range of growth of the performance of the family of algorithms. What I'd really like to know is what is the range for:

Ts(A1,A2,...,An) -> Aw,Ac,...,Am

The various different families of algorithms that differently sort through a list of N elements.

We can deal with this by taking the two bounding cases at the same time; the best and worst case.

In the best case, the elements are sorted because the algorithm "knows" the behavior. That is the above algorithm decomposes into

Ts(A1,A2,...,An,L) -> Al,P1

Ts(A1,A2,...,An,K) -> Ak,P2

...

Ts(A1,A2,...,An,M) -> Am,Pn

That is, for each different element (L,K,M) at a time, a transformation takes the whole list, the current element, and returns that element and it's position in the new list.

Since we're still working with the best case, there are N decompositions of the original algorithm that generate N positions, which are used to re-arrange the list. Each decomposition, if we are still in the best case, could use something "intrinsic" from its input to calculate P. As such, the best possible sort algorithm which can use "intrinsic information" from it's incoming data, can only ever re-order the list in n*O(1) time which I think works out to O(n) time.

In the worst case, however, assuming that the code is all practical, and really does things, we can decompose the sort algorithm down into N transforms, each of which has to examine every N elements, in order to calculate P. As such, it works out to n*n*O(1) or O(n^2). So we get something like:

Ts(A1,A2,...,An) -> Aw,Ac,...,Am [ O(1), O(n^2) ]

THE POWER OF SEARCHING

In an analogous manner to sorting we could start examining the class of transformers that take a large number of inputs, and return a subset of "found" outputs. I won't go into too much detail, but at each level of decomposition we can reduce the number of arguments, split off a best and worst case. In that way we might start with:

Tf(A1,A2,...,An,C1,C2,...,Cm) -> (null | Ak,...,Al)

We could simplify this by adding null in as an input:

Tf(null, A1,A2,...,An,C1,C2,...,Cm) -> Ak,..,Al

Then this describes the "find" family of transforms, that takes a set of input, and a set of criteria, and then returns nothing, or some smaller subset of "found" items.

At the very bottom, as a best case, a transform would either "know" about that the data was or was not it what it was looking for. Otherwise the transform would have to look through everything, and determine which were the correct elements.

The best and worse cases are similar to the space for sorting.

SUMMARY

So why do I think this is useful? Primarily because it can be used to draw hard boundaries on a family of algorithms. From the standard Computer Science perspective, you can work with algorithmic complexity to find better algorithms, but you have no way of knowing if you've found the best one.

There is some computational bound which cannot be crossed, and it is nice to know where that bound is.

Of course in my examples, I took the best case to be a trivially not-possible case, but that was primarily because I've really haven't gone deep enough to understand that lower bound in a better fashion. It's likely, from the output, and from decompositions, that we should be able to figure a way to calculate a much more accurate best case boundary.

Another place where I think this could be useful is in studies the computational complexities themselves. We've had categories of complexity for a long time like NP, but we've been unable to make a dent in determining elementary issues like P=NP.

These problems too represent important boundaries on the things that can and would like to do with our computers. Unfortunately now, the studies in these areas have been tied to working with existing algorithmic problems, and with specific algorithms.

If we can work with the complexity classes, without having to descend into finding the actual algorithms themselves, we may be able to construct higher order proofs about our boundaries, without getting too wrapped up in the specifics of existing algorithms.

Another interesting issue about this approach is that it takes us one more step down a path I've talked about before:

http://theprogrammersparadox.blogspot.com/2009/04/end-of-coding-as-we-know-it.html

With our current code-centric perspectives on software systems, it seems as if the systems can quickly exceed everyone's threshold of understanding as they swell in complexity.

What this means in practical terms is that while the system may start under control, with enough people contributing, and enough work getting done, the system rapidly moves into a "stupid zone", where it has become so large and so complex, that nobody can fully understand it anymore.

If nobody can understand it, then nobody can use it (fully), and as such it is just a monolithic nightmare of related functionality.

Once large enough, the only way programmers can deal with adding in new functionality is by carving off little sections and ignoring the rest. This divide and conquer strategy works, but it also means the functionality itself starts becoming disjoint, and the system gets even more complex as a result.

So we get these behemoths that are so large and so disjoint that people can't make use of the available functionality. We pass over some maximum point in combined functionality, where on the other side our usage degenerates horribly.

The perfect example of this is Microsoft Word. Maybe a few decade agos, most users understood most of the functionality and could use it well. Now, the software is so convoluted that most users are forced to use less features just because the interactions between the features rarely does what is expected.

We used to heavily "type" our documents to allow the formater to make them auto-magically consistent. Now, anybody would have to be crazy to turn on the formatting, it hasn't worked (behaved consistently) in years. So we're back to working with our documents in the most crude manner possible.

The general case is that if you keep building on some underlying complexity, at some point the complexity itself will overwhelm what you are doing, and become its own issue.

If instead of building up ever-increasing mountains of complexity, we just worked at one low level abstraction and used the machines themselves to dynamically assemble the complexity, we could avoid a lot of problems.

The abilities of our computers are strong enough that they can be used to map pathways through a sea of transformations. Intellect is necessary for "how" to do the transformations, but it is not necessary to calculate a deterministic path between a large series of them.

We've based our current architectures and technologies implicitly on our existing theories. The Turing machine model drove the hardware construction and that drove the way we've built software. This in turn controls how we see our software problems.

We see every problem as a nail, because we started out with using hammers. And after a very long time of pounding nails into anything that moved, we're now seeing the intrinsic weaknesses in our approach to building things.

If we can't control and contain complexity, we can't exceed a fairly low threshold for the sophistication of our systems. We are stuck, just crudely re-write the same things over and over again, with simple variations on the same underlying technologies. After twenty years, this lack of progress is getting boring.

http://www.lulu.com/content/paperback-book/the-programmers-paradox/178872

Lack of sales is really my fault, as I was never really strong about pushing the book (as it was never edited as well as I would have liked). It was my first real act of writing, so please go easy on me :-)

TRANSFORMATIONS

While I was fiddling with prime numbers, I also had another collection of un-related ideas invade my brain. At the high level, it involves how we have constructed Computer Science as a formal system.

Computers deal with two distinct things: code and data. This is an inherent duality in the basic concept.

However, all of our theoretical models are extremely "code-centric". That is, they focus on the boundaries of the code within a formal system, and leave the data as just a secondary player.

This is true of Turing machines, deterministic finite automata and all of the other formal system's modelling of computing that I am aware of. The data exists, but only as an unimportant set of "symbols" that come and go on some "tape". It's the code we are really studying, not the data.

As I've said before, that is not entirely unexpected as most people's introduction to computers come from a code perspective, and they see the machines in terms of the user functionality and how they can apply it.

Still, the counter-approach of seeing computers in terms of machines that assemble mass piles of data is an equally correct, yet significantly less complex way of visualizing them.

Code is intrisically complex and hard to unwind, data is structurally complex and mostly is normalized by virtue of its usage. That is, there are an infinitely large number of ways to write a program, but there is only a small (and finite) number of reasonable mappings from the internal data back to reality. The data is either correct, or it isn't (not accounting for bugs, of course).

A MODEL

So, I started thinking about some way to model the data "within" the system, without having it overshadowed by the code.

I've worked out a number of the basic elements, but these ideas are far from complete. Because of this, I'll try to include my rational in this discussion, along with the rather sparse ideas in the hope that collectively these ideas will get pushed farther along by anyone interested in them.

Starting at the top, if we aren't interested in the code, then it shouldn't appear in the model explicitly. With that in mind we can define a transform T that maps the data from it's value A to another value B.

T(A) -> B

A transformer is some block of (unknown) code that takes an explicit set of inputs, and returns an explicit set of outputs. To make life easier, there won't be a concept of global variables. Everything necessary for the transformation must be passed in explicitly. And there is also no concept of "side effects". Everything changed in any way is passed back outwards. Transforms can return multiple values.

So we might have something more complex like:

T(a,b,c,d,e,f,g,h) -> A,B,C

A transform that takes 8 parameters, applies some type of algorithm to the incoming data and returns three other new parameters. It's a nice simple start.

Now, none of this would be interesting if we couldn't use this to model something relevant. What might be interesting to know, is "given that there is some infinite number of possible algorithms for a class of transforms (or there are none), what are the bounds of these algorithms as they are running?"

SORTING

I woke up one night, a while back, with an idea about how to use transforms to calculate the bounds on algorithms. While asleep it seemed to be a very strong idea, but awake I've been unable to fully pin in down. It's mostly about continuously decomposing the algorithms, to get a better idea of how they work, but without ever having to examine the inner workings themselves.

The general idea is pretty simple (if it works). While we can't just splat in a bunch of algorithms into the transforms and calculate the bounds, we can sub-divide the transforms into smaller ones (endlessly) until we can use those atom pieces to calculate best and worst performing boundaries.

So for instance, we might be interest in studying the class of "sorting" transformers. That is, any transformer that takes N elements in one arrangement and returns them in another re-arranged "sorted" (lexically, numerically, etc.) order.

If we were going to look at the trivial case, then we find that:

Ts(A) -> A [ O(1) .. O(1) ]

In English, this says that any transform that "sorts" a single entity, requires at minimum "constant time" to sort, and at maximum, also constant time to sort (I used a range notation, but I'd guess that using little o and big O notation may be more expected (acceptable)).

What's really interesting here is to figure out the range of growth of the performance of the family of algorithms. What I'd really like to know is what is the range for:

Ts(A1,A2,...,An) -> Aw,Ac,...,Am

The various different families of algorithms that differently sort through a list of N elements.

We can deal with this by taking the two bounding cases at the same time; the best and worst case.

In the best case, the elements are sorted because the algorithm "knows" the behavior. That is the above algorithm decomposes into

Ts(A1,A2,...,An,L) -> Al,P1

Ts(A1,A2,...,An,K) -> Ak,P2

...

Ts(A1,A2,...,An,M) -> Am,Pn

That is, for each different element (L,K,M) at a time, a transformation takes the whole list, the current element, and returns that element and it's position in the new list.

Since we're still working with the best case, there are N decompositions of the original algorithm that generate N positions, which are used to re-arrange the list. Each decomposition, if we are still in the best case, could use something "intrinsic" from its input to calculate P. As such, the best possible sort algorithm which can use "intrinsic information" from it's incoming data, can only ever re-order the list in n*O(1) time which I think works out to O(n) time.

In the worst case, however, assuming that the code is all practical, and really does things, we can decompose the sort algorithm down into N transforms, each of which has to examine every N elements, in order to calculate P. As such, it works out to n*n*O(1) or O(n^2). So we get something like:

Ts(A1,A2,...,An) -> Aw,Ac,...,Am [ O(1), O(n^2) ]

THE POWER OF SEARCHING

In an analogous manner to sorting we could start examining the class of transformers that take a large number of inputs, and return a subset of "found" outputs. I won't go into too much detail, but at each level of decomposition we can reduce the number of arguments, split off a best and worst case. In that way we might start with:

Tf(A1,A2,...,An,C1,C2,...,Cm) -> (null | Ak,...,Al)

We could simplify this by adding null in as an input:

Tf(null, A1,A2,...,An,C1,C2,...,Cm) -> Ak,..,Al

Then this describes the "find" family of transforms, that takes a set of input, and a set of criteria, and then returns nothing, or some smaller subset of "found" items.

At the very bottom, as a best case, a transform would either "know" about that the data was or was not it what it was looking for. Otherwise the transform would have to look through everything, and determine which were the correct elements.

The best and worse cases are similar to the space for sorting.

SUMMARY

So why do I think this is useful? Primarily because it can be used to draw hard boundaries on a family of algorithms. From the standard Computer Science perspective, you can work with algorithmic complexity to find better algorithms, but you have no way of knowing if you've found the best one.

There is some computational bound which cannot be crossed, and it is nice to know where that bound is.

Of course in my examples, I took the best case to be a trivially not-possible case, but that was primarily because I've really haven't gone deep enough to understand that lower bound in a better fashion. It's likely, from the output, and from decompositions, that we should be able to figure a way to calculate a much more accurate best case boundary.

Another place where I think this could be useful is in studies the computational complexities themselves. We've had categories of complexity for a long time like NP, but we've been unable to make a dent in determining elementary issues like P=NP.

These problems too represent important boundaries on the things that can and would like to do with our computers. Unfortunately now, the studies in these areas have been tied to working with existing algorithmic problems, and with specific algorithms.

If we can work with the complexity classes, without having to descend into finding the actual algorithms themselves, we may be able to construct higher order proofs about our boundaries, without getting too wrapped up in the specifics of existing algorithms.

Another interesting issue about this approach is that it takes us one more step down a path I've talked about before:

http://theprogrammersparadox.blogspot.com/2009/04/end-of-coding-as-we-know-it.html

With our current code-centric perspectives on software systems, it seems as if the systems can quickly exceed everyone's threshold of understanding as they swell in complexity.

What this means in practical terms is that while the system may start under control, with enough people contributing, and enough work getting done, the system rapidly moves into a "stupid zone", where it has become so large and so complex, that nobody can fully understand it anymore.

If nobody can understand it, then nobody can use it (fully), and as such it is just a monolithic nightmare of related functionality.

Once large enough, the only way programmers can deal with adding in new functionality is by carving off little sections and ignoring the rest. This divide and conquer strategy works, but it also means the functionality itself starts becoming disjoint, and the system gets even more complex as a result.

So we get these behemoths that are so large and so disjoint that people can't make use of the available functionality. We pass over some maximum point in combined functionality, where on the other side our usage degenerates horribly.

The perfect example of this is Microsoft Word. Maybe a few decade agos, most users understood most of the functionality and could use it well. Now, the software is so convoluted that most users are forced to use less features just because the interactions between the features rarely does what is expected.

We used to heavily "type" our documents to allow the formater to make them auto-magically consistent. Now, anybody would have to be crazy to turn on the formatting, it hasn't worked (behaved consistently) in years. So we're back to working with our documents in the most crude manner possible.

The general case is that if you keep building on some underlying complexity, at some point the complexity itself will overwhelm what you are doing, and become its own issue.

If instead of building up ever-increasing mountains of complexity, we just worked at one low level abstraction and used the machines themselves to dynamically assemble the complexity, we could avoid a lot of problems.

The abilities of our computers are strong enough that they can be used to map pathways through a sea of transformations. Intellect is necessary for "how" to do the transformations, but it is not necessary to calculate a deterministic path between a large series of them.

We've based our current architectures and technologies implicitly on our existing theories. The Turing machine model drove the hardware construction and that drove the way we've built software. This in turn controls how we see our software problems.

We see every problem as a nail, because we started out with using hammers. And after a very long time of pounding nails into anything that moved, we're now seeing the intrinsic weaknesses in our approach to building things.

If we can't control and contain complexity, we can't exceed a fairly low threshold for the sophistication of our systems. We are stuck, just crudely re-write the same things over and over again, with simple variations on the same underlying technologies. After twenty years, this lack of progress is getting boring.

Subscribe to:
Posts (Atom)