TeX2HTML - Converted Document

  

Approximation and Complexity in Numerical Optimization: Continuous and Discrete Problems (P. M. Pardalos, Editor), © 1999 Kluwer Academic Publishers

On Complexity and Optimization in Emergent Computation

Victor Korotkich (v.korotkich@cqu.edu.au)

Faculty of Informatics and Communication
Central Queensland University
Mackay, Queensland 4740, Australia

Abstract

New results on complexity and optimization in emergent computation are presented. The results are found by using a final mathematical structure based on integers only. The structure is a collection of hierarchical formations of integer relationships. The formations are used to suggest a concept of structural complexity. A principle to discuss the specification of natural systems in terms of hierarchical formations of the structure is given. As the principle is not subject to constraints of the Turing model of computation, a problem of how it can best be approximated is considered. The problem is explored for a special case by reducing to the problem of binary sequence prediction with structural complexity as the criterion. An optimal algorithm to the last problem is presented. The algorithm gives an experimental verification of the principle by a connection with period doubling.

Keywords: emergent computation, complexity, optimization, approximation

1  Introduction

Emergent computation can often be viewed when separate components combine and form something new, in which they act harmoniously together. Depending on the context, the components may represent cells in the human immune system, termites in a colony nest, individual atoms in a molecule, or any of a wide variety of other examples [1]. The importance of emergent computation to global optimization is clear as an increasing number of algorithms are developed to mimic it (for example, simulated annealing, genetic and evolution algorithms).

Research in emergent computation is huge and interdisciplinary. It includes an ever growing number of disciplines such as computer science, mathematics, physics, cellular and molecular biology, social and economical sciences with each of them making a valuable concrete contribution. Increasingly, separate disciplines are indicating that at some deep level there are certain universal principles of emergent computation. In particular, the possible existence of these principles is the main motive underlying systems, complexity and chaos theories.

It is becoming increasingly evident that understanding of the universal principles requires a profound revision of the most fundamental ideas. Probably the most noticeable attempt of such a rethinking is a world view in which natural systems are seen not as separate entities but as integrated parts of a unified whole (for example [2]). The supporting evidence for the world view comes mainly from quantum physics. In Cartesian world view it is assumed that natural systems are independently existing entities interacting in spatio-temporal void, and this is believed to explain everything.

To understand the world as a unified whole a mathematical structure that can disclose and specify this world view is crucially needed (for example [3]). It was gradually realised that the problem of proposing such a structure was mainly about finding proper irreducible concepts to base the structure on and how to develop it without ever once referring to something that is not irreducible. A mathematical structure that has this property is termed a final theory [4]. In particular, no physical concepts can be used as there is no final theory in physics to date.

The aim of the paper is to present some new results on complexity and optimization in emergent computation. The results are found by using a final mathematical structure based on integers only. An arsenal of tools used in the definition of the structure is given in sections 2-5. The structure, called a web of relations, is a collection of hierarchical formations with each of them producing level by level increasingly complex elements, which are nothing but integer relationships. This motivates to propose the web of relations as an universal scale to measure the complexity of things in terms of the hierarchical formations of integer relationships. These formations are used to suggest a concept of structural complexity presented in section 5.

A principle, formulated to discuss the specification of natural systems in terms of hierarchical formations of the structure, is given in section 6. It says that the specification is given by hierarchical formations having the maximum structural complexity among possible ones. To study the principle a model that captures emergent computation and provides abstractions for the development of algorithms is outlined in section 7. By using the model the principle is expressed in terms of a notion of coherent solution in section 8. The notion says that binary sequences of a coherent solution maximize structural complexity.

As the principle is not subject to constraints of the Turing model of computation, the problem of how coherent solutions can best be algorithmically approximated is presented in section 9. The approximation problem is very difficult. It is considered for a special case by reducing to the problem of binary sequence prediction with structural complexity as the criterion. An optimal algorithm to the last problem is described in section 10. The algorithm gives an experimental verification of the principle by a connection with period doubling, an observable phenomenon of self-organization in nature. The optimal algorithm with the help of a generator of the celebrated Prouhet-Thue-Morse (PTM) sequence can be formulated as a simple strategy: ``win - stay, lose - consult PTM generator''.

2  Integer code series and a global description of sequences

An integer code series [5] gives a representation for definite integrals of piecewise constant functions and plays an important role in the description of the web of relations.

Namely, let I be an integer alphabet and

In = {s = s1...sn, si Î I, i = 1,...,n }
be the set of all sequences of length n ³ 2 with symbols in I. If I = { -1,+1 } then In is the set of all binary sequences of length n, denoted Bn. Let s(i) = s1...si and |s(i) | = i denote the length of s(i) Î Ii, i = 1,...,n. Let Wde([tm, tm+n]) be a set of piecewise constant functions defined on an interval [tm,tm+n] Ì Â, where m is an integer. If a function f belongs to the set Wde([tm, tm+n]) then there is a sequence s = s1...sn Î In, called a code of the function f and denoted c(f), such that sid is the value of the function f in (ti-1,ti], i = m + 1,...,m + n and f(tm) = s1d, where ti = ie,  i = m,...,m+n. The integer code series specifies the kth k = 1,2,...   integral f[k] of a piecewise constant function f Î Wde([tm, tm+n]) in terms of its code c(f) = s1...sn by the formula [5]

f[k](tm+n) = k-1
å
i = 0 
akmi((m+n)is1 + ... + (m+1)isn)ekd+ k
å
i = 1 
bknif[i](tm)ek-i,
where akmi, i = 0,...,k-1 and bkni, i = 1,...,k are rational coefficients.

Let a map rmde take a sequence s Î In to a function f Î Wde([tm,tm+n]), denoted f = rmde(s), such that c(f) = s and whose kth integral f[k] satisfies f[k](tm) = 0, k = 1,2,...   (see Figure 1). A global description of sequences proposed in [3] is based on the integer code series and produces systems of linear equations in integers as the next step towards the definition of the web of relations. In this global description a sequence s = s1...sn is associated with numbers Jk(s),  k = 1,2,...  , called ''structural'' numbers. These numbers have a geometrical interpretation as Jk(s) is the value of the kth integral of a function f = rmde(s) at the point tm+n, i.e., Jk(s) = f[k](tm+n), k = 1,2,...  . By definition J0(s) = 0,  s Î In.

figure1

Figure 1. Graph of a function

f = r011(s),s = c(f) = s1...s4 = +1-1-1+1, d = 1, e = 1,m = 0, n = 4, tm+n = 4.

3  Systems of integer relationships and a new type of hierarchical formations

If for a pair of different sequences s = s1...sn,  s¢ = s1¢...sn¢ Î In there is an integer k ³ 2 such that their first (k-1) structural numbers are equal J1(s) = J1(s¢),...,Jk-1(s) = Jk-1(s¢), whereas the kth structural numbers are not Jk(s) ¹ Jk(s¢) then C(s,s¢,n) = k. It is shown that C(s,s¢,n) = k £ n [6] leading to a system of (k-1) linear equations in integers (si - si¢),i = 1,...,n

(m + n)0(s1 - s1¢) + ... + (m + 1)0(sn - sn¢) = 0

.                      .                      .                     .

(m + n)k-2(s1 - s1¢) + ... + (m + 1)k-2(sn - sn¢) = 0
(1)
and inequality

(m + n)k-1(s1 - s1¢) + ... + (m + 1)k-1(sn - sn¢) ¹ 0.
(2)
The system (1) can be viewed as a system of integer relationships. The key idea concerning the system (1) and inequality (2) is that they represent a new type of hierarchical formations, i.e., hierarchical formations of integer relationships and two-dimensional geometrical patterns (see Figure 2). Such systems are the main ingredient in the definition of the web of relations.

4  A New meaning for the nature of integer relationship

The radical power of the web of relations comes from a new meaning for the nature of integer relationship. Integer and integer relationship are an integral part of our mental equipment but the scope of humans to fully appreciate integer relationship is quite limited. Integer relationship is but a set of abstract symbols subject to operational rules of arithmetic. Usually, integer relationships appear as solutions to Diophantine equations and questions about their existence and character are of interest. However, integer relationships do not appear to us as rooted in firm reality and have the power to evoke in our minds images of concrete objects which form into each other.

The scope of integer relationship is extended in [7] by developing a new perception of it as a geometric object. It is shown that integer relationships have a dual existence: on one hand they are, of course, of arithmetic character, on the other hand they find an ``incarnation'' as two-dimensional geometric objects which form into each other. In particular, the geometrical interpretation of structural numbers helps to introduce specific patterns, called integer patterns, which are connected with integers and integer relationships. By using integer patterns the system (1) can be unfolded into visible images (see Figure 2).

5  The Web of relations as a final theory and structural complexity

The following results further lead to the definition of the web of relations [7]. The system (1) and inequality (2) are associated with a hierarchical formation of integers and integer relationships WR(s,s¢,n,m,In) of k = C(s,s¢,n) levels. It is shown that the hierarchical set of integer relationships WR(s,s¢,n,m,In) is isomorphic to a hierarchical set of integer patterns WPde(s,s¢,n,m,In), i.e.,

ì
í
î
System  of  Linear  Equations  (1)
        and  Inequality  (2)
ü
ý
þ

ß

ì
í
î
 Hierarchical  Formation
of  Integer  Relationships
ü
ý
þ
     Û
isomorphism
ì
í
î
 Hierarchical  Formation
   of  Integer  Patterns
ü
ý
þ
Figure 2 illustrates this result by displaying a hierarchical formation of integer relationships and a corresponding hierarchical formation of integer patterns together.

All hierarchical formations of integer relationships associated with sequences s, s¢ and their initial segments s(i), s¢(i) of length i = 1,...,n-1 are incorporated into one hierarchical formation

WR(s,s¢,m,In) = n
È
i = 1 
WR(s(i),s¢(i),i,m,Ii)
of

C(s,s¢) =
max
i = 1,...,n 
C(s(i),s¢(i),i)
levels. The isomorphism defines a hierarchical formation of integer patterns

WPde(s,s¢,m,In)
corresponding to WR(s,s¢,m,In). The web of relations is defined to incorporate all these hierarchical formations into one whole

WR(In) =
È
m Î Z 

È
s Î In 

È
s¢ Î In 
WR(s,s¢,m,In),

WR(I) =
lim
n ® ¥ 
WR(In),    WR = WR(Z),
where all possible pairs of sequences are considered and integer alphabet I is the set of all integers Z. Due to the isomorphism elements of the web of relations can be equally seen as integer relationships or integer patterns.

figure2

Figure 2. Two hierarchical formations of integer relationships and integer patterns are displayed together to underline the dual character of elements of the web of relations. The figure actually allows us to see how integer relationships form into each other.

Complexity, in spite of its important role in science and powerful intuitive meanings, does not have generally accepted definitions in rigorous mathematical terms. The web of relations is nothing but an integrated collection of hierarchical formations. This produces more and more complex elements as compositions of elements from the lower level. The nature of the hierarchical formations is very close to what is associated with complexity. In particular, the Oxford Dictionary defines something as ``complex'' if it is ``made of (usually several) closely connected parts'' or intuitively it is understood that in order to have a complex two or more components are needed, which are joined in such a way that it is difficult to separate them.

This motivates to propose the web of relations as an universal scale to measure the complexity of things in terms of the hierarchical formations of integer relationships. These formations are used to introduce a concept of complexity, called structural complexity [7]. Namely, the structural complexity of a sequence s Î In with respect to a sequence s¢ Î In, s ¹ s¢ is defined as the maximum level of elements that the hierarchical formation WR(s,s¢,m,In) produces. This level equals C(s,s¢). The concept C(s,s¢) is useful in describing the hierarchical formations and plays an important role.

The web of relations is based on integers as the single concept and is a set of integer relationships organized in a special way. The structure is a final theory with regard to the existence of the elements and their formations. It is completely specified by integers, which are considered as the most fundamental entities irreducible to something more simpler. Namely, ultimate building blocks of the web of relations are integers. They initiate hierarchical formations, level by level, producing more complex elements of the web of relations, which are nothing but integer relationships.

With the web of relations at hand, clarification is then needed as to how it is used in understanding universal principles of emergent computation. The next section describes the situation.

6  Methods to probe universal principles of emergent computation

The web of relations as a final theory is used to probe universal principles of emergent computation. In particular, a principle is formulated to discuss a variety of issues connected with the mathematical specification of natural systems in terms of hierarchical formations of the web of relations [8]. The principle cannot be derived from the web of relations only and other means are needed for its justification. However, it is possible to discern outlines of the principle as the hierarchical formations have the preferred bottom-up direction of growth. This suggests that hierarchical formations achieving the maximum level among possible ones stand out as special. This observation accords well with the recognition that the unfolding of ever-greater complexity may be a fundamental property of nature (for example [9]). This recognition is realized in defining the above principle in a concrete mathematical form by using the concept of structural complexity.

Principle.  A natural system specified in terms of the web of relations comprises hierarchical formations that have maximum possible structural complexity.

The principle figuratively says that ``nature realizes structures with maximum complexity''. Since the days of Darwin, evolution has been associated with an increase in complexity. What constitutes the principal difference is that in our case the definition of complexity is formally rigorous and, crucially, is backed up by a final mathematical structure.

7  A Model of computation relevant to the universal principles

The principle is studied within a model that captures emergent computation and provides abstractions for the development and analysis of algorithms. The model is based on the El Farol problem suggested by W.B. Arthur [10].

The problem can be described as follows. Assume there is a group of N ³ 2 people, who enjoy going to El Farol, a small bar, to listen to the music on Thursday evenings. However, none of them wants to visit the bar if it is going to be crowded. As a result each person seeks to understand and use in a proper way the situation in which he or she participates. In particular, on Thursday each person, before going to the bar, employs an algorithm to predict if it will be crowded at the bar in the evening. It is assumed that each person only has the information of how many people were at the bar over all past weeks. Each person decides independently to go to the bar only if his or her prediction is that it will be not crowded. These decisions create a result, the emergent number of people at the bar on the Thursday evening, as some turn up at the bar, while others expect it will be crowded and do not go there. As the new attendance number is become known on the next day, expectations of some of them are validated by the result, others are not. Then each person updates the accuracy of his or her algorithm for the next Thursday.

Each person is characterized by an algorithm which outputs only values crowded and not crowded. Symbols -1 and +1 can be used respectively for denotation of these values. Namely, if the algorithm predicts that it will be crowded at the bar then the value is -1. This means that the person does not go to the bar. If it predicts that it will not be crowded at the bar then the value is +1. This means that the person goes to the bar. Accordingly, the result, due to the attendance number, can be encoded by symbols -1 and +1. If an attendance number is interpreted by the person that it was crowded at the bar then the result is -1 and +1 otherwise.

An algorithm A for a period of n weeks is a set { ck,k = 1,...,n } of n rules. Each rule ck, k = 1,...,n, in general, uses all information given by the past k-1 weeks and takes sk Î {-1,+1} as the prediction of a result sk¢ Î {-1,+1} at the kth week. This information can be encoded as sequences

s(k-1) = s1...sk-1,    s¢(k-1) = s1¢...sk-1¢,
where si, si¢ Î {-1,+1}, i = 1,...,k-1 are a prediction and a result at the ith week respectively when k = 2,...,n and s(0) = Æ, s¢(0) = Æ, where Æ is the empty sequence when k = 1. Hence, for a rule ck, k = 1,...,n of an algorithm A we have

sk = ck(s(k-1), s¢(k-1)).
(3)

The behaviour of a group of N people during n weeks can be described as a set of N binary sequences of length n

S = { _
s
 

i 
= si1...sin, i = 1,...,N },
where a sequence `si = si1...sin Î S specifies the behaviour of the ith i = 1,...,N person and sij = +1 if the person goes to the bar on the jth j = 1,...,n week and sij = -1 otherwise.

The model is a version of the El Farol problem specified by the following condition. The group of N people must interact in such a way that the emergent number of people each Thursday turning up at the bar must be equal to the number of seats N¢ in it. In the version for each person of the group ''crowded'' means that there are no spare seats at the bar. As the number of seats changes the group must adapt accordingly self-organizing itself to the new condition. This emergent computation of the number of seats at the bar is a useful information processing ability of the group. It is easy to appreciate how lucky the owner of the bar could be to have this group of people as customers. Each time for whatever reasons he or she changes the number of seats, the bar without any restructuring cost has exactly as many people as it can accommodate.

8  The Principle in the El Farol problem and coherent solutions

Let P(n,N,N¢) be a set of solutions to the version of the El Farol problem when for n weeks a group of N people visits the bar with N¢ seats. If a collection of binary sequences

S = { _
s
 

i 
= si1...sin Î Bn, i = 1,...,N }.
is a solution to the problem S Î P(n,N,N¢) then for each jth week j = 1,...,n

N
å
i = 1 
(1/2)(sij + 1) = N¢,
where a sequence `si = si1...sin Î Bn describes the visitation of the ith person i = 1,...,N of the group. A solution S Î P(n,N,N¢) can be represented as a n ×N matrix

M(S) = { sij },    i = 1,...,N,    j = 1,...,n.
A solution S Î P(n,N,N¢) in the web of relations WR is associated with a collection of hierarchical formations

WR(S) = { WR( _
s
 

i 
, _
s
 

j 
, 0, Bn),   _
s
 

i 
, _
s
 

j 
Î S, i, j = 1,...,N, i ¹ j } Ì WR,
where a hierarchical formation WR(`si, `sj, 0, Bn) of the collection WR(S) is connected with the ith and jth person i, j = 1,...,N,  i ¹ j of the group. There are two related descriptions of the group in space-time and in the web of relations

ì
í
î
Space-time
Matrix  M(S)
ü
ý
þ
Þ ì
í
î
            Web  of  Relations
Collection  of  Hierarchical  Formations  WR(S)
ü
ý
þ

For a solution S Î P(n,N,N¢) by À(S,k) denote the number of all pairs (`s, `s¢) with sequences `s, `s¢ Î S corresponding to two different people of the group and having at least structural complexity k. Let

min
(S) =
min
`s, `s¢ Î S 
C( _
s
 
, _
s
 
¢),     max
(S) =
max
`s, `s¢ Î S 
C( _
s
 
, _
s
 
¢).
The model translates the principle into a notion of coherent solution.

Definition (Coherent Solution).  A solution S0 Î P(n,N, N¢) is called a coherent solution if for any solution S Î P(n,N, N¢) other than a coherent one

1. min(S0) = max(S0) then

À(S,i) < À(S0,i),    i = min
(S0)

2. min(S0) < max(S0) then there exists an integer 0 £ l £ max(S0)-min(S0)-1 such that

À(S,i) = À(S0,i),    i = min
(S0),..., min
(S0)+l
and

À(S,i) < À(S0,i),    i = min
(S0)+l+1,..., max
(S0).

We illustrate the definition by simple examples. Consider the version of the El Farol problem when the bar is open for n = 5 weeks, a group consists of N = 5 people and the number of seats at the bar is N¢ = 5,4,3. Coherent solutions are treated up to the group of permutations of the people.

When N¢ = 5 there are enough seats for all people of the group. A person of the group can visit the bar each week and enjoy the music without taking care about visitations of the other people. There is only one solution S to the problem, which is also the coherent one S = S0

_
s
 

1 
= +1+1+1+1+1,     _
s
 

2 
= +1+1+1+1+1,     _
s
 

3 
= +1+1+1+1+1,

_
s
 

4 
= +1+1+1+1+1,     _
s
 

5 
= +1+1+1+1+1,
or

M(S0) = æ
ç
ç
ç
ç
ç
ç
ç
è
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
ö
÷
÷
÷
÷
÷
÷
÷
ø
(4)
Looking at the matrix (4) it may seem that this coherent solution has a perfect order. It may be said that the people visit the bar in lines or that the ``+1''s of the matrix, interpreted as spins, look in one direction. However, this order can be viewed as the simplest one because the structural complexity of the coherent solution S0 is 0. There are no hierarchical formations in the web of relations corresponding to the coherent solution S0.

When N¢ = 4 it can be shown that there is only one coherent solution S0. The solution is

_
s
 

1 
= +1+1+1+1-1,     _
s
 

2 
= +1+1+1-1+1,     _
s
 

3 
= +1+1-1+1+1,

_
s
 

4 
= +1-1+1+1+1,     _
s
 

5 
= -1+1+1+1+1.
or

M(S0) = æ
ç
ç
ç
ç
ç
ç
ç
è
+1
+1
+1
+1
-1
+1
+1
+1
-1
+1
+1
+1
-1
+1
+1
+1
-1
+1
+1
+1
-1
+1
+1
+1
+1
ö
÷
÷
÷
÷
÷
÷
÷
ø
Sequences of the coherent solution S0 have the same first structural numbers. Among all possible solutions P(5,5,4) to the problem the coherent solution is the unique one in which all sequences have the same first structural numbers. In particular, this gives min(S0) = 2 and it can be seen that max(S0) = 2. No segments of sequences of a solution S Î P(5,5,4) have the same first structural numbers all together. This gives À(S,2) < À(S0,2).

Sequences of the coherent solution S0 Î P(5,5,4) produce in the web of relations hierarchical formations that all have structural complexity 2. According to the coherent solution S0 each person visits the bar four times and meets any person of the group at the bar the same number of times. We may say that under the coherent solution people in the group have many equal opportunities.

When N¢ = 3 it can be shown that there are three coherent solutions. We present one of them

_
s
 

1 
= +1+1-1-1+1,     _
s
 

2 
= +1-1+1+1-1,

_
s
 

3 
= +1-1-1+1+1,     _
s
 

4 
= -1+1+1-1+1,

_
s
 

5 
= -1+1+1+1-1,
or

M(S0) = æ
ç
ç
ç
ç
ç
ç
ç
è
+1
+1
-1
-1
+1
+1
-1
+1
+1
-1
+1
-1
-1
+1
+1
-1
+1
+1
-1
+1
-1
+1
+1
+1
-1
ö
÷
÷
÷
÷
÷
÷
÷
ø
A coherent solution S0 may be seen to consist of three subgroups of people (`s1, `s2), (`s3, `s4), `s5. All sequences of a coherent solution have the same first structural numbers. In particular, this means that each person visits the bar the same number of times. However, there are other solutions S Î P(5,5,3) to the problem that have this property, i.e., À(S,2) = À(S0,2). What distinguishes a coherent solution S0 is that sequences in subgroups (`s1, `s2), (`s3, `s4) have the same second structural numbers J2(`s1) = J2(`s2),  J2(`s3) = J2(`s4). It can be shown that min(S0) = 2,  max(S0) = 3 and for a solution S Î P(5,5,3) not a coherent one À(S,3) < À(S0,3), which is what the definition claims.

All pairs of sequences of a coherent solution S0 Î P(5,5,3) (except (`s1, `s2), (`s3, `s4)) produce in the web of relations hierarchical formations of structural complexity 2. Besides pairs (`s1, `s2), (`s3, `s4) give hierarchical formations of structural complexity 3. Comparing the structural complexities of hierarchical formations of the coherent solutions for N¢ = 4 and N¢ = 3 it can be said that the coherent solution in the first case is less complex than a coherent solution in the second one.

According to a coherent solution S0 Î P(5,5,3) each person in the group visits the bar the same number of times but he or she meets different people of the group a different number of times. For example, according to the coherent solution presented the first person meets the second one only once at the bar, while he or she meets the third person two times. It can be shown by computations that in a coherent solution these differences are minimal in comparison with other solutions to the problem. Figuratively, under a coherent solution all people of the group have equal opportunities as far as possible.

9  Approximation to coherent solutions and the problem of binary sequence prediction

In the previous section the principle was expressed in terms of a notion of coherent solution. The notion says that binary sequences of a coherent solution maximize structural complexity. In this section the problem of how the coherent solutions can best be algorithmically approximated is presented.

In general, this approximation problem can be formulated as follows. For each ith person i = 1,...,N find an algorithm Ai such that the group of people as a whole generates solutions to the version of the El Farol problem, which are close to coherent solutions in terms of structural complexity as much as possible. This problem is very difficult. Naturally, its consideration was started with the problem when there were only two people in the group N = 2 [8].

Without loss in generality, in this case the approximation problem is about a person, say the first one, who, by using an algorithm A = { ck,k = 1,...,n }, seeks to produce sequences having maximum structural complexity with sequences generated by the second one. At each kth week k = 1,...,n the first person uses a rule ck and produces the kth symbol of a sequence s before the kth symbol sk¢ of a sequence s¢, generated by the second person, becomes known. It is assumed that a value of sk¢ is totally indeterminate before the production of sk, i.e., symbols -1 and +1 appear with the same probability. After n steps an algorithm A for a sequence s¢ = s1¢...sn¢ Î Bn produces a sequence s = s1...sn Î Bn, denoted by s = A(s¢).

The problem just described is the problem of binary sequence prediction with structural complexity as the criterion. Usually the problem of binary sequence prediction uses a criterion based on the measure, which counts the number of the components sequences s, s¢ Î Bn have in common. When the problem of binary sequence prediction is considered in terms of structural complexity, an algorithm A is described by the total structural complexity


å
s¢ Î Bn, s = A(s¢) 
C(s,s¢),
(5)
associated with the algorithm. The interest is to find an algorithm Ac that maximizes the total structural complexity


å
s¢ Î Bn, s = Ac(s¢) 
C(s, s¢) =
max
A Î A 

å
s¢ Î Bn, s = A(s¢) 
C(s, s¢),
(6)
where A is the set of all algorithms of type (3).

An optimal algorithm Ac for the model and N = 2 produces in terms of structural complexity the closest possible approach to coherent solutions and can be seen as the best approximation to the principle given by algorithms of A. The idea concerning the optimal algorithm is to find its description which can be associated with something that is observable in nature. This can serve as an experimental verification of the principle. In this respect special attention is paid in order to associate the optimal algorithm with self-organization.

However, it is not clear how to find the optimal algorithm (6) immediately. It is proposed then to understand main features of the algorithm by using an approximation to it. In this approximation for an algorithm A and sequences s, s¢ Î Bn such that s = A(s¢) the idea is to consider not the sequences s, s¢ themselves but only their symbols associated with incorrect steps of the algorithm A. These symbols are then viewed not as separate symbols but as whole sequences. They are called mistake sequences and denoted by s / s¢,  s¢/ s. For example, let

s = +1+1-1+1+1-1,    s¢ = +1-1+1+1-1+1
then

s / s¢ = +1-1+1-1,    s¢/ s = -1+1-1+1.
By using the approximation the structural complexity of sequences s, s¢ is replaced by the structural complexity of sequences s / s¢,s¢/ s as it is suggested that

C(s, s¢) £ C(s / s¢,s¢/ s).
An algorithm A then can be characterized by


å
s¢ Î Bn, s = A(s¢) 
C(s / s¢,s¢/ s),
instead of (5). An algorithm, denoted by Ac*, which maximizes


å
s¢ Î Bn, s = Ac*(s¢) 
C(s / s¢,s¢/ s) =
max
A Î A 

å
s¢ Î Bn, s = A(s¢) 
C(s / s¢,s¢/ s),
(7)
is defined and sought as an approximation to the optimal algorithm (6).

10  Constructing optimal algorithm as experimental verifications of the principle

When many options are available a theory is needed to help us understand what might happen and why. Verifications of the theory are expected to come from real observations. This situation is the case with the principle as it needs experimental verifications. As the principle is approximated by an algorithm Ac*, its verifications can be provided by the algorithm being a mathematical description of a phenomenon that is observable in nature.

In this section two descriptions of the algorithm Ac* are given. The first description, called structural complexity, shows that rules of the algorithm Ac* are based on structural complexity. It is important that the second description, called chaotic, is connected with period doubling, an observable phenomenon of self-organization in nature. This result gives an experimental verification of the principle.

1. Structural complexity description of the algorithm

According to this description the kth rule ck*,  k = 1,...,n of the algorithm Ac* is based on the structural complexity

l = C(s(k) / s¢(k),s¢(k) / s(k),|s(k) / s(k)¢|)
of a mistake sequence s(k) / s¢(k) respect to a mistake sequence s¢(k) / s(k) and the sign of

Jl(s(k) / s¢(k)) - Jl(s¢(k) / s(k)),
where s(k) = Ac*(s¢(k)). In this sense it may be said that the algorithm learns from mistakes what decisions to make. In particular, at the first step mistake sequences are empty and the algorithm Ac* starts by definition s1 = c1*(Æ) = + 1. At the (k+1)th step k = 1,...,n-1 two situations are possible:

1. s(k) / s¢(k) = s(k) / s¢(k) = Æ, i.e., mistake sequences are empty,

2. s(k) / s¢(k) ¹ Æ,  s¢(k) / s(k) ¹ Æ, i.e., mistake sequences are not empty.

In the first situation condition s(k) / s¢(k) = Æ means that no incorrect steps have been made up to the kth step. Under this circumstance the (k+1)th rule ck+1* continues to set sk+1 = ck+1*(Æ) = +1.

The second situation is a place where structural complexity is used. Since s(k) / s¢(k) ¹ Æ then s(k) / s¢(k) ¹ s¢(k) / s(k). This means that there exists an integer

1 £ l = C(s(k) / s¢(k),s¢(k) / s(k), |s(k) / s(k)¢|)
such that

Ji(s(k) / s¢(k)) = Ji(s¢(k) / s(k)),    i = 0,...,l-1
but

Jl(s(k) / s¢(k)) ¹ Jl(s¢(k) / s(k)).
Then two cases are possible:

1. Jl(s(k) / s¢(k)) - Jl(s¢(k) / s(k)) > 0,

2. Jl(s(k) / s¢(k)) - Jl(s¢(k) / s(k)) < 0.

The rule ck+1* sets the value of sk+1 to minimize the difference between the lth structural numbers of mistake sequences s(k+1) / s¢(k+1),  s¢(k+1) / s(k+1). Namely, the rule sets sk+1 = -1, if

Jl(s(k) / s¢(k)) - Jl(s¢(k) / s(k)) > 0,
and sets sk+1 = +1, if

Jl(s(k) / s¢(k)) - Jl(s¢(k) / s(k)) < 0.
The algorithm Ac* has a remarkable property distinguishable by the construction of mistake sequences. The following observation is helpful to explain the property. Namely, at each kth step k = 1,...,n a rule ck of an algorithm A cannot always correctly determine the next symbol sk¢ of a sequence s¢. Nevertheless, a rule ck can always correctly determine the next symbol of a mistake sequences s(k) / s¢(k) if the symbol appears at the kth step. Indeed, if a rule ck at the kth step sets sk = +1, but si¢ = -1, then the next symbol of a mistake sequence s(k) / s¢(k) is +1. The mistake sequence s(k) / s¢(k) does not change if sk = sk¢ = +1. If the rule ck sets sk = -1, but si¢ = +1, then the next symbol of the mistake sequence s(k) / s¢(k) is -1. The mistake sequence s(k) / s¢(k) does not change if sk = sk¢ = -1. Therefore, each algorithm A has its own logic for the construction of mistake sequences.

The property of the algorithm Ac* is that it constructs the mistake sequence s / s¢ in such a way that its kth symbol k = 1,...,n is the kth bit of the celebrated Prouhet-Thue-Morse (PTM) sequence

h = +1-1-1+1-1+1+1-1-1+1+1-1+1-1-1+1  ...
starting with +1. The following theorem demonstrates this property of the algorithm [5].

Theorem 1 (Mistakes according to the PTM sequence).  Let s = s1...sn, s¢ = s1¢...sn¢ Î Bn be such that s = Ac*(s¢). Then the algorithm Ac* constructs the mistake sequence s(k) / s(k)¢ according to the PTM sequence

s(k) / s(k)¢ = h(l),    |s(k) / s(k)¢| = l,    1 £ l £ k,    k = 1,...,n.

2. Chaotic description of the algorithm

Theorem 1 provides another description of the algorithm Ac* and makes it possible to realize it by using a natural process. In particular, suppose there exists a generator G(h) of the PTM sequence consequently producing one symbol of the sequence per iteration. The generator can be built on the basis of a dynamical system that has period doubling. With the help of a generator G(h) the chaotic description of the algorithm can be represented by the following steps.

Step 1. Set the number of steps k = 1 and number of incorrect steps l = 1.

Step 2. Generate the lth bit hl of the PTM sequence h by the generator G(h) and for the value of the rule ck* set ck* = hl. Set sk = ck*.

Step 3. If sk = sk¢, then go to Step 4. Otherwise go to Step 5.

Step 4. If k = n, then go to Step 6. Otherwise set ck+1* = ck* and sk+1 = ck+1*. Increment the number of steps k = k+1 and go to Step 3.

Step 5. If k = n, then go to Step 6. Otherwise increment the number of steps k = k + 1 and the number of incorrect steps l = l + 1 and go to Step 2.

Step 6. Stop.

The chaotic description of the algorithm Ac* gives an experimental verification of the principle. For example, when the input sequence s¢ is the PTM sequence starting with -1 then the algorithm Ac* is nothing but demonstrates the symbolic description of period doubling, an observable phenomenon of self-organization in nature.

The algorithm Ac* can be formulated as a simple strategy. This strategy of ''win - stay, lose - consult generator G(h)'' considers a generator G(h) as a ``source of universal wisdom'', which can provide the answer to every question. It is proved that the algorithm Ac* minimizes (7) [11] and in this sense Ac* is optimal.

Theorem 2. The algorithm Ac* is the unique algorithm minimizing (7) among all algorithms of A starting with +1.

References

[1]
J. Holland, (1998), Emergence: From Chaos to Order, Helix Books.

[2]
D. Bohm, (1980), Wholeness and the Implicate Order, Routledge and Kegan Paul, London.

[3]
R. Penrose, (1995) Shadows of the Mind, Vintage.

[4]
S. Weinberg, (1993), Dreams of a Final Theory, Vintage, London.

[5]
V. Korotkich, (1993) Integer Code Series: Applications in Dynamic Systems and Complexity, The Computing Center of Russian Academy of Sciences, Moscow.

[6]
V. Korotkich, (1999), ``On a Structure of Systems of Linear Equations in a Global Description of Sequences'', in From Local to Global Optimization, edited by K. Holmqvist, S. Migdalas, P. Pardalos and Varband, Kluwer Academic Publishers, Dordrecht.

[7]
V. Korotkich, (1997), ``Symmetry in Structural Complexity and a Model of Formation'', Journal Complexity International, vol. 3.

[8]
V. Korotkich, (1998), ``A Mathematical Framework for Human Decision Making as an Integrated Part of a Whole'', in Studies in Fuzziness and Soft Computing, Physica-Verlag, New York, vol. 17, pp. 36-62.

[9]
F. Heylighen and D. Aerts, (1996), The Evolution of Complexity, Kluwer Academic Publishers, Dordrecht.

[10]
W.B. Arthur, (1994), ``Inductive Reasoning and Bounded Rationality (The El Farol Problem)'', American Economic Review, vol. 84, pp. 406-411.

[11]
V. Korotkich, (1995), ``Multicriteria Analysis in Problem Solving and Structural Complexity'', in Advances in Multicriteria Analysis, edited by P. Pardalos, Y. Siskos and C. Zopounidis, Kluwer Academic Publishers, Dordrecht, pp. 81-90.