~{%N~}ー~{%H~}
~{%9%i%$%I~} ~{%7%g~}ー
~}/~{7G1mJ>~}" id="nb_otl" style='position:absolute;top:0%;left:0%;width:100%;height:100%;cursor:default'>
~{%"%&%H%i%$%s~}
1
Mathematics based on Learning
  • Susumu Hayashi, Nov. 23, 2002
  • ALT/DS ~{!/~}02, L~{(9~}beck, Germany
2
Collaborators
  • The results presented in the talk are join work with the following  calibrators:        Y. Akama, S. Berardi, H. Ishihara, U.Kohlenbach, M. Nakata, T. Yamazaki, M. Yasugi
3
Apology
  • The paper in the proceedings contain considerable number of typos. Correction would be on my LCM web page at http://www.shayashi.jp very soon


4
Mathematics based on Learning
  • Computational Learning Theory is  mathematical investigations of learning.
  • In this talk, I will present

    LCM: Limit-Computable Mathematics

    a mathematics based on computational learning theory, more exactly, based on inductive inference.
5
An example of LCM-theorem
  • MNP (Minimal Number Principle):
    Let f be a function form Nat to Nat. Then, there is xÎNat such that
    f(x) is the smallest value among 
    f(0), f(1), f(2),~{!-~}
  • $xÎNat."yÎNat.f(x)£f(y)
  • Nat : the set of natural numbers
6
An example of LCM semantics of logical formulas
  • Let x,yÎNat and f:Nat®Nat.
  • LCM-semantics of $y.A(x,f,y) is
    • There is a machine M taking x and the first n data of f: (0,f(0)), (1,f(1)),~{!-~},
      (n-1,f(n-1)) and outputs its guess yn
    • The guesses y0, y1,~{!-~} converges to a value y¥ in Gold~{!/~}s sense
    • A(x,f,y¥) holds in LCM-semantics
7
An inductive inference machine for MNP
  • Regard the function f  as an input stream
     (0,f(0)), (1,f(1)), (2,f(2)),~{!-~} for M
  • M is the inductive inference machine taking the first n elements of the stream and outputs the minimal xn such that "y<n.f(xn)£f(y).
  • The outputs (guesses) of the machine converges to the right answer of MNP.
8
LCM semantics of logical formulas
  • LCM semantics of logical formulas are similarly defined for other cases by means of inductive inference machines or in another word limit-computation. Thus, it is called Limit-Computable Math.
  • Details of the semantics are in the proceeding paper.
9
Difference from inductive inference (1)
  • The LCM-semantics of $y.A(x,f,y) is different from the standard definition of inductive inference (from a paper by Case and Suraj)
  • Inductive inference
    • Functions representable finitely, e.g., computable functions, are infered from their finite segments.
10
Difference from inductive inference (2)
  • LCM
    • Functions such as f in $y.A(x,f,y) may be arbitrary functions without any finite representations
    • The objects to be infered may not be the functions (or languages) to be fed to the inductive inference machines


11
Learning theory?
  • LCM-semantics is much generalized from inductive inference
  • It is about limiting-recursion (D02-stuffs in recursion theory), but not all limiting-recursion stuffs are related to learning
  • Is it really useful to think the subject in the relation with learning theory?
  • My answer is YES.
12
The reasons why I say YES. (1)
  • Inductive inference is formulated as the problem of Generalization, but non-generalization problem such as Classification (Wiehagen and Smith) is also considered in COLT
13
A thought on non-computability of entities
  • Studies of classification still assume finite representability of entities, since the data are stored in a DB
  • But, Web is now a kind of DB. It might be reasonable to think an entity as a stream of data obtained from Web
  • Isn~{!/~}t it better to modelize a registered customer at an e-shop as a non-computational stream of interactions



14
The reasons why I say YES. (2)
  • Similarities between the two areas
    • The learning for MNP is ~{!0~}Learning by Erasing~{!1~} due to Jain and others.
    • Resemblance between ~{!0~}learning of algebraic concepts~{!1~} by Stephan et al. and LCM algebra.
    • A relationship of LCM to Reverse Mathematics has been found, and it can be explained a ~{!0~}game~{!1~} with a flavor of learning
15
The reasons why I say YES. (3)
  • LCM framework is very general to the existing frameworks in Learning theory
  • However, it does not seem too general, as ~{!0~}thinking in learning theory~{!1~} has been a great help for developments of LCM
  • LCM might provide some ideas to Learning Theory in future
16
Where did LCM come from?
What is LCM for?
  • The original motivation was Proof Animation technology, which aims to debug (partially) formalized proofs with proof checkers.
  • By chance, thank to A. Yamamoto, I could related it to learning theory.
  • LCM is primarily for Proof Animation.


17
Proof Animation
  • Formal methods are becoming realistic but still costs of developments of formal proofs are too huge for almost all commercial applications
  • A difficulty is ~{!0~}wrong directed proof plan~{!1~}.
  • Proof developments sometimes go into wrong directions due to ill-formalized lemmas (subgoals) and definitions.
  • An example~{!-~}..
18
An example of Proof Animation
19

Proof of the theorem
    • The theorem holds for group A and B, since they have only n marbles. All the marbles are of the same color, since they share an.
20
Necessity of some means to find bugs in proofs (1)
  • This kind of situations happen in developments of formal proofs.
  • Sometimes, we cannot even the theorem is wrong, when we are not very familiar with concepts we defined for a particular software and hardware systems. (Your formal definitions may not represent your informal concepts!)
21
Necessity of some means to find bugs in proofs (2)
  • We can see goals and definitions are wrong when we find that we cannot prove the goals.
  • However, realizing unprovability of goals after many fruitless trials is awfully time consuming and exhausting.
  • Is~{!/~}nt there any better way to find errors?
22
Yes there is.
Let~{!/~}s animate proofs!
23
Automatic generation of programs for proof animation
  • If a proof is intuitionistic, namely does not use the principle of excluded middle A or not A, a program which execute the proof is automatically obtained. Actually, the example of marbles is intuitionistic and, a skeleton of the applet could be automatically generated.
  • However, not for classical proofs using A or not A~{!-~}.
24
Inductive inference and Excluded middle
  • Restricted LEM (Law of Excluded Middle) are valid under LCM-semantics
  • Then inductive inference machine could be used as the marble applet
  • We cannot compute the final value, but we can debug LCM-proofs by inductive inference
  • They do not cover all classical mathematics, but are enough for very many theorems.


25
What kind of proofs animated by inductive inference?
  • The law of excluded middle and related principle of double negation elimination can be restricted to S0n-and P0n-formulas. They are called semi-classical principles
  • Weak semi-classical principles correspond to inductive inference
26
Semi-classical principles (LEM)
  • S0n-LEM~{!!~}(Law of Excluded Middle):
           A or not A~{!!~}
    for S0n-formula A.
  • Similarly for P0n-LEM
    • P01-LEM: "x.A or not "x.A,
  • D0n-LEM
    ~{!!~} (A ↔ B) ~{!z~} A or not  A
    for S0n-formula~{!!~}A and P0n-formula~{!!~}B
27
Semi-classical principles (DNE)
  • S0n-DNE~{!!~}(Double Negation Elimination):
       (not not A) ~{!z~} A~{!!~}
    for S0n-formula~{!!~}A.
    • S02-DNE:  not not $x."y.A~{!z~} $x."y.A
  • P0n-DNE is defined similarly


28
Hierarchy of semi-classical principles
29
LCM semi-classical principles and inductive inference
  • Principles correspond to inductive inference:
    • P01-LEM, S01-LEM, D02-LEM,
      S02-DNE
  • The others are strictly weaker or stronger than inductive inference
  • Thus these are proper LCM-principles
30
Hierarchy of semi-classical principles (2)
  • The converse of arrows in the hierarchy of semi-classical principles are conjectured not to be derivable in HA.
  • The conjecture have been solved for n=1, 2 levels, which include all principles corresponding to inductive inference and below.
  • It is still open for the higher levels.
31
What are provable by proper LCM-principles?
  • We are now classifying mathematical theorems by means of these principles.
  • There are theorems exceeding the scope of LCM, like Bolzano-Weierstrass theorem, but many theorems of standard mathematics are in the scope of LCM
  • E.g. Heine-Borel theorem, Completeness theorem of predicate logic, etc.etc~{!-~}.
32
WKL and LCM (1)
  • S. Simpson and his collaborators have shown that WKL is enough to prove many mathematical theorems, although WKL is very weak from consistency point of view.
  • Weak König Lemma (WKL): any binary branching tree with infinite nodes has an infinite path
33
4.4 of the paper is now obsolete
  • After sending the manuscript of the paper, stuffs on WKL have been much improved. Even the formulation of LCM-WKL has been changed. Now, it is just as ordinary WKL!
  • And the problem of learning theoretic model of WKL is nearly solved as I will present soon
34
WKL and LCM (2)
  • The low basis theorem: the infinite paths could be of low degree. A degree is low, if its jump is lower than the degree of halting problem.
  • This fact suggests that WKL sits in between of limiting-computation and computation
  • WKL actually sits in the expected place in our hierarchy


35
The place of WKL in LCM hierarchy
36
Transfers from
Reverse Mathematics to LCM
  • Theorems provable constructively from WKL are animated by inductive inference
  • Many theorems in Reverse Mathematics are proved without or with very weak classical reasoning, and we can transfer their proofs nearly automatically into LCM-proofs.
  • An example:  Completeness theorem of the first order predicate logic.
37
A good relationship between learning and LCM
  • A semantics of intuitionistic WKL by Lifschitz and van Oosten gives a pradigm of ~{!0~}learning by refutation~{!1~}
  • This semantics is based on multi-valued computation and developed for a very different reason in 70-90~{!/~}s.



38
Multi-valued computation
a la Lifschitz-van Oosten (1)
  • Let p1,~{!-~},pn be programs and let C1,~{!-~},Cn be corresponding P01-conditions (formulas) guaranteeing  ~{!0~}correctness~{!1~} of the programs
  • A finite set {(p1,C1),~{!-~},(pn,Cn)} represents the following set of values:
  •   {vi | vi is the value of pi and Ci holds}
39
A learning theoretic interpretation: Popperian competition
  • Assume that p1,~{!-~},pn are competing scientists computing a scientific constant based on their own theories represented by C1,~{!-~},Cn
  • Since theories C1,~{!-~},Cn are P01, they are refutable. If one guy is wrong, he will drop from the race eventually
  • If at most one remains, then we can effectively know who is correct including none.
40
Multi-valued computation
a la Lifschitz-van Oosten (2)
  • A system of this representation of  multi-values gives an intrinsic semantics of WKL
  • The proofs by WKL are easy to animate by this system of computation
  • An interpretation with a flavor of learning is possible
41
Learning? Yes!
  • Since it is strictly below the proper LCM principles, it is not real inductive inference
  • However, the metaphor to Popper style competition by refutation helps the understanding of proof of ~{!0~}the low basis theorem~{!1~} and proof animation by the computation system
42
Conclusion:
Learning theory and LCM
  • It must be fruitful to investigate relationship of  COLT and LCM-like mathematics based on ~{!0~}learning~{!1~}.
  • There are many interesting open problems. An example:
    • Is there mathematics corresponding to other paradigm of learning theory besides inductive inference? How about PAC-learning?
43
Thanks!
  • Hearty thanks to
  •      A. Yamamoto, H. Arimura, J. Case, E. Martin, F. Stephan,
  • I learned about COLT much through discussions with them
  • Yamamoto helped me to start the entire field