~{%N~}ー~{%H~}
~{%9%i%$%I~} ~{%7%g~}ー
~}/~{7G1mJ>~}" id="nb_otl" style='position:absolute;top:0%;left:0%;width:100%;height:100%;cursor:default'>
~{%"%&%H%i%$%s~}
1
Mathematics based on Learning
  • Susumu Hayashi
  • Dec, 18, 2002
  • SIGFAI, Fukuoka
2
Mathematics based on Learning
  • Computational Learning Theory is  mathematical investigations of learning.
  • In this talk, I will present

    LCM: Limit-Computable Mathematics

    a mathematics based on computational learning theory, more exactly, based on inductive inference.
3
An example of LCM-theorem
  • MNP (Minimal Number Principle):
    Let f be a function form Nat to Nat. Then, there is xÎNat such that
    f(x) is the smallest value among 
    f(0), f(1), f(2),~{!-~}
  • $xÎNat."yÎNat.f(x)£f(y)
  • Nat : the set of natural numbers
4
An example of LCM semantics of logical formulas
  • Let x,yÎNat and f:Nat®Nat.
  • LCM-semantics of $y.A(x,f,y) is
    • There is a machine M taking x and the first n data of f: (0,f(0)), (1,f(1)),~{!-~},
      (n-1,f(n-1)) and outputs its guess yn
    • The guesses y0, y1,~{!-~} converges to a value y¥ in Gold~{!/~}s sense
    • A(x,f,y¥) holds in LCM-semantics
5
An inductive inference machine for MNP
  • Regard the function f  as an input stream
     (0,f(0)), (1,f(1)), (2,f(2)),~{!-~} for M
  • M is the inductive inference machine taking the first n elements of the stream and outputs the minimal xn such that "y<n.f(xn)£f(y).
  • The outputs (guesses) of the machine converges to the right answer of MNP.
6
LCM semantics of logical formulas
  • LCM semantics of logical formulas are similarly defined for other cases by means of similar inductive inference machines.
  • Thus, it is called Limit-Computable Math.
  • Details of the semantics are in ALT/DS 02 paper and other papers at PA/LCM home page.
    •        http://www.shayashi.jp/PALCM/
7
Logic of Discovery (1)
  • Carl Smith pointed out LCM semantics is similar to ~{!0~}Logic of Discovery~{!1~} (J. Bardins, R. Freivalds and C. Smith).
  • Logic by Smith et al. is based on Tarski~{!/~}s semantics and learning is involved only at the outmost level of truth definition.
  • Ours are based on Kleene~{!/~}s semantics and learning process is involved at the every level of truth definition.
8
Logic of Discovery (2)
  • Ours are based on Kleene~{!/~}s semantics and learning process is involved at the every level of truth definition.
  • However, up to S02-level (the level classical logic holds in LCM), they are pretty much similar.
  • The notion of ~{!0~}expectation level~{!1~} might be helpful to bridge LCM and computable analysis and numerical analysis.


9
Difference from inductive inference (1)
  • The LCM-semantics of $y.A(x,f,y) is different from the standard definition of inductive inference
  • Inductive inference
    • Functions to be learned are finitely representable
    • To be learned is the whole picture (representation etc.) of the function
10
Difference from inductive inference (2)
  • LCM
    • Functions such as f in $y.A(x,f,y) may be arbitrary functions without any finite representations
    • The object to be learned (infered) is any attribute related to the function f


11
Learning?
  • According to Carl Smith~{!/~}s definition, our process is not learning but discovery.
  • Mathematics based on Discovery??
  • Probably, it is better to call such a task ~{!0~}learning attributes~{!1~}.
12
Anyway, no matter how it is called,
LCM is learning theoretic
  • LCM have been related to Learning Theory in some non-trivial ways.
    • Resemblance between ~{!0~}learning of algebraic concepts~{!1~} by Stephan et al. and LCM algebra.
    • The learning for MNP is ~{!0~}Learning by Erasing~{!1~} due to Jain and others.
    • A relationship of LCM to Reverse Mathematics can be explained by a learning theoretic ~{!0~}Popperian race~{!1~}
13
Where did LCM come from?
What is LCM for?
  • The original motivation was Proof Animation technology, which aims to debug (partially) formalized proofs with proof checkers.
  • By chance, thank to A. Yamamoto, I could related it to learning theory.
  • LCM is primarily for Proof Animation.


14
Proof Animation
  • Formal methods are becoming realistic but still costs of developments of formal proofs are too huge for almost all commercial applications
  • A difficulty is ~{!0~}wrong directed proof plan~{!1~}.
  • Proof developments sometimes go into wrong directions due to ill-formalized lemmas (subgoals) and definitions.
  • An example~{!-~}..
15
An example of Proof Animation
16

Proof of the theorem
    • The theorem holds for group A and B, since they have only n marbles. All the marbles are of the same color, since they share an.
17
Necessity of some means to find bugs in proofs
  • This kind of situations happen in developments of formal proofs.
  • We can see goals and definitions are wrong when we find that we cannot prove the goals.
  • However, realizing unprovability of goals after many fruitless trials is awfully time consuming and exhausting.
  • Isn~{!/~}t there any easier way to find errors?


18
Yes there is.
Let~{!/~}s animate proofs!
19
Automatic generation of programs for proof animation
  • If a proof is intuitionistic, namely does not use the principle of excluded middle A or not A, a program which execute the proof is automatically obtained. Actually, the example of marbles is intuitionistic and, a skeleton of the applet could be automatically generated.
  • However, not for classical proofs using A or not A~{!-~}.
20
Inductive inference and Excluded middle
  • Restricted LEM (Law of Excluded Middle) are valid under LCM-semantics
  • Then inductive inference machine could be used as the marble applet
  • We cannot compute the final value, but we can debug LCM-proofs by inductive inference
  • They do not cover all classical mathematics, but are enough for very many theorems.


21
What kind of logic hold?
  • Logical of LCM depend on realizability semantics used, e.g., modified realizability and Kleene realizability~{!-~}.
  • Thus they differ from versions to versions.
  • However, they have common characteristics:
  •  semi-classical principles hold
22
Semi-classical principles (LEM)
  • S0n-LEM~{!!~}(Law of Excluded Middle):
           A or not A~{!!~}
    for S0n-formula A.
  • Similarly for P0n-LEM
    •     P01-LEM: "x.A or not "x.A,
  • D0n-LEM
    ~{!!~} (A ↔ B) ~{!z~} A or not  A
    for S0n-formula~{!!~}A and P0n-formula~{!!~}B
23
Semi-classical principles (DNE)
  • S0n-DNE~{!!~}(Double Negation Elimination):
       (not not A) ~{!z~} A~{!!~}
    for S0n-formula~{!!~}A.
    •    S02-DNE:  not not $x."y.A~{!z~} $x."y.A
  • P0n-DNE is defined similarly


24
Semi-classical principles (LLPO)
added to the hierarchy on Sunday 15th
  • S0n-LLPO~{!!~}(Lesser Limited Principles of Omniscience):
       (not (A and B) )~{!z~} not A or not B
    for S0n-formula~{!!~}A, B.
    • S01-LLPO: P, Q are recurive
    •    not ($x.not P and $x.not Q)
                                   ~{!z~}
      "x.P or "x.Q
  • P0n-LLPO is defined similarly
25
Hierarchy of semi-classical principles (1)
26
Hierarchy of semi-classical principles (2)
  • The converse of arrows in the hierarchy of semi-classical principles are conjectured not to be derivable in HA.
  • The conjecture have been solved for n=1, 2 levels, which include all principles corresponding to inductive inference and below.
  • It is still open for the higher levels. Maybe~{!-~}.
  • The exact place of general LLPO is open. However, it is well-known in the lowest level, Sublearning Level.
27
WKL and LCM (1)
  • S. Simpson and his collaborators have shown that WKL is enough to prove many mathematical theorems, although WKL is very weak from consistency point of view.
  • Weak König Lemma (WKL): any binary branching tree with infinite nodes has an infinite path
28
WKL and LCM (2)
  • The low basis theorem: the infinite paths could be of low degree. A degree is low, if its jump is lower than the degree of halting problem.
  • This fact suggests that WKL sits in between of limiting-computation and computation
  • WKL actually sits in the expected place in our hierarchy


29
WKL and LLPO
  • Bishop~{!/~}s LLPO (=our S01-LLPO) :
     not (A and B) ~{!z~} not A or not B
    for A, B: S01-formulas
  • WKL is constructively equivalent to LLPO plus the bounded countable choice for P01-formula (decision making for effectively refutable conditions).




30
The place of WKL in LCM hierarchy
31
Three underivability proofs
  • The underivability of P01-LEM is proved by three different proofs:
    • monotone functional interpretation (Kohlenbach)
    • Standard realizability plus low degree model of WKL0 (Berardi, Hayashi, Yamazaki)
    • Lifschitz realizability (Hayashi)
32
How far can LCM prove?
Answer: pretty much many!
  • Transfers from Reverse Mathematics to LCM
    • Many theorems in Reverse Mathematics are proved without or with very weak classical reasoning, and we can transfer their proofs nearly automatically into LCM-proofs.
    • An example:  Completeness theorem of the first order predicate logic.
  • This is not an coincidence. Reverse math. and LCM are deeply related. (Next slide)
33
A good relationship between learning and LCM through WKL
  • A semantics of intuitionistic WKL by Lifschitz and van Oosten gives a pradigm of ~{!0~}learning by refutation~{!1~}
  • This semantics is based on multi-valued computation initiated for a very different reason in 70~{!/~}s, incidentally by a known AI researcher V. Lifschitz, and studied by van Oosten in 90~{!/~}s.



34
Multi-valued computation
a la Lifschitz-van Oosten (1)
  • Let p1,~{!-~},pn be programs and let C1,~{!-~},Cn be corresponding P01-conditions (formulas) guaranteeing  ~{!0~}correctness~{!1~} of the programs
  • A finite set {(p1,C1),~{!-~},(pn,Cn)} represents the following set of values:
  •   {vi | vi is the value of pi and Ci holds}
35
A learning theoretic interpretation: Popperian race
  • Assume that p1,~{!-~},pn are competing scientists computing a scientific constant based on their own theories represented by C1,~{!-~},Cn
  • Since theories C1,~{!-~},Cn are P01, they are refutable. If one guy is wrong, he will drop from the race eventually
  • If at most one remains, then we can effectively know who is correct including none.
36
Multi-valued computation
a la Lifschitz-van Oosten (2)
  • A system of this representation of  multi-values gives an intrinsic semantics of logic with WKL
  • Existence of entity is interpreted as ~{!0~}an answer is in the multi-valued output, although I do not know which one it is~{!1~}.
  • The proofs by WKL are easy to animate by this system of computation
37
Open problems (1)
  • LCM-concepts corresponding to various learning theory concepts:
    • What~{!/~}s positive data and negative data in LCM? (Question by Eric Martin)
    • Is there logic for PAC-learning?: Probability of PAC-learning seems to corresponds to basis of topological space as expectation level of logic of discovery.
  • Relationship to computable and/or numerical analysis, Kohlenbach~{!/~}s work, Arikawa et al~{!-~}
38
Open problems (2)
  • Relationship to geometry, recursion theory, L-system, fractal, etc. etc~{!-~}.
  • The thesis
      • Many scientific phenomena would be explained  in the frame work of computable generation and refutation, erasing
      • Just as development of organic system by cell generation and apotosis
39
Fractal generation by erasing
40