This book proves stable roots behind reliable software, which includes basic logic concept, prove, coq proving assistant, functional programming, operation semantics and type systems.
All contents in this book is proved by coq, and we intend to solve exercises by using coq.
Building reliable software comes to be beyond man's control, the broadness, complexity and people involving in it contributes to this. This course tries to help improve software's reliability in a mathematic criterion.
Coq enables following terms:
An important tenet of Functional Programming is the computation should be pure,that means to be free from side effects. Therefore, we find the machinery underlying coq becomes clear:
Proof are Programs.
If a procedure or method has no side effects, then (ignoring efficiency) all we need to understand about it is how it maps inputs to outputs.
Some useful features of functional programming are first-class functions, algebraic data types, pattern matching and polymorphic type system.
The core built in coq is extremely small, so you can define all types on your own.
Use Inductive to define a new type like Boolean including only two constructors : true and false. Every expression in coq has a type. Coq shall also be able to define constructors receiving parameters.
The induction types tell us that : rules are defined for computation, so all symbols representing data constructors are free to be selected. Constructors and functions have a little difference : the function has rules to compute while constructors only build data(it is only a method to write down data).
Use Fixpoint to represent recursion. Use Notation to intros symbol for simplification, which can also decide associativity and scope.
There are some basic tactics that can be used for proof.
simpl tries to simplify the terms in prop.
reflexivity unfolds some terms to prove the reflexivity of two sides in a equation.
Sometimes simpl in not required, but simpl is used in situations where we may have to read and understand the new goal that it creates, to prevent us from messing and confusions.
When is comes to implies, things become more interesting. We now have some assumptions for proof, and use rewrite we can tell coq to perform a replacement, from left to right or right to left, depending on situations.
We can also use rewrite to apply proved theorem, coq will try to instantiate the parameters in prop by matching with the goal.
Of course, not everything can be proved by simple calculation and rewriting, some hypothetical values can block simplification. To make a progress, we need to analyze by cases.
The tactic in coq for considering cases is called destruct.
The destruct tactic will generate some subgoals, you need to prove them separately. The as clause is called intros pattern, and what goes between the square brackets is a list of symbolic names, separated by | .
Using bullets between subgoals is essential to clarify the proving process. They make the structure of a proof apparent, making it more readable. Also, bullets make coq ensure the proof goals is completed and prevent from messing it up with following subgoals.
Here comes a simple notation from cases : destruct when intros. Use square brackets can help us apply this tactic. This mode is common and useful.
Pro tip: Coq's notation mechanism is not especially powerful. Don't expect too much from it!
Coq demands some arguments of fixpoint should be decreasing to ensure that every function in coq will definitely terminate. Sometimes this is sophisticated so that we need to write down some functions unnaturally.
We need induction method to solve those types which are generated by recursively defined.
Here the induction principle of natural numbers is enough.
induction will generate the induction hypothesis automatically and analyze cases separately.
Sometimes some proof are too easy to write a lemma apart, so assert commands will generate a lemma waiting to be proved in the context. Also, assert can solve some cases that rewrite doesn't work.
"Informal proofs are algorithms; formal proofs are code."
What constitutes a successful proof of a mathematical claim? A definition is that, proof is a written text to make others believe something is true, in other words, proof is an act of communication.
When readers are machines like Coq, with believe that P can be inferred from a set of mechanically derived logical rules, and the proof is a recipe guides the program in checking this fact. This is the definition of Formal Proofs.
Informal Proofs are shown to humans, so they are efficient ways to communicate ideas between human beings.
The informal proofs will tell readers about state of proof in time, but formal ones will not.
In an inductive type, each constructors can take more than one arguments,
Inductive natprod : Type :=
| pair (n1 n2 : nat).
Sometimes we need to destruct a Nat prod to show Coq the inner structures.
Generalizing the definition of pairs, the list can be defined : a list is either empty or paired with a natural number and a list.
Notations have levels and associativity which can be set. Using .. T .. for nested notations.
Here are some useful functions about lists.
Repeat : receive a n and a count return a list of n with length count .
Length : returns the length of a list.
Append : concatenates two lists.
Head and tail : head returns the first element of a list and returns default value when the list is empty. Tail returns the rest of a list.
Although some simple facts about lists can be proved by simplification and rewrite, we still need to induct lists most time.
The principle of lists is just like numbers, we can replace each cons with S, and nil is similar to zero, since the constructor of a type can only be represented as some applications to another type and element. So the fact gives rise to a way to reason about inductively defined sets.
To show the specific example of this principle, let's see some properties of the reverse function.
We have made so much proofs that can be useful in further proof, find these lemma and remember those names are so difficult then. Search tactic will tell Coq to find all lemmas about something.
The previous function hd
is not so good. Now suppose we need a function called n-th element, we still
need a default value to return, this is confusing sometimes. Changing the return types may solve this problem.
Define a new type called natoption, containing None and Some nat,
to represent errors.
Some tips, the conditionals in Coq is a bit different to others, since the Boolean function is also defined not built-in, Coq support all types containing two constructor exactly to be a role of Boolean, the guard consider true is it matches the first constructor.
An easy partial map can be constructed like lists, empty or a key, a value and another map, the record returns the first value which of key meets the search key.
In this chapter we continue to develop some basic concepts in functional programming: the polymorphism(functions on types) and the high-ordered functions(functions as data).
It is easy to infer that we can define lists for another datatypes, since we can go further to define new types, it well be endless to define lists for them.
Coq supports polymorphism on types, consider a function that takes a type as argument and return another type constructed. List is like that function.
Coq has a mechanic called the type inference, so you can erase some unessential type declarations sometimes. And type arguments can be inferred too.
We can go further to make the type arguments implicit, using curly braces around arguments to show they are implicit. But sometimes we need to give arguments of types forcedly, using @ in front of functions instead.
The generalization of pairs is called Product(Polymorphic Pairs), so the first and second can be different types.
There is also the generalization of options(Polymorphic Options).
Like many other functional programming, Coq treat the functions as first-class citizens.
Some operations on lists can be written down like filter, map and fold.
Use (fun (argument list) => (body)) to define a lambda function.
Sometimes the goal will be the same to some hypotheses or lemmas, then the apply shall work. Generally, the apply tactic will search variables automatically. symmetry will exchange the two sides of an equation. with subclasses for apply with make it more specific.
Constructors of types are all injective and disjoint.
The inversion tactic will do this job, it develops some lemma from these two basic facts. Also, if the hypothesis is false, inversion will solve the subgoal instantly, this is an application of principle of explosion.
In most cases, we will apply tactics on goal, but we can also apply tactics on hypotheses, which fits the form of forward reasoning. generally, Coq will be used of backward reasoning instead.
It is important to control the exact form of induction hypotheses, especially in those lemmas refer to multiple arguments, we need to keep the university of some quantifiers in hypotheses. thus generalize dependent tactic will take effect.
The auto simple of some tactics is often too conservatives, especially for some pattern matching functions, to make the state more clear to both Coq and human, unfold tactic is introduced.
We have seen many destructions on variables, but we can also destruct expressions and add definitiers eqn: to remember these expressions.
We have proved so many propositions in previous chapters, in this chapter, let's talk about the mechanics in Coq to deal with the form of logical reasoning.
Propositions have Type in Coq call Prop, also, props are first class object, that is, we can operate props like other functions in Coq, so it is not strange to write a prop that take arguments, which can be interpreted as Properties or Relations.
Conjunction, or logical and, is often proved by split tactic, and can be destructed in hypotheses.
Disjunction, or logical or, is often proved by left and right tactic, and can be destructed either, just like case analysis.
Not, or negation, is represented as a implication from P to False, aka, ex falso quodlibet. As we often use this lemma, Coq has a tactic called exfalso.
True is simply true and can be proved instantly, but it is too trivial instead.
Logical equivalence is a conjunction of two implications, since it is also equivalence, it can be proved by reflexivity or rewrite.
Extential quantification is also an important connectives, to prove an extential prop, we need use exists to give evidence.
Recursive functions that return a prop is sometimes useful in constructing large systems, but is strictly limited by Coq computations, so use inductive props instead in next chapter.
In Coq, Proof is also first class citizens called proof scripts, so you can apply them like functions, and the type inference system is also the same.
Coq's logical core, the Calculus of Inductive Constructions, differs in some important ways from some other systems.
For example, in ZFC theory, a math object can be members of multiple sets, but in Coq, a term can only have a type. These differences make it necessary to slightly change some descriptions of math concepts. In some cases, however, translate the standard mathematical proof into Coq scripts is nearly impossible, unless we enrich logic with additional axioms.
The first axiom needed is functional extensionality, informally, the identification of a function is decided by its action.
(∀x, f x = g x) → f = g
Use axiom command to write down a axiom that doesn't need to be proved.
We must be careful when adding new axioms into Coq's logic, because it may lead to the inconsistence of logical systems. Establishing the proof of the consistence of any combination of axioms takes hard work.
We know the equivalence of Boolean functions and props, but we still need to prove this. Coq's inner core requires the computational of functions, so propositions can not be applied, as it will lead to some non-computable functions. Using Boolean functions will make some simple automations, so it is essential to apply a technique called Proof by Reflection. The great example is the proof of Four color theorem.
Theorems like excluded middle principle can't be proved in Coq generally, for all props in Coq comes from constructions, except proving by reflection on Boolean functions.
However, Coq's props are much more stronger than classical math in extential props, in other words, any proof of existence is necessarily constructive.
Logics like Coq that not referred to excluded middle principle are named with constructive logics, otherwise classical. Excluded middle principle and contradiction will lead to non-constructive proof, though sometimes the work done for some proof in constructive logics are much more heavy instead, but the principle is compatible with Coq's logic.
The connection in CH6 is mainly about combination logic such as and, or, not.
In this chapter, we shall try to write some props by induction.
We can drawing inference rules to show connections between premises and conclusion.
We often call this proving proceed a tree for its appearance.
Use inductive to define these prop.
Since we know all constructors to build props, and they are the only way to construct. So we need to inductive these evidences to prove by case analysis.
Though destruct and inversion tactics can both be effective in some reducing cases, the destruct tactic usually fails in another case, since destruct simply analyze two case the inductive type given, the inversion tactic will detect reasonable changing directions instead.
At a glance at what we have done by inversion, it seems that we only use the infectivity of constructors to prove equality props, therefore, some inductive props should also work.
Inversion destructs all constructors of assumption P, and throws all contradictive subgoals, and finally add an assumption equation that make the constructor available.
The same things happen that we can truly inducting evidence into small pieces and raise the assumption like data types, here, we need to use induction tactic for evidence.
A proposition parameterized by an argument can be considered as a property of that type, and parameterized by two arguments can be considered as a relation of type, in another word, some subsets of type pairs.
To show the power of inductive definitions, let us use inductive props to build a model of regular expressions.
Inductive reg_exp {T : Type} : Type :=
| EmptySet
| EmptyStr
| Char (t : T)
| App (r1 r2 : reg_exp)
| Union (r1 r2 : reg_exp)
| Star (r : reg_exp).
Induction tactic will approve everything given, thus lose some important evidence for proof, and also induct some paradox. Thus induct evidence is more like destruction but inversion.
Remember tactic will add the equation to replace some complex expressions with simpler variables.
We have seen relation between Boolean computations and statements in Coq, this sometimes may lead to dull proof.
To simplify these proof, we can declare an inductive prop for destructing Boolean expressions, which is called reflect.
Inductive reflect (P : Prop) : bool → Prop :=
| ReflectT : P → reflect P true
| ReflectF : ¬ P → reflect P false.
Using reflect instead of Iff makes it clear for case analysis of P and ~P.
In this chapter we will try to build a data struct call maps (aka. Dictionaries).
There are two kinds of maps, total maps, which return a default value when missing keys, while another is called partial maps, return None instead.
First decide types of keys, we use string this time.
And we need to prove some trivial lemmas for strings.
This time, we will build a partial maps in two steps. Firstly, we need to build a total map by functions rather than key-value pairs, this shall give us a more extensional view.
Definition t_update {A : Type} (m : total_map A) (x : string) (v : A) :=
fun x' ⇒ if beq_string x x' then v else m x'.
The update function provides us with a nice example for high-order programming : takes a function m and return another.
So in this way, partial maps are just total maps with elements of type option A and default element None. We now straightforwardly lift all of the basic lemmas about total maps to partial maps.
"Algorithms are the computational content of proofs." —Robert Harper
We knows that Coq has two mechanisms. For programming and for proving properties,for many times we may take them apart but in truth they are the same.
Provability in Coq is reprensented by concrete evidence, and all evidences are shown as a data structure called evidence tree. Since evidences are all data, properties are all types.
This pun between types and properties, between computation and logic, is Curry-Howard Correspondence.
propositions ~ types
proofs ~ data values
Proof objects are no more than data generated by constructors (type scope), for example, nil is a polymorphism function from type to its empty list ( X -> list X).
When Coq executes a proof scripts, it constructs a proof object step by step, use Show Proof command can give us a scrape. Anyway, Coq knows what's the type the goal is, and follows the constructors the script given.
Tactics are useful but not essential, especially when we can construct all evidences by hand and use Definition to give a global name directly to this evidence.
Constructors and functions can both give evidences in proof.
By constructing a function that takes an evidence and an argument, we can reply an envidence that is our proving goal!
The evidence argument is called dependent types, in fact, the argument it received is also called expressions.
The arrows, aka. Lambda expressions are a short way to write ∀, a grammar sugar in fact.
In another aspect, tactics can also construct program that computing, while functions can prove statements.
By using defined instead of functions to define program, Coq will make this proof transparent in order to make application usage in another statement or usage ordinary.
Proving the functions is useful in dependent types. Moreover, it does illustrate the uniformity and orthogonality of the basic ideas in Coq.
Most of login connectives, except ∀, is defined inductively.
And is a tuple of evicence for both left and right.
Or has two constructors for both sides.
Existential Quatifiler is a package with evidence x and its proof that x matches property P.
True is just a non-argument constructor, while False is a type with no constructor, since there is no way building its proof.
Even Coq's equality relation is not built in. It is a inductive type with reflective constructors.
Inductive eq {X:Type} : X → X → Prop :=
| eq_refl : ∀x, eq x x.
If two evidence has the same normal form, or in other words, are convertible, then they are equal.
Another way to state equality is to prove that ∀ Properties that matches x, also matches y, will prove that x = y, this is called Leibniz Equality.
Finally, we are able to learn how inversion tactic works:
takes a hypothesis H whose type P is inductively defined, and
for each constructor C in P's definition,
Declaring data types will also generate the induction principle for this type.
Induction is just a simple package for this principle of applying it. But in fact, what induction has done is more than just applying it. In general, induction will match cases no matter the variable is introduced, induction will also generate variables' names in all cases.
Though some types are not recursive, Coq will also generate induction principle for them, in these cases, the principle provides with a possible method to prove that a property matches all data of this type.
The polymorphic types bring with polymorphic induction principles, when given a specific type, it will generate a specific induction principle.
A induction hypotheses on P is a statement generalized for all properties indexed by type P.
The hypotheses is a premise for implication later -- we are allowed to use the assumption that P holds for constructor argument n, when proving P holds for the result constructed.
Induction first specialize the argument that need for induction, by using a particular argument, induction keeps the prop to be proved re-generalized when change the subgoals, those arguments after it will still keep their universal identities.
Induction principles in propositions is a bit more complicated, like even, what we need is not just a statement for evidence, but a statement for common numbers……so intuitively speaking, we want a induction principle that allow us to prove properties on numbers by induction on evidence.
ev_ind_max : ∀ P : (∀ n : nat, ev n → Prop),
P O ev_0 →
(∀ (m : nat) (E : ev m), P m E → P (S (S m)) (ev_SS m E)) →
∀ (n : nat) (E : ev n), P n E
Evidence on that a nat is a even number is not needed when proving property P, because we just need the structure of inductive prop even, not the evidence that even x. So Coq will simplify the principle above by abandoning the evidence E. Another point is, the form inductive proposition is defined will influence the complexity of induction principles.
The informal proof of a proposition P is ought to teach us how to produce the formal proof of it.
The ideal proof should be simple and scratch some essential points, we still need a template for reference.
The object inductively defined is often named derivation trees, so induction on propositions is induction on derivation trees.
Relations on a set is just like property: while property is single argument, relations take two arguments and return a proposition. Though relation is a common word in general, Coq hijacks it on specific relations in a single set.
Like in discrete math, relations have many properties, reflexive, transitive, etc.
Partial Functions
For a relation R, for every x, there is at most one y such that R x y -- i.e. R x y1 = R x y2 -> y1 = y2.
Reflexivity ∀ x : X, R x x.
Transitivity ∀ a b c, R a b -> R b c -> R a c.
Symmetric ∀ a b, R a b -> R b a.
Antisymmetric ∀ a b, R a b -> R b a -> a = b.
Equivalence
A relation is an equivalence if it's reflexive, symmetric, and transitive.
Partial Orders and Preorders
A relation is a partial order when it's reflexive, anti-symmetric, and transitive. A preorder is almost like a partial order, but doesn't have to be antisymmetric.
The reflexive, transitive closure of a relation R is the smallest relation that contains R and that is both reflexive and transitive.
For example, the rtc of next_nat is le.
In fact, we often specialize the relation in transitive with containing relations.
We will give the impression language in three parts, firstly, the core definition of arithmetic and Boolean expressions, then the variables, and at last, the control stream.
In this chapter, we elide the translation from the concrete syntax to abstract syntax trees. The BNF grammar is given below:
a ::= nat
| a + a
| a - a
| a * a
b ::= true
| false
| a = a
| a ≤ a
| ¬b
| b && b
There is so many informal notations like BNF, which is convient for communication and ideas. It takes some work to translate them into formal languages.
Evaluation on numbers get numbers while on Boolean values the same.
We can do some optimation now, like optimizing the 0 + n into n, proving its correctness leads to dull script though.
It is annoying to do many replicating proof, so in this section, we will introduce some some facilities that construct the proof automatically. It allow us to scale up our efforts to more complex definitions and more interesting properties.
Tacticals, aka. High-order tactics is a term in Coq that take other tactics as arguments.
The try tactical : if T is a tactic and try T is like T except that, if T fails, try T will does nothing at all rather than failing.
The ; tactical : ; takes two tactics T and T', it first performs T and then performs T' on each subgoal generated by T.
The try and ; tactical are often applied together to simplify the proof.
The ; tactical(general form) : if T T1 T2 … Tn are all tactics, T ; [T1 | T2 | … | Tn] is a tactical that apply T1 in the first subgoal T generate, T2 in the second…
The repeat tactical : it takes another tactic and keeps applying this tactic until it fails.
The repeat never fails, also there is no limit for repeat times, though Coq doesn't allow the evaluation not terminate, but tactics are not. But this will not influence the logical consistency since it just means we have failed to construct a proof, not a wrong one.
We can also "programming" Coq's tactic scripts by methods below:
Here is an example for Tactic Notation :
Tactic Notation "simpl_and_try" tactic(c) :=
simpl; try c.
The omega tactic : it comes from a subset of first-order logic called Presburger arithmetic.
More tactics :
Another way to think about evaluation is as a relation between expressions and its values, and sometimes it comes more flexible. Importing some notations will also help us understand.
Using Reserved Notation … where … will prevent from introducing the function application form.
Writing Inference Rule makes it clear when discussing informal.
By the way, the variable defined now is often called Metavariables, to distinguish with variables that user defined.
Sometimes, represent evaluation as some relations do have its benefits. For example, if we want to extend the evaluation with division, the safe div is simpler in relations. Moreover, considering about a generator of numbers, which is non-deterministic, is easy to understand with relations.
Relation is more elegant and easier to understand though, function plays an important role in computation and is more deterministic. Functions can directly export as Ocaml or Haskell codes.
However, it is wise to define two different styles together and prove their coincidence, in order to switch from one another at will.
To simplify the problem, we assume that all variables are global and only hold numbers.
Using map to represent this state, which is a series of values at runtime. The default value of a variable is set as 0.
Add a constructor for aexp called id will simply refer to a variable.
Import some notations and coercions and make them in special tye scope to prevent conflict.
The evaluation now needs a more argument call st
, represent the state where variables lies.
It’s easy to use inductive types for grammar and commands.
Inductive com : Type :=
| CSkip
| CAss (x : string) (a : aexp)
| CSeq (c1 c2 : com)
| CIf (b : bexp) (c1 c2 : com)
| CWhile (b : bexp) (c : com).
And some notations…
Use Locate command to locate notations and identifiers.
The fact that sometimes WHILE do not terminate makes a bit tricky in Coq to define a function for evaluation.
Because of the logical consistence of Coq, the ever loop will be rejected, thus we can only abandon the WHILE temporarily.
An improvement is to rewrite the evaluation into relations.
However, we now need construct proof for commands.
We can also prove that ceval
is indeed a partial function.
We can get some distance just working with these bare definitions now;
The inversion tactic will expand one step of ceval
to generate some hypothesis.
This Chapter is simply to show it is possible to build a lexical analyzer and parser by Coq.
By passing an additional argument to evaluate function, we can break in force when the steps break the limits.
Note that this argument is not specificly to calculate steps, see in sequence evaluations.
An improvement is changing the function's type to option so that we can tell the difference between normal and abnormal termination.
This section aims to prove the equality of two evaluations.
Three languages available when extraction : Ocaml, Haskell and Scheme.
Moreover, we can tell Coq to extract certain Inductive definitions to specific OCaml types.
It is entirely your responsibility to make sure that the translations you're proving make sense.
In this chapter, we wil look through more tactics in Coq about auto proof systems.
Firstly, an example of Ltac Marco is useful to simplify and make proof more understandable.
auto will search for a series of application that can be applied to goals.
Any combination of intros and apply can be replaced by auto, auto is also safe, it will neither rewrite goals nor fail. Passing argument to auto to control the search depth. Auto relies on a hint database to work, auto using will expand the default database.
Hint Resolve T will add a lemma into global hint database, Hint constructor c will add all props inductively defined in c. Hint unfold d will expand the symbol d for more propositions.
Proof with T will apply T in the place of …
Match goal with … in Ltac will match goals with a specific form and do some tactics.
Though match goals is not convient when debuging, it is also worth adding it to proof scipts.
eapply and eauto will delay the instantiations of quantifiers.
e… tactics are much more slower than ordinary tactics, so it will be better using apply and auto ar most of time, only switching to e tactics when some jobs can not be done.