Hacker Newsnew | past | comments | ask | show | jobs | submit | dselsam's commentslogin

“Here is a Homer poem about the Singularity:”

It is not possible to say

Whether the gods set the Singularity

Upon us, or the Singularity

Caused the gods to be.


One of the founders of the IMO Grand Challenge here. FYI I presented the challenge along with a preliminary roadmap at AITP 2020 last week: https://youtu.be/GtAo8wqWHHg. No way to know if gold is five years out or five hundred but I hope the challenge spurs progress in the field either way.


That repository has rotted. Preliminary roadmap described in recent invited talk at AITP-2020 http://grid01.ciirc.cvut.cz/~mptp/zoomaitp/aitp_sep_15_1930....


> Its always funny to realize how "easy" beating human-intelligence is (Chess AI, Go AI, even Mathematical Proofs), but how hard beating human-simple behaviors are.

This is the baseless myth that won't die. We are nowhere near rivaling humans in mathematical proving. Even after a human has proved a theorem, it can be an astronomic amount of work to even explain the proof to a machine after the fact (i.e. to construct a machine-checkable version of the proof), even when using suites of sophisticated software tools developed solely to facilitate this process. Progress has been glacially slow since the field began in the 1950s. Deep learning has had zero impact.

We will have robots that can grasp competently way before we have machines that can rival humans in mathematical proving.


True: automated mathematical proving is strictly limited to certain domains right now. First order logic (and other higher forms of logic) are outside the scope of machines, although machines can somewhat solve logic puzzled correctly stated in first-order form (its a semi-decidable problem, so "somewhat solve" is basically the best we can ever hope for).

But move down to boolean logic and symbolic computation... the stuff that translates statements in Verilog or VHDL into pure logic statements for small LUTs in FPGAs or logical NAND gates of hardware synthesis??

Oh yeah, machines are way better than humans. People pretty much rely upon automatic synthesis for circuit design and boolean logic these days. Proof is in the pudding.

Perhaps it isn't the same kind of math you were thinking about. But Boolean algebra is still math, logic, and proofs. Of course, in practice, humans use these tools to build larger structures... but I'd argue that the machine handles a lot of the rote proof stuff (optimally figuring out the minimum number of NAND gates to represent a certain truth table).


Author here. We weren't shooting for low-hanging fruit, and we realize that we are still very far from contributing to the state-of-the-art. We tried to approach this project as scientists instead of as engineers. I personally found NeuroSAT's success to be extremely surprising, and I think it underscores how little we understand about the capabilities and limitations of neural networks. Our approach might prove to be a dead-end but I think there is some chance we have stumbled on something powerful.


Author here.

Building Certigrad involves replaying all tactic scripts in the entire project to reconstruct all of the formal proofs, and then checking each of the formal proof objects in Lean's small trusted kernel. Proving (and checking) the main correctness theorem for stochastic backpropagation is very fast. The vast majority of the time and memory is spent verifying that a specific machine learning model (AEVB) satisfies all the preconditions for backprop. This involves proving several technical conditions, e.g. that various large terms are (uniformly) integrable. We have not experimented much with simplification strategies, and there is probably a lot of room for improvement in bringing these numbers down. It would also be good to provide an option to build the system without reconstructing the proofs; checking the proofs is analogous to running the entire test suite, and most users do not do this for every tool they build.


Could this problem have been (partly) avoided by going for an LCF-based prover such as HOL or Isabelle/HOL, rather than a system based on the Curry-Howard correspondence?


I do not even know how I would have built Certigrad in Isabelle/HOL in the first place. In my first attempt to build Certigrad, I used a non-dependent type for Tensors (T : Type), and I accumulated so much technical debt that I eventually gave up and started again with a dependent type for tensors (T : Shape -> Type). This was my only major design error in the project, and development went smoothly once I made this change.


ITTs like Lean are more expressive as logics than HOL, but I doubt that Certigrad needs even a fraction of HOL's power. So anything you need that you express as types in Lean you should be able to express as theorem in HOL. Presumably that is inconvenient (= technical debt), but why?


Tensors are a good example: In a proof you might want to do induction over the dimensions of a tensor. This means your type of tensors needs to contain all tensors of all dimensions. But working on the type of all tensors is not so nice anymore: a lot of algebraic properties do not hold, they only hold for tensors of a specific dimension. An example is the Tensor library in the Deep_Learning AFP entry.

Now in Isabelle most strutures are encoded as type class, but when your type is not anymore in the type class suddenly you need to proof a lot of theorems yourself, and the automation does not work as nice as with type classes. Generally, in HOL provers you want at least that your type at least has nice algebraic properties on the entire type. If this is not the case proofs get much more convoluted. Isabelle's simplifier supports this case using conditional rewrite rules, but it still does not work as nice as using type inference to handle this cases.

In dependent type theorem provers it isn't a problem to proof everything on tensors with a fixed dimension. When a proof requires to do induction on the dimension itself these kind of shape can always constructed in the proof itsefl.


The memory consumption will surely be better, as HOL or Isabelle will only store the fact that a theorem was proven, but not how. But then Lean can store the proofs and has two independent external type checkers. Isabelle has the infrastructure to do this, but it can not cope with it after a certain size (Isabelle's HOL-Proof does not even contain all of HOL/Complex_Main). In my experience Lean feels much faster than Isabelle, so I would guess the proof will take much longer than in Lean. But I don't have any concrete measures.

In Isabelle one might want to trust the code generator to evaluate certain computations in ML.


One of my PhD students will soon have to learn an interactive prover. I was to recommend Isabelle/HOL because it's got the best automation, but maybe I should consider Lean (and learn it along with him).

I worry slightly about Lean being immature, and lacking a big library eco-system in 2017. OTOH, it's also good to be at the cutting edge.


> The specification is a lot smaller than the code, and so it's easier to read and manually verify that it's correct.

>> How is that the case in this specific example? It looks a lot harder to check for correctness.

Author here.

Here is the specification for the stochastic backpropagation algorithm:

https://github.com/dselsam/certigrad/blob/master/src/certigr...

The preconditions do not need to be inspected carefully because we prove that the models of interest satisfy them (example: https://github.com/dselsam/certigrad/blob/master/src/certigr...)

Here is the part of the specification that needs to be inspected carefully:

https://github.com/dselsam/certigrad/blob/master/src/certigr...

The syntax may seem strange to those unfamiliar with Lean, and a few of the functions involved may not be self-explanatory without reading their definitions, but the statement is conceptually simple: the stochastic backpropagation algorithm correctly computes unbiased estimates of the gradient of the expected loss.

It is subjective, but I personally think that this specification is vastly easier to understand and to check for correctness than the actual implementation of the stochastic backpropagation algorithm.


> All you're doing is moving the bugs from the source code to the specification.

The value of doing this can vary, but there are some cases in which the gain is immense and indisputable. Suppose you are writing a compiler optimization. Your source code could be arbitrarily complicated, but your specification is simply "an optimized program always behaves the same as the original".


Are things like "the output must be equal to the output of this other thing" easily expressed to a prover? If so this seems like it'd be awesome for optimisation as well.


Yes, this is easy to express in a prover. A naive implementation can always serve as a specification for a sophisticated one.


This sounds very interesting indeed. Can this be done with Coq? Do you have any tutorials or such on doing this?


It can definitely be done with Coq. The typical recommendation is to read through and do the exercises in the freely-available Software Foundations book [1], which will teach you how to do that in Coq.

That said, while it was educational and an extremely good introduction to interactive theorem proving, as I went through the book I became increasingly disappointed with how Coq-style proofs were so laborious and difficult to achieve, as often you had to prove even the simplest things step by step. You also had to go through many chapters before you finally arrived at proving correctness of optimization.

I later found an alternative which I can't recommend highly enough. If you read through and do the exercises in the also freely-available Concrete Semantics book [2], you will also learn how to do exactly what you want, but you will learn it much sooner and in a much easier way.

In the latter book, you will be using the Isabelle/HOL prover instead of the Coq prover, though. The language looks slightly more idiosyncratic (I'm referring to the weird extra quotes - you will know what I mean), but despite that, to me it was worth a hundred times over: Isabelle/HOL has the most amazing proof automation I've seen - it will not only find proofs all by itself quite often, automatically using hundreds of theorems it already knows or that you have already proved before, but sometimes it will even automatically spit out human-readable proofs, which will teach you how the proof can be done (which is useful when you are stumped and had no clue how to move forward).

Isabelle/HOL can also be configured to automatically find errors in your specification and in your implementation - it has built-in automatic quickcheck / nitpick functionality, which can often tell you that you're trying to prove something that is not provable even before you attempt to prove it. IMHO, all of these things contribute to a significantly better user experience (as a newbie, at least).

If you are interested, give it a try. I highly recommend it!

[1] https://softwarefoundations.cis.upenn.edu/current/index.html [2] http://concrete-semantics.org/


I second this.

Isabelle/HOL has much better proof automation than any prover based on Curry/Howard. That's primarily because HOL is a classical logic, while Curry/Howard is constructive (yes, I know we can do classical proofs inside constructive logic, e.g. with proof irrelevance). Almost all automated provers are also classical. Having comparably powerful proof automation to Curry/Howard based systems is an open research problem.

   weird extra quotes
That's because Isabelle is not a prover, but a 'meta-prover' used for implementing provers, e.g. Isabelle/HOL. In practise HOL is the only logic implemented in Isabelle with enough automation to be viable, so we don't really use Isabelle's 'meta-prover' facilities in practise.


> Having comparably powerful proof automation to Curry/Howard based systems is an open research problem.

Isabelle/HOL is essentially isomorphic to a subset of Lean when we assume classical axioms, and any automation you can write for Isabelle, you can write for that subset of Lean. It is often easy to generalize automation techniques to support the rest of Lean (i.e. dependent types) as well, but sometimes doing so may introduce extra complexity or preclude the use of an indexing data structure. See https://arxiv.org/abs/1701.04391 for an example of generalizing a procedure (congruence closure) to support dependent types that introduces very little overhead.


This is interesting. It will take me some time to digest this paper.

As an aside: is there a terse description of Lean's type-theory? Also: Lean is Curry/Howard based, it's not an LCF-style architecture?


There are older papers that describes Lean's meta theory, https://leanprover.github.io/publications/, the first paper, and the one on elaboration. Unfortunately they both describe the version present in Lean 2, which still had the HoTT support we removed in Lean 3.

The new tactic paper, set to appear at ICFP '17, also briefly describes Lean's meta-theory. https://leanprover.github.io/papers/tactic.pdf

At a high level our meta-theory is very similar to Coq's with a few important departures, a big difference the remove of pattern matching and fixpoints from the kernel, we instead have primitive recursors/elminators/recursion principles.

The lead author of Lean (Leo De Moura) has built some of the worlds most advanced automated reasoning tools, and his goal has always to unify type theory and automation. Our goal is to have the benefits of dependent type theory coupled with the high level of automation present in SMT and from tools like Isabelle.


Thanks for the link to the paper.

De Moura is giving a talk tomorrow in Cambridge at the workshop in computer aided proofs called "Metaprogramming with Dependent Type Theory". I wanted to go, but won't be able to attend. I guess that talk will mostly be about the content of the paper. Now I don't feel so bad for being unable to attend.


Lean is LCF style in the sense that there is a small kernel, and all proofs written by the user and generated by the automation are checked by this kernel. This kernel (i.e. type checker) is much smaller than that of Coq or Agda.

It is _not_ LCF style in the sense that there is only a "theorem" datatype with proof operations. Lean proofs are type-checked expressions.

It is hard to find a terse description of the Calculus of Constructions or CIC. Lean's logic essentially consists of Martin-Löf type theory + (noncumulative) universes with Impredicative Prop + inductive type (where nested and mutual induction is compiled down to a simpler forms with only one eliminator) + quotient types + function and Prop extensionality. And optionally choice to gain a classical logic. Most other features, like a Agda like dependent pattern matching, or mutual and nested inductive types are compiled down to more basic features.


    there is only a "theorem" datatype 
You mean that there is NOT only a "theorem" datatype? In contrast to Curry/Howard provers, the LCF approach forgets proofs, it guarantees soundness by giving you access to proof rules only through the "theorem" datatype (which is the key trusted computing base). To be sure the "theorem" datatype may internally maintain a proof object (e.g. for the purposes of program extraction), but that's not necessary.


Sorry, I was unclear. I meant that most LCF style theorem provers only have one theorem datatype to carry proof information.


   Can this be done with Coq?
This can be done with any prover.

Simplifying a bit, an algorithm is a function of type A -> B. If you have two implementations, f1 :: A -> B and f2 :: A -> B, say one slow and simple (f1), one fast and complicated (f2), then you can verify that f2 does what it is supposed to do in two steps:

1. Show that for all a in A we have f1 a == f2 a.

2. Show that f1 meets the specification.

If you trust that f1 does what it is supposed to do, then you can skip (2).


> your specification is simply "an optimized program always behaves the same as the original"

I wouldn't say it is simple, since it is undecidable for general programs.


Daniel's point is that the specification itself is simple, he is saying nothing about the complexity of proving that your program matches the specification.

The decidability of the specification isn't important in this context. It is known there is no procedure to build a proof of this specification for any language or optimization, but as humans we can prove that it holds for a specific configuration of compilers, and optimizations, and have done so many times, for example CompCert, Alive, etc.


The specification is simple in the sense that it is easy to understand what it states and to confirm that it has the intended meaning. This does not mean that all proofs of the specification will be simple, but once you do construct a machine-checkable proof of the specification, your level of confidence that the system is correct will be extremely high (commensurate to the simplicity of the specification) and will not depend on the complexity of the proof.


The point is that the specification is short and human understandable: you get confidence that bugs are unlikely to exist in the specification. Now if you also can prove that your optimization passes fulfill the specification you've obviously gained a lot of value. It doesn't matter that deciding the correctness of arbitrary program transformations is undecidable if you happen to succeed in proving that the ones actually implemented are correct.


> Doesn't TensorFlow support random variables too?

The paper doesn't explain this well, but although you can put random variables in TensorFlow programs, you cannot backpropagate through them. With stochastic computation graphs, you can differentiate the expected loss as long as the probability density/mass functions of the random variables are differentiable. One of the main benefits of stochastic computation graphs is that you can train with arbitrary, non-differentiable simulators as long as they only depend on the parameters indirectly though random variables.

> They even compare their performance to TensorFlow on a model with random variables.

The naive variational autoencoder of Figure 2 cannot be trained in TensorFlow since the random variable depends on the parameters. However, this particular model can be reparameterized so that it doesn't depend on them without affecting the expected loss (Section 4.6) and the resulting model can be trained in TensorFlow.


Author here.

> Also as scribu states, this doesn't allow you to prove final goals of the system like "classify images with 95% accuracy", nor does it save you from insufficient or inaccurate data.

We focused on implementation-based notions of correctness, which are both much easier to state and much more useful when actually developing software, but in principle downstream notions of correctness could be stated and proved as well. For example, for some models and under some assumptions about the data, you could formally verify that the accuracy of your trained model will probably be high on an unseen test set. However, the limiting factor is that nobody knows that much yet about the assumptions under which most useful models (such as neural networks) are guaranteed to perform well.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: