> anyone who spends a second thinking about it. Except most programmers don't sp...

Jtsummers · 2025-07-07T19:36:24 1751916984

Try that in SPARK/Ada. It'll stop you there [if it can't prove that low + high won't overflow]. Don't take a proof written with one set of assumptions (in this case how integers are expected to behave) and translate it to another language where those assumptions don't hold.

touisteur · 2025-07-08T11:27:35 1751974055

The most interesting part of SPARK for me is how all the runtime checks are implicitly, automatically generated and need to be proved.

A whole bunch of assumptions about the language are hidden and the verification conditions are generated automatically for the underlying automatic or semi-automatic proof tools (why3-intermediated) - the second best part of SPARK.

You have to trust that the SPARK FrontEnd makers got it right - or you can inspect/review all the discharged VC if you want - and you still have to actually prove or help prove them all, but I'm not losing sleep over a forgotten 'normal' check.

zelphirkalt · 2025-07-07T21:39:26 1751924366

Are writing our next program in Lean then? Where does that run?

There seems to be a fundamental difficulty here. Either we prove things in the language we want to use, which means modelling the behavior of the things we use in that language, or we prove things in Lean, but then cannot apply that to an actual implementation, because of issues like the one above.

I would be surprised, if there was no standard approach for modelling bounded integers and their specific properties in a language (which can differ) in a proof language like this. There must have been more people having thought about this and come up with solutions.

aseipp · 2025-07-07T22:20:55 1751926855

It does not look like it due to its largest target audience (hardcore math nerds writing proofs), but Lean 4 is actually a meta-programming language, like Racket -- it is one single unified language for programming, meta programming, and theorem proving, on top of a small core. The implementation of Lean is almost entirely written in Lean, including components like LSP servers, the compiler itself, many efficient data structures, etc. It is a key use case to write real world programs, because the implementation itself is such a program.

In the long past, Lean 3 had this idea that you could use one language both for writing proofs, and writing proof automation -- programs that manipulate proofs and automatically do things. Like in the article itself, the 'grind' thing-a-mabob is proof automation. (Roqc has two separate languages for proofs and proof automation.) But there was a problem: Lean started as a theorem prover and the implementation tried to move towards "executing programs" and it didn't work very well and was slow. The prototype compiler from Lean 3 to C++ ran out of steam before Lean 3 got canned.

Lean 4 instead went and did things the other way around: it started as a programming language that was executable. The Lean 4 compiler was self-hosted during development for a long-time, way before anyone ported any big math proofs to it from Lean 3. Why did they do this? Because the key insight is that if you want to write programs that manipulate proofs (AKA programs), the best thing to have at hand is have a robust general programming language -- like Lisp or Racket. And so Lean is macro based, too, and like Racket it allows you to have families of languages and towers of macros that expand as deep as you want. Meta programs that write programs that write programs, etc...

So in Lean you write Lean programs that can manipulate Lean ASTs, and you write "reader macros" that allow you to use domain specific syntax right inside the source file for any kind of math or programming-language DSL you want. And all those macros and meta-programs are compiled and executed efficiently just like you'd expect. Finally, there is a total fragment of the language that you can actually write proofs over and reason about. And there is a big library called "mathlib" with lots of macros for writing math and math proofs in this language-framgent-we-can-reason-and-prove-things-with.

Lean 4 is very close in spirit to the Lisp idea, but modified for math proving and generalized programming. It's very unique and powerful.

Jtsummers · 2025-07-07T22:17:39 1751926659

> Are writing our next program in Lean then? Where does that run?

That first question is hard to parse. If you mean "Are you writing your next program in Lean then?" then: No, but in principle we could, it runs on each OS we use (Windows and Linux, standard x64 hardware). If you mean something else, I can't figure out what it would be.

> Either we prove things in the language we want to use

Sure, I mentioned SPARK/Ada. There are systems for C, Java, and others that also work and understand their types so you don't have to add extra modeling.

> which means modelling the behavior of the things we use in that language

It would already be done for you, you wouldn't have to model Ada's integers in SPARK, for instance.

> we prove things in Lean, but then cannot apply that to an actual implementation, because of issues like the one above.

https://lean-lang.org/doc/reference/latest/Basic-Types/Fixed...

If you knew your target system was using fixed-width integers, you'd use this.

MobiusHorizons · 2025-07-08T02:15:19 1751940919

What hardware doesn’t use fixed width integers?

Jtsummers · 2025-07-08T03:03:04 1751943784

What's the context of your question? Did I say otherwise?

Some languages (what's being discussed here) give you arbitrary precision integers, like Lean. So the proof in the blog applies to Lean and any issues with things like -INT_MIN not existing isn't a factor in the proof. That's what's being discussed. The proof for C or Ada or something else with fixed-width integers will be different (and the algorithm, likely) to account for that difference and any other relevant differences.

MobiusHorizons · 2025-07-09T04:32:06 1752035526

> If you knew your target system was using fixed-width integers, you'd use this.

Usually `system` in this context refers to the machine you run the program on, not the language itself, or at least that is how I interpreted it. I guess there used to be systems with BCD math supported by hardware, and I guess those could have variable precision. I'm not sure if it was arbitrary (ie it could still overflow), but one could presumably detect overflow and increase the memory allocation. But once you have a software system doing your math rather than a machine instruction it seems like you need to prove the correctness of that implementation not just the general concept.

fc417fc802 · 2025-07-08T05:57:51 1751954271

A language can provide arbitrary width integers while nonetheless being implemented on top of fixed width integers. In fact I'd say that's pretty typical.

MobiusHorizons · 2025-07-09T04:25:52 1752035152

I mean sure, but at that point it's a library, so don't you need to prove the correctness of that library as well?

fc417fc802 · 2025-07-09T11:17:53 1752059873

To the same extent that you need to prove the correctness of the compiler and the language runtime I guess. Arbitrary width integers are a core feature of quite a few languages.

Given Lean's raison d'etre it wouldn't surprise me if the relevant library code was verified. If using another language such as Python I think it would be fairly reasonable to take the underlying implementation on faith. If we're going to start questioning whether a core part of the runtime that implements basic arithmetic operations has bugs then why are we not also suspecting the operations provided by the CPU and other hardware?

We probably should suspect those things, but as a matter of practicality we're going to need a boundary at present.

MobiusHorizons · 2025-07-09T14:34:45 1752071685

Fair enough. I must just not be understanding what the motivation for creating proofs of such software is then. I guess to me assuming the compiler or language runtime have no bugs seem to severely undermine the value of the proof. I feel like if the proof was to have value in an imperative language like C or C++ I would want the assembly proven, as the resulting computation can be drastically altered from the original source text between various compiler flags and optimization levels.

bollu · 2025-07-07T21:47:37 1751924857

yes, Lean is executable, and the proof of natural numbers runs with arbitrary width integers. they're stored as tagged pointers, with upto 63bit numbers being normal numbers, and larger numbers become GMP encoded.

ykonstant · 2025-07-08T08:06:19 1751961979

Lean is actually pretty easy to learn and code in; also, if you are not interested in extreme performance, it is pretty much ready for basic applications (not sure about async and co.).

The problem is that you want to be doing all your coding monadically (so that it looks like the familiar imperative style), and I think there are still many problems about doing proofs on monadic code.

lmm · 2025-07-08T01:12:54 1751937174

> Except most programmers don't spend even a second to think about it, and we end up with "int mid = (low + high) / 2;" bugs in standard implementations of binary search in e.g. Java. And that implementation even had a written proof accompanying it!

I see this as far more of an indictment of Java than an indictment of programmers. If you make the symbol "+" part of your programming language's syntax (especially in a language where you don't allow custom implementations of "+") then it should do what the ordinary mathematical + operation does!

cjfd · 2025-07-08T07:29:00 1751959740

Do you realize that making + do what the ordinary mathematical + does, is quite expensive? It requires arbitrary precision arithmetic and hence, potentially, memory allocations.

lmm · 2025-07-08T07:48:59 1751960939

> Do you realize that making + do what the ordinary mathematical + does, is quite expensive? It requires arbitrary precision arithmetic and hence, potentially, memory allocations.

Yes I do. It's an acceptable cost in most cases (you will note much of the world runs on Python, a language in which + does behave sensibly, these days), and dangerous optimisations should be opt-in rather than opt-out.

In the worst case you should at least make your language fail-stop definite by making it error when the result of + is too large rather than silently continuing in an invalid state.

Tainnor · 2025-07-08T11:48:09 1751975289

> In the worst case you should at least make your language fail-stop definite by making it error when the result of + is too large rather than silently continuing in an invalid state.

That's what Swift does fwiw. But there are downsides to that too (although they're also related to Swift not having a good fault recovery mechanism).

In practice, whenever you know you're going to use large enough numbers in Java, you probably want to use BigInteger or BigDecimal (e.g. in banking). Unfortunately, Java doesn't allow you to use the + operator with it.

lmm · 2025-07-09T06:51:45 1752043905

> whenever you know you're going to use large enough numbers in Java, you probably want to use BigInteger or BigDecimal

When you've made a conscious decision you can pick the right thing, sure. The problem usually happens when you haven't thought about it at all, which is why the default should be safe and the unsafe optimisation should be opt-in.

Tainnor · 2025-07-09T13:47:27 1752068847

But "safe" and "unsafe" are not a binary thing.

Maybe for LOB applications, it would be better if languages defaulted to arbitrary precision arithmetic. But scientific computing is also a huge field that often uses the same languages and there, arbitrary precision is often the completely wrong tool, e.g. it would make certain key algorithms (like Gaussian elimination) exponential.

I feel like this is just one of these things that developers should know about so they can make the correct choice, just like DB indices.

lmm · 2025-07-10T01:36:58 1752111418

> arbitrary precision is often the completely wrong tool, e.g. it would make certain key algorithms (like Gaussian elimination) exponential.

Sure, but even in science taking a lot of time to deliver a result, or failing to deliver one at all, is much safer than silently delivering the wrong result. There's the concept of fail-stop if you want a rigorous approach to safety. There's no analogous safety model that says silent overflow is the safe option.