Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The way you've asked this it will lead people to think you're asking for very powerful static checking

That's exactly what I'm asking for :-). It should be possible to create restricted types that are checked at runtime. For example, the fabs function would return a float which is guaranteed to be >= 0. The sqrt function would require the argument to be positive.

    float x;
    ...         // arbitrary code
    sqrt(x);    // warning, possible invalid argument

    float y;
    y = fabs(...);  // y is now float<positive>
    sqrt(x);    // no warning

    float z;
    z = -1.0;
    sqrt(z);    // error: invalid argument, argument z of type float
                // fails constraint z > 0. z is constant -1.0
It is an error if the compiler can prove it to be wrong. It would probably be a warning if the compiler can't prove it to be safe - but you could optionally make it an error (to prove critical code). There are some cases where it won't work - a la solve_halting_problem() - but you could add little proofs to help it. For example, you could declare that sin(x) is `float<positive>` if `x > 0 || x < pi`.

This may sound crazy, but I think the parts are already out there. C#'s non-nullable types are a very weak version of this (the compiler can infer if something could possibly be null, and complains if you assign it to a non-nullable type). Many compilers also track uninitialized variables, and only let you assign to them, not from them. They are also effectively a restricted version of the original type (every time you declare a float argument, you actually mean float<initialized>).

Another ingredient is control-flow analysis. For example there is the excellent PySonar [1] that performs a whole-program analysis of Python programs and is able to infer types (including union types where it is not sure). In Python, values and types are somewhat blurred IMHO. Instead of variables, Python actually has `names`, and each name has a `type` and a `value` inside. Inferring what `type` a `name` could have is a bit like inferring what value a variable could have in other languages. So I think some of the lessons learned there could be valuable for building a something like this, too.

[1] http://yinwang0.wordpress.com/2010/09/12/pysonar/



The non-nullable types are trivially implemented in Haskell. From what I can tell the flow control analysis you're talking about is just type inference (another thing well tackled in ML-like languages).

Indeed, everything you've written can be fairly trivially implemented in Haskell, though inference and annotation would feel different. The real challenge comes in when you've got an arbitrary "test" expression which must be satisfied by the statics of your language.

For instance, your original example had ensure blocks

    ensure { x.left > 10 && x.right < 30 }
the problem is that it's very difficult for the compiler to take constraints like these and propagate them through increasingly complex transformations without just representing the entire language at the type level.

Which is exactly the path taken by "DT" languages like Agda, Coq, Idris, ATS. The upshot is that you get arbitrarily powerful static queries. The downside is that your type system is exactly as complex as your language itself and you have to shuffle around proofs of various invariants being held because it's an NP-complete search problem for the compiler to find them for itself.

I highly recommend trying out Coq if this kind of stuff interests you. I recommend Coq over Agda/Idris because you're going to be learning new syntax anyway and Coq has better literature out there:

    http://www.cis.upenn.edu/~bcpierce/sf/
    http://adam.chlipala.net/cpdt/


Thanks, I'll definitely have a look at Coq (well, once I have free time, and I wanted to really understand Haskell first).

The thing that bothers me about implementing these constraints the Haskell or Idris way (or in any other DT language I've encountered) is that it feels a lot like template metaprogramming in C++. Define integers with nested types as Zero and the sequence of successors (data Nat = Z | S Nat) seems a lot like defining integers with the Peano axioms using nested C++ templates - possible, even neat, but a bit crazy. Maybe that feeling goes away when you're more familiar with the languages.

Btw. I've just found out that Spec# (an extension of C#) almost has what I described... you can define preconditions and invariants that are checked at least at runtime, but a static checker tries to proove as much as possible at compile time. Its a pity it is no longer developed, this could have been the static type checked language for the rest of us:

    http://rise4fun.com/specsharp/


If you want to see a dependently typed (technically, though the term often doesn't include this class of languages in practice) language which enforces smaller sets of properties more eloquently take a look at Liquid Haskell

http://goto.ucsd.edu/~rjhala/liquid/haskell/blog/about/

The core idea here is called the Refinement Type which is basically what you're looking for. In order to power this type system Liquid Haskell depends on a full SMT solver, though, which is a rather large hammer to wield.

As it regards to Peano arithmetic in DT languages the Peano naturals data type is awkward for doing math... but FANTASTIC for defining the properties of recursion and induction. Indeed, what you find is that the Peano nats show up everywhere as being a stable foundation for complex proofs and smart recursive functions.

It's a very similar feature to Haskell's lists. Competent Haskell programers do not use lists when they want arrays or vectors—the latter types are naturally far more convenient when in their application space. But lists are not abolished! They're fantastically useful as control structures indicating sequential recursion.

Typically you think of data types as forming their meaning through construction. You use numbers by constructing them according to arithmetic properties. Another way to do it, though, especially with recursive ADTs and inductive families, is to think of a type's meaning as being derived from how you break it down.

The canonical way to break down a list is to find some way of combining the head of the list with the broken-down version of the tail of the list. In Haskell that is called foldr

    --       combiner         zero list   result
    foldr :: (a -> b -> b) -> b -> [a] -> b
and you can see its canonicity by thinking of it as simply replacing the constructors for a list by the arguments to foldr.

    let ls = Cons 1 (Cons 2 (Cons 3 Nil)))
    
    foldr comb zero ls =
             comb 1 (comb 2 (comb 3 zero)))


I think all of the complications with your proposal can be eliminated if you change "float y;" to "float<positive> y;". If you tell the compiler what type you actually want to use, it will check all of the constraint propagation for you. This is exactly what Haskell, ML, Idris and friends will do. In fact, in Haskell and ML you would be better off leaving out the declarations altogether; the compiler would spot that you've used "fabs", so it will infer "float<positive>", and unify this with the "float<positive>" argument of "sqrt". In Idris, you could declare that the fabs function converts "float" to "float<positive>" and it would automatically insert calls to fabs when you try passing floats to sqrt.

No need for runtime checks/errors, no need to solve the halting problem, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: