First up - I'll preface my reply below with a big disclaimer that I'm a relative notice with Haskell, so these are purely my opinions at this point in my learning curve.
> What's "it" - Haskell, or referential transparency? Referential transparency definitely has its victims, and debugging is one of them. Debug.Trace is quite useful, and also violates referential transparency. That Haskell provides it is an admission that strict R.T. is unworkable.
I'd disagree that this is any real attack on the merits of referential transparency, since Debug.Trace is not part of application code. It violates referential transparency in the same way an external debugger would. It's an out of band debugging tool that doesn't make it into production.
> Baloney! Haskell's laziness makes the order of execution highly counter-intuitive. Consider
I wouldn't say it makes order of execution highly counter-intuitive, and your above example is pretty intuitive to me. But expanding your point, time and space complexity can be very difficult to reason about - so I'll concede that's really a broader version of your point.
> Haskell returns a new pointer to a buffer, while other versions need to copy into a new buffer? This is nonsense.
C uses null-terminated strings, so it order to extract a substring it must be copied. It also has mutable strings, so standard library functions would need to copy even if the string were properly bounded.
Java using bounded strings, but still doesn't share characters. If you extract a substring, you're getting another copy in memory.
Haskell, using the default ByteString implementation, can do a 'substring' in O(1) time. This alone was probably a large part of the reason Haskell came out ahead - it wasn't computing faster, it was doing less.
Obviously in Java and C you could write logic around byte arrays directly, but this point was for a naive implementation, not a tuned version.
> This would seem to imply that Haskell will "read ahead" from a file. Haskell does not do that
It would seem counter-intuitive that the standard library would read one byte at a time. I would put money on the standard file operations buffering more data than needed - and if they didn't, the OS absolutely would.
> Like laziness, immutability is almost always a performance loss.
On immutability -
In a write-heavy algorithm, absolutely. Even Haskell provides mutable data structures for this very reason.
But in a read-heavy algorithm (Such as my example above) immutability allows us to make assumptions about the data - such as the fact that i'll never change. This means that the standard platform library can, for example, implement substring in O(1) time complexity instead of having to make a defensive copy of the relevant data (Lest something else modify it).
On Laziness -
I'm still relatively fresh to getting my head around laziness, so take this with a grain of salt. But my understanding, from what I've been told and from some personal experience:
In completely CPU bound code, laziness is likely going to be a slowdown. But laziness can be also make it easier to write code in ways that would be difficult in strict languages, which can lead to faster algorithms with the same effort. In this particular example, it was much easier to write this code using streaming non-blocking IO that it would be in C
> It is not especially difficult to write a referentially transparent function in C. Haskell gives you more confidence that you have done it right, but that measures correctness, not performance.
Except that GHC can do some clever optimizations with referential transparency that a C compiler (probably) wouldn't - such as running naively written code over multiple cores.
> When it comes to parallelization, it's all about tuning. Haskell gets you part of the way there, but you need more control to achieve the maximum performance that your hardware is capable of. In that sense, Haskell is something like Matlab: a powerful prototyping tool, but you'll run into its limits.
I completely agree. If you need bare to the metal performance, then carefully crafted C is likely to still be the king of the hill for a very long time. Haskell won't even come close.
But in day to day code, we tend to not micro-optimize everything. We tend to just write the most straight forward code and leave it at that. Haskell, from my experience so far, for the kinds of workloads I'm giving it (IO Bound crud apps, mostly) tends to provide surprisingly performant code under these conditions. I'm under no illusion that it would even come close to C if it came down to finely tuning something however.
> What's "it" - Haskell, or referential transparency? Referential transparency definitely has its victims, and debugging is one of them. Debug.Trace is quite useful, and also violates referential transparency. That Haskell provides it is an admission that strict R.T. is unworkable.
I'd disagree that this is any real attack on the merits of referential transparency, since Debug.Trace is not part of application code. It violates referential transparency in the same way an external debugger would. It's an out of band debugging tool that doesn't make it into production.
> Baloney! Haskell's laziness makes the order of execution highly counter-intuitive. Consider
I wouldn't say it makes order of execution highly counter-intuitive, and your above example is pretty intuitive to me. But expanding your point, time and space complexity can be very difficult to reason about - so I'll concede that's really a broader version of your point.
> Haskell returns a new pointer to a buffer, while other versions need to copy into a new buffer? This is nonsense.
C uses null-terminated strings, so it order to extract a substring it must be copied. It also has mutable strings, so standard library functions would need to copy even if the string were properly bounded.
Java using bounded strings, but still doesn't share characters. If you extract a substring, you're getting another copy in memory.
Haskell, using the default ByteString implementation, can do a 'substring' in O(1) time. This alone was probably a large part of the reason Haskell came out ahead - it wasn't computing faster, it was doing less.
Obviously in Java and C you could write logic around byte arrays directly, but this point was for a naive implementation, not a tuned version.
> This would seem to imply that Haskell will "read ahead" from a file. Haskell does not do that
It would seem counter-intuitive that the standard library would read one byte at a time. I would put money on the standard file operations buffering more data than needed - and if they didn't, the OS absolutely would.
> Like laziness, immutability is almost always a performance loss.
On immutability -
In a write-heavy algorithm, absolutely. Even Haskell provides mutable data structures for this very reason.
But in a read-heavy algorithm (Such as my example above) immutability allows us to make assumptions about the data - such as the fact that i'll never change. This means that the standard platform library can, for example, implement substring in O(1) time complexity instead of having to make a defensive copy of the relevant data (Lest something else modify it).
On Laziness -
I'm still relatively fresh to getting my head around laziness, so take this with a grain of salt. But my understanding, from what I've been told and from some personal experience:
In completely CPU bound code, laziness is likely going to be a slowdown. But laziness can be also make it easier to write code in ways that would be difficult in strict languages, which can lead to faster algorithms with the same effort. In this particular example, it was much easier to write this code using streaming non-blocking IO that it would be in C
> It is not especially difficult to write a referentially transparent function in C. Haskell gives you more confidence that you have done it right, but that measures correctness, not performance.
Except that GHC can do some clever optimizations with referential transparency that a C compiler (probably) wouldn't - such as running naively written code over multiple cores.
> When it comes to parallelization, it's all about tuning. Haskell gets you part of the way there, but you need more control to achieve the maximum performance that your hardware is capable of. In that sense, Haskell is something like Matlab: a powerful prototyping tool, but you'll run into its limits.
I completely agree. If you need bare to the metal performance, then carefully crafted C is likely to still be the king of the hill for a very long time. Haskell won't even come close.
But in day to day code, we tend to not micro-optimize everything. We tend to just write the most straight forward code and leave it at that. Haskell, from my experience so far, for the kinds of workloads I'm giving it (IO Bound crud apps, mostly) tends to provide surprisingly performant code under these conditions. I'm under no illusion that it would even come close to C if it came down to finely tuning something however.