There's a lot of good information in the article for software people for whom hardware design is strange but I think the author makes some strange points. For starters, its strange to have him refer to Verilog as a programming language multiple times all over the article. Anyone that has ever tried to do even minimal stuff on hardware will know this the wrong way to think about things.
Verilog, VHDL and whatever other hardware description languages should not be approached as programming languages, or you'll have a bad time. You need to think about what you are generating. By using HDL's we have the ability to avoid messing with crappy interfaces where you drag chips and connect wires by hand, but ultimately you really need to know what's being generated. If you write a couple of lines and have no idea of the hardware behind it, you'll likely be making a lot of mistakes. Everyone that has tried messing with FPGA's without thinking about this has ended up with hundreds of generated latches among other niceties... There's already some degree of syntactic sugar on VHDL processes (or verilog @ blocks) which make it really easy to shoot yourself in the foot by abstracting stuff.
Ultimately you'll want to know what's going on rather than not. Hardware is hard, but when you need the performance you'll do it the right way, or you might as well save yourself a lot of trouble and just go with a beefier machine and some optimized C code.
I agree that you can write horrible Verilog by virtue of not understanding of what it compiles down to; but you can write horrible C because of not understanding how it compiles down to just as well, not to mention C++, not to mention languages where everything costs a ton asymptotically (like copying lists all the time - list(reversed(values)) in Python, etc.)
With hardware some people see it as particularly preposterous because of just what it is that you're wasting; but it's not that different, really. In my view for instance C is overly high-level because you can do ptr-ptr2 and it divides by sizeof and you have a division sitting in there that you don't see. Well in Verilog you can do an x%y and it synthesizes, and I worked on chips that went into mass production with this idiocy instead of using the fact that y was a power of 2. But it works just fine because it's just one piece of idiocy in a big design which is not all made of idiocy.
The upshot is, I said in there that some people will consider my perspective a tad strange, but yeah Verilog is a programming language :) with all the usual virtues and vices of one.
The difficulty I had was not writing horrible Verilog, but wrong Verilog. My mental model for computation was so defined by store-program computers that it took me a long time to build the correct mental model that let me write correct Verilog. And, as the parent poster pointed out, that very much involved thinking about physical things, being statically laid out.
I know that on Xilinx's FPGA toolchain, x%y cannot be synthesized unless y is both constant and a power of 2, in which case it just takes the n least significant bits as though you'd written that in the first place. Are you sure the tools you were using don't do the same thing?
HDL development is kind of weird in some ways compared to conventional coding though. For example, the most portable way to make use of the SRAM hard macros that all modern FPGAs have is like this: you write HDL that describes how the SRAM behaves, and if you describe it in the right way it's replaced with actual SRAM through some magic of the synthesis tools.
This is much worse in ASIC, where there's no standardized SRAM interface and you have built-in self-test wires coming in that uglify each non-standard interface tremendously.
HDLs have their share of warts - pragmas in comments, vital information in non-portable synthesis scripts, etc.; well, C has its own pragmas and vital info in linker scripts and stuff as well... Verilog is in some ways much prettier because for instance it compiles down to itself - the output of synthesis tools, a netlist, is Verilog source. In C the closest approximation is int main[] = {non-portable hexadecimal representation of machine instructions}.
I'm curious about the ASIC comment. Surely no one is "synthesizing" SRAM in ASIC by describing its behavior with an HDL, right? You just decide at design time to use an SRAM array of whatever size, then (for simulation) just stub it out with an appropriate HDL model that gets replaced during synthesis. The "SRAM" abstraction in that model is provided by the semiconductor fab, no?
At least, that's how I've always assumed things work.
It works as you describe, pretty much, the gnarly bit being that there's no standard interface for an SRAM (that is, SRAM is not in the standard library, unlike say simple gates.) One reason is the many different ways to implement BIST and connect the BIST signals from all memories in the chip into a coherent whole.
Here's an article (not written by me, just found by using google) that has an example how the VHDL/Verilog compiler infers to use a dual-ported SRAM from code that describes the behavior, including the bugs you'll encounter.
Verilog is in some ways much prettier because for instance it compiles down to itself - the output of synthesis tools, a netlist, is Verilog source.
This is a terribly lame analogy. You could have the C compiler output C that exactly resembles what the actual CPU instructions do. And even write an assembler for that if you wanted.
I think it would be absolutely awesome if the assembly output from my C compiler were syntactically-valid and semantically-equivalent C. Among other things, it would enable me to compile it (with a C compiler rather than an assembler) to run on a different CPU architecture. And then I'd just need a disassembler to generate it in the first place.
Unfortunately this isn't possible because you can do a lot of things in assembly that you can't do in C. RET, say.
yup. I worked on a simple game AI (Connect6) for a university project implemented in Verilog. Initial prototyping in python was super easy and quickly done, translating the same program to Verilog took multiple weeks. We split the program up into multiple modules with each module handling different cases. While writing the program we only compiled it with the modules we were currently working on and even that took long enough (remember: we were trying to more or less directly translate the python program to verilog with a bit more global state). Once we finished we tried to compile the whole program with each module enabled and it compiled and compiled and compiled. We aborted after 20 hours of compiling and no end in sight, the design we were using was just not suited to be used with hardware.
Now, this was the first time ever I worked with hardware and while I really liked having something tangible and all those nice blinking lights on the fpga board, programming one was a pain in the ass. Towards the very end of the project I slightly got the hang on how you should approach writing such software. And the most important thing is to not think like you would when programming traditionally.
Designing and later implementing a FSMD for example is far more efficient.
Verilog, VHDL and whatever other hardware description languages should not be approached as programming languages, or you'll have a bad time. You need to think about what you are generating. By using HDL's we have the ability to avoid messing with crappy interfaces where you drag chips and connect wires by hand, but ultimately you really need to know what's being generated. If you write a couple of lines and have no idea of the hardware behind it, you'll likely be making a lot of mistakes. Everyone that has tried messing with FPGA's without thinking about this has ended up with hundreds of generated latches among other niceties... There's already some degree of syntactic sugar on VHDL processes (or verilog @ blocks) which make it really easy to shoot yourself in the foot by abstracting stuff.
Ultimately you'll want to know what's going on rather than not. Hardware is hard, but when you need the performance you'll do it the right way, or you might as well save yourself a lot of trouble and just go with a beefier machine and some optimized C code.