I actually did a very similar test to this recently and got similar results (though mine was just between Go and Python)... in retrospect the reason is obvious: in python I was using python-cjson which is a heavily optimized C implementation of a json parser, while in Go I was using a 100% Go implementation.
In a certain way this shows nothing, since we all know C is fast. But in another way it convinced me that Python is actually fine for most of what I'm doing since it's easy to drop in to C for parts that need performance. And even more importantly there are often already already C libraries with python bindings that do the thing I'm looking for (ie parsing, numpy, etc).
It's interesting that Python has a C library for this, which indicates that someone thought Python couldn't cut it for this task. Yet the PyPy guy (in the comments) claims that with a bit of tweaking they got the pure python JSON encoder working faster than the C library (plus python/c interface overhead I assume) and that the same could probably be done for decoding too.
Python's USP seems to be the ability to drop down to C or C++, PyPy's is that it's fast enough to keep everything in high-level Python (at a cost in memory) and Go's is that it's low-level enough that you can do the C level optimisation in Go (though as yet they haven't for this particular case).
I think the time for the traditional Python approach has passed as newer technology allows you to be fast enough for many tasks without leaving the language and even Go is aimed at a fairly low level. PyPy is cool but is somewhat chained to assumptions in Python (though this lets them build on that wide legacy). So it makes me wonder what the new Python/Ruby/Perl is going to be? Possibly something written with PyPy's toolchain to get a JIT? Do they have a here's what we can do if we get to rewrite the rules to suit our tools language in the PyPy family? What are the other contenders? All the ones that spring to mind are rewrites of existing languages.
Something like Julia, Rust or Clay that are high level languages but with little more than Python-like simplicity that generate LLVM code seem powerful. This general model is the closest to a viable C-replacement I have seen yet.
In addition, personally, I'd much rather invest my time learning above the flexibile and fully open LLVM technologies than above Google's proprietary little-better-than-Java constrained language.
For current practical purposes though, Python/C combo is more than sufficient for many needs.
Does Python let you optimize easily? Not in my experience, unless you define "easily," as pushing the critical path into C libraries. (Not an unreasonable approach, just not one I'd call "easy," and then you're not using Python anymore.)
Go does give you the tools you need to profile, understand, and then fix performance issues without switching languages.
Profiling in Python isn't hard either. But yeah, you really can't gain the performance of statically typed language in pure python (well, maybe with pypy). However, pushing the critical path into C library doesn't have to be as horrible as it sounds (for python developer at least). Cython[1] project can be very helpful. You just annotate critical variables/functions with types and compile the now cython code into C. This C code will be pure C as you wrote it in your typed parts, and bunch of ugly (but commented) calling of Python libraries in pure python parts. Then you just throw it at gcc and you are done. The best thing is that you can mix the typed/untyped code and pass variables around as it was pure python. Example from docs: http://docs.cython.org/src/userguide/tutorial.html#primes
This isn't a benchmark of the languages, but of the libraries that ship with them. That's fine, but be careful not to confuse that with a language benchmark.
Given Go is a compiled language I am rather surprised at its low speed in this benchmark. I am not familiar enough with the language to take a look at the benchmark code, has anyone else done so?
I can't help wondering if the benchmark code is falling foul of Go's garbage collection which I understand from discussions here is somewhat less intelligent than Python's GC.
As it seems like is always the case in these fairly naive benchmarks. (Not specifically about Go here, it seems like people often cite python when they're really using some optimized C code).
You could use one of the C json implementations such as json-c. However as has been pointed out you will still be just comparing libraries rather than the languages themselves.
It certainly could be more optimized.
For example, all numeric literals are converted to strings before being parsed. That's a mostly unnecessary copy/allocation per number.
Similarly, it doesn't make use of the fact that a []T can only legally contain Ts. It's all very clean and generic, which is great, because when working on the compiler they can work on making good code faster without being mislead by code that tries to work around lack of optimizations.
As an experiment, I replaced the []int in the Go code in the link with an IntList that implements json.Unmarshaler, and does a single-pass integer list parsing there (without overflow checking, but eh). With that quick 40 line change, the Go code was faster than the CPython, at least on my machine.
That and similar type-specific handling could be built into the json package fairly easily, and would make those cases quite a bit faster. Then again, I bet the Go team would have a better and more general idea.
I spun the JSON decoding off into a few goroutines and managed to get it down to ~1second (but it was out of order, so didn't quite solve the same problem - though this is fixable). This is was ~10 lines of Go (and only using a single core).
This was my bad I have updated the post. I should have been marsheling to a more comparable datatype (map[string]interface{}{}) and now the performance is more in line with what I would expect. I need to look into the Unmarshal code to see why this is so slow.
I appreciate why you use "golang" instead of the general "Go" or current release "Go1", but the former is the name of the open source organization, and the latter is the name of the language.
The Go program allocates a new item on every line instead of allocating a single struct and just passing a pointer into unmarshal, which would be way faster. There's some claim of "doing async style programming" but the Go code doesn't parellelize anything, so... not really a fair Go entry there. The input file is also not supplied.
edit: not to mention there are things like adding individual sums multiple times that suggest that the author didn't check the output to verify that all programs arrived at the same output...
I wrote the article because I write a lot of web applications which a vast majority of the request are JSON. I am about to do a lot of post analysis on JSON sent from the web browser so it is valuable for me to know how fast a language (library) JSON parser is. For highly concurrent webserver it is good to know how fast and efficient the JSON parser is since it will be doing a lot of concurrent parsing of JSON.
When you're working with streaming JSON like this (or in a web app), it might be easier to use a decoder/encoder. See http://play.golang.org/p/TLNORK2WK9 for an example.
Also fwiw, it has never been my experience that json decoding/encoding become the bottleneck in a web app. I/O (to a database, or the filesystem) is by far the largest bottleneck in any app I've profiled. A good benchmark for a web app language is hard to write, because it tends to depend on (unreliable) IO-bound systems.
I understand the reasoning behind this benchmark, but the Python results aren't super relevant because the library is written in C.
What is interesting is the Dart/Go comparison. Dart's VM still doesn't have a lot of the optimizations that will ultimately be in there and it's keeping pace pretty well with Go's much more mature compiler (which are based on the Plan 9 compilers).
Can anyone talk about the state of the compiler in Go? How good/bad is the optimizer? That's not something you usually address early in language design.
Go's gc compiler is relatively young and makes few optimizations (the compiled code still performs well - this particular benchmark compares Go's unoptimized json package with a mature C library). The gccgo compiler can take advantage of all gcc's code generation optimizations, and it's performance reflects this.
I actually like dart and go more so than javascript. Javascript still has a leg up when it comes to libraries. If Go or Dart get a follow I would be happy to look into it.
In a certain way this shows nothing, since we all know C is fast. But in another way it convinced me that Python is actually fine for most of what I'm doing since it's easy to drop in to C for parts that need performance. And even more importantly there are often already already C libraries with python bindings that do the thing I'm looking for (ie parsing, numpy, etc).