To be fair, monadic code does not necessarily have side effects. It's just a useful way of thinking about computational steps. `Maybe` is a monad that is completely pure and has no side effects: http://www.haskell.org/ghc/docs/latest/html/libraries/base/s...
I understand that this article is geared towards beginners, so this is probably just a simplification on the author's part. Other than that it's a great article.
There is a highly theoretical point of view which states that a "side effect" is anything other than a function A → B which takes an A and returns a B. This includes things like exceptions (A → err + B, which is Haskell's Either), state (A → (S → B * S), Haskell's State monad), and Maybe (A → () + B, as mentioned.) In fact, Haskell is only mostly pure, because it contains the nontermination effect (A → ⊥ + B, i.e. functions might not always return a value because of infinite loops.)
It's true that most of these "effects" are implemented in terms of pure functional code, e.g. the state monad—which models a very imperative construct—is also pure and has no side effects from the implementation's point of view[1].
The point is that from one point of view, these are 100% pure: "Of course, it's implemented in pure terms, so of course it's pure!" There's another point of view that they're entirely impure: "I'm manipulating state in this function, so of course it's impure!" All of which is to say, if you ignore the plumbing, the water appears out of nowhere. [invocation of Clarke's 3rd law excised for triteness]
The same could be said about Haskell itself: "Of course it's impure, because there's a call stack being destructively modified as it runs!" (Conal Elliott went the other way and suggested that C was a pure functional language[2].) It's just that, of all the monads, some of them (e.g. the IO monad, the X monad, &c) use some kind of "magic" to interface with external effectful functions, while other ones mimic state using pure functional constructs. Still—to a programmer, the Cont monad appears to jump throughout your code, making it "effectful" from an appropriate level of abstraction.
The problem with the simplification is that you really need to understand which monad code is using to read the code.
foo >>= bar >>= baz has very different behaviors in the IO monad (it's like a shell pipe foo | bar | baz) and the Maybe monad (it's like a short-circuiting foo && bar && baz).
Not really... you're perfectly capable of reading Python code:
x = foo()
y = bar(x)
baz(y)
Knowing that any of these functions could throw an exception. How is it any different, from the reader's perspective? (Yes, it's /very/ different from an implementation perspective, but that's not what we're dealing with here...)
I understand that this article is geared towards beginners, so this is probably just a simplification on the author's part. Other than that it's a great article.