It's funny...when I read that bit, all I could think was, how much nicer would i...

zedshaw · on Feb 25, 2012

Regex aren't difficult to learn, it's just nobody teaches them as a language with a base syntax and words to use. If you just sit down and memorize the names of a few symbols, then learn what each does, then it becomes fairly clear.

It's my belief (totally unfounded) that learning a simple symbolic language like regular expressions teaches you how to handle other symbolic languages like mathematics, chemistry, and programming. That's one of the reasons I'm teaching it and trying to get other people to use it.

More importantly though, they are damn handy. As long as you don't abuse them in places where a lexer+parser is better, you can get a lot done with very little regex in very short time.

jacobolus · on Feb 25, 2012

Unfortunately, the actual syntax of regexps is far from ideal. As an example, it’s completely stupid that non-capturing groups must be written as (?:…) when they are by far the common case.

Larry Wall’s writings on this general topic are fairly convincing. http://www.perl.com/pub/2002/06/04/apo5.html?page=2

zedshaw · on Feb 25, 2012

I don't consider that a failing of regular expressions (which predate perl), but a failure of implementation. I've thought that instead of syntax it should be an API option that says "this is a matcher" vs. "this is a capture". Then the same regex works for both, it's just how you run it.

eurleif · on Feb 26, 2012

So would it not be possible to mix capturing and non-capturing groups in the same regex? That's useful to do if you have code that expects specific things to be captured at specific group indices, and you need to add grouping somewhere else in the regex without messing it up.

gghh · on Feb 25, 2012

::Regex aren't difficult to learn, it's just nobody teaches them as a language with a base syntax and words to use::

I couldn't agree more. because of lack of the "language" approach to them, their weird syntax, and the fact that the "verbose" mode for their definition is almost unknown, they come out as a sort of voodoo that only gurus can handle. Moreover, this results in tons of broken code in production. They're simple, beautiful and handy, but have an unfortunate historical load.

ufo · on Feb 26, 2012

What I miss the most about regexes (and I think is kind of what bermanoid was hinting at) is that we don't have access to much of the expressiveness we usually have available in a programming language. For example, I never saw a widely used regex library that takes advantage of the algebraic structure of regular expressions and that would let me do things like incrementally building regexes or creating named constants:

    var regex1 = /some_regex/,
        regex2 = /other_regex/;

    var regex3 = alternative(regex1, regex2);
    var regex4 = kleene_star( sequence(regex1, regex2) );

riffraff · on Feb 25, 2012

you would _love_ perl6 rules.