I was recently discussing something that involved knowing whether a string consi...

Merad · on Dec 22, 2022

To be honest this sounds like the very definition of "you solved the problem with regex, and now you have two problems". In essentially every language in existence the question you're asking can be solved with a simple loop and perhaps 3-4 lines of code. In many languages it's easily expressed as a one liner, i.e. (with C#) `s.All(c => c == s[0])`.

I really don't get the love affair that some devs have with regex. In my 10 year career I think I don't think I've run into more than a dozen problems in a production system that _required_ regex to solve. When you're working with robust modern languages there's almost a solution other than regex that's significantly easier to understand + maintain, and probably a lot more performant to boot. Is regex useful for other things, especially cli stuff like grep and sed? Oh yes absolutely. But generally speaking I really don't want it in my code base unless there's no other choice.

RadiozRadioz · on Dec 22, 2022

10 years and a dozen problems? Conversely, I encounter pattern matching problems and use RegEx near daily. Both these perspectives are anecdata, neither are useful.

> and probably a lot more performant to boot

I highly doubt your home-grown pattern matching functions could beat the decades of optimization that have gone into RegEx engines, in anything but the most trivial of patterns (like the one demonstrated here). Creating your own ad-hoc pattern matcher instead of using the ubiquitous one built into your language is like the junior in the article re-implementing JOINs. Sure, you may be able to beat the engine occasionally on particularly simple patterns, but I guarantee you'll lose out overall.

RegEx is not inherently slow, and it is definitely possible to maintain. See industries with serious text processing demands like bioinformatics, where Perl is still used extensively. They could not operate like they do if they shied away from RegEx like many developers seem to.

pixelbath · on Dec 22, 2022

> Creating your own ad-hoc pattern matcher instead of using the ubiquitous one built into your language

GP was using LINQ, which is a first-class language construct in C#. I'll grant that it may not be _as_ optimized as Perl's regex routines, but it's hardly ad-hoc or slow.

RadiozRadioz · on Dec 23, 2022

Yes, the linked example falls in the category of "particularly simple patterns" I mentioned. This strategy only works for such simple patterns; add in a single alternation with a common base and the naïve iteration strategy falls apart. Implementing a sensible algorithm to evaluate such a pattern would require much more code than a couple of brackets, would not benefit from RegEx caching, etc.

Merad · on Dec 22, 2022

> I highly doubt your home-grown pattern matching functions could beat the decades of optimization that have gone into RegEx engines

On the contrary, the All() method used here (which is part of the .Net standard library) is literally just a loop that evaluates each item in the collection to verify that they all match the predicate function. It'll be able to check hundreds if not thousands of characters in the time that it takes the regex engine to initialize and parse the pattern.

RadiozRadioz · on Dec 23, 2022

Interesting how you conveniently ignored the second half of my sentence. I will paste it here:

> in anything but the most trivial of patterns (like the one demonstrated here)

ttfkam · on Dec 23, 2022

It other words, you're not caching/reusing you regex patterns?

I think I see the cause of one of your recurring performance problems.

yyyk · on Dec 22, 2022

>reserve space for a 64-bit integer which I shall henceforth refer to as 'i'

In this scenario, regex processing should allocate more and be slower. The for loop is more optimal even if takes more lines. There's probably some SIMD solution which would be the fastest.

marcosdumay · on Dec 22, 2022

It's probably slower by a constant amount. It also probably allocates a constant amount of memory.

On an compiled language, odds are that the regexp is faster and uses the same amount of memory.

cbm-vic-20 · on Dec 22, 2022

Let's see if I remember Perl regexps: '/': this is a regular expression. '^' at the beginning of a line, '(.)' match any character, and remember it for later. '\1' match the same character that you just remembered, '*' zero or more times. '$' then match the end of the line. '|' Or, '^$' match the beginning and end of the line with nothing in between.

edflsafoiewq · on Dec 22, 2022

all(c == str[0] for c in str)

tracker1 · on Dec 22, 2022

I think the only thing I might add is a guardrail for strings that are too long, and might behave really badly with regexp.

wruza · on Dec 22, 2022

s/\./[\\s.]/ iirc. Or s/\/$/&s/ if the engine implements it.

adammarples · on Dec 22, 2022

len(set(x)) == 1

vsareto · on Dec 22, 2022

>because they haven't spent a few minutes learning a language

Regex is pretty far from an easy-to-learn language and you're going to need more than a few minutes with it. Like, imagine if a standard string library only had functions with a single character name and how awful that would be to use.