I was recently discussing something that involved knowing whether a string consisted of only a single repeated character. Having spent many years in the trenches with Perl, my first thought was /^(.)\1*$|^$/, which is the kind of thing people dismiss as "line noise" because they haven't spent a few minutes learning a language that can easily express what you want. We have this trend now from people who like languages like Go where answering the question "does this string consist of a single repeated character" begins with "I would now like to reserve space for a 64-bit integer which I shall henceforth refer to as 'i'...", and that's considered a virtue.
To be honest this sounds like the very definition of "you solved the problem with regex, and now you have two problems". In essentially every language in existence the question you're asking can be solved with a simple loop and perhaps 3-4 lines of code. In many languages it's easily expressed as a one liner, i.e. (with C#) `s.All(c => c == s[0])`.
I really don't get the love affair that some devs have with regex. In my 10 year career I think I don't think I've run into more than a dozen problems in a production system that _required_ regex to solve. When you're working with robust modern languages there's almost a solution other than regex that's significantly easier to understand + maintain, and probably a lot more performant to boot. Is regex useful for other things, especially cli stuff like grep and sed? Oh yes absolutely. But generally speaking I really don't want it in my code base unless there's no other choice.
10 years and a dozen problems? Conversely, I encounter pattern matching problems and use RegEx near daily. Both these perspectives are anecdata, neither are useful.
> and probably a lot more performant to boot
I highly doubt your home-grown pattern matching functions could beat the decades of optimization that have gone into RegEx engines, in anything but the most trivial of patterns (like the one demonstrated here). Creating your own ad-hoc pattern matcher instead of using the ubiquitous one built into your language is like the junior in the article re-implementing JOINs. Sure, you may be able to beat the engine occasionally on particularly simple patterns, but I guarantee you'll lose out overall.
RegEx is not inherently slow, and it is definitely possible to maintain. See industries with serious text processing demands like bioinformatics, where Perl is still used extensively. They could not operate like they do if they shied away from RegEx like many developers seem to.
> Creating your own ad-hoc pattern matcher instead of using the ubiquitous one built into your language
GP was using LINQ, which is a first-class language construct in C#. I'll grant that it may not be _as_ optimized as Perl's regex routines, but it's hardly ad-hoc or slow.
Yes, the linked example falls in the category of "particularly simple patterns" I mentioned. This strategy only works for such simple patterns; add in a single alternation with a common base and the naïve iteration strategy falls apart. Implementing a sensible algorithm to evaluate such a pattern would require much more code than a couple of brackets, would not benefit from RegEx caching, etc.
> I highly doubt your home-grown pattern matching functions could beat the decades of optimization that have gone into RegEx engines
On the contrary, the All() method used here (which is part of the .Net standard library) is literally just a loop that evaluates each item in the collection to verify that they all match the predicate function. It'll be able to check hundreds if not thousands of characters in the time that it takes the regex engine to initialize and parse the pattern.
>reserve space for a 64-bit integer which I shall henceforth refer to as 'i'
In this scenario, regex processing should allocate more and be slower. The for loop is more optimal even if takes more lines. There's probably some SIMD solution which would be the fastest.
Let's see if I remember Perl regexps: '/': this is a regular expression. '^' at the beginning of a line, '(.)' match any character, and remember it for later. '\1' match the same character that you just remembered, '*' zero or more times. '$' then match the end of the line. '|' Or, '^$' match the beginning and end of the line with nothing in between.
>because they haven't spent a few minutes learning a language
Regex is pretty far from an easy-to-learn language and you're going to need more than a few minutes with it. Like, imagine if a standard string library only had functions with a single character name and how awful that would be to use.