You might want to add Wolfram's Language String Patterns[1].
Perl 6 (which is now actually called Raku, for what it's worth) has BNF-style Grammars as a first class citizen of the language (a virtually unheard of thing AFAIK), and programmers are encouraged to use it for complex parsing tasks instead of regex. I don't know whether that falls under "alternative syntax for regex", they are very closely related to regexes and actually much more powerful and readable, including regexes as a subset. But adding them might drag you into adding every Context Free\Parsing Expression Grammar tool out there, things like Bison and YACC and Antlr.
The language SNOBOL[2][3][4] is one of the earliest languages with text matching and processing as a first class citizen (in fact, the only citizen). Being designed (1962) before the first software implementation of regular expressions* (1968), the pattern language it uses is not based on regular expression, and in some cases actually exceeds it (e.g. matching balanced parenthesis, which mathmatical regex can't do, but some variants of practical regexes can with special non-regular constructs).
This thread[5] in Retrocomputing stack exchange discusses what is the earliest language with string pattern matching capabilities, and finds hidden gems in the process.
I think SNOBOL might count, but it's a bit different in that I think it's a "dead" language now? e.g. it doesn't appear to have a user base or implementations. But feel free to add it or anything else if there are good links.
There's definitely a fuzzy line between things like LPeg or Rosie and YACC/ANTLR ... it's less about the power of the language and what kind of tasks people use it for, I suppose. If it's "scripting friendly".
That's so weird. I literally just started my own exploration of alternative Regex syntax this morning. The simulation rears her head again. The Dude abides.
( not OP and not disregarding the issue at hand ) have you tried practicing usage of the shift key opposite the key you want to press? Learning to make that change was hard for me, but is one or if not the largest improvement for me over the years in both typing speed and general typing comfort
Assuming a standard QWERTY layout, shouldn't you be using the left shift when typing < or >? It'll significantly reduce the contortion effort in your right hand. Same goes for most chords- use opposing hands for the modifier and symbol keys.
(Oddly enough, I don't bother doing this with ctrl/cmd + a/s/f/z/x/c/v, but I think that is mostly because keys to the right of the space bar vary so much between laptops and keyboards that I never bothered trying to stick with it).
Hello Yoav! In my opinion the match keyword is not needed. When the parser gets to an opening bracket that should start whatever methodology the match keyword is doing. As a heavy regex user, I understand that you want consistency with the capture keyword behaviour. But if we assume that the user is a programmer but does not know regex, it makes more sense to view {<space>;"batman";} as an array (delimited by curly brackets).
In fact, you might want to go a step further and consider using [] for match and {} for capture (thus eliminating the capture keyword as well). Using [] for match would be natural for Javascript programmers.
A bit orthogonal but something I would love to see:
A library which takes a regex and shows some examples that pass and some that fail. I would find that the easiest way of understanding a regex, rather than changing the language itself. (Though Melody looks v promising and I'm keen to see it develop).
It wouldn't be trivial to build — particularly for the "fail" examples, you'd want them fairly close to passing. For example, with `(/*\.csv\.gz)` you'd want `foo.csv.gz` rather than `aoseutn` as an example of a failure.
There's a python library called xeger [0] that allows you to generate strings from regular expressions. I've used this at work to generate large quantities of "valid" test data.
The fail bit is harder indeed, especially for larger regexps, but not totally impossible. The easiest way towards that goal seems to be constructing the DFA first and then generating illegal single edits (insertion, deletion, substitution). Generating a positive example is possible without it.
Does it (or are the plans to) reverse compile? If I could input my regex and output melody script one could create an excellent interactive learning tool, and also more selfishly help with adoption in teams with crusty old devs like me who like our magic rituals and prefer typing our regex by hand.
Also are there plans to support runtime compiling in JS? Something like...
someMelodyObject = <initialise and configure melody>
String.replace(someMelodyObject.toRegexp(), someString)
This I think would make it a compelling library for inclusion into projects assuming it were fairly efficient and lightweight. Not sure how or if you'd have to deal with performance and caching but it would probably go a ways to improving adoption among web developers at least.
Anyways good luck with the project. Regex is often considered a dark art when it's actually fairly concise and expressive, opening it up to more people at a higher level could lead to greater understanding of regex in general. Also what an interesting and challenging project to undertake, definitely a nontrivial challenge all told.
1. A reverse compiler is one of the 'maybe' features (see the table in the README), it's something I'd like but would essentially be an entire compiler so it's non trivial
2. The plan is to make Melody available as a compile step (like e.g. SASS) with no runtime overhead or as a Rust crate. You could do the compilation at runtime but other than including variables in the pattern I'm not sure if it'd have a benefit over compile time transforming, + it'd have a performance impact.
No dramas, understood re 1 it's no doubt beyond trivial to create a bidirectional transpiler. I wouldn't even know where to begin so good work on what you've managed so far.
Re 2, shouldn't be a problem as we already have build processes in place. Most projects I work on have npm build steps, I'm not sure how that figures in with rust (I really need to get off my butt and check it out sometime), but if it could be pulled in as an npm dependency that would work. If it could be done inline even better (e.g. inline melody within a JS file, compiles to the expression inline...)
Anyway good job again so far, have followed the repo, all the best once more!
I have never taken the time to learn RegEx stuff. It seems like it would be great if I could keep all the syntax in my head. So the idea of Melody seems great. I don't like that the github description claims it to be unstable currently. I hope this project continues and flourishes.
Author here, thank you!
The reason I stated that Melody is unstable is that the project is very young (days) and so some of the syntax is still being considered and may change (although the general idea and direction will remain), and also not everything is implemented yet. I'm also considering changing the way the parsing works but that wouldn't affect end users in terms of expected results for valid code)
Shameless plug: I'm working on a similar project, with a strong focus on providing a painless migration path. Author: let's chat and perhaps join forces?
Please don't submit to HN until I release the vscode plugin :-)
So, a RegEx (melody syntax) to RegEx (unspecified syntax) compiler? I mean, the syntax is nice, but 1. please specify which kind of regular expression it compiles to, 2. are those really regular expressions or a language higher in the chomsky hierarchy? 3. I suggest to add a graphical output of the state machine, e.g. with graphviz.
As for 1: "The current goal is supporting the JavaScript implementation of regular expressions." Right on the readme :).
2: I couldn't tell you, but does it matter if it has a practical use? I for one never understood why regexes have the notation they have, and always struggle because I use them next to never. This looks like an attempt to make something that would suit me better.
3. What do you mean exactly?
Here's the parse rule for Batman:
And complete example for the Semantic version: