Awk is great and this is a great post. But dang, awk really shoots itself so much with its lack of features that it so desperately needs!
Like: printing all but one column somewhere in the middle. It turns into long, long commands that really pull away from the spirit of fast fabrication unix experimentation.
Often times I don't! Entirely depends on what I'm doing. #1 thing off the top of my head is to remove That One Column that's a bajillion characters long that makes exploratory analysis difficult.
I suspect the rationale for Perl is that most Linux systems will probably have it installed already. Installing something you're familiar with is great when you can, but I'm guessing the awk script linked to here was picked more for its ubiquity than elegance.
Kinda, but not really. Of the infrastructures I've worked on, not a single one has been consistent in installing perl on 100% of hosts. The ones that get close are usually like that because one high up person really, really likes perl. And they send a lot of angry emails about perl not being installed.
Within infrastructures where perl is installed on 95% of hosts, that 5% really bites you in the ass and leads to infrastructure rot very quickly. You're kinda stuck writing and maintaining two separate scripts to do the same thing.
I dunno about that. IME, python is much, much more universally installed on the hosts I've worked on. Sure, usually it's 2.7, but it's there! I've tended to work on rhel and debian hosts, with some fedora in the mix.
(Once had a coworker reject a PR I wrote because I included a bash builtin in a deployment script. He said that python is more likely to be installed than bash, so we should not use bash. These debates are funny sometimes.)
Interesting, in my experience perl ends up pulled in as a dependency for one thing or another most of the time, but I don't have that perception about Python. Maybe there's just something I use that pulls in perl without me realizing and it's biased my experience.
>awk really shoots itself so much with its lack of features that it so desperately needs!
That's why I use Perl instead (besides some short one liners in awk, which in some cases are even shorter than the Perl version) and do my JSON parsing in Perl.
I've been using perl instead of sed because PCRE is just better and it's the same regex that PHP uses which I've been coding in for nearly 20 years.
I still don't actually know perl, but apparently Gemini does. It wrote a particularly crazy find and replace for me.
Never got around to using or learning awk. Only time I see it come up is when you want to parse some tab delimited output
Just tried to use "-d" and learned that it's a GNUism which isn't available under MacOS, so it's not a portable solution. And neither was it available under BSD 4.3 when I learned about xargs the first time.
Sure, but my example was just that and I actually use /identical$/ as the pattern. Sorry for the typo.
And I use this "historic" one liner only when I know about the contents of both directories. As soon as I need a "safer" solution I use a Perl script and pattern matching, as I said.
Things are already like that, friend! We have mawk, gawk and nawk. But it's fun to think about how we could improve our ideal tooling if we had a time machine.
Like: printing all but one column somewhere in the middle. It turns into long, long commands that really pull away from the spirit of fast fabrication unix experimentation.
jq and sql both have the same problem :)