At a previous Biotech, we used Cromwell/WDL because the DSL was the most intuitive to our bioinformatics scientists. But seeing as that doesn't work as nicely on AWS (and is also supported by an organization that is imploding), we opted for Argo on our K8s cluster to process RNAseq data en masse. Getting the scientists to use YAMl has been an uphill struggle, but the same issues would apply to learning groovy I guess. We've found that the Argo engine is easier to maintain, and also we only have to support one orchestrator across our Bioinformatics and ML teams.
For industrial purposes, I've started to approach these pipelines as a special case of feature extraction and so I'm reusing our ML infrastructure as much as possible.
For both workflow languages, they are both better for building a singular reproducible workflow that can be published with an academic paper. For us, I'm looking for a workflow language that can treat the pipeline as a testable, deployable piece of software. I find that with Nextflow, scientists fall into bad patterns of mixing in the pipeline logic (eg if this sample type, then process it this way) interspersed with the bioinformatics model (eg use these bowtie2 parameters) throughout the pipeline which makes it more difficult to maintain as our platform evolves. Their K8s integration is lacking for both of them and they work much better an academic-style clusters.
YAML does leave a lot to be desired, but it also forces a degree of simplicity in architecting the pipeline because to do otherwise is too cumbersome. I really liked WDL as a language when I used to use that--seemed to have a nice balance of readability and simplicity. I believe Dyno created a python SDK for the Argo YAML syntax, and I need to look into that more.
For industrial purposes, I've started to approach these pipelines as a special case of feature extraction and so I'm reusing our ML infrastructure as much as possible.