Replacing Ansible with salt-ssh

richardfey · on Oct 22, 2020

I went the other way around instead, from salt(-ssh) to Ansible. I could not bear the slow speed though, so I suggest everyone who hasn't tried yet to use it with Mitogen. I don't think I will ever go back and in fact I expect upstream to eventually adopt it.

Major issues with Salt: lots and lots of bugs and quirks. I have had my fair share of those in Ansible over the years as well, but Salt is just not good enough for my taste. A strong NIH is perceived all over its design.

kev009 · on Oct 22, 2020

I've generally seen the opposite, salt works well multi0-platform, ansible is geared toward the LTS distros. It's extremely easy to get bug fixes into salt, and I've done a few dozen.

The biggest issue I see with ansible is the amount of discipline required to use it well. Junior staff will create non-idempotent plays. I don't really see that with salt, where shelling out is pretty rare (the benefit of NIH?).

busterarm · on Oct 22, 2020

SaltStack is famous for their NIH syndrome and various high severity vulnerabilities that have resulted from that.

I've used both Ansible and SaltStack a fair bit and would agree with the parent. In most cases, Ansible w/ Mitogen is the way to go.

SaltStack is really only sanely deployable in a Masterless setup. Also the real secret sauce of SaltStack that almost no one is using it for is its Event Bus & Reactor System to automate maintenance/incident runbooks. There's a ton of untapped potential there.

kev009 · on Oct 23, 2020

I always am curious when I see this. I've planned and deployed master setups in two separate 10^4 node count companies now. In my experience the number of hosts doesn't really matter (figuring out an architecture that will work with availability zones etc only takes a couple weeks). The real work is figuring out how to normalize and govern the repos. I haven't seen any sane ansible repo once it's beyond a few senior contributors because it's so easy to do whatever you want. Ansible is the ultimate small business tool, but it works very poorly in large companies.

busterarm · on Oct 23, 2020

We're at the 10^4 node count but the team managing it is still at the 10^1 size. That's how, basically. You have to make large investments into automation, code review process and time to make continuous improvements (because things do slip through).

There's style guides, linting, commit checks...

jaden · on Oct 22, 2020

I had never heard of Mitogen and installed it a few minutes ago based on your recommendation. The performance improvement is massive. Much obliged.

folken · on Oct 22, 2020

What people don't get about salt, is that its literally kubernetes (as in orchestrator) for your infrastructure.

It takes the "operator" concept to a very new level: You can tell it to react automatically and enlarge a disk in a machine due to space constrains, reschedule workloads according to load, configure a loadbalacner according to rules you write once, create resiliency rules, deploy new machines or containers...

It scales ridiculously, i have seen 30000 minions to a master.

Why you would use it via ssh other than bootstrapping is beyond me...

btw: you can run ansible via the salt bus transport = salstack in the ansible.cfg, be amazed.

busterarm · on Oct 22, 2020

This is exactly what I was eluding to with this comment. https://news.ycombinator.com/item?id=24863836

Thank you for putting it more cleanly.

I don't necessarily always think that's the right thing to do though, but within limits, yes. It can do things you would otherwise have people respond manually to.

Putting out fires sucks.

Spivak · on Oct 22, 2020

This seems like a lot of work for such a small problem in Ansible that can actually be fixed without any patches. Here's an example.

    - name: Pretend to install stuff.
      debug: var=item
      with_items: [git, httpd, vim]
      when:
        - "'all' in ansible_run_tags or item in ansible_run_tags"
        - item not in ansible_skip_tags
      tags: always

If you run without any tags it will install everything and if you specify --tags=git it will only install git and if you specify --skip-tags=vim it will do the correct thing.

Ref: https://docs.ansible.com/ansible/latest/reference_appendices...

And in the specific case of installing packages you can speed this process up dramatically with something like the following without having to use tags.

    - name: Probe the host for installed packages.
      package_facts:

    - name: Install packages.
      package: name={{ item }}
      with_items: [git, httpd, vim]
      when: item not in ansible_facts.packages

But this itself is really inefficient too because you're still calling the module for every package instead of calling the module once with all the packages. Older versions of Ansible did this automatically but it was super magic so it's probably better that it was removed. We can do it ourselves though.

    - name: Install packages.
      package: name={{ to_install }}
      vars:
        packages: [git, httpd, vim]
        to_install: "{{ packages | difference(ansible_facts.packages) | list }}"

ofrzeta · on Oct 22, 2020

Now you've shown three different methods to achieve the goal where the third version has become sufficiently convoluted to make it hard to understand.

And how is an Ansible user supposed to know what's more efficient and what are the side effects of calling the packages with name={{item}} or a list variable? The documentation of ansible.builtin.package for one doesn't seem to supply any indication of it.

Spivak · on Oct 22, 2020

Fair, I can make the 3rd one more readable. I figured the extra vars would actually help make it clearer. Guess not.

    - name: Install packages.
      package: 
        name: [git, httpd, vim] | difference(ansible_facts.packages) | list

This isn't something super specific to Ansible, it's the same story all over when libraries have a "bulk" or "batch" api so that you don't have to iterate over operations with lots of setup. The package/yum/apt/dnf modules happen to take lists as arguments since you can run "yum install a b c" faster than "for p in a b c; do yum install $p; done".

ofrzeta · on Oct 22, 2020

Ok, that's a valid point. Naively I think a user could also expect the {{items}} to be expanded in the package module but someone who has used Ansible would probably be familiar with the "with_items" idiom.

Also I was confused by your claim that your second version was sped up dramatically although it is still using "with_items". Is this due to the use of facts instead of tags?

Spivak · on Oct 22, 2020

The problem the OP was facing was when you had a loop like

    - name: Install packages
      package: name={{ item }}
      with_items: "{{ loooong_list_of_packages }}"

and most of the packages are already installed then you're spending most of the loop running the package module just for it to come back any say 'ok'. Hence why the author wanted to use tags to limit the run to specific packages.

Rather than doing that you can run package_facts once to get a list of all the installed packages and skip all the module invocations you know won't be changed entirely. You can't do this in general because on a totally unknown system something exteranl might mess with packages between your fact gathering and the install but on your known systems this is almost always fine.

spyc · on Oct 22, 2020

Some cool hacks here, thanks for sharing!

athenot · on Oct 22, 2020

The author's example of using loop data out of loops is overlooking anchors and aliases in YAML. His example can be rewritten as:

    - hosts: all
      tasks:
      - name: Install distro packages
        package:
          name: "{{ item }}"
          state: present
        with_list: &packages
          - git
          - htop
        tags: *packages

spyc · on Oct 22, 2020

Good point, thanks for bringing it up! I have updated the article accordingly just now.

tutfbhuf · on Oct 22, 2020

Isn't it possible to create many small playbooks, instead of a big one. This would allow you to just run the stuff you need. And if you need all e.g. for bootstraping then you can create one playbook that simply includes the smaller ones with the import_playbook statement. This also gives you the flexibility to create other subsets based on your needs.

tyldum · on Oct 22, 2020

Multiple playbooks is possible, and joining them into a large one by imports. You also have tags to do target with more precision.

PaywallBuster · on Oct 23, 2020

this is the right answer.

You can tag playbooks and either run only the specified tags playbooks or skip those and run everything else.

folken · on Oct 22, 2020

you write states per purpose. you can import, you can extend, you can define functions in one and reuse them somewhere else...

xet7 · on Oct 22, 2020

Regarding VMware buying SaltStack, it is a good thing, Tom will have more time to code, according to TheHacks podcast episode about it at https://www.saltstack.com/the-hacks/ that I usually listen from page https://thehacks.libsyn.com/ .

From other podcast episodes I got to know that they at SaltStack are currently merging those many differently maintained SaltStack repos together to less repos, so it will make progress faster.

Disclaimer: I do not work at SaltStack. I just happily listen to their podcasts and I have used SaltStack for managing my servers.

crmrc114 · on Oct 22, 2020

You lost me when you when the author failed to properly understand how to use {{ item }} by defining with_items:

https://www.educba.com/ansible-with_items/

Let me fix his code here: https://pastebin.com/PQDAnHiJ

I use with_items at least once a day in this manner.

Spivak · on Oct 22, 2020

That’s not right either. It’s super wasteful since newer versions of Ansible don’t magic loops into a single call to your package manager anymore. You have to pass the list as a single item.

    - name: Install stuff.
      package:
        name: [git, httpd]

crmrc114 · on Oct 22, 2020

Agreed, great callout for the example use case, my point was simply using the items list which could even be used for something like shell or win_shell

emmelaich · on Oct 23, 2020

Note that it was possible to install multiple packages in one call by passing the whole list to yum/apt. This was undocumented but happened to work.

You just had to supply the list as space separated. i.e.

   package: {{ packages|join(' ') }}

or similar

deeblering4 · on Oct 22, 2020

Is this the case too with other resources, like users? I noticed that those loops can become pretty time consuming as the user list gets larger.

geerlingguy · on Oct 22, 2020

It depends on the module; the user module works on one entity at a time, so you have to loop at the task level, which does take a little longer that trying to pass through a batch of user data at once.

spyc · on Oct 22, 2020

The article uses with_list rather than with_items but even with with_items, that syntax would be working and define {{ item }} fine.

Please check the docs of with_list at https://docs.ansible.com/ansible/latest/user_guide/playbooks... .

Arnavion · on Oct 22, 2020

Their point is not about how to pass in those parameters to the `package` task. They want to reuse the list of items as the value of the tags list, which they can't without duplicating the list.

crmrc114 · on Oct 22, 2020

Or they could define the list of packages as an array once? https://stackoverflow.com/questions/23507589/is-it-possible-...

Then just point the module to the array, like in the linked example.

edit; Check out Spivak's solution in the comments below. Interrogate the host and find out what is needed, then apply the install- saves you all the time of getting back 'ok' results.

fanf2 · on Oct 22, 2020

The main reasons I started using Ansible in 2013 were its agentless setup (which this article covers nicely) and its --check --diff options (which this article leaves out). Does salt have a do-nothing sanity check mode?

Shish2k · on Oct 22, 2020

It does, in my case my standard “check if everything is in order on a host” is:

salt-ssh $hostname state.highstate test=True

fanf2 · on Oct 22, 2020

Neat :-) does that print diffs if there are discrepancies? Including diffs of templates config files?

bproven · on Oct 22, 2020

--output-diff is what you are looking for

dpedu · on Oct 22, 2020

It has test.ping

bovermyer · on Oct 22, 2020

As it happens, I moved from Ansible to Chef not too long ago as my provisioner of choice.

It can be used with just SSH and Chef Zero (no need for a Chef Server), and it's built with Ruby and Chef's DSL and not YAML.

It took me several years to not hate Chef's approach, though, compared to the ease of use of Ansible.

jitl · on Oct 22, 2020

I much prefer the chef experience because I like Ruby - and I used Chef solo at Airbnb for years. However The fact that chef recipes are Ruby means that there’s no static analysis or interoperability available in Chef land. I can read and write Ansible playbooks in any language. Chef I can only use from Ruby.

Migrating from Chef to XXX was an enormous pain. It’d be easier to migrate away from Ansible.

busterarm · on Oct 22, 2020

I won't provision anything unless it's with Terraform anymore.

I use Ansible to build images and to configure (as in template configuration files in) newly provisioned systems after first boot-up and that's it.

Everything's end-to-end automated after I ship my code and I don't have to worry about it.

Systems don't get changed, they get rebuilt.

bovermyer · on Oct 23, 2020

You are aware that Chef can be used in the same capacity with Terraform, right?

busterarm · on Oct 23, 2020

Well aware, but no one tool should have multiple responsibilities.

Chef is primarily a configuration management tool and I keep that separate from a distinctly provisioning tool like Terraform. Terraform is declarative rather than procedural like Chef and that's far preferred when trying to define what you have. Terraform & Ansible don't require me to run a master server or have agents on every machine either.

Use tools where they're strongest.

We used to use Chef years ago and replaced it with Ansible and Terraform and have zero regrets at 10^4 nodes. We also use Terraform for managing CloudFlare and all sorts of other providers. It has been a dream.

Ansible does less and less work as time goes on for us and as we've moved to semi-immutable infrastructure. If we need to change something about a system, we'd rather replace it.

aprdm · on Oct 22, 2020

Hm as others mentioned it is possible to use Ansible in a very "usual" way to solve this problem.

I would love if we standardize in Packer, Ansible and Terraform. Luckily it seems like the devops world is standardizing in this stack ... there's very little you can't do in an elegant way using those 3 tools (only use the 3 if you need, a lot of shops only use Ansible for example)

core-questions · on Oct 22, 2020

Hahah, I finally got to the point of standardizing on that and now everyone wants to throw away a well working system to move to container orchestration.

All to get to slightly less downtime and faster deploys... idk. Hard to know if the months of work will be worth it.

hinkley · on Oct 22, 2020

I think the other leg of this stool that containers don't have to worry about composing with each other on the filesystem. I can run three Python processes and two of NodeJS on the same host with no worries about intra-version compatibility, upgrades or child processes getting the wrong PATH.

If, that is, disk space is cheap enough that I don't have to normalize the filesystem (database-style) in order to function. That was a huge IF not so very long ago, which is why we are iterating on this again now.

aprdm · on Oct 22, 2020

I would say that you can deploy containers with above just fine, of course that you probably mean kubernetes which I am also struggling with the thinking of migrating to or not. We have a very well oiled machine atm and k8s has to give us more to be worth it.

movedx · on Oct 22, 2020

I'm working on a book that focuses on Ansible, Terraform, Packer and GitLab (CI) and their integration. I'll eventually incorporate Docker and K8s too (just built a physical cluster at my office for giggles.)

I believe you're correct in that this is a fine stack of tools that can get anything done. I wish businesses would just use them.

takeda · on Oct 24, 2020

Well, I wouldn't standardize on specific tools, but I agree, that there's that trio:

- infrastructure

- configuration management

- image building

Although, frankly, the last two are solved much more elegantly through NixOS, so it ultimately could be:

- infrastructure configuration

- system configuration

aloer · on Oct 22, 2020

Slightly OT:

I have recently done some scripting work on a single machine and used Ansible because it makes it so convenient to parse output, generate config files etc., mostly shell and jinja2

Is that valid use of Ansible or should I be ashamed and use something else?

NateEag · on Oct 22, 2020

If the tool met your needs, why worry about shame?

If you eventually run into difficulties, sit down and understand why.

Once you've understood the cause of your problems, see whether the features you didn't take time to understand can alleviate them.

chousuke · on Oct 22, 2020

Ansible is okay for stuff like that.

I very much don't like that it does the YAML-as-a-programming-language thing that's way too popular nowadays but when judiciously used, Ansible can be useful.

I really want something like Ansible with a real declarative programming language. Puppet is the closest nowadays to having a sane language, but it has its own issues.

eeZah7Ux · on Oct 22, 2020

ashamed. Almost every programming language is better than a mashup of yaml and jinja

sirtaj · on Oct 22, 2020

Writing plugins in Python is a great way to reduce the size/scope of yaml you have to write, particularly things like filter and vars plugins. You can then use yaml primarily as a data source rather than a programming language.

hibbelig · on Oct 22, 2020

I fondly remember CFengine. Though it did require installation on the host to be managed, but that seemed rather simple: Log in, install CFengine and git, clone the repo, and off you go to run it.

Fnoord · on Oct 25, 2020

Funny thing is, CFengine predates Git. Its from 1993 and I remember looking into it in end of 90s after seeing it covered in a Linux magazine.

beefbroccoli · on Oct 23, 2020

I revisit these tools every so often and I still don't understand how they're any better than simple shell scripts.

noinsight · on Oct 23, 2020

There's nothing simple about shell scripts. Idempotency is the main thing - and a great thing.

beefbroccoli · on Oct 26, 2020

People talk about idempotency, as if it's complicated because it's a 5 syllable word.

  if ! which sometool; then
    install sometool
  fi

That's an idempotent shell script, in all it's simple glory.

bpbp-mango · on Oct 22, 2020

I like the look of the file.keyvalue. Anything like that in ansible? https://docs.saltstack.com/en/latest/ref/states/all/salt.sta...