Ansible and Variables

A basic explanation of Ansible and a discussion of variable usage.

I’ve been talking about Ansible on Facebook lately and the other day a friend asked me about Ansible and variables. I gave her a quick explanation, then told her I’d do a more thorough writeup that would be easier to follow than my “stream of consciousness” explanation given in FB messages.
It occurred to me that I’m planning to do a “lunch and learn” on Ansible at work soon, and I could re-use the same material, so I’ll just post this publicly. I plan for this to be the first in a series on DevOps, integration, idempotent, configuration management and Ansible. So without further ado…

For those who have not seen my posts on Facebook, Ansible is a configuration management tool for provisioning, deploying and configuring, servers and applications. It is one of a series of such tools that have come out in the last few years, such as Puppet, Chef and Saltstack. It is designed to be fast, easy to use, power, efficient and secure. It is serverless and agentless. It aims to be idempotent.

I can’t speak to Puppet, Chef or Saltstack as I’ve never used them.

Addressing these one at a time, not necessarily in the order presented above:

  • Secure
  • Everything is done through SSH tunnels. No passwords, no configuration files, are ever sent over the network in the clear. Set up your SSH keys and you don’t have to worry about typing passwords either.
    There is no agent software running on the managed machines, so there’s nothing to hack.

  • Easy to use
  • “I wrote Ansible because none of the existing tools fit my brain. I wanted a tool that I could not use for 6 months, come back later, and still remember how it worked.”
    Michael DeHaan
    Ansible project founder

  • Efficient
  • No agents, just SSH (or PowerShell with Windows, but I won’t get into that.) The only software required on the managed machine is an SSH daemon and Python.

  • Serverless and Agentless
  • As I’ve already mentioned, there’s no agent running on the managed server. If you can ssh into it and run Python, you’re good to go.
    There is no central server, full of manifests, menus, etc. You can run it from your desktop or laptop. Again, if you have Python, you’re good to go (Python has its own implementation of the OpenSSH client.) Just make sure you back up your playbook and roles. Git is a great place for this!

  • idempotency
  • The is one of the most important! It means you should be able to run your Ansible script against a managed host at any time, and not break it. If anything is not configured the way it is supposed to be, the ansible script will put it back the way it should be. Shell scripts have to be written very carefully to detect if something doesn’t need to be done. It’s also notoriously difficult to modify files with shell scripts (unless you’re really good with tools like sed and awk, or perhaps Perl…)

Some vocabulary before we begin:

  • playbook
  • A file defining which hosts you want to manipulate and what roles you want to apply to those hosts, as well as what tasks you want to run.

  • roles
  • A defined list of tasks to be run when the role is called, as well as any files to be installed, templates to be applied, dependency information, etc.

  • inventory
  • A file listing every server you will manage with Ansible, and what groups they belong to. A host can belong to any number of groups, including none at all, and groups can be members of other groups.

  • host_vars & group_vars
  • Directories with files containing variables specific to certain hosts (host_vars) and host groups (group_vars). These variables are used in your tasks and roles.

Now, on with the discussion of variables. Here was Kathryn’s original question:

How do variables work with dependencies in roles? Meaning, if a role is dependant on another, can it access the variables of the other at run time?

I started to answer with an example we use at work: we have a “common” role that sets up some users with specific UIDs that we want on all our servers, and an “apache” role that depends on that common role (e.g.: it needs the wwww user created by common). Kathryn further asked:

Okay, say “application” depends on “common” and “common” has default variables… would “application” pick up “common”‘s defaults?

Yes! For example, we have in our “common” role, a task with a file which pushes out customized /etc/sudoers.d files, depending on what the server will do, what environment it will be in, etc. One of the tasks looks like this:

NOTE: the language used to write Ansible files, Yaml, is whitespace sensitive, however due to the limitations of HTML and my WordPress config, the whitespace is removed from my examples. Do not just cut and paste and expect it to work. You will need to adjust the leading spacing on all lines.

- name: Sudoers - push sudoers.d/hadoop_conf
template: >
src=sudoers_hadoop_conf.j2
dest=/etc/sudoers.d/hadoop_conf
owner=root
group=root
mode=0440
when: hadoop_cluster is defined

Note the last line: “when: hadoop_cluster is defined”. “hadoop_cluster” is a variable. This variable isn’t actually defined in our role, but rather in the playbook, or in a host_var or group_var file. In this case we have a group_vars/all_hadoop file. Any task run on any server that is part of the “all_hadoop” group in the inventory will have the variables defined in this group_var file. This file contains:
# file: group_vars/all_hadoop

hadoop_cluster: true

In this case “hadoop_cluster” is defined, and has a value of “true”. Our task above doesn’t care about the value, only that the variable is defined at all. If I run the above task on the server “namenode1”, and “namenode1” is in a group called “all_hadoop” in my inventory file, it will inherit the variables in group_vars/all_hadoop, “hadoop_cluster” is defined, so the task will be run.
Another role or task, which might be part of “common” role or in a completely different role, will be able to access the same variable and act on it. That role / task might actually care about the value of the role, and would be able to see that value. Or it might just care that the variable is defined.

Another example: I built a role for a set of servers at work. In our development environment we wanted to allow the developers actually writing the code for the applications to run on those servers to be able to use sudo to gain root access. I added another task to the same file as our Hadoop example above:
- name: Sudoers - push sudoers.d/nova_conf
template: >
src=sudoers_project_conf.j2
dest=/etc/sudoers.d/project_conf
owner=root
group=root
mode=0440
when: allow_project_sudo is defined

In our inventory, the development servers for this project are in a “dev_project” group, and there’s a group_vars/dev_project file that defines “allow_project_sudo”. We also have a “production_project” group in our inventory which contains the production servers for this project. The “allow_project_sudo” variable is NOT defined in group_vars/production_project, so that sudoers file is not pushed out.

Directly addressing Kathryn’s question about one role being able to call variables “defined” by another role (although I’ve already addressed the fact that roles don’t really “define” variables, they just access them), I have this task:
- name: Build ssh key files
assemble: >
src={{ item.user }}_ssh_keys
dest=/home/{{ item.user }}/.ssh/authorized_keys
owner={{ item.user }}
group={{ item.group }}
mode=0600
remote_src=false
backup=yes
with_items:
- { user: 'projectuser', group: 'projectgroup' }
when: allow_project_sudo is defined

Again, we look to see if “allow_projecgt_sudo” is defined; if so, we build a .ssh/authorized_keys file for the user “projectuser”, allowing all those same devs to ssh into the server as that user. This task also includes the intriguing and useful “with_items”. This allows for a form of looping, such that it will actually perform this task for each item listed in the “with_items” block, redefining the “item.user” and “item.group” variables used in the src, dest, owner and group lines in the task.
We actually define two variables in our “with_items”. Each line in “with_items” is an “item”. In this case we have two variables (basically an associative array), and we can reference the key/value pairs in the array. “item.user” has the value “project user”. “item.group” has the value “projectgroup”. Thus our “assemble” becomes, on the first iteration of “with_items”:

assemble: >
src=projectuser_ssh_keys
dest=/home/projectuser/.ssh/authorized_keys
owner=projectuser
group=projectgroup
mode=0600
remote_src=false
backup=yes

This basically says “grab all the files (presumably ssh key files) in the directory “projectuser_ssh_keys” (stored inside a directory in our role) and build, on the managed host, a file called “authorized_keys” in the directory /home/projectuser/.ssh, make that file owned by projectuser:projectgroup, with -rw——- permissions. Oh, and back up the original file first, just in case.