Using Ansible on Linux to conduct Windows VMs

Ismar Omerčević
ReversingLabs Engineering
8 min readMay 8, 2023
Source: https://unsplash.com/photos/vpOeXr5wmR4

Authors: Ismar Omerčević and Andrija Milovac

In this blog, we are presenting how we manage a system of 850+ Windows 10 virtual machines that actively scan samples for malware with 40+ antivirus scanners, with about ~7 million files passing through the system daily.

We accomplish this by leveraging an open source IT automation tool called Ansible, developed by RedHat.

Ansible can be used to configure systems, deploy software and orchestrate complex IT tasks such as continuous deployments or zero downtime rolling updates.

It manages machines in an agentless manner. What this means in plain English is that you do not need to have Ansible installed on the machines you are managing. Ansible only requires the remote machine to have Python installed, and to be configured to accept SSH or some other type of remote connection.

Ansible comes with various command line tools which include:

  • ansible
    – define and run a single task playbook (ad hoc task) against a set of hosts
  • ansible-config
    – views ansible configuration
  • ansible-console
    – REPL console for executing ad-hoc tasks
  • ansible-doc
    – plugin documentation tool
  • ansible-galaxy
    – command to manage Ansible roles in shared repositories, the default of which is Ansible Galaxy https://galaxy.ansible.com.
  • ansible-inventory
    – used to display or dump the configured inventory
  • ansible-playbook
    – runs ansible playbooks, executing the defined tasks on the targeted hosts
  • ansible-pull
    – pulls playbooks from a VCS repo and executes them for the local host
  • ansible-vault
    – encryption/decryption utility for Ansible data files

Ansible Core Concepts

Control Node

A Control Node is the host with Ansible installed that connects to the hosts you wish to manage. It executes ad hoc commands or Ansible playbooks, connects to managed nodes and executes the tasks remotely.

Managed Node

A Managed Node is a remote system or host controlled by Ansible.

Inventory

An Ansible inventory defines a list of logically organized managed nodes. You can then execute tasks against this list of hosts. The inventory is most commonly written in YAML or INI format, but there are other supported formats such as TOML, and even the possibility of dynamically loading the inventory.

Example inventory:

server.ungrouped.com

[webservers]
cookies.reversinglabs.com
shirts[01:10].reversinglabs.com
staging.cookies.reversinglabs.com http_port=8080

[webservers:vars]
http_port=8080
title=hello

[databases:children]
nosqldatabases
sqldatabases

[nosqldatabases]
mongodb.reversinglabs.com
scylladb.reversinglabs.com
staging.mongodb.reversinglabs.com

[sqldatabases]
pgsql.reversinglabs.com
staging.pgsql.reversinglabs.com

[production]
cookies.reversinglabs.com
shirts.reversinglabs.com
pgsql.reversinglabs.com
mongodb.reversinglabs.com
scylladb.reversinglabs.com

[production:vars]
deployment=production

[staging]
staging.pgsql.reversinglabs.com
staging.cookies.reversinglabs.com
staging.mongodb.reversinglabs.com

[staging:vars]
deployment=staging

We can analyze this inventory by using the ansible-inventory command

ansible-inventory --graph -i hosts.ini 
@all:
|--@databases:
| |--@nosqldatabases:
| | |--mongodb.reversinglabs.com
| | |--scylladb.reversinglabs.com
| | |--staging.mongodb.reversinglabs.com
| |--@sqldatabases:
| | |--pgsql.reversinglabs.com
| | |--staging.pgsql.reversinglabs.com
|--@production:
| |--cookies.reversinglabs.com
| |--mongodb.reversinglabs.com
| |--pgsql.reversinglabs.com
| |--scylladb.reversinglabs.com
| |--shirts.reversinglabs.com
|--@staging:
| |--staging.cookies.reversinglabs.com
| |--staging.mongodb.reversinglabs.com
| |--staging.pgsql.reversinglabs.com
|--@ungrouped:
| |--server.ungrouped.com
|--@webservers:
| |--cookies.reversinglabs.com
| |--shirts01.reversinglabs.com
| |--shirts02.reversinglabs.com
| |--shirts03.reversinglabs.com
| |--shirts04.reversinglabs.com
| |--shirts05.reversinglabs.com
| |--shirts06.reversinglabs.com
| |--shirts07.reversinglabs.com
| |--shirts08.reversinglabs.com
| |--shirts09.reversinglabs.com
| |--shirts10.reversinglabs.com
| |--staging.cookies.reversinglabs.com

We can also view the variables set for each host, we can see that the host staging.cookies.reversinglabs.com received the variables both defined in the staging group and web servers group as well as its own variables. Notice that http_port is 8080 and not 80, because of variable precedence (variables set on host precede variables on group)

ansible-inventory --host staging.cookies.reversinglabs.com -i hosts.ini 
{
"deployment": "staging",
"http_port": 8080
}

Playbook

An Ansible playbook is a YAML file that defines the tasks to be executed in a declarative, repeatable manner which can be easily shared to other people via VCS. A playbook is composed of one or more plays. Each play contains a task, name and managed nodes that you wish to run the play on. Each task is an Ansible module that is being called. Playbooks can contain more than just a task, name and target. You can use many playbook keywords at the playbook, play or task level to influence what Ansible does. You can find a list of these keywords here: https://docs.ansible.com/ansible/latest/reference_appendices/playbooks_keywords.html.

---
- name: Update web servers
hosts: webservers
remote_user: root

tasks:
- name: Ensure apache is at the latest version
ansible.builtin.yum:
name: httpd
state: latest
- name: Write the apache config file
ansible.builtin.template:
src: /srv/httpd.j2
dest: /etc/httpd.conf

- name: Update sql servers
hosts: sqldatabases
remote_user: root

tasks:
- name: Ensure postgresql is at the latest version
ansible.builtin.yum:
name: postgresql
state: latest
- name: Ensure that postgresql is started
ansible.builtin.service:
name: postgresql
state: started

Playbook execution:

To run a playbook, use the ansible-playbook command. For example, to run the preceding playbook with the inventory we described in the blog, use the following command:

ansible-playbook playbook.yml -i inventory.yaml

A playbook runs in order from top to bottom, as do tasks within each play. By default, Ansible executes each task on all the targeted hosts before continuing to the next task. You can change this behavior if you need to do so by using a different strategy. Most ansible modules are idempotent, meaning that they won’t change the state of the managed node if the managed node is already in the desired state. When you run a playbook, Ansible returns information about connections, the executed tasks and whether the task succeeded on each node.

Jinja2

Ansible uses Jinja2 templating to enable dynamic expressions, access to variables and facts, as well as using various filters to manipulate data. You can use templating with the template module. For example, you can create a template for a database configuration file, deploy that configuration to multiple environments and simultaneously provide the correct differing data to each database instance.

Handlers

An ansible play can also contain handlers. Handlers are defined in a similar manner as tasks within a play. Handlers are used to execute a task conditionally at the end of the play, if a change occurred on a machine. You can trigger handler execution at the task level by notifying the handler using the notify keyword. Notifying the same handler multiple times has no effect, as the handler will only be executed once.

Example playbook with handlers:

---
- name: Verify apache installation
hosts: webservers
vars:
http_port: 80
max_clients: 200
remote_user: root
tasks:
- name: Ensure apache is at the latest version
ansible.builtin.yum:
name: httpd
state: latest

- name: Write the apache config file
ansible.builtin.template:
src: /srv/httpd.j2
dest: /etc/httpd.conf
notify:
- Restart apache

- name: Ensure apache is running
ansible.builtin.service:
name: httpd
state: started

handlers:
- name: Restart apache
ansible.builtin.service:
name: httpd
state: restarted

Roles

Roles allow you to automatically load related variables, files, tasks, handlers, and other Ansible artifacts based on file structure. After grouping your content in roles, you can easily reuse them and share them with other users. An Ansible role has a defined directory structure with eight main standard directories. You must include at least one of these directories in each role. You can omit any directories the role does not use.

Example role directory structure:

roles/
common/ # this hierarchy represents a "role"
tasks/ #
main.yml # <-- tasks file can include smaller files if warranted
handlers/ #
main.yml # <-- handlers file
templates/ # <-- files for use with the template resource
ntp.conf.j2 # <------- templates end in .j2
files/ #
bar.txt # <-- files for use with the copy resource
foo.sh # <-- script files for use with the script resource
vars/ #
main.yml # <-- variables associated with this role
defaults/ #
main.yml # <-- default lower priority variables for this role
meta/ #
main.yml # <-- role dependencies
library/ # roles can also include custom modules
module_utils/ # roles can also include custom module_utils
lookup_plugins/ # or other types of plugins, like lookup in this case

webtier/ # same kind of structure as "common" was above, done for the webtier role
monitoring/ # ""
fooapp/ # ""

You can use roles in three ways:

  • At the play level with the roles option: This is the classic way of using roles in a play.
  • At the tasks level with include_role: You can reuse roles dynamically anywhere in the tasks section of a play using include_role.
  • At the tasks level with import_role: You can reuse roles statically anywhere in the tasks section of a play using import_role.

Basic example of using roles:

---
- hosts: webservers
roles:
- common
- webservers

Ansible on Windows

For Ansible to communicate with a Windows host, the Windows host has to meet the following requirements:

  • PowerShell 3.0 and .NET 4.0 or newer
  • a WinRM listener should be created and activated

Keep in mind that Ansible uses different modules for Windows which can be found here: https://docs.ansible.com/ansible/2.9/modules/list_of_windows_modules.html#windows-modules.

In addition, the following core modules work on Windows as well:

  • add_host
  • assert
  • async_status
  • debug
  • fail
  • fetch
  • group_by
  • include
  • include_role
  • meta
  • pause
  • raw
  • script
  • set_fact
  • set_stats
  • setup
  • slurp
  • template
  • wait_for_connection

Here are some example playbooks that we use in production:

  • Fetch file from target node (VM)
---
- name: Fetch file from VM
hosts: "{{ target_nodes }}"
vars:
ansible_connection: winrm
ansible_winrm_transport: basic
ansible_python_interpreter: /usr/bin/python3
ansible_winrm_server_cert_validation: ignore

vars_prompt:
- name: ansible_user
prompt: Username
private: no

- name: ansible_password
prompt: Password
private: yes

- name: target_nodes
prompt: Target
private: no

tasks:
- name: Fetch File
ansible.builtin.fetch:
src: C:\vtest2\scripts\definitions.ps1
dest: /home/ansible/
flat: yes
register: cp_from
  • A quick way to get rid of SMB shared disks on VMs
---
- name: Unmap network drive M and Z
hosts: "{{ group }}"
serial: 5
vars:
ansible_user: "{{ user }}"
ansible_password: "{{ password }}"
ansible_connection: winrm
ansible_winrm_transport: basic
ansible_python_interpreter: /usr/bin/python3
ansible_winrm_server_cert_validation: ignore


vars_prompt:
- name: user
prompt: Username
private: no

- name: password
prompt: Password
private: yes

- name: group
prompt: Group target
private: no

tasks:
- name: Delete net drive Z
win_mapped_drive:
letter: Z
state: absent

Ansible at scale

Forks

Forks is an Ansible parameter used for batching task execution. The default value for forks is 5. Ansible executes a task on the five hosts in first batch, waits for the tasks to complete, and then executes next batches of five hosts until done, while waiting for all tasks within one batch to finish before continuing.

Forks is either set in ansible.cfg, or as a parameter while executing a playbook using the --forks option. While talking about forks, hardware requirements are worth mentioning. For using forks, there is a 2 GB baseline for automation controller, plus an additional 1 GB of memory for every 10 forks. So, to use same example as documentation, for 400 forks you need 42 GB of memory.

Strategies

There are multiple builtin strategies along with the option to write your own strategy plugin.

Common strategy types are linear, which is the default strategy, debug, free and host pinned.

As the name says, the linear strategy is executed in a linear fashion, meaning that each task runs on every host and the next task won’t start until the previous task is completed on all hosts.

Debug is a strategy which allows user to interactively run the playbook for troubleshooting purposes.

The free strategy, unlike the linear strategy, doesn’t wait for all hosts to complete a task before continuing to the next task.

The host pinned strategy is similar to the free strategy, with the difference that a play won’t be started for a host unless the play can be finished without interruption by tasks for another host.

--

--