Difference between Forks and Serial in Ansible

Heshan Dharmasena
DevOps SriLanka
Published in
6 min readMay 7, 2020

What We’ll Cover

  • What is Ansible ?
  • Ansible Tuning Parameters
  • Ansible Official Documentation
  • What is FORKS ?
  • What is SERIAL ?
  • Conclusion

What is Ansible ?

Ansible is Configuration Management tool in simple description and this goes under IAAC category in Cloud Computing. There are lot of reasons why you choose Ansible among other Configuration Management Tools. Every story has two sides and Ansible has its own capabilities as well.

When we focus on ansible configuration file in general it placed under /etc/ansible/ansible.cfg. This is where most of most of ansible tuning parameters are stored by default.

In this article I am using six different stories (use cases) to explain the concepts of Forks and Serial parameters.

Ansible Tuning Parameters

Here the official documentation about tuning performance in Ansible.

Forks and Serial are couple of tuning parameters in Ansible. These parameters help to customize default behavior of ansible. These customizations necessary when it comes to production environment on different requirements. This is kind of a small effort, going to give you theoretical idea and practical explanation on how fork and serial tuning parameters works in real-time.

Mostly those parameters are use inside playbooks/ansible configuration file, but you can use within adhoc commands too. But my explanation here is completely on ansible-playbooks scenarios.

Ansible Official Documentation

Read Froks in Ansible documentation & Serial in Ansible documentation First go through official documentation, if not clear this might help you :)

When we run Ansible playbook, usually it process each task in all inventory before it moves to other task. This is the default behavior.

What is FORKS?

Simple Explanation— Forks decides maximum number of simultaneous connections that Ansible made on each Task under a single run. default fork value is 5.

Use: When you need to manage how many nodes should get affected simultaneously.

Story 1: Suppose you have 4 nodes (nodeA, nodeB, nodeC, nodeD) in inventory, 2 tasks in playbook, forks =5 (by default)

First task process on all 4 nodes (nodeA, nodeB, nodeC, nodeD)simultaneously, then Second task process on all 4 nodes (nodeA, nodeB, nodeC, nodeD) simultaneously. If we assume processing time for each task is 5 seconds;

Time take for 1st task = 5s (nodeA, nodeB, nodeC, nodeD)

Time take for 2nd task = 5s (nodeA, nodeB, nodeC, nodeD)

Total time taken for playbook = 10s

That’s the way Ansible work by default (By default Ansible process maximum 5 nodes simultaneously). It was written inside /etc/ansible/ansible.cfg as forks =5. You are free to change forks value but remember increasing it may put heavy load on Ansible Control Node (so make sure not to overload Control Nodes resources).

Story 2: Suppose you have 6 nodes (nodeA, nodeB, nodeC, nodeD, nodeE, nodeF) in inventory & 2 tasks in playbook (assuming default forks = 5)

First task process on 5 nodes (nodeA, nodeB, nodeC, nodeD, nodeE) simultaneously, after that 5 nodes completed task process of rest of node (nodeF). Then Second task process on 5 nodes (nodeA, nodeB, nodeC, nodeD, nodeE) simultaneously, after that 5 nodes completed task process of rest of node (nodeF). If we assume processing time for each task is 5 seconds;

Time taken for 1st task = 5s (nodeA, nodeB, nodeC, nodeD, nodeE) + 5s (nodeF)

Time taken for 2nd task = 5s (nodeA, nodeB, nodeC, nodeD, nodeE) + 5s (nodeF)

Total time taken for playbook = 20s

Hope you got a CLEAR idea on how FORKS works

What is SERIAL ?

Simple Explanation — Serial decides the maximum number of nodes, process each tasks under a single run. If the total nodes count is higher than SERIAL value, then playbook runs again for remaining nodes. By default ansible runs in parallel against all the hosts in the pattern you set in the hosts: field of each play.

Use: When you need to provide changes as batches/ rolling changes.

Story 3: Suppose you have 4 nodes (nodeA, nodeB, nodeC, nodeD) in inventory, 2 tasks in playbook, forks =5 and serial = 2.

First task process on 2 nodes (nodeA, nodeB) simultaneously (should process on 4 nodes, but due to serial configuration it process on 2 nodes only) and then jump into Second task. Second task process on 2 nodes (nodeA, nodeB) simultaneously. Once both tasks completed, it again run playbook for rest of 2 nodes.

Next run of playbook process on 2 nodes (nodeC, nodeD) simultaneously, then Second task process on 2 nodes (nodeC, nodeD) simultaneously. After that playbook run get completed.

If we assume processing time for each task is 5 seconds;

First run, Time taken for 1st task = 5s (nodeA, nodeB)

First run, Time taken for 2nd task = 5s (nodeA, nodeB)

Second run, Time taken for 1st task = 5s (nodeC, nodeD)

Second run, Time taken for 2nd task = 5s (nodeC, nodeD)

Total time taken for playbook = 20s

Lets get another set of nodes to explain serial scenario.

Story 4: Suppose you have 3 nodes (nodeA, nodeB, nodeC) in inventory, 2 tasks in playbook, forks =5 and serial = 2.

First task process on 2 nodes (nodeA, nodeB) simultaneously and then jump into Second task. Second task process on 2 nodes (nodeA, nodeB) simultaneously. Once both tasks completed, it again run playbook for rest of 2 nodes.

Next run of playbook process on 1 node (node), then Second task process on 1 node (nodeC). After that playbook run get completed.

If we assume processing time for each task is 5 seconds;

First run, Time taken for 1st task = 5s (nodeA, nodeB)

First run, Time taken for 2nd task = 5s (nodeA, nodeB)

Second run, Time taken for 1st task = 5s (nodeC)

Second run, Time taken for 2nd task = 5s (nodeC)

Total time taken for playbook = 20s

Did you get idea on SERIAL now ??

Don’t think that there won’t be problem in execution time, since execution time for Story 2 and Story 3 were same. Think about 2 tasks, 10 nodes with forks =5 & 2 tasks 10 nodes with forks=5 and serial =4.

Story 5: 2 Tasks, 10 nodes with forks =5.

Single run,

1st task on first 5 nodes = 5s (node1, node2, node3, node4, node5)

1st task on second 5 nodes = 5s (node6, node7, node8, node9, node10)

2nd task on first 5 nodes = 5s (node1, node2, node3, node4, node5)

2nd task on second 5 nodes = 5s (node6, node7, node8, node9, node10)

Total time taken for playbook = 20s

Story 6: 2 Tasks, 10 nodes with forks=5 and serial =4

First run, 1st task = 5s (node1, node2, node3, node4)

First run, 2nd task = 5s (node1, node2, node3, node4)

Second run, 1st task = 5s (node5, node6, node7, node8)

Second run, 2nd task = 5s (node5, node6, node7, node8)

Third run, 1st task = 5s (node9, node10)

Third run, 2nd task = 5s (node9, node10)

Total time taken for playbook = 30s

Hope you got a CLEAR idea on how SERIAL works

Conclusion

Finally Ansible forks decides how many maximum parallel connections can be initiated for manage nodes from control node to execute ansible commands.

And serial decides how many maximum parallel hosts can be initiated for a task in a single run of playbook.

Please comment if there is anything need to change or correct.

Thank you for reading.

--

--

Heshan Dharmasena
DevOps SriLanka

Passionate on Linux | DevOps | Cloud | Automation | Platform Engineering | Red Hat Certified Architect | Technical Trainer | Organizer DevOps Sri Lanka