Resource Limits in Linux | limits.conf

A Comprehensive Guide to understand Limits.conf file

Harsha Koushik
Kernel Space
8 min readMay 31, 2021

--

Photo by FLY:D on Unsplash

Now we have a pretty good idea of why we should set Resource Limits after going through Fork Bomb article. If you have not read it yet, here it is — https://medium.com/kernel-space/linux-security-all-about-fork-bomb-75dd6741f060 . Fork Bomb is just one example, there is much more to limits file. We will explore all other types in ‘limits.conffile in this Article.

Basically ‘limits.confis read by a PAM module called ‘pam_limits’. This module will parse this file or all files in ‘limits.d’ directory and takes the responsibility of applying these limits to the Users logging in. PAM stands for Pluggable Authentication Modules, this is a set of powerful modules which control the authentication of User to apps/services dynamically. PAM by itself will become a series of articles if i start explaining. So we’ll limit it to whats needed in this article and explore PAM in other articles.

Let us look at where this file is located and how does it look like — This file is located in ‘/etc/security/’ directory. There is a limits directory and a limits conf file. Just like any other config in etc in linux, you can keep multiple separate config files in the limits.d directory, *.conf files will be considered while reading the config. Or a single config file like limits.conf can be maintained.

Location of limits.conf & limits.d Directory

This config file is very simple. It contains just four fields. Everything can be defined using these four fields and the options within them. These are the Four Fields —

<Domain> <Type> <Item> <Value>

Domain — Domain is the actual User or Group to which the limits are Set.

Type — Type defined if the Limit is Soft or Hard.

Item — Item is the actual Limit Type, ex-nproc, nofile, memlock.

Value — Value is the Decimal Number which defines the Limit for an Item.

Domain

Domain can contain these values —

  • Single Username
  • Group Name, syntax — <@groupname>
  • UID Range, syntax — <min_uid:max_uid>, ex- 1003:1009
  • GID Range, syntax — @<min_gid:max_gid>, ex- @500:509
  • Wildcard * for all Users and all Groups, the default entry.
  • Wildcard % for maxloginlimits item only. Can combine a group with % like %@devs to set maxloginlimits to the entire group. Default % will set that limits for everyone.
  • %:<group_id> — Specifies maxloginlimits for that specific group ID.

Type

Type can contain these values —

  • Hard — this will enforce Hard Limit on the Resource. This is set by the Superuser and enforced by Kernel. A normal user can never go beyond this limit.
  • Soft — this will enforce Soft Limit on the Resource, meaning — the user can can move up or down within the permitted range by any pre-existing hard limits.
  • ‘-’ a hyphen — this enforces both hard and soft limits together. Remember, if this hyphen is specified without any Value or Item name, the module will never enforce any limit to that specified User or Group.

Note: Soft is less than Hard limit, only then it makes some sense. So Soft_Limit ≤ Hard_Limit. Soft Limit can be temporarily exceeded by the User. But when he exceeds it, a timer begins. While the timer is ticking, the user is allowed to operate above the soft limit but cannot exceed the hard limit. Once the user goes below the soft limit, the timer gets reset.

But if the User stays above Soft Limit after the timer gets expired, the Soft Limit will be treated as Hard Limit for him and will be forcefully pulled down. By default, the soft limit timer is seven days.

Item

Item can contain these values —

core — This is NOT related to CPU Cores at all. This is the Core Dump File Size. This limit specifies maxsize allowed of that file. Core file is generated when a program is failing/getting terminated due to fault in the program, used for debugging. Ex- process accessed an invalid pointer.

data — This sets the Maximum size of a process’s data segment.If this limit is exceeded, the malloc() function shall fail with errno set to [ENOMEM].

fsize — Largest file a User’s Process can create or extend.

memlock —This is used to set the maximum amount of locked-in-memory address space in KB. This is memory that will not be paged out.

nofile — Specifies the number of file descriptors a user process may have open at one time.

rss — RSS stands for Resident Set Size here. It says how much memory this process currently has in main memory (RAM). It does not include memory that is swapped out. So this value sets a Limit on that in KB.

stack — Specifies the limit for largest process stack segment for a user’s process in KB.

cpu — This specifies Max System Unit Time that a Process can use in Seconds.

nproc — Sets limit on Number of Processes per User. We have changed this value in our Previous Fork Bomb article.

as — as stands for Address Space here. Address Space of every process will be divided into two sub-spaces — User and Kernel. User Space is Process’s private space, and Kernel Space is shared across all Processes. This sets a limit on ‘as’ in KB. Address Space as a concept deserves one complete article, will explain about it in detail in that article.

maxlogins — This specifies Max Number of Logins allowed for that User. This doesn’t apply to the User with UID = 0 (Root by default).

maxsyslogins — This specifies Max Number of all Logins on the System. Any User except the User with UID = 0 is not allowed, if the Total number of Logins exceed this number.

priority — This specifies the priority to run user process with.

nice — Nice values are user-space values that we can use to control the priority of a process. This specifies maximum nice priority allowed to raise to.

rtprio — Stands for Real Time Priority. Real time processes have strict scheduling requirements, they must run every X micro/milli seconds and complete their Job. This specifies maximum realtime priority allowed for non-privileged processes.

Before moving forward with other fields, to understand about what Priority, Nice Value and RTPriority are, we need to understand the basics of how Scheduling works, because all these values are related to Scheduling. Process Scheduler Subsystem in Linux takes care of all this. We can clearly see the Priority is something which defines, with what priority a Process runs, but what is this real time priority again.

Linux supports scheduling classes so that different scheduling policies, together with their implementing algorithms, can coexist on the same platform. These are the Scheduling Classes —

SCHED_OTHER the standard round-robin time-sharing policy
SCHED_BATCH for “batch” style execution of processes
SCHED_IDLE for running very low priority background jobs.
SCHED_FIFO a first-in, first-out policy
SCHED_RR a round-robin policy
SCHED_DEADLINE, each task is assigned a deadline.

1,2 and 3 in the list are Normal Scheduling Classes, 4, 5 and 6 are Real Time Scheduling Classes.

Basically we have Real Time processes and Non Real Time(Normal) Processes. As said before, realtime processes have strict scheduling requirements. For these processes the scheduler drops everything else, to a point that if a process is already on the processor, it pulls the process out by switching it with this process. And the Non real time processes, as you already guessed are the ones which are considered not urgent at all. So with this said, let us explore those values.

Priority and Nice Value are only for the Non real time processes, they have nothing to do with the Real Time Process. Real Time Process has got its own Priority value called ‘rtprio’. Nice Value is just used as the hint to the Kernel saying with what Priority a particular process should run, actual scheduling happens based on the Priority Value. The relation between Priority and Nice Value is like —

PR = 20 + NI

Nice Value ranges from -20 to 19, -20 being the highest priority and 19 being the lowest. People often get confused about lowest and highest priority, it is simple — lower the number is, higher the priority(importance) is. This Nice Value is actually used for user-space programs. It is not, as is often misunderstood, the priority of a process. It is a number that influences the priority property of a process.

Note: Kernel can always choose real priority (PR value) on its own depending on the situation. Kernel need not change the nice value as such, it can directly change the Priority value of a Process depending on the situation, for example- if a process is running for so long or many process are running for long time, kernel gets to decide the priority. So the formula PR = 20 + NI is true most of the times, but not always.

In linux , total 140 priorities are available. In which , 0–99 are real time priorities and 100 to 139 are regular priorities. Nice value enables the process to map to the last part of the range (from 100 to 139). This equation leaves the values from 0 to 99 unreachable which will correspond to a negative PR level (from -100 to -1).

Similarly the formula to calculate Real Time Priority is this —

PR = PR = -1 - real_time_priority

So the resultant Priority always stays in this range [-100,-1]. Here -100 means highest priority(max importance) and -1 is the lowest priority(min importance). if the Real Time Priority is -100, the top command shows it as ‘rt’. Don’t get confused seeing the numbers like -19, -100, 139 etcetera. Just see the output of top command and you will have a clear picture of what is happening. Values shown in the output of top command are after the calculation.

Nice value of a Program can be changed using —

Nice value of a running Process can be changed using —

Real Time Process Priority can be changed using —

Note: Normal Users cannot set negative Nice values, only Super User can do that. Also Normal user cannot meddle with RealTime Process Priority Values, only Super User can do it.

Back to the Field Values…

locks — This specifies the Max number of File Locks the User can hold.

sigpending — Specifies the limit on the number of signals that may be queued for the real user ID of the calling process.

msgqueue — MSGQUEUE is used as an IPC mechanism, ex- Process 1 writes into queue and process 2 reads it. So this specifies maximum memory used by POSIX message queues in Bytes.

Value

Value is the actual number which specifies the limit for each item. A small example from man page is here —

Conclusion

This whole theory makes you understand how these things work, but trying them practically and seeing whats happening in the system for each value you change, will take you to another level. Do not attempt playing with these Values in Mission Critical Systems, for example, playing with priority and rtprio values can make the system unstable, due to starvation. So please ensure you do this in a contained environment such as VM/Container and make sure you absolutely know what you are doing.

Please feel free to point out mistakes if there are any. Thank you for reading. You can connect with me on Linkedin . Happy to answer your queries.

--

--