Raising shared memory limit of a Kubernetes container

Anuj Arora
Dive into ML/AI
Published in
2 min readMar 8, 2021

--

While using Pytorch’s (v1.4.0) Dataloader with multiple workers (num_workers > 0), I encountered the following error,

Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.

With this started my couple of hours long struggle for increasing the shared memory size. Now, if one is running a docker container with docker run command, this issue can be handled by inserting following command line argument.

--shm-size=desired_memory_size

However, for running the job on a kubernetes cluster, one needs to include the relevant flag in the corresponding *.yaml file. Internet search provided me with suggestions (link, link, link) to include shm_size tags at different locations but none seemed to help.

Finally, I happened across the solution that had worked for some users. It suggested mounting an emptyDir to /dev/shm and setting the medium to Memory.

spec:
volumes:
- name: dshm
emptyDir:
medium: Memory
containers:
- image: image-name #specify your image name here
volumeMounts:
- mountPath: /dev/shm
name: dshm

here,

  • volumes - declares the available volume(s)
  • volumeMounts - points to the volume declared in volumes and specifies the location for mounting that volume within the container .

--

--