MMap’s in Python

Nishanth G
Apr 3 · 4 min read

yes you read it correct its 2 M’s

So what’s MMap ?

The extra M stands for ‘Memory’. Memory mapping is a process through which machine level constructs are used to map a file directly for use in a program from disk. It maps the entire file in the disk to a range of addresses within the computer program’s address space. The program can access the files on the disk in the same way it accesses data from random access memory.

The mmap function uses the concept of virtual memory to make it appear to the program that a large file has been loaded to main memory.

But in reality the file is only present on the disk. The operating system just maps the address of the file into the program’s address space so that program can access the data on the filesystem directly, instead of using normal I/O functions. Memory-mapping typically improves I/O performance because it does not involve a separate system call for each access and it does not require copying data between buffers — the memory is accessed directly.

Memory-mapped files can be treated as mutable strings or file-like objects, depending on your need. A mapped file supports the expected file API methods, such as close(), flush(), read(), readline(), seek(), tell(), and write(). It also supports the string API, with features such as slicing and methods like find().

Now lets see how to implement mmap function in Python

We can use mmap module for file I/O instead of simple file opeartion.

  1. we first import the mmap module
  2. then define the filepath of the file in the disk
  3. then we create the file_object using open() system call
  4. After getting the file object we create a memory mapping of file into the address space of the program using the mmap function
  5. Then we read the data from mmap object
  6. and print the data.

Description of mmap function

mmap requires a file descriptor for the first argument.

The argument length takes the size of memory in bytes to be mapped and the argument access informs the kernel how the program is going to access the memory.

The argument offset instructs the program to create a memory map of the file after certain bytes specified in the offset.

  • The file descriptor for the first argument is provided by the fileno() method of the file object.
  • The length in the second argument can be specified 0 if we want the system to automatically select a sufficient amount of memory to map the file.
  • The access argument has many options. ACCESS_READ allows the user program to only read from the mapped memory. ACCESS_COPY and ACCESS_WRITE offer write mode access. In ACCESS_WRITE mode the program can change both the mapped memory and file but in ACCESS_COPY mode only the mapped memory is changed.
  • The offset argument is often specified 0 when we want to map the file from the start address.

We saw how to read a file using MMap , now let see how to Write

To write some data to a memory-mapped file, we can use the ACCESS_WRITE option in the access argument and use mmap_object.write() function to write to the file after creating the file object by opening the file in r+ mode.

Here we have to take care of the point that mmap doesn’t allow mapping of empty files. This is due to the reason that no memory mapping is needed for an empty file because it’s just a buffer of memory.

If we will open a file using “w” mode, mmap will cause ValueError.

An important point we have to keep in mind regarding the above example is that the input should be converted into bytes before writing to mmap.

Also, mmap starts writing data from the first address of the file and overwrites the initial data. If we have to save previous data, we can do so by specifying the appropriate offset in mmap function call.

Conclusion

Simple read/write operations make many system calls during execution which causes multiple copying of data in different buffers in the process.

Using mmap provides us a significant improvement in terms of performance because it skips those function calls and buffers operations especially in programs where extensive file I/O is required.

That brings us to the End of this Article

If u have read so far ….

Nerd For Tech

From Confusion to Clarification

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/. Don’t forget to check out Ask-NFT, a mentorship ecosystem we’ve started

Nishanth G

Written by

Avg Software Engineer , can’t write any post that takes more than 5 mins

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/. Don’t forget to check out Ask-NFT, a mentorship ecosystem we’ve started

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store