Memory Mapped Files with C#

Today I’m going to talk about Memory Mapped Files in C#. A memory-mapped file contains the contents of a file in virtual memory. This mapping between a file and memory space enables an application, including multiple processes, to modify the file by reading and writing directly to memory. Starting with the .NET 4 framework you are now able to use managed code to access memory-mapped files the same way Windows does. Why is this important? If you need to process large files, using memory-mapped files can be a lifesaver. Memory-mapped files will allow you to access the contents of a file without running into out of memory exceptions, gain quick access to the content within the file, and even allow multiple processes and threads to process the content in parallel. You can read up more on memory-mapped files on msdn.

One thing that is important to note about using a memory-mapped file is that you need to know the starting byte and ending byte of the content you want to work with. To be able to work with the content for a large file it’s best to index the content first before doing any processing of that content. Unless of course you are working with an image file or another file that doesn’t require you to know the exact positions of the content you are trying to access within the file. You might be thinking that indexing the file first would make things slower but in fact it usually takes milliseconds to index content, even in gigabyte files. Once the indexes are created, we are then able to go directly to the content within the large file using the indexes we created. Here is a basic example on how to open a file for reading using a memory-mapped file:

I’ve used a using statement to ensure that the dispose method is called even if an exception occurs when we are working with the memory-mapped file. You don’t have to use a using statement but you better make sure you properly dispose of the memory-mapped file object otherwise you could bring your program to a crashing halt.

Now that we have the file open, we need to create either a view stream, used when you want to work with non-persisted memory-mapped files, or a view accessor which is used when you want to work with persisted memory-mapped files. The difference between persisted and non-persisted is based on wether or not you want to work with the content in memory or on disk. In this example I’m going to go the non-persisted route and work with the content in memory. I would recommend that if you are working with files in the gigabyte size range that you do not use the non-persisted method.

Here is an example method containing logic that is needed to actually get content from the open memory-mapped file:

The key items to focus on are the parameters to the method. The first is a reference to the memory-mapped file, the second is the beginning byte location of the content we want to access within the file and the last is the length of the content. If you remember back to when I said we needed to index the content first before we could work with the file, this is exactly why we needed to do that indexing. In order to get the content from the file, you need to know the starting byte location of the content you want to get as well as how much of the content thereafter you want to get. Because we are working in the binary world when we use memory-mapped files we need to convert the content we retrieved from a byte array to a string so that we can work with the content like it is originally presented in the file we are processing.

Now that we have the content, we can do whatever we want with it. This is where the real power of using memory-mapped files comes into play. For example, you can create parallel logic to retrieve the content from the original file thus allowing you to process that content very quickly and dramatically increasing the performance of your processing capabilities. Here is a brief example of some parallel logic:

Obviously the code shown above were just examples but hopefully you get the idea of the power that memory-mapped files can provide when you need to work with large files. If you need to work with processing the contents of documents, images or any other type of large files I would highly recommend you look into using memory-mapped files.

Posted in C# Tagged with: ,

Leave a Reply