Ch11.8: File Loader (Memory Mapping)

Overview

fast_io provides two file loader types that load an entire file into memory and expose it through a container-like interface. They are move-only, RAII-managed objects that automatically release the underlying memory when they go out of scope.

Both loaders provide the same interface, so you can switch between them depending on platform capabilities without changing the rest of your code.

Common Interface

Both native_file_loader and allocation_file_loader provide a contiguous-container-like interface:

They are also directly printable: you can pass a loader to print() or println() and the file contents will be written to the output.

native_file_loader

::fast_io::native_file_loader uses the OS memory mapping facilities. On POSIX systems, it calls mmap(); on Windows NT, it uses NtCreateSection and NtMapViewOfSection. The file is not read into a separate buffer — instead, the file contents are mapped directly into virtual memory. Accesses to the mapped region are handled by the OS page fault mechanism, giving zero-copy semantics.


#include <fast_io.h>
#include <fast_io_device.h>

using namespace fast_io::io;

int main(int argc, char** argv)
{
    if (argc < 2)
    {
        perr("Usage: ", ::fast_io::mnp::os_c_str(*argv), " <file>\n");
        return 1;
    }

    // Memory-map the entire file
    ::fast_io::native_file_loader loader(::fast_io::mnp::os_c_str(argv[1]));

    // Print the entire file contents directly
    print(loader);
    println("File size: ", loader.size(), " bytes");

    // Iterate over the bytes
    ::std::size_t newline_count{};
    for (auto ch : loader)
    {
        if (ch == u8'\n')
        {
            ++newline_count;
        }
    }
    println("Newline count: ", newline_count);

    // Random access via operator[]
    if (loader.size() > 0zu)
    {
        println("First byte: ", loader[0]);
    }

    // Use with algorithms via begin()/end()
    auto it = ::std::find(loader.begin(), loader.end(), u8'X');
    if (it != loader.end())
    {
        println("Found 'X' at offset: ", static_cast<::std::size_t>(it - loader.begin()));
    }
}
      

The loader is move-only. You can transfer ownership with ::std::move, but you cannot copy it. When the loader is destroyed, the memory mapping is released automatically.

allocation_file_loader

::fast_io::allocation_file_loader provides the same interface, but uses malloc() to allocate a buffer and read() to fill it. This is used on platforms that do not support memory mapping, such as WASM (WebAssembly) or environments with newlib. It is also useful when you need the file data to reside in heap memory that you control.


#include <fast_io.h>
#include <fast_io_device.h>

using namespace fast_io::io;

int main(int argc, char** argv)
{
    if (argc < 2)
    {
        perr("Usage: ", ::fast_io::mnp::os_c_str(*argv), " <file>\n");
        return 1;
    }

    // Load file into a heap-allocated buffer
    ::fast_io::allocation_file_loader loader(::fast_io::mnp::os_c_str(argv[1]));

    // Same container-like interface
    println("File size: ", loader.size(), " bytes");
    // Print file contents
    print(loader);

    // Access individual bytes
    for (::std::size_t i{}; i < loader.size(); ++i)
    {
        if (loader[i] == u8'\n')
        {
            println("Newline at offset: ", i);
        }
    }
}
      

Choosing Between the Two

Feature native_file_loader allocation_file_loader
Underlying mechanism mmap() / MapViewOfFile malloc() + read()
Copy into user buffer No (zero-copy) Yes
Platform support POSIX, Windows NT All platforms
Works on WASM / newlib No Yes
Interface data(), size(), begin(), end(), operator[] data(), size(), begin(), end(), operator[]
Move-only, RAII Yes Yes

If your target platform supports memory mapping, prefer native_file_loader for best performance. If you need maximum portability (e.g. cross-compiling to WASM), use allocation_file_loader.

Important: If you need to load a file, modify its contents in memory, and then write it back to the same file, use allocation_file_loader instead of native_file_loader. The memory-mapped loader keeps the file open until it is destroyed, which prevents you from reopening the file for writing. allocation_file_loader reads the entire file into heap memory and closes it immediately, allowing you to safely reopen the file for output.


#include <fast_io.h>
#include <fast_io_device.h>

int main()
{
    using namespace ::fast_io::iomnp;

    // Load file into heap memory (file is closed after loading)
    ::fast_io::allocation_file_loader loader(u8"data.txt");

    // Modify the contents in memory
    for (auto& ch : loader)
    {
        if (ch == u8'a') ch = u8'A';
    }

    // Now we can safely write back to the same file
    ::fast_io::obuf_file obf(u8"data.txt");
    print(obf, loader);
}
      

If you had used native_file_loader in the example above, the file would remain memory-mapped and locked, causing the obuf_file construction to fail or the write to corrupt the mapping.

Move Semantics and RAII

Both loaders are move-only types. Their destructors release the underlying resource (unmap the memory or free the buffer). You can transfer ownership via ::std::move:


#include <fast_io.h>
#include <fast_io_device.h>
#include <utility>

using namespace fast_io::io;

::fast_io::native_file_loader load_file(char const* path)
{
    ::fast_io::native_file_loader loader(path);
    println("Loaded file of size: ", loader.size());
    return loader; // NRVO or move
}

int main()
{
    auto loader = load_file("example.txt");

    // loader now owns the mapped file
    println("Size in main: ", loader.size());
    print(loader);

    // Move into another variable
    auto loader2 = ::std::move(loader);
    // loader is now in a moved-from state; loader2 owns the data

    // When loader2 goes out of scope, the mapping is released
}
      

Loading Files Relative to a Directory

You can combine ::fast_io::dir_file with a file loader to load files relative to a directory. Use drt(ent) on a directory entry to get a (fd, name) pair that the loader can open directly.


#include <fast_io.h>
#include <fast_io_device.h>

int main()
{
    using namespace ::fast_io::iomnp;

    ::fast_io::dir_file df(u8"src");

    // Iterate over all files recursively
    for (auto const& ent : recursive(at(df)))
    {
        if (type(ent) != ::fast_io::file_type::regular) continue;

        // Load only .cpp files
        ::std::u8string_view ext{u8extension(ent)};
        if (ext != u8".cpp") continue;

        // drt(ent) gives the (fd, name) pair to open this entry
        ::fast_io::native_file_loader loader(drt(ent));
        ::std::size_t lines{};
        for (auto ch : loader)
        {
            if (ch == u8'\n') ++lines;
        }
        print(::fast_io::mnp::code_cvt(u8filename(ent)), ": ", lines, " lines\n");
    }
}
      

This pattern is efficient because drt(ent) reuses the directory’s file descriptor — no extra open() call is needed to resolve the path.

Key Takeaways