::fast_io::string

1. Creating a string

A ::fast_io::string is a container for text. You can construct one directly:


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;
    ::fast_io::string s("Hello fast_io");
    println(s);
}
      

2. Concatenation

Concatenation in fast_io reuses its I/O printing system. Any type that can be printed can be concatenated into a string.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;
    ::fast_io::string name{"Alice"};
    ::fast_io::string greeting = ::fast_io::concat_fast_io("Hello, ", name, "!");
    ::fast_io::string greeting_ln = ::fast_io::concatln_fast_io("Hello, ", name); // adds newline
    print(greeting_ln); // Output: Hello, Alice\n
}
      

3. Reading input

Use manipulators from fast_io::iomnp to read text into strings. Place using namespace fast_io::iomnp; inside main() so it only applies locally.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string str;

    // read next space-delimited token
    scan(str);
    println("Token: ", str);

    // read one line (without newline)
    scan(line_get(str));
    println("Line: ", str);

    // read whole input until end of the file
    scan(whole_get(str));
    println("Whole input size: ", str.size());
}

These work with files too:


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace fast_io::iomnp;

    ::fast_io::string str;
    ::fast_io::ibuf_file ibf("input.txt");

    scan(ibf, line_get(str));   // read one line
    scan(ibf, whole_get(str));  // read whole file
}
      

Detecting End of File (EOF)

Use scan<true> to detect End of File (EOF). It returns true while input is available.

Example 1: Reading tokens until EOF


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace fast_io::iomnp;

    ::fast_io::string str;

    // keep reading space-delimited tokens until end of the file
    for (; scan<true>(str); ) {
        println("Token: ", str);
    }
}

Example 2: Reading lines until EOF


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace fast_io::iomnp;

    ::fast_io::string str;

    // keep reading lines until end of the file
    for (; scan<true>(line_get(str)); ) {
        println("Line: ", str);
    }
}

4. Accessing characters safely

A fast_io::string lets you access characters directly:


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    char first = s.front();   // first character
    char last  = s.back();    // last character

    println("First: ", chvw(first)); // H
    println("Last: ", chvw(last));   // o

    // access by index
    char middle = s[2];       // third character (index starts at 0)
    println("Middle: ", chvw(middle)); // l
}
      

⚠️ Safety note: If you call s.front(), s.back(), or use s[i] when the string is empty or the index is out of range, fast_io performs a boundary check. If the check fails, it calls __builtin_trap(), which stops the program instantly and silently.

Crash example


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s;   // empty string

    // All of these will crash:
    char c1 = s.front();   // invalid: empty string
    char c2 = s.back();    // invalid: empty string
    char c3 = s[5];        // invalid: index out of range
}
      

Safe usage

To avoid crashes, always check before accessing:


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    if(!s.is_empty()) {
        println("First: ", chvw(s.front()), "\n"
                "Last: ", chvw(s.back()));
    }

    std::size_t i = 2;
    if(i < s.size()) {
        println("Index ", i, ": ", chvw(s[i]));
    }
}
      

5. Iterating through characters

Preferred: Range‑based for loop

The easiest and safest way to go through every character is a range‑based for loop:


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    for(char c : s) {
        print(chvw(c), ' ');
    }
    // Output: H e l l o
}
      

Alternative: Index loop

If you need positions, use container indexing. This is safer than iterators because bounds are checked:


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    for(std::size_t i{}, n{s.size()}; i != n; ++i) {
        println("Index ", i, ": ", chvw(s[i]));
    }
}
      

Advanced: Iterators

Iterators act like pointers into the string’s memory. They are powerful but more difficult for novices. Unlike container indexing, iterators do not perform bounds checks. Misuse can cause undefined behavior — meaning the program may crash, corrupt data, or even open the door to security vulnerabilities.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    // Iteration uses a left-inclusive range: [begin(), end())
    // begin() points to the first element ('H')
    // end() points just past the last element (not valid to dereference)

    for(auto it = s.begin(), ed = s.end(); it != ed; ++it) {
        print(chvw(*it), ' ');
    }
    // Output: H e l l o
}
  

Left‑inclusive ranges

Iterators in C++ follow the convention of a left‑inclusive, right‑exclusive range:

  • begin() → points to the first valid element (inclusive).
  • end() → points one past the last element (exclusive, cannot be dereferenced).
  • The loop condition it != s.end() ensures iteration stops before stepping out of bounds.

This design makes ranges easier to compose: the size of a range is simply end() - begin(), and algorithms can safely stop at end() without accessing invalid memory.

Undefined behavior and real exploits

Undefined behavior means the C++ standard imposes no rules on what happens. The program may appear to work, but it can also crash, corrupt memory, or expose vulnerabilities. Famous exploits have abused these exact categories of memory safety bugs:

  • Use‑after‑free (iterator invalidation) — exploited in Heartbleed (2014) and countless browser vulnerabilities.
  • Buffer overflows (out‑of‑bounds access) — exploited in the Morris Worm (1988) and WannaCry ransomware (2017).

Both use‑after‑free and buffer overflows are examples of memory safety bugs. These flaws allowed attackers to execute arbitrary code, steal sensitive data, or spread malware. That’s why safe iteration practices are critical even in simple examples.

According to Google’s Chromium security team, around 70% of serious security bugs in Chrome are memory safety problems, with half of those being use‑after‑free issues. This analysis was based on 912 high or critical severity bugs since 2015. You can read their detailed breakdown here: Chromium Project: Memory Safety .

Iterator invalidation

Iterators become invalid if the string reallocates (for example, after append or reserve). Using an invalidated iterator is undefined behavior — the same class of bug as a use‑after‑free.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    auto it2 = s.begin();
    s.append(" world"); // may reallocate, invalidates it2

    // ⚠️ Undefined behavior: iterator now points to freed memory
    println(*it2); // like a use-after-free
}

Iterator arithmetic

Iterators are contiguous, so you can use arithmetic operations (+/-, [], comparisons). ⚠️ These operations do not check bounds. Negative offsets or stepping past end() are undefined behavior.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    auto it = s.begin();          // points to 'H'
    println("First: ", chvw(*it));

    // Move forward with +n
    println("Third: ", *(it + 2)); // 'l'

    // Negative indexing relative to a valid position
    auto it2 = s.begin() + 2;     // points to 'l'
    println("Second via negative index: ", chvw(it2[-1])); // 'e'

    // Random access with operator[]
    println("Fourth via []: ", chvw(it[3])); // 'l'

    // Comparisons
    if(it < s.end()) {
        print("Iterator is before end()\n");
    }

    // ⚠️ Buffer overflow example: stepping past end()
    auto bad = s.end();
    // println(chvw(*bad)); // undefined behavior, like a buffer overflow

    // ⚠️ Use-after-free example: iterator invalidation
    auto it3 = s.begin();
    s.append(" world"); // may reallocate, invalidates it3
    // println(chvw(*it3));   // undefined behavior, like use-after-free
}

Printing iterator addresses

You can inspect the memory address an iterator points to using the ::fast_io::mnp::pointervw manipulator (available directly from <fast_io.h>):


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};
    auto it = s.begin();

    println("Character: ", chvw(*it));
    println("Address: ", ::fast_io::mnp::pointervw(it));
}

Advice for beginners

Iterators are powerful but advanced. For most cases, prefer:

  1. Range‑based for loops — safest and simplest way to iterate.
  2. Container indexing (s[i]) — safer than iterators, with boundary checks.
  3. Iterators — powerful but advanced; use only when algorithms require them.

Container indexing in fast_io::string performs boundary checks and calls __builtin_trap() if out of range, stopping the program immediately. This fail‑fast behavior prevents silent memory corruption. Iterators, by contrast, do not check bounds and can lead to the same classes of memory safety bugs that have historically been exploited.

6. Modifications

Strings can be modified in two main ways: by referring to positions (indices) or by using iterators. Both approaches let you insert, erase, or replace parts of the string, but they differ in safety.

Index‑based modification

When you work with positions, the string checks that the index is valid. If you go out of range, the program will stop immediately instead of silently corrupting memory. This fail‑fast behavior makes index‑based modification safer and easier to reason about.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    // Add text at the end
    s.insert_index(5, " world");
    println(s); // Hello world

    // Remove everything from position 5 onward
    s.erase_index(5);
    println(s); // Hello

    // Replace the first 5 characters ("Hello") with "Hi"
    s.replace_index(0, 5, "Hi");
    println(s); // Hi
}

Iterator‑based modification

Iterators act like pointers into the string’s memory. They allow more flexible operations, but they do not perform bounds checks. If you use an invalid iterator or one that has been invalidated after reallocation, the behavior is undefined — the same class of bug as buffer overflows or use‑after‑free.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::string s{"Hello"};

    // Insert using a valid iterator
    auto it = s.begin() + 5;
    s.insert(it, '!');
    println(s); // Hello!

    // Replace using iterators: replace "Hello" with "Hi"
    s.replace(s.begin(), s.begin() + 5, "Hi");
    println(s); // Hi!

    // ⚠️ Danger: iterator invalidation
    auto it2 = s.begin();
    s.append(" world"); // may reallocate, invalidates it2
    // s.insert(it2, '?'); // undefined behavior
}

Summary

Position‑based modification is safer because it enforces bounds checking and fails fast on errors. Iterator‑based modification is more powerful but unchecked, requiring discipline to avoid invalidation and out‑of‑range access. Misusing iterators can lead to undefined behavior and the same memory safety bugs that have caused real‑world security exploits.

7. Other string types

In addition to fast_io::string (based on char), fast_io provides specialized string containers for different character encodings:

  • fast_io::u8string — UTF‑8 encoded text
  • fast_io::u16string — UTF‑16 encoded text
  • fast_io::u32string — UTF‑32 encoded text
  • fast_io::wstring — wide characters (wchar_t, platform‑dependent)

Automatic filename transcoding

fast_io automatically transcodes filenames to whatever encoding the operating system requires. This means you can safely use u8, u16, u32, or L string literals for filenames, and fast_io will handle the conversion internally. You don’t need to worry about platform differences — the library ensures the correct encoding is passed to the OS.

UTF‑8 string example


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::u8obuf_file u8obf(u8"a.txt");   // UTF‑8 filename, transcoded automatically
    ::fast_io::u8string u8str = ::fast_io::u8concat(u8"blah blah", 30, u8"fassfaafs");

    println(u8obf, u8"hello: ", u8str);
}
  

UTF‑16 string example


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::u16obuf_file u16obf(u"a16.txt");   // UTF‑16 filename, transcoded automatically
    ::fast_io::u16string u16str = ::fast_io::u16concat_fast_io(u"UTF", 16, u"text");

    println(u16obf, u"Message: ", u16str);
}
  

UTF‑32 string example


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::u32obuf_file u32obf(U"a32.txt");   // UTF‑32 filename, transcoded automatically
    ::fast_io::u32string u32str = ::fast_io::u32concat_fast_io(U"UTF", 32, U"text");

    println(u32obf, U"Message: ", u32str);
}
  

Cross‑character type printing

Use ::fast_io::mnp::code_cvt to print strings of different character types together. This allows you to mix and match string, u8string, u16string, and u32string in the same output without manual conversion.


#include <fast_io.h>
#include <fast_io_dsal/string.h>

int main() {
    using namespace ::fast_io::iomnp;

    ::fast_io::u8obuf_file obf(u8"hello.txt");
    ::fast_io::u8string s(u8"ASCII text");
    ::fast_io::u16string u16s = ::fast_io::u16concat_fast_io(u"UTF", 16, u" text", ::fast_io::mnp::code_cvt(s));

    // code_cvt allows cross charactype printing
    println(obf, s, u8" | ", code_cvt(u16s));
}
  

A note on wide strings

fast_io::wstring is based on wchar_t, which differs by platform (UTF‑16 on Windows, UTF‑32 on most Unix systems). Because of this inconsistency, beginners should avoid using wchar_t/wstring unless they have a specific platform requirement. Prefer u8string, u16string, or u32string for predictable, portable Unicode handling.

Best practice

- Use u8string as your default choice: UTF‑8 is the most portable and widely supported encoding.
- Rely on automatic filename transcoding: you can pass any literal type, and fast_io will adapt it to the OS.
- Use code_cvt when mixing different string types in output.
- Avoid wchar_t/wstring unless you are targeting a specific platform API that requires it.

8. Memory Safety Tools

Modern compilers and runtimes provide powerful tools to detect and prevent memory safety bugs. These can be combined with fast_io to ensure robust and secure code.

  • AddressSanitizer (ASan): enable with -fsanitize=address to catch buffer overflows, use‑after‑free, and other memory errors at runtime.
  • UndefinedBehaviorSanitizer (UBSan): enable with -fsanitize=undefined to detect undefined behaviors such as invalid shifts, integer overflows, or misaligned accesses.
  • Fuzzing: combine sanitizers with Clang’s -fsanitize=fuzzer to automatically generate random inputs and stress‑test your code paths for hidden bugs.
  • Memory Tagging: use -fsanitize=memtag on supported platforms (ARM MTE, WebAssembly memory tagging, and others) to enforce tagged memory safety. This detects spatial and temporal errors by associating tags with memory allocations. The WebAssembly memory tagging approach was developed by the same author as this fast_io library, underscoring its focus on practical memory safety. See ACM reference for details on cross‑platform memory tagging.

In practice: run your builds with sanitizers during development, integrate fuzzing into CI pipelines, and adopt memory tagging where available. These tools complement fast_io’s fail‑fast philosophy by catching subtle bugs before they reach production.

9. Summary

::fast_io::string is the core text container in fast_io, designed for safe and efficient text handling.

  • Use concat_fast_io / concatln_fast_io for efficient concatenation.
  • Manipulate text safely with push_back, append, and indexing.
  • Read input with no manipulator (token), line_get (line), or whole_get (entire stream).
  • Prefer range‑based loops; use indices for positions, iterators only for advanced cases.
  • Favor u8string (UTF‑8) for portability; avoid wchar_t/wstring unless platform‑specific.
  • Use tooling such as Clang sanitizers (-fsanitize=address,undefined), fuzzing (-fsanitize=fuzzer), and memory tagging (-fsanitize=memtag on ARM MTE, WebAssembly, and other platforms) to detect and prevent memory safety bugs.

In short: rely on safe defaults, use iterators carefully, and lean on UTF‑8 for cross‑platform text.