Ch5.8: C‑style Strings
Overview
A C‑style string is a C‑style array of characters terminated by the special
character '\0' (the null terminator). This
convention originates from the C language and remains part of modern C++ for
compatibility.
In this chapter, you will learn:
- what a C‑style string is
- how null‑terminated character arrays work
- string literals and why they are read‑only
- why C‑style strings decay to pointers
- why
char const*should always be used for pointers to C‑style strings - why fast_io prints string literals but not pointers to them
- how
strlenandstrnlenwork - how fast_io provides manipulators for printing C‑style strings
- how wide and Unicode C‑style strings work
1. What is a C‑style string?
A C‑style string is simply a C‑style array of char that ends with
'\0'.
char s[6]{'H', 'e', 'l', 'l', 'o', '\0'};
Memory layout:
Address: 2000 2001 2002 2003 2004 2005
Memory: [ H ][ e ][ l ][ l ][ o ][ \0 ]
2. Why C‑style strings must end with '\0'
C‑style strings do not store their length.
Instead, code that processes them scans characters until it finds '\0'.
If the null terminator is missing:
- the program reads past the end of the array
- this is undefined behavior
- the program may crash or read garbage memory
char bad[5]{'H', 'e', 'l', 'l', 'o'}; // ❌ no '\0'
3. String literals
A string literal such as "Hello" is stored as a
constant C‑style array of char with a null terminator.
char const *p = "Hello";
Memory layout:
"Hello" → [ H ][ e ][ l ][ l ][ o ][ \0 ]
Important rules:
- string literals have static storage duration
- string literals are read‑only
- writing to a string literal is undefined behavior
// char *p = "Hello"; // allowed to compile, but dangerous
// p[0] = 'h'; // ❌ undefined behavior
Always write char const* for pointers to C‑style strings.
4. C‑style strings are C‑style arrays
A C‑style string is just a C‑style array of char.
Therefore, all rules from previous chapters apply:
- arrays are contiguous
- arrays decay to pointers
- pointer arithmetic works on characters
- out‑of‑bounds access is undefined behavior
char s[6]{"Hello"};
char const *p = s; // decay
char const *q = p + 1; // points to 'e'
5. Why fast_io prints string literals but not pointers to them
A string literal is fundamentally a C‑style array. fast_io recognizes C‑style arrays of characters and prints them as strings.
print("hello world\n"); // ✔ works
However, once the array decays to a pointer, fast_io cannot know whether the pointer refers to a valid C‑style string. It refuses to print raw pointers directly.
char const *ptr = "hello world\n";
// print(ptr); // ❌ does NOT work
fast_io must assume:
- a C‑style array of matching
char_typeis a string literal - a pointer is not automatically a C‑style string
This is why:
print("hello"); ✔ works (array)
print(ptr); ❌ does not work (pointer)
To print a pointer as a C‑style string, use:
println(os_c_str(ptr));
6. strlen and strnlen
strlen computes the length of a C‑style string by scanning until
'\0'.
char s[6]{"Hello"};
std::size_t n = strlen(s); // returns 5
Important:
strlen assumes the string is properly null‑terminated.
If '\0' is missing, it will read past the end → undefined behavior.
strnlen
strnlen is similar but stops after a maximum number of characters.
std::size_t n = strnlen(s, 6); // safe upper bound
strnlen is safer because it prevents scanning unbounded memory.
7. fast_io manipulators for C‑style strings
fast_io provides manipulators for printing pointers and C‑style strings.
pointervw(ptr)
Prints a pointer value as an address.
char const *p = "Hello";
println(pointervw(p)); // prints address of p
os_c_str(ptr)
Prints a null‑terminated C‑style string.
println(os_c_str(p)); // prints "Hello"
os_c_str(ptr, n)
Prints a C‑style string but stops after at most n characters.
println(os_c_str(p, 3)); // prints "Hel"
8. Wide and Unicode C‑style strings
C++ supports several character types, each with its own C‑style string form:
char— narrow stringwchar_t— wide stringchar8_t— UTF‑8 code unitschar16_t— UTF‑16 code unitschar32_t— UTF‑32 code units
Each uses its own null terminator:
char→'\0'wchar_t→L'\0'char8_t→u8'\0'char16_t→u'\0'char32_t→U'\0'
wchar_t ws[]{L"Hello"};
char16_t u16s[]{u"Hello"};
char32_t u32s[]{U"Hello"};
char8_t u8s[]{u8"Hello"};
All of them follow the same rules:
- they are C‑style arrays
- they decay to pointers
- they must end with a null terminator
Key takeaways
- A C‑style string is a C‑style array of characters ending with
'\0'. - String literals are constant arrays and cannot be modified.
- Always write
char const*for pointers to C‑style strings. - C‑style strings decay to pointers in expressions.
strlenscans until'\0'and is unsafe if the terminator is missing.strnlenadds a maximum bound and is safer.- fast_io prints C‑style arrays but not pointers unless you use
os_c_str. - fast_io provides
pointervw,os_c_str, andos_c_str(ptr,n). - Wide and Unicode C‑style strings follow the same rules but use different character types.
- C‑style strings are error‑prone because they do not store their length.