C-Style Strings
Strings are collections of characters, e.g. "Hello, world!"
, traditionally represented by an array of char
.
Type of C-style strings
As char[]
C-style strings (aka. null-terminated strings) are arrays of char
terminated by the null character (i.e. \0
or ASCII code 0) at the end of the string.
For example, the way C-style string stores the string "Hello"
is as follows. (Also note that because of the \0
, the size of the str[]
array would be 6 instead of 5.)
// The statement below is the same as:
// char str[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
char str[] = "Hello";
As char*
Since C-style strings are just arrays of char[]
, you can just store the pointer to the first element of that array.
Note that pointer decay happens here. It is up to you or the utility functions to figure out the actual length of string (e.g. by iterating the next addresses until you find the null character).
// The `str` pointer points to the first element
// of the array (the address of character 'H').
// It does not know the actual length of the array.
char* str = "Hello";
String literals
Quote them using double quotes ("
), e.g. "Hello"
. Note that you don't have to manually add the \0
character (it's added automatically at the end.)
If the type is not array of char
, don't forget to append them with the correct prefix:
u
— to make it an array ofchar16_t
(UTF-16), e.g.u"Hello"
U
— to make it an array ofchar32_t
(UTF-32), e.g.U"Hello"
L
— to make it an array ofwchar_t
(Unicode), e.g.L"Hello"
For example, the following demonstrates how you would store a string.
// Stores "A book holds a house of gold."
char str_en[] = "A book holds a house of gold.";
// Stores "书中自有黄金屋"
char16_t str_zh[] = u"\u4e66\u4e2d\u81ea\u6709\u9ec4\u91d1\u5c4b";
Concatenating string literals
You can concatenate a string literal directly after a string literal. (Whitespaces between string literals are ignored.)
#include <cstdio>
char str[] =
"Hello, "
"You seem tired.";
int main() {
printf("%s\n", str);
//=> Hello, You seem tired.
}
Formatting strings
Narrow strings
The format specifier for arrays of char
("narrow strings") is %s
.
char name[] = "Ferdinand";
printf("Hello, %s!\n", name);
Wide strings
"Printing Unicode to the console is surprisingly complicated." — Lospinoso
When dealing with Unicode characters (arrays of char16_t
, char32_t
, or wchar_t
), you will have to ensure that the console supports it, and you have to set the correct code page.
I tried making the snippet code (using wide strings), and sadly it didn't work.
Read more
- Unicode Output to the Windows Console — https://www.codeproject.com/Articles/34068/Unicode-Output-to-the-Windows-Console
References
- C++ Crash Course (Josh Lospinoso) — 2. Types
- 10.6 — C-style strings — https://www.learncpp.com/cpp-tutorial/c-style-strings/
- How to print Unicode character in C++? — https://stackoverflow.com/questions/12015571/how-to-print-unicode-character-in-c