摘要:标准库中的string类的常用函数
C语言中,字符串是以'\0'结尾的一些字符的集合,为了操作方便,C标准库中提供了一些str系列的库函数, 但是这些库函数与字符串是分离开的,不太符合OOP(面向对象)的思想,而且底层空间需要用户自己管理,稍不留神可能还会越界访问。
在OJ中,有关字符串的题目基本以string类的形式出现,而且在常规工作中,为了简单、方便、快捷,基本 都使用string类,很少有人去使用C库中的字符串操作函数。
(string 类不属于 STL 【C++】-7- STL简介,属于标准库)下面介绍 string类 中比较常用、重要的函数。string类的接口设计繁多,需要时查一下文档即可。cplusplus.com/reference/string/string/
1. Constructor
关于构造函数不多赘述,参考文档可以很清楚的了解这些构造函数。→ https://cplusplus.com/reference/string/string/string/
补充说明: npos 为 string类 中的静态成员变量,类型为 无符号整型。static const size_t npos = -1 ,-1表示为无符号整型的最大值1111 1111 1111 1111 1111 1111 1111 1111 → 4,294,967,295。
2. 遍历string_Element Access
operator[] | Get character of string (public member function) |
at | Get character in string (public member function) |
back | Access last character (public member function) |
front | Access first character (public member function) |
1)operator[]
像普通数组一样,以[下标]的方式访问string类对象中的成员是最常用、便捷的一种方式。同样的,这种访问方式也支持修改:
#include<iostream>
#include<string>int main()
{std::string s1("Hello!");for (int i = 0; i < s1.size(); ++i){std::cout << s1[i] << " ";//访问}std::cout << std::endl;for (int i = 0; i < s1.size(); ++i){std::cout << ++s1[i] << " ";//修改}return 0;
}
注意: operator[] 越界,程序直接终止(assert断言处理)
2)at
at:越界抛异常
#include<iostream>
#include<string>int main()
{std::string s1("Hello!");for (int i = 0; i < s1.size(); ++i){std::cout << s1.at(i) << " ";}std::cout << std::endl; return 0;
}
3. Iterator_迭代器
迭代器是更通用、主流的遍历方式——不是所有的容器都适用operator[],譬如链表——空间按地址不连续。为了方便理解,可以把迭代器看作指针(虽然实际底层实现可能是指针也可能不是)。
begin | Return iterator to beginning (public member function) |
end | Return iterator to end (public member function) |
rbegin | Return reverse iterator to reverse beginning (public member function) |
rend | Return reverse iterator to reverse end (public member function) |
cbegin | Return const_iterator to beginning (public member function) |
cend | Return const_iterator to end (public member function) |
crbegin | Return const_reverse_iterator to reverse beginning (public member function) |
crend | Return const_reverse_iterator to reverse end (public member function) |
#include<iostream>
#include<string>int main()
{std::string s2("Hello World!");std::string::iterator it = s2.begin();while (it != s2.end()){std::cout << *it << " ";++it;}std::cout << std::endl;return 0;
}
-
1)reverse_iterator
-
2)范围for
#include<iostream> #include<string>int main() {for (auto e : s2){std::cout << e << " ";}return 0; }
范围for 本质上就是迭代器,编译器会在编译的时候替换成迭代器。
-
3)const_iterator
int main() {const std::string s3("RoundBottle");std::string::const_iterator c_it = s3.cbegin();while (c_it != s3.cend()){std::cout << *c_it << " ";++c_it;}std::cout << std::endl;std::string::const_reverse_iterator cr_it = s3.crbegin();while (cr_it != s3.crend()){std::cout << *cr_it << " ";++cr_it;}std::cout << std::endl;return 0; }
如上代码,const 对象调用 std::string::const_iterator 和 std::string::const_reverse_iterator 。
-
ps.可以用 auto 自动识别类型——auto cr_it = s3.crbegin();
4. Capacity
size | Return length of string (public member function) |
length | Return length of string (public member function) |
max_size | Return maximum size of string (public member function) |
resize | Resize string (public member function) |
capacity | Return size of allocated storage (public member function) |
reserve | Request a change in capacity (public member function) |
clear | Clear string (public member function) |
empty | Test if string is empty (public member function) |
shrink_to_fit | Shrink to fit (public member function) |
- clear():一般只清理空间,不释放空间
- size() and length():string 类对象结尾以 '\0' 为结束标志——为了兼容C语言 。size() and length() 都不把结尾的 '\0' 算在内,先有 length 后有 size,是为了和其他容器保持一致,“size” 这种表达更具有通用性。功能一样,都是返回 string类 的对象的长度(不包括结尾的 '\0' )。
-
reserve:提前预留空间,因为频繁的扩容是有代价的,提前预留空间可以提高效率(一般不缩容)。另外,不同平台下实际实现方案有所不同,譬如 vs编译器 下有一些对齐的规则,最终开出来的空间会比 reserve 指定的空间大小要大一点;g++平台下一般是按指定的空间大小开空间。
- resize:改变 size 的大小。(ps.如果指定的 size 大小过大,改变 size 的大小会导致改变 capacity 的大小。
关于不同平台扩容的不同规则:vs平台一般是1.5倍扩容,g++平台一般是2倍扩容。
5. Modifiers
operator+= | Append to string (public member function) |
append | Append to string (public member function) |
push_back | Append character to string (public member function) |
assign | Assign content to string (public member function) |
insert | Insert into string (public member function) |
erase | Erase characters from string (public member function) |
replace | Replace portion of string (public member function) |
swap | Swap string values (public member function) |
pop_back | Delete last character (public member function) |
- operator+=:常用,推荐使用,可以插入字符或字符串
- push_back:插入字符
- append:插入一串指定字符
- insert , erase , replace:能不用就不用。因为挪动数据,影响效率
使用示例:
int main()
{std::string s2("Hello,Round Bottle");s2 += 'x';s2 += "llllll";s2 += "321";s2 += '!';std::cout << s2;s2.push_back('7');std::cout << s2;s2.append("aaaaaa");s2.append(3, '0');s2.append("alison", 2);std::cout << s2;return 0;
}
Non-member function overloads
- operator+ :全局函数(尽量减少用,代价很大)
int main() {std::string s2("Hello,Round Bottle");std::string s3 = s2 + "777";std::cout << s3 << std::endl;return 0; }
关于Swap函数:
①std库中提供了 swap 函数模板:(3次深拷贝——1次拷贝构造+两次赋值——效率低)
②害怕成本太高,std库中又提供了现成的针对 string类对象的:(就是 Non-member function overloads 表格中所展示的 swap 函数)
③ string类中自己有 swap 成员函数:(就是 Modifiers 表格中所展示的 swap 函数)
swap 成员函数使用示例:
int main()
{std::string s1("nothing");std::string s2("Hello,Round Bottle");std::cout << "s1:" << s1 << std::endl;std::cout << "s2:" << s2 << std::endl;s1.swap(s2);std::cout << "--------------swap---------------" << std::endl;std::cout << "s1:" << s1 << std::endl;std::cout << "s2:" << s2 << std::endl;return 0;
}
综上,针对交换 string类 的对象,建议使用 string类自己的成员函数 swap 进行交换——效率更高。
6. String operations
c_str | Get C string equivalent (public member function) |
data | Get string data (public member function) |
get_allocator | Get allocator (public member function) |
copy | Copy sequence of characters from string (public member function) |
find | Find content in string (public member function) |
rfind | Find last occurrence of content in string (public member function) |
find_first_of | Find character in string (public member function) |
find_last_of | Find character in string from the end (public member function) |
find_first_not_of | Find absence of character in string (public member function) |
find_last_not_of | Find non-matching character in string from the end (public member function) |
substr | Generate substring (public member function) |
compare | Compare strings (public member function) |
c_str:与C语言接口兼容。使用示例如下:
int main()
{std::string s1("nothing");printf("%s", s1.c_str());return 0;
}
7. 设计string类的意义——编码
编码:值和符号一一映射对应的关系 → 编码表 (e.g. ASCII)
Unicode:万国码 ⇨ UTF
- UTF-8 ⇢ 兼容 ASCII(Linux → UTF-8)(Windows —— 针对中国用户 → gbk ——参考了UTF-8)
- UTF-16
- UTF-32
<string>
string | String class (class) ⇨ UTF-8 |
u16string | String of 16-bit characters (class) |
u32string | String of 32-bit characters (class) |
wstring | Wide string (class) |
👆适应不同的编码,为了更好的表示世界上的各种语言。
- wchar_t:2 byte(宽字符)⇨ wstring
- char16_t:16 bit → 2 byte (UTF-16)
- char32_t:32 bit → 4 byte (UTF-32)
乱码:(数)值 通过不同的编码表 得出了不同的符号——存储方式与解释方式不匹配。
GBK:GBK字库_百度百科 (baidu.com)
END