Miscellaneous Topics - Be Happy Every Day

概述#

本节课是 Miscellaneous Topics，主要补充 C++ 中几个容易在实际编程中出错的主题：

Named casts：用 static_cast、dynamic_cast、reinterpret_cast、const_cast 替代 C-style cast，使类型转换的语义更明确；
Multiple inheritance：多重继承会带来数据布局、重复基类、二义性和构造顺序等问题；
Virtual base classes：用虚基类解决 diamond inheritance 中重复基类的问题，但会引入运行时和空间开销；
Protocol / Interface classes：多重继承较安全的使用场景；
Namespaces：用命名空间组织名字，避免全局命名冲突；

Named casts#

C-style cast 的问题#

C-style cast 写法如下：

1
(type) expression

例如：

1
a = (int)d;

它的问题主要有两个：

语义不明确：同一个 (T)x 可能表示普通数值转换、去除 const、父子类转换、甚至强行按另一种类型解释内存；
不利于搜索和审查：项目中很难快速区分哪些地方做了危险转换。

C++ 更推荐使用 named cast：

1
static_cast<T>(expr)
2
dynamic_cast<T>(expr)
3
reinterpret_cast<T>(expr)
4
const_cast<T>(expr)

这四种转换把不同语义拆开，读代码时能直接看出程序员的意图。

TIP
如果必须做类型转换，优先使用 named cast。它不一定让转换安全，但能让转换的风险更明确。

`static_cast`#

static_cast 用于相对正常、编译器可检查的类型转换。

典型用途：

数值类型转换；
指针或引用在继承体系中的上行转换、下行转换；
void* 和具体类型指针之间的转换；
明确调用某些用户自定义转换。

例如：

1
double d = 7.1;
2
int a;
3

4
a = d;                    // implicit conversion
5
a = (int)d;               // C-style explicit conversion
6
a = static_cast<int>(d);  // exact meaning

static_cast<int>(d) 表示把 double 的值转换成 int 值。小数部分会被截断，所以 7.1 转成 7。

static_cast 的特点：

转换语义比较常规；
编译器会检查目标类型和源类型之间是否存在合理关系；
不做运行时类型检查。

`reinterpret_cast`#

reinterpret_cast 用于低层次的重新解释。

例如：

1
int a = 7;
2
double* p;
3

4
p = (double*)&a;                       // ok, but a is not a double
5
p = static_cast<double*>(&a);          // error
6
p = reinterpret_cast<double*>(&a);     // ok: I really mean it

这里 a 本质上仍然是一个 int 对象。reinterpret_cast<double*>(&a) 只是把 int* 的地址强行解释成 double*。

WARNING
reinterpret_cast 通常只改变“如何看待这段地址”，不会改变对象本身。把一段 int 内存当作 double 读写通常会造成 undefined behavior。它适用于非常底层的场景，例如查看对象的字节表示、和硬件/系统接口交互、序列化等。

`const_cast`#

const_cast 用于添加或移除 const / volatile 属性。

例如：

1
int i = 7;
2
const int c = i;
3
int* q;
4

5
q = &c;                         // error
6
q = (int*)&c;                   // ok, but is *q = 2 really allowed?
7
q = static_cast<int*>(&c);      // error
8
q = const_cast<int*>(&c);       // I really mean it

const_cast<int*>(&c) 表示程序员明确要去掉 const 限定。

但需要区分两种情况：

1
int x = 7;
2
const int* p = &x;
3
int* q = const_cast<int*>(p);
4
*q = 8;     // OK，原对象 x 本来不是 const

1
const int x = 7;
2
const int* p = &x;
3
int* q = const_cast<int*>(p);
4
*q = 8;     // Undefined behavior，原对象本来就是 const

const_cast 只能改变表达式的类型限定，不能改变对象本身是否真的可修改。

`static_cast` 和 `dynamic_cast` 的区别#

考虑继承体系：

1
struct A {
2
    virtual void f() {}
3
};
4

5
struct B : public A {};
6
struct C : public A {};

如果有：

1
A* pa = new B;

则 pa 的静态类型是 A*，但它实际指向的动态对象是 B。

`static_cast`：编译期检查继承关系，不检查真实对象类型#

1
C* pc = static_cast<C*>(pa);  // compiles, but *pa is B

这段代码能通过编译，因为 C 确实是 A 的派生类，A* -> C* 在类型层面存在下行转换关系。

但运行时 pa 实际指向的是 B，不是 C。因此得到的 pc 不是一个真正指向 C 对象的指针，继续解引用会产生严重问题。

`dynamic_cast`：运行时检查真实对象类型#

1
C* pc = dynamic_cast<C*>(pa);  // returns nullptr

dynamic_cast 会检查 pa 实际指向的对象是否真的是 C 或 C 的派生类。

如果转换成功，返回有效指针；
如果指针转换失败，返回 nullptr；
如果引用转换失败，抛出 std::bad_cast；
源类型必须是 polymorphic type（多态类型），也就是类中至少有一个虚函数。

如果 A 没有虚函数：

1
struct A {
2
    // virtual void f() {}
3
};

则：

1
C* pc = dynamic_cast<C*>(pa);  // Error: A is not polymorphic

原因是 dynamic_cast 需要运行时类型信息 RTTI，而 RTTI 通常依赖多态类型的虚函数机制。

演示 1：`static_cast` 与对象二进制表示#

double 转 int，再用 reinterpret_cast 查看对象在内存中的字节表示。

1
#include <cstddef>
2
#include <iomanip>
3
#include <iostream>
4
using namespace std;
5

6
int main()
7
{
8
    double d = 7.0;
9
    int i = static_cast<int>(d);
10

11
    cout << "double value: " << d << "\n";
12
    cout << "int value: " << i << "\n";
13

14
    cout << "Binary data of double (hex): ";
15
    byte* double_bytes = reinterpret_cast<byte*>(&d);
16
    for (int j = 0; j < sizeof(d); ++j) {
17
        cout << hex << setw(2) << setfill('0')
18
             << static_cast<int>(double_bytes[j]) << " ";
19
    }
20
    cout << "\n";
21

22
    cout << "Binary data of int (hex): ";
23
    byte* int_bytes = reinterpret_cast<byte*>(&i);
24
    for (int j = 0; j < sizeof(i); ++j) {
25
        cout << hex << setw(2) << setfill('0')
26
             << static_cast<int>(int_bytes[j]) << " ";
27
    }
28
    cout << "\n";
29
}

编译时使用：

1
g++ main.cpp -std=c++20

因为 std::byte 在较新的 C++ 标准中提供。如果不指定标准，可能出现类似错误：

1
error: unknown type name 'byte'

一次运行结果类似：

1
double value: 7
2
int value: 7
3
Binary data of double (hex): 00 00 00 00 00 00 1c 40
4
Binary data of int (hex): 07 00 00 00

这段代码说明：

static_cast<int>(d) 会产生一个新的 int 值，数值是 7；
d 的对象表示仍然是 IEEE 754 double 的字节表示；
i 的对象表示是整数 7 的字节表示；
输出顺序是低地址到高地址，所以在小端机器上 int 7 显示为 07 00 00 00。

TIP
static_cast<int>(d) 是值转换；reinterpret_cast<byte*>(&d) 是把对象地址当作字节序列查看。二者语义完全不同。

演示 2：`dynamic_cast` 的运行时检查#

一个不带虚函数的版本：

1
#include <iostream>
2
using namespace std;
3

4
class A {};
5
class B : public A {};
6
class C : public A {};
7

8
int main()
9
{
10
    A* pa = new B;
11
    C* pc = pa;  // error
12
}

这会报错，因为不能把 A* 直接赋值给 C*：

1
error: cannot initialize a variable of type 'C *' with an lvalue of type 'A *'

改成 static_cast 后可以通过编译：

1
#include <iostream>
2
using namespace std;
3

4
class A {};
5
class B : public A {};
6
class C : public A {};
7

8
int main()
9
{
10
    A* pa = new B;
11
    C* pc = static_cast<C*>(pa);
12

13
    cout << pc << endl;
14
}

这段程序会打印一个地址，但这个地址并不代表它真的指向一个 C 对象。pa 实际指向的是 B。

接着改成 dynamic_cast：

1
#include <iostream>
2
using namespace std;
3

4
class A {};
5
class B : public A {};
6
class C : public A {};
7

8
int main()
9
{
10
    A* pa = new B;
11
    C* pc = dynamic_cast<C*>(pa);
12

13
    cout << pc << endl;
14
}

此时会报错：

1
error: 'A' is not polymorphic

原因是 A 没有虚函数，不是多态类型。加上虚函数后，dynamic_cast 才能使用运行时类型信息：

1
#include <iostream>
2
using namespace std;
3

4
class A {
5
public:
6
    virtual void foo() {}
7
};
8

9
class B : public A {};
10
class C : public A {};
11

12
int main()
13
{
14
    A* pa = new B;
15

16
    // C* pc = static_cast<C*>(pa);  // compiles, but unsafe
17
    C* pc = dynamic_cast<C*>(pa);
18
    cout << pc << endl;              // prints 0 / nullptr
19

20
    B* pb = dynamic_cast<B*>(pa);
21
    cout << pb << endl;              // prints a valid address
22
}

运行结果类似：

1
0
2
0x1053099e0

含义：

pa 实际指向 B；
转成 C* 失败，dynamic_cast<C*>(pa) 返回 nullptr；
转成 B* 成功，dynamic_cast<B*>(pa) 返回有效地址。

WARNING
dynamic_cast 只能检查多态类型。只要类中有至少一个虚函数，它就是 polymorphic type。实际工程中，如果要通过基类指针删除派生类对象，基类通常还应该提供 virtual ~A() = default;。

演示 3：`reinterpret_cast` 的危险性#

老师最后演示了完全无继承关系的类型转换。

1
#include <iostream>
2
using namespace std;
3

4
class A {};
5
class B : public A {};
6
class C : public A {};
7
class D {};
8

9
int main()
10
{
11
    A* pa = new B;
12

13
    D* pd = static_cast<D*>(pa);  // error
14
    cout << pd << endl;
15
}

D 和 A 没有任何继承关系，所以 static_cast 会被编译器拒绝：

1
error: static_cast from 'A *' to 'D *', which are not related by inheritance, is not allowed

如果强行使用 reinterpret_cast：

1
#include <iostream>
2
using namespace std;
3

4
class A {};
5
class B : public A {};
6
class C : public A {};
7
class D {};
8

9
int main()
10
{
11
    A* pa = new B;
12

13
    // D* pd = static_cast<D*>(pa);  // error: A and D are unrelated
14
    D* pd = reinterpret_cast<D*>(pa);
15

16
    cout << pd << endl;
17
}

这段代码可能通过编译并打印一个地址，但只是把同一个地址“看成” D*。

它不能说明该地址处真的有一个 D 对象。继续访问 pd 指向的对象通常是错误的。

Multiple inheritance#

基本形式：mix and match#

Multiple inheritance（多重继承） 指一个类同时继承多个基类。

例子：

1
class Employee {
2
protected:
3
    String name;
4
    EmpID id;
5
};
6

7
class MTS : public Employee {
8
protected:
9
    Degrees degree_info;
10
};
11

12
class Temporary {
13
protected:
14
    Company employer;
15
};
16

17
class Consultant :
18
    public MTS,
19
    public Temporary {
20
    /* ... */
21
};

Consultant 同时继承 MTS 和 Temporary，因此它会获得两边的属性：

从 MTS 路径获得 Employee 的 name 和 id；
从 MTS 获得 degree_info；
从 Temporary 获得 employer。

这种写法的直观意义是“组合多个身份”：一个 consultant 既可以是技术人员，又可以是临时雇员。

多重继承的数据布局#

多重继承会让对象内部同时包含多个 base class subobject。

以上面的 Consultant 为例，它大致包含：

1
Consultant object
2
├── MTS subobject
3
│   └── Employee subobject
4
│       ├── name
5
│       └── id
6
└── Temporary subobject
7
    └── employer

所以多重继承不是简单地“复制几个函数接口”，它会真实影响对象内存布局。

用 iostream package 作为例子。标准输入输出库中有类似结构：

1
basic_istream      basic_ostream
2
       \              /
3
        \            /
4
         basic_iostream

basic_iostream 同时具备输入和输出能力，这是一种多重继承的实际应用。

Vanilla MI：重复基类#

默认情况下，多重继承中的基类会被复制。

这称为 Vanilla MI。特点是：

members are duplicated；
derived class has access to full copies of each base class；
有时这是有用的。

例如：

一个链表节点可能需要有多套 link，用于同时挂在多个链表中；
输入流和输出流可能需要各自持有 stream buffer 相关结构。

重复基类带来的二义性#

考虑 diamond-shaped inheritance：

1
struct B1 {
2
    int m_i;
3
};
4

5
struct D1 : public B1 {};
6
struct D2 : public B1 {};
7
struct M : public D1, public D2 {};

此时 M 中有两份 B1 subobject：

1
M
2
├── D1
3
│   └── B1
4
│       └── m_i
5
└── D2
6
    └── B1
7
        └── m_i

代码如下：

1
int main()
2
{
3
    M m;        // OK
4
    B1* p = &m; // ERROR: which B1???
5

6
    B1* p1 = static_cast<D1*>(&m); // OK
7
    B1* p2 = static_cast<D2*>(&m); // OK
8
}

B1* p = &m; 会报错，因为编译器不知道你想要 D1 路径里的 B1，还是 D2 路径里的 B1。

同样：

1
M m;
2
m.m_i++;  // ERROR: D1::B1::m_i or D2::B1::m_i?

这里 m_i 也有两份，因此访问二义。

TIP
重复基类本身不一定是问题。如果 D1 和 D2 各自使用自己的 B1 只是实现细节，外部不直接访问，就未必会出错。问题出现在重复的数据带来了逻辑二义性。

Protocol / Interface classes#

多重继承相对安全的使用场景是继承多个 protocol / interface classes。

Protocol / Interface class 通常是一个抽象基类，满足：

所有 non-static member functions 都是 pure virtual，除了 destructor；
destructor 是 virtual destructor，通常有空实现；
没有 non-static member variables，无论是自己声明还是继承来的；
可以包含 static members。

例子：Unix character device 接口。

1
class CDevice {
2
public:
3
    virtual ~CDevice() = default;
4

5
    virtual int read(...) = 0;
6
    virtual int write(...) = 0;
7
    virtual int open(...) = 0;
8
    virtual int close(...) = 0;
9
    virtual int ioctl(...) = 0;
10
};

这种接口类主要提供行为约束，不携带对象状态。即使多重继承多个接口，也不容易造成重复数据的二义性。

Virtual base classes#

如果希望 diamond inheritance 中只保留一份共同基类，可以使用 virtual base class。

1
struct B1 {
2
    int m_i;
3
};
4

5
struct D1 : virtual public B1 {};
6
struct D2 : virtual public B1 {};
7
struct M : public D1, public D2 {};
8

9
int main()
10
{
11
    M m;       // OK
12
    m.m_i++;  // OK, there is only one B1 in m
13

14
    B1* p = new M; // OK
15
}

此时对象结构变成：

1
M
2
├── D1
3
├── D2
4
└── shared virtual B1
5
    └── m_i

D1 和 D2 共享同一份 B1，所以 m.m_i 不再二义。

对 C++ 来说，virtual 往往意味着 indirect：

virtual member functions 使用 dynamic binding，本质上有指针间接寻址；
virtual base classes 也通过间接方式表示共享基类。

因此 virtual base class 会带来：

运行时开销；
空间开销；
更复杂的对象布局和构造顺序。

TIP
如果基类重复不是问题，就没有必要把基类声明为 virtual。对于不含数据的 abstract base class，重复一份通常没有实际问题，反而可以避免 virtual base 的复杂性。

多重继承的复杂性#

多重继承会引入很多额外问题：

name conflicts；
dominance rule；
order of construction；
virtual base 由谁构造；
需要 virtual base 时，前面的基类没有声明成 virtual，派生类无法事后修补；
virtual base 相关代码可能被多条继承路径影响；
编译器实现和对象模型更复杂。

结论：

Use sparingly. Avoid diamond patterns. In general, SAY NO.

也就是：多重继承不是不能用，但要非常谨慎。优先考虑 composition、单继承加接口、或者更简单的设计。

Namespaces#

为什么需要 namespace#

如果两个头文件都在全局作用域声明同名函数，就会产生冲突。

1
void f();
2
void g();
3

4
// old2.h
5
void f();
6
void g();

当一个源文件同时包含这两个头文件时，f 和 g 的名字会冲突。

解决方式是把名字包进 namespace：

1
namespace old1 {
2
    void f();
3
    void g();
4
}
5

6
// old2.h
7
namespace old2 {
8
    void f();
9
    void g();
10
}

此时完整名字分别是：

1
old1::f();
2
old2::f();

定义 namespace#

namespace 表示一组逻辑上相关的类、函数、变量等。

1
namespace Math {
2
    double abs(double);
3
    double sqrt(double);
4
    int trunc(double);
5
    // ...
6
}

注意：namespace 的右花括号后面不需要像 class 一样加分号。

1
namespace Math {
2
    // ...
3
}  // no semicolon required

namespace 本质上也是一个 scope，就像 class scope 一样。

推荐在需要 name encapsulation 时使用 namespace。

在头文件中放 namespace 声明#

通常把 namespace 中的声明写在头文件中。

1
namespace MyLib {
2
    void foo();
3

4
    class Cat {
5
    public:
6
        void Meow();
7
    };
8
}

用户包含 MyLib.h 后，就能看到 MyLib::foo 和 MyLib::Cat 的声明。

实现 namespace 中的函数#

实现 namespace 中的函数时，可以用作用域解析运算符 ::。

1
#include <iostream>
2
#include "MyLib.h"
3
using namespace std;
4

5
void MyLib::foo()
6
{
7
    cout << "foo\n";
8
}
9

10
void MyLib::Cat::Meow()
11
{
12
    cout << "meow\n";
13
}

其中：

MyLib::foo 表示 foo 是 MyLib 命名空间里的函数；
MyLib::Cat::Meow 表示 Meow 是 MyLib 命名空间中的 Cat 类的成员函数。

也可以把实现写进 namespace 块中：

1
namespace MyLib {
2
    void foo()
3
    {
4
        cout << "foo\n";
5
    }
6

7
    void Cat::Meow()
8
    {
9
        cout << "meow\n";
10
    }
11
}

使用 namespace 中的名字#

最直接的写法是使用 scope resolution operator：

1
#include "MyLib.h"
2

3
int main()
4
{
5
    MyLib::foo();
6

7
    MyLib::Cat c;
8
    c.Meow();
9
}

优点是来源明确；缺点是如果名字很长，会比较啰嗦。

`using` declaration#

using declaration 引入某一个具体名字，相当于给局部作用域提供一个简写。

1
int main()
2
{
3
    using MyLib::foo;
4
    using MyLib::Cat;
5

6
    foo();
7

8
    Cat c;
9
    c.Meow();
10
}

特点：

只引入指定名字；
能在一个地方清楚说明名字来自哪里；
比 using namespace 更可控。

`using` directive#

using namespace 会把一个 namespace 中的所有名字都变得可见。

1
int main()
2
{
3
    using namespace std;
4
    using namespace MyLib;
5

6
    foo();
7

8
    Cat c;
9
    c.Meow();
10

11
    cout << "hello" << endl;
12
}

它可以作为记号上的便利，但会增加命名冲突风险。

WARNING
不要在头文件中写 using namespace std;。头文件会被很多源文件包含，这会把大量名字污染到包含者的作用域中，容易造成冲突。

命名空间二义性#

考虑：

1
namespace XLib {
2
    void x();
3
    void y();
4
}
5

6
namespace YLib {
7
    void y();
8
    void z();
9
}

如果同时使用两个 using-directives：

1
int main()
2
{
3
    using namespace XLib;
4
    using namespace YLib;
5

6
    x();        // OK
7
    y();        // Error: ambiguous
8
    XLib::y();  // OK, resolves to XLib
9
    z();        // OK
10
}

using namespace 只是让名字可见。二义性通常在真正使用这个名字时才出现。

解决方式是使用限定名：

1
XLib::y();
2
YLib::y();

Namespace aliases#

namespace 名字太短容易冲突，太长又不方便使用。可以用 alias。

1
namespace supercalifragilistic {
2
    void f();
3
}
4

5
namespace short_ns = supercalifragilistic;
6

7
short_ns::f();

alias 也可以用于库版本管理。例如：

1
namespace mylib_v1 {
2
    void f();
3
}
4

5
namespace mylib = mylib_v1;

以后如果升级为 mylib_v2，可以调整 alias 指向。

Namespace composition#

可以把多个 namespace 组合成新的 namespace。

1
namespace first {
2
    void x();
3
    void y();
4
}
5

6
namespace second {
7
    void y();
8
    void z();
9
}

如果直接组合，y 会冲突。可以用 using-declaration 指定采用哪一个：

1
namespace mine {
2
    using namespace first;
3
    using namespace second;
4

5
    using first::y;  // resolve clashes
6

7
    void mystuff();
8
    /* ... */
9
}
10

11
int main()
12
{
13
    mine::x();
14
    mine::y();       // call first::y()
15
    mine::mystuff();
16
}

这里 mine 组合了 first 和 second 的功能，并显式解决 y 的冲突。

Namespace selection#

也可以只选择某些名字组成新 namespace。

1
namespace mine {
2
    using orig::Cat;  // use Cat class from orig
3

4
    void x();
5
    void y();
6
}

这种方式比 using namespace orig 更精确，只暴露需要的部分。

如果 orig::Cat 的声明发生改变，mine::Cat 对应的名字也会反映这种变化。

Namespaces are open#

namespace 是 open 的。多个 namespace 声明会合并到同一个 namespace。

1
namespace X {
2
    void f();
3
}
4

5
// header2.h
6
namespace X {
7
    void g();  // X now has f() and g()
8
}

这意味着 namespace 可以分散在多个文件中维护。

标准库中的一些扩展点也依赖这一点。例如为用户自定义类型特化 std::hash：

1
#include <cstddef>
2
#include <functional>
3

4
namespace mylib {
5
    class Widget {
6
        /* ... */
7
    };
8
}
9

10
// Specialize std::hash so that Widget can be used as
11
// a key in std::unordered_set and std::unordered_map.
12
namespace std {
13
    template<>
14
    struct hash<mylib::Widget> {
15
        size_t operator()(const mylib::Widget& obj) const {
16
            return /* ... */ 0;
17
        }
18
    };
19
}

注意：一般不要随意往 std 里面添加新函数或新类。为用户自定义类型特化标准库模板是少数允许的扩展方式之一。

概述#

目录#

Named casts#

C-style cast 的问题#

static_cast#

reinterpret_cast#

const_cast#

static_cast 和 dynamic_cast 的区别#

static_cast：编译期检查继承关系，不检查真实对象类型#

dynamic_cast：运行时检查真实对象类型#

演示 1：static_cast 与对象二进制表示#

演示 2：dynamic_cast 的运行时检查#

演示 3：reinterpret_cast 的危险性#

Multiple inheritance#

基本形式：mix and match#

多重继承的数据布局#

Vanilla MI：重复基类#

重复基类带来的二义性#

Protocol / Interface classes#

Virtual base classes#

多重继承的复杂性#

Namespaces#

为什么需要 namespace#

定义 namespace#

在头文件中放 namespace 声明#

实现 namespace 中的函数#

使用 namespace 中的名字#

using declaration#

using directive#

命名空间二义性#

Namespace aliases#

Namespace composition#

Namespace selection#

Namespaces are open#

评论

`static_cast`#

`reinterpret_cast`#

`const_cast`#

`static_cast` 和 `dynamic_cast` 的区别#

`static_cast`：编译期检查继承关系，不检查真实对象类型#

`dynamic_cast`：运行时检查真实对象类型#

演示 1：`static_cast` 与对象二进制表示#

演示 2：`dynamic_cast` 的运行时检查#

演示 3：`reinterpret_cast` 的危险性#

`using` declaration#

`using` directive#