STL pair源码分析

pair是STL中提供的一个简单的struct，用来处理类型不同的一对值，是非常常用的数据结构。这一对值是以public的形式暴露出来的，直接通过first和second就能访问。我们以MSVC提供的STL源码为例，分析pair的具体实现。在分析过程中可能会对源码进行一定程度地简化。

首先是它存储的数据成员，pair中还记录了这一对值的类型：

template <class _Ty1, class _Ty2>
struct pair { // store a pair of valuesusing first_type  = _Ty1;using second_type = _Ty2;_Ty1 first; // the first stored value_Ty2 second; // the second stored value
};

pair的构造函数提供了以下若干种，实现都比较简单，这里整理出来如下：

template <class _Ty1, class _Ty2>
struct pair {pair() : first(), second() {}pair(const _Ty1& _Val1, const _Ty2& _Val2) : first(_Val1), second(_Val2) {}template <class _Other1, class _Other2>pair(_Other1&& _Val1, _Other2&& _Val2) : first(_STD forward<_Other1>(_Val1)), second(_STD forward<_Other2>(_Val2)) {}pair(const pair&) = default;pair(pair&&)      = default;template <class _Other1, class _Other2>pair(const pair<_Other1, _Other2>& _Right) : first(_Right.first), second(_Right.second) {}template <class _Other1, class _Other2>pair(pair<_Other1, _Other2>&& _Right) : first(_STD forward<_Other1>(_Right.first)), second(_STD forward<_Other2>(_Right.second)) {}template <class _Tuple1, class _Tuple2, size_t... _Indices1, size_t... _Indices2>constexpr pair(_Tuple1& _Val1, _Tuple2& _Val2, index_sequence<_Indices1...>, index_sequence<_Indices2...>): first(_Tuple_get<_Indices1>(_STD move(_Val1))...), second(_Tuple_get<_Indices2>(_STD move(_Val2))...) {}template <class... _Types1, class... _Types2>pair(piecewise_construct_t, tuple<_Types1...> _Val1, tuple<_Types2...> _Val2): pair(_Val1, _Val2, index_sequence_for<_Types1...>{}, index_sequence_for<_Types2...>{}) {}
};

通过观察可以得出，pair的构造函数主要分为三类，第一类是传两个值val1，val2进来，pair根据这两个值分别初始化first和second，当传入参数是右值引用时，构造函数需要通过std::forward进行完美转发，保证调用的是first和second的右值引用构造函数；第二类是传另外一个pair对象right进来，构造函数会把right的first和second依次取出，再分别构造自身的first和second，同样这里也需要考虑右值的问题；第三类比较复杂，传入的参数是tuple，tuple类型是C++的多元组，可以包含两个以上的元素。那么这里index_sequence和piecewise_construct_t类型又是什么呢？话不多说，我们来看两个例子，通过例子就能明白了：

#include <iostream>
#include <utility>
#include <tuple>using namespace std;struct A
{int x;int y;int z;A() = default;A(int x, int y, int z) : x(x), y(y), z(z) {}
};int main()
{tuple<int, int, int> t1(1, 2, 3);tuple<int, int, int> t2(6, 5, 4);pair<A, A> p1(t1, t2, index_sequence<0, 1, 2>{}, index_sequence<2, 1, 0>{});pair<A, A> p2(piecewise_construct, t1, t2);cout << p1.first.x << " " << p1.first.y << " " << p1.first.z << " " << p1.second.x << " " << p1.second.y << " " << p1.second.z << endl;cout << p2.first.x << " " << p2.first.y << " " << p2.first.z << " " << p2.second.x << " " << p2.second.y << " " << p2.second.z << endl;return 0;
}

例子的输出结果如下：

1 2 3 4 5 6
1 2 3 6 5 4

到这里就明白了，这两个构造函数是说，不把tuple当作一个类型看待，而是将tuple中的所有元素取出，使用这些元素来构造pair。带有两个index_sequence类型的构造函数，是用这两个sequence来指示，使用tuple里的哪些元素，以及以怎样的顺序，来构造pair的first和second；而带有piecewise_construct_t类型的构造函数其实是前一种的特殊形式，它默认把两个tuple中的所有元素，从头到尾按顺序取出，来构造pair。其实看源码也能发现，这个构造函数的内部实现就是调用了前一个构造函数来完成的。

在STL的实际实现中，这些构造函数的签名并不像前面列出的那么简洁。实际上，STL的每个函数，都希望在编译期间能够尽可能地多做类型检查，并且能确定是否会抛出异常。我们这里以一个构造函数为例，来看看它完整的声明：

template <class _Uty1 = _Ty1, class _Uty2 = _Ty2,enable_if_t<conjunction_v<is_default_constructible<_Uty1>, is_default_constructible<_Uty2>>, int> = 0>
constexpr explicit(!conjunction_v<_Is_implicitly_default_constructible<_Uty1>, _Is_implicitly_default_constructible<_Uty2>>)pair() noexcept(is_nothrow_default_constructible_v<_Uty1>&& is_nothrow_default_constructible_v<_Uty2>) // strengthened: first(), second() {}

首先是一开始template的声明，这里加了一个enable_if_t<bool, T>，它用来进行编译检查，即只有第一个模板参数推导出来结果为true时，后面的T才生效，也就是说，如果检查失败，就不存在对应的T，这里的template声明就是非法，编译期间就会报错。

那么它要检查的是什么呢？可以看到是一个conjunction_v<T…>，它挨个对里面的参数进行检查，只有当所有的参数检查通过时，才会返回true。conjunction_v里包含is_default_constructible<_Uty1>和is_default_constructible<_Uty2>这两个参数，从字面意思就能看出，这个template是用来判断某个类型是否有默认构造函数的。那么这下就很清楚了，只有传入pair的两个类型都存在默认构造函数时，这个pair才有默认构造函数，否则编译时就会报错。为了验证这一点，我们写个例子来尝试一下：

#include <iostream>
#include <utility>
#include <tuple>using namespace std;struct A
{A() = default;A(int x) {}
};struct B
{B() = delete;B(int x) {}
};int main()
{pair<A, A> p1; // okpair<A, B> p2; // errorpair<A, B> p3(1, 2); // okreturn 0;
}

例子中B类型没有默认构造函数，所以p2编译就会直接失败，而由于A类型和B类型都有接受一个int类型的构造函数，因此p3可以编译成功。

接下来我们发现一个explicit(bool)表达式，它的含义是说如果表达式返回值为true，那么explicit就会生效，也就是说这个默认构造函数是explicit的，必须显式构造不能隐式转换。不难发现，只要传入pair的两个类型任意一个的默认构造函数是explicit的，那么pair的这个默认构造函数就是explicit的，这一点也很好理解。同样我们以例子进行佐证：

#include <iostream>
#include <utility>
#include <tuple>using namespace std;struct A
{A() = default;
};struct B
{explicit B() = default;
};int main()
{pair<A, A> p1 = {}; // okpair<A, B> p2 = {}; // errorreturn 0;
}

用MSVC编译时，报错信息还贴心地告诉了我们这个构造函数是explicit的：

test.cpp(20): error C2512: 'std::pair<A,B>': no appropriate default constructor available
test.cpp(20): note: Constructor for struct 'std::pair<A,B>' is declared 'explicit'

最后就是noexcept声明了，同理，只有当pair的两个类型默认构造函数都不抛出异常时，它才不会抛出异常。

有了构造函数之后，就要有与之匹配的赋值操作，pair重载的赋值操作符大概也有以下几种：

template <class _Ty1, class _Ty2>
struct pair {pair& operator=(const volatile pair&) = delete;template <class _Myself = pair>pair& operator=(_Identity_t<const _Myself&> _Right) {first  = _Right.first;second = _Right.second;return *this;}template <class _Myself = pair>pair& operator=(_Identity_t<_Myself&&> _Right) {first  = _STD forward<_Ty1>(_Right.first);second = _STD forward<_Ty2>(_Right.second);return *this;}template <class _Other1, class _Other2>pair& operator=(const pair<_Other1, _Other2>& _Right) {first  = _Right.first;second = _Right.second;return *this;}template <class _Other1, class _Other2>pair& operator=(pair<_Other1, _Other2>&& _Right) {first  = _STD forward<_Other1>(_Right.first);second = _STD forward<_Other2>(_Right.second);return *this;}};

基本和构造函数一一对应，这里就不再赘述了。pair还提供了swap操作，相同类型和不同类型的两个pair都可以进行swap，交互彼此的值：

template <class _Ty1, class _Ty2>
struct pair {void swap(pair& _Right) {using _STD swap;if (this != _STD addressof(_Right)) {swap(first, _Right.first); // intentional ADLswap(second, _Right.second); // intentional ADL}}
};template <class _Ty1, class _Ty2>
void swap(pair<_Ty1, _Ty2>& _Left, pair<_Ty1, _Ty2>& _Right) {_Left.swap(_Right);
}

swap不同类型的pair，需要把后一个pair类型转换为前一个类型。swap内部实现也很简单，就是分别调用每个类型的swap，ADL机制保证了这一点。

pair类型还提供了比较机制，它会首先拿第一个类型进行比较，如果第一个元素不相等，那么比较结果就是最终的结果，如果相等才会比较第二个元素。这意味着只有当两个元素都相等时，两个pair对象才会视为相等。

template <class _Ty1, class _Ty2, class _Uty1, class _Uty2>
constexpr bool operator==(const pair<_Ty1, _Ty2>& _Left, const pair<_Uty1, _Uty2>& _Right) {return _Left.first == _Right.first && _Left.second == _Right.second;
}template <class _Ty1, class _Ty2, class _Uty1, class _Uty2>
constexpr common_comparison_category_t<_Synth_three_way_result<_Ty1, _Uty1>,_Synth_three_way_result<_Ty2, _Uty2>>operator<=>(const pair<_Ty1, _Ty2>& _Left, const pair<_Uty1, _Uty2>& _Right) {if (auto _Result = _Synth_three_way{}(_Left.first, _Right.first); _Result != 0) {return _Result;}return _Synth_three_way{}(_Left.second, _Right.second);
}

C++ 20提出了spaceship <=> 操作符，可以不再写重复的比较代码了，<=>操作符返回的对象有以下的性质：

当左操作数 < 右操作数时，对象 < 0；

当左操作数 > 右操作数时，对象 > 0；

当左操作数 = 右操作数时，对象 = 0。

还是以一个例子来验证：

#include <iostream>
#include <utility>
#include <tuple>using namespace std;int main()
{pair<int, int> p1(1, 2);pair<int, int> p2(3, 4);auto comp = (p1 <=> p2);if(comp < 0){cout << "p1 < p2" << endl;}else if(comp > 0){cout << "p1 > p2" << endl;}else{cout << "p1 == p2" << endl;}return 0;
}

记得要用C++20标准编译哦，不然编译就过不去了，运行结果如下：

>cl test.cpp /std:c++20
>test.exe
p1 < p2

最后，pair还支持便捷函数make_pair构造出对象。注意如果两个参数类型是reference_wrapper，则需要取它们的引用类型作为pair的类型，make_pair本身的实现很简单，就是调用一下pair的构造函数即可。

template <class _Ty>
struct _Unrefwrap_helper { // leave unchanged if not a reference_wrapperusing type = _Ty;
};template <class _Ty>
struct _Unrefwrap_helper<reference_wrapper<_Ty>> { // make a reference from a reference_wrapperusing type = _Ty&;
};// decay, then unwrap a reference_wrapper
template <class _Ty>
using _Unrefwrap_t = typename _Unrefwrap_helper<decay_t<_Ty>>::type;_EXPORT_STD template <class _Ty1, class _Ty2>
_NODISCARD constexpr pair<_Unrefwrap_t<_Ty1>, _Unrefwrap_t<_Ty2>> make_pair(_Ty1&& _Val1, _Ty2&& _Val2) noexcept(is_nothrow_constructible_v<_Unrefwrap_t<_Ty1>, _Ty1>&&is_nothrow_constructible_v<_Unrefwrap_t<_Ty2>, _Ty2>) /* strengthened */ {// return pair composed from argumentsusing _Mypair = pair<_Unrefwrap_t<_Ty1>, _Unrefwrap_t<_Ty2>>;return _Mypair(_STD forward<_Ty1>(_Val1), _STD forward<_Ty2>(_Val2));
}

看个例子，就能明白这里特殊处理reference_wrapper的作用了：

#include <iostream>
#include <utility>using namespace std;template<typename T>
void f(T&& x)
{}int main()
{int x = 1;auto p1 = make_pair(x, x);p1.first = 3;cout << "x " << x << endl;int& y = x;auto p2 = make_pair(y, y);p2.first = 5;cout << "x " << x << endl;auto p3 = make_pair(ref(x), ref(x));p3.first = 7;cout << "x " << x << endl;pair<int&, int&> p4(y, y);p4.first = 9;cout << "x " << x << endl;return 0;
}

例子的输出结果如下：

x 1
x 1
x 7
x 9

有点令人意外的是p2，它传入的参数明明是int&，但pair的参数类型却是int。这是因为make_pair返回的参数类型是_Unrefwrap_t，而它会先调用一次decay_t把引用类型给摘掉，虽然一开始_Ty1和_Ty2都会被推导为int&，但是经过decay_t之后它们就退化成了int，传给了pair。