本文来自博客园,作者:T-BARBARIANS,博文严禁转载,转载必究!
最近对kolla部署的Openstack做了一些调研,尤其是对mariadb galera集群自身的高可用机制,以及haproxy,proxysql组件对mariadb galera集群的负载均衡方法、支持程度进行了一些分析和探索。我把这次调研的收获总结成了知识型输出,希望可以给有相关需求的同行一些参考。
mariadb cluster概览
mariadb 集群有多种模式,针对不同的业务场景,各有所长。
| 集群模式 | 类型 | 适用场景 | 延迟 | 官方支持 |
|---|---|---|---|---|
| MariaDB Galera Cluster | 同步复制,强一致性 | 高频OLTP | 低 | ✅ 是 |
| MariaDB Replication | 异步复制,最终一致性 | 读写分离 | 中高 | ✅ 是 |
| MariaDB ColumnStore | 列式存储分布式集群 | OLAP分析 | 高 | ✅ 商业版 |
| MariaDB Xpand | 分布式 SQL 数据库 | 超大规模OLTP | 中 | ✅ 商业版 |
mariadb galera集群简介
什么是galera集群
Galera 是 MariaDB 官方推荐的 Multi Master,同步复制集群方案,主要面向需要强一致性和高可用的 OLTP(Online Transaction Processing)场景。
kolla 在 openstack 环境部署时,默认将 galera 作为后端数据库集群方案。官方建议 galera 集群至少需要三个 mariadb 节点。
galera集群同步机制简介
从应用向 galera 集群发起数据请求开始,直至完成事务提交,应用与 galera 集群的交互,以及 galera 集群的同步机制可简单概括为:
- 客户端A、B、C分别同时连接 mariadb_A,mariadb_B,mariadb_C,并发写操作
- 各mariadb节点分别生成对应事务的写集(WriteSet)A,写集B,写集C。例如写集A的内容
{"transaction_id": "a1b2c3d4-5678","global_seqno": 1024, // 临时序列号"source_id": "mariadb_A", // 发起节点"timestamp": 1630000000.123, // 时间戳"schema": {"name": "users", // 表名"version": 3 // 表结构版本号},"rows": [{"key": {"id": 1}, // 主键值"action": "UPDATE", // 操作类型"before": {"name": "LiLei"}, // 修改前的值"after": {"name": "Tom"}, // 修改后的值"version": 5 // 行版本号(用于冲突检测),行被修改后递增1}],"dependencies": [] // 依赖的前序事务seqno }- 写集包含了事务写操作的关键信息
- 写集用于向集群广播,通告此次sql操作的影响范围。不是原始sql语句,而是哪些数据将被修改的相关信息
- 写集广播。各自mariadb节点使用 Galera 的 GCS(Group Communication System) 将各自写集广播
- 事务全局排序。
- GCS 使用分布式一致性协议(类似 Paxos,galera称为Virtually Synchronous Replication)在集群协商事务的全局执行顺序
- 为多个并发事务(A, B, C)分配全局唯一的,严格递增的 seqno,从而达成所有事务的全局执行顺序
- 事务认证(事务冲突检测)与事务提交。
- 如果当前待提交事务与并发的,已提交的事务无冲突,则提交该事务。否则事务拒绝,并返回 deadlock
- 各节点按照全局 seqno 按序提交对应事务
在 Galera Cluster 的多主架构中,事务的最终结果不取决于事务的发起时间,而是由 GCS(Group Communication System)的全局排序(Total Ordering) 决定。
galera集群的优缺点
主要优点:
| 集群模式 | 类型 |
|---|---|
| Multi-Master架构 | 所有节点均可处理读写请求 |
| 同步复制 | 采用写集复制,通过组通信协议(GCS)进行同步提交 |
| 强一致性 | 所有事务在提交前必须经过全局认证、排序、冲突检测,最终实现一致性,也意味着所有节点数据对等 |
| 无单点故障 | 任意节点宕机不影响集群整体可用性 |
| 新节点自动同步 | 新节点可以自动加入集群,通过 State Snapshot Transfer (SST) 或 Incremental State Transfer (IST) 同步状态 |
主要缺点:
| 集群模式 | 类型 |
|---|---|
| 写入冲突导致回滚 | 并发修改相同一行的数据会被检测为冲突,需要应用自行处理 |
| 事务延迟依赖最慢节点 | 所有节点在事务提交前需达成共识,进行全局排序和冲突检测造成事务延迟 |
| 写入水平扩展受限 | 事务广播同步,写性能受限于最慢节点,且每个节点都互为副本,无法通过增加节点提升写吞吐 |
相同行事务并发更新
- 表结构
MariaDB [test]> desc tmp;
+-------------+--------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| data | varchar(255) | YES | | NULL | |
| created_at | timestamp | NO | | current_timestamp() | |
| data_origin | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------------------+----------------+
4 rows in set (0.003 sec)
- 并发冲突demo
import uuid
import pymysql
import threadingDB_HOSTS = ["18.242.143.30", "18.242.143.31", "18.242.143.32"]
DB_USER = "root"
DB_PASS = "nuage#2345"
DB_NAME = "test"def write_to_db(ip, data, data_origin, row_id):try:conn = pymysql.connect(host = ip, user = DB_USER, password = DB_PASS, database = DB_NAME)cursor = conn.cursor()# 事务开启cursor.execute("BEGIN;")# 更新插入cursor.execute("""INSERT INTO tmp (id, data, data_origin) VALUES (%s, %s, %s)ON DUPLICATE KEY UPDATE data = VALUES(data), data_origin = VALUES(data_origin);""", (row_id, data, data_origin))# 事务提交conn.commit()print(f"[{ip}] ✅ Update Success")except Exception as e:print(f"[{ip}] ❌ Update Error: {e}")finally:conn.close()thread1 = threading.Thread(target=write_to_db, args=(DB_HOSTS[0], str(uuid.uuid4())[:8], 'value_from_30', 3))
thread2 = threading.Thread(target=write_to_db, args=(DB_HOSTS[1], str(uuid.uuid4())[:8], 'value_from_31', 3))
thread3 = threading.Thread(target=write_to_db, args=(DB_HOSTS[2], str(uuid.uuid4())[:8], 'value_from_32', 3))# 创建并发线程
thread1.start()
thread2.start()
thread3.start()thread1.join()
thread2.join()
thread3.join()
- 结果输出
[18.242.143.31] ✅ Update Insert Success
[18.242.143.30] ❌ Update Insert Error: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
[18.242.143.32] ❌ Update Insert Error: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
MariaDB [test]> select * from tmp;
+----+----------+---------------------+---------------+
| id | data | created_at | data_origin |
+----+----------+---------------------+---------------+
| 3 | 04d2ec9a | 2025-08-11 13:58:07 | value_from_31 |
+----+----------+---------------------+---------------+
2 rows in set (0.001 sec)
或者其它结果:
[18.242.143.32] ✅ Update Insert Success
[18.242.143.31] ❌ Update Insert Error: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
[18.242.143.30] ❌ Update Insert Error: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
MariaDB [test]> select * from tmp;
+----+----------+---------------------+---------------+
| id | data | created_at | data_origin |
+----+----------+---------------------+---------------+
| 3 | f0d31a50 | 2025-08-11 13:58:07 | value_from_32 |
+----+----------+---------------------+---------------+
2 rows in set (0.001 sec)
- 虽然 galera 支持多节点并发写,但是相同行的并发操作还是会受限。
- 相同行的并发结果随机,取决于谁的事务先到达。但是最后,所有节点一定会按照协商的全局顺序应用并发事务,保证所有节点数据的一致性。
相同行事务串行更新
将线程创建方式从并发改为串行启动:
thread1.start()
thread1.join()thread2.start()
thread2.join()thread3.start()
thread3.join()
- 结果输出
[18.242.143.30] ✅ Update Insert Success
[18.242.143.31] ✅ Update Insert Success
[18.242.143.32] ✅ Update Insert Success
不再出现相同行并发事务冲突问题。
不同行事务并发更新
并发更新插入,操作不同行
DB_HOSTS = ["18.242.143.30", "18.242.143.31", "18.242.143.32"]thread1 = threading.Thread(target=write_to_db, args=(DB_HOSTS[0], str(uuid.uuid4())[:8], 'value_from_30', 3))
thread2 = threading.Thread(target=write_to_db, args=(DB_HOSTS[1], str(uuid.uuid4())[:8], 'value_from_31', 4))
thread3 = threading.Thread(target=write_to_db, args=(DB_HOSTS[2], str(uuid.uuid4())[:8], 'value_from_32', 5))
- 结果输出
[18.242.143.32] ✅ Update Insert Success
[18.242.143.31] ✅ Update Insert Success
[18.242.143.30] ✅ Update Insert Success
galera三节点集群搭建
- 配置文件举例
[mysqld]
bind-address=18.242.143.30 # 监听地址
log_bin=mysql-bin
binlog_format=ROW
default_storage_engine=InnoDB # galera只支持InnoDB引擎
innodb_autoinc_lock_mode=2wsrep_on=ON # 启用galare插件
wsrep_cluster_name="galera-cluster" # galera集群名称
wsrep_cluster_address="gcomm://mariadb-1,mariadb-2,mariadb-3" # galera集群地址列表
wsrep_provider=/usr/lib64/galera/libgalera_smm.sowsrep_node_name="mariadb-1" # 当前节点名称
wsrep_node_address="18.242.143.30" # 当前节点通信地址wsrep_sst_method=mariabackup # sst(State Snapshot Transfer)同步默认方法。不锁表,写入不阻塞
wsrep_sst_auth="sstuser:nuage#2345" # sst认证凭据wsrep_slave_threads=4
max_connections=10000
- 其余两节点配置只需要修改 bind-address , wsrep_node_name 和 wsrep_node_address
- 当集群是第一次启动,可任选其中一个节点执行:galera_new_cluster
- galera_new_cluster 生成集群uuid标识,后续加入的节点都使用该ID
- galera_new_cluster 跳过集群成员检查。常规启动时(systemctl start mariadb),本节点会尝试连接 wsrep_cluster_address 中的其他节点,但是 galera_new_cluster 会跳过该步骤
- 最先启动的节点成为 Donor 节点(数据源节点)。后续节点加入时,会从该节点触发 SST(State Snapshot Transfer,全量同步)或 IST(Incremental State Transfer,增量同步)
- 主节点启动成功后,其余节点执行:systemctl start mariadb 即可
galera集群节点数规则
mariabd 官方建议 galera 集群至少应由三个节点组成。
galera 集群需要遵循多数派规则:
- 剩余节点数可以构成多数派(剩下的节点数需要大于原始总集群的50%),则该集群可以正常工作
- 剩余节点数不满足多数派,则该集群终止写入能力。
假设集群由两个几点构成:
- 正常:Node1 ↔ Node2 正常通信
- 异常场景:网络故障导致两节点相互失联
- Node1 认为 Node2 挂了
- Node2 认为 Node1 挂了
如果没有多数派规则,两个节点都会以为自己是唯一存活节点,于是都可以接受写请求,最终造成两边数据非一致,导致脑裂。所以,为了尽可能避免脑裂,galera 在仲裁规则里规定:剩下的节点必须获得多数派票数,才能成为 Primary,才能继续写,否则终止写入,状态切换为 Non-Primary。有如下集群多数派计算公式和表格:
majority = [节点总数 / 2] + 1
| 集群节点数 | 需要在线的最少节点数(majority) | 允许故障数 | 故障容忍率 | 说明 |
|---|---|---|---|---|
| 1 | 1 | 0 | 0% | 孤立节点,非集群 |
| 2 | 2 | 0 | 0% | 任意一个挂了,剩下的一个失去多数派,集群进入只读状态,即 Non-Primary |
| 3 | 2 | 1 | 33% | 允许一个节点故障,剩下两个还能继续写 |
| 4 | 3 | 1 | 25% | 与3节点集群一样,只允许一个节点故障 |
| 5 | 3 | 2 | 40% | 允许两个节点故障 |
| 6 | 4 | 2 | 33% | 与5节点集群一样,只允许两个节点故障 |
表格反映了一些事实:
- 奇数个节点的允许故障数与 +1 的偶数个节点允许故障数相同
- 奇数个节点的故障容忍率比 +1 的偶数个节点故障容忍率要高
为了避免资源浪费,官方建议,galera 集群建议节点数应为奇数个。并且,三个节点是 galera 最小安全配置。
galera集群扩容
有当前集群信息:
MariaDB [test]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.001 sec)
新增一个 mariadb 节点,从三节点集群扩容至四节点集群。配置:
[mysqld]
bind-address=18.242.143.33 # 监听地址
log_bin=mysql-bin
binlog_format=ROW
default_storage_engine=InnoDB # galera只支持InnoDB引擎
innodb_autoinc_lock_mode=2wsrep_on=ON # 启用galare插件
wsrep_cluster_name="galera-cluster" # galera集群名称
wsrep_cluster_address="gcomm://mariadb-1,mariadb-2,mariadb-3" # galera集群地址列表
wsrep_provider=/usr/lib64/galera/libgalera_smm.sowsrep_node_name="mariadb-4" # 当前节点名称
wsrep_node_address="18.242.143.33" # 当前节点通信地址wsrep_sst_method=mariabackup # sst(State Snapshot Transfer)同步默认方法。不锁表,写入不阻塞
wsrep_sst_auth="sstuser:nuage#2345" # sst认证凭据wsrep_slave_threads=4
max_connections=10000
在新增节点执行:systemctl start mariadb 即可。摘取新增节点的关键日志信息:
# 连接到集群中的其它节点
2025-08-12 14:25:25 0 [Note] WSREP: (21b01b3d-a887, 'tcp://0.0.0.0:4567') connection established to 3afca15d-bd74 tcp://18.242.143.31:4567
2025-08-12 14:25:25 0 [Note] WSREP: (21b01b3d-a887, 'tcp://0.0.0.0:4567') connection established to 3cfa0384-b2c4 tcp://18.242.143.30:4567
2025-08-12 14:25:25 0 [Note] WSREP: (21b01b3d-a887, 'tcp://0.0.0.0:4567') connection established to 134e4245-aba6 tcp://18.242.143.32:4567# 当前集群状态为 Primary,满足多数派规则
2025-08-12 14:25:26 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 4# SST同步成功。SST过程开销 14s
2025-08-12 14:25:26 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> ec75dcca-711c-11f0-9611-ef90f88adb5e:918
WSREP_SST: [INFO] Proceeding with SST (20250812 14:25:27.481)
2025-08-12 14:25:39 0 [Note] WSREP: 0.0 (mariadb-3): State transfer to 1.0 (mariadb-4) complete.
2025-08-12 14:25:40 3 [Note] WSREP: SST received# IST补充未同步事务
2025-08-12 14:25:40 0 [Note] WSREP: ####### IST current seqno initialized to 874
2025-08-12 14:25:40 0 [Note] WSREP: Receiving IST... 0.0% (0/45 events) complete.
2025-08-12 14:25:40 2 [Note] WSREP: IST received: ec75dcca-711c-11f0-9611-ef90f88adb5e:918# 集群加入成功,新加入节点后续可以参与多数派投票和接受读写
025-08-12 14:25:40 2 [Note] WSREP: Server status change joined -> synced
新节点成功加入集群后,集群扩容成功
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 4 |
+--------------------+-------+
1 row in set (0.002 sec)
mariadb节点正常退出
有当前 galera 集群信息:
MariaDB [test]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.002 sec)# 任意节点的集群状态都是 Primary,代表可读可写
MariaDB [test]> SHOW STATUS LIKE 'wsrep_cluster_status';
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+
1 row in set (0.002 sec)
任意节点通过:systemctl stop mariadb,逐步停止。
- 停止 18.242.143.30 节点
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 2 |
+--------------------+-------+
1 row in set (0.001 sec)MariaDB [(none)]> show status like 'wsrep_cluster_status';
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+
1 row in set (0.001 sec)
- 停止 18.242.143.31 节点
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 1 |
+--------------------+-------+
1 row in set (0.001 sec)MariaDB [(none)]>
MariaDB [(none)]> show status like 'wsrep_cluster_status';
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+
1 row in set (0.001 sec)
节点逐步协商退出,集群的规模逐步减小,直至剩余最后一个节点。因为集群规模也退化为1个节点,也满足 galera 集群多数派运行规则(因为集群规模也缩减至1),所以可以继续对外提供读写服务。
mariadb节点异常退出
3节点集群,异常退出两个 mariadb 节点,得到如下集群状态:
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 1 |
+--------------------+-------+
1 row in set (0.002 sec)MariaDB [(none)]>
MariaDB [(none)]> show status like 'wsrep_cluster_status';
+----------------------+-------------+
| Variable_name | Value |
+----------------------+-------------+
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
1 row in set (0.002 sec)
galera节点异常退出,导致最后的节点认为集群规模还是3节点,按照 galera 集群多数派运行规则:
- 3节点集群至少需要两个节点正常运行
- 1个节点不再满足大多数(>50%)
所以剩余的唯一的 galera 节点被禁止写,只可读。
galera集群整体停机后的重新恢复
先看文件:/var/lib/mysql/grastate.dat
[root@mariadb-1 ~]# cat /var/lib/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: ec75dcca-711c-11f0-9611-ef90f88adb5e # 集群的全局唯一标识符(Cluster UUID)
seqno: -1 # 节点最后提交的事务序列号(Sequence Number)
safe_to_bootstrap: 0 # 标记节点是否可以安全引导集群(即作为第一个启动节点)。 0:禁止自动引导;1:允许引导
galera整体停机分为两种情况,一种是所有节点按序退出;另一种是所有节点异常宕机。
所有节点按序退出的集群恢复
所有节点按序退出,seqno 会被每个节点更新为最后的事务ID,safe_to_bootstrap 会被最后退出的节点置为 1。正常退出的集群恢复:
- 找到 safe_to_bootstrap 为 1 的节点,执行:galera_new_cluster
- 如果找不到 safe_to_bootstrap 为 1 的节点,找到 seqno 最大的节点,将该节点的 safe_to_bootstrap 置为 1,再执行:galera_new_cluster
- 其它节点执行:systemctl start mariadb 即可
所有节点异常宕机的集群恢复
galera集群所有节点异常宕机,所有节点的 grastate.dat 文件都不会被更新,将出现无引导恢复节点的情形。如果直接通过重启服务:systemctl restart mariadb ,任意节点执行重启,最后都会报错,因为节点启动时会先去连接其它 galera 节点,但此时无一节点存活,启动失败。
2025-08-12 17:34:47 0 [Note] WSREP: gcomm: connecting to group 'galera-cluster', peer 'mariadb-1:,mariadb-2:,mariadb-3:'
2025-08-12 17:34:47 0 [Note] WSREP: (d51c9e68-9e44, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://18.242.143.30:4567
2025-08-12 17:34:47 0 [Note] WSREP: Failed to establish connection: Connection refused
2025-08-12 17:34:47 0 [Note] WSREP: Failed to establish connection: Connection refused
如果在任意节点修改 grastate.dat 的 safe_to_bootstrap = 1,再执行 galera_new_cluster,存在数据丢失的风险。
因此,这种场景的恢复,需要人工干预:
- 每个 mariabd 节点都需要执行一次:mariadbd --wsrep_recover --user=mysql,目的是得到当前节点停机前的最后一个事务ID。在 mariadb log 文件查看命令行信息
- 其中一个异常停机节点有信息:
2025-08-20 11:38:01 0 [Note] Server socket created on IP: '18.242.143.30'. 2025-08-20 11:38:01 0 [Note] WSREP: Recovered position: 80d99892-7d76-11f0-9233-6b6b37ab0652:4- 另一个异常停机节点有信息:
2025-08-20 13:44:13 0 [Note] Server socket created on IP: '18.242.143.32'. 2025-08-20 13:44:13 0 [Note] WSREP: Recovered position: 80d99892-7d76-11f0-9233-6b6b37ab0652:6 - 将ID最大节点的 grastate.dat 的 safe_to_bootstrap 设置为 1
- 在该节点执行:galera_new_cluster
- 主节点启动完毕,其它节点依次执行:systemctl start mariadb
当然,也可以通过脚本包装以上命令,实现自动恢复。
galera集群在openstack中的应用
什么是haproxy
HAProxy(High Availability Proxy)是一款开源的高性能 TCP/HTTP 负载均衡器。
| 特性 | 说明 |
|---|---|
| 高性能 | 单进程,事件驱动(I/O多路复用),支持每秒数十万请求 |
| 多协议支持 | TCP(四层)、HTTP(七层)、SSL/TLS、WebSocket、gRPC |
| 健康检查 | 主动监测后端服务器状态,自动剔除故障节点 |
| 负载均衡算法 | 轮询(Round Robin)、最小连接(Least Connections)、源 IP 哈希等 |
kolla haproxy-galera环境节点信息
| 节点 | 说明 |
|---|---|
| 18.242.143.10 | control-1 |
| 18.242.143.11 | control-2 |
| 18.242.143.12 | control-3 |
| 18.242.143.100 | floating ip, control-3 |
kolla haproxy-galera框架
kolla 部署的 openstack,haproxy 默认 enable,表示将使用 haproxy 代理并负载均衡各种 openstack 应用请求。例如有如下配置:
- haproxy 对 neutron-server 的代理配置:cat /etc/kolla/haproxy/services.d/neutron-server.cfg 。haproxy 对 openstack 核心组件都是使用 http 代理模式,应用请求默认使用轮询分发到各节点
frontend neutron_server_frontmode httphttp-request del-header X-Forwarded-Protooption httplogoption forwardforhttp-request set-header X-Forwarded-Proto https if { ssl_fc }bind 18.242.143.100:9696 # floating IPdefault_backend neutron_server_backbackend neutron_server_backmode httpserver control-60 18.242.143.10:9696 check inter 2000 rise 2 fall 5server control-61 18.242.143.11:9696 check inter 2000 rise 2 fall 5server control-62 18.242.143.12:9696 check inter 2000 rise 2 fall 5
- neutron-server组件本身有关数据库访问的配置
[database]
connection = mysql+pymysql://neutron:aYNsc0NKcGKDvhyYEQiiibGp18r6YhJRl9722sSv@18.242.143.100:3306/neutron # 访问的是 18.242.143.100:3306
connection_recycle_time = 10
max_pool_size = 1
max_retries = -1
- haproxy 对 mariabd 的代理配置
- haproxy 不支持 sql 解析
- haproxy 对 mariadb 代理使用tcp模式
- kolla openstack,haproxy 将其它 mariadb 节点作为备份节点,所有应用请求均代理到主 mariadb 节点。一主多备模式,无负载均衡
- kolla openstack,haproxy 检查的端口是:4569,不是 mariadb 默认服务端口3306
frontend mariadb_frontmode tcpoption clitcpkatimeout client 3600soption tcplogbind 18.242.143.100:3306 # 监听 vip:18.242.143.100default_backend mariadb_backbackend mariadb_backmode tcp # TCP 代理option srvtcpkatimeout server 3600s # 后端空闲3600s后断开连接option httpchk # http检查server control-1 18.242.143.10:3306 check port 4569 inter 2000 rise 2 fall 5 # 被haproxy标记为master。每 2 秒检查一次,2 次成功标记为健康,5 次失败标记为宕机server control-2 18.242.143.11:3306 check port 4569 inter 2000 rise 2 fall 5 backup # 被haproxy标记为slave。每 2 秒检查一次,2 次成功标记为健康,5 次失败标记为宕机server control-3 18.242.143.12:3306 check port 4569 inter 2000 rise 2 fall 5 backup # 被haproxy标记为slave。每 2 秒检查一次,2 次成功标记为健康,5 次失败标记为宕机
$ netstat -ntpl | grep 3306
tcp 0 0 18.242.143.100:3306 0.0.0.0:* LISTEN 513998/haproxy
tcp 0 0 18.242.143.10:3306 0.0.0.0:* LISTEN 3883762/mariadbd
- 举例说明客户端通过 horizen 发起 vm 实例查询。为了方便展示,图中的两个 haproxy 其实为同一个进程。
- 客户端通过访问 node-1 的 floating IP 打开 dashboard
- horizen 向 nova 发起查询,查询指向主节点的 haproxy 代理,haproxy 通过轮询分配了 node-2 的 nova
- node-2 的 nova 请求网络信息,向 neutron 发起查询请求,请求指向主节点的 haproxy 代理,haproxy 通过轮询分配了 node-3 的 neutron
- node-3 的 neutron 向 galera 集群发起查询请求,请求指向主节点的 haproxy 代理,haproxy 根据配置,指定向 galera 的 mariadb 主节点发起查询请求
- node-1 的 mariadb 响应查询
haproxy高可用
haproxy对mariadb的健康检查
haproxy 通过与 mariadb_clustercheck 容器监听的 4569 端口交互,实现对 mariadb 节点的健康检查。
# mariadb监听的是3306和4567端口
$ netstat -ntpl | grep mariadb
tcp 0 0 18.242.143.10:4567 0.0.0.0:* LISTEN 1136220/mariadbd
tcp 0 0 18.242.143.10:3306 0.0.0.0:* LISTEN 1136220/mariadbd
$
# kolla部署了mariadb和mariadb_clustercheck容器
$ docker ps | grep mariadb
c0f9b3e0876a quay.io/openstack.kolla/mariadb-server:2024.1-rocky-9 "dumb-init -- kolla_…" 3 days ago Up 3 days (healthy) mariadb
8c038cb7f60e quay.io/openstack.kolla/mariadb-clustercheck:2024.1-rocky-9 "dumb-init --single-…" 5 weeks ago Up 3 days mariadb_clustercheck
$
$ netstat -ntpl | grep 4569
tcp 0 0 18.242.143.10:4569 0.0.0.0:* LISTEN 1136238/socat
$
# 端口4569是 mariadb_clustercheck 监听的
$ docker exec -it mariadb_clustercheck lsof -i :4569 -P
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
socat 7 root 6u IPv4 183999368 0t0 TCP control-1:4569 (LISTEN)
$
synced 状态代表该节点可正常处理读写请求,是一个健康的 mariadb 节点。
当某个 mariadb 节点离线,haproxy 得到状态是:not synced,表示 mariadb 节点异常,从而不会再将流量分配到该节点。
$ curl --noproxy "*" -v http://18.242.143.10:4569
* Trying 18.242.143.10:4569...
* Connected to 18.242.143.10 (18.242.143.10) port 4569 (#0)
> GET / HTTP/1.1
> Host: 18.242.143.10:4569
> User-Agent: curl/7.76.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Connection: close
< Content-Length: 32
<
Galera cluster node is not synced.
haproxy 也提供了 web 展示接口。galera 集群健康时,haproxy 有状态如图:

有 mariadb backup 节点离线时,haproxy 有状态如图:

如果是 mariadb master 节点离线,haproxy 会从 backup 中选择一个作为新 master。
keepalived保障haproxy高可用
keepalived 的配置:
$ cat keepalived/keepalived.conf
vrrp_script check_alive {script "/check_alive.sh"interval 2fall 2rise 10
}vrrp_instance kolla_internal_vip_51 {state BACKUPnopreemptinterface ens3virtual_router_id 51priority 1advert_int 1virtual_ipaddress {18.242.143.100 dev ens3}authentication {auth_type PASSauth_pass gc7dzrgn3kjpRLLVK78aqcgR6aJNeakZzLajGPIz}track_script {check_alive}
}
check_alive.sh 将 /checks 目录下发现的 check* 脚本循环执行一遍。check_alive_haproxy.sh 通过 socat 检查 haproxy 状态。其中:/var/lib/kolla/haproxy/haproxy.sock 是 haproxy 监听的 unix_socket,用于返回 haproxy 的状态信息。
$ cat keepalived/checks/check_alive_haproxy.sh
#!/bin/bash# This will return 0 when it successfully talks to the haproxy daemon via the socket
# Failures return 1echo "show info" | socat unix-connect:/var/lib/kolla/haproxy/haproxy.sock stdio > /dev/null
当浮动IP所在节点的 haproxy 离线时,浮动IP切换到集群其它节点,不影响针对 galera 集群的查询
haproxy的不足
haproxy 可以对 galera 集群实现流量负载均衡,但是:
- 一主多备模式,读写流量都在主节点。主节点可能负载过大
- 配置多主模式,读写流量按负载算法分配。多节点并发写,可能会造成 galera 集群写冲突
所以,为了避免冲突,kolla 默认将 haproxy 配置为一主多备模式代理 galera 集群。
haproxy 代理的是四层流量,无法实现 sql 层次的解析。如果想要减轻主节点的读负担,实现读水平扩展,同时解决多节点写的冲突问题,需要使用 sql 级别的负载均衡方案。
proxysql代理的galera集群
什么是proxysql
ProxySQL 是一个高性能的 MySQL/MariaDB 中间层代理,实现 SQL 流量的转发、读写分离和负载均衡。如下是一个 mariadb 主从模式集群的 sql 流量读写分离示意图。
| 特性 | 说明 |
|---|---|
| 连接池 | 应用连接 ProxySQL,ProxySQL 维护和数据库的连接池,减少数据库连接开销 |
| 读写分离 | 根据正则、Digest(SQL 指纹)规则匹配 SQL 语句(SELECT、INSERT、UPDATE),并将识别后的 sql 路由到不同的节点 |
| 负载均衡 | 支持向多个后端节点按权重或策略分发请求 |
| 高可用与自动故障转移 | 内置后端 mariadb 节点健康检查机制,自动驱逐离线节点 |
kolla 支持 proxysql 部署,但是 proxysql 的配置较为复杂。
kolla proxysql-galera环境节点信息
| 节点 | 说明 |
|---|---|
| 18.242.143.60 | control-1 |
| 18.242.143.61 | control-2 |
| 18.242.143.62 | control-3 |
| 18.242.143.69 | floating ip, control-3 |
kolla proxysql-galera框架
proxysql-galera的框架配置
kolla 部署的 openstack,proxysql 默认 disable。打开 proxysql,将有配置如下:
- neutron-server组件本身有关数据库访问的配置
[database]
connection = mysql+pymysql://neutron:aYNsc0NKcGKDvhyYEQiiibGp18r6YhJRl9722sSv@18.242.143.69:3306/neutron
connection_recycle_time = 10
max_pool_size = 1
max_retries = -1
3306端口不再是 haproxy 代理,而是 proxysql 代理。
$ netstat -ntpl | grep 3306
tcp 0 0 18.242.143.62:3306 0.0.0.0:* LISTEN 267014/mariadbd
tcp 0 0 18.242.143.69:3306 0.0.0.0:* LISTEN 3809645/proxysql
-
proxysql 代理用户配置
- 每个需要访问数据库的 openstack 组件都是一个用户
$ cat proxysql/users/neutron.yaml mysql_users:- username: "neutron"password: "aYNsc0NKcGKDvhyYEQiiibGp18r6YhJRl9722sSv" # 与neutron-server组件配置的数据库密码一致transaction_persistent: 1active: 1$ cat proxysql/users/cinder.yaml mysql_users:- username: "cinder"password: "12Nwlivw0rV7QmJPoOXKdwI5gC66B3c7vKuB6poH" # 与cinder组件配置的数据库密码一致transaction_persistent: 1active: 1 ...... ...... ......- 只有被配置的用户才有权限通过 proxysql 访问后端数据库,否则会被拒绝。例如:
Database error: (1045, "ProxySQL Error: Access denied for user 'root') -
用户配置
$ ll -lh proxysql/users/ -rw-rw----. 1 root root 719 Jul 23 10:09 cinder.yaml -rw-rw----. 1 root root 719 Jul 23 10:09 glance.yaml -rw-rw----. 1 root root 720 Jul 23 10:09 gnocchi.yaml -rw-rw----. 1 root root 717 Jul 23 10:10 heat.yaml -rw-rw----. 1 root root 720 Jul 23 10:10 horizon.yaml -rw-rw----. 1 root root 721 Jul 23 10:10 keystone.yaml -rw-rw----. 1 root root 719 Jul 23 10:10 manila.yaml -rw-rw----. 1 root root 750 Jul 23 10:10 mariadb.yaml -rw-rw----. 1 root root 720 Jul 23 10:11 neutron.yaml -rw-rw----. 1 root root 717 Jul 23 10:11 nova-cell.yaml -rw-rw----. 1 root root 843 Jul 23 10:11 nova.yaml -rw-rw----. 1 root root 722 Jul 23 10:11 placement.yaml
在 proxysql 生效的用户配置:
MySQL [(none)]> SELECT * FROM mysql_users;
+--------------+------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+------------+---------+
| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections | attributes | comment |
+--------------+------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+------------+---------+
| cinder | 12Nwlivw0rV7QmJPoOXKdwI5gC66B3c7vKuB6poH | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| glance | XNYstM1eU6drqnDU0tfTK5kObT6njjrISzrD6vM4 | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| gnocchi | 4dbCpgaw25qfyPbNgPP1n3eNx3U0k0XjvcpFFQ46 | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| heat | G0C78zEMgL2aDJZAqghAilNyN8UCibJa2xfzH8Cx | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| horizon | nVggeoWMlN94QKFpEqWKY6dHG11pE9uWTwUwBLY0 | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| keystone | wtwjQKmVrsRiUpOub17idW6vWiuWGPR5ghu0QYWI | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| manila | Cqb01BzrtDftsz9tScztXGN56ksTleBdSZr6ckBn | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| root_shard_0 | Msc7WhEeCLQSAstDV4IbfV8644TffapkRrcRbJ9V | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| neutron | aYNsc0NKcGKDvhyYEQiiibGp18r6YhJRl9722sSv | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| placement | kkboyBeKF3mMUCtB8Iv5rwdsvg3ANAAq8Ah165Ep | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| nova | db8uFbJtZ8QUODB1l6bV2JWUHpELHi2RPLfUuuTe | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
| nova_api | o8zpPMvglwzFkCEsZ8Q7qVXr1WSVCoTqHxg1uvzo | 1 | 0 | 0 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
+--------------+------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+------------+---------+
12 rows in set (0.001 sec)
- proxysql.yaml 主体配置
- hostgroup 0 定义为写组;hostgroup 1 定义为备写组;hostgroup 2 定义为读组;hostgroup 3 定义为离线组
- writer_is_also_reader: 0,表示写节点不支持读
$ cat proxysql/proxysql.yaml
datadir: "/var/lib/proxysql"
errorlog: "/var/log/kolla/proxysql/proxysql.log"admin_variables:admin_credentials: "kolla-admin:Do65z48HMsERpGWCec1Iv3Wpwjq4jfNR35l0HTAr"mysql_ifaces: "18.242.143.61:6032;18.242.143.69:6032;/var/lib/kolla/proxysql/admin.sock"stats_credentials: "kolla-stats:KtLqNzv1Rrsahz0XIdxxbSOjzLdH62soihO5T86i"mysql_variables:threads: 5max_connections: 40000interfaces: "18.242.143.69:3306" # proxysql 监听地址monitor_username: "monitor"monitor_password: "ITlG70FfcqE14RFjj14kv0PfoW5UfDguiuysncfK"mysql_servers:- address: "18.242.143.62"port : 3306hostgroup : 0 # write groupmax_connections: 10000max_replication_lag: 0weight : 100- address: "18.242.143.60"port : 3306hostgroup : 1 # backup write groupmax_connections: 10000max_replication_lag: 0weight : 100- address: "18.242.143.61"port : 3306hostgroup : 1 # backup write groupmax_connections: 10000max_replication_lag: 0weight : 100- address: "18.242.143.60"port : 3306hostgroup : 2 # read groupmax_connections: 10000max_replication_lag: 0weight : 100- address: "18.242.143.61"port : 3306hostgroup : 2 # read groupmax_connections: 10000max_replication_lag: 0weight : 100mysql_galera_hostgroups:- writer_hostgroup: 0 # write group IDbackup_writer_hostgroup: 1 # baciup write group IDreader_hostgroup: 2 # read group IDoffline_hostgroup: 3 # offline group IDmax_connections: 10000max_writers: 1writer_is_also_reader: 0 # read groupcomment: "Galera cluster for shard 0"
- proxysql 规则配置
$ cat proxysql/rules.bak/neutron.yaml
mysql_query_rules:- schemaname: "neutron"destination_hostgroup: 0apply: 1active: 1$ cat proxysql/rules.bak/cinder.yaml
mysql_query_rules:- schemaname: "cinder"destination_hostgroup: 0apply: 1active: 1......
......
......
proxysql-galera默认配置的工作机制
kolla 默认配置会将所有组件的 读/写 sql 流量都分发到 hostgroup: 0 的写组单节点。这一点与 haproxy 没有本质区别,都是将流量分发到了 galera 集群主节点。
- proxysql 节点运行信息
- 只有 mariadb 节点 18.242.143.62 在写组,且状态是:ONLINE
- 状态 SHUNNED 表示不适合接收请求。
MySQL [(none)]> SELECT * FROM mysql_servers;
+--------------+---------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
5 rows in set (0.002 sec)MySQL [(none)]>
MySQL [(none)]> SELECT * FROM runtime_mysql_servers;
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.60 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.61 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
5 rows in set (0.003 sec)
- proxysql 的运行规则
- match_digest(摘要匹配,忽略参数值),match_pattern(完整文本匹配,包括参数) 都为 NULL
- destination_hostgroup 都为 0,表示将所有schemaname的请求,都代理至写组,即 online 的 18.242.143.62 节点
MySQL [(none)]> SELECT rule_id, schemaname, match_digest, match_pattern, destination_hostgroup, apply, active, comment FROM runtime_mysql_query_rules;
+---------+------------+--------------+---------------+-----------------------+-------+--------+---------+
| rule_id | schemaname | match_digest | match_pattern | destination_hostgroup | apply | active | comment |
+---------+------------+--------------+---------------+-----------------------+-------+--------+---------+
| 1 | gnocchi | NULL | NULL | 0 | 1 | 1 | NULL |
| 2 | heat | NULL | NULL | 0 | 1 | 1 | NULL |
| 3 | nova | NULL | NULL | 0 | 1 | 1 | NULL |
| 4 | nova_cell0 | NULL | NULL | 0 | 1 | 1 | NULL |
| 5 | nova_api | NULL | NULL | 0 | 1 | 1 | NULL |
| 6 | horizon | NULL | NULL | 0 | 1 | 1 | NULL |
| 7 | keystone | NULL | NULL | 0 | 1 | 1 | NULL |
| 8 | neutron | NULL | NULL | 0 | 1 | 1 | NULL |
| 9 | cinder | NULL | NULL | 0 | 1 | 1 | NULL |
| 10 | glance | NULL | NULL | 0 | 1 | 1 | NULL |
| 11 | manila | NULL | NULL | 0 | 1 | 1 | NULL |
| 12 | placement | NULL | NULL | 0 | 1 | 1 | NULL |
+---------+------------+--------------+---------------+-----------------------+-------+--------+---------+
12 rows in set (0.002 sec)
- sql路由确认
所有sql语句(无论是读还是写)都被路由到了 hostgroup = 0 的节点:18.242.143.62
MySQL [(none)]> SELECT hostgroup, username, count_star, SUBSTRING(digest_text, 1, 180) AS digest_text_short FROM stats_mysql_query_digest;
+-----------+-----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| hostgroup | username | count_star | digest_text_short |
+-----------+-----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0 | manila | 8 | UPDATE reservations SET updated_at=reservations.updated_at,deleted_at=?,deleted=reservations.id WHERE reservations.deleted = ? AND reservations.expire < ? |
| 0 | placement | 16 | SELECT rp.id,rp.uuid,rp.name,rp.generation,rp.updated_at,rp.created_at,root_rp.uuid AS root_provider_uuid,parent_rp.uuid AS parent_provider_uuid FROM resource_providers AS rp INNER |
| 0 | placement | 16 | SELECT me.id,me.uuid,parent.id AS parent_id,parent.uuid AS parent_uuid,root.id AS root_id,root.uuid AS root_uuid FROM resource_providers AS me INNER JOIN resource_providers AS root |
| 0 | nova | 16 | DELETE FROM console_auth_tokens WHERE console_auth_tokens.expires <= ? |
| 0 | placement | 16 | SELECT placement_aggregates.id,placement_aggregates.uuid FROM placement_aggregates INNER JOIN resource_provider_aggregates ON placement_aggregates.id = resource_provider_aggregates | | |
| 0 | neutron | 796 | SELECT agents.id AS agents_id,agents.agent_type AS agents_agent_type,agents.`binary` AS agents_binary,agents.topic AS agents_topic,agents.host AS agents_host,agents.availability_zo |
| 0 | neutron | 130 | SELECT agents.id AS agents_id,agents.agent_type AS agents_agent_type,agents.`binary` AS agents_binary,agents.topic AS agents_topic,agents.host AS agents_host,agents.availability_zo |
| 0 | neutron | 130 | SELECT networkdhcpagentbindings.network_id AS networkdhcpagentbindings_network_id,networkdhcpagentbindings.dhcp_agent_id AS networkdhcpagentbindings_dhcp_agent_id,networkdhcpagentb | |
| 0 | neutron | 54 | SELECT subnetroutes.destination AS subnetroutes_destination,subnetroutes.nexthop AS subnetroutes_nexthop,subnetroutes.subnet_id AS subnetroutes_subnet_id,subnets_1.id AS subnets_1_ |
| 0 | neutron | 54 | SELECT dnsnameservers.address AS dnsnameservers_address,dnsnameservers.subnet_id AS dnsnameservers_subnet_id,dnsnameservers.`order` AS dnsnameservers_order,subnets_1.id AS subnets_ |
| 0 | neutron | 54 | SELECT tags.standard_attr_id AS tags_standard_attr_id,tags.tag AS tags_tag,standardattributes_1.id AS standardattributes_1_id FROM (SELECT networks.id AS networks_id FROM networks |
| 0 | neutron | 54 | SELECT standardattributes.id AS standardattributes_id,standardattributes.resource_type AS standardattributes_resource_type,standardattributes.description AS standardattributes_desc |
| 0 | neutron | 54 | SELECT subnet_service_types.subnet_id AS subnet_service_types_subnet_id,subnet_service_types.service_type AS subnet_service_types_service_type,subnets_1.id AS subnets_1_id FROM (SE |
| 0 | neutron | 54 | SELECT subnets.project_id AS subnets_project_id,subnets.id AS subnets_id,subnets.name AS subnets_name,subnets.network_id AS subnets_network_id,subnets.segment_id AS subnets_segment |
| 0 | neutron | 54 | SELECT segmenthostmappings.segment_id AS segmenthostmappings_segment_id,segmenthostmappings.host AS segmenthostmappings_host,networksegments_1.id AS networksegments_1_id FROM (SELE |
| 0 | neutron | 54 | SELECT networksegments.id AS networksegments_id,networksegments.network_id AS networksegments_network_id,networksegments.network_type AS networksegments_network_type,networksegment |
| 0 | neutron | 54 | SELECT agents.id AS agents_id,agents.agent_type AS agents_agent_type,agents.`binary` AS agents_binary,agents.topic AS agents_topic,agents.host AS agents_host,agents.availability_zo |
| 0 | neutron | 27 | SELECT standardattributes.id AS standardattributes_id,standardattributes.resource_type AS standardattributes_resource_type,standardattributes.description AS standardattributes_desc |
| 0 | neutron | 27 | SELECT extradhcpopts.id AS extradhcpopts_id,extradhcpopts.port_id AS extradhcpopts_port_id,extradhcpopts.opt_name AS extradhcpopts_opt_name,extradhcpopts.opt_value AS extradhcpopts |
| 0 | neutron | 27 | SELECT ml2_port_binding_levels.port_id AS ml2_port_binding_levels_port_id,ml2_port_binding_levels.host AS ml2_port_binding_levels_host,ml2_port_binding_levels.level AS ml2_port_bin |
| 0 | keystone | 9 | SELECT federated_user.id AS federated_user_id,federated_user.user_id AS federated_user_user_id,federated_user.idp_id AS federated_user_idp_id,federated_user.protocol_id AS federate |
| 0 | neutron | 27 | SELECT subnetroutes.destination AS subnetroutes_destination,subnetroutes.nexthop AS subnetroutes_nexthop,subnetroutes.subnet_id AS subnetroutes_subnet_id,anon_1.subnets_id AS anon_ |
| 0 | keystone | 24 | SELECT project_option.project_id AS project_option_project_id,project_option.option_id AS project_option_option_id,project_option.option_value AS project_option_option_value,anon_1 |
| 0 | keystone | 24 | SELECT project_tag.project_id AS project_tag_project_id,project_tag.name AS project_tag_name,anon_1.project_id AS anon_1_project_id FROM (SELECT project.id AS project_id FROM proje |
| 0 | keystone | 37 | SELECT revocation_event.id AS revocation_event_id,revocation_event.domain_id AS revocation_event_domain_id,revocation_event.project_id AS revocation_event_project_id,revocation_eve |
| 0 | keystone | 175 | SELECT ? |
......
......
......
proxysql-galera请求路径
openstack 的 proxysql-galera 环境,各应用对 galera 集群的访问演变为如下图所示:
- 相对于 haproxy-galera,只是第 ⑦ 步变为了组件对 proxysql 的访问
- 所有读写请求都调度到了 galera 集群主节点 mariadb-1,没有实现读写分离。效果其实跟haproxy类似
问题的根本原因在于 galera 默认配置:read_only = OFF,以及 proxysql:writer_is_also_reader = 0
- galera 集群默认所有节点都可读可写,所有 galera 节点只读标识默认为 read_only = OFF
- proxysql 虽然配置了 18.242.143.60 和 18.242.143.61 在读组,但是 proxsql 的 runtime_mysql_servers 表里并不存在任何读节点。因为 proxysql 和后端 mariadb 节点协商时,只有节点 read_only = ON,该节点才能被加入到读组。但 galera 集群却默认都是 read_only = OFF,所以没有节点被加入到读组
- proxysql 的 writer_is_also_reader = 0 表示 mariadb 写节点不可读。但是 proxysql 为了容错,所有读请求还是会被默认送到galera集群主节点
proxysql-galera半读写分离(单写,所有节点读)
需要 writer_is_also_reader = 1,以及配合sql路由规则配合。
- 写节点同时可读,此时 proxysql 不再检查 mariadb 节点的 read_only 标识,因此 galera 集群所有节点都会被添加到读组:hostgroup: 2
- proxysql 只配置了一个节点为写,所以只有主节点 18.242.143.62 在写组
MySQL [(none)]> SELECT * FROM runtime_mysql_servers;
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.60 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.61 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
8 rows in set (0.003 sec)
- proxysql 配置相应 sql 路由规则
任意组件的读请求路由到 hostgroup = 2 的读节点,写请求路由到 hostgroup = 0 的写节点。
MySQL [(none)]> SELECT rule_id, schemaname, match_digest, match_pattern, destination_hostgroup, apply, active, comment FROM runtime_mysql_query_rules;
+---------+------------+------------------------------+---------------+-----------------------+-------+--------+----------------------------------+
| rule_id | schemaname | match_digest | match_pattern | destination_hostgroup | apply | active | comment |
+---------+------------+------------------------------+---------------+-----------------------+-------+--------+----------------------------------+
| 13000 | cinder | ^INSERT | NULL | 0 | 1 | 1 | cinder write INSERT |
| 13001 | cinder | ^UPDATE | NULL | 0 | 1 | 1 | cinder write UPDATE |
| 13002 | cinder | ^DELETE | NULL | 0 | 1 | 1 | cinder write DELETE |
| 13004 | cinder | ^SELECT | NULL | 2 | 1 | 1 | cinder read SELECT |
| 13070 | neutron | ^INSERT | NULL | 0 | 1 | 1 | neutron write INSERT |
| 13071 | neutron | ^UPDATE | NULL | 0 | 1 | 1 | neutron write UPDATE |
| 13072 | neutron | ^DELETE | NULL | 0 | 1 | 1 | neutron write DELETE |
| 13074 | neutron | ^SELECT | NULL | 2 | 1 | 1 | neutron read SELECT |
| 13080 | nova | ^INSERT | NULL | 0 | 1 | 1 | nova write INSERT |
| 13081 | nova | ^UPDATE | NULL | 0 | 1 | 1 | nova write UPDATE |
| 13082 | nova | ^DELETE | NULL | 0 | 1 | 1 | nova write DELETE |
| 13084 | nova | ^SELECT | NULL | 2 | 1 | 1 | nova read SELECT |
......
......
......
- sql路由确认
MySQL [(none)]> SELECT hostgroup, username, count_star, SUBSTRING(digest_text, 1, 180) AS digest_text_short FROM stats_mysql_query_digest;
+-----------+-----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| hostgroup | username | count_star | digest_text_short |
+-----------+-----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0 | manila | 3 | UPDATE reservations SET updated_at=reservations.updated_at,deleted_at=?,deleted=reservations.id WHERE reservations.deleted = ? AND reservations.expire < ? |
| 2 | manila | 3 | SELECT reservations.created_at AS reservations_created_at,reservations.updated_at AS reservations_updated_at,reservations.deleted_at AS reservations_deleted_at,reservations.deleted |
| 2 | keystone | 3 | SELECT project_endpoint_group.endpoint_group_id AS project_endpoint_group_endpoint_group_id,project_endpoint_group.project_id AS project_endpoint_group_project_id FROM project_endp |
| 2 | keystone | 3 | SELECT assignment.type AS assignment_type,assignment.actor_id AS assignment_actor_id,assignment.target_id AS assignment_target_id,assignment.role_id AS assignment_role_id,assignmen |
| 2 | keystone | 59 | SELECT ? |
| 2 | nova | 6 | SELECT migrations.created_at AS migrations_created_at,migrations.updated_at AS migrations_updated_at,migrations.deleted_at AS migrations_deleted_at,migrations.deleted AS migrations |
| 2 | neutron | 10 | SELECT tags.standard_attr_id AS tags_standard_attr_id,tags.tag AS tags_tag,anon_1.standardattributes_id AS anon_1_standardattributes_id FROM (SELECT standardattributes.id AS standa |
| 2 | neutron | 10 | SELECT standardattributes.id AS standardattributes_id,standardattributes.resource_type AS standardattributes_resource_type,standardattributes.description AS standardattributes_desc |
| 2 | neutron | 10 | SELECT dnsnameservers.address AS dnsnameservers_address,dnsnameservers.subnet_id AS dnsnameservers_subnet_id,dnsnameservers.`order` AS dnsnameservers_order,anon_1.subnets_id AS ano |
| 2 | keystone | 3 | SELECT project_endpoint.endpoint_id AS project_endpoint_endpoint_id,project_endpoint.project_id AS project_endpoint_project_id FROM project_endpoint WHERE project_endpoint.project_ |
| 2 | neutron | 10 | SELECT subnet_service_types.subnet_id AS subnet_service_types_subnet_id,subnet_service_types.service_type AS subnet_service_types_service_type,anon_1.subnets_id AS anon_1_subnets_i |
| 2 | neutron | 10 | SELECT subnets.project_id AS subnets_project_id,subnets.id AS subnets_id,subnets.name AS subnets_name,subnets.network_id AS subnets_network_id,subnets.segment_id AS subnets_segment |
| 2 | nova | 11 | SELECT block_device_mapping.created_at AS block_device_mapping_created_at,block_device_mapping.updated_at AS block_device_mapping_updated_at,block_device_mapping.deleted_at AS bloc |
| 2 | keystone | 14 | SELECT implied_role.prior_role_id AS implied_role_prior_role_id,implied_role.implied_role_id AS implied_role_implied_role_id FROM implied_role WHERE implied_role.prior_role_id = ? |
| 0 | nova | 418 | UPDATE services SET updated_at=?,report_count=?,last_seen_up=? WHERE services.id = ? |
| 0 | cinder | 582 | UPDATE services SET updated_at=?,report_count=? WHERE services.deleted = false AND services.id = ? |
- 通过路由日志再次确认读写路由
select 日志抽样,可以发现读请求,会被调度到任意 mariadb 节点(下列日志中server的IP)。因为当前配置下,唯一的写节点也支持读。
{"client":"18.242.143.61:46704","digest":"0x1C46AE529DD5A40E","duration_us":2100,"endtime":"2025-08-07 11:13:08.563790","endtime_timestamp_us":1754536388563790,"event":"COM_QUERY","hostgroup_id":2,"query":"SELECT 1","rows_sent":1,"schemaname":"nova","server":"18.242.143.61:3306","starttime":"2025-08-07 11:13:08.561690","starttime_timestamp_us":1754536388561690,"thread_id":654,"username":"nova"}
{"client":"18.242.143.61:46704","digest":"0xCDFFE6B0A002A823","duration_us":684,"endtime":"2025-08-07 11:13:08.565868","endtime_timestamp_us":1754536388565868,"event":"COM_QUERY","hostgroup_id":2,"query":"SELECT services.created_at AS services_created_at, services.updated_at AS services_updated_at, services.deleted_at AS services_deleted_at, services.deleted AS services_deleted, services.id AS services_id, services.uuid AS services_uuid, services.host AS services_host, services.`binary` AS services_binary, services.topic AS services_topic, services.report_count AS services_report_count, services.disabled AS services_disabled, services.disabled_reason AS services_disabled_reason, services.last_seen_up AS services_last_seen_up, services.forced_down AS services_forced_down, services.version AS services_version \nFROM services \nWHERE services.deleted = 0 AND services.id = 15 \n LIMIT 1","rows_sent":1,"schemaname":"nova","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:08.565184","starttime_timestamp_us":1754536388565184,"thread_id":654,"username":"nova"}
{"client":"18.242.143.61:46636","digest":"0x1C46AE529DD5A40E","duration_us":965,"endtime":"2025-08-07 11:13:08.907899","endtime_timestamp_us":1754536388907899,"event":"COM_QUERY","hostgroup_id":0,"query":"SELECT 1","rows_sent":1,"schemaname":"nova_cell0","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:08.906934","starttime_timestamp_us":1754536388906934,"thread_id":616,"username":"nova"}
{"client":"18.242.143.69:56760","digest":"0x1C46AE529DD5A40E","duration_us":846,"endtime":"2025-08-07 11:13:08.962562","endtime_timestamp_us":1754536388962562,"event":"COM_QUERY","hostgroup_id":2,"query":"SELECT 1","rows_sent":1,"schemaname":"nova","server":"18.242.143.60:3306","starttime":"2025-08-07 11:13:08.961716","starttime_timestamp_us":1754536388961716,"thread_id":617,"username":"nova"}
{"client":"18.242.143.69:56760","digest":"0xCDFFE6B0A002A823","duration_us":1178,"endtime":"2025-08-07 11:13:08.964812","endtime_timestamp_us":1754536388964812,"event":"COM_QUERY","hostgroup_id":2,"query":"SELECT services.created_at AS services_created_at, services.updated_at AS services_updated_at, services.deleted_at AS services_deleted_at, services.deleted AS services_deleted, services.id AS services_id, services.uuid AS services_uuid, services.host AS services_host, services.`binary` AS services_binary, services.topic AS services_topic, services.report_count AS services_report_count, services.disabled AS services_disabled, services.disabled_reason AS services_disabled_reason, services.last_seen_up AS services_last_seen_up, services.forced_down AS services_forced_down, services.version AS services_version \nFROM services \nWHERE services.deleted = 0 AND services.id = 3 \n LIMIT 1","rows_sent":1,"schemaname":"nova","server":"18.242.143.61:3306","starttime":"2025-08-07 11:13:08.963634","starttime_timestamp_us":1754536388963634,"thread_id":617,"username":"nova"}
{"client":"18.242.143.69:60304","digest":"0x1C46AE529DD5A40E","duration_us":872,"endtime":"2025-08-07 11:13:09.026921","endtime_timestamp_us":1754536389026921,"event":"COM_QUERY","hostgroup_id":0,"query":"SELECT 1","rows_sent":1,"schemaname":"nova_cell0","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:09.026049","starttime_timestamp_us":1754536389026049,"thread_id":660,"username":"nova"}
{"client":"18.242.143.69:60304","digest":"0xCDFFE6B0A002A823","duration_us":1326,"endtime":"2025-08-07 11:13:09.029245","endtime_timestamp_us":1754536389029245,"event":"COM_QUERY","hostgroup_id":0,"query":"SELECT services.created_at AS services_created_at, services.updated_at AS services_updated_at, services.deleted_at AS services_deleted_at, services.deleted AS services_deleted, services.id AS services_id, services.uuid AS services_uuid, services.host AS services_host, services.`binary` AS services_binary, services.topic AS services_topic, services.report_count AS services_report_count, services.disabled AS services_disabled, services.disabled_reason AS services_disabled_reason, services.last_seen_up AS services_last_seen_up, services.forced_down AS services_forced_down, services.version AS services_version \nFROM services \nWHERE services.deleted = 0 AND services.id = 15 \n LIMIT 1","rows_sent":1,"schemaname":"nova_cell0","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:09.027919","starttime_timestamp_us":1754536389027919,"thread_id":660,"username":"nova"}
{"client":"18.242.143.69:45330","digest":"0xA50E27CC84BD5D30","duration_us":2223,"endtime":"2025-08-07 11:13:12.083383","endtime_timestamp_us":1754536392083383,"event":"COM_QUERY","hostgroup_id":2,"query":"SELECT agents.id AS agents_id, agents.agent_type AS agents_agent_type, agents.`binary` AS agents_binary, agents.topic AS agents_topic, agents.host AS agents_host, agents.availability_zone AS agents_availability_zone, agents.admin_state_up AS agents_admin_state_up, agents.created_at AS agents_created_at, agents.started_at AS agents_started_at, agents.heartbeat_timestamp AS agents_heartbeat_timestamp, agents.description AS agents_description, agents.configurations AS agents_configurations, agents.resource_versions AS agents_resource_versions, agents.`load` AS agents_load, agents.resources_synced AS agents_resources_synced \nFROM agents \nWHERE agents.agent_type IN ('Open vSwitch agent') AND agents.host IN ('control-60')","rows_sent":1,"schemaname":"neutron","server":"18.242.143.60:3306","starttime":"2025-08-07 11:13:12.081160","starttime_timestamp_us":1754536392081160,"thread_id":666,"username":"neutron"}
update 日志抽样,所有写请求都被调度到 galera 主节点:"server":"18.242.143.62:3306"
{"client":"18.242.143.61:46704","digest":"0xF06575E8A673326E","duration_us":1096,"endtime":"2025-08-07 11:13:08.568875","endtime_timestamp_us":1754536388568875,"event":"COM_QUERY","hostgroup_id":0,"query":"UPDATE services SET updated_at='2025-08-07 03:13:08.725545', report_count=127825, last_seen_up='2025-08-07 03:13:08.725012' WHERE services.id = 15","rows_affected":1,"rows_sent":0,"schemaname":"nova","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:08.567779","starttime_timestamp_us":1754536388567779,"thread_id":654,"username":"nova"}
{"client":"18.242.143.61:46636","digest":"0xF06575E8A673326E","duration_us":713,"endtime":"2025-08-07 11:13:08.912593","endtime_timestamp_us":1754536388912593,"event":"COM_QUERY","hostgroup_id":0,"query":"UPDATE services SET updated_at='2025-08-07 03:13:09.069605', report_count=126300, last_seen_up='2025-08-07 03:13:09.069056' WHERE services.id = 27","rows_affected":1,"rows_sent":0,"schemaname":"nova_cell0","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:08.911880","starttime_timestamp_us":1754536388911880,"thread_id":616,"username":"nova"}
{"client":"18.242.143.69:56760","digest":"0xF06575E8A673326E","duration_us":883,"endtime":"2025-08-07 11:13:08.967613","endtime_timestamp_us":1754536388967613,"event":"COM_QUERY","hostgroup_id":0,"query":"UPDATE services SET updated_at='2025-08-07 03:13:08.966502', report_count=128990, last_seen_up='2025-08-07 03:13:08.965974' WHERE services.id = 3","rows_affected":1,"rows_sent":0,"schemaname":"nova","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:08.966730","starttime_timestamp_us":1754536388966730,"thread_id":617,"username":"nova"}
{"client":"18.242.143.69:60304","digest":"0xDC8E39D73596DD46","duration_us":568,"endtime":"2025-08-07 11:13:09.031580","endtime_timestamp_us":1754536389031580,"event":"COM_QUERY","hostgroup_id":0,"query":"UPDATE services SET updated_at='2025-08-07 03:13:09.030808', report_count=129003 WHERE services.id = 15","rows_affected":1,"rows_sent":0,"schemaname":"nova_cell0","server":"18.242.143.62:3306","starttime":"2025-08-07 11:13:09.031012","starttime_timestamp_us":1754536389031012,"thread_id":660,"username":"nova"}
- 半读写分离,proxysql 对后端 galera 集群的流量调度如下图
- 有节点都能分担读压力,资源充分利用
- 虽然 galera 理论上是同步复制,但在高并发场景下依然可能有写同步延迟。某些读请求存在读到旧数据的可能性
- sql路由规则配置会比较麻烦,需要给每个应用配置复杂的 sql 读写路由规则
proxysql-galera读写完全分离(单写,多读)
不同于 mariadb 的主从集群(读节点天然是 read_only = ON),proxysql 对 galera 集群的支持并不友好。如果要在 galera 集群上实现 sql 流量的完全读写分离,需要:
- proxysql 设置 writer_is_also_reader = 0,保证主节点只写
- 其余节点作为读,设置 read_only = ON
- 需要引入新的机制,动态设置 galera 的读写角色(因为 proxysql 只能感知 mariadb 的节点角色,但是不能主动设置,需要一个新服务根据监控信息,去动态设置某个 mariadb 节点的角色)。例如写节点挂掉,自动挑选一个读节点转换为写节点;若写节点存在时,动态设置当前节点为“只读”角色。
按上面的方法,实现完全读写分离,proxysql 有如下 galera 代理信息:主节点在写组,其余节点在读组
MySQL [(none)]> SELECT * FROM runtime_mysql_servers;
+--------------+---------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
3 rows in set (0.002 sec)
此时,有流量路径如下图所示:
mariadb 角色监控服务,需要具备角色监控和设置能力。例如监控到 mariadb-1,18.242.143.62 写节点离线,通过设置其余 mariadb 读节点角色(自动挑选 18.242.143.61 为新的写节点,修改 18.242.143.61 节点 mariadb 的 read_only = ON 为 OFF),proxysql 感知到 18.242.143.61 节点的角色变化后, 最终重新选举出18.242.143.61为新的 写 节点。
MySQL [(none)]> SELECT * FROM runtime_mysql_servers;
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 2 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 3 | 18.242.143.62 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
3 rows in set (0.005 sec)
总之,proxysql 对 galera 集群并不友好,实现完全读写分离需要自主实现 mariadb 角色监控服务。
读写完全分离更适合 mariadb 主从模式(读节点天然是 read_only = ON)。半读写分离,已经是 kolla proxysql-galera 改动最小,代价最低的方案。
proxysql高可用
proxysql对mariadb的健康检查
跟 haproxy类似,proxysql 内部也集成了对后端 mariadb 节点的健康检查。
例如 kolla proxysql 默认的读写不分离场景,主 mariadb 节点健康时,proxysql 有视图信息:
MySQL [(none)]> SELECT * FROM runtime_mysql_servers;
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.60 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.61 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
5 rows in set (0.003 sec)
主节点 18.242.143.62 离线后进入了离线组,18.242.143.61 成为了新读写节点:
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 0 | 18.242.143.60 | 3306 | 0 | SHUNNED | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 0 | 18.242.143.61 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 1 | 18.242.143.60 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
| 3 | 18.242.143.62 | 3306 | 0 | ONLINE | 100 | 0 | 10000 | 0 | 0 | 0 | |
+--------------+---------------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
keepalived保障proxysql高可用
也跟 haproxy 类似,启用 proxysql 后,keepalived 会新增一个面向 proxysql 健康检查的脚本:
$ ll keepalived/checks/
total 8
-rwxrwx---. 1 root root 211 Jul 23 10:08 check_alive_haproxy.sh
-rwxrwx---. 1 root root 210 Jul 23 10:08 check_alive_proxysql.sh
$
$ cat keepalived/checks/check_alive_proxysql.sh
#!/bin/bash# This will return 0 when it successfully talks to the ProxySQL daemon via localhost
# Failures return 1echo "show info" | socat unix-connect:/var/lib/kolla/proxysql/admin.sock stdio > /dev/null
/var/lib/kolla/proxysql/admin.sock,是 proxysql 监听的 unix_socket。当浮动IP所在节点的 proxysql 离线时,浮动IP切换到集群其它节点,不影响针对 galera 集群的查询。
总结
- kolla 默认 mairadb 多节点为 galera 集群,还是为了使用 galera 的多节点可同时读写,同步复制,以及强一致性特性,为了避免写冲突,kolla 默认 openstack 应用都向同一个节点写。不管是 haproxy 还是 proxysql
- galera 集群也不是越大越好,节点数越多,一致性开销就会越大,一般3-5个节点合适
- haproxy 对 galera 集群支持 4 层代理
- proxysql 可以支持 sql 路由,但是对 galera 的支持不够。想要 sql 完整读写分离,需要引入新的机制自动设置 mariadb 节点角色,在 mariadb 写节点离线时,使得 * proxysql 可以选出新的写节点,保障 galera 高可用
本文来自博客园,作者:T-BARBARIANS,博文严禁转载,转载必究!