并行创建 AO/AOCO 表与刷新物化视图(引入自 v1.5.0 版本)
自 v1.5.0 起,Apache Cloudberry 支持使用 CREATE TABLE AS
语句并行创建 Append-Optimized (AO) 表和 Append-Optimized Column Oriented (AOCO) 表,同时支持并行刷新基于该表的物化视图,从而加速建表和物化视图刷新。
要使用该并发功能,你需要先将系统参数 enable_parallel
的值设为 ON
。
使用示例
并发创建 AO/AOCO 表
-
创建基表
t_p2
,在建表语句中使用WITH
指定并发算子数量。CREATE TABLE t_p2(c1 INT, c2 INT) WITH (parallel_workers=2) DISTRIBUTED BY (c1);
-
向表中插入数据,并收集表
t_p2
上的统计信息。INSERT INTO t_p2 SELECT i, i+1 FROM generate_series(1, 10000000) i;
ANALYZE t_p2; -
开启并发,并关闭 GPORCA 优化器。
SET enable_parallel = ON;
SET optimizer = OFF; -
使用
CREATE TABLE AS
基于t_p2
建表,在执行计划中出现了并行算子,说明已经并发建表。-
创建 AO 表:
EXPLAIN(COSTS OFF) CREATE TABLE ctas_ao USING ao_row AS SELECT SUM(a.c2) AS c2, AVG(b.c1) AS c1 FROM t_p2 a JOIN t_p2 b ON a.c1 = b.c1 DISTRIBUTED BY (c2);
QUERY PLAN
-----------------------------------------------------------------
Redistribute Motion 1:3 (slice1; segments: 1)
Hash Key: (sum(a.c2))
-> Finalize Aggregate
-> Gather Motion 6:1 (slice2; segments: 6)
-> Partial Aggregate
-> Parallel Hash Join
Hash Cond: (a.c1 = b.c1)
-> Parallel Seq Scan on t_p2 a
-> Parallel Hash
-> Parallel Seq Scan on t_p2 b
Optimizer: Postgres query optimizer
(11 rows) -
创建 AOCO 表:
EXPLAIN(COSTS OFF) CREATE TABLE ctas_aoco USING ao_column AS SELECT SUM(a.c2) AS c2, AVG(b.c1) AS c1 FROM t_p2 a JOIN t_p2 b ON a.c1 = b.c1 DISTRIBUTED BY (c2);
QUERY PLAN
-----------------------------------------------------------------
Redistribute Motion 1:3 (slice1; segments: 1)
Hash Key: (sum(a.c2))
-> Finalize Aggregate
-> Gather Motion 6:1 (slice2; segments: 6)
-> Partial Aggregate
-> Parallel Hash Join
Hash Cond: (a.c1 = b.c1)
-> Parallel Seq Scan on t_p2 a
-> Parallel Hash
-> Parallel Seq Scan on t_p2 b
Optimizer: Postgres query optimizer
(11 rows)
-