Dynamic Join Filter
Dynamic Join Filter通过动态收集Hash Join的右表Join键信息,在Join计算前过滤左表中无法Join匹配的数据,从而提升AnalyticDB PostgreSQL版的Hash Join的性能。使用TPC-H在本地生成1 GB测试数据,操作方法,请参见生成测试数据。创建用于测试的lineitem和part表,建表语句如下:CREATE TABLE LINEITEM (
L_ORDERKEY BIGINT NOT NULL,
L_PARTKEY INTEGER NOT NULL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L_QUANTITY NUMERIC(15,2) NOT NULL,
L_EXTENDEDPRICE NUMERIC(15,2) NOT NULL,
L_DISCOUNT NUMERIC(15,2) NOT NULL,
L_TAX NUMERIC(15,2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL
) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN)
DISTRIBUTED BY (L_ORDERKEY, L_LINENUMBER);
CREATE TABLE PART (
P_PARTKEY INTEGER NOT NULL,
P_NAME VARCHAR(55) NOT NULL,
P_MFGR CHAR(25) NOT NULL,
P_BRAND CHAR(10) NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER CHAR(10) NOT NULL,
P_RETAILPRICE NUMERIC(15,2) NOT NULL,
P_COMMENT VARCHAR(23) NOT NULL
) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN)
DISTRIBUTED BY (P_PARTKEY);使用\COPY命令导入1 GB测试数据,语句如下:\COPY LINEITEM FROM 'lineitem.tbl' with DELIMITER '|' NULL '';
\COPY PART FROM 'part.tbl' with DELIMITER '|' NULL '';查看关闭Dynamic Join Filter功能的情况下,Q17查询的执行耗时。具体步骤如下:关闭Dynamic Join Filter功能:SET adbpg_enable_dynamic_join_filter TO off;执行TPC-H Q17查询:SELECT
sum(l_extendedprice) / 7.0 as avg_yearly
FROM
lineitem,
part
WHERE
p_partkey = l_partkey
and p_brand = 'Brand#54'
and p_container = 'SM CAN'
and l_quantity < (
SELECT
0.2 * avg(l_quantity)
FROM
lineitem
WHERE
l_partkey = p_partkey
);返回结果如下,查询耗时为3468ms: avg_yearly
---------------------
336452.465714285714
(1 row)
Time: 3468.411 ms查看开启Dynamic Join Filter功能的情况下,Q17查询的执行耗时。具体步骤如下:开启Dynamic Join Filter功能:SET adbpg_enable_dynamic_join_filter TO on;执行TPC-H Q17查询:SELECT
sum(l_extendedprice) / 7.0 as avg_yearly
FROM
lineitem,
part
WHERE
p_partkey = l_partkey
and p_brand = 'Brand#54'
and p_container = 'SM CAN'
and l_quantity < (
SELECT
0.2 * avg(l_quantity)
FROM
lineitem
WHERE
l_partkey = p_partkey
);返回结果如下,查询耗时为305ms: avg_yearly
---------------------
336452.465714285714
(1 row)
Time: 305.632 ms通过以上示例可以看出,开启Dynamic Join Filter功能后,执行时间从3468ms降低到305ms,有效地提升了执行性能。
栏目分类
- VEGA中文网
- EMR中文网