當兩張表進行聯(lián)結的時候,如果表1中的數(shù)據(jù)行是否出現(xiàn)在結果集中需要根據(jù)表2中出現(xiàn)或不出現(xiàn)至少一個相匹配的數(shù)據(jù)行來判斷,這種情況就會發(fā)生半聯(lián)結;而反聯(lián)結便是半聯(lián)結的補集,它們會作為數(shù)據(jù)庫中常見的聯(lián)結方法如NESTED LOOPS,MERGE SORT JOIN,HASH JOIN的選項出現(xiàn)。
實際上半聯(lián)結和反聯(lián)結本身也可以被認同是兩種聯(lián)結方法;在CBO優(yōu)化模式下,優(yōu)化器能夠根據(jù)實際情況靈活的轉(zhuǎn)換執(zhí)行語句從而實現(xiàn)半聯(lián)結和反聯(lián)結方法,畢竟沒有什么SQL語法可以顯式的調(diào)用半聯(lián)結和反聯(lián)結,它們只是SQL語句滿足某些條件時優(yōu)化器可以選擇的選項而已,不過仍然有必要深入這兩種選項在特定情況下帶來的性能優(yōu)勢。
半聯(lián)結
半聯(lián)結通常都發(fā)生在使用含有IN和EXISTS的相關子查詢的時候,=ANY的用法與IN相同,所以也會出現(xiàn)發(fā)生半聯(lián)結的情況;不過也是有例外的,在11gR2版本中,優(yōu)化器不會為任何包含在OR分支中的子查詢選擇半聯(lián)結,這也是現(xiàn)在官檔中唯一明確標識的限制條件,來看幾種發(fā)生場景:
復制代碼 代碼如下:
-- 使用IN關鍵字的相關子查詢 => 發(fā)生NESTED LOOPS半聯(lián)結
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2605691773
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
-- 使用EXISTS關鍵字的相關子查詢 => 發(fā)生NESTED LOOPS半聯(lián)結
SQL> select department_name
2 from hr.departments dept where exists
3 (select null from hr.employees emp where emp.department_id = dept.department_id);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2605691773
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
-- 謂語中使用了OR分支中的EXISTS子查詢 => 禁用半聯(lián)結
SQL> select department_name
2 from hr.departments dept
3 where 1=2 OR exists
4 (select null from hr.employees emp where emp.department_id = dept.department_id);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 440241596
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 27 | 432 | 4 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 2 | 6 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "HR"."EMPLOYEES" "EMP" WHERE
"EMP"."DEPARTMENT_ID"=:B1))
3 - access("EMP"."DEPARTMENT_ID"=:B1)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
35 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
從結果集來看,我們很容易聯(lián)想到內(nèi)聯(lián)結,那為什么半聯(lián)結通常來說會獲得更高的性能呢?這實際也是半聯(lián)結優(yōu)化的關鍵,拿NESTED LOOPS來舉例,在NESTED LOOPS聯(lián)結中,驅(qū)動表被讀取后需要逐個的進入內(nèi)層循環(huán)來進行匹配工作,并且只有當外層循環(huán)的數(shù)據(jù)行與內(nèi)層循環(huán)中的每一行數(shù)據(jù)匹配運算完成后才會結束一個結果集的獲??;而相對而言,半聯(lián)結的區(qū)別在于數(shù)據(jù)集1中的每一條記錄只返回一次,而不管數(shù)據(jù)集2中有幾條匹配的記錄,因此,半聯(lián)結會在找到子查詢中匹配到的第一條數(shù)據(jù)后立即結束處理從而提高性能。
對于某些需要利用半聯(lián)結來提高性能的場景,可以通過手動的方式控制半聯(lián)結的執(zhí)行計劃,使用SEMIJOIN和NO_SEMIJOIN提示分別可以指定優(yōu)化器使用和禁用半聯(lián)結。
復制代碼 代碼如下:
-- 使用NO_SEMIJOIN提示禁用半聯(lián)結
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select /*+ no_semijoin */department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3372191744
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 106 | 1802 | 4 (25)| 00:00:01 |
| 1 | VIEW | VM_NWVW_2 | 106 | 1802 | 4 (25)| 00:00:01 |
| 2 | HASH UNIQUE | | 106 | 2544 | 4 (25)| 00:00:01 |
| 3 | NESTED LOOPS | | 106 | 2544 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 567 | 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 4 | 12 | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
506 recursive calls
0 db block gets
188 consistent gets
7 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
10 sorts (memory)
0 sorts (disk)
11 rows processed
除此之外,我們還可以使用_always_semi_join隱藏參數(shù)選擇半聯(lián)結的聯(lián)結類型,有關_always_semi_join參數(shù)的可選值:
復制代碼 代碼如下:
SQL> SELECT
2 PARNO_KSPVLD_VALUES pvalid_par#,
3 NAME_KSPVLD_VALUES pvalid_name,
4 VALUE_KSPVLD_VALUES pvalid_value,
5 DECODE(ISDEFAULT_KSPVLD_VALUES, 'FALSE', '', 'DEFAULT' ) pvalid_default
6 FROM
7 X$KSPVLD_VALUES
8 WHERE
9 LOWER(NAME_KSPVLD_VALUES) LIKE '%'||LOWER(nvl('pname',name_kspvld_values))||'%'
10 ORDER BY
11 pvalid_par#,
12 pvalid_default,
13 pvalid_Value
14 /
PAR# PARAMETER VALUE DEFAULT
------ -------------------------------------------------- ------------------------------ -------
1705 _always_semi_join CHOOSE
_always_semi_join HASH
_always_semi_join MERGE
_always_semi_join NESTED_LOOPS
_always_semi_join OFF
該參數(shù)的默認值為choose,表示選用半聯(lián)結的類型由優(yōu)化器來決定,下面來使用_always_semi_join參數(shù)將上面的NESTED LOOPS半聯(lián)結改變?yōu)镠ASH JOIN半聯(lián)結:
復制代碼 代碼如下:
-- 默認發(fā)生NESTED LOOPS SEMI
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2605691773
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
-- session級別修改參數(shù)
SQL> alter session set "_always_semi_join"=merge;
Session altered.
-- 發(fā)生MERGE JOIN SEMI
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 954076352
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 4 (25)| 00:00:01 |
| 1 | MERGE JOIN SEMI | | 10 | 190 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | 27 | 432 | 2 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | DEPT_ID_PK | 27 | | 1 (0)| 00:00:01 |
|* 4 | SORT UNIQUE | | 107 | 321 | 2 (50)| 00:00:01 |
| 5 | INDEX FULL SCAN | EMP_DEPARTMENT_IX | 107 | 321 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
1 physical reads
0 redo size
742 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
11 rows processed
-- 從trace來看優(yōu)化器的選擇還是非??煽康?。。
反聯(lián)結
從本質(zhì)上來說,反聯(lián)結和半聯(lián)結很多相似的因素,反聯(lián)結的發(fā)生通常是在使用含有NOT IN,NOT EXISTS的相關子查詢的時候,同樣,如果子查詢謂語OR分支中,反聯(lián)結也會被禁用,它和半聯(lián)結主要的不同點還是在返回數(shù)據(jù)的匹配方式上,它是會返回在子查詢中沒有匹配到的數(shù)據(jù)行,不過其優(yōu)化的原理是一致的,通過在子查詢中找到第一條匹配記錄而立即停止處理來提高效率,一下是發(fā)生的集中場景:
復制代碼 代碼如下:
SQL> set autotrace traceonly
-- NOT IN 觸發(fā)反聯(lián)結
SQL> select department_name
2 from hr.departments
3 where department_id not in
4 (select department_id from hr.employees where department_id is not null);
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3082375452
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 17 | 323 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID" IS NOT NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
6 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16 rows processed
-- EXISTS觸發(fā)反聯(lián)結
SQL> select department_name
2 from hr.departments dept
3 where not exists
4 (select null from hr.employees emp where emp.department_id = dept.department_id);
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3082375452
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 17 | 323 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
3 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16 rows processed
從上面的例子可以看出minus和outer join操作都可以巧妙的實現(xiàn)同樣的結果,不過從執(zhí)行計劃來看,minus操作顯然沒有反聯(lián)結操作優(yōu)化,而使用outer join雖然發(fā)生了反聯(lián)結優(yōu)化,但是由于使用了帶空值的虛擬記錄來匹配數(shù)據(jù)行,不便于理解,因此實際還是不建議使用的。
如果想要手動控制反聯(lián)結的執(zhí)行計劃,這里也有一些hint和參數(shù)可以使用,常用的hint有:
1.ANTIJOIN-進行反聯(lián)結,優(yōu)化器決定聯(lián)結類型
2.USE_ANTI-老版本的提示,和ANTIJOIN功能一致
3.[NL_AJ] | [HASH_AJ] | [MERGE_AJ]-指定發(fā)生反聯(lián)結的類型(10g開始被棄用,不過仍然可以生效)
在參數(shù)控制方面,也有個和_always_semi_join非常相同的_always_anti_join參數(shù),用法完全一致;還有參數(shù)_optimizer_null_aware_antijoin,_optimizer_outer_to_anti_enable用于控制對含空值和外聯(lián)結的反聯(lián)結轉(zhuǎn)換。
復制代碼 代碼如下:
-- 使用hint顯式指定反聯(lián)結類型
SQL> select department_name
2 from hr.departments dept
3 where not exists (select /*+ hash_aj */ null from hr.employees emp
4 where emp.department_id = dept.department_id);
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3587451639
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 17 | 323 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | EMP_DEPARTMENT_IX | 107 | 321 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
566 recursive calls
0 db block gets
193 consistent gets
0 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
12 sorts (memory)
0 sorts (disk)
16 rows processed
-- 使用_optimizer_null_antijoin參數(shù)關閉反聯(lián)結中的空值考慮選項-即返回空值的情況不使用反聯(lián)結
SQL> alter session set "_optimizer_null_aware_antijoin"=false;
Session altered.
SQL> select department_name
2 from hr.departments
3 where department_id not in (select department_id from hr.employees);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3416340233
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 26 | 416 | 30 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMPLOYEES | 2 | 6 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "HR"."EMPLOYEES" "EMPLOYEES"
WHERE LNNVL("DEPARTMENT_ID">:B1)))
3 - filter(LNNVL("DEPARTMENT_ID">:B1))
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
172 consistent gets
0 physical reads
0 redo size
343 bytes sent via SQL*Net to client
513 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed