Versions available for this page: CUBRID 8.4.0 | CUBRID 8.4.1 |
GROUP BY caluse optimization works on the premise that if all columns in the GROUP BY clause are included in an index, you can use the index upon executing a query, so you don't execute a separate sorting job. The columns in the GROUP BY clause must exist in front side of the column forming the index.
SELECT * FROM tab WHERE col1 > 0 GROUP BY col1,col2
You can use the index if the column condition is a constant although the column consisting of the idex doesn't exist in the GROUP BY clause.
SELECT * FROM tab WHERE col2=val GROUP BY col1,col3
If there is any index that consists of tab(col1, col2, col3) in the above example, use the index for optimizing GROUP BY.
Row sorting by GROUP BY is not required, because you can get the result as the requested order on the index scan.
If the index consisting of the GROUP BY column and the first column of the index is NOT NULL, even though there is no WHERE clause, the GROUP BY optimization will be applied.
GROUP BY optimization is applied only when MIN() or MAX() are used in an aggregate function, and to use the two aggregate functions together, an identical column must be used.
CREATE INDEX ON T(a, b, c);
SELECT a, MIN(b), c, MAX(b) FROM T WHERE a > 18 GROUP BY a, b;
CREATE TABLE tab (i INT, j INT, k INT);
CREATE INDEX ON tab (j,k);
INSERT INTO tab VALUES (1,2,3),(6,4,2),(3,4,1),(5,2,1),(1,5,5),(2,6,6),(3,5,4);
The following example executes the GROUP BY to j and k column so that the index can be used without a separate sorting process.
SELECT i,j,k FROM tab WHERE j > 0 GROUP BY j,k;
-- the selection from the query plan dump shows that the index i_tab_j_k was used and sorting was not necessary
-- (/* ---> skip GROUP BY */)
Query plan:
iscan
class: tab node[0]
index: i_tab_j_k term[0]
sort: 2 asc, 3 asc
cost: fixed 0(0.0/0.0) var 1(0.0/1.0) card 0
Query stmt:
select tab.i, tab.j, tab.k from tab tab where ((tab.j> ?:0 )) group by tab.j, tab.k
/* ---> skip GROUP BY */
i j k
5 2 1
1 2 3
3 4 1
6 4 2
3 5 4
1 5 5
2 6 6
The following example executes GROUP BY to the j and k column and there is no condition for the j but the j column has the NOT NULL attribute so that, the index consisting of tab(j,k) will be used without a separate sorting process.
ALTER TABLE tab CHANGE COLUMN j j INT NOT NULL;
SELECT * FROM tab GROUP BY j,k;
-- the selection from the query plan dump shows that the index i_tab_j_k was used (since j has the NOT NULL constraint )
-- and sorting was not necessary (/* ---> skip GROUP BY */)
Query plan:
iscan
class: tab node[0]
index: i_tab_j_k
sort: 2 asc, 3 asc
cost: fixed 0(0.0/0.0) var 1(0.0/1.0) card 0
Query stmt: select tab.i, tab.j, tab.k from tab tab group by tab.j, tab.k
/* ---> skip GROUP BY */
=== <Result of SELECT Command in Line 1> ===
i j k
=======================================
5 2 1
1 2 3
3 4 1
6 4 2
3 5 4
1 5 5
2 6 6