Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Versions available for this page: CUBRID 8.4.0 |  CUBRID 8.4.1 | 



GROUP BY Clause Optimization

Description

GROUP BY caluse optimization works on the premise that if all columns in the GROUP BY clause are included in an index, you can use the index upon executing a query, so you don't execute a separate sorting job. The columns in the GROUP BY clause must exist in front side of the column forming the index.

SELECT * FROM tab WHERE col1 > 0 GROUP BY col1,col2

  • You can use the index consisting of tab(col1, col2) for optimization.
  • The index consisting of tab(col1, col2, col3) can be used because col3 no referred by GROUP BY comes after col1 and col2.
  • You cannot use the index consisting of tab(col1) for optimization.
  • You also cannot use the index consisting of tab(col3, col1, col2) or tab(col1, col3, col2), because col3 is not located at the back of the column in the GROUP BY clause.

You can use the index if the column condition is a constant although the column consisting of the idex doesn't exist in the GROUP BY clause.

SELECT * FROM tab WHERE col2=val GROUP BY col1,col3

If there is any index that consists of tab(col1, col2, col3) in the above example, use the index for optimizing GROUP BY.

Row sorting by GROUP BY is not required, because you can get the result as the requested order on the index scan.

If the index consisting of the GROUP BY column and the first column of the index is NOT NULL, even though there is no WHERE clause, the GROUP BY optimization will be applied.

GROUP BY optimization is applied only when MIN() or MAX() are used in an aggregate function, and to use the two aggregate functions together, an identical column must be used.

CREATE INDEX ON T(a, b, c);

SELECT a, MIN(b), c, MAX(b) FROM T WHERE a > 18 GROUP BY a, b;

Example

CREATE TABLE tab (i INT, j INT, k INT);

CREATE INDEX ON tab (j,k);

INSERT INTO tab VALUES (1,2,3),(6,4,2),(3,4,1),(5,2,1),(1,5,5),(2,6,6),(3,5,4);

The following example shows that indexes consisting of tab(j,k) are used and no separate sorting process is required because GROUP BY is executed by j and k columns.

SELECT i,j,k FROM tab WHERE j > 0 GROUP BY j,k;

 

--  the  selection from the query plan dump shows that the index i_tab_j_k was used and sorting was not necessary

--  (/* ---> skip GROUP BY */)

 

Query plan:

iscan

    class: tab node[0]

    index: i_tab_j_k term[0]

    sort:  2 asc, 3 asc

    cost:  fixed 0(0.0/0.0) var 1(0.0/1.0) card 0

 

Query stmt:

select tab.i, tab.j, tab.k from tab tab where ((tab.j> ?:0 )) group by tab.j, tab.k

/* ---> skip GROUP BY */

            i            j            k

            5            2            1

            1            2            3

            3            4            1

            6            4            2

            3            5            4

            1            5            5

           2            6            6

The following example shows that an index consisting of tab(j,k) is used and no separate sorting process is required while GROUP BY is executed by j and k columns, no condition exists for j, and j column has NOT NULL attribute.

ALTER TABLE tab CHANGE COLUMN j j INT NOT NULL;

SELECT * FROM tab GROUP BY j,k;

 

--  the  selection from the query plan dump shows that the index i_tab_j_k was used (since j has the NOT NULL constraint )

--  and sorting was not necessary (/* ---> skip GROUP BY */)

Query plan:

iscan

    class: tab node[0]

    index: i_tab_j_k

    sort:  2 asc, 3 asc

    cost:  fixed 0(0.0/0.0) var 1(0.0/1.0) card 0

 

Query stmt: select tab.i, tab.j, tab.k from tab tab group by tab.j, tab.k

/* ---> skip GROUP BY */

=== <Result of SELECT Command in Line 1> ===

            i            j            k

=======================================

            5            2            1

            1            2            3

            3            4            1

            6            4            2

            3            5            4

            1            5            5

            2            6            6