Single SQL to count rows in all tables

R Andrew Cocks
5 min readJul 7, 2019

--

Oracle 18c XE

Row counts for HR schema
Single SQL to count rows in tables

Well that was easy, not a COUNT(*) in sight and the query completes in 20ms according to SQL Developer. But why are last_analyzed and stale_stats queried and what is this data dictionary table user_tab_statistics?

Let’s delve into it…

To do demos in Oracle I like to use the Human Resources (HR) sample schema which ships with every version. This is a default install of 18c XE connecting to the standard port (1521) and the XE pluggable database (XEPDB1)

C:\>sqlplus hr/hr@//localhost:1521/XEPDB1SQL*Plus: Release 18.0.0.0.0 - Production on Thu Jul 4 16:15:34 2019
Version 18.4.0.0.0
Copyright (c) 1982, 2018, Oracle. All rights reserved.ERROR:
ORA-28000: The account is locked.

Oops, that didn’t work. We’ll need to unlock the HR account first.

C:\>sqlplus sys/<password>@//localhost:1521/XEPDB1 as sysdbaSQL*Plus: Release 18.0.0.0.0 - Production on Thu Jul 4 16:12:21 2019
Version 18.4.0.0.0
Copyright (c) 1982, 2018, Oracle. All rights reserved.Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0
SQL> ALTER USER hr IDENTIFIED BY hr ACCOUNT UNLOCK;User altered.

Log into the HR account using HR as a password we IDENTIFIED BY above:

C:\>sqlplus hr/hr@//localhost:1521/XEPDB1SQL*Plus: Release 18.0.0.0.0 - Production on Thu Jul 4 16:17:28 2019
Version 18.4.0.0.0
Copyright (c) 1982, 2018, Oracle. All rights reserved.Last Successful login time: Thu Jul 04 2019 16:14:47 +08:00Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0
SQL>

Now we can run a simple query:

SQL> ALTER SESSION SET NLS_DATE_FORMAT = 'YYYYMMDD';
SQL> COLUMN last_analyzed FORMAT 99999999 HEADING 'Last|Analyzed'
SQL> COLUMN num_rows FORMAT 99999 HEADING 'Num|Rows'
SQL> COLUMN table_name FORMAT a19
SQL> SELECT table_name, num_rows, last_analyzed
2 FROM user_tables;
Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
REGIONS
COUNTRIES
LOCATIONS
DEPARTMENTS
JOBS
EMPLOYEES
JOB_HISTORY
7 rows selected.SQL>

Unfortunately we have NULL in the num_rows column when we were expecting numbers. [1]

These row counts are calculated in real time, they come from statistics. The freshness of the statistics dictate the accuracy. It is the LAST_ANALYZED column which tell us when the stats were most recently gathered, NULL means there are no statistics at all. So let’s gather some, first for a single table and then for all the tables in the schema to see the different methods.

SQL> SET TIMING ON
SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>'HR',tabname=>'DEPARTMENTS',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');
PL/SQL procedure successfully completed.Elapsed: 00:00:00.20
SQL> SET TIMING OFF
SQL> SELECT table_name, num_rows, last_analyzed
2 FROM user_tables;
Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
REGIONS
COUNTRIES
LOCATIONS
DEPARTMENTS 27 20190704
JOBS
EMPLOYEES
JOB_HISTORY
7 rows selected.SQL> SET TIMING ON
SQL> EXECUTE dbms_stats.gather_schema_stats(ownname=>'HR',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');
PL/SQL procedure successfully completed.Elapsed: 00:00:00.51
SQL> SET TIMING OFF
SQL> SELECT table_name, num_rows, last_analyzed
2 FROM user_tables;
Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
REGIONS 4 20190704
COUNTRIES 25 20190704
LOCATIONS 23 20190704
DEPARTMENTS 27 20190704
JOBS 19 20190704
EMPLOYEES 107 20190704
JOB_HISTORY 10 20190704
7 rows selected.

Great, we got our row counts and the stats gathering took under a second but these are small tables. For larger tables generally we might want to gather sample statistics, not count every row. So let’s create a table with 10K rows:

SQL> CREATE TABLE test_stats_estimate AS
2 SELECT LEVEL id, CAST(DBMS_RANDOM.string('a',100) AS VARCHAR2(100)) txt FROM dual
3 CONNECT BY LEVEL <= 10000;
Table created

But gather stats on only 10% of the table:

SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>'HR',tabname=>'TEST_SMALL_ESTIMATE_PERCENT',estimate_percent=>10,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');PL/SQL procedure successfully completed.

What we find is that the number of rows is an approximation, not exactly 10K:

SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%';                       Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE 9979 20190704

What happens if we delete half the rows in the table?

SQL> DELETE FROM test_small_estimate_percent WHERE ROWNUM <= 5000;5000 rows deleted.Elapsed: 00:00:00.08
SQL> SELECT COUNT(*) FROM test_small_estimate_percent;
COUNT(*)
----------
5000
SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%'; Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE 9979 20190704

We can see the difference in a COUNT() but there’s no change to num_rows. What about if we add new rows?

SQL> INSERT /*+ APPEND */ INTO test_small_estimate_percent
2 SELECT LEVEL id, CAST(DBMS_RANDOM.string('a',100) AS VARCHAR2(100)) txt FROM dual
3 CONNECT BY LEVEL <= 10000;
10000 rows created.SQL> COMMIT;Commit complete.SQL> SELECT COUNT(*) FROM test_stats_estimate; COUNT(*)
----------
15000
SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%'; Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE 9979 20190704

How about if we truncate the table?

SQL> TRUNCATE TABLE test_stats_estimate;
Table truncated.
SQL> SELECT COUNT(*) FROM test_stats_estimate;
COUNT(*)
----------
0
SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%'; Num Last
TABLE_NAME Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE 9979 20190704

No matter what we do the statistics generated row count isn’t updated. Looks like this method of find a row count is useless — you not only need to have stats but they have to be current. Or is it useless? Oracle provides a method to update stale statistics. How does it know?

SQL> SELECT table_name, inserts, updates, deletes, truncated FROM all_tab_modifications WHERE table_owner = 'HR';TABLE_NAME                        INSERTS    UPDATES    DELETES TRU
------------------------------ ---------- ---------- ---------- ---
TEST_SMALL_ESTIMATE_PERCENT 10000 0 19839 YES

This information is summarized by Oracle into a single stale_stats flag:

SQL> SELECT table_name,num_rows,last_analyzed,stale_stats FROM user_tab_statistics;TABLE_NAME                       NUM_ROWS LAST_ANALYZED   STALE_S
------------------------------ ---------- --------------- -------
TEST_SMALL_ESTIMATE_PERCENT 10023 20190704 YES
LOCATIONS 23 20190704 NO
DEPARTMENTS 27 20190704 NO
EMPLOYEES 107 20190704 NO
COUNTRIES 25 20190704 NO
REGIONS 4 20190704 NO
JOBS 19 20190704 NO
JOB_HISTORY 10 20190704 NO
8 rows selected.

Here we can see that the statistics for our target table are stale. To have Oracle automatically gather stats for stale tables we can use:

EXECUTE DBMS_STATS.gather_schema_stats (ownname => 'HR', cascade => TRUE, options => 'GATHER AUTO');

Which leads us back to where this post started, with:

The automatic optimizer statistics collection job which calls DBMS_STATS package runs in predefined maintenance windows and these maintenance windows are open once a day during which various jobs including the gathering of statistics is performed. SYSTEM can see the job in the scheduler:

SELECT *
FROM dba_scheduler_jobs
WHERE job_name LIKE '%STAT%';

See my next post if you want to expand this to calculate disk usage.

Here’s the entire sqlplus script which can be run on the command line:

Using Oracle Database 18c Express Edition (XE) — Free download

[1] This won’t happen on a fresh DB install, I deleted the statistics using:

EXECUTE DBMS_STATS.DELETE_SCHEMA_STATS(ownname=>’HR’);

--

--