Single SQL to count rows in all tables

5 min readJul 7, 2019

Oracle 18c XE

Single SQL to count rows in tables

Well that was easy, not a COUNT(*) in sight and the query completes in 20ms according to SQL Developer. But why are last_analyzed and stale_stats queried and what is this data dictionary table user_tab_statistics?

Let’s delve into it…

To do demos in Oracle I like to use the Human Resources (HR) sample schema which ships with every version. This is a default install of 18c XE connecting to the standard port (1521) and the XE pluggable database (XEPDB1)

C:\>sqlplus hr/hr@//localhost:1521/XEPDB1SQL*Plus: Release 18.0.0.0.0 - Production on Thu Jul 4 16:15:34 2019
Version 18.4.0.0.0Copyright (c) 1982, 2018, Oracle.  All rights reserved.ERROR:
ORA-28000: The account is locked.

Oops, that didn’t work. We’ll need to unlock the HR account first.

C:\>sqlplus sys/<password>@//localhost:1521/XEPDB1 as sysdbaSQL*Plus: Release 18.0.0.0.0 - Production on Thu Jul 4 16:12:21 2019
Version 18.4.0.0.0Copyright (c) 1982, 2018, Oracle.  All rights reserved.Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0SQL> ALTER USER hr IDENTIFIED BY hr ACCOUNT UNLOCK;User altered.

Log into the HR account using HR as a password we IDENTIFIED BY above:

C:\>sqlplus hr/hr@//localhost:1521/XEPDB1SQL*Plus: Release 18.0.0.0.0 - Production on Thu Jul 4 16:17:28 2019
Version 18.4.0.0.0Copyright (c) 1982, 2018, Oracle.  All rights reserved.Last Successful login time: Thu Jul 04 2019 16:14:47 +08:00Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0SQL>

Now we can run a simple query:

SQL> ALTER SESSION SET NLS_DATE_FORMAT = 'YYYYMMDD';
SQL> COLUMN last_analyzed FORMAT 99999999 HEADING 'Last|Analyzed'
SQL> COLUMN num_rows      FORMAT 99999    HEADING 'Num|Rows'
SQL> COLUMN table_name    FORMAT a19
SQL> SELECT table_name, num_rows, last_analyzed
  2    FROM user_tables;
                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
REGIONS
COUNTRIES
LOCATIONS
DEPARTMENTS
JOBS
EMPLOYEES
JOB_HISTORY7 rows selected.SQL>

Unfortunately we have NULL in the num_rows column when we were expecting numbers. [1]

These row counts are calculated in real time, they come from statistics. The freshness of the statistics dictate the accuracy. It is the LAST_ANALYZED column which tell us when the stats were most recently gathered, NULL means there are no statistics at all. So let’s gather some, first for a single table and then for all the tables in the schema to see the different methods.

SQL> SET TIMING ON
SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>'HR',tabname=>'DEPARTMENTS',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');PL/SQL procedure successfully completed.Elapsed: 00:00:00.20
SQL> SET TIMING OFF
SQL> SELECT table_name, num_rows, last_analyzed
  2    FROM user_tables;
                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
REGIONS
COUNTRIES
LOCATIONS
DEPARTMENTS             27 20190704
JOBS
EMPLOYEES
JOB_HISTORY7 rows selected.SQL> SET TIMING ON
SQL> EXECUTE dbms_stats.gather_schema_stats(ownname=>'HR',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');PL/SQL procedure successfully completed.Elapsed: 00:00:00.51
SQL> SET TIMING OFF
SQL> SELECT table_name, num_rows, last_analyzed
  2    FROM user_tables;
                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
REGIONS                  4 20190704
COUNTRIES               25 20190704
LOCATIONS               23 20190704
DEPARTMENTS             27 20190704
JOBS                    19 20190704
EMPLOYEES              107 20190704
JOB_HISTORY             10 201907047 rows selected.

Great, we got our row counts and the stats gathering took under a second but these are small tables. For larger tables generally we might want to gather sample statistics, not count every row. So let’s create a table with 10K rows:

SQL> CREATE TABLE test_stats_estimate AS
  2  SELECT LEVEL id, CAST(DBMS_RANDOM.string('a',100) AS VARCHAR2(100)) txt FROM dual
  3  CONNECT BY LEVEL <= 10000;Table created

But gather stats on only 10% of the table:

SQL> EXECUTE dbms_stats.gather_table_stats(ownname=>'HR',tabname=>'TEST_SMALL_ESTIMATE_PERCENT',estimate_percent=>10,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');PL/SQL procedure successfully completed.

What we find is that the number of rows is an approximation, not exactly 10K:

SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%';                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE   9979 20190704

What happens if we delete half the rows in the table?

SQL> DELETE FROM test_small_estimate_percent WHERE ROWNUM <= 5000;5000 rows deleted.Elapsed: 00:00:00.08
SQL> SELECT COUNT(*) FROM test_small_estimate_percent;COUNT(*)
----------
      5000SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%';                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE   9979 20190704

We can see the difference in a COUNT() but there’s no change to num_rows. What about if we add new rows?

SQL> INSERT /*+ APPEND */ INTO test_small_estimate_percent
  2  SELECT LEVEL id, CAST(DBMS_RANDOM.string('a',100) AS VARCHAR2(100)) txt FROM dual
  3  CONNECT BY LEVEL <= 10000;10000 rows created.SQL> COMMIT;Commit complete.SQL> SELECT COUNT(*) FROM test_stats_estimate;  COUNT(*)
----------
     15000SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%';                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE   9979 20190704

How about if we truncate the table?

SQL> TRUNCATE TABLE test_stats_estimate;
Table truncated.
SQL> SELECT COUNT(*) FROM test_stats_estimate;  COUNT(*)
----------
         0SQL> SELECT table_name, num_rows, last_analyzed FROM user_tables WHERE table_name LIKE 'TEST%';                       Num Last
TABLE_NAME            Rows Analyzed
------------------- ------ --------
TEST_STATS_ESTIMATE   9979 20190704

No matter what we do the statistics generated row count isn’t updated. Looks like this method of find a row count is useless — you not only need to have stats but they have to be current. Or is it useless? Oracle provides a method to update stale statistics. How does it know?

SQL> SELECT table_name, inserts, updates, deletes, truncated FROM all_tab_modifications WHERE table_owner = 'HR';TABLE_NAME                        INSERTS    UPDATES    DELETES TRU
------------------------------ ---------- ---------- ---------- ---
TEST_SMALL_ESTIMATE_PERCENT         10000          0      19839 YES

This information is summarized by Oracle into a single stale_stats flag:

SQL> SELECT table_name,num_rows,last_analyzed,stale_stats FROM user_tab_statistics;TABLE_NAME                       NUM_ROWS LAST_ANALYZED   STALE_S
------------------------------ ---------- --------------- -------
TEST_SMALL_ESTIMATE_PERCENT         10023 20190704        YES
LOCATIONS                              23 20190704        NO
DEPARTMENTS                            27 20190704        NO
EMPLOYEES                             107 20190704        NO
COUNTRIES                              25 20190704        NO
REGIONS                                 4 20190704        NO
JOBS                                   19 20190704        NO
JOB_HISTORY                            10 20190704        NO8 rows selected.

Here we can see that the statistics for our target table are stale. To have Oracle automatically gather stats for stale tables we can use:

EXECUTE DBMS_STATS.gather_schema_stats (ownname => 'HR', cascade => TRUE, options => 'GATHER AUTO');

Which leads us back to where this post started, with:

The automatic optimizer statistics collection job which calls DBMS_STATS package runs in predefined maintenance windows and these maintenance windows are open once a day during which various jobs including the gathering of statistics is performed. SYSTEM can see the job in the scheduler:

SELECT *
  FROM dba_scheduler_jobs
 WHERE job_name LIKE '%STAT%';

See my next post if you want to expand this to calculate disk usage.

Here’s the entire sqlplus script which can be run on the command line:

Using Oracle Database 18c Express Edition (XE) — Free download

[1] This won’t happen on a fresh DB install, I deleted the statistics using:

EXECUTE DBMS_STATS.DELETE_SCHEMA_STATS(ownname=>’HR’);

Single SQL to count rows in all tables

Written by R Andrew Cocks