Find columns with non-null data
Have tables with loads of columns and lots of data – mostly null – and want to know which columns across them actually has any data? Try this. Oracle specific though.
First, create a table that contains all the tables and columns you want to check – and some columns for processing output. You’ll monitor this table to check progress, and get the output from here in the end. Lets call this table COLS_TO_CHECK. You must have dba rights for this.
create table COLS_TO_CHECK as (
select
col.owner as schema_name,
col.table_name,
col.column_name,
col.nullable,
'N' as processed,
' ' as not_null
from
sys.dba_tab_columns col
where
col.owner = 'SCHEMA_OWNER_OF_TABLES_TO_CHECK'
and col.table_name in ('LIST','OF','TABLES')
);
You can check progress later in this table – it will set “processed” to “Y” when done, and “not_null” to either “Y” or “N”, depending on whether that column has any data in it (“Y”) or whether all rows are null (“N”).
To actually check the data, run this PL/SQL (its going to be slow!). If it dies or gets killed, you can just re-run – it will carry on from where it last checked.
declare
l_test_statement varchar2(32767);
l_contains_value pls_integer;
cursor c is
select schema_name, table_name, column_name, nullable
from COLS_TO_CHECK -- the table created above
where processed = 'N';
begin
for r in c
loop
if r.nullable = 'N'
then
update COLS_TO_CHECK set processed='Y', not_nul ='Y'
where schema_name=r.schema_name
and table_name=r.table_name
and column_name=r.column_name;
commit;
else
begin
l_test_statement := 'select 1 from dual where exists (select 1 from ' || r.schema_name || '.' || r.table_name || ' where ' || r.column_name || 'is not null)';
execute immediate l_test_statement
into l_contains_value;
update COLS_TO_CHECK set processed='Y', not_null='Y'
where schema_name=r.schema_name
and table_name=r.table_name
and column_name=r.column_name;
commit;
exception
when no_data_found then
update COLS_TO_CHECK set processed='Y', not_null='N'
where schema_name=r.schema_name
and table_name=r.table_name
and column_name=r.column_name;
commit;
end;
end if;
end loop;
end;