i working in sql server 2008. have been tasked writing stored procedure data validations on external data before move our star schema data warehouse environment. 1 type of test requested domain integrity / reference lookup our external fact data tables our dimension tables. this, use following technique:
select some_column some_fact_table left join some_dimension_table on some_fact_table.some_column = some_dimension_table.lookup_column some_fact_table.some_column not null , some_dimension_table.lookup_column null the select clause match column definition errors table move output via ssis. so, select clause looks like:
select primary_key, 'some_column' offending_column, 'not found in lookup' error_message, some_column offending_value but, because fact tables large, want minimize number of times have select it. hence, have 1 query each fact table check each column in question, looks like:
select primary_key, 'col1|col2|col3' potentially_offending_columns, 'not found in lookup|not found in lookup|not found in lookup' error_messages, col1 + '|' + col2 + '|' + col3 potentially_offending_values fact_table left join dim_table1 on fact_table.col1 = dim_table1.lookup_column left join dim_table2 on fact_table.col2 = dim_table2.lookup_column left join dim_table3 on fact_table.col2 = dim_table3.lookup_column dim_table1.lookup_column null or dim_table2.lookup_column null or dim_table3.lookup_column null this has problems it. (1) if of source column rows null, string concatenation in offending_values result in null. if wrap each column isnull (and swap nulls empty string), won't able tell if test failed because of true empty string in source or if swapped empty string. (2) if 1 of columns fail in lookup, error message still read 'not found in lookup|not found in lookup|not found in lookup', i.e., can't tell of columns failed. (3) potentially_offending_columns column in output static, means can't tell if of columns failed looking @ it.
so, in effect, having design problems errors table. there standard way of outputting errors table in situation? or, if not, need fix make output readable , useful?
i don't know data looks like, instead of using empty string isnull, couldn't return word fail or that's meaningful you. case when 'not found in lookup' column.
case when col1 null 'not found in lookup' else '' end + '|' + case when col2 null 'not found in lookup' else '' end + '|' + case when col3 null 'not found in lookup' else '' end error_messages, isnull(col1,'fail') + '|' + isnull(col2,'fail') + '|' + isnull(col3,'fail') potentially_offending_values
Comments
Post a Comment