Data validation across environments using HASH

Today i will discuss about data validation across environments using HASH function. There are times we need to validate the data of same table in different environments which is very monotonous and time consuming task while doing manually.

Its better to automate the process. Here , I am considering the three environments Hive,IBM DB2,Snowflake Database.
MD5_HEX (concatenated_columns)
HEX(HASH_MD5( concatenated_columns)

Here, you need to take care of below things.
1. Hexadecimal Output may in uppercase or lowercase, so handle with UPPER or LOWER function across environments.
2. You need to take care of nulls as well. Replace nulls with Blank(”) .
3. Do type casting of all columns to varchar/string before concatenation.
4. Data validation using HASH functions is successful when the number of records are equal in both the environments.
5. Most importantly, order of records plays an important role.

Related posts