Check existence and non-existence of columns in a table

In this Bog, We will prepare a query which will tell us which tables has one specific column but another column which is also important is not present.QUERY-1select distinct TABLE_NAME from INFORMATION_SCHEMA.COLUMNS c where COLUMN_NAME=’column1′ and not exists ( select 1 from INFORMATION_SCHEMA.columns cc where cc.TABLE_NAME=c.TABLE_NAMEand cc.COLUMN_NAME=’column2′) The above Query will list all tables in which column1 is present and column2 is not present. QUERY-2select distinct TABLE_NAME from INFORMATION_SCHEMA.COLUMNS c where COLUMN_NAME=’column1′ and exists ( select 1 from INFORMATION_SCHEMA.columns cc where cc.TABLE_NAME=c.TABLE_NAMEand cc.COLUMN_NAME=’column2′) This Query will list all tables in…

get all Values separated by delimiter in sql

In this blog, I will discuss about “How to get all values separated by delimiter in SQL”. In one of my previous blog, where I have explained about positions of all delimiters present in a string. We will use same logic here as well. For example I have string with value abc@gmail.com;xyz@gmail.com;123@gmail,comIn the above string, I have three email ids which are separated by delimiter ; . I need these values in below format abc@gmail.comxyz@gmail.com123@gmail,com Use the below code to fetch the data in above format . declare @var1 varchar(100)…

ROLLBACK IN SQL SERVER

Lets discuss about the rollback code or syntax in the SQL-Server. In SQL Server Settings, we generally set auto-commit on. I have worked on many projects where database is SQL-Server and most of them(team mates) were not aware of Rollback syntax in SQL-Server. In oracle, auto-commit is generally set to off. So, we always press on commit button in SQL Developer or any other tool or we use syntax “Commit;” to commit the uncommitted transactions like insert/update. In SQL Server, If you want to explicitly want to commit the transactions…

GET HIERARCHICAL VALUES PRESENT IN SAME COLUMN OF A TABLE

Lets discuss about the parent and child relationship present in the same column of a table. That means both exists in the same column and there is always a identifier which says whether a record belongs to parent value or it belongs to child value. See the below data .As it is clearly seen from the data, parent and child record can be easily identified from column c1. Here , we need to apply self join to get parent and child values in the same row. See the below Query.…

Data validation across environments using HASH

Today i will discuss about data validation across environments using HASH function. There are times we need to validate the data of same table in different environments which is very monotonous and time consuming task while doing manually. Its better to automate the process. Here , I am considering the three environments Hive,IBM DB2,Snowflake Database.HIVE:md5(concatenated_columns)SNOWFLAKE:MD5_HEX (concatenated_columns)IBM DB2:HEX(HASH_MD5( concatenated_columns) Here, you need to take care of below things.1. Hexadecimal Output may in uppercase or lowercase, so handle with UPPER or LOWER function across environments.2. You need to take care of nulls…

Count the number of nulls using left join

Today, I will discuss about How to calculate the number of nulls that gets populated with left join or %age of matched data or count of not nulls using left join. CASE 1:Calculate the number of nulls that gets populated from right hand table using left join.Assume , I have two employee table EMP_TBL1 and EMP_TBL2. See the data for the same. EMP_TBL1 EMP_TBL2 below is the code used to get number of nulls that comes from right side table using left join. select count(*)-count(y_empid) from ( select x.emp_id ,y.emp_id…

Internal and External tables in Hive

Today, I will present types of tables that can be created in hive in a different way. I created three tables in hive.create table empdtls (emp_id int ); { This is called Internal or Managed table} create external table empdtls_ext (emp_id int ); { This is External table } create external table empdtls_ext_v2 ( emp_id int ) location ‘/user/demo/hivetesting/’ { This is also called External table} Now, we will check the TABLE_TYPE of all above tables using command DESCRIBE FORMATTED table_name Internal Or Managed Table/user/hive/warehouse/ is the location where all…

get position of all delimiter values in a string

Today, I will explain about “How to get position and values separated by delimiter in a string”. Delimiter may be comma,pipe,colon etc. STEP1 : Get position of all delimiters in a string. I am explaining with SQL-server as Database. ;with T as ( select 0 as row_id, charindex(‘,’, @string) pos union all select pos + 1, charindex(‘,’, @string, pos + 1) from T where pos > 0)select * from T For example , consider a string “Harsimran,Kaur,91-123-456-7890” . So, it has first name and last name and dummy mobile number.…

Check duplicate record in Hive

Today, I will discuss about ” How to automate the process where in you can check entire row duplicate record in hive”. As I have mentioned in all Automation blogs, I will share the pseudo code.STEP1: In hive , use “desc table_name” , this command will give you column names along with datatype and data length. Store the output of this command in a file , say HIVE_TABLE_DDL.txt STEP2 : Read the file HIVE_TABLE_DDL.txt using “cat” command. cat HIVE_TABLE_DDL.txt | awk ‘{print $1}’ ORS=’,’ | sed ‘s/,$//’ * awk'{print $1}’…

create table using file having variable length for each field in hive

Today, I will discuss about to how to create Hive table using file in which fields length is different for all fields .To put in differently, it is neither fixed length file nor delimited file.Below is the example of file contentDefinition of a fileFIELD1, length 2, value are 10 ,10FIELD2, length 5 , value are 5,125FIELD3, length 6 , value are 3,12FIELD4, length 7, value are ABC,EFFIELD5, length 4, value are 12,21FIELD6, length 10, value are 15,10 In order to create table for such files, use the below syntax for…