Check duplicate record in Hive

Today, I will discuss about ” How to automate the process where in you can check entire row duplicate record in hive”. As I have mentioned in all Automation blogs, I will share the pseudo code.STEP1: In hive , use “desc table_name” , this command will give you column names along with datatype and data length. Store the output of this command in a file , say HIVE_TABLE_DDL.txt STEP2 : Read the file HIVE_TABLE_DDL.txt using “cat” command. cat HIVE_TABLE_DDL.txt | awk ‘{print $1}’ ORS=’,’ | sed ‘s/,$//’ * awk'{print $1}’…

equivalent of sum of columns in SAS

Today, I will discuss about “How to design SAS code which is equivalent of sum of columns in database. Consider the below employee details. In order to get sum of Year-2019 and Year-2020, we need to use add (+) operator to achieve the same. See the below Query and data for the same. Here, i applied ,normal addition, sum using ISNULL function, difference using absolute(ABS) function. In order to achieve the same in SAS . Create the same Datasets . See the below Screen Shot for the same. Now, create…

features of SAS Tool

Today, I will discuss about the features of SAS. As I started my career as database developer ,hence I will share features which is similar to database commands which will help you all in migration projects. 1. Connect to any data base like Sql-server , Netezza, Oracle using connect to odbc command.2. Perform nesting of SAS scripts inside one SAS script using macro and include commands.3. Perform union all, all joins(left outer,right outer,full outer) operations in SAS using merge and set commands4. Read data from any file like csv ,pipe(|)…

equivalent of union all in SAS

Today , I will discuss about ” How to code in SAS which is equivalent of union all in Database.See the below Data set in Sql Server. See the output of union all in Sql Server Perform the same Steps for SAS as well. First, create the same data set in SAS. Below code will perform union all logic in SAS.* Tricky part here is renaming of columns , otherwise data will not look alike that of sql server. See the SAS output for the same.

equivalent of full outer join in SAS

Today, I will discuss about “How to design SAS code which is equivalent of full outer join in database”. We created the two datasets in sql-server. when we apply full outer join on above data set. In SAS, create the dataset same as TEST1 and TEST2 tables.* Always remember to sort the data. Make a habit of this in SAS. Merge the above datasets using the below code. See the final output for the same.

equivalent of inner join in SAS

Today, I will discuss about “How to design code in SAS which is equivalent to inner join in database where column name are different in both the data sets or tables in terms of database.Below is the data in two tables. Now, apply inner Join in both the tables on condition COL1=COL4 and below is the output of the same. Next step , try to first create the same dataset in SAS and then apply logic on that data to get result equivalent of inner join.Create the dataset same as…

Athena object properties

Today, I will discuss about the two things in single blog.1. How to generate DDL of Athena Object2. How to Check the properties of tables . GENERATE DDL of Athena ObjectLook at left Pane in the Athena console, select the database, Click on three vertical dots. See the below ScreenShot for the same. Click on “Generate Create Table DDL“ Table Properties of Athena Object Go to “Show Properties“ Once you click on Show properties. You will see the below Screen .If you can see , it will tell you when…

shell script to check whether logs have errors or not

Today, I will discuss about “how to check logs files at regular interval and identify if any errors are there or not and if yes, then send email to the concerned team members. As mentioned in my previous blogs , I will be sharing pseudo code only. STEP1 : Create a parameterized file which has all the details of log file names. For example, consider filename is (logs.config) and below is the content of the fileUSECASE_NAME|SERVER_LOGNAME_PATTERNRECHARGE_USSD|rechargeUssdMode STEP2 : Create the shell script in such a way that when you trigger…

create table using file having variable length for each field in hive

Today, I will discuss about to how to create Hive table using file in which fields length is different for all fields .To put in differently, it is neither fixed length file nor delimited file.Below is the example of file contentDefinition of a fileFIELD1, length 2, value are 10 ,10FIELD2, length 5 , value are 5,125FIELD3, length 6 , value are 3,12FIELD4, length 7, value are ABC,EFFIELD5, length 4, value are 12,21FIELD6, length 10, value are 15,10 In order to create table for such files, use the below syntax for…

check whether process is running or not

Today, I will discuss about existence of a process in the linux server through shell script. I will share the pseudo code for the same in the form of steps.* Create a shell script which runs in infinite loop using while 1==1 * Inside this while loop, check the existence of a process on every 15th minute which can be achieved by dividing the minutes part of current timestamp by 15 using MOD function. If it returns 0 , go to next statements else come out of If statement. *…