cassandra input component in Pentaho

Today, I will discuss about “How to use cassandra input component in Pentaho”. For this, the first and foremost criteria is that cassandra database should be downloaded and installed in your local Machine.Once installation is completed, you can start the Apache-cassandra services using below command.cassandra.bat -f ( This batch file is present inside bin folder).In order to create a sample table in the Cassandra database, you need to open another command prompt session where you need to run the cqlsh command(Run this command inside bin folder) which helps to run…

call stored procedure using Table Input component in Pentaho

Today, I will discuss about the scenario which might comes in someone’s development phase wherein we need to call procedure or function using Table input component. Here, I have created the sample function and procedure using sql server Database. See the code snippets. See the Data which is present in the TBL_DIM_CUSTOMER Now, we will call function using select statement inside Table input component. See the below SS. Here , I have passed customer_id value as 5. Click on Preview, you will get required result. In the similar Fashion, we…

Dimension Lookup in Pentaho

Today, I will discuss about the Dimension Lookup in Pentaho. I have seen various blogs and forums where users share their experience about the issues encountered during the implementation of this component. One of the major issue is null values in the Dimension tables. There are two scenarios. 1. When you use this component with “Use Auto Incremental Key” for Technical Field, then it inserts a null values on every run which is a concern .2. When you use this component using “Use table Maximum +1” for Technical Fields, then…

How to pass parameter in carte

Today, I will discuss about “Call Job using Carte with parameters”. I have already explained How to setup carte in windows machine and How to execute job or Transformation using Carte. https://www.allabouttechnologies.co.in/pentaho/run-transformation-as-web-service-using-carte-in-pentaho/ Now, we will extend this blog by passing the parameter to a job. Here I have a job where I have set parameter “file_name”. See the Below images for the same. This file_name parameter is used inside the transformation which can see below. Now, we will run the Job using carte . In order to do this, first…

Query to get all records not matching in all tables.

Today, I will discuss about the scenario/Problem which i faced in current project. Problem Statement : You have two Employee tables having data mentioned below. Emp-1 EmpId EmpName 1 Shivani 2 Gaurav 3 Radhe 4 Rahul 5 Anil Emp-2 EmpId EmpName 1 Shivani 6 Ashok 3 Radhe 7 Vikram 4 Rahul Output : EmpId EmpName 2 Gaurav 5 Anil 6 Ashok 7 Vikram Solution : Here, you need to use except or minus based on the database you are using and union clause. See the below query for the same.…

Avro Output in Pentaho

Today, I will discuss about Avro Output Component in Pentaho. In my previous blog, I have share my experience about Avro input component where Data Deserialization happens. In this Component, Data Serialization Happens. So, if you have data in a text format, you can convert the same in Avro format as well. As soon as you do this conversion, a Schema file also get generated along the Avro file.This all can be achieved through Avro Output Component in Pentaho I have designed a very simple Transformation wherein we have csv…

Avro File Input In Pentaho

Today, I will discussing about the Avro input component in Pentaho. Avro uses the concept of serialization and De-serialization. Serialization means processing the data into binary format. Its very clear that if we have data in binary format ,its unreadable and hence very effective way to transfer over the network. Therefore, many Organization are adopting this technique due to data security concerns. Deserialization means convert the binary formatted data into a readable form. Now the question comes how binary data is deserialized. Here , comes the concept of Schema file.…

EDIT THE DATA IN HIVE TABLES

In Hive, We know that it works on file reading mechanism where hive reads data present in files present in hadoop file system. Here , pre-requisite is you should have basic knowledge of Hive. STEP-1 Copy the Hadoop files of a particular Partition for that particular Hive object to your local server using get command. For example hdfs dfs –get /hadoop-server-details/path-of-the-file/partition-column-name=value  /home/user1/ Here Assumption is file format of files that are mapped to hive object is not normal text file (say avro file) . So, it is recommended to copy…

BEELINE COMMAND LINE IN HIVE

Today, I will discuss about the beeline command line which we use to call the SQL Query through Linux. But what if same thing needs to be called through Shell Scripting. First of all, we need to call Sql Query through Beeline command line inside shell Scripting using below command. beeline -u “jdbc:hive2://localhost:10000/default;principal=hive/localhost“ -n “username” -p “password” –hivevar var1=$col1_hive –hivevar var2=$schema_name –hivevar var3=$table_name –hivevar var4=$col1_value -f sql_script.sql > text.log Here $col1_hive is the column name of a table. $table_name is the table name. $schema_name is the Schema Name where that…

INTERVIEW QUESTIONS IN PENTAHO SET-4

1.How to create ETL code which will behave same as case statement in Database. Name the component which you will use for the same. 2.Can we run the ETL code until the condition met or you can say infinite loop like while statement saying 1==1 in shell script. 3.How to get number of fields in csv file in Pentaho without using shell Script. 4.Have you ever used Sample rows component. If yes, where you actually need this component. 5.What is Apache kafka . How you will setup the same in…