DERIVED VARIABLES IN PENTAHO

Today, i will discuss about the Derived variables in Pentaho. The term “Derived Variables” means variable name or variable value derived from a particular variable or its value. In order to demonstrate that i have designed an ETL job for the same. See the below image for the same. In this  transformation, i used get system info, where i defined one variable “date_value” with value as system date (fixed). After this component, i added Java script component where i  created new more variables and value is  derived from date_value variable.…

INTERVIEW QUESTIONS IN PENTAHO SET-2

Today, i will discuss about the interview Questions on Pentaho. In my previous blog, I have share 15 Interview question. You can go through those as well using below link. interview Questions on Pentaho Below are the new set of Interview Questions on Pentaho. 1. What is Carte in Pentaho. In what all scenarios , we can avail Carte services. 2. How can we dynamically split value into multiple Variables. 3. What is AEL (Adaptive execution Layer) in Pentaho. 4. How can we change the User interface of Pentaho BI…

HOW TO GET FILENAMES PRESENT INSIDE THE ZIP FILE

Today, i will discuss about “How to get filenames present inside the zip file without unzipping it”. Here, “Get file names” component will be used.  See the below image for the same. In Get File Names component, the value of file directory is set to zip://C:/data-integration/SALARIES.zip and wildcard is “.*“. Here the catch is “zip:” part . Click on show filenames.It will give you all file names which are present inside the zip file. See the below image for the same. Run the  transformation  and see the Pentaho logs. Write to…

DYNAMICALLY SPLIT FIELDS IN PENTAHO

Today, i will discuss about the split fields component usage based on the occurrence of delimiter. In last blog, we encountered that if occurrence of delimiter vary, then the split field component will not work effectively. In other words, alone split field component is sufficient to get the proper results. Please see the below image which demonstrate the dynamically using split field component based on the count of delimiter occurrence. Here, switch/case component and java expression is added. Below is the java script used to get the occurrence of delimiter.…

SPLIT FIELDS COMPONENT IN PENTAHO

Today, i will discuss about the “Split Fields” component in Pentaho.   Split Fields component split the value of a variable into multiple variables based on delimiter.For example, consider  a variable var1 having value a;b;c. Now i need to split into three different values a,b and c. In order to achieve this, use split Fields component in Pentaho with delimiter as “;”. See the below image for the same. Use Generate rows  to generate one variable with value “a;b;c”. Then use the split Field component with delimiter as “;”. See the…

S3 FILE OUTPUT IN PENTAHO

Today i will share my experience on S3 file output component in Pentaho. This component is used when you want to create a file(with data in it) in S3 bucket. Below is the ETL code for the same. In above transformation , CSV file input and S3 file output component are used. In this code, I am just copying the data from a file which is present in my local Machine  to the S3 bucket(s3-techie as mentioned in my last blog). In S3 output file, mention the values for S3…

S3 FILE INPUT IN PENTAHO

Today i will share my experience how to use S3 file input component in Pentaho. Before we start and discuss about the ETL code, you should have S3 Access key and S3 secret key handy with you in order to connect with Amazon S3 bucket. Secondly, If you want to test the below ETL code and you are not using S3 bucket for file storage, then you can create your own Amazon account(Free for one year with limited features and limited storage). Below is the ETL code regarding “How to…

MERGE THE MULTIPLE CSV FILES INTO ONE FILE IN PENTAHO

Today, I will discuss about how to merge the multiple csv files into one csv file in Pentaho. Below is the image of the ETL code for the same. First, create transformation which will load the csv file names into variable.  Below is the code for the same. In Get file names component, use wildcard expression to fetch all CSV files of particular pattern, and click on show filenames which will show file names along with absolute path. Here, I have considered a scenario where employee salary generated in csv…

execute the job using data cleaner in Pentaho

In my last post, I have explained how to create different data sources in data cleaner. Today, i will use the same data source which is csv input file and design a job in the data cleaner tool.  Upon completion of job, we will use data cleaner component in Pentaho and execute the same job using Pentaho. First, once you open data cleaner tool using Pentaho which i have mentioned in my previous post, click on New-> Build New Job. See the below SS for the same. Then , select…

CHANGE THE USER INTERFACE OF PENTAHO

Today, I will discuss about  how to change the UI (User Interface) of Pentaho. Below is the Welcome page of Pentaho Data Integration. See the Green highlighted part. Now , in order to change the content and image of Welcome page. You need to change the index.html. Location of this file is <path>\data-integration\docs\English\welcome. I have changed the below lines of index.html. <div class=”header-navigation”> <div class=”header-navigation-item”>WELCOME TO PDI</div> <div class=”header-navigation-item”>MEET PDI FAMILY</div> <div class=”header-navigation-item”>CREDITS</div> <div class=”header-navigation-item”>WHY ENTERPRISE EDITION</div> <div class=”clear”></div> </div>   <div class=”headerContents”> <h1 class=”large lineheight45″>How to get Most<br>From Pentaho</h1>…