DYNAMICALLY SPLIT FIELDS IN PENTAHO

Today, i will discuss about the split fields component usage based on the occurrence of delimiter. In last blog, we encountered that if occurrence of delimiter vary, then the split field component will not work effectively. In other words, alone split field component is sufficient to get the proper results. Please see the below image which demonstrate the dynamically using split field component based on the count of delimiter occurrence. Here, switch/case component and java expression is added. Below is the java script used to get the occurrence of delimiter.…

SPLIT FIELDS COMPONENT IN PENTAHO

Today, i will discuss about the “Split Fields” component in Pentaho.   Split Fields component split the value of a variable into multiple variables based on delimiter.For example, consider  a variable var1 having value a;b;c. Now i need to split into three different values a,b and c. In order to achieve this, use split Fields component in Pentaho with delimiter as “;”. See the below image for the same. Use Generate rows  to generate one variable with value “a;b;c”. Then use the split Field component with delimiter as “;”. See the…

CREATE SQL SERVER INSTANCE IN AWS

Today, i will discuss  regarding the sql server instance creation in AWS. First of all, login to your AWS account. Go to services -> RDS.  Below image will appear on your screen. Select SQL server as database Engine. Check the checkbox on “Only enable options for RDS Free Usage Tier” if you are using free Tier Account. Once you check the checkbox, it will automatically choose the option “Sql Server Express Edition”.If you are working on a project, choose the option as per the client requirements. See the below image…

HOW TO GENERATE ACCESS AND SECRET KEY IN AWS

Today, i will discuss about “how to generate Access and Secret key in AWS”.  First, go to “My Account credentails”. See the below image for the same. Then Go to Access keys(Access key ID and secret access key). Then click on “Generate Access key”. Once you click on this button,”rootkey.csv” will get generated which will have Access key and Secret key.

HOW TO CREATE A S3 BUCKET IN AWS

Today i will discuss about “How to Create the S3 bucket in AWS”. Login to AWS account. Go to S3. Click on “create bucket”. See the  below image for the same. Once you click on “create bucket“, it will ask for bucket name. Enter the bucket name(I created the bucket with name “s3-tech”) and click on Next. Then again click on Next.(As I am using free tier Account so few of functionalities are disabled). Then choose the value “Do not grant public read access to this bucket” for “Manage public permission”…

S3 FILE OUTPUT IN PENTAHO

Today i will share my experience on S3 file output component in Pentaho. This component is used when you want to create a file(with data in it) in S3 bucket. Below is the ETL code for the same. In above transformation , CSV file input and S3 file output component are used. In this code, I am just copying the data from a file which is present in my local Machine  to the S3 bucket(s3-techie as mentioned in my last blog). In S3 output file, mention the values for S3…

S3 FILE INPUT IN PENTAHO

Today i will share my experience how to use S3 file input component in Pentaho. Before we start and discuss about the ETL code, you should have S3 Access key and S3 secret key handy with you in order to connect with Amazon S3 bucket. Secondly, If you want to test the below ETL code and you are not using S3 bucket for file storage, then you can create your own Amazon account(Free for one year with limited features and limited storage). Below is the ETL code regarding “How to…

MERGE THE MULTIPLE CSV FILES INTO ONE FILE IN PENTAHO

Today, I will discuss about how to merge the multiple csv files into one csv file in Pentaho. Below is the image of the ETL code for the same. First, create transformation which will load the csv file names into variable.  Below is the code for the same. In Get file names component, use wildcard expression to fetch all CSV files of particular pattern, and click on show filenames which will show file names along with absolute path. Here, I have considered a scenario where employee salary generated in csv…

execute the job using data cleaner in Pentaho

In my last post, I have explained how to create different data sources in data cleaner. Today, i will use the same data source which is csv input file and design a job in the data cleaner tool.  Upon completion of job, we will use data cleaner component in Pentaho and execute the same job using Pentaho. First, once you open data cleaner tool using Pentaho which i have mentioned in my previous post, click on New-> Build New Job. See the below SS for the same. Then , select…

CHANGE THE USER INTERFACE OF PENTAHO

Today, I will discuss about  how to change the UI (User Interface) of Pentaho. Below is the Welcome page of Pentaho Data Integration. See the Green highlighted part. Now , in order to change the content and image of Welcome page. You need to change the index.html. Location of this file is <path>\data-integration\docs\English\welcome. I have changed the below lines of index.html. <div class=”header-navigation”> <div class=”header-navigation-item”>WELCOME TO PDI</div> <div class=”header-navigation-item”>MEET PDI FAMILY</div> <div class=”header-navigation-item”>CREDITS</div> <div class=”header-navigation-item”>WHY ENTERPRISE EDITION</div> <div class=”clear”></div> </div>   <div class=”headerContents”> <h1 class=”large lineheight45″>How to get Most<br>From Pentaho</h1>…