44 views

Metadata Injection component in Pentaho

Today, I will discuss about the Metadata Injection component in Pentaho. This is one of the fine component in Pentaho. If you have come across scenario where your input file changes based on the number of columns or other way of saying it , that file changes dynamically. In such cases, we need to use Metadata Injection component. I have created the ETL job for such cases in PDI. See the below SS for the same.
CRUX OF THE METADATA INJECTION COMPONENT IS INPUTS CONNECTED TO THIS COMPONENT AND THE JOB WHICH IS CALLED USING THIS COMPONENT

STEP1: Source Metadata
This sheet has the Metadata details of the source data which is source column names(together with data types) and with mapping of source with target columns. See the below image for the same.

Here comes the tricky part. If you see the fields clearly, We have intentionally defined the metadata sheet in such a way which will match the fields of excel input fields(explained about this in below sections).You need to take care of this point.

STEP2: Get FileNames
Here, we considered the input file as excel and we just need to get the filename along with path. Hence , we used getfileNames comoponent. See the below figure for the same.


STEP3: Target Metadata
This sheet has data about the target fields and their data types. See the below snippet of the same.

STEP4: MetaData Injection
Now, comes the paramount part of the code which is metadata injection.
In order to understand this, I have again divided into steps.
1. We have to map the ETL code(job or transformation) which we need to run using this component.

At this moment, if you see we are calling transformation “Trigger-MI.ktr”.See the below image of this ETL code.

Here, the interesting part is you don’t need to mention or define anything in either of the components. Metadata injection and the three inputs to that component will take care of this thing.
2. Once you mention the desired job or transformation in the “Browse” Section, metadata injection component will display all components details of that job/transformation. Below is the image depicting the same.

Here, Observe the following points.
Tranformation Name (Trigger-MI).
Source_Metadata , this is the Name given to one of the component used in the code. See the top most image of ETL job.
Mapping of each field in the source metadata with the Excel input component fields as Trigger-MI.ktr has excel input component.
3. Now, comes the second part where we map the filename .In you remember, we mentioned about “Get-filename” so here its comes.
Renaming of source fields names with target fields name is also mentioned using “select values” component(Trigger-MI.ktr). Reason for this is name of the source and target fields may be different .


4. Now , the final part is mapping of target fields with the output component which is text file output(Trigger-MI). See the below Image of the same.


Here, Fields of the Target Metadata sheet mapped with the fields of Text file output component.
Please post your doubts/issues in the comment section .

.

Related posts