In my last post on data cleaner where i mentioned the steps to install data cleaner and integrate with Pentaho, so today i will explain how to add data source in Data cleaner and what all types of data sources we have in data cleaner. Below are the data source which are available in Data cleaner.
- csv file.
- Excel file
- Access Database.
- Dbase Database
- Text fixed file
- XML file
- Sql Server
- Apache Hive
Below is the image for the same.
Now, when you click on any data source, say csv file, below screen will appear.
Here, I have considered EMP_DETAILS.csv file. Once you fill above details, it will look like below.
Click on Register Datastore. As soon as you click on this, it will appear on Datastore Management.See the below image for the same.
It is well understood that whatever action we are doing at UI level, it should reflect in some file as well. So, here is one surprise for you all, as i am doing this on windows Machine, one of the file which has all datastore details is not getting updated. The location of the file is
Here , you need to add EMP_DETAIL.csv as datasource in conf.xml file. Below is the content which you need to add in this file.
description=”Example CSV-file with representing employee’ details”>
So, these are the steps to add data source in Data cleaner.