Avro Output in Pentaho

Today, I will discuss about Avro Output Component in Pentaho. In my previous blog, I have share my experience about Avro input component where Data Deserialization happens.

In this Component, Data Serialization Happens. So, if you have data in a text format, you can convert the same in Avro format as well. As soon as you do this conversion, a Schema file also get generated along the Avro file.This all can be achieved through Avro Output Component in Pentaho

I have designed a very simple Transformation wherein we have csv input file and avro output component.Below is the Screenshot for the same.

Avro-Output Transformation

In the CSV file input, consider a simple csv file which has employee data. See the below SS for the same.

Now, In Avro Output,Go to “Fields” tab, Click on “Get Fields” which are coming from the source which is csv file and give the path along with filename which you want to generate as Avro file . Then Go to “Schema” tab, mention the Avsc filename, Namespace,RecordName, DocValue. See the below SS for the same.

Once this is done, Run the Transformation , Avsc and Avro file files get generated at paths mentioned in the above images.

Related posts