2018/07/20

Hive OpenCSVSerde - Impala error

SerDe library 'org.apache.hadoop.hive.serde2.OpenCSVSerde'


I have a Hive external table created on a CSV file using the SerDe - 'org.apache.hadoop.hive.serde2.OpenCSVSerde'.  The external table created OK and am able access from Hive.

But when I try to access the same table from Impala, it starts throwing the error - "SerDe library 'org.apache.hadoop.hive.serde2.OpenCSVSerde' is not supported."

Looks like Impala does not recognize the Hive's OpenCSVSerde.

To get around the problem, I had to follow these steps :


  1. Create Hive External Table - stored as TextFile
  2. Create Hive table ( managed ) using CTAS - stored as PARQUET
  3. Access the Parquet table from Impala

Hope if any one is facing the same issue will find this solution useful.

1 comment:

File Handling with Python

This little utility is for copying files from source to target directories.  On the way it checks whether a directory exists in the target, ...