How to create an Impala table using Parquet file format (Cloudera Impala)

Hey,

The whole thing behind Impala tables is to create them from "impala-shell"
using the “hive metastore” service you will be able to access those tables from HIVE \ PIG

It is recommended to
run INSERT statements using HIVE (it is also possible via impala-shell)
run SELECT statements using IMPALA

So, suppose you want to create an Impala table
DO NOT try to create the table from the hive interface \ command line.

the procedure should be :

1. create the table from the Impala-shell
General syntax of create table would be:

CREATE TABLE  table_name

col1 type1,

col2 type2,

..

PARTITIONED BY (colx typex, … )
ROW FORMAT
STORED AS
LOCATION ”;

For example:

CREATE EXTERNAL TABLE IF NOT EXISTS table_name (
col1 DOUBLE,
col2 int
)
PARTITIONED BY (batch_id INT, date_day STRING )
STORED AS PARQUETFILE
LOCATION '/mnt/my_table';

Please make sure you are following this high level syntax.

2. After a successful creation of the desired table you will be able to access the table via Hive \ Impala \ PIG

hive> show tables;

impala-shell> show tables;

OR

impala-shell> show table stats table_name ;

3. Insert Data from Hive \ Impala-shell
4. Refresh the impala talbe

refresh table_name

OR

invalidate metadata table_name

5. Now you can enjoy SELECTING your data from Impala-shell.

Amiram.

Advertisements
Tagged with: , , , ,
Posted in BigData

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: