Impala

 ################# Imapala ######### (used to process stured data)--Advanced than HIVE)

offers Hign performance,low latency in processing SQL query (used for analyzing or accesing data present in Hbase or s3 bucket)--uses HIVE meta store
In memory data processing
used to process sturenred data(not support transactional data)--only for analytical users--means not suitable for realtime data----(if impala meta store or HIVE metastore down you can't access  imapala)
wite data i.e present in HDFS,Hbase,s3 storage (eay data access)------
compatible with HIVE syntax
Easy integration to multiple tools such as tableau
used for kerberos authentication

****** 3 deamons runs on implala----master slave architecture-ID,ISS,ICS
ID--impala deamon -------------slave-------------impala is a high memory intensive deamon------co-ordinate the quering (analyse,access,querying data)--128GB RAM recommended
ISS--imapla state store---------master----used to monitor the service state info of all impala deamons-
ICS--impala catalog server------master--used to provide meta data info of impala deamons (from HIVE meta store)

Imapala and HIVE will be using common meta store i.e HIVE metastore (HIVE and imapala same queries)

PARQUET (col oriented format)
if the files are in PARQUET format impala will give good performance
if you want to fetch updated info run cmd--Refresh/invalidate metadata

dependencies between HIVE and impala ?
--------------------------
How to access impala?
impala shell or HUE----
HIVE
JDBC/ODBC


-------------------------
Java (MR)----client---run query--Impala deamons (acts as a co-ordinator for the query)-----it will request ICS to fetch meta data--ICS provide meta data info to ISS--

what ever queries run on HIVE same queryies we can run on imapala
HIVE---YARAN/spark
Imapala---IN memory procrssing (faster than HIVE but uses more hardware)
--------------------------------
Reference websites
1.dezyew.com---impala tutorial...
-------------------------------------
In CDP new verison---Imapala is removed---came up with HIVELLAP----HIve low latency analytical procesing
present only in CDH 5.x and 6.x
why they have removed impala?
suppose you are working on
HDFS
YARN
HIVE
Hbase
kafka
ZK
spark
imapala
-------------------------------------------------
*****pending
Ranger
ANsible
TLS SSL
-------------------------------

No comments:

Post a Comment