Impala – Optimisation at partition level

We all know that to optimise our queries these 3 strategies are like most common : Partitioned table Bucketing Collecting Stats But sometimes a simple query will run on ALL partitions instead of one. You may notice your query should work on one partition but it will run on all partitions.Let me show you an … Read more

Sqoop – Handle NULL values

By default Sqoop import NULL as null, if you want to change this default configuration you can use following arguments. While importing data :  –null-string –null-non-string While exporting data :  –input-null-string –input-null-non-string Check this example for more clarification :  In above example : –null-string argument represents what should be writtern in HDFS whenever a NULL is identified in … Read more