Snowflake – Performance Tuning and Best Practices

Note: This article is a compilation effort of multiple performance tuning methodologies in Snowflake. Some Text/Images in the following article has been referred from various interesting articles and book, details of which are captured under “References”. Introduction to Snowflake Snowflake is a SaaS-based Data Warehouse platform built over AWS (and other clouds) infrastructure. One of the … Read more

Apache Spark – Performance Tuning and Best Practices

Note: This article is a compilation effort of multiple performance tuning methodologies in Apache Spark. Text/Images in following article has been referred from various interesting articles and book, details of which are captured under “References”. Tweak Configurations Viewing and Setting Apache Spark Configurations 4 ways of doing it : Way-1:Using $SPARK_HOME directory (Configuration changes in … Read more

Impala – Optimisation at partition level

We all know that to optimise our queries these 3 strategies are like most common : Partitioned table Bucketing Collecting Stats But sometimes a simple query will run on ALL partitions instead of one. You may notice your query should work on one partition but it will run on all partitions.Let me show you an … Read more