Understanding how to optimize load processes in HQL/SQL for a better utilization of your cluster.

From time to time I dump some random knowledge to colleagues and I thought this is maybe also relevant for you : )

SQL is a highly structured language, which means it is important to stick to the rules. On the other hand, everything that is not forbidden — is allowed.

Usually, you will order a query like this:

  1. SELECT
  2. FROM
  3. JOIN or OUTER JOIN with ON
  4. WHERE
  5. GROUP BY and optionally HAVING
  6. ORDER BY

But the order of execution is slightly different:

  1. FROM
  2. JOIN

Hendrik Schultze

Data Engineer with a passion for games and beer ;D

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store