Counting Distinct Values Accurately in Druid SQL
· 3 min read
When working with Druid SQL, it's easy to fall into a common trap when counting distinct values: using COUNT(DISTINCT ...)
directly can sometimes return unexpected results. Recently, I hit a case where COUNT(DISTINCT)
returned a different value than selecting the DISTINCT
rows manually - and this post explains why that happens, and how to fix it.