How query engines work
Recently, I delved into Apache Ballista, a distributed computing platform. In the blog of its creator, Andy Grove, I discovered the book How Query Engines Work, which provided me with deeper insights into the motivations behind the creation of this tool. It also helped me understand what Apache Arrow and DataFusion are, and how they are interconnected. Overall, I gained a solid understanding of how modern query engines are designed. While reading this book, I took notes that I now intend to share. This will not be an outline of the book. Article will focus on the main ideas, supplemented with information from other sources.