1 Key Design

Yellowbrick Data Warehou delivers efficient, scalable and resilient data warehousing in public clouds and in private data centers.

2 Architecture

Storage is separated from compute and data is persisted in object storage as column-oriented, compressed files known as shards.

Microservices Architectrue

Deployment Approach

3 Software Optimizations

3.1 Database Optimizations

parallel query plans, cost-based optimization, workload mangaement and parallel query execution

compiler: c++code and machine code, distributed to the workers for parallel execution

sql parser and planner based on PG 9.5

shared-nothing database: rows distributed hash, randomly or replicated

workers comprised of an execution engine and a storage engine, uses credit-based flow control framework

row-oriented and column-oriented data packets

storage engine manages the column-oriented shard files

data packets are 256 kb in size, fit into L3

configurable workload management profile

life cycle of a query

3.2 Operating System Optimizations

Yellowbrick bypasses the Linux kernel for most system-level operations: data read from NVMe SSDs is preserved in the CPU caches.

memory management: takes over control of the system memory to avoid kernel swapping, grouped by query lifetime to avoid memory fragmentation, lock-free and NUMA-ware. 2MB or 1GB HugePage.

task scheduler: run in user space, context switch between queries in ~100 nanoseconds, executing the same stage of the query plan at the same time

3.3 Storage Optimizations

hybrid storage engine: front-end row store and back-end column store

3.4 Networking Optimization using DPDK

on top of UDP

each vCPU on a worker connects to a corresponding vCPU thread on a different worker

4 Conclusions

Yellowbrick adoption of Kubernetes as the orchestration and platform-agnostic runtime enables it to deliver a modern data warehouse that runs anywhere. The delegation of infrastructure responsibility to Kubernetes has allowed us to focus on the core business of enhancing database performance and adding new features.