High Performance Real-time Analytics Database

VizDB is the foundation of the Vizlytics platform. VizDB is SQL-based, relational, columnar and specifically developed to harness the massive parallelism of modern CPU and GPU hardware. VizDB can query up to billions of rows in milliseconds, and is capable of unprecedented ingestion speeds, making it the ideal SQL engine for the era of big, high-velocity data.

Features

Advanced Memory Management

VizDB optimizes the memory and compute layers to deliver unprecedented performance. VizDB was designed to keep hot data in GPU memory for the fastest access possible. Other GPU database systems have taken the approach of storing the data in CPU memory, only moving it to GPU at query time, trading the gains they receive from GPU parallelism with transfer overheads over the PCIe bus.

VizDB avoids this transfer inefficiency by caching the most recently touched data in High Bandwidth Memory on the GPU, which offers up to 10x the bandwidth of CPU DRAM and far lower latency. VizDB is also designed to exploit efficient inter-GPU communication infrastructure such as NVIDIA NVLink when available.

Rapid Query Compilation

A key component of VizDB’s innovation advantage is the JIT (Just-In-Time) compilation framework built on LLVM (Low-level Virtual Machine). By pre-generating compiled code for the query, Vizlytics avoids many of the memory bandwidth and cache-space inefficiencies of traditional virtual machine or transpiler approaches.

Using LLVM, compilation times are much quicker— generally under 30 milliseconds for entirely new SQL queries. Furthermore, the system can cache templated versions of compiled query plans for reuse. This is important in situations where users are leveraging Vizlytics Horizon to cross-filter billions of rows over multiple correlated visualizations.

Native Support of Standard Geo Data Types

VizDB SQL engine can store and query data using native Open Geospatial Consortium (OGC) types, including POINT, LINESTRING, POLYGON, and MULTIPOLYGON. With native geo type support, analysts can query geo data at scale using a growing number of special geospatial functions. This opens up a wide range of new use cases for geospatial analysts, who can use the full power of modern CPU and GPU hardware to quickly and interactively calculate distances between two points and intersections between objects. Now analysts can find all points that fall within a building footprint or search for intersections between them.

Hybrid Execution

A key component of the Vizlytics SQL engine performance advantage is the hybrid, or parallelized, execution of queries. Parallelized code allows a processor to compute multiple data items simultaneously. This is necessary to achieve optimal performance on GPUs, which contain thousands of execution units.

Optimizing hybrid execution also translates well to CPUs, which increasingly have “wide” execution units capable of processing multiple data items at once. VizDB parallelizes computation across multiple GPUs and CPUs, and even improves query performance on CPU-only systems.

Distributed Architecture

The Vizlytics scale-out configuration allows single queries to span more than one physical host when data is too large to fit on a single machine. Across nodes, Vizlytics uses a shared-nothing architecture between GPUs. When a query is launched, each GPU processes a slice of data independently from other GPUs. Even though multiple GPUs reside within a single machine, the data is fanned out from CPU to multiple GPUs and then gathered back together onto the CPU.

A distributed architecture also provides faster data load times. Import times speed up linearly with the number of nodes because loading can be done concurrently across multiple nodes. Reads from disk also benefit from similar acceleration in a scale-out configuration.

High Availability (HA)

The goal of Vizlytics's High Availability (HA) is to meet an organization's service level agreements for performance and uptime. If an Vizlytics server becomes unavailable, the load balancer redirects traffic to a different available Vizlytics server, preserving availability. High Availability configurations allow a set of Vizlytics servers that are running together in a High Availability Group to be synchronized in a guaranteed way.

As HA group members receive updates, backend synchronization orchestrates and manages replication, then updates the Vizlytics servers in the HA group using Kafka topics as a distributed resilient logging system. While multiple servers are active in an HA group, average response times also tend to improve, due to the efficient distribution of query load across the members. A load balancer distributes users across the available Vizlytics servers, improving concurrency and throughput as more servers are added.