Performance Deep Dive

A practical exploration of high-performance computing concepts. This page presents a series of benchmarks, from naive implementations to hand-crafted SIMD kernels, demonstrating the real-world impact of hardware-aware optimization.

Haversine Distance Benchmark
Comparing Native (C++, Rust, Zig) and WebAssembly performance over 100,000 calls.

Operations per Millisecond

Implementation
C++Static C++
(Local)
Naive
78.47 ns
(1.00x)
N/AN/AN/AN/AN/AN/A
Optimized Scalar
25.89 ns
(3.03x)
N/AN/AN/AN/AN/AN/A
SIMD + Multithreading
11.61 ns
(6.76x)
N/AN/AN/AN/AN/AN/A
Understanding the Environments
  • Static C++ (Local): This is the baseline, compiled and run directly on the host machine with full `-march=native` optimizations. It represents the maximum potential performance.
  • Live C++ (Server): C++ benchmark run inside a Docker container, compiled with `-mavx2` for portability.
  • Live Rust (Server): Rust benchmark run inside a Docker container with optimized compilation.
  • Live Zig (Server): Zig benchmark run inside a Docker container with optimized compilation.
  • WASM C++ (Browser): C++ compiled to WebAssembly and run directly in your browser.
  • WASM Rust (Browser): Rust compiled to WebAssembly and run directly in your browser.
  • WASM Zig (Browser): Zig compiled to WebAssembly and run directly in your browser.
Sine Function Benchmark
Compares various sine function implementations over 10,000,000 calls.