Benchmarks | Troels Brahe

A practical exploration of high-performance computing concepts. This page presents a series of benchmarks, from naive implementations to hand-crafted SIMD kernels, demonstrating the real-world impact of hardware-aware optimization.

Haversine Distance Benchmark

Comparing Native (C++, Rust, Zig) and WebAssembly performance over 100,000 calls.

Operations per Millisecond

Implementation	Static C++ (Local)
Naive	78.47 ns (1.00x)	N/A	N/A	N/A	N/A	N/A	N/A
Optimized Scalar	25.89 ns (3.03x)	N/A	N/A	N/A	N/A	N/A	N/A
SIMD + Multithreading	11.61 ns (6.76x)	N/A	N/A	N/A	N/A	N/A	N/A

Understanding the Environments

Static C++ (Local): This is the baseline, compiled and run directly on the host machine with full `-march=native` optimizations. It represents the maximum potential performance.
Live C++ (Server): C++ benchmark run inside a Docker container, compiled with `-mavx2` for portability.
Live Rust (Server): Rust benchmark run inside a Docker container with optimized compilation.
Live Zig (Server): Zig benchmark run inside a Docker container with optimized compilation.
WASM C++ (Browser): C++ compiled to WebAssembly and run directly in your browser.
WASM Rust (Browser): Rust compiled to WebAssembly and run directly in your browser.
WASM Zig (Browser): Zig compiled to WebAssembly and run directly in your browser.

Sine Function Benchmark

Compares various sine function implementations over 10,000,000 calls.

Performance Deep Dive

Operations per Millisecond