Nihilai Collective Logo

void-numerics

Nihilai Collective

SWAR-optimized integer conversion · C++23 · Header-only

What It Is

vn::to_chars and vn::from_chars are drop-in replacements for their standard library equivalents, engineered for performance on hot integer-conversion paths. Header-only, zero-allocation, no exceptions, no RTTI.

Every conversion is exhaustively unit-tested against std::to_chars and std::from_chars — including the full LLVM libc++ test suite — and benchmarked against std, jeaiii, fmt, and strtoll/strtoull across 8-, 16-, 32-, and 64-bit signed and unsigned integer types. Registered on the Microsoft vcpkg registry.

Why

The C++ standard library's <charconv> is already fast. void-numerics is faster — substantially so on hot integer-conversion paths — without sacrificing correctness, portability, or API compatibility.

The gains come from a combination of SWAR-width arithmetic, compile-time specialization per integer type, multiply-and-shift division replacement, and force-inlined ladder dispatch sized to digit count. The result is measurable throughput improvement across all major platforms and compilers. The biggest optimizations come from asking the right questions, not from better answers to the wrong questions.

Features

  • API-compatible with std::to_chars / std::from_chars — returns standard result types
  • All integer types: int8_t through uint64_t
  • Header-only — single include, no build step required
  • Zero allocation, zero exceptions, zero RTTI
  • C++23 — concepts for type-correct dispatch
  • Compile-time tables for digit conversion and overflow bounds
  • Multiply-and-shift division replacement on hot paths
  • Force-inlined helpers with manual ladder dispatch per digit count
  • Cross-platform CI: Ubuntu Clang, Ubuntu GCC, macOS Clang, macOS GCC, Windows MSVC
  • AddressSanitizer / UndefinedBehaviorSanitizer support out of the box
  • Non-base-10 conversions transparently dispatch to std for full compatibility

Quick Start

Integer → String
#include <void-numerics> int main() { char buffer[32]; int64_t value = -9223372036854775807LL; auto result = vn::to_chars(buffer, buffer + sizeof(buffer), value); if (result.ec == std::errc{}) { // result.ptr points one past the last written character std::string_view text(buffer, result.ptr - buffer); // text == "-9223372036854775807" } }
String → Integer
#include <void-numerics> int main() { const char* input = "42abc"; uint32_t value{}; auto result = vn::from_chars(input, input + 5, value); // result.ptr points to 'a' (first non-digit) // result.ec == std::errc{} // value == 42 }
Base Support
char buf[32]; auto r = vn::to_chars(buf, buf + 32, 0xDEADBEEF, 16); // r.ptr - buf == 8, buf == "deadbeef"

Non-base-10 conversions transparently dispatch to std::to_chars / std::from_chars for full standard-library compatibility.

API Reference

vn::to_chars

template<integer_types v_type> std::to_chars_result to_chars( char* first, char* last, v_type value, int32_t base = 10 ) noexcept;

Writes the textual representation of value to [first, last). Returns {ptr, std::errc{}} on success, where ptr is one past the last character written. Returns {last, std::errc::value_too_large} if the buffer is too small.

vn::from_chars

template<integer_types v_type> std::from_chars_result from_chars( const char* first, const char* last, v_type& value, int32_t base = 10 ) noexcept;

Parses an integer from [first, last) into value. Returns {ptr, std::errc{}} on success, where ptr points to the first character not consumed. Returns {first, std::errc::invalid_argument} if no characters could be parsed.

Building

void-numerics is header-only — copy include/vn-incl/ into your project, or consume via CMake.

CMake

add_subdirectory(void-numerics) target_link_libraries(your_target PRIVATE void-numerics::void-numerics)

Options

Option Default Description
VN_TESTSOFFBuild the unit test suite
VN_BENCHMARKSOFFBuild the benchmark harness
VN_ASANOFFEnable AddressSanitizer
VN_UBSANOFFEnable UndefinedBehaviorSanitizer

Requirements

  • C++23-capable compiler — Clang ≥ 20, GCC ≥ 14, MSVC latest, AppleClang
  • CMake ≥ 3.28

Testing

The test suite provides exhaustive coverage across all integer types and edge cases:

  • Exhaustive value enumeration for int8 / uint8 / int16 / uint16
  • Digit-length sweeps (1 through max digits) for all types
  • Powers of 2, powers of 10, all-1s, all-9s patterns
  • Limits: min, min+1, min/2, max, max-1, max/2, max/3, max/7
  • Round-trip verification (to_chars → from_chars → equality)
  • Leading-zero handling, stop-at-non-digit semantics, lone-minus and empty-input edge cases
  • Full LLVM libc++ <charconv> test suite
cmake -S . -B build -DVN_TESTS=TRUE -DCMAKE_BUILD_TYPE=Debug cmake --build build ./build/bin/vn_unit_tests

Memory Footprint

vn::to_chars uses a precomputed lookup table of approximately 40KB for branch-free digit pair extraction. This is well-suited to any CPU with an L2 cache of 256KB or larger — all modern desktop, server, and laptop CPUs. The table resides comfortably in L2 while streaming integer data flows through L1, with no contention between the two. Benchmarks confirm this behavior holds under explicit cache-eviction pressure between runs.

vn::from_chars carries no significant lookup tables and is suitable for any target.

⚠ Not Recommended For
Microcontrollers without L2 cache (Cortex-M0/M3/M4/M7, AVR, MSP430) — the table cannot fit in available cache or SRAM. Embedded targets where total flash/SRAM is smaller than 40KB. Any environment where the constant data footprint is unacceptable for binary size. For these targets, std::to_chars is more appropriate.

Benchmark Methodology

Throughput is measured in megabytes of source data processed per second (MB/s), computed as total megabytes divided by total elapsed time across the measured sample window. For each test, a randomized dataset of n integers (n ∈ {100, 1,000, 10,000, 100,000}) is generated at mixed digit lengths across all standard integer widths (int8 through int64, signed and unsigned), under positive, negative, and mixed-sign value distributions where applicable.

Benchmarks use adaptive sampling: iterations begin at 60 and double each epoch (60 → 120 → 240 → …) up to a maximum of 1,200 iterations. Each epoch evaluates a trailing window of max(iterations / 10, 10) samples, capped at 100,000. Convergence requires RSE < 2.5% AND mean shift < 1.0% epoch-over-epoch simultaneously. The first epoch satisfying both conditions is retained as the canonical result. If convergence is never reached before 10 seconds elapse or the iteration cap is hit, the result is marked non-converged and excluded from all rankings — only converged results participate in win/tie/loss tallying.

All results use Bessel-corrected variance and Welch's t-test for statistical tie detection. CPU caches are cleared before each iteration batch to prevent cache-warmth artifacts. Each platform/compiler pair is tested independently on identical hardware.

Platforms: Linux, macOS, Windows — Compilers: Clang, GCC, MSVC — Benchmark library: benchmarksuite

View all benchmark files on GitHub →

Results

Platform
Compiler
Benchmark
Loading benchmarks