SIMD in Register: Doubling Hash Table Lookup Performance
Optimizing String Lookups: A Bitwise Adventure in C#
Table of Contents
Ever found yourself staring at a performance bottleneck, knowing there must be a more efficient way? I recently dove headfirst into optimizing string lookups in C#, and let me tell you, it was a journey that led me down a rabbit hole of bitwise operations. Teh result? A significant performance boost that, while perhaps a tad less readable, is undeniably powerful.
The Problem: Slow String Lookups
In a recent project, I was dealing with a scenario where we needed to perform a high volume of string lookups. The initial implementation used a straightforward byte array to store and check for the presence of strings.While functional, the benchmarks revealed a clear performance issue, especially with negative lookups (when a string isn’t present). The existing method was simply too slow for our needs.
Initial Approach: The Shifting Method
My first attempt at optimization involved a technique that’s frequently enough seen in these kinds of scenarios: shifting. The idea was to XOR the input string with a known pattern and then check if any of the resulting bytes were zero.
csharp
public static bool shiftlookup(ReadOnlySpan input)
{
uint xored = 0;
for (int i = 0; i < input.Length; i++)
{
xored ^= ((uint)input[i] << (i % 4) 8);
}
return (xored & 0x01010101U) != 0;
}
This was a decent improvement over the naive approach. Positive lookups saw a modest speedup, and negative lookups were also faster. However, I felt there was still room for more aggressive optimization. The loop and the conditional shifting felt like they could be streamlined further.
The Breakthrough: Bitwise Magic
This is where things get interesting, and perhaps a little bit intimidating if you’re not used to bitwise operations. I stumbled upon a clever bit-twiddling hack that promised even greater performance gains. The core idea is to leverage the properties of XOR and bitwise AND to detect if any byte in a sequence is zero, without explicit looping or conditional checks on each byte.
The magic happens in this line:
csharp
return ((xored - 0x01010101U) & ~xored & 0x80808080U) != 0;
Let’s break down this seemingly cryptic expression:
-
xored: This is the result of XORing the input bytes with a specific pattern. The goal here is to zero out bytes that match the pattern. -
xored - 0x01010101U: Subtracting0x01010101Ufromxoredhas a peculiar effect. If a byte inxoredwas0x00, subtracting0x01from it will cause a borrow that propagates through the byte, effectively turning0x00into0xFF. If a byte was0x01, subtracting0x01makes it0x00. -
~xored: This is the bitwise NOT ofxored.It flips all the bits. So, any0x00bytes inxoredbecome0xFFin~xored, and any non-zero bytes become zero. -
& 0x80808080U: This is a mask that isolates the most significant bit (MSB) of each byte.
When you combine these operations, the expression ((xored - 0x01010101U) & ~xored & 0x80808080U) will result in a non-zero value if and only if
