For whatever reason, Go insisted on loading rm.sm[i] in several chunks,
even though it could be loaded in a single 64-bit block. Instead, let's
reorder our loads to minimize the amount of memory we're uselessly
moving around.
This gives us about a 15% perf boost in
github.com/julienschmidt/go-http-routing-benchmark's
BenchmarkGoji_StaticAll, and questionable benefits (i.e., not
distinguishable from noise but certainly no worse) on Goji's own
benchmarks.