Rminibwa follows the
RsimdDispatch/SIMDe downstream-package pattern
for the SIMD-sensitive minibwa KSW kernels:
configure probes compiler support;src/rmb-ksw/;src/Makevars.in links those staged objects through
PKG_LIBS;This avoids putting ISA flags such as -mavx2 into global
PKG_CPPFLAGS or PKG_CFLAGS.
Dispatch starts at the KSW functions called by minibwa:
ksw_extz2_sse();ksw_extd2_sse();ksw_ll_qinit() and the ksw_ll_*
helpers.The rest of minibwa stays baseline-compiled once:
Rminibwa currently exposes:
scalar: portable SIMDe fallback compiled with native
intrinsics disabled and no ISA flags;sse4: native SSE4.1/SIMDe-alias build of the KSW
files;avx2: native AVX2 build, including the widened dual-gap
extd2 kernel.The avx2 backend uses the wide
ksw2_extd2_wide.c patch for the dominant dual-gap
gap-filling path and uses AVX2-compiled KSW objects for the remaining
KSW entry points.
library(Rminibwa)
simd_info()
#> $dispatch_mode
#> [1] "rminibwa-ksw-staged"
#>
#> $requested_backend
#> [1] "auto"
#>
#> $selected_backend
#> [1] "avx2"
#>
#> $compiled_backends
#> [1] "scalar" "sse4" "avx2"
#>
#> $cpu_supported_backends
#> [1] "scalar" "sse4" "avx2"
#>
#> $available_backends
#> [1] "scalar" "sse4" "avx2"
#>
#> $target
#> [1] "x86_64"Select a backend explicitly for diagnostics or benchmarks:
old <- simd_backend()
simd_set_backend("scalar")
simd_backend()
#> [1] "scalar"
simd_set_backend("auto")simd_set_backend() errors if the requested backend was
not compiled or is not supported by the current CPU/runtime.
For every available backend, Rminibwa runs the same native mapping
fixture in tinytest and checks that the native alignment
batch and ALTREP column accessors remain valid. The README
make rdm path additionally benchmarks:
scalar vs sse4 vs
avx2; andBackend timing must use a workload that actually reaches KSW. Very
short exact matches can be resolved by seeding/chaining and ungapped
fast paths without a single ksw_ext* or
ksw_ll* call, so apparent
scalar/sse4/avx2 differences on
those examples are just end-to-end mapper noise. The developer README
benchmark uses an indel-mutated random read and records internal KSW
call counters before timing count-only mapping.