

Completely depends on your laptop hardware, but generally:
- TabbyAPI (exllamav2/exllamav3)
- ik_llama.cpp, and its openai server
- kobold.cpp (or kobold.cpp rocm, or croco.cpp, depends)
- An MLX host with one of the new distillation quantizations
- Text-gen-web-ui (slow, but supports a lot of samplers and some exotic quantizations)
- SGLang (extremely fast for parallel calls if thats what you want).
- Aphrodite Engine (lots of samplers, and fast at the expense of some VRAM usage).
I use text-gen-web-ui at the moment only because TabbyAPI is a little broken with exllamav3 (which is utterly awesome for Qwen3), otherwise I’d almost always stick to TabbyAPI.
Tell me (vaguely) what your system has, and I can be more specific.
I had 2x MMR. Just got a 3rd shot anyway, just in case.
EDIT: For more context, the Costco pharmacist told me (even with 2 shots) its immunity does wane over time. She said I’d probably be fine skipping in my age bracket, but I’m in Texas and I don’t like ‘probably.’