1 article tagged with moe.
Using llama.cpp to punch above your hardware class
Learn how to run Z.ai’s massive 30B MoE reasoning model on a modest Windows GPU using llama.cpp and tensor overrides.