• PlanterTree@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 days ago

    Intel and ARM Ampere systems.

    Does this mean they optimized for CPU instead of GPU? I doubt they target Intel GPUs tbh, so they really optimized for CPU… interesting!

    • brucethemoose@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      4 days ago

      All the runtimes except Intel ones are llama.cpp Q4KMs, so the Ampere ones aren’t anything special.

      …The Intel ones kinda are though. They actually have runtimes for CPU/GPU, and NPU, and AFAIK the CPU ones may be able to use AMX if you are on a server CPU.

      It’s still not great for a lot of reasons, but one could do worse.