It’s a portable, standard-C++, CPU-based implementation of the code to do inference (i.e. text generation) with the LLaMA language model. You get a command line command that takes text and the model and eventually outputs more text.
You could do the same thing and run Stable Diffusion off of the CPU at some relatively slow speed, but I don’t know if anyone has code for it.
That guide refrences this version for more CPU performance. I haven’t tried using CPU, but from my experience with raytraced rendering on a CPU, it’s probably very slow compared to GPU. It might be faster than online services with a queue though.
No idea what a llama.cpp is.
It’s a portable, standard-C++, CPU-based implementation of the code to do inference (i.e. text generation) with the LLaMA language model. You get a command line command that takes text and the model and eventually outputs more text.
You could do the same thing and run Stable Diffusion off of the CPU at some relatively slow speed, but I don’t know if anyone has code for it.
There are a few UIs that run SD on CPU.
You can do it with Auto1111.
Wow, that claims to be really fast on CPU actually. Why aren’t people using this all the time instead of the annoying services?
That guide refrences this version for more CPU performance. I haven’t tried using CPU, but from my experience with raytraced rendering on a CPU, it’s probably very slow compared to GPU. It might be faster than online services with a queue though.