This weekend I learned that it is possible to diffuse images without a GPU. I didn't think this would work but it's not only possible, it's easy and actually pretty fast! (Disclaimer: you need a good amount of RAM, I have 20GB)
Setup
To keep this simple and portable. I have used docker to run fastsdcpu.
Dockerfile
FROM ubuntu:24.04 AS base
RUN apt update \
&& apt-get install -y python3 python3-venv python3-pip python3-wheel ffmpeg git wget nano \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& pip install uv --break-system-packages
FROM base AS fastsd
ARG FASTSDCPU_VERSION=v1.0.0-beta.200
RUN git clone https://github.com/rupeshs/fastsdcpu /app \
&& cd app \
&& git checkout -b $FASTSDCPU_VERSION \
&& wget https://huggingface.co/rupeshs/FastSD-Flux-GGUF/resolve/main/libstable-diffusion.so?download=true -O libstable-diffusion.so
WORKDIR /app
SHELL [ "/bin/bash", "-c" ]
RUN echo y | bash -x ./install.sh --disable-gui
VOLUME /app/models/gguf/
VOLUME /app/lora_models/
VOLUME /app/controlnet_models/
VOLUME /root/.cache/huggingface/hub/
ENV GRADIO_SERVER_NAME=0.0.0.0
EXPOSE 7860
CMD [ "/app/start-webui.sh" ]
And used Docker Compose to map the volumes to directories on my host system. This will store downloaded models outside of the container and enable adding custom models.
docker-compose.yaml
services:
fastsdcpu:
build:
context: .
dockerfile: Dockerfile
ports:
- "7860:7860"
volumes:
- gguf:/app/models/gguf/
- lora:/app/lora_models/
- ctrl:/app/controlnet_models/
- cache:/root/.cache/huggingface/hub/
deploy:
resources:
limits:
memory: 20g
stdin_open: true
tty: true
environment:
- GRADIO_SERVER_NAME=0.0.0.0
volumes:
gguf:
driver: local
driver_opts:
type: none
o: bind
device: ./models/gguf
lora:
driver: local
driver_opts:
type: none
o: bind
device: ./models/lora
cache:
driver: local
driver_opts:
type: none
o: bind
device: ./models/cache
ctrl:
driver: local
driver_opts:
type: none
o: bind
device: ./models/ctrl
Then run sudo docker compose up --build
to start the container. Once the Web UI service has started you can access it at http://localhost:7860
.
Usage
This app is designed to auto-download the selected model the first time you try to generate an image with it. You'll have to experiment with what works best for your use-case. The default model LCM -> stabilityai/sd-turbo
works pretty well for objects and scenery but does not do so well with realistic images of people. LCM-Lora -> Lykon/dreamshaper-8
is much better at people and quite surprisingly fast. Even with my modest Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
with no dedicated GPU, I can generate crisp, consistent images in ~30 seconds.
Of course your generation settings will affect this. Higher resolution images or more inference steps will take longer. I found the best settings for dreamshaper to be 4-5 steps with guidance scale 1. I can quickly generate 256x256 images for testing prompts and after I get roughly what I want, I increase the resolution and other settings gradually until I get exactly the image I'm looking for. Using tiny auto encoder for SD
makes a significant difference in speed.
Testing
I tried using LCM-OpenVINO -> rupeshs/sd-turbo-openvino
which is specifically for Intel setups but I found this took longer and bogged my system down. If you have a newer Intel Arc based system this will probably work better for you.
Sadly I was not able to get Flux1 working. I think it requires CPU instructions that my system does not possess. If you have an i7 or higher, this would be the ideal model to choose if you want highly creative images especially in a fantasy setting. Also, Flux1 can generate coherent text which Stable Diffusion notoriously fails at.
Bonus
This app also has an API that you can hook into but it needs to be enabled before you can access it by adding these extra bits to the docker config.
Add to Dockerfile
after RUN echo y | bash -x ./install.sh --disable-gui
RUN cat > run.sh <<EOF
#!/bin/bash
/app/start-webserver.sh &
/app/start-webui.sh
EOF
RUN chmod +x start-webserver.sh run.sh
# ... VOLUMES ...
EXPOSE 7860
EXPOSE 8000
CMD [ "/app/run.sh" ]
Add the port to docker-compose.yaml
services:
fastsdcpu:
ports:
- "7860:7860"
- "8000:8000"
Then you can access the API browser at http://localhost:8000/api/docs
and do stuff such as make POST requests to /api/generate
with a JSON body like this:
{
"diffusion_task": "text_to_image",
"use_tiny_auto_encoder": true,
"use_lcm_lora": true,
"lcm_lora": {
"base_model_id": "Lykon/dreamshaper-8",
"lcm_lora_id": "latent-consistency/lcm-lora-sdv1-5"
},
"prompt": "a silly cat",
"negative_prompt": "humans",
"image_height": 256,
"image_width": 256,
"inference_steps": 1,
"guidance_scale": 1,
"number_of_images": 1
}
You'll get a JSON response back containing a base64 encoded JPG image that you can throw directly into a DataURL. The first image you generate through the API will take a little longer as the system warms up, but after that things run pretty smoothly.
Top comments (0)