Running LLMs Locally Using Podman, Ollama & OpenWebUI
Introduction
In this post, we will be looking at how to run local AI models using Openwebui, Ollama and podman
. We will start by understanding what Podman is and how it differs from docker
and then implement this knowledge by creating a pod with two containers: Ollama and Openwebui.
Podman
Podman is a daemonless and rootless open-source container management system, which allows you to quickly create containers for services with low overhead. Here are some advantages to using podman
:
Security:
podman
doesn't use daemon, unlike docker, which runs as root. This means it has a lower attack footprint as malicious actors may target vectors like a daemon, because of its privileged permissions. Furthermore, each container is lauched with SELinux which uses security policies to enforce and manage processes, files and apps 1 . Moreover, each container can be managed without using root, restricting it to only what each user can access.SystemD:
podman
has great integration withsystemd
via quadlets, where users can create containers and manage containers viasystemd
2.Ease of Use: The
podman
command-line is also very similar todocker
, so learningpodman
will be easy to pick-up and get going. Furthermore,pods
group containers together allowing for easier management of multiple containers and services, similiar todocker compose
. These pods can then be re-initialised viapodman generate kube <pod name>
, which generates a.yml
file. This file can then be used to quickly recreate the pod usingpodman kube play <yml file>
.
Creating our pod
So we want to run our own local AI models using podman
. How to do this? Firstly, we know these services will be linked and so we should create a pod to make it easier to manage these services together.
podman create --pod localai -p 11434:11434 -p 3000:8080
We used create
to make a pod and then give it a name: localai
. We then map the ports that each service uses. We first map 11434 on the host to the same port within the container. This is how will connect to Ollama from our host machine. Secondly, we map port 3000 to port 8080 in the OpenWebUI container.
Note You can use any arbitrary port you want, as long as it is not being used. We must also declare any port mappings during pod creation. Podman does not allow you to retroactively add port mappings after pod creation.
Adding containers to our pod
We now need to add the Ollama and OpenWebUI container to the pod and then we can launch the pod and hopefully everything should work! Let's do the Ollama container first.
Ollama
We can create a container like so and add to our newly-created pod!
podman create --name ollama --pod=localai -v ollama:/root/.ollama --device /dev/kfd --device /dev/dri ollama/ollama:rocm
Here, we create a container, give it a name, add it to the pod and create a volume where ollama will store all of our downloaded models and then provide the image name. We don't need to provide port mappings here, as we have already done so during pod creation.
GPU Tip I am using an AMD GPU, so the above command should work as I am giving the container access to the devices
dri
andkfd
. For users who are not using an AMD GPU, please visit this link to see the correct configuration for your graphics card.
Openwebui
We will follow similiar steps for Openwebui:
podman create --name openwebui --pod=localai -v open-webui:/app/backend/data --name ghcr.io/open-webui/open-webui:latest
Pulling images If you are running these commands for the first time,
podman
will automatically pull the images down from their respective repositories. Once the images have been downloaded, any subsequent container creations which utilises those images will be used from your local repository, stored on your system.
Once done, we should have three containers in our pod. We can verify this via: podman pod ls
and look at the last column. You may be wondering why the number of containers is three and not two.
POD ID NAME STATUS CREATED INFRA ID # OF CONTAINERS
2ce6f54a43de ai Created 8 seconds ago 1cf56691c890 3
podman
, by default, creates a container to manage the pod (typically called an infra-container
). This gets automatically added to the pod, so we don't really need to worry about this, for now.
Lauching the pod
We can now launch our pod and check to see if everything works okay.
podman pod start localai
Once executed, allow the containers to boot up for a few seconds then head over to localhost:<your port number>
. You should see the OpenWebUI sign in page. This simply creates a local account and doesn't require a valid email.

Fantastic! You can also check to see if Ollama is running:
➜ curl 127.0.0.1:11434
Ollama is running
You should get the "Ollama is running" output!
Conclusion
We did it! We can now run local AI models on our own systems, using a ChatGPT-like interface using Ollama and our own GPU.
Hopefully, you can see the positives of using podman
over docker
, and how using pods can be helpful in grouping services together.
Last updated