Containers are a great way to package code and dependencies in a portable package. This is fantastic in a world where dependencies keep getting bigger and more complex. But we are not here to discuss their benefits. If Google brought you here, more than likely you know already what a container is and you are trying to troubleshoot one. If that’s the case, I hope you find useful the tip I am about to tell you.

To understand the solution we must first remember that containers are designed to run a task as soon as they start. You define such task with either CMD or ENTRYPOINT. The problem is that if the task dies the container dies too and you can’t log in into it to find out what happened. Sometimes “docker logs” will give you a hint of what’s happening but other times you wish you could simply log in to the container and check things out, but of course you can’t because the container is dead
A solution I have become fond of is to override the ENTRYPOINT with a process that keeps the container alive and then open a terminal session into it. Let me give you an example.
IMPORTANT: I am using Podman but all of this will work perfectly fine with Docker as well. This is the “Dockerfile“. Notice how the main task for this container is “python app.py“
FROM python:3.10.5
WORKDIR /app
COPY . /app
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 7860
CMD ["python", "app.py"]
This is the code contained in “app.py“. It is a simple Gradio chatbot. It gets the users prompt and sends an API call to to get a response.
import gradio as gr
import os, requests, import urllib3
urllib3.disable_warnings()
appurl = os.environ["APP_URL"]
def give_response(query, history):
payload = {"query": query}
response = requests.post(appurl, json=payload)
return response.json()["response"]
demo = gr.ChatInterface(
give_response,
type = "messages",
title="My first Chatbot",
description="Ask me a question, don't be shy")
demo.launch(server_name="0.0.0.0")
As you can see the code requires us to define an environment variable “APP_URL” with the URL to send the request to. Let’s say that we don’t define the variable and attempt to run the container.
pi@piper1:~$ podman run -d -p 7860:7860 localhost/blog:v1
19dc02db0b2e06c51756c7f55a3eb861c11f831a21574efdcd2e00081c9191a1
pi@piper1:~$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES
19dc02db0b2e localhost/blog:v1 python app.py 3 seconds ago Exited (1) dry_rice
As expected the container fails but here is the trick … we run it again but we use the “–entrypoint” argument to run “tail -f /dev/null“. The container is not changing we are simply overriding “python app.py” with this “tail” command which stays running and in doing so keeps the container alive. Notice the “single quote” around the square brackets.
pi@piper1:~$ podman run -d --entrypoint='["tail", "-f", "/dev/null"]' -p 7860:7860 localhost/blog:v1
e1031d4c2ecbf3d4ec244aef262e71fb2f4e227154b6e22eabf634e4e053f4a0
pi@piper1:~$ podman ps
CONTAINER ID IMAGE STATUS PORTS NAMES
e1031d4c2ecb localhost/blog:v1 Up 6 seconds 0.0.0.0:7860->7860/tcp sour_soup
Now we can login into the container and do whatever checks we need to do. We can even run “python app.py” and see what’s going on live.
pi@piper1:~$ podman exec -it e1031d4c2ecb /bin/bash
root@e1031d4c2ecb:/app# ls -l
total 12
-rw-r--r-- 1 root root 518 Mar 24 04:35 Dockerfile
-rw-r--r-- 1 root root 566 Mar 24 04:35 app.py
-rw-r--r-- 1 root root 897 Mar 24 04:35 requirements.txt
root@e1031d4c2ecb:/app# python3 app.py
Traceback (most recent call last):
File "/app/app.py", line 7, in <module>
appurl = os.environ["APP_URL"]
File "/usr/local/lib/python3.10/os.py", line 679, in __getitem__
raise KeyError(key) from None
KeyError: 'APP_URL'
This was a simplistic example but you get point. There could be a file or a path missing, a typo, a permissions issue … by using this trick you can troubleshoot interactively inside the container. Once you know what the issue is you can fix your files and rebuild the container image.
The next question is, can you do this in Kubernetes as well? Yes, you can add the “command” to the “Deployment”, not to the “Pod”. This is a list of strings as you see in the last line in the simplified manifest below.
apiVersion: v1
kind: Deployment
metadata:
name: blog
spec:
containers:
- name: blog-container
image: debian
command: ["tail", "-f", "/dev/null"]
As soon as I save the changes to the deployment manifest, Openshift creates a new pod and kills the old one. Then I can terminal into the new pod and browse around. Notice I can even launch the application from the terminal session.
$ pwd
/app
$ ls -l
total 12
-rw-r--r--. 1 root root 518 Mar 24 04:35 Dockerfile
-rw-r--r--. 1 root root 566 Mar 24 04:35 app.py
-rw-r--r--. 1 root root 897 Mar 24 04:35 requirements.txt
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
1000770+ 1 0 0 00:00 ? 00:00:00 tail -f /dev/null
1000770+ 7 0 0 00:00 pts/0 00:00:00 sh -i -c TERM=xterm sh
1000770+ 13 7 0 00:00 pts/0 00:00:00 sh
1000770+ 90 13 0 00:02 pts/0 00:00:00 ps -ef
$ python3 app.py
* Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
Of course, this is only intended for troubleshooting. Once you find out what’s wrong you can fix the image and manifest and deploy them again.
Categories: DellEMC
