Recently I had to troubleshoot a disk space problem, hard disk full on our client server, caused by a docker process incorrectly writing files in its local file system instead of forwarding them to stdout.
# df on / it shows 347G used and only 44 G available df -h Filesystem Size Used Avail Use% Mounted on udev 32G 0 32G 0% /dev tmpfs 6.3G 640M 5.7G 10% /run /dev/md2 438G 372G 44G 90% / tmpfs 32G 3.4M 32G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md1 488M 486M 0 100% /boot tmpfs 6.3G 0 6.3G 0% /run/user/1001
As visible from the snippet above the /dev/md2 partition has 372G of space in use.
While investigating with du
we realised that it was a container taking a lot of space and thus writing a massive file of about 330G under `/var/lib/docker/containers/fileID.log` .
Although we removed the offending container and its whole stack from the Rancher deployment we realised the file was not being released and the disk space still allocated even if the file was deleted already. As a result we looked for deleted files that were not released yet. In order to look for it you can use lsof
or in our case as it was not installed just use find/ls
.
Indeed you can ls /proc/*/fd
as such
sudo ls -lU /proc/*/fd | grep deleted lr-x------ 1 root root 64 Oct 13 2018 42 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log (deleted) lr-x------ 1 root root 64 Oct 14 2018 123 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log (deleted) lr-x------ 1 root root 64 Oct 15 2018 139 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log (deleted) lr-x------ 1 root root 64 Apr 18 06:00 146 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log (deleted)
and see files marked as deleted but not released yet.
Then you have a few options, you can try to stop/restart the process blocking the file or as in our case, as this might cause a downtime to our client, we would just overwrite the file with empty content. If the file was still present it could be deleted with
: > /path/to/the/file.log
but as the file was yes deleted but still locked up by a process then we can overwrite it by looking up the process ID and the file descriptor and run the overwrite as shown here
: > "/proc/$pid/fd/$fd"
or
sudo sh -c ': > /proc/1233/fd/146'
if you experience permission problems in bash.
To find the process id and the file descriptor you can run
sudo find /proc/*/fd -ls | grep deleted | grep docker
17288 0 lr-x------ 1 root root 64 Oct 13 2018 /proc/1233/fd/42 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log\ (deleted)
1106161 0 lr-x------ 1 root root 64 Oct 14 2018 /proc/1233/fd/123 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log\ (deleted)
4993234 0 lr-x------ 1 root root 64 Oct 15 2018 /proc/1233/fd/139 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log\ (deleted)
1659260673 0 lr-x------ 1 root root 64 Apr 18 06:00 /proc/1233/fd/146 -> /var/lib/docker/containers/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24/15ef8edcf7dcef2ea696fdef79e8b22150789227c86ec856570a49f086300e24-json.log\ (deleted)
and as you can see in bold above, the $pid and the $fd values will be visible in this breakdown.
Once you overwrite the content then your filesystem will finally be freed of some extra space.
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 32G 0 32G 0% /dev
tmpfs 6.3G 680M 5.7G 11% /run
/dev/md2 438G 38G 377G 10% /
tmpfs 32G 3.4M 32G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/md1 488M 486M 0 100% /boot
tmpfs 6.3G 0 6.3G 0% /run/user/1001
I hope you like this article, please feel free to share it online or contact us if you have any questions.