running top
with batch mode via -b
should get you the information you're looking for.
Here's a very messy start to what you could do:
top -b -n 1 | head | grep -A 1 PID | grep "^[0-9]" | cut -f1 -d" " | xargs kill
You can always kill a process from an interactive run of top
using the k key as well, since you might not like what it picks...
Not sure what kernel you're running, but cgroups may also be of use to you in addition to limits.conf
One approach could be to use PID namespaces:
Boot your system with a init=/some/cmd
as kernel parameter, where /some/cmd
forks a process in a new namespace (CLONE_NEWPID
) and runs /sbin/init
in it (it will have PID 1 in that new namespace and pid 2 in the root namespace), then in the parent, execute your "program".
You'll probably want a way to control your program in one way or another (TCP or ABSTRACT Unix socket for instance).
You'll probably want to mlock your program in memory and close most references to the filesystem so that it doesn't rely on anything.
That process won't be seen from the rest of the system. The rest of the system will in effect run like in a container.
If that process dies, the kernel will panic which gives you an extra guarantee.
An inconvenient side-effect though is that we won't see the kernel threads in the output of ps
.
As a proof of concept (using this trick to boot a copy of your system in a qemu virtual machine):
Create a /tmp/init
like:
#! /bin/sh -
echo Starting
/usr/local/bin/unshare -fmp -- sh -c '
umount /proc
mount -nt proc p /proc
exec bash <&2' &
ifconfig lo 127.1/8
exec socat tcp-listen:1234,fork,reuseaddr system:"ps -efH; echo still running"
(you need unshare
from a recent version of util-linux (2.14)). Above we're using socat
as the "program" which just answers on TCP connections on port 1234 with the output of ps -efH
.
Then boot your VM as:
kvm -kernel /boot/vmlinuz-$(uname -r) -initrd /boot/initrd.img-$(uname -r) \
-m 1024 -fsdev local,id=r,path=/,security_model=none \
-device virtio-9p-pci,fsdev=r,mount_tag=r -nographic -append \
'root=r rootfstype=9p rootflags=trans=virtio console=ttyS0 init=/tmp/init rw'
Then, we see:
Begin: Running /scripts/init-bottom ... done.
Starting
[...]
root@(none):/# ps -efH
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 14:24 ? 00:00:00 bash
root 4 1 0 14:24 ? 00:00:00 ps -efH
root@(none):/# telnet localhost 1234
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
UID PID PPID C STIME TTY TIME CMD
root 2 0 0 14:24 ? 00:00:00 [kthreadd]
root 3 2 0 14:24 ? 00:00:00 [ksoftirqd/0]
[...]
root 1 0 2 14:24 ? 00:00:00 socat tcp-listen:1234,fork,reuseaddr system:ps -efH; echo still running
root 204 1 0 14:24 ? 00:00:00 /usr/local/bin/unshare -fmp -- sh -c umount /proc mount -nt proc p /proc exec bash <&2
root 206 204 0 14:24 ? 00:00:00 bash
root 212 206 0 14:25 ? 00:00:00 telnet localhost 1234
root 213 1 0 14:25 ? 00:00:00 socat tcp-listen:1234,fork,reuseaddr system:ps -efH; echo still running
root 214 213 0 14:25 ? 00:00:00 socat tcp-listen:1234,fork,reuseaddr system:ps -efH; echo still running
root 215 214 0 14:25 ? 00:00:00 sh -c ps -efH; echo still running
root 216 215 0 14:25 ? 00:00:00 ps -efH
still running
Connection closed by foreign host.
root@(none):/# QEMU: Terminated
Best Answer
If you did not put the software there and/or if you think your cloud instance is compromised: Take it off-line, delete it, and rebuild it from scratch (but read the link below first). It does not belong to you anymore, you can not trust it any longer.
See "How to deal with a compromised server" on ServerFault for further information about what to do and how to behave when getting a machine compromised.
In addition to the things to do and think about in the list(s) linked to above, be aware that depending on who you are and where you are, you may have a legal obligation to report it to either a local/central IT security team/person within your organization and/or to authorities (possibly even within a certain time frame).
In Sweden (since December 2015), for example, any state agency (e.g. universities) are obliged to report IT-related incidents within 24 hours. Your organization will have documented procedures for how to go about doing this.