Is a POSIX context switch well-defined? Is it the same thing as switching threads in C? Can the C compiler generate everything for a context switch or is assembly programming still needed for a routine that switches the threads or switches the "context"? Is there even defined what is meant by "context" – isn't it the same as a thread?
Does POSIX define context switch
thread
Related Solutions
Interrupts are handled by the operating system, threads (or processes, for that matter) aren't even aware of them.
In the scenario you paint:
- Your thread issues a
read()
system call; the kernel gets the request, realizes that the thread won't do anything until data arrives (blocking call), so the thread is blocked. - Kernel allocates space for buffers (if needed), and initiates the "find the block to be read, request for that block to be read into the buffer" dance.
- The scheduler selects another thread to use the just freed CPU
- All goes their merry way, until...
- ... an interrupt arrives from the disk. The kernel takes over, sees that this marks the completion of the read issued before, and marks the thread ready. Control returns to userspace.
- All goes their merry way, until...
- ... somebody yields the CPU by one of a thousand reasons, and it just so happens the just freed CPU gets assigned to the thread which was waiting for data.
Something like that, anyway. No, the CPU isn't asigned to the waiting thread when an interrupt happens to signal completion of the transfer. It might interrupt another thread, and execution probably resumes that thread (or perhaps another one might be selected).
I know this question is pretty old (Feb 16) but here a response in case it helps someone else. The problem is that you've entered the '-F 999' indicating that you want to sample the events at a frequency of 999 times a second. For 'trace' events, you don't generally want to do sampling. For instance, when I select sched:sched_switch, I want to see every context switch. If you enter -F 999 then you will get a sampling of the context switches... If you look at the output of your 'perf record' cmd with something like:
perf script --verbose -I --header -i perf.dat -F comm,pid,tid,cpu,time,period,event,trace,ip,sym,dso > perf.txt
then you would see that the 'period' (the number between the timestamp and the event name) would not (usually) be == 1.
If you use a 'perf record' cmd like below, you'll see a period of 1 in the 'perf script' output like:
Binder:695_5 695/2077 [000] 16231.700440: 1 sched:sched_switch: prev_comm=Binder:695_5 prev_pid=2077 prev_prio=120 prev_state=S ==> next_comm=kworker/u16:17 next_pid=7665 next_prio=120
A long winded explanation but basically: don't do that (where 'that' is '-F 999').
If you just do something like:
perf record -a -g -e sched:sched_switch -e sched:sched_blocked_reason -e sched:sched_stat_sleep -e sched:sched_stat_wait sleep 5
then the output would show every context switch with the call stack for each event. And you might need to do:
echo 1 > /proc/sys/kernel/sched_schedstats
to get the sched_stat events.
Best Answer
POSIX uses the term context switch for at least two different purposes, without attempting to define it rigorously (or even providing a definition):
Rather, POSIX assumes you already know what the term means. For instance,
Further reading: