Linux – Why is it quicker to switch between threads than to switch between processes and share data between them

linuxprocessthreads

Why is it quicker to switch between threads than to switch between processes and share data between them? For example, in the apache web server running on ubuntu can use either the Prefork MPM (spawning multiple child processes) or Worker MPM (creating multiple threads for a single process).

Best Answer

Quickie, incomplete explanation:

In a thread switch, there's a lot less context you need to save/load. Specifically, memory is shared. The kernel doesn't have to do any page outs of dirty pages, and VM tricks to pull in all the memory for a new process (though some specific pages may need to be pulled in). There are other process specific data structures in the kernel (say, the open file descriptor table) that don't need to be swapped out.

As a side effect, you're also much more likely to be able to use what's in the processor's caches at that point. a new process probably needs to start a cold cache.

Yes, there are tools to share memory (IPC shared memory, pipes) between processes, but none are as clean/easy as the common memory in a process. You can't grow a shared memory block like you can with realloc() and such. Anything besides shared memory means you need multiple copies of data structures, one in each process, with tricks to copy changes as needed.

Specifically, apache has multiple models for different OSes. The original model was pre-fork, giving process isolation in exchange for some heavyweight process switching. This worked fine with the UNIXes that were common at the time of it's first writing, where some didn't have threads at all. Windows was so bad with multi-process that apache had to do threading - processes on Windows (at least at the time of apache 1.2/2.0) were too heavyweight. Linux has very light processes, where switching is close to thread switching time, so it stayed pre-Fork usually. Solaris has a complex thread "LWP" model, and does best in a hybrid thread/fork model.

Related Solutions

Windows – How to determine if a process on Windows “has no parent”

I am not sure how to do it from command line, but I wrote this to do some filtering of OS related processes from PowerShell. Maybe it will give you an idea. It skips items owned by service, system and null.

gwmi win32_process |select ProcessID,ParentProcessID,Name, @{l="Username";e={$_.getowner().user}}|where {$_.Username -ne "SYSTEM"} | where {$_.Username -ne "LOCAL SERVICE"} | where {$_.Username -ne "NETWORK SERVICE"} | where {$_.Username -ne $null} |Sort-Object ProcessID | ft -AutoSize
#

Output

    ProcessID ParentProcessID Name            Username
    --------- --------------- ----            --------
     2136     3460            notepad.exe     KNUCKLE-DRAGGER
     2504     3460            firefox.exe     KNUCKLE-DRAGGER
     2792      700            dllhost.exe     KNUCKLE-DRAGGER
     2816     4232            conhost.exe     KNUCKLE-DRAGGER
     2916     3460            powershell.exe  KNUCKLE-DRAGGER
     3128     3460            notepad.exe     KNUCKLE-DRAGGER
     3180      576            taskhost.exe    KNUCKLE-DRAGGER
     3196     4308            vmware-tray.exe KNUCKLE-DRAGGER
     3460     4392            explorer.exe    KNUCKLE-DRAGGER
     3644     4636            vmware-vmx.exe  KNUCKLE-DRAGGER
     3696     3460            mplayerc.exe    KNUCKLE-DRAGGER
     4636     3196            vmware.exe      KNUCKLE-DRAGGER
     4828     3460            notepad.exe     KNUCKLE-DRAGGER

Cores and threads? How does it all work exactly

As another user commented, it's mostly OS-dependent.

if a CPU has 2 logical cores, it can run two programs 100% concurrent, yes?

Concurrently yes, in parallel no. See: https://softwareengineering.stackexchange.com/questions/190719/the-difference-between-concurrent-and-parallel-execution

For example, say I have 100 processes running on 2 cores ... will the OS try and divide 50 on each core for load balance? Will they be randomly scattered?

Each OS has it's own scheduling algorithm.

Say I launch mspaint.exe on a quad-core Intel chip ... where will it be executed from (core 1, 2, 3, 4?), and will it continue executing there until close?

We don't know where it will be executed and it will most probably not continue executing from start to finish on the same core. Again, depends on the OS scheduler.

Is it truly possible to pick a specific core, or program for multi-cores directly without having a transparent daemon or the OS doing it randomly for you?

Apparently yes: https://stackoverflow.com/questions/663958/how-to-control-which-core-a-process-runs-on

How so, if all people say is "just use threads"? Is using multi-threads mapped to cores? If so, how is using a thread tailored to a core without OS intervention if threads on a single-core do not concurrently work?

I didn't understand the question here, but the basic idea with threads is that you create them and the OS runs using its scheduling algorithm, there's no need for you to control in which logical or physical core it will run (there may be cases you might want to do that, I'm not sure why).

Best Answer

Related Solutions

Windows – How to determine if a process on Windows “has no parent”

Cores and threads? How does it all work exactly

Related Question