Does the X windowing system suffer from scalability

Protocolsx11

One of my professors was telling us about scalability problems, and said that the X protocol was a prime example of a not scalable protocol. Why is that? Is it because it is very hardware dependent? I know that X is used in modern unix/linux environments, if it's not scalable than why is it used so widely?

Best Answer

One reason he may have said this is that if you look at the traffic that flows back and forth between a client and a server, it's fairly verbose. This doesn't present an issue when the traffic is only having to go locally on a single box between the 2, however when the traffic needs to go over a network connection, then it becomes more painfully obvious that it's an inefficient protocol.

The protocol is tolerable on a LAN network, but as soon as you try and span it over a WAN connection, or introduce encryption in the form of a VPN or by using an SSH connection as a link between the client and the server, the Protocol really starts to show it's lack of scalability.

Benchmarking

You can use the tool x11perf to get a sense of the impact of running the applications localhosted vs. running them over an SSH connection to another X system.

Here I'm running the -create test to give you a taste of what I'm talking about.

localhost

$ x11perf -create
x11perf - X11 performance program, version 1.2
Fedora Project server version 10905000 on :0.0
from grinchy
Mon Sep 16 21:08:28 2013

Sync time adjustment is 0.1340 msecs.

   2400 reps @   0.0134 msec ( 74400.0/sec): Create and map subwindows (4 kids)
   2400 reps @   0.0156 msec ( 64300.0/sec): Create and map subwindows (4 kids)
   ....
   2400 reps @   0.0119 msec ( 83800.0/sec): Create and map subwindows (100 kids)
  12000 trep @   0.0063 msec (158000.0/sec): Create and map subwindows (100 kids)
   ....
   2400 reps @   0.0029 msec (349000.0/sec): Create and map subwindows (200 kids)
  12000 trep @   0.0049 msec (205000.0/sec): Create and map subwindows (200 kids)

LAN host

$ ssh skinner "x11perf -create"
....    
Sync time adjustment is 1.5461 msecs.

   2400 reps @   0.0270 msec ( 37100.0/sec): Create and map subwindows (4 kids)
   2400 reps @   0.0219 msec ( 45700.0/sec): Create and map subwindows (4 kids)
   ....
   2400 reps @   0.0168 msec ( 59600.0/sec): Create and map subwindows (100 kids)
  12000 trep @   0.0211 msec ( 47300.0/sec): Create and map subwindows (100 kids)
   ....
   2400 reps @   0.0159 msec ( 62900.0/sec): Create and map subwindows (200 kids)
  12000 trep @   0.0196 msec ( 50900.0/sec): Create and map subwindows (200 kids)

WAN host

$ ssh catbus-o "x11perf -create"
....
Mon Sep 16 21:12:22 2013

Sync time adjustment is 27.9911 msecs.

   2400 reps @   0.0592 msec ( 16900.0/sec): Create and map subwindows (4 kids)
   2400 reps @   0.0604 msec ( 16600.0/sec): Create and map subwindows (4 kids)
   ....
   2400 reps @   0.0538 msec ( 18600.0/sec): Create and map subwindows (100 kids)
  12000 trep @   0.0558 msec ( 17900.0/sec): Create and map subwindows (100 kids)
   ....
   2400 reps @   0.0697 msec ( 14400.0/sec): Create and map subwindows (200 kids)
  12000 trep @   0.0586 msec ( 17100.0/sec): Create and map subwindows (200 kids)

Notice the extreme drop off from:

localhost:

  12000 trep @   0.0049 msec (205000.0/sec): Create and map subwindows (200 kids)

LAN host:

  12000 trep @   0.0196 msec ( 50900.0/sec): Create and map subwindows (200 kids)

WAN host:

  12000 trep @   0.0586 msec ( 17100.0/sec): Create and map subwindows (200 kids)

That's a pretty steep decline in performance. Now realize that this isn't all X's fault. It is going over a 100MB network in the LAN test, and a ~20MB connection for the WAN test, but the point is still the same. X isn't helping itself with it's overly beefy communications it's throwing back and forth between the X server and the X client.

Communications Breakdown (couldn't resist the Led Zeppelin reference)

This is more for effect but just to give you an idea of the amount of data that's roughly flowing back and forth during the x11perf -create test that I used above I decided to run it on my LAN host again, only this time I used tcpdump to capture the SSH traffic, and dump it to a file.

I used this command:

$ sudo -i
$ tcpdump -lnni wlan0 -w dump.log -s 65535 host skinner and port ssh

The resulting log file:

$ ll dump.log 
-rw-r--r-- 1 root root 5768821 Sep 16 22:30 dump.log

So the resulting amount of traffic was in the ballpark of ~5.5MB. Granted this is not all X traffic but it gives you an idea of the amount of data flowing. This is really the Achilles' heel of X, and the major reason why it can't scale.

Related Question