Shell – Metaphor for the concept of shell

Architectureshell

I'm finding myself helping out some classmates in my computer science class, because I have prior development experience, and I'm having a hard time explaining certain things like the shell. What's a good metaphor for the shell in the context of the Terminal on Mac, contrasted with a remote shell via SSH?

Best Answer

Put simply, a terminal is an I/O environment for programs to operate in, and a shell is a command processor that allows for the input of commands to cause actions (usually both interactively and non-interactively (scripted)). The shell is run within the terminal as a program.

There is little difference between a local and remote shell, other than that they are local and remote (and a remote shell generally is connected to a pty, although local shells can be too).

Related Solutions

Bash – Object-Oriented Shell for Unix Systems

I can think of three desirable features in a shell:

Interactive usability: common commands should be quick to type; completion; ...
Programming: data structures; concurrency (jobs, pipe, ...); ...
System access: working with files, processes, windows, databases, system configuration, ...

Unix shells tend to concentrate on the interactive aspect and subcontract most of the system access and some of the programming to external tools, such as:

bc for simple math
openssl for cryptography
sed, awk and others for text processing
nc for basic TCP/IP networking
ftp for FTP
mail, Mail, mailx, etc. for basic e-mail
cron for scheduled tasks
wmctrl for basic X window manipulation
dcop for KDE ≤3.x libraries
dbus tools (dbus-* or qdbus) for various system information and configuration tasks (including modern desktop environments such as KDE ≥4)

Many, many things can be done by invoking a command with the right arguments or piped input. This is a very powerful approach — better have one tool per task that does it well, than a single program that does everything but badly — but it does have its limitations.

A major limitation of unix shells, and I suspect this is what you're after with your “object-oriented scripting” requirement, is that they are not good at retaining information from one command to the next, or combining commands in ways fancier than a pipeline. In particular, inter-program communication is text-based, so applications can only be combined if they serialize their data in a compatible way. This is both a blessing and a curse: the everything-is-text approach makes it easy to accomplish simple tasks quickly, but raises the barrier for more complex tasks.

Interactive usability also runs rather against program maintainability. Interactive programs should be short, require little quoting, not bother you with variable declarations or typing, etc. Maintainable programs should be readable (so not have many abbreviations), should be readable (so you don't have to wonder whether a bare word is a string, a function name, a variable name, etc.), should have consistency checks such as variable declarations and typing, etc.

In summary, a shell is a difficult compromise to reach. Ok, this ends the rant section, on to the examples.

The Perl Shell (psh) “combines the interactive nature of a Unix shell with the power of Perl”. Simple commands (even pipelines) can be entered in shell syntax; everything else is Perl. The project hasn't been in development for a long time. It's usable, but hasn't reached the point where I'd consider using it over pure Perl (for scripting) or pure shell (interactively or for scripting).
IPython is an improved interactive Python console, particularly targetted at numerical and parallel computing. This is a relatively young project.
irb (interactive ruby) is the Ruby equivalent of the Python console.
scsh is a scheme implementation (i.e. a decent programming language) with the kind of system bindings traditionally found in unix shells (strings, processes, files). It doesn't aim to be usable as an interactive shell however.
zsh is an improved interactive shell. Its strong point is interactivity (command line edition, completion, common tasks accomplished with terse but cryptic syntax). Its programming features aren't that great (on par with ksh), but it comes with a number of libraries for terminal control, regexps, networking, etc.
fish is a clean start at a unix-style shell. It doesn't have better programming or system access features. Because it breaks compatibility with sh, it has more room to evolve better features, but that hasn't happened.

Addendum: another part of the unix toolbox is treating many things as files:

Most hardware devices are accessible as files.
Under Linux, /sys provides more hardware and system control.
On many unix variants, process control can be done through the /proc filesystem.
FUSE makes it easy to write new filesystems. There are already existing filesystems for converting file formats on the fly, accessing files over various network protocols, looking inside archives, etc.

Maybe the future of unix shells is not better system access through commands (and better control structures to combine commands) but better system access through filesystems (which combine somewhat differently — I don't think we've worked out what the key idioms (like the shell pipe) are yet).

Concept of memory mapping in Unix like systems

Consider: two processes can have the same file open for reading & writing at the same time, so some kind of communication is possible between the two.

When process A writes to the file, it first populates a buffer inside its own process-specific memory with some data, then calls write which copies that buffer into another buffer owned by the kernel (in practise, this will be a page cache entry, which the kernel will mark as dirty and eventually write back to disk).

Now process B reads from same point in the same the file; read copies the data from the same place in the page cache, into a buffer in B's memory.

Note that two copies are required: first the data is copied from A into the "shared" memory, and then copied again from the "shared" memory into B.

A could use mmap to make the page cache memory available directly in its own address space. Now it can format its data directly into the same "shared" memory, instead of populating an intermediate buffer, and avoiding a copy.

Similarly, B could mmap the page directly into its address space. Now it can directly access whatever A put in the "shared" memory, again without having to copy it into a separate buffer.

(Obviously some kind of synchronization is required if you really want to use this scheme for IPC, but that's out of scope).

Now consider the case where A is replaced by the driver for whatever device this file is stored on. By accessing the file with mmap, B still avoids a redundant copy (the DMA or whatever into the page cache is unavoidable, but it doesn't need to be copied again into B's buffer).

There are also some drawbacks, of course. For example:

if your device and OS support asynchronous file I/O, you can avoid blocking reads/writes using that ... but reading or writing a mmapped page can cause a blocking page fault which you can't handle directly (although you can try to avoid it using mincore etc.)
it won't stop you trying to read off the end of a file, or help you append to it, in a nice way (you need to check the length or explicitly truncate the file larger)

Best Answer

Related Solutions

Bash – Object-Oriented Shell for Unix Systems

Concept of memory mapping in Unix like systems

Related Question