Run Forest, run
You know now how to run commands in the terminal and how a command is assembled. You can distinguish parameters from commands. You know how to use the built-in help system. You can easily navigate through the file tree and look at files.
Identify the command and all arguments:
- cat /proc/meminfo
- ls -l -h ~
- less /proc/uptime
- pwd
- man man
Name the four most essential components of your computer.
Explain what files are for. Give a reason why we need directories.
Name 3 responsibilities of the kernel.
You’ve already learnt about the file structure. You know that files are used to store data and are organized in directories. You also know that files are kept on the hard disc, so when you turn off the machine the files still persist (i.e. are still there when you turn the machine back on). Files are one of the most important concepts in computers, as we store anything - and I mean everything - in a file. But files don’t do stuff for themselves. They are a mere container, a box, to store some content.
You’ve surely heard about Memory and the [CPU]. Let’s explain these two parts quickly. The CPU is responsible for all the calculations. This means, it can add, subtract, multiply and divide like crazy. It’s really insanely fast but at the same time it can hardly remember anything. That’s because it does not have any memory [1]. Instead, the memory is a seperate component, often called [RAM] or main memory. The CPU and memory work closely together. Basically, the CPU loads parts of the memory, then does all the computation and stores the result in the memory again.
Principally, the main memory and the hard disc are not that different. Both are storage systems. You could actually use a hard disc instead of main memory. We usually don’t do this because the hard disc is extremely slow, at least compared to main memory. Since main memory is fast, why don’t we use it for storage of files? Well, RAM only remembers stuff as long as there’s power. When you turn off your computer, all data in the main memory is lost but the hard disc persists.
So how does this information help us? Well, you’ve only just discovered how computers - or any electronic device - work. Let’s now put the facts together. There’s the disc keeping all persistent data in files. This means that also programs are stored in files - surely a program is persistent, as you still have it there after a reboot. So yes, a program is nothing more than a file on your disc. It does not contain text, however, but instructions the computer can understand and execute (the code). But as stated, the file is only a container, it does not do anything on its own. When a program is to be run, it is copied from the disc to main memory. From there, the CPU can directly read it and execute the program code it contains, line by line.
Any questions? Let’s go over some exercises to let the new knowledge sink in
Do a little online research:
- What’s the price per megabyte for a hard disk? What for main memory?
- What’s the access speed (MB/s reading) of a hard disk? And of main memory?
When you enter text in leafpad but don’t save it - where is it stored (memory or hard disk)?
When you save the text in leafpad where is it stored (memory or hard disk)?
How is a program file different from a text file you write with leafpad?
It was just defined what a program is: A file (probably several) containing instructions. But a program is a static thing - as it’s a file it doesn’t do anything (yet). If you want to use a program, you have to start it (seems trivial, right?). Let’s check out what programs currently run on your machine. We introduce a new command to do so:
top
By pressing M and P you can sort the list by memory or CPU consumption. You can return to the terminal via q. Note that for top the case matters - you must write the characters as shown here.
On the right, you see the program name. For example, SciTE is the text editor which I use to write this tutorial right now. The time column to the left of the name tells you for how long the program runs already (so I’ve started writing less than 20 minutes ago). Also interesting are the %CPU and %MEM columns - they show you which program consumes the most system resources. For example, if the %MEM value is very high - let’s say 80%, you will soon run out of memory and into trouble. top also presents the overall memory statistics (KiB Mem-row). If the free value is very low your system will soon become very slow [2]. We will not discuss the meaning of the other columns now, they will be explained later on.
Have a look at top again.
Before, you were told that the program name is denoted in the rightmost column. Well, in the column header it says COMMAND there. You’ve used the term command before, now we say it’s the program name. Confused? Actually these terms are closely related. You might have guessed - the command you type is exactly the program name. All you do on the command line is starting programs. You do so by giving the program name. Consider the already well known command
ls /
When you execute the command, the computer checks if a program called ls is installed. If so, it is started. A program can produce text output. In the case of ls, that’s the list of files and directories. While in principle you could do anything with the output, the terminal simply displays it.
So that explains all you see on the terminal - the command line (starting programs) and text (program output). But a program is not restricted to text only. You want to start a graphical program from the terminal? No problem, just do it:
xterm
Ok, this opens another terminal, how impressive. Really, you can do whatever you want and you open a new terminal? You must really love the terminal by now. You can’t even use both terminals, you’ll notice that the original command line is ‘blocked’. Only once you exit the new terminal window, you can continue in the first.
Note
The command you type in the terminal is not always what you’d expect. For example, if you want to start LibreOffice, you’d have to type soffice.
Only one step is missing until absolution. Have you ever started a program twice? Of course, this is possible. Let’s have a look at this. You’ll need 3 terminals to run the next example (the ones where you start the leafpad can’t be used until leafpad is closed - just like xterm before). The first two commands use leafpad to show you a file. The files are probably empty, if you like you can type something.
leafpad /tmp/foo.t
leafpad /tmp/bar.t
Next, let’s show at the program list, similar to what you did with top before. But this time, we use a different command.
ps u
Usually, top is used for system monitoring (memory and CPU consumption). It automatically updates the process list and you can sort easily. However, it also truncates the list so that it fits your screen. If you want the complete list of all processes and don’t need the regular updates, then ps is the better alternative.
Your screen could look somewhat like in the image below:
In the topmost terminal, you now see the leafpad twice in ps‘s output. That’s because you ran the program leafpad two times. Not really unexpected, is it?
There’s a slight difference between a program and a process. The program is a static thing, a collection of files lying somewhere on the hard dics. It is typically installed once. When you run a program, a process (or program instance) is created. So the process is the running program, the ‘living’ thing. It can run, do stuff, receive input, produce output and do more stuff. You can have several processes (of the same program) running at the same time. When you close the program (e.g. via the x button), you actually terminate its process. The process is then really finished, all traces of it are removed. You’d be surprised though, if terminating a process uninstalls the program.
Why this distinction? In the example, the leafpad was opened twice, each one reading a (different) file. So even though it’s the same program, each process works with its own file. Another example: Imagine you start ls in two different terminals at two different paths at the same time. Of course you’d expect ls to report the directory contents from where it was started - even though a second ls runs simultaneously. So each process has to run independently of other instances of the same program.
Generally speaking each process get some resources. That’s files, CPU time, memory but also access to the network or other hardware [3]. It also gets some information about its environment, like the working directory it was started from and the arguments you used to start. You can also call this the context a program is started in. All this doesn’t matter (or is not known) when the program is not running and may be different at each program startup.
Note
For demonstration, we have to be a bit careful since some programs (like firefox) don’t actually start twice. They realize that they were already started and open a new window in the original process instead of creating a second one. Keep this in mind if you start firefox from the terminal. The same is true for LibreOffice.
You’ve already seen how a process can be started on the command line. How about stopping? Perhaps you didn’t notice but in all the examples so far you terminated the process somehow or it finished on its own. An example of the former is less: You close the process when you press q. For the latter, let’s have a look at ls: After having written something on the terminal it has fulfilled its purpose, so it terminates itself. When a process, started from the terminal, finishes or gets closed you end up at the command line again. In the graphical examples, you didn’t see the command line (and couldn’t type another command) until you’ve closed the window with the x button (terminates graphical programs).
Sometimes you want to abort a process prematurely, i.e. before it would close itself. Sometimes it’s not possible to do so from the program instance (like less). For example, you might have typed (don’t do this!) ls -lR / by mistake and don’t want to wait until it has finished (which might take quite some time).
You already know one method of aborting commands: Ctrl-c. If you press this key combination, the currently running process is asked to terminate immediately. Sometimes we cannot do this, so let’s talk alternatives.
Like in the example before, start leafpad from the terminal.
leafpad
Save and close all other running leafpads before continuing. If you fail to do so, they will be closed for you without warning. Open a second terminal and then run the next commands.
ps u
You should indeed see the leafpad process. Only a single leafpad process (if you considered the warning above). Now, let’s terminate it.
killall leafpad
This command closes all running leafpads. It does exactly the same as Ctrl-c but not only for one process but any named leafpad.
Let’s assume you have more than one leafpad running. Wait, let’s actually do this. In two terminals, do
leafpad /tmp/foo.t
leafpad /tmp/bar.t
Now you’re set to go with two running leafpads. Say, you want to close the first but not the second one. So you cannot use killall, as this would close both processes. Somehow we have to select only a single one of them. Let’s again show at our process list.
ps u
On the left side, there’s a column PID (Process ID). A PID is an unique number which is automatically assiged to any new process. With the PID, we can identify any process. Now, let’s terminate the first leafpad. In the command below, replace <PID> with the actual PID of the first leafpad. Run it in a new terminal and watch your first leafpad die.
kill <PID>
In the example, the process of the first leafpad was killed. So its window disappears and the we’re informed that the process was Terminated. Like when the process would have been closed normally, the prompt returns and we can use the command line again. Unlike killall which closed all leafpads, we now only terminated one of them.
Sometimes a program ignores our kill attempt. So we have to make a second, more serious one. Let’s try this on the second leafpad, now the only running one. Again, replace the <PID> in the command below with the PID of the second leafpad process. Then run the command and again watch a leafpad being terminated.
kill -9 <PID>
If you add the -9 to the kill command, the program cannot refuse to shut down. However, you should only use this variant if it is absolutely necessary. A forced program shutdown may lead to file corruption or inconsistencies: Imagine you terminate a process while it writes a file - the first half is already written but not the second one.
Again the leafpad window gets closed, but in contrast to the first kill, now the terminal states Killed. This again reflects the slight difference between the two kill versions.
You’ve only seen your own processes until now. Let’s change that. Do the following:
ps -ef
Wow, that’s a lot more processes that it used to be. In fact, that’s all processes currently running. Most of them are unrelated to you and you certainly didn’t start them. That’s because these are Daemons. A Daemon is nothing extraordinary. It’s just a system process - started and maintained by the operating system. Daemons offer important services, like sound, networking or the graphical interface (see below).
But when you execute a command in the terminal you cannot use the terminal as long as the process is alive. You’ve noticed this already when starting the leafpad. If you only have to run a couple of windowed programs you probably don’t want to open a single terminal for every one to start. Also, you didn’t start the many daemons in a terminal, so where’s their terminal? How can they live without? You’ll figure out in a second!
You can start a leafpad again. But this time we want the terminal to remain usable. Try the following:
leafpad &
Bam! Immediately after the leafpad started, the terminal shows you the command line again. What happened? The & at the end of a command makes the process run in the background. It is started normally but then seperated from the terminal so that you return to the command line. You can confirm with ps that the process is actually running
ps u
Congratulations, you’ve just created a deamon. This also gives a new purpose for kill. Imagine the program you start doesn’t have graphical output, i.e. no window. How would you close the program then? Having its PID (easily obtained through ps) you can kill it (that’s nice enough without the -9). You know how this works, so let’s close the leafpad as an exercise :
kill <PID>
Before we finish this chapter, let’s briefly discuss some important system processes. You can get the full process list with ps:
ps -ef
There’s lots of processes enclosed in brackets ([, ]). These are kernel processes, meaning that you cannot start or manipulate them yourself. All others are processes started by you or the operating system. We cannot go through the whole list, so here’s a description of some commonly available services.
Process | Description |
---|---|
init | The process which starts up your system. You’ll learn more details about it very soon. |
syslog | Manages log files, used for general system monitoring. Whenever a program feels like storing possibly important notifications it uses this daemon. |
cron | Periodically executes commands. If, for example, you want to run a daily backup this daemon takes care of running it at an appropriate time. |
cupsd | The printing daemon. Is very often installed but not mandatory. |
exim | An internal (local) mail delivery agent. |
sshd | The remote login service. You’ll discover the benefits of this tool later in the tutorial. |
dhclient | The DHCP client which sets up your network. |
getty | The most basic terminal login. Offers very impressive terminal magic. |
udev | The service which manages devices (i.e. hardware). When you plug in your USB stick, that’s when udev becomes active. |
X | The x window system deals with all graphical programs (windows and stuff). The X process is the underlying program handling these things for you, but there are many others involved. |
bash | What we usually call the terminal is actually also a program. Typically, we use bash, but others exist (sh, ksh). |
You’ve seen the terms program, command and process. You know how they are related and how these concepts are different. You understand their importance for computers. You’ve seen how to keep track of processes in the terminal and how to start and stop them. You know about the PID, how you find it and how to use it. You’ve been exposed to some daemons and know how to master them.
Command | Example | Description |
---|---|---|
ls | ls / | List directory contents. |
pwd | pwd | Show the working directory. |
cd | cd / | Change directory. |
exit | exit | Close the terminal. |
cat | cat /proc/version | Display file contents on the terminal. |
less | less /proc/cpuinfo | A simple and small text viewer. |
top | top | Live process and resource viewer. |
ps | ps -ef | Show a long list of process information. |
killall | killall firefox | Terminate a process by name. |
kill | Terminate a process by PID. |
[CPU] | Central Processing Unit |
[1] | This is almost true. The CPU has some small space for storing information, the registers. However, there’s only a lower two-digit figure of registers, not nearly enough to store anything useful. Think of it more as a piece of paper on which you write intermediate results of a complex calculation. |
[RAM] | Random Access Memory. This means that you can freely access any memory location. |
[2] | Remember the part about the hard disc being much slower than main memory? If you run out of main memory, the hard disc will actually be used as memory extension at the cost that the system becomes slower. |
[3] | The resources are managed by the kernel. It’s more or less anything that you have once in your system and needs to be distributed among the running processes. Like writing to the memory or hard disk - two processes shouldn’t write to the same memory location or file at the same time. In this case, the kernel decides what memory location (and how much) is given to which program. |