
4
$ ls -l /dev/loop0 /dev/null
brw-rw---- 1 root disk 7, 0 /dev/loop0
crw-rw-rw- 1 root root 1, 3 /dev/null
$ stat -c "%F, Mm: %t,%T" /dev/{loop0,null}
block special file, Mm: 7,0
character special file, Mm: 1,3
Fig. 5. The type, major and minor number of a file
While various virtual filesystems can be mounted to
interface with the kernel the proc(5) filesystem is almost
always there. It allows configuration and introspection of
processes or kernel subsystems.
Each process has a subfolder given by its process ID. It is
populated with information on the environment variables,
command line arguments, links to the files behind the
acquired file descriptors, current syscall, process threads,
cgroup information, the mount environment and I/O statis-
tics.
The sysctl(8) utility essentially accesses the files in
/proc/sys/ to e.g. configure IP package forwarding or
the memory swapping policy. The kernel maintains tables
which specify the allowed content and a handler function
for value change [5].
Read-only kernel debugging is possible in gdb through
the /proc/kcore file and many other files like e.g.
/proc/partitions exist. Internally there are two APIs,
the original and the newer seq_file interface to overcome
the limit of single page reads [5].
A more systematic approach, but similar to /proc/sys/
is found in the newer sysfs which can expose kernel
objects and their attributes more easily by making use of the
internal hierarchy [2]. It is commonly mounted in /sys and
uses directories to represent objects with their parent/child
relation. In this way symbolic links are used to interconnect
e.g. device classes with the devices of this kind.
The object attributes are contained as files in the di-
rectory and for read/write operations the defined func-
tions in the kernel are invoked. This superseeded the
practice of ioctl syscalls on special devices. And since
kobject_uevents() can be used to emit an event for
a kernel objects via Netlink messages to user space, the
advantage of static exposure and dynamic messaging can
be combined.
Other special purpose filesystems for interfacing kernel
subsystems which may be found in /sys/kernel/ are
debugfs to modify values via the seq_file API, and configfs
to create kernel objects from user space, cf. documentation
in filesystems/ of [6].
Control cgroups(7) were mentioned in the section
on system calls and they can also be managed through
the virtual filesystem. Modern Linux distributions mount
the cgroups controllers which specified as mount op-
tions for the cgroup (v1) filesystem to subfolders in
/sys/fs/cgroup/. The cgroup2 filesystem abandons
this flexibility for a unified mount point. New groups for
each controller can be created through new directories
and then processes be added by writing the PID to the
cgroup.procs file. An example use case is to implement
access restrictions in the device controller by writing e.g.
a
*
:
*
rwm to devices.deny to deny reads, writes and
mknods for all types and all major/minor device IDs. CPU
sets allow to pin a process group to a certain CPUs. Memory,
I/O or network restrictions are also possible.
All these user space requests need to go through the
virtual filesystem tree into the specific filesystem which has
a significant latency compared to direct system calls [7].
Sometimes it is possible to use either the sysfs entry or
an (ioctl) system call depending on the needs.
6 SIGNALS
Process signals as defined in POSIX are tied to actions
(where also ignoring is an action). A process can register
a handler function to replace the default action to e.g.
handle a Ctrl-C SIGABRT on the terminal. The handler is
asynchronously invoked independently from the normal
programm execution [1]. A signal mark can block signals
during handler execution. Magic SysRq keyboard com-
mands can issue termination and kill signals to all processes
directly from the kernel side [2].
Linux supports POSIX standard and realtime signals as
listed in SIGNAL(7). SIGTERM asks the process to termi-
nate while SIGKILL even cannot be handled and directly
halts the process. They have the IDs 15 for TERM and 9 for
KILL, where 15 is the default value for the kill(1) utility.
SIGSTOP can also not be handled and prevents the process
from being scheduled.
Processor exceptions which trigger a kernel interrupt
will come as e.g. SIGFPE or SIGSEGV signals to the process
for invalid floating point operations or memory access.
Signals can carry a data word which might be useful for
simple IPC with the non-specific SIGUSR1.
7 SHARED MEMORY, SEMAPHORES AND QUEUES
Both the System V SVIPC(7) and POSIX API variants
for shared memory, semaphores and message queues are
offered. Shared memory regions can be created or the same
file mapped to memory. This concept is essential if copying
large amounts of data is to be avoided, but can also serve
as communication method combined with mutal exclusion
through blocking semaphores as in SEM_OVERVIEW(7). A
simple list of active resources is available through the lsipc
or ipcs utility. They are persistent in the kernel if the pro-
cesses do not remove them. Access is gained through unique
identifiers and the creating process may set restrictions.
The POSIX shared memory as described in
SHM_OVERVIEW(7) can be named and referenced in
/dev/shm or anonymous. Message queues as in POSIX
MQ_OVERVIEW(7) allow multiple reading and writing
processes.
8 INTER-PROCESS COMMUNICATION SOCKETS
Messages through sockets offer flexibility and portability
compared to direct syscalls or shared memory. Pipes, IP and
Unix domain sockets are common in user space IPC and can
also be used in kernel space but Netlink sockets are unique
to the Linux kernel, and therefore not often used seen in
user space.