User space

A modern computer operating system usually segregates virtual memory into kernel space and user space.[a] Primarily, this separation serves to provide memory protection and hardware protection from malicious or errant software behaviour.

Kernel space is strictly reserved for running a privileged operating system kernel, kernel extensions, and most device drivers. In contrast, user space is the memory area where application software and some drivers execute.

Overview

The term userland (or user space) refers to all code that runs outside the operating system's kernel.[1] Userland usually refers to the various programs and libraries that the operating system uses to interact with the kernel: software that performs input/output, manipulates file system objects, application software, etc.

Each user space process normally runs in its own virtual memory space, and, unless explicitly allowed, cannot access the memory of other processes. This is the basis for memory protection in today's mainstream operating systems, and a building block for privilege separation. A separate user mode can also be used to build efficient virtual machines – see Popek and Goldberg virtualization requirements. With enough privileges, processes can request the kernel to map part of another process's memory space to its own, as is the case for debuggers. Programs can also request shared memory regions with other processes, although other techniques are also available to allow inter-process communication.

Various layers within Linux, also showing separation between the userland and kernel space
User mode User applications For example, bash, LibreOffice, GIMP, Blender, 0 A.D., Mozilla Firefox, etc.
Low-level system components: System daemons:
systemd, runit, logind, networkd, PulseAudio, ...
Windowing system:
X11, Wayland, SurfaceFlinger (Android)
Other libraries:
GTK+, Qt, EFL, SDL, SFML, FLTK, GNUstep, etc.
Graphics:
Mesa, AMD Catalyst, ...
C standard library open(), exec(), sbrk(), socket(), fopen(), calloc(), ... (up to 2000 subroutines)
glibc aims to be POSIX/SUS-compatible, musl and uClibc target embedded systems, bionic written for Android, etc.
Kernel mode Linux kernel stat, splice, dup, read, open, ioctl, write, mmap, close, exit, etc. (about 380 system calls)
The Linux kernel System Call Interface (SCI, aims to be POSIX/SUS-compatible)
Process scheduling
subsystem
IPC
subsystem
Memory management
subsystem
Virtual files
subsystem
Network
subsystem
Other components: ALSA, DRI, evdev, LVM, device mapper, Linux Network Scheduler, Netfilter
Linux Security Modules: SELinux, TOMOYO, AppArmor, Smack
Hardware (CPU, main memory, data storage devices, etc.)

Implementation

The most common way of implementing a user mode separate from kernel mode involves operating system protection rings.

Another approach taken in experimental operating systems is to have a single address space for all software, and rely on a programming language's semantics to make sure that arbitrary memory cannot be accessed – applications simply cannot acquire any references to the objects that they are not allowed to access.[2][3] This approach has been implemented in JXOS, Unununium as well as Microsoft's Singularity research project.

See also

Notes

  1. ^ Older operating systems, such as DOS and Windows 3.1x, do not use this architecture.

References

  1. ^ "userland, n." The Jargon File. Eric S. Raymond. Retrieved 2016-08-14.
  2. ^ "Unununium System Introduction". Archived from the original on 2001-12-15. Retrieved 2016-08-14.
  3. ^ "uuu/docs/system_introduction/uuu_intro.tex". UUU System Introduction Guide. 2001-06-01. Retrieved 2016-08-14.

External links

Alpine Linux

Alpine Linux is a Linux distribution based on musl and BusyBox, primarily designed for security, simplicity, and resource efficiency. It uses a hardened kernel and compiles all user space binaries as position-independent executables with stack-smashing protection.Because of its small size, it's heavily used in containers providing quick boot up times.A fork of the distribution, postmarketOS, is designed to run on mobile devices.

Debugfs

debugfs is a special file system available in the Linux kernel since version 2.6.10-rc3. It was written by Greg Kroah-Hartman.debugfs is a simple-to-use RAM-based file system specially designed for debugging purposes. It exists as a simple way for kernel developers to make information available to user space. Unlike /proc, which is only meant for information about a process, or sysfs, which has strict one-value-per-file rules, debugfs has no rules at all. Developers can put any information they want there.

Direct Rendering Manager

Not to be confused with Digital rights managementIn computing, the Direct Rendering Manager (DRM), a subsystem of the Linux kernel, interfaces with the GPUs of modern video cards. DRM exposes an API that user-space programs can use to send commands and data to the GPU, and to perform operations such as configuring the mode setting of the display. DRM was first developed as the kernel space component of the X Server's Direct Rendering Infrastructure, but since then it has been used by other graphic stack alternatives such as Wayland.

User-space programs can use the DRM API to command the GPU to do hardware-accelerated 3D rendering and video decoding as well as GPGPU computing.

Distributed File System (Microsoft)

Distributed File System (DFS) is a set of client and server services that allow an organization using Microsoft Windows servers to organize many distributed SMB file shares into a distributed file system. DFS has two components to its service: Location transparency (via the namespace component) and Redundancy (via the file replication component). Together, these components improve data availability in the case of failure or heavy load by allowing shares in multiple different locations to be logically grouped under one folder, the "DFS root".

Microsoft's DFS is referred to interchangeably as 'DFS' and 'Dfs' by Microsoft and is unrelated to the DCE Distributed File System, which held the 'DFS' trademark but was discontinued in 2005.

It is also called "MS-DFS" or "MSDFS" in some contexts, e.g. in the Samba user space project.

Evdev

evdev (short for 'event device') is a generic input event interface in the Linux kernel. It generalizes raw input events from device drivers and makes them available through character devices in the /dev/input/ directory.

The user-space library for the Linux kernel component evdev is called libevdev. Libevdev abstracts the evdev ioctls through type-safe interfaces and provides functions to change the appearance of the device. Libevdev shares similarities with the read system call.It sits below the process that handles input events, in between the kernel and that process.

Linux kernel → libevdev → xf86-input-evdev → X server → X clientFor Weston/Wayland, the stack would look like this:

Linux kernel → libevdev → libinput → Weston → Wayland clientSince version 1.16 the xorg-xserver obtained support for libinput:

Linux kernel → libevdev → libinput → xf86-input-libinput → X server → X clientevdev is primarily used by display servers like X.org (via xf86-input-evdev driver and libevdev) and Weston.

Fiber (computer science)

In computer science, a fiber is a particularly lightweight thread of execution.

Like threads, fibers share address space. However, fibers use cooperative multitasking while threads use preemptive multitasking. Threads often depend on the kernel's thread scheduler to preempt a busy thread and resume another thread; fibers yield themselves to run another fiber while executing.

Gazelle (web browser)

Gazelle was a research web browser project by Microsoft Research, first announced in early 2009. The central notion of the project was to apply operating system (OS) principles to browser construction. In particular, the browser had a secure kernel, modelled after an OS kernel, and various web sources run as separate "principals" above that, similar to user space processes in an OS. The goal of doing this was to prevent bad code from one web source to affect the rendering or processing of code from other web sources. Browser plugins are also managed as principals.Gazelle had a predecessor project, MashupOS, but with Gazelle the emphasis was on a more secure browser.By the July 2009 announcement of Google Chrome OS, Gazelle was seen as a possible alternative Microsoft architectural approach compared to Google's direction. That is, rather than the OS being reduced in role to that of a browser, the browser would be strengthened using OS principles.The Gazelle project became dormant, and ServiceOS arose as a replacement project also related to browser architectures. But by 2015, the SecureOS project was also dormant, after Microsoft decided that its new flagship browser would be Edge.

HTTP/3

HTTP/3 is the upcoming third major version of the Hypertext Transfer Protocol used to exchange binary information on the World Wide Web. HTTP/3 is based on previous RFC draft "Hypertext Transfer Protocol (HTTP) over QUIC". QUIC is an experimental transport layer network protocol initially developed by Google where user space congestion control is used over User Datagram Protocol (UDP).

On 28 October 2018 in a mailing list discussion, Mark Nottingham, Chair of the IETF HTTP and QUIC Working Groups, made the official request to rename HTTP-over-QUIC as HTTP/3 to "clearly identify it as another binding of HTTP semantics to the wire protocol ... so people understand its separation from QUIC" and pass its development from the QUIC Working Group to the HTTP Working Group after finalizing and publishing the draft. In the subsequent discussions that followed and stretched over several days, Nottingham's proposal was accepted by fellow IETF members, who in November 2018 gave their official seal of approval that HTTP-over-QUIC become HTTP/3.

Klibc

In computing, klibc is a minimalistic subset of the standard C library developed by H. Peter Anvin. It was developed mainly to be used during the Linux startup process, and it is part of the early user space, i.e. components used during kernel startup, but which do not run in kernel mode. These components do not have access to the standard library (usually glibc) used by normal userspace programs.

The development of klibc library was part of the 2002 effort to move some Linux initialization code out of the kernel. According to its documentation, the klibc library is optimized for correctness and small size. Because of its design, klibc is also technically suitable for embedded software in general, and even some full-featured programs such as the MirBSD Korn Shell. klibc is licensed under the full GPL license, which (unlike LGPL) imposes itself on any code linked with it. (This only applies to klibc as a whole due to embedding some Linux kernel derived files; most of the library source code is actually available under a BSD licence from UCB or the Historical Permission Notice and Disclaimer.) This may limit its applicability to commercial embedded software.During the Linux startup process, klibc is loaded from within a temporary RAM file system, initramfs. It is incorporated by default into initial RAM file systems that are created by the mkinitramfs script in Debian and Ubuntu. Furthermore, it has a set of small Unix utilities that are useful in early user space: cpio, dash, fstype, mkdir, mknod, mount, nfsmount, run-init, etc. all using the klibc library. An alternate strategy is to include everything in one executable, like BusyBox, which determines the requested applet via arguments or a symlink.

L7-filter

l7-filter is a software package which provides a classifier for Linux's Netfilter subsystem which can categorize Internet Protocol packets based on their application layer data. The major goal of this tool is to make possible the identification of peer-to-peer programs, which use unpredictable port numbers. There are two versions for this software. The first is implemented as a kernel module for Linux 2.4 and 2.6. The second experimental version was released in December 2006 which runs as a user-space program and relies on netfilter's user-space libraries for the classification process.

Both versions of l7-filter use regular expressions (though the user-space and kernel modules use different regular expression libraries) to identify the network protocol. This technique, used in conjunction with Linux's QoS system, allows application-specific yet port-independent traffic shaping.

All versions of l7-filter have been released under the GNU General Public License.

Linux kernel interfaces

The Linux kernel provides several interfaces to user-space applications that are used for different purposes and that have different properties by design. There are two types of application programming interface (API) in the Linux kernel that are not to be confused: the "kernel–user space" API and the "kernel internal" API.

Linux startup process

Linux startup process is the multi-stage initialization process performed during booting a Linux installation. It is in many ways similar to the BSD and other Unix-style boot processes, from which it derives.

Booting a Linux installation involves multiple stages and software components, including firmware initialization, execution of a boot loader, loading and startup of a Linux kernel image, and execution of various startup scripts and daemons. For each of these stages and components there are different variations and approaches; for example, GRUB, LILO, SYSLINUX or Loadlin can be used as boot loaders, while the startup scripts can be either traditional init-style, or the system configuration can be performed through modern alternatives such as systemd or Upstart.

Mach (kernel)

Mach () is a kernel developed at Carnegie Mellon University to support operating system research, primarily distributed and parallel computing. Mach is often mentioned as one of the earliest examples of a microkernel. However, not all versions of Mach are microkernels. Mach's derivatives are the basis of the modern operating system kernels in GNU Hurd and Apple's operating systems macOS, iOS, tvOS, and watchOS.

The project at Carnegie Mellon ran from 1985 to 1994, ending with Mach 3.0, which is a true microkernel. Mach was developed as a replacement for the kernel in the BSD version of Unix, so no new operating system would have to be designed around it. Mach and its derivatives exist within a number of commercial operating systems. These include all using the XNU operating system kernel which incorporates an earlier non-microkernel Mach as a major component. The Mach virtual memory management system was also adopted in 4.4BSD by the BSD developers at CSRG, and appears in modern BSD-derived Unix systems, such as FreeBSD.

Mach is the logical successor to Carnegie Mellon's Accent kernel. The lead developer on the Mach project, Richard Rashid, has been working at Microsoft since 1991 in various top-level positions revolving around the Microsoft Research division. Another of the original Mach developers, Avie Tevanian, was formerly head of software at NeXT, then Chief Software Technology Officer at Apple Inc. until March 2006.

Microkernel

In computer science, a microkernel (often abbreviated as μ-kernel) is the near-minimum amount of software that can provide the mechanisms needed to implement an operating system (OS). These mechanisms include low-level address space management, thread management, and inter-process communication (IPC).

If the hardware provides multiple rings or CPU modes, the microkernel may be the only software executing at the most privileged level, which is generally referred to as supervisor or kernel mode. Traditional operating system functions, such as device drivers, protocol stacks and file systems, are typically removed from the microkernel itself and are instead run in user space.In terms of the source code size, microkernels are often smaller than monolithic kernels. The MINIX 3 microkernel, for example, has approximately 12,000 lines of code.

Plan 9 from User Space

Plan 9 from User Space (also plan9port or p9p) is a port of many Plan 9 from Bell Labs libraries and applications to Unix-like operating systems. Currently it has been tested on a variety of operating systems including: Linux, macOS, FreeBSD, NetBSD, OpenBSD, Solaris and SunOS. The project's name is a reference to the 1950s Ed Wood film Plan 9 from Outer Space.

A number of key applications have been ported, as have programs used by the system itself, along with the requisite libraries from Plan 9. All of these have been made to work on top of a Unix-like environment instead of their native Plan 9. Some of the most significant ported components are:

rc – The Plan 9 shell.

sam – A text editor.

acme – A combination text editor and graphical shell especially useful to programmers.

mk – A tool for building software, analogous to the traditional Unix make utility.

plumber – An interprocess messaging facility.

Venti – A network storage system that permanently stores data blocks.

Rump kernel

The NetBSD rump kernel is the first implementation of the "anykernel" concept where drivers either can be compiled into and/or run in the monolithic kernel or in user space on top of a light-weight rump kernel.

The NetBSD drivers can be used on top of the rump kernel on a wide range of POSIX operating systems, such as the Hurd, Linux, NetBSD, DragonFlyBSD, Solaris and even Cygwin, along with the file system utilities built with the rump libraries. The rump kernels can also run without POSIX directly on top of the Xen hypervisor, an L4 microkernel using the Genode OS Framework or even on "OS-less" bare metal.

Splice (system call)

splice() is a Linux-specific system call that moves data between a file descriptor and a pipe without a round trip to user space. The related system call vmsplice() moves or copies data between a pipe and user space. Ideally, splice and vmsplice work by remapping pages and do not actually copy any data, which may improve I/O performance. As linear addresses do not necessarily correspond to contiguous physical addresses, this may not be possible in all cases and on all hardware combinations.

Supervisor Mode Access Prevention

Supervisor Mode Access Prevention (SMAP) is a feature of some CPU implementations such as the Intel Broadwell microarchitecture that allows supervisor mode programs to optionally set user-space memory mappings so that access to those mappings from supervisor mode will cause a trap. This makes it harder for malicious programs to "trick" the kernel into using instructions or data from a user-space program.

Sysfs

sysfs is a pseudo file system provided by the Linux kernel that exports information about various kernel subsystems, hardware devices, and associated device drivers from the kernel's device model to user space through virtual files. In addition to providing information about various devices and kernel subsystems, exported virtual files are also used for their configuration.

sysfs provides functionality similar to the sysctl mechanism found in BSD operating systems, with the difference that sysfs is implemented as a virtual file system instead of being a purpose-built kernel mechanism, and that, in Linux, sysctl configuration parameters are made available at /proc/sys/ as part of procfs, not sysfs which is mounted at /sys/.

General
Kernel
Process management
Memory management and
resource protection
Storage access and
file systems
List
Miscellaneous concepts

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.