Skip to content
/linux-syscalls

Security & Credentials · Section 2

prctl(2)

Process-control multiplexer: configure a hundred per-process knobs — name, capabilities, no-new-privs, seccomp install, dumpable, and more.

Signature

#include <sys/prctl.h>

int prctl(int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5);
option
Which operation to perform. See <sys/prctl.h> for the PR_* constants (>60 options).
arg2
Operation-specific. Often a value to set or a pointer to receive a value.
arg3
Operation-specific. Often a buffer length or secondary parameter.
arg4
Operation-specific. Most options ignore arg4 and arg5 — must be 0 then.
arg5
Operation-specific. Same as arg4.

Description

prctl() is a multiplexer syscall that exposes per-process control operations under a single number. The first argument, option, selects the operation; the remaining arguments are operation-specific. Important options include: PR_SET_NAME (set the process's 16-byte name shown in /proc/<pid>/comm and ps), PR_SET_DUMPABLE (control whether the process can be ptraced and whether it dumps core), PR_SET_NO_NEW_PRIVS (irrevocably disable privilege elevation via execve — required before installing user-namespace seccomp filters), PR_SET_SECCOMP (install a seccomp filter), PR_CAPBSET_READ / PR_CAPBSET_DROP (capability bounding set), PR_SET_PDEATHSIG (send a signal to this process when its parent dies). Many architecture-specific options exist (PR_PAC_*, PR_SVE_*, PR_SET_TAGGED_ADDR_CTRL on aarch64; PR_MPX_* on x86 — now removed). Each option has its own return-value semantics; consult prctl(2) for the specific contract.

Architecture mapping

ArchitectureNumberABIEntry point
x86 (i386)172i386sys_prctl
x64 (x86_64)157commonsys_prctl
ARM64 (aarch64)167sys_prctl

Kernel history

Introduced in Linux 2.1.57.

  1. 2.1.57

    prctl() was added in 2.1.57 to multiplex many small per-process operations behind a single syscall number, replacing the practice of adding new syscalls for each per-process tweak.

  2. 3.5

    PR_SET_NO_NEW_PRIVS was introduced as the cornerstone of unprivileged sandboxing. With no_new_privs set, execve() cannot elevate privileges — making it safe for an unprivileged process to install seccomp filters on its own children without root.

  3. 3.17

    PR_SET_MM was extended with options like PR_SET_MM_ARG_START so userspace can rewrite a process's argv/envv pointers, used by setproctitle() implementations and by some live-patching frameworks.

  4. 5.6

    PR_SET_TAGGED_ADDR_CTRL was added on aarch64 to opt the process into Top Byte Ignore — letting userspace stash 8 bits of metadata in the high byte of pointers (used by HWASan, MTE).

seccomp & containers

Docker default profile

Allowed

Podman default profile

Allowed

prctl() is allowed by default in Docker / Podman. It's also the syscall every seccomp sandbox uses to install itself (PR_SET_NO_NEW_PRIVS + PR_SET_SECCOMP), so blocking it outright is self-defeating. The hardening play is argument filtering: allow only the PR_* options the workload genuinely needs (typically PR_SET_NAME and the seccomp / no-new-privs duo if it sandboxes itself). PR_SET_MM in particular is worth blocking outside of debug builds — it's a vector for argv-spoofing.

libseccomp

// Allow only the prctl options the workload actually needs.
seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(prctl),
    1, SCMP_A0(SCMP_CMP_EQ, PR_SET_NAME));
seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(prctl),
    1, SCMP_A0(SCMP_CMP_EQ, PR_SET_NO_NEW_PRIVS));
seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(prctl),
    1, SCMP_A0(SCMP_CMP_EQ, PR_SET_SECCOMP));

strace example

$ strace -e prctl bash -c 'true'
prctl(PR_SET_NAME, "bash")              = 0
prctl(PR_SET_DUMPABLE, SUID_DUMP_USER)  = 0
prctl(PR_GET_NO_NEW_PRIVS, 0, 0, 0, 0)  = 0

strace decodes PR_* option names and unpacks their argument structures (e.g. PR_SET_NAME shows the new name as a string). Use -e prctl when investigating a suspicious binary. The first prctl(PR_SET_NAME, …) call in a process is often a good fingerprint for the actual program identity even when the binary path has been hidden.

Security & observability

prctl() is interesting in two security directions. (1) Capability tightening: a privileged init can use PR_CAPBSET_DROP to permanently remove unneeded capabilities before forking workers, reducing post-exploitation surface. PR_SET_NO_NEW_PRIVS prevents children from regaining privilege via set-UID binaries. PR_SET_DUMPABLE(0) prevents secrets from leaking into core dumps. (2) Anti-forensics: malware uses PR_SET_NAME to rename itself to something innocuous (kthread, systemd-…) in /proc/<pid>/comm, and PR_SET_MM to rewrite /proc/<pid>/cmdline; these are easily compared against /proc/<pid>/exe (the actual binary, which prctl() cannot rewrite). eBPF tracepoint sys_enter_prctl captures the option, making PR_SET_NAME / PR_SET_MM activity highly observable.

Errors

EACCES
Permission denied — operation-specific, often related to mm fields (PR_SET_MM).
EBADF
EBUSY
EFAULT
EINVAL
option is not recognised on this kernel, or arg2..arg5 are invalid for the selected option.
ENXIO
EPERM
The option requires a capability the caller does not hold (e.g. PR_SET_KEEPCAPS without CAP_SETPCAP, or PR_SET_SECUREBITS without CAP_SETPCAP).

Flags

PR_SET_PDEATHSIG
1
Deliver the specified signal to the calling process when its parent terminates. Useful for cleanup of detached subprocesses, but note: relies on the parent PID not being recycled.
PR_GET_DUMPABLE
3
PR_SET_DUMPABLE
4
0 = no core dump and not ptraceable; 1 = normal; 2 = root-only-ptraceable. Set-UID binaries reset to 0 on exec — most production daemons should explicitly set to 0 to prevent core dumps containing secrets.
PR_SET_KEEPCAPS
8
PR_SET_NAME
15
Set the process's 16-byte name (visible in /proc/<pid>/comm and ps). Truncated silently to 15 chars + NUL.
PR_GET_NAME
16
PR_SET_SECCOMP
22
Install a seccomp filter. SECCOMP_MODE_FILTER attaches a BPF program; SECCOMP_MODE_STRICT restricts to read/write/exit/sigreturn.
PR_CAPBSET_READ
23
PR_CAPBSET_DROP
24
Drop a capability from the bounding set — irrevocably for this process and its children. Essential when constructing a low-privilege subprocess.
PR_GET_SECUREBITS
27
PR_SET_SECUREBITS
28
PR_SET_NO_NEW_PRIVS
38
Set the no_new_privs bit irrevocably. After this, execve() ignores set-UID, set-GID, file capabilities, and AT_SECURE behaviour — a hard prerequisite for unprivileged seccomp filters.
PR_GET_NO_NEW_PRIVS
39
PR_SET_THP_DISABLE
41
PR_CAP_AMBIENT
47
PR_SVE_SET_VL
50
PR_PAC_RESET_KEYS
54
PR_SET_TAGGED_ADDR_CTRL
55
PR_SET_VMA
0x53564d41

Related syscalls