Memory · Section 2
mprotect(2)
Change the access protection of a memory region.
Signature
#include <sys/mman.h>
int mprotect(void * addr, size_t len, int prot);- addr
- Page-aligned start address.
- len
- Length in bytes. Rounded up to a page-size multiple.
- prot
- Desired protection: bitwise OR of PROT_READ, PROT_WRITE, PROT_EXEC, or PROT_NONE (alone).
Description
mprotect() changes the access protection of the pages in the range [addr, addr+len) to prot. addr must be page-aligned; len is rounded up to a page-size multiple. The range must lie within mappings created by mmap() (or the process's stack/heap). On success returns 0; -1 with errno on failure. mprotect() is the runtime knob behind every dynamic-loader symbol-resolution path (the linker writes the GOT, then mprotect()s it read-only — RELRO), behind every JIT (write the code, then flip to PROT_READ|PROT_EXEC), and behind stack guards that flip a page to PROT_NONE around hot regions. It is also the canonical step where compromised processes flip a writable shellcode buffer to executable.
Architecture mapping
| Architecture | Number | ABI | Entry point |
|---|---|---|---|
| x86 (i386) | 125 | i386 | sys_mprotect |
| x64 (x86_64) | 10 | common | sys_mprotect |
| ARM64 (aarch64) | 226 | — | sys_mprotect |
Kernel history
Introduced in Linux 1.0.
1.0
mprotect() has been part of Linux since 1.0 with POSIX semantics.
4.9
pkey_mprotect() was introduced (Linux 4.9, Intel MPK on supported hardware) to attach a memory protection key to a region — letting a thread enable/disable protection on a group of pages with a single register write rather than per-syscall mprotect calls. The basis for libmpk and arm64 PAuth-based variants.
5.13
mprotect()'s behaviour at the boundary of stack VMAs (PROT_GROWSDOWN) was tightened to prevent ambiguous extension that allowed older kernels to be tricked into expanding the stack into adjacent allocations.
seccomp & containers
Docker default profile
Allowed
Podman default profile
Allowed
mprotect() is on all default profiles and cannot be blocked outright — every program calls it via the dynamic loader. The high-value hardening is argument filtering on PROT_EXEC: blocking mprotect(..., PROT_EXEC) at the seccomp layer enforces W^X across the process, blocking the canonical shellcode-loading step (mmap RW → write → mprotect RX). Combine with the matching mmap() PROT_EXEC mask for full coverage. Workloads that legitimately JIT (Node, Java HotSpot, V8) need an exemption; pure data-handlers (a database, an Nginx without modules) do not.
libseccomp
// Block mprotect(..., PROT_EXEC) to enforce W^X at the seccomp layer
seccomp_rule_add(ctx, SCMP_ACT_ERRNO(EPERM), SCMP_SYS(mprotect),
1, SCMP_A2(SCMP_CMP_MASKED_EQ, PROT_EXEC, PROT_EXEC));strace example
$ strace -e mprotect /bin/true 2>&1 | head -5
mprotect(0x7f8c2a1d0000, 16384, PROT_READ) = 0
mprotect(0x7f8c2a212000, 8192, PROT_READ) = 0
mprotect(0x55d4e8e4a000, 4096, PROT_READ) = 0strace decodes prot symbolically (PROT_READ|PROT_WRITE|PROT_EXEC). A startup sequence on a normal binary shows ~10–30 mprotect() calls from ld.so for RELRO and TLS; anything beyond is application activity. To isolate JIT activity, filter -e trace=mprotect after the first few seconds of process life.
Security & observability
mprotect() is the post-exploitation primitive that turns a writable buffer into an executable one — almost every shellcode loader on Linux ends with mprotect(PROT_READ|PROT_EXEC). eBPF tracepoint sys_enter_mprotect with prot=PROT_EXEC is a high-signal feed; pair with the process binary fingerprint to whitelist known JITs. RELRO writes (linker turning the GOT read-only at startup) generate mprotect() calls during process init; these are benign and predictable. /proc/<pid>/maps shows current permissions — comparing snapshots before/after suspicious activity reveals which regions changed.
Errors
- EACCES
- The requested protection is incompatible with the underlying mapping (e.g. PROT_WRITE on a mapping of a file opened O_RDONLY with MAP_SHARED).
- EINVAL
- addr not page-aligned, the range does not cover existing mappings, or prot is invalid.
- ENOMEM
- Internal kernel allocation failure (rare; usually means vm.max_map_count is exhausted because the call would split a VMA).
Flags
- PROT_READ
- 0x1
- Pages may be read.
- PROT_WRITE
- 0x2
- Pages may be written.
- PROT_EXEC
- 0x4
- Pages may be executed. The single most security-relevant flag — see W^X.
- PROT_NONE
- 0x0
- No access. Generates SIGSEGV on any reference. Useful for guard pages.
- PROT_GROWSDOWN
- 0x01000000
- —
- PROT_GROWSUP
- 0x02000000
- —