File & I/O · Section 2
open(2)
Open or create a file, returning a file descriptor.
Signature
#include <fcntl.h>
int open(const char * pathname, int flags, mode_t mode);- pathname
- Path of the file to open. Absolute or relative to the calling process's current working directory.
- flags
- Bitwise OR of an access mode (O_RDONLY, O_WRONLY, or O_RDWR) and zero or more creation/status flags (O_CREAT, O_CLOEXEC, O_NONBLOCK, etc.).
- mode
- File permission bits applied only when O_CREAT or O_TMPFILE is in flags; otherwise ignored. Subject to the process umask.
Description
open() opens the file specified by pathname. If the file does not exist and the O_CREAT flag is given, it will be created with the permissions specified by mode (modified by the process umask). The call returns a non-negative file descriptor on success, or -1 with errno set on failure. On modern Linux, open() is implemented in terms of openat() with AT_FDCWD; on aarch64 it is not exported as a syscall — only openat() is, and the libc shim translates open() calls.
Architecture mapping
| Architecture | Number | ABI | Entry point |
|---|---|---|---|
| x86 (i386) | 5 | i386 | sys_open |
| x64 (x86_64) | 2 | common | sys_open |
Kernel history
Introduced in Linux 1.0.
2.6.23
O_CLOEXEC was added so that the close-on-exec flag could be set atomically at open time, closing a race window that previously required a separate fcntl(F_SETFD).
2.6.39
O_PATH was added to obtain a descriptor referring to a filesystem location without opening the underlying file — useful for fstatat-style operations and for descriptor passing.
3.11
O_TMPFILE was added to atomically create an unnamed file under a directory; the file disappears when the last reference closes, removing the need for unlink() races.
seccomp & containers
Docker default profile
Allowed
Podman default profile
Allowed
open() is on the Docker and Podman default allow-lists. Containers that drop it must also drop openat()/openat2() — most modern libc uses openat() under the hood. Restricting these is a heavy hammer that breaks essentially all I/O.
libseccomp
scmp_filter_ctx ctx = seccomp_init(SCMP_ACT_KILL);
seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(open), 0);
seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(openat), 0);
seccomp_load(ctx);BPF filter (raw)
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, offsetof(struct seccomp_data, nr)),
BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_open, 0, 1),
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),strace example
$ strace -e openat,open ls /etc
openat(AT_FDCWD, "/etc", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3Note that strace prints openat() for most code paths even when source calls open() — glibc rewrites the call. Use -y to resolve file descriptors to paths, and -e trace=openat,open to filter.
Security & observability
Rootkits frequently hook open()/openat() to hide files from userspace inspection (a classic LD_PRELOAD or syscall-table rewrite trick). At the kernel level, eBPF tracepoints sys_enter_openat and lsm/file_open are the canonical observation points; auditd captures it via the openat rule. Watch for processes opening /proc/*/mem, /dev/kmem, or /sys/kernel/debug from non-privileged contexts.
Errors
- EACCES
- The requested access is denied, or O_CREAT was specified but search permission on a directory component is denied.
- EEXIST
- O_CREAT|O_EXCL was specified and the file exists.
- ENOENT
- A directory component does not exist, or O_CREAT was not set and the file does not exist.
- EISDIR
- —
- ENFILE
- System-wide file-table limit reached.
- EMFILE
- Per-process file-descriptor limit reached.
- ENAMETOOLONG
- —
- ENOSPC
- —
- EROFS
- —
- ELOOP
- —
Flags
- O_RDONLY
- 0
- —
- O_WRONLY
- 1
- —
- O_RDWR
- 2
- —
- O_CREAT
- 0100
- Create the file if it does not exist; requires mode argument.
- O_EXCL
- 0200
- When combined with O_CREAT, fail if the file already exists (atomic create).
- O_TRUNC
- 01000
- —
- O_APPEND
- 02000
- —
- O_NONBLOCK
- 04000
- Open in non-blocking mode; subsequent I/O will not block.
- O_CLOEXEC
- 02000000
- Set the close-on-exec flag on the new file descriptor — essential to prevent fd leaks across execve().
- O_DIRECTORY
- 0200000
- —
- O_PATH
- 010000000
- Obtain a descriptor that can be used only for filesystem-tree operations (fstatat, openat); the file itself is not opened.
- O_TMPFILE
- 020200000
- Create an unnamed temporary file under the given directory; dropped on close.