Help:Ctrtool/ns_open_file

ctrtool ns_open_file can be used to open namespace-specific file descriptors (such as sockets in network namespaces and paths in mount namespaces) without entering the namespace itself. (This is useful if a program needs to have multiple file descriptors open from multiple namespaces and/or needs to stay in the initial network or user namespaces; see Notes about namespaces#General / Miscellaneous). ns_open_file is designed to work with both traditional namespaces as well as "rootless" namespaces.

ctrtool ns_open_file [-o i_offset] [-L reg,value] \
     [-n [-N NAMESPACE | -i ns_reg] [-U] [-A] [-d domain] [-t type] [-p protocol] [-l listen_backlog]
         [-4 ipv4_address,port[,options] | -6 ipv6_address,port[,options[,scope_id]]]
         [-P unix_socket_path | -P @abstract_socket_path] [-s reg[,i]] ]
     [-m [-N NAMESPACE | -d dir_fd | -i ns_reg[,[d|n]]] [-U] [-2] [-P file_path] [-O open_flags]... [-R resolve_flags]... [-s reg[,i]] ]
     [-fN FILENAME [-s reg[,i]] ]
     ...
     [PROGRAM] [ARGUMENTS]
  • -n creates a new socket, with the domain, type, and protocol (currently accepts only numbers corresponding to the AF_*, SOCK_*, and IPPROTO_* constants TODO: document the "preset" string values). You can also specify a network NAMESPACE for the socket with -N; with -U, also enter the associated user namespace. Regardless of these options, the user/network namespace membership of PROGRAM is not changed, and with the appropriate privileges, you can create multiple sockets from network namespaces owned by different user namespaces. To bind the socket to an address, use -4 or -6 (nothing else is done to the socket if those options are not specified). options is a string of one or more of the following characters:
  • 'a': SO_REUSEADDR
  • 'p': SO_REUSEPORT
  • 'f': IP_FREEBIND
  • 't': IP_TRANSPARENT
  • 'd': TCP_DEFER_ACCEPT (currently only 1 s)
  • 'O': IPV6_V6ONLY off
  • 'o': IPV6_V6ONLY on
  • 'e': TCP_NODELAY
  • ':': No operation (should be the only option if no other options are to be set)
  • 'I' (capital i): (-6 only) Scope ID is numeric, rather than an interface name
  • If domain, type, and protocol are not specified, create an IPv6 TCP socket (domain = AF_INET6, type = SOCK_STREAM, protocol = 0)
  • -l (lowercase L) also calls listen() on the socket with the specified backlog value.
  • -m opens a specified file (specified by -P) in the specified mount namespace. If -U is specified, also enter the associated user namespace. For the sake of compatibility with older versions, the default path is / and the default set of flags (without -P or -O) is O_RDONLY|O_PATH|O_DIRECTORY.
  • -fN opens a regular file. Although -m -P FILENAME can already perform the same operation, it is still kept here for the sake of compatibility.
  • -o specifies the index offset of the n value in the exported CTRTOOL_NS_OPEN_FILE_FD_n environment variables. This option should only be specified once per invocation (i.e. multiple uses of this option will not adjust the value individually for each -m, -n, or -f operation).

The file descriptor result of all of these operations is saved into a set of environment variables $CTRTOOL_NS_OPEN_FILE_FD_n when PROGRAM is executed, where n is the index of the operation (first socket or descriptor opened is stored in $CTRTOOL_NS_OPEN_FILE_FD_0, second socket or descriptor is stored in $CTRTOOL_NS_OPEN_FILE_FD_1, etc.) If -o is specified, add the value of it to n. In other words, with -o 100, start at $CTRTOOL_NS_OPEN_FILE_FD_100, then $CTRTOOL_NS_OPEN_FILE_FD_101, etc.

For all operations, it is possible to specify -s reg to store the file descriptor result in the internal "register" reg (numeric, 0 to 7 inclusive). All eight registers are initialized to -1 and are local to a single invocation of ns_open_file. -s can be used with an -i reg specification in a subsequent operation to specify a file descriptor for the namespace, which is loaded from that register. However, after exec, these register contents are reset to -1. If -s reg,i is used, then the CTRTOOL_NS_OPEN_FILE_FD_n environment variable will just be -1 (but the actual file descriptor number is still saved to that internal register); in this case, the file descriptor will be marked close-on-exec such that it will not be leaked into the target process.

The file path specified by -N should be the full path to the network or mount namespace; for example, for network namespaces created by ip netns add, use /run/netns/[NAME]. For network namespaces by PID, use /proc/PID/ns/net.

Due to how user namespaces work, if you specify -U and own the user namespace of the target network namespace (see the discussion of "owner UID" in user_namespaces(7)), then you will not need any privileges to bind to ports less than 1024, open a raw or packet socket, or set certain socket options like IP_TRANSPARENT.

If -N is specified, then any addresses, ports, or scope IDs that you specify for the socket are determined relative to the foreign network namespace, not the current network namespace, i.e. specify the address, port and/or scope ID as if you used nsenter --net=NAMESPACE PROGRAM or ip netns exec NAMESPACE PROGRAM, though other parameters (such as proxy connect or bind addresses in the target process's configuration file, or anything that does not involve the server socket) should be relative to the current network namespace.

If -A is specified with -n, then the new socket will be in a new, anonymous network namespace. If -N is also specified with -A, then the user namespace of the namespace specified with -N (-N can specify any namespace file) will be the owning user namespace of the new network namespace. If -N specifies a user namespace, then the new network namespace will be owned by its parent user namespace. If -U is also specified with -A and -N (-U has no effect without -N), then -N must specify a user namespace, and the new network namespace will simply be owned by that user namespace. The actual file descriptor for the newly created network namespace can be later retrieved using the SIOCGSKNS ioctl on the newly created socket (or with the -N operation of pidfd_ctl).

The operation performed by ns_open_file -n is fundamentally similar to the socketat()[1] system call that was originally proposed at the same time setns() was introduced, but never made it to the mainline kernel.

Examples

Open a socket bound to port 80, drop privileges, then run FOOBAR, which accepts a file descriptor argument with -f.

ctrtool ns_open_file -n -6 ::,80,a -l 4096 setpriv --reuid=www-data --regid=www-data --init-groups sh -c 'exec FOOBAR -f "$CTRTOOL_NS_OPEN_FILE_FD_0"'

Equivalent functionality on IPv4 in case IPv6 is turned off:

ctrtool ns_open_file -n -d 2 -4 0.0.0.0,80,a -l 4096 setpriv --reuid=www-data --regid=www-data --init-groups sh -c 'exec FOOBAR -f "$CTRTOOL_NS_OPEN_FILE_FD_0"'

Open two sockets bound to port 80 and 443 in two different network namespaces (/run/netns/my_netns1 and /run/netns/my_netns2):

ctrtool ns_open_file \
    -n -N /run/netns/my_netns1 -6 ::,80,a -l 4096 \
    -n -N /run/netns/my_netns2 -6 ::,443,a -l 4096 \
    FOOBAR ARGUMENTS

We have a rootless container running on PID 12345. Create a DNS resolver listening socket on 127.0.0.10, UDP port 53, such that DNS requests within the container are forwarded to the FOOBAR host process (FOOBAR would have to be instructed to use the listening socket on $CTRTOOL_NS_OPEN_FILE_FD_0, which may or may not always be 3 or 4).

ctrtool ns_open_file -n -N /proc/12345/ns/net -U -d 2 -t 2 -4 127.0.0.10,53,a FOOBAR ARGUMENTS

Alternatively, if FOOBAR supports systemd socket activation (note that this also closes any other file descriptors other than stdin/stdout/stderr that were previously open):

ctrtool ns_open_file -n -N /proc/12345/ns/net -U -d 2 -t 2 -4 127.0.0.10,53,a ctrtool set_fds -s -e :CTRTOOL_NS_OPEN_FILE_FD_0 FOOBAR ARGUMENTS

Open the root of the mount namespace of the same process instead:

ctrtool ns_open_file -m -N /proc/12345/ns/mnt -U FOOBAR ARGUMENTS

To access /var/lib/foobar in the mount namespace, use

int dir_fd = atoi(getenv("CTRTOOL_NS_OPEN_FILE_FD_0"));
int new_fd = openat(dir_fd, "var/lib/foobar", O_RDONLY);

or opening the generic path

/proc/self/fd/$CTRTOOL_NS_OPEN_FILE_FD_0/var/lib/foobar

or if the foreign mount namespace is untrusted (currently requires Linux 5.6)

struct open_how o_h = {.flags = O_RDONLY, .mode = 0, .resolve = RESOLVE_IN_ROOT|RESOLVE_NO_MAGICLINKS};
int new_fd = openat2(dir_fd, "var/lib/foobar", &o_h, sizeof(o_h));

If this command is running with the same user ID as the container itself, then no privileges or capabilities are required.

The following command demonstrates the use of "registers" in ns_open_file to open up all of the target process's namespaces using a dirfd-relative operation:

ctrtool ns_open_file \
    -m -P /proc/self/ns -O rdonly -O path -O directory -s 0,i \
    -m -i 0,d -P cgroup -O rdonly -O nonblock \
    -m -i 0,d -P ipc -O rdonly -O nonblock \
    -m -i 0,d -P mnt -O rdonly -O nonblock \
    -m -i 0,d -P net -O rdonly -O nonblock \
    -m -i 0,d -P pid -O rdonly -O nonblock \
    -m -i 0,d -P time -O rdonly -O nonblock \
    -m -i 0,d -P user -O rdonly -O nonblock \
    -m -i 0,d -P uts -O rdonly -O nonblock \
    PROGRAM ARGUMENTS

The first -m operation opens up /proc/self/ns, which is the directory containing all of the namespace file descriptors, storing that file descriptor internally in "register" 0 (using -s 0,i). The eight subsequent -m operations use that directory file descriptor to open up each of the individual namespace files relative to that directory. That directory is specified using -i 0,d, which says to use the file descriptor stored in "register" 0 as the directory file descriptor for that operation.

Environment variable Contents
CTRTOOL_NS_OPEN_FILE_FD_0 A literal -1
CTRTOOL_NS_OPEN_FILE_FD_1 File descriptor number of the cgroup namespace
CTRTOOL_NS_OPEN_FILE_FD_2 File descriptor number of the ipc namespace
CTRTOOL_NS_OPEN_FILE_FD_3 File descriptor number of the mount namespace
CTRTOOL_NS_OPEN_FILE_FD_4 File descriptor number of the network namespace
CTRTOOL_NS_OPEN_FILE_FD_5 File descriptor number of the PID namespace
CTRTOOL_NS_OPEN_FILE_FD_6 File descriptor number of the time namespace
CTRTOOL_NS_OPEN_FILE_FD_7 File descriptor number of the user namespace
CTRTOOL_NS_OPEN_FILE_FD_8 File descriptor number of the UTS namespace

Common values for -d and -t

  • -d 1: AF_UNIX
  • -d 2: AF_INET
  • -d 10: AF_INET6
  • -t 1: SOCK_STREAM (= TCP for INET/INET6)
  • -t 2: SOCK_DGRAM (= UDP for INET/INET6)
This end-user documentation is part of ctrtool. Reproduction and use of this material for any purpose is permitted, provided that a link to this page is provided as attribution.