Snippets:Creating network namespace atomically

The following shell scripts both attempt to create a new network namespace with a veth pair to the host.

#!/bin/sh
set -eu

ip netns add netns0
ip link add to_netns0 type veth peer name eth0 netns netns0
ip addr add 192.168.1.2/24 dev to_netns0
ip link set to_netns0 up
ip -n netns0 link set lo up
ip -n netns0 addr add 192.168.1.3/24 dev eth0
ip -n netns0 link set eth0 up
ip -n netns0 route add 0.0.0.0/0 via 192.168.1.2 dev eth0
#!/bin/sh
set -eu

unshare -n sh -s <<\EOF
set -euC
ip link set lo up
ip link add eth0 type veth peer name to_netns0 netns /proc/1/ns/net
ip addr add 192.168.1.3/24 dev eth0
ip link set eth0 up
umask 077
: > /run/netns/netns0
mount --bind /proc/self/ns/net /run/netns/netns0
EOF
ip addr add 192.168.1.2/24 dev to_netns0
ip link set to_netns0 up
ip -n netns0 route add 0.0.0.0/0 via 192.168.1.2 dev eth0

Both of these shell scripts are identical in function, but the second one is significantly safer.

Why is that? The first shell script creates the network namespace in a more classical way, using the ip command exclusively. The second shell script is a bit more obscure, using a few extra programs like mount and unshare.

One reason why the second shell script is safer is because of certain race conditions that may occur with the first shell script. Imagine if you used the first shell script, and while the first shell script is executing, another process on the system runs ip netns exec. That process might see an incomplete view of the network namespace, and this might affect the new process's socket operations.

Another, more important reason is that in the second shell script, we first create the new network namespace without bind-mounting it to the filesystem (for comparison, ip netns add netns0 is roughly equivalent to : > /run/netns/netns0; unshare -n mount --bind /proc/self/ns/net /run/netns/netns0). The issue concerned here is if any of the commands in the new network namespace fails. (A simple subnet configuration is shown here for simplicity. Real shell scripts like these may have more complicated setups like iptables rules.) With the first shell script, the new network namespace may be left in an incomplete state. However, with the second shell script, if any commands within the new network namespace fail, then the newly created network namespace is immediately destroyed without any other changes to the system. (Unlike sockets, processes, and file descriptors to the /proc/PID/ns/net file, veth pairs do not hold a reference to the network namespace in which they were created. If the network namespace of the other side of a veth pair is destroyed, then the entire veth pair is destroyed.)

In other words, with the second shell script, we first prepare the new network namespace to our liking (which may include addresses, routes, and/or firewall rules) and only when it is fully formed do we give it a name. This means that to a process which does an ip netns exec on the new namespace, it will either see the new network namespace in a fully configured state or it will see nothing at all.