Reordering the Debian boot sequence for correctness and speed

There are subtle bugs in the Debian boot and shutdown sequences. They are hard to find, as they normally only affect rare combination of packages. They are harder to fix, as they normally require the combined work of several maintainers and changes in several packages. This talk is about the release goal for Lenny to solve them, and gain a few advantages on the way.

Petter Reinholdtsen - one of the sysvinit maintainers
pere@hungry.com
FOSDEM 2008, 2008-02-24

Outline

SysV init in Debian - Booting

Note, this is the stuff going on after the initrd part is done. The very early boot is done before hard drive partitions are mounted.

  1. /sbin/init stats and looks at /etc/inittab to decide what to do.
  2. The scripts in /etc/rcS.d/ are executed in sequence by /etc/init.d/rc, with start as the argument.
  3. Depending on the runlevel, the scripts for the given runlevel are executed (normally the ones in /etc/rc2.d/) are executed in sequence, first the stop scripts with stop as their argument, next the start scripts with start as their argument. The rc*.d/ directories contain symlinks the files in to /etc/init.d/.
  4. The ordering is important.
  5. Runlevel 1 is not the single user runlevel. The single-user runlevel will present a login prompt after rcS.d/ was executed. Runlevel 1 is not the single user runlevel, but it behaves as a better single user.

Note that because the Linux kernel is becoming more and more event based, the boot sequence is no longer sequencial.

SysV init in Debian - Switching runlevels

Note that switching to runlevel S will not run the scripts in /etc/rcS.d/. To get a similar effect after boot, switch to runlevel 1. It will (should) kill all services and prepare the machine for maintenance.

SysV init in Debian - Shutting down

This is roughtly equivalent to switching to runlevel 0 (halt) or 6 (reboot).

Minor exception: all scripts (both start and stop) are executed with the stop argument, ignoring their start and stop settings and confusing script writers.

Only stop scripts for services started in the previous runlevel are executed.

The ordering problem

Script ordering is vital for this to work. And how are the scripts ordered? By numbers 01-99!

And the numbers are picked using skills, knowledge and negotiation. Getting it right is often hard.

The current Debian default is wrong. The stop sequence should by default be the reverse of the start sequence. It isn't. The default uses '20' for both.

Reordering is hard and sometimes requires cooperation between maintainers of all packages involved.

The ordering problem - an example

Given two packages with two scripts inserted with the default settings in Debian:

Package A: script_a sequence 20 (start and stop)
Package B: script_b sequence 20 (start and stop)

Along comes script C, which should run before script_a and after script_b. Current solution is to change packages A and C or packages B and C to get something like this:

Package A: script_a start seq. 22, stop seq. 18
Package B: script_b sequence 20 (start and stop)
Package C: script_c start seq 21, stop seq 19

If other scripts depend on the old order of script_a, they will have to change their sequence number too. The only way to discover this is by a lot of testing, or documenting script dependencies.

An ordering solution

Let each script document its dependencies, and generate sequence numbers using this dependency information. Example:

Package A: script_a depend on nothing
Package B: script_b depend on nothing
Package C: script_c depend on script_b, a dependency of script_a

Generated sequence:

script_b start seq 1, stop seq 3
script_c start seq 2, stop seq 2
script_a start seq 3, stop seq 1

An implementation of this system is the dependency based boot sequencing, provided by the insserv package. It uses the format specified in Linux Software Base to document init.d script dependencies.

Checking the current boot sequence

Two options are available with the insserv package:
Static checking of current headers:
/usr/share/insserv/check-initd-order [-o] [-k]
report mismatch in current ordering.
Graph of the dependencies:
/usr/share/insserv/check-initd-order -g
for reviewing dependencies with dotty from graphviz.

Enabling dependency based boot sequencing I

Will refuse to enable when obsolete init.d scripts, loops, duplicate provides etc is detected

Enabling dependency based boot sequencing II

# aptitude install insserv
# dpkg-reconfigure insserv
info: Checking if it is safe to convert to dependency based boot.
info: Backing up existing boot scripts in \
  /var/lib/insserv/bootscripts-20080223T0742.tar.gz
info: Reordering boot system, log to \
  /var/lib/insserv/run-20080223T0742.log
info: Recording new boot sequence in \
  /var/lib/insserv/bootscripts-20080223T0742-after.list
info: Use '/usr/sbin/update-bootsystem-insserv \
  restore' to restore the old boot sequence.
Adding `diversion of /usr/sbin/update-rc.d to \
  /usr/sbin/update-rc.d.distrib by insserv'
success: Boot system successfully converted
# /var/lib/insserv/insserv-seq-changes \
  /var/lib/insserv/bootscripts-20080223T0742.tar.gz
[...]
#

Using dependency based boot sequencing

update-rc.d refuses to Insert scripts which create a loop.

update-rc.d requires scripts to be inserted in dependency order.

Incorrect dependencies give the wrong but predictable and stable (as in the same all the time) boot and shutdown order.

Every insserv upload are checked using test suite to make sure the generated boot sequence is correct, and that previously detected bugs do not show up again.

It is possible to enable concurrent booting, running boot scripts in parallel (CONCURRENCY=startpar in /etc/default/rcS)

Might even speed up boot and shutdown (initial benchmark shows speedup by just reordering based on dependencies).

Disabling dependency based boot sequencing

Run dpkg-reconfigure insserv and disable it.

It is always possible to disable just after it was enabled, before any new packages are installed.

When disabling it, a backup of the old boot sequence is restored if no changes have been made to the boot sequence since it was enabled.

If restore is not possible, all postinst scripts for packages with init.d scripts will be executed again to make them call update-rc.d and add the boot scripts again.

This is guaranteed to work if no packages have been added since it was enabled, and most often works if packages have been added. So if you change your mind, do it quickly.

LSB headers for insserv

What to list as dependencies (I)

If your package used the default update-rc.d settings before, this is the header for you:

### BEGIN INIT INFO
# Provides:          scriptname
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
### END INIT INFO

$remote_fs is needed by all scripts using files in /usr/. $syslog is needed only by scripts starting services logging to syslog.

Virtual facilities (I)

Linux Software Base version 3.2 defines these virtual facilities:

$local_fs
all local file systems are mounted. (In Debian, / and /var/ is available)
$network
basic networking support is available. Example: a server program could listen on a socket. (In Debian, network interfaces are up)
$portmap
daemons providing the SunRPC/ONCRPC portmapping service as defined in RFC 1833: Binding Protocols for ONC RPC Version 2 (if present) are running.
$remote_fs
all remote file systems are available. In some configurations, file systems such as /usr may be remote. Many applications that require $local_fs will probably also require $remote_fs. (In Debian, /usr/ and NFS directories are guaranteed to be mounted)

Virtual facilities (II)

$time
the system time has been set, for example by using a network-based time program such as ntp or rdate, or via the hardware Real Time Clock.
$syslog
the system logger is operational.
$named
IP name-to-address translation, using the interfaces described in this specification, are available to the level the system normally provides them. Example: if a DNS query daemon normally provides this facility, then that daemon has been started.

All of these represent points in time during boot and shutdown. Virtual facilities are defined in /etc/insserv.conf and /etc/insserv.conf.d/

What to list as dependencies (II)

Normally, the start and stop dependencies are the same.

Virtual dependencies are preferred over specific dependencies.

When using specific dependencies, use the string listed in the provides header of the scripts you depend on.

Scripts started in rcS.d/ need extra care.

All scripts not started in rcS.d/ should depend on $remote_fs. This make sure /usr/ is available during start and that it is stopped before sendsigs kills all processes during shutdown.

A sysadmin can provide overrides in /etc/insserv/overrides/scriptname if the script settings are wrong. The insserv package provides overrides in /usr/share/insserv/overrides/ for packages currently missing headers.

Handling alternatives

Not quite tested yet.

Should perhaps be handled using virtual facilities in /etc/insserv.conf.d/.

Do not listing identical provides in several scripts - break installation.

Status of the dependency based boot system

LSB header progress graph

Release goal for Lenny.
76% of packages got LSB headers.
Unsolved in BTS: ~85
Without BTS reports: ~150
Last package will be fixed 2008-06-15 at the current rate.
Needs better documentation for maintainers.
Should update Debian policy to reflect dependency based boot sequencing.

Two insserv bugs left to fix:

Tracking status

The current status is tracked in the svn repository for insserv. Run
debian/rules missing-overrides or
debian/rules missing-by-popcon
in the source directory to see the current status.

See also wiki pages with documentation and status:

There is also slides from my talk from Debconf 7.

More man-power is needed to report BTS reports, NMU packages and discover dependency errors. Please test the system.

Thank you very much

Questions?

http://www.hungry.com/~pere/mypapers/200802-bootsequence/200802-bootsequence.html