E pur si muove

Encrypted root on Debian with keyfile/keyscript

Thursday, December 15, 2016

I've recently set up another laptop with whith whole-disk encryption, including /boot. This is all fine --the debian-installer almost gets it right, you just need to edit /etc/default/grub to add GRUB_ENABLE_CRYPTODISK=y and then re-run grub-install on the target-- but once you get this working you end up having to type in the password for the LUKS disk encryption twice at boot: once for GRUB and once for the booted system. You can solve this by adding a keyfile to the encrypted disk so it can be decrypted by either having this keyfile or the original password. The idea is then to put this keyfile into the initrd image where the bootup sequence uses it instead of having the type in the passphrase again. The initrd itself is then decrypted by GRUB together with the kernel etc. when you type it the password. This sounds simple enough, but on Debian this still involves creating some manual scripts to get this all working.

I'm going to skip through the initial setup of getting GRUB to work, instead just covering the keyfile part. But I'll quickly describe my setup so you know how the starting point. I have a GPT partition table with 2 partitions: one FAT partition for EFI and the second parition is used as the LUKS encrypted block device. This LUKS-device is then used as a Physical Volume (PV) for LVM, with two LVM partitions: one for swap and the other for BTRFS on which my root filesystem lives directly at the top of the main BTRFS volume.

So once you got this booting with a password you want to create a keyfile which will also be able to unlock the LUKS volume. LUKS will only use the required number of bits out of a keyfile so you can just make a large one, or you can look how many bits you need:

# cryptsetup luksDump /dev/nvme0n1p2
MK bits:       512

So we need a minimum of 512 bits in our keyfile, or 64 bytes. I store the keyfile in /etc/cryptroot/ but you can store it anywhere you like, just make sure only root can read it. Finally you want to add the keyfile as a way to decrypt the LUKS volume:

# dd if=/dev/urandom of=/etc/cryptroot/keyfile.bin bs=1 count=64
# chmod go-rwx /etc/cryptroot/keyfile.bin
# cryptsetup luksAddKey /dev/nvme0n1p2 /etc/cryptroot/keyfile.bin

This is the easy part, now you want to add this keyfile to the initramfs and make sure that the initramfs will use it to decrypt the disk early at boot, rather then prompting you for the password.

Firstly start with adding the keyfile to the Debian-specific /etc/crypttab, the third field there is the path of the keyfile so it should look similar to this:

# name          device                                    keyfile                    options
nvme0n1p2_crypt UUID=e5216e6f-f89c-49c4-8aca-33cde0d4648d /etc/cryptroot/keyfile.bin luks,discard,keyscript=/etc/cryptroot/

The fourth field in this file are the options, you should have luks as one option and I also use discard since I use a solid state disk and want the TRIM command to be issued. If you only had these options then Debian would be able to decrypt your disk on startup, *if* it could read the keyfile from the filesystem. But since we got the partition with the keyfile on itself encrypted we need to add the keyscript option to enable it to be used in the initrd. Again the keyscript can live anywhere, but I also keep it in /etc/cryptroot.

Needing to create this keyscript is a bit weird and I'm not sure why a keyfile in the initrd is not supported directly, probably historical reasons, but the keyscript will be copied into the initramfs and is responsible for printing the contents of the keyfile to stdout. This allows you to do lots of crazy things if you so wish, but I simply use this:

#!/bin/busybox ash

# This script is called by initramfs using busybox ash.  The script is added
# to initramfs as a result of /etc/crypttab option "keysscript=/path/to/script"
# updating the initramfs image.
# This script prints the contents of a key file to sdout using cat.  The key
# file location is supplied as $1 from the third field in /etc/crypttab, or can
# be hardcoded in this script.
# If using a key embedded in initrd.img-*, a hook script in
# /etc/initramfs-tools/hooks/ is required by update-initramfs.  The hook script
# copies the keyfile into the intramfs {DESTDIR}.

if [ -f "${KEY}" ]; then
 cat "${KEY}"
 PASS=/bin/plymouth ask-for-password --prompt="Key not found.  Enter LUKS Password: "
 echo "${PASS}"

As the comment in the script describes, this keyscript itself will be included in the initramfs image automatically but our keyfile itself is still not included. We set the keyfile location in /etc/crypttab to /etc/cryptroot/keyfile.bin and we will get this value as $1, but this script is executed in the initramfs and the real filesystem is not yet there. So lastly we need to provide a hook for update-initramfs which will copy the keyfile to the right location in the initramfs image. We could have chosen any hardcoded location in the script instead of using the one from /etc/cypttab, but in this case I decided to use the same location in the initramfs as on the real system.

So the last thing to do is create this hook scrypt in /etc/initramfs-tools/hooks/


# This hook script is called by update-initramfs. The script checks for the
# existence of the key file loading script and copies it
# to initramfs if it's missing.
# This script also copies the key file autounlock.key to the /root/ directory
# of the initramfs. This file is accessed by, as specified
# in /etc/crypttab.


prereqs() {
  echo "$PREREQ"

case "$1" in
    exit 0

. "${CONFDIR}/initramfs.conf"
. /usr/share/initramfs-tools/hook-functions

if [ ! -f "${DESTDIR}/lib/cryptsetup/scripts/" ]; then
 if [ ! -d "${DESTDIR}/lib/cryptsetup/scripts" ]; then
  mkdir -p "${DESTDIR}/lib/cryptsetup/scripts"
 cp /etc/cryptroot/ "${DESTDIR}/lib/cryptsetup/scripts/"
if [ ! -d "${DESTDIR}/etc/cryptroot/" ]; then
 mkdir -p "${DESTDIR}/etc/cryptroot/"
cp /etc/cryptroot/keyfile.bin "${DESTDIR}/etc/cryptroot/"

And now you have everything in place to build a new initramfs image:

update-initramfs -u

You could now pry appart your new initramfs image to check everything is in place. Or you could simply reboot and see if it all works.

A Container is an Erlang Process

Monday, August 15, 2016

This post is a response to A Container Is A Function Call by Glyph. It is a good article and worth your time reading, and you might want to read it to follow here. On twitter I asserted the article recommends building a monolith while Glyph countered "On the contrary, explicit interfaces are what makes loose coupling possible". Fair enough, but twitter is a bit awkward to respond, so I'm attempting to write my thoughts down here.

In particular the suggestion that the infrastructure, whether that is Docker Compose or as I would recommend Kubernetes or even something else, should refuse to run a container unless all it's dependencies are available:

An image thusly built would refuse to run unless:

  • Somewhere else on its network, there was an etcd host/port known to it, its host and port supplied via environment variables.
  • Somewhere else on its network, there was a postgres host, listening on port 5432, with a name-resolution entry of “pgwritemaster.internal”.
  • An environment variable for the etcd configuration was supplied
  • A writable volume for /logs was supplied, owned by user-ID 4321 where it could write common log format logs.

The suggestion here is that the service, err container, would just crash if any of these where not available. However when you're building your service it should expect network failure as well as failure of other services, that is the nature of distributed systems. Dependencies might not always be there and your service should do the most sensible thing in that case. In fact systems like Kubernetes have a nice service concept which is a fixed (DNS) endpoint available in the cluster which gets dynamically routed to any container running which happens to have the correct tags associated with it. This emphasises that whatever provides this service might come and go while often even multiple containers can provide it.

I compare a container with an Erlang process because I think this is how they should behave. They should be managed by a process supervisor, Kubernetes or whichever is your poison, and they should communicate using an asynchronous communication protocol based on message passing and not (remote) function calls. If they don't do this you're building a tightly coupled system which is like a monolith but with added network failures between your function calls.

Obviously in the real world you're stuck with things like the Postres protocol and this is ok. Sometimes your own service is also going to need a protocol which will need to explicitly respond. But the key thing is that as a user of such a service you expect failure, you expect it not to be there and do the best you can for your own users, even if that is just returning an error code. If you do this your process supervisor, err container/cluster infrastructure, can happily normalise the state of your services again by bringing up the missing service without a huge cascade in failures grinding your entire cluster to a halt. This is the opposite of the infrastructure refusing to run your container because a service which it uses is missing.

Shameless plug: I also spoke about this at EuroPython.

py.test sprint in Freiburg

Saturday, February 20, 2016

Testing is a really important part of Python development and picking the testing tool of choice is no light decision. Quite a few years ago I eventually decided py.test would be the best tool for this, a choice I have never regretted but has rather been re-enforced ever since. Py.test has been the testing tool that seamlessly scaled from small unit tests to large integration test. Furthermore it has seen a steady and continuous development over all these years, the py.test I first used was without a lot of the features we now consider essential: AST re-writing, fixtures and even the plugin system did not exist yet. To have seen all this work by so many people put into the tool has been great. And at some point I myself moved from user to contributing plugins and eventually doing various bits of work on the core as well.

Personally the greatest part of this all has been seeing the project grow from (mostly) a single maintainer to the team that maintains py.test now while at the same time the adoption among users has steadily kept growing as well. Py.test is now in a position where any of about half a dozen people can make a release and many plugin maintainers have now also joined the pytest-dev team. Since the team has grown in the last few years some of us have managed to meet up at various conferences. Yet, due to the range of continents we never all managed to meet. This is how the idea of a dedicated sprint for py.test first came about, would it not be great if we all managed to meet and spend some dedicated time to work on py.test together?

With this objective we have now organised a week-long sprint and created a fundraiser campaign to help us make it affordable for even those of us coming from far-flung continents (depending on your point of view!). It would be great to get your or your company's support if you think py.test is a worthwhile tool for you. The sprint is open to anyone, so if you or your company think it would be interesting for you to learn a lot about py.test while helping out or maybe working on your pet feature or bug, please come along! Just drop us a note on the mailing list and we'll accommodate for you.

There is a variety of topics people looking at working on, all together hopefully culminating in a py.test 3.0 release (which will be backwards compatible!). Personally I would like to work on a feature to elegantly fail tests from within finalisers. The problem here is that raising an exception in a finaliser is actually treated as an error, but yet this is a fairly common feature that fixtures often do this anyway. My current plan is to add a new request.addverifier() method which would be allowed to fail the test, though exact details may change. Another subject I might be interested in is adding multiple-environment support to tox, so that you may be able to test packages in e.g. a Conda environment. Though this is certainly not a simple feature.

So if you use py.test and would like to support us it would be great if you contributed or maybe convince your work to contribute. And if you're keen enough to join us for the sprint that would be great too. I look forward to meeting everyone in June!

Pylint and dynamically populated packages

Thursday, December 04, 2014

Python links the module namespace directly to the layout of the source locations on the filesystem. And this is mostly fine, certainly for applications. For libraries sometimes one might want to control the toplevel namespace or API more tightly. This also is mostly fine as one can just use private modules inside a package and import the relevant objects into the file, optionally even setting __all__. As I said, this is mostly fine, if sometimes a bit ugly.

However sometimes you have a library which may be loading a particular backend or platforms support at runtime. An example of this is the Python zmq package. The apipkg module is also a very nice way of controlling your toplevel namespace more flexibly. Problem is once you start using one of these things Pylint no longer knows which objects your package provides in it's namespace and will issue warnings about using non-existing things.

Turns out it is not too hard to write a plugin for Pylint which takes care of this. One just has to build the right AST nodes in place where they would be appearing at runtime. Luckily the tools to do this easily are provided:

def transform(mod):
    if == 'zmq':
        module = importlib.import_module(
        for name, obj in vars(module).copy().items():
            if (name in mod.locals or
                    not hasattr(obj, '__module__') or
                    not hasattr(obj, '__name__')):
            if isinstance(obj, types.ModuleType):
                ast_node = [astroid.MANAGER.ast_from_module(obj)]
                if hasattr(astroid.MANAGER, 'extension_package_whitelist'):
                real_mod = astroid.MANAGER.ast_from_module_name(obj.__module__)
                ast_node = real_mod.getattr(obj.__name__)
                for node in ast_node:
            mod.locals[name] = ast_node

As you can see the hard work of knowing what AST nodes to generate is all done in the astroid.MANAGER.ast_from_module() and astroid.MANAGER.ast_from_module_name() calls. All that is left to do is add these new AST nodes to the module's globals/locals (they are the same thing for a module).

You may also notice the fix_linenos() call. This is a small helper needed when running on Python 3 and importing C modules (like for zmq). The reason is that Pylint tries to sort by line numbers, but for C code they are None and in Python 2 None and an integer can be happily compared but in Python 3 that is no longer the case. So this small helper simply sets all unknown line numbers to 0:

def fix_linenos(node):
    if node.fromlineno is None:
        node.fromlineno = 0
    for child in node.get_children():

Lastly when writing this into a plugin for Pylint you'll want to register the transformation you just wrote:

def register(linter):
    astroid.MANAGER.register_transform(astroid.Module, transform)

And that's all that's needed to make Pylint work fine with dynamically populated package namespaces. I've tried this on zmq as well as on a package using apipkg and its seems to work fine on both Python 2 and Python 3. Writing Pylint plugins seems not too hard!

New pytest-timeout release

Thursday, August 07, 2014

At long last I have updated my pytest-timeout plugin. pytest-timeout is a plugin to py.test which will interrupt tests which are taking longer then a set time and dump the stack traces of all threads. This was initially developed in order to debug some some tests which would occasionally hang on a CI server and can be used in a variety of similar situations where getting some output is more useful then getting a clean testrun.

The main new feature of this release is that the plugin now finally works nicely with the --pdb option from py.test. When using this option the timeout plugin will now no longer interrupt the interactive pdb session after the given timeout.

Secondly this release fixes an important bug which meant that a timeout in the finaliser of a fixture at the end of the session would not be caught by the plugin. This was mainly because pytest-timeout was not updated since py.test changed the way fixtures where cached on their scope, the introduction of @pytest.fixture(scope='...'), even though this was a long time ago.

So if you use py.test and a CI server I suggest now is as good a time as any to configure it to use pytest-timeout, using a fairly large timeout of say 300 seconds, then forget about it forever. Until maybe one day it will suddenly save you a lot of head scratching and time.

Subscribe to: Posts (Atom)