I've recently set up another laptop with whith whole-disk encryption,
including /boot. This is all fine --the debian-installer almost gets it right, you just need to edit /etc/default/grub to add GRUB_ENABLE_CRYPTODISK=y and then re-run grub-install on the target-- but once you get this
working you end up having to type in the password for the LUKS disk
encryption twice at boot: once for GRUB and once for the booted
system. You can solve this by adding a keyfile to the encrypted disk
so it can be decrypted by either having this keyfile or the original
password. The idea is then to put this keyfile into the initrd image
where the bootup sequence uses it instead of having the type in the
passphrase again. The initrd itself is then decrypted by GRUB
together with the kernel etc. when you type it the password. This
sounds simple enough, but on Debian this still involves creating some
manual scripts to get this all working.
I'm going to skip through the initial setup of getting GRUB to work,
instead just covering the keyfile part. But I'll quickly describe my
setup so you know how the starting point. I have a GPT partition
table with 2 partitions: one FAT partition for EFI and the second
parition is used as the LUKS encrypted block device. This LUKS-device
is then used as a Physical Volume (PV) for LVM, with two LVM
partitions: one for swap and the other for BTRFS on which my root
filesystem lives directly at the top of the main BTRFS volume.
So once you got this booting with a password you want to create a
keyfile which will also be able to unlock the LUKS volume. LUKS will
only use the required number of bits out of a keyfile so you can just
make a large one, or you can look how many bits you need:
# cryptsetup luksDump /dev/nvme0n1p2
...
MK bits: 512
So we need a minimum of 512 bits in our keyfile, or 64 bytes. I store
the keyfile in /etc/cryptroot/ but you can store it anywhere you
like, just make sure only root can read it. Finally you want to add
the keyfile as a way to decrypt the LUKS volume:
# dd if=/dev/urandom of=/etc/cryptroot/keyfile.bin bs=1 count=64
# chmod go-rwx /etc/cryptroot/keyfile.bin
# cryptsetup luksAddKey /dev/nvme0n1p2 /etc/cryptroot/keyfile.bin
This is the easy part, now you want to add this keyfile to the
initramfs and make sure that the initramfs will use it to decrypt the
disk early at boot, rather then prompting you for the password.
Firstly start with adding the keyfile to the Debian-specific
/etc/crypttab, the third field there is the path of the keyfile so
it should look similar to this:
# name device keyfile options
nvme0n1p2_crypt UUID=e5216e6f-f89c-49c4-8aca-33cde0d4648d /etc/cryptroot/keyfile.bin luks,discard,keyscript=/etc/cryptroot/getinitramfskey.sh
The fourth field in this file are the options, you should have
luks as one option and I also use discard since I use a solid
state disk and want the TRIM command to be issued. If you only had
these options then Debian would be able to decrypt your disk on
startup, *if* it could read the keyfile from the filesystem. But
since we got the partition with the keyfile on itself encrypted we
need to add the keyscript option to enable it to be used in the
initrd. Again the keyscript can live anywhere, but I also keep it in
/etc/cryptroot.
Needing to create this keyscript is a bit weird and I'm not sure why a
keyfile in the initrd is not supported directly, probably historical
reasons, but the keyscript will be copied into the initramfs and is
responsible for printing the contents of the keyfile to stdout. This
allows you to do lots of crazy things if you so wish, but I simply use
this:
#!/bin/busybox ash
# This script is called by initramfs using busybox ash. The script is added
# to initramfs as a result of /etc/crypttab option "keysscript=/path/to/script"
# updating the initramfs image.
# This script prints the contents of a key file to sdout using cat. The key
# file location is supplied as $1 from the third field in /etc/crypttab, or can
# be hardcoded in this script.
# If using a key embedded in initrd.img-*, a hook script in
# /etc/initramfs-tools/hooks/ is required by update-initramfs. The hook script
# copies the keyfile into the intramfs {DESTDIR}.
KEY="${1}"
if [ -f "${KEY}" ]; then
cat "${KEY}"
else
PASS=/bin/plymouth ask-for-password --prompt="Key not found. Enter LUKS Password: "
echo "${PASS}"
fi
As the comment in the script describes, this keyscript itself will be
included in the initramfs image automatically but our keyfile itself
is still not included. We set the keyfile location in /etc/crypttab
to /etc/cryptroot/keyfile.bin and we will get this value as $1,
but this script is executed in the initramfs and the real filesystem
is not yet there. So lastly we need to provide a hook for
update-initramfs which will copy the keyfile to the right location
in the initramfs image. We could have chosen any hardcoded location
in the script instead of using the one from /etc/cypttab, but in
this case I decided to use the same location in the initramfs as on
the real system.
So the last thing to do is create this hook scrypt in
/etc/initramfs-tools/hooks/loadinitramfskey.sh:
#!/bin/sh
# This hook script is called by update-initramfs. The script checks for the
# existence of the key file loading script getinitramfskey.sh and copies it
# to initramfs if it's missing.
# This script also copies the key file autounlock.key to the /root/ directory
# of the initramfs. This file is accessed by getinitramfskey.sh, as specified
# in /etc/crypttab.
PREREQ=""
prereqs() {
echo "$PREREQ"
}
case "$1" in
prereqs)
prereqs
exit 0
;;
esac
. "${CONFDIR}/initramfs.conf"
. /usr/share/initramfs-tools/hook-functions
if [ ! -f "${DESTDIR}/lib/cryptsetup/scripts/getinitramfskey.sh" ]; then
if [ ! -d "${DESTDIR}/lib/cryptsetup/scripts" ]; then
mkdir -p "${DESTDIR}/lib/cryptsetup/scripts"
fi
cp /etc/cryptroot/getinitramfskey.sh "${DESTDIR}/lib/cryptsetup/scripts/"
fi
if [ ! -d "${DESTDIR}/root/" ]; then
mkdir -p "${DESTDIR}/etc/cryptroot/"
fi
cp /etc/cryptroot/keyfile.bin "${DESTDIR}/etc/cryptroot/"
And now you have everything in place to build a new initramfs image:
update-initramfs -u
You could now pry appart your new initramfs image to check everything
is in place. Or you could simply reboot and see if it all works.
This post is a response
to A
Container Is A Function Call by Glyph. It is a good article and
worth your time reading, and you might want to read it to follow here.
On twitter I asserted the article recommends building a monolith while
Glyph countered "On the contrary, explicit interfaces are what makes
loose coupling possible". Fair enough, but twitter is a bit awkward
to respond, so I'm attempting to write my thoughts down here.
In particular the suggestion that the infrastructure, whether that is
Docker Compose or as I would recommend Kubernetes or even something
else, should refuse to run a container unless all it's dependencies
are available:
An image thusly built would refuse to run unless:
-
Somewhere else on its network, there was an etcd host/port known
to it, its host and port supplied via environment variables.
-
Somewhere else on its network, there was a postgres host,
listening on port 5432, with a name-resolution entry of
“pgwritemaster.internal”.
-
An environment variable for the etcd configuration was supplied
-
A writable volume for /logs was supplied, owned by user-ID 4321
where it could write common log format logs.
The suggestion here is that the service, err container, would just
crash if any of these where not available. However when you're
building your service it should expect network failure as well as
failure of other services, that is the nature of distributed systems.
Dependencies might not always be there and your service should do the
most sensible thing in that case. In fact systems like Kubernetes
have a nice service concept which is a fixed (DNS) endpoint
available in the cluster which gets dynamically routed to any
container running which happens to have the correct tags associated
with it. This emphasises that whatever provides this service might
come and go while often even multiple containers can provide it.
I compare a container with an Erlang process because I think this
is how they should behave. They should be managed by a process
supervisor, Kubernetes or whichever is your poison, and they should
communicate using an asynchronous communication protocol based on
message passing and not (remote) function calls. If they don't do
this you're building a tightly coupled system which is like a
monolith but with added network failures between your function
calls.
Obviously in the real world you're stuck with things like the
Postres protocol and this is ok. Sometimes your own service is also
going to need a protocol which will need to explicitly respond. But
the key thing is that as a user of such a service you expect failure,
you expect it not to be there and do the best you can for your own
users, even if that is just returning an error code. If you do this
your process supervisor, err container/cluster infrastructure,
can happily normalise the state of your services again by bringing up
the missing service without a huge cascade in failures grinding your
entire cluster to a halt. This is the opposite of the infrastructure
refusing to run your container because a service which it uses is
missing.
Shameless plug: I
also spoke
about this at EuroPython.
Testing is a really important part of Python development and picking
the testing tool of choice is no light decision. Quite a few years
ago I eventually decided py.test would be the best tool for this, a
choice I have never regretted but has rather been re-enforced ever since.
Py.test has been the testing tool that seamlessly scaled from small
unit tests to large integration test. Furthermore it has seen a
steady and continuous development over all these years, the py.test I
first used was without a lot of the features we now consider
essential: AST re-writing, fixtures and even the plugin system did not
exist yet. To have seen all this work by so many people put into the
tool has been great. And at some point I myself moved from user to
contributing plugins and eventually doing various bits of work on the
core as well.
Personally the greatest part of this all has been seeing the project
grow from (mostly) a single maintainer to the team that maintains
py.test now while at the same time the adoption among users has
steadily kept growing as well. Py.test is now in a position where any
of about half a dozen people can make a release and many plugin
maintainers have now also joined the pytest-dev team. Since the team
has grown in the last few years some of us have managed to meet up at
various conferences. Yet, due to the range of continents we never all
managed to meet. This is how the idea of a dedicated sprint for
py.test first came about, would it not be great if we all managed to
meet and spend some dedicated time to work on py.test together?
With this objective we have now organised a week-long sprint and
created a fundraiser
campaign to help us make it affordable for even those of us coming
from far-flung continents (depending on your point of view!). It
would be great to get your or your company's support if you think
py.test is a worthwhile tool for you. The sprint is open to anyone,
so if you or your company think it would be interesting for you to
learn a lot about py.test while helping out or maybe working on your
pet feature or bug, please come along! Just drop us a note on the mailing
list and we'll accommodate for you.
There is a variety of topics people looking at working on, all
together hopefully culminating in a py.test 3.0 release (which will be
backwards compatible!). Personally I would like to work on a feature
to elegantly fail tests from within finalisers. The problem here is
that raising an exception in a finaliser is actually treated as an
error, but yet this is a fairly common feature that fixtures often do
this anyway. My current plan is to add a new
request.addverifier() method which would be allowed to
fail the test, though exact details may change. Another subject I
might be interested in is adding multiple-environment support to tox,
so that you may be able to test packages in e.g. a Conda environment. Though this is
certainly not a simple feature.
So if you use py.test and would like to support us it would be
great if you contributed or maybe convince your work to contribute.
And if you're keen enough to join us for the sprint that would be
great too. I look forward to meeting everyone in June!
Python links the module namespace directly to the layout of the
source locations on the filesystem. And this is mostly fine,
certainly for applications. For libraries sometimes one might want
to control the toplevel namespace or API more tightly. This also is
mostly fine as one can just use private modules inside a package and
import the relevant objects into the __init__.py file,
optionally even setting __all__. As I said, this is
mostly fine, if sometimes a bit ugly.
However sometimes you have a library which may be loading a
particular backend or platforms support at runtime. An example of
this is the
Python zmq
package.
The apipkg
module is also a very nice way of controlling your toplevel
namespace more flexibly. Problem is once you start using one of
these things Pylint no longer knows
which objects your package provides in it's namespace and will issue
warnings about using non-existing things.
Turns out it is not too hard to write a plugin for Pylint which
takes care of this. One just has to build the right AST nodes in
place where they would be appearing at runtime. Luckily the tools
to do this easily are provided:
def transform(mod):
if mod.name == 'zmq':
module = importlib.import_module(mod.name)
for name, obj in vars(module).copy().items():
if (name in mod.locals or
not hasattr(obj, '__module__') or
not hasattr(obj, '__name__')):
continue
if isinstance(obj, types.ModuleType):
ast_node = [astroid.MANAGER.ast_from_module(obj)]
else:
if hasattr(astroid.MANAGER, 'extension_package_whitelist'):
astroid.MANAGER.extension_package_whitelist.add(
obj.__module__)
real_mod = astroid.MANAGER.ast_from_module_name(obj.__module__)
ast_node = real_mod.getattr(obj.__name__)
for node in ast_node:
fix_linenos(node)
mod.locals[name] = ast_node
As you can see the hard work of knowing what AST nodes to generate
is all done in the astroid.MANAGER.ast_from_module()
and astroid.MANAGER.ast_from_module_name() calls. All that
is left to do is add these new AST nodes to the module's
globals/locals (they are the same thing for a module).
You may also notice the fix_linenos() call. This is a
small helper needed when running on Python 3 and importing C modules
(like for zmq). The reason is that Pylint tries to sort by
line numbers, but for C code they are None and in Python
2 None and an integer can be happily compared but in Python
3 that is no longer the case. So this small helper simply sets all
unknown line numbers to 0:
def fix_linenos(node):
if node.fromlineno is None:
node.fromlineno = 0
for child in node.get_children():
fix_linenos(child)
Lastly when writing this into a plugin for Pylint you'll want to
register the transformation you just wrote:
def register(linter):
astroid.MANAGER.register_transform(astroid.Module, transform)
And that's all that's needed to make Pylint work fine with
dynamically populated package namespaces. I've tried this
on zmq as well as on a package using apipkg and
its seems to work fine on both Python 2 and Python 3. Writing
Pylint plugins seems not too hard!
At long last I have updated my pytest-timeout plugin.
pytest-timeout is a plugin to py.test which will interrupt tests
which are taking longer then a set time and dump the stack traces of
all threads. This was initially developed in order to debug some
some tests which would occasionally hang on a CI server and can be
used in a variety of similar situations where getting some output is
more useful then getting a clean testrun.
The main new feature of this release is that the plugin now finally
works nicely with the --pdb option from py.test. When using this
option the timeout plugin will now no longer interrupt the interactive
pdb session after the given timeout.
Secondly this release fixes an important bug which meant that a
timeout in the finaliser of a fixture at the end of the session would
not be caught by the plugin. This was mainly because pytest-timeout
was not updated since py.test changed the way fixtures where cached
on their scope, the introduction of @pytest.fixture(scope='...'),
even though this was a long time ago.
So if you use py.test and a CI server I suggest now is as good a
time as any to configure it to use pytest-timeout, using a fairly
large timeout of say 300 seconds, then forget about it forever. Until
maybe one day it will suddenly save you a lot of head scratching and
time.