Developing Custom Cloud-Init Modules

By Samuel Matzek posted Mon April 27, 2015 12:00 AM


By: Taylor Peoples

This article explains why you might want to develop your own custom cloud-init module and provides tips on how to develop custom modules.

Paths in This Document

The paths in this document are the normal paths cloud-init uses on Linux.  AIX uses slightly different paths.  The Linux to AIX path mapping is as follows:

Linux AIX
/var/lib/cloud /opt/freeware/var/lib/cloud
/etc/cloud /opt/freeware/etc/cloud
/usr/lib/python2.7/site-packages/cloudinit /opt/freeware/lib/python2.7/site-packages/cloudinit
/usr/bin/cloud-init /opt/freeware/bin/cloud-init

Motivation for Developing Custom Modules

There are a number of reasons you might want to develop custom cloud-init modules.

  • You find yourself passing an existing script as activation input on every deploy and change parameters in the script for each deploy.

  • You require people who use your images to pass a script as activation input on every deploy and require them to change parameters in the script.

  • You want to develop new scripts that fall into the same categories as above.

  • You want to allow YAML style cloud-config user input to facilitate passing parameters to activation tasks on deploy.

In essence, if you want to be able to (or have your users be able to) deploy an image and have it do a series of tasks on deploy without passing those tasks as scripts to activation input, then you would likely benefit from developing your own cloud-init module.

If these scripts never change from deploy to deploy and require no user input, then you can put a simple wrapper script in /var/lib/cloud/scripts/per-instance directory and cloud-init will run it on the first boot of every deploy instead of having to develop custom modules.

Note that you should review the existing base cloud-init modules to determine if a module already exists that can fulfill your use case (http://cloudinit.readthedocs.org/en/latest/topics/modules.html).

Module Structure

There are two modules that ship with cloud-init that can be used to get an idea of what a custom module needs to contain: cc_foo and cc_final_message. These modules can be found at http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/config/cc_foo.py and http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/config/cc_final_message.py respectively. cc_foo.py has a large comment that has good documentation on what the structure of a module needs to be.

All cloud-init modules must implement the handle() method in order for the base cloud-init code to properly use the module. Also, the argument list should not change. The particularly useful arguments are “_cfg” and “log”. The config argument provides the method, with any cloud-config data that can be passed in by a user on deploy for parameters. The log argument provides the method with a logger that can be used to log information.

All cloud-init modules should also define the frequency in which they should be run. There are the following frequencies:

  • PER_INSTANCE: The module will be run on the first deploy of the VM but never again.

  • PER_ALWAYS: The module will be run on every boot of the VM, including the first deploy.

  • PER_ONCE: The module will be run only a single time (i.e., the first time the image is deployed, but not on any other deploys).

The cc_final_message.py module shows how cloud-config data can be obtained using the util.get_cfg_option methods. There are methods to help get boolean values, string values, and list values. The cc_final_message module specifically uses the get_cfg_option_str method to get a string that the user specifies during the deploy as part of the cloud-config data. A user that wanted to supply a different final message could specify it like this:

See Activation Input in PowerVC for more information about how to specify activation input.

Modules should be placed in the config/ directory within cloud-init's Python site-package directory (e.g. /usr/lib/python2.7/site-packages/cloudinit/config/ for some operating systems) and the name of the module must be prepended with “cc_”.

Since these modules are running Python, you can essentially write a module to do anything that any other Python script can do. One handy utility method that cloudinit provides in its util package is “subp”. This method is a wrapper around the Python subprocess library, and it allows you to run commands on the command line. This can be particularly useful if there is no Python library for what you want your module to do.

In order for the module to run, it needs to be specified in the cloud.cfg file (e.g. /etc/cloud/cloud.cfg for some operating systems) in one of the following sections:

  • cloud_init_modules: Modules that run early on in the boot sequence

  • cloud_config_modules: Modules that run after the boot sequence

  • cloud_final_modules: Modules that run after the config modules are run

Debug Logging

Using the log object to debug is extremely helpful, allowing you to do printf() debugging. Depending on the operating system and the version of cloud-init, the cloud-init output can be in a number of places, including /var/log/cloud-init.log, /var/log/cloud-init-output.log, and /var/log/messages. On some distributions, it may be necessary to edit /etc/cloud/cloud.cfg.d/05_logging.cfg and change all WARNING values to DEBUG.

Tips for Testing Modules

In order to efficiently develop and debug your modules, you'll want to “trick” cloud-init into re-running, even when the VM has already booted up. This way, you don’t have to wait on additional deploys, captures, etc.

To do this, you need to make cloud-init think that your module has not been run and then make cloud-init re-run it. Cloud-init has a notion of semaphores for modules to ensure that they are only run the right number of times (i.e., their frequency). Two ways to go about this are mentioned below.

  • Delete the semaphore file for your module in the /var/lib/cloud/instance/sem/ directory (e.g., /var/lib/cloud/instance/sem/config_foo). Then run `/usr/bin/cloud-init modules --mode <init,config,final>` depending on what kind of module you are writing, to have cloud-init re-execute your module. Some versions of cloud-init allow you to run a single module at a time as well.

  • Modify the instance ID found in /var/lib/cloud/data/instance-id and then rename the corresponding directory in /var/lib/cloud/instances/ to the new instance ID, then reboot the VM to have cloud-init re-execute your module.

This method of re-running the modules will run into issues if using PowerVC on PowerVM. This is because information (such as cloud-config data) is passed to cloud-init via an attached virtual optical device. After roughly an hour post-deploy, this virtual optical device is detached from the VM on PowerVM systems. This issue does not exist on PowerKVM systems, as the virtual optical device is never detached from the VM. There are two methods to workaround this on PowerVM systems.  The first option is the preferred option:

  • Modify the /etc/cloud/cloud.cfg and change the line "datasource_list: ['ConfigDrive']" to "datasource_list: ['None']"   Remember to revert this change when you are finished developing the module and before you capture the VM to create a deployable image.  Note that when you make this change the semaphore file you must delete to allow your module to re-run will be in the /var/lib/cloud/instances/iid-datasource-none/sem directory.

  • You can disable the removal of virtual optical devices by setting “ibmpowervm_iso_detach_timeout” to a negative value in the /etc/nova/nova*.conf configuration file that corresponds to the host you will deploy the VM to, then restarting the nova services. Remember to revert this change and then restart nova services again once you are done developing your module.

Once you are confident in your module, you should capture your VM with the module in place (i.e., Python cc_ file in config/ directory and the module enabled in cloud.cfg), then deploy the VM and verify that your module worked as intended. Also, you should test passing different cloud-config data if your module accepts user input through that method.


It is important to note that cloud-init is an open-source project and is currently released under GNU GPL v3. Please consult your own lawyers about developing your own custom modules under the GPL license.

1 comment



Thu April 26, 2018 05:54 PM

Is this possible for IBM i deployments?