Chef: chef_gem vs gem_package and ORDER

01-04-2014 | Remy van Elst

Table of Contents

One of my clients use Chef and have a cookbook which builds a postgresql database node. It should install the pg ruby gem. However, it uses the chef_gem resource to do that, and it keeps failing.

They use a wrapper-cookbook around the regular postgresql cookbook. The wrapper cookbook installs some extra packages required by their HA setup, plus it places a few other templates than the regular templates.

First they pin on a specific postgresql version and remove any other version:

package "postgresql-client-common" do
  action :purge
  not_if "psql --version | grep 9.1"

The postgresql apt-repository is enabled:

include_recipe 'postgresql::apt_pgdg_postgresql'

The postgresql::server recipe is included:

include_recipe "postgresql::server"

Then a few templates from the postgresql cookbook are overwritten by templates from the wrapper-cookbook:

r = resources(:template => "#{node['postgresql']['dir']}/pg_hba.conf")
r.source "pg_hba.conf.erb"
r.cookbook "wrapper-postgresql"

This goes on for a few templates and files. Then the culprit comes along:

chef_gem "pg"

Now what we want to happen is that first the postgresql apt-repository is enabled. Then the correct packages are installed. This includes the libpq-dev package. This is installed in the regular postgres cookbook.

This cookbook however, on a new node, fails with the following error messages:

Error executing action `install` on resource 'chef_gem[pg]'

ERROR: Failed to build gem native extension.

        /usr/bin/ruby extconf.rb
checking for pg_config... no
No pg_config... trying anyway. If building fails, please try again with
checking for libpq-fe.h... no

This is before any of the other things from the run list (other roles etc) run. Just boom, right away, it fails.

At first I thought it had something to do with chef_gem being a method instead of a resource but that is not the case. On the chef_gem documentation we read the following:

The chef_gem and gem_package resources are both used to install Ruby gems. For any machine on which the chef-client is installed, there are two instances of Ruby. One is the standard, system-wide instance of Ruby and the other is a dedicated instance that is available only to the chef-client. Use the chef_gem resource to install gems into the instance of Ruby that is dedicated to the chef-client. Use the gem_package resource to install all other gems (i.e. install gems system-wide).

And, more important for this problem:

The chef_gem resource works with all of the same attributes and options as the gem_package resource, but does not accept the gem_binary attribute because it always uses the CurrentGemEnvironment under which the chef-client is running. In addition to performing actions similar to the gem_package resource, the chef_gem resource does the following:

    Runs its actions immediately, before convergence, allowing a gem to be used in a recipe immediately after it is installed

So, this resource is executed before anything else. Exactly the problem we have. It cannot install because the needed packages are not there yet and those will never be installed because of this.

I asked around and it turns out the pg gem is not needed in the Ruby instance dedicated to chef-client, but system wide. There is only one ruby and they install chef-client via gem install chef-client with that ruby, so this has always worked worked for them. Therefore, changing it to gem_package should also have the same result.

It also turns out the other admins just did some things manually because they did not get time from management to fix this issue...

In the end I changed the chef_gem to gem_package. The cookbook now works on a new node without this issue.

Tags: chef, deployment, devops, gem, postgresql, ruby,