Perl Module XS configuration is hard

I wrote a simple little Perl Module recently, and it reminded me how frustrating it is to get it all working.

I’m not even talking about the .xs preprocessor (xsubpp)—that’s weird, but it’s fairly straightforward. Most contingencies are accounted for and you can make it do whatever you want, and in my case, the results were pretty minimal and beautiful in their own way. No, I’m talking about actually building it and making it compile in a cross platform way.

If you’re used to standard unix C source releases, you know you can just run ./configure and then make and it’ll (probably) just work. Behind the scenes configure is madness, but the principle it works on (feature detection by actually compiling things) is sound. My module is an interface to a small third party library. I can’t count on the library being installed on the machine already, so I want to statically link against it. It includes a configure script for unix machines and some windows source that isn’t covered by ./configure.

In Perl, there are two different ways to build your module: ExtUtils::MakeMaker and Module::Build. For pure Perl modules, I will only use Module::Build, as ExtUtils::MakeMaker seems too hairy. But for an XS module, I’m not sure. ExtUtils::MakeMaker looked fairly configurable so I try that first. It works very well for compiling XS Unix stuff, but because it builds a Makefile and lets you just drop your own rules in, it encouraged me to just call my library’s ./configure and then make in its directory:

use ExtUtils::MakeMaker;

WriteMakefile(
    # ... boilerplate stuff stripped for brevity...

    INC               => '-I./monotonic_clock/include',
    MYEXTLIB          => 'monotonic_clock/.libs/libmonotonic_clock.a'
);

sub MY::postamble {
'
$(MYEXTLIB): monotonic_clock/configure
    cd monotonic_clock && CFLAGS="$(CCFLAGS)" ./configure && $(MAKE) all
monotonic_clock/configure: monotonic_clock/configure.ac
    cd monotonic_clock && ./bootstrap.sh
'
}

This of course worked just fine on my Mac.

But it utterly failed everywhere else. Ugh, there’s hardly any green there! Ok, so I focused on the Unix failures first—I know this, I should be able to get stuff working. Turns out Linux wants -fPIC on the library’s code because I’m statically linking against it and it will end up in my modules shared library “.so” file. Ok, so I can just unilaterally add -fPIC:

sub MY::postamble {
'
$(MYEXTLIB): monotonic_clock/configure Makefile.PL
    cd monotonic_clock && CFLAGS="$(CCFLAGS) -fPIC" ./configure && $(MAKE) all
monotonic_clock/configure: monotonic_clock/configure.ac
    cd monotonic_clock && ./bootstrap.sh
'
}

That works on clang on the Mac and gcc on Linux. I actually tested on my Debian machine. Everything should be great now!

Nope. Half the Linuxes are still failing, some of the Macs too, and I haven’t even addresses Windows yet. I discover that -fPIC happens to be defined by ExtUtils::MakeMaker on the appropriate platforms in the CCCDLFLAGS make variable, and switch to it thinking this should solve the remaining Unix problems.

Then I start thinking about Windows. I boot up my Windows 8 VM where I have Strawberry Perl installed and try out my module. It fails utterly. I forgot—you can’t just call ./configure in windows! Plus my library doesn’t even handle Windows in its configure script. I start thinking about writing my own Makefile rules to build it so I can drop the configure script completely. But I don’t like it. I’m going to need to detect which back-end the library should be using. I basically have to recreate what configure does, but in a Makefile. And I don’t know how much shell I can use in my rules and still work on Windows, meaning the detection is going to be a huge issue.

So I decide to drop ExtUtils::MakeMaker and use Module::Build instead. It’s actually pretty straightforward:

use strict;
use warnings FATAL => 'all';
use Module::Build;

my $builder = Module::Build->new(
    # ... boilerplate stuff stripped for brevity...

    extra_compiler_flags => '-DHAVE_GETTIMEOFDAY', # We're going to assume everyone is at least that modern
    include_dirs => 'monotonic_clock/include',
    c_source     => ['monotonic_clock/src/monotonic_common.c'],
);

# Add the appropriate platform-specific backend.
#
# This isn't as good as the configure script that comes with
# monotonic_clock, since it actually tests for the feature instead of
# assuming that non-darwin unixes support POSIX clock_gettime. On the other
# hand, this handles windows.
push(@{$builder->c_source},
     $^O                 eq 'darwin'  ? 'monotonic_clock/src/monotonic_mach.c' :
     $builder->os_type() eq 'Windows' ? 'monotonic_clock/src/monotonic_win32.c' :
     $builder->os_type() eq 'Unix'    ? 'monotonic_clock/src/monotonic_clock.c' :
                                        'monotonic_clock/src/monotonic_generic.c');

$builder->create_build_script();

I figure out that I can abuse the c_source configuration option. It’s supposed to be a directory where the source code lives and they search it recursively for “*.c” files. But it turns out if I pass a C file to it instead of a directory, the recursive search finds that one C file! So now I can add specific C files for the particular platforms. I now have something that compiles on my Mac, my Debian machine, and my Windows VM. Hooray! That covers everything. Finally!

Sigh. That’s worse than my last Makefile.PL based version! What’s going on? Ok, my c_source hack is biting me. I noticed that it was helpfully adding -I options for the sources directory to the compiler, but since I was passing in actual files to c_source I was getting compiler lines like -Imonotonic_clock/src/monotonic_clock.c. This was a warning on my Debian gcc and on Windows gcc, but my Mac’s clang just ignored it with no message at all. So I blew it off. Well, it turns out other compilers are more strict.

So I start poking around the Module::Build source code. I discover that the c_source is adding to an internal include_dirs list. So I override the compile function to strip .c files out of the include_dirs list.:

# This hacks around the fact that we are using c_source to store files, when Module::Build expects directories.
my $custom = Module::Build->subclass(
    class => 'My::Builder',
    code  => <<'CUSTOM_CODE');
sub compile_c {
  my ($self, $file, %args) = @_;
  # Adding to c_source adds to include_dirs, too. Since we're adding files, remove them.
  @{$self->include_dirs} = grep { !/.c$/ } @{$self->include_dirs};
  $self->SUPER::compile_c($file, %args);
}
CUSTOM_CODE

It seems really hacky (I hate having to write Perl code in a string) and a bit fragile (I sure hope they don’t change the API in a new version) but it actually works! I publish that and wait to see my wall of green.

I don’t get to see it yet. This is baffling to me. The worst part is the test reports themselves. They aren’t showing me the build process, and they don’t try to identify the Linux distro, leaving me to intuit it from the versions of various things. It looks like one of the failing distros is Debian Wheezy (aka the current “stable”). So I as a last ditch effort use “debootstrap” to build a minimal install that I can chroot into. I try compiling and I can actually reproduce the error. This is phenomenal because I don’t have to guess any more.

After poking around for a while I discover that in older glibcs, the clock_gettime function that my library uses requires librt and therefore a -lrt option to the linker. Ok. But I don’t want to add that unilaterally—I got bit by that earlier in this process. I’m also frustrated that my backend detection code is just hardwired to Module::Builds os_type() function. Do I really know that old FreeBSDs support clock_gettime? No, I don’t, and I don’t want to figure it out. I think back wistfully to ./configure—it just detects stuff by compiling it and seeing if it works. That’s really what I want—it’s the only way to know for sure without researching and testing on every. single. platform.

And then it hits me, I guess I could do that. Module::Build uses ExtUtils::CBuilder internally, and so I can too. It turns out to actually not be that bad. The longest part of the code is redirecting stdout and stderr so you don’t see a bunch of compilation errors while it’s figuring things out:

# autoconf style feature tester. Can't believe someone hasn't written this yet...
use ExtUtils::CBuilder;
my $cb = ExtUtils::CBuilder->new(quiet=>1);

sub test_function_lib {
    my ($function, $lib) = @_;
    my $source = 'conf_test.c';
    open my $conf_test, '>', $source or return;
    print $conf_test <<"C_CODE";
int main() {
    int $function();
    return $function();
}
C_CODE
    close $conf_test;

    my $conf_log='conf_test.log';
    my @saved_fhs = eval {
        open(my $oldout, ">&", *STDOUT) or return;
        open(my $olderr, ">&", *STDERR) or return;
        open(STDOUT, '>>', $conf_log) or return;
        open(STDERR, ">>", $conf_log) or return;
        ($oldout, $olderr)
    };

    my $worked = eval {
        my $obj = $cb->compile(source=>$source);
        my @junk = $cb->link_executable(objects => $obj, extra_linker_flags=>$lib);
        unlink $_ for (@junk, $obj, $source, $conf_log);
        return 1;
    };

    if (@saved_fhs) {
        open(STDOUT, ">&", $saved_fhs[0]) or return;
        open(STDERR, ">&", $saved_fhs[1]) or return;
        close($_) for (@saved_fhs);
    }

    $worked
}

Using it is pretty easy:

my $have_gettimeofday = test_function_lib("gettimeofday", "");
my $need_librt = test_function_lib("clock_gettime", "-lrt");

The one annoyance is that the feature detection doesn’t completely work on Windows. The technique I used (stolen unabashedly from autoconf) just declares the function with the wrong arguments and return value and tries to link with it. This works in general because C doesn’t encode the arguments or return values into the symbol name. Except Windows apparently does with its API functions: my feature detector detects gettimeofday() just fine, but not QueryPerformanceCounter(). If I declare QueryPerformanceCounter() properly then Windows links with it, otherwise I get an undefined symbol error. I don’t care enough to properly research Windows’s ABI, so I decided to just check $^O, assuming that all Windows will work. A quick check of the Windows documentation reveals that it’s has been supported since Windows 2000 and that’s good enough for me.

I upload to CPAN and finally see that green I’ve been looking for.

But it makes me wonder, why did I have to write this? It’s 2015, has nobody ever needed to build their Perl XS module differently for different platforms? I know I can’t be the first, but I had a really hard time finding any documentation or discussion about it. I didn’t see anything on CPAN that looked like it might solve my problems. Should this be a feature in Module::Build? I’ve only written two XS modules in my life, but both times I needed different sources on different platforms. Does anyone want to point me to something that does this well, or failing that, build off this idea and make something neat? I would love it, certainly.

Leave a Reply

Your email address will not be published. Required fields are marked *