Configuration for Modules

Most modules need to offer system administrators and users some means of configuring and controlling them. In some cases, this is even the primary purpose of a module.

System administrators configure Apache using httpd.conf, while end users have more limited control through .htaccess files. Modules give control to them by implementing configuration directives that can be used in these files.

In this article, we show how to implement configuration directives, in a module, and how to work with directives implemented by other modules.

Add Comment

Configuration Basics


From the point of view of a system administrator, there are several kinds of directive. These can be broadly classified according to their scope and validity in the configuration files. That is to say, some directives are valid only for the server as a whole, while others apply within a scope such as <VirtualHost> or <Directory>.

Conflicting directives may override each other on the basis of specificity. For example, a directive inside a <Directory> overrides one outside it. In most cases this applies recursively, although this is controlled by individual modules whose behaviour may differ.

The standard contexts supported by Apache are:

Main Config
Directives appearing in httpd.conf but not inside any container apply globally, except where overridden. This is appropriate for setting system defaults such as MIME types, and for once-only setup such as loading modules. Most directives can be used here.
Virtual Host
Each Virtual host has it's (virtual) server-wide configuration, set within a <VirtualHost> container. Directives that are valid in the main config are also valid in a virtual host, and vice versa.
Directory
The <Directory>, <Files> and <Location> directives define a hierarchy within which configuration can be set and overridden at any level. This is the most usual form of configuration, and is orthogonal to the virtual hosts. In the interests of brevity, we'll refer to this collectively as the Directory Hierarchy in this article.
htaccess
htaccess is an extension of the Directory Hierarchy that serves to enable users to set directives for themselves, subject to permissions (AllowOverride) set up by the server administrator.

Note that contexts are not always the same as containers. Modules may themselves implement their own containers: for example, mod_access implements <Limit>;, and mod_perl implements <Perl>. These are not relevant to this discussion.

show annotation

Note by anonymous, Wed Nov 2 02:38:00 2005

aaaa < bbbb

show annotation

Note by anonymous, Mon May 12 01:53:59 2008

Add Comment

Configuration Data Structs


As noted above, there are two orthogonal hierarchies of configuration directives: (Virtual) Hosts and Directories. Internally, this is based on having two different data structs: the per-server config and the per-directory config. In fact, every module has its own pointers for implementing each of these structs, although either or both can be unused (NULL), and it is unusual for a module to use both of them.

The per-server config is kept on the server_rec, of which there is one for each virtual host, created at server startup. The per-directory config is kept on the request_rec and may be computed using the merge function for every request.

show annotation

Note by anonymous, Mon May 12 01:54:03 2008

Add Comment

Managing a Module Configuration


No less than five out of the six (usable) elements of the Apache module struct are concerned with configuration:

module my_module = {
  STANDARD20_MODULE_STUFF,
  my_create_dir_conf,		/* Create config rec for Directory */
  my_merge_dir_conf, 		/* Merge config rec for Directory */
  my_create_svr_conf,		/* Create config rec for Host */
  my_merge_svr_conf, 		/* Merge config rec for Host */
  my_cmds,           		/* Configuration directives */
  my_hooks
} ;

It is up to each module whether and how to define each struct. Whenever a struct is defined, the module must implement an appropriate create function to allocate and (usually) initialise it:

typedef struct {
  ... ;
} my_svr_cfg ;

static void* my_create_svr_conf(apr_pool_t* pool, server_rec* svr) {
  my_svr_cfg* svr = apr_pcalloc(pool, sizeof(my_svr_cfg));
  /* Set up the default values for fields of svr */
  return svr ;
}

typedef struct {
  ... ;
} my_dir_cfg ;

static void* my_create_dir_conf(apr_pool_t* pool, char* x) {
  my_dir_cfg* dir = apr_pcalloc(pool, sizeof(my_dir_cfg));
  /* Set up the default values for fields of dir */
  return dir ;
}

At this point, just allocating and returning a struct of the right size is sufficient: Apache uses the return value. Now these values can be accessed at any time a server_rec or request_rec respectively is available:

my_svr_cfg* svr = ap_get_module_config(s->module_config, &my_module) ;
my_dir_cfg* dir = ap_get_module_config(r->per_dir_config, &my_module) ;
show annotation

Note by anonymous, Wed Jan 12 03:45:05 2005

two typo in the middle of source code, p should be pool in the signature of the function.

show annotation

Note by anonymous, Wed Jan 12 03:45:39 2005

two typo in the middle of source code, p should be pool in the signature of the function.

show annotation

Note by niq, Wed Jan 12 04:02:56 2005

Fixed. Thanks for pointing it out:-)

show annotation

Note by anonymous, Wed Jul 27 11:05:12 2005

What about get config of another module? For example get config of module_A in module_B

Add Comment

Implementing Configuration Directives


my_cmds above is a null-terminated array containing the commands implemented by the module. Normally they are defined using macros defined in http_config.h. For example,

static const cmd_rec my_cmds[] = {
  AP_INIT_TAKE1("MyFirstDirective", my_first_cmd_func, my_ptr, OR_ALL,
	"This is My First Directive"),
  /* more directives as applicable */
  { NULL }
} ;

AP_INIT_TAKE1 is one of many such macros, all having the same prototype (more later). The arguments to it are as follows:

  1. Directive Name.
  2. Function implementing the directive.
  3. Data pointer (often NULL).
  4. Where this directive is allowed.
  5. A brief Help message for the directive.
show annotation

Note by anonymous, Wed Jul 13 14:38:42 2005

cmd_rec should be command_rec, as far as I can understand.

Add Comment

Configuration Functions


An essential component of every directive is the function implementing it. Normally the function serves to set some data field(s) in one of the config structs. The function prototype for AP_INIT_TAKE1 is the same, regardless of whether we're setting per-server or per-directory config:
const char* my_first_cmd_func(cmd_parms* cmd, void* cfg, const char* arg)

cmd is a cmd_parms_struct comprising a number of fields used internally by Apache and available to modules. Fields likely to be of interest in modules include:

  • void* info - contains my_ptr from the command declaration
  • apr_pool_t* pool - pool for permanent resource allocation
  • apr_pool_t* temp_pool - pool for temporary resource allocation
  • server_rec* server - the server rec

cfg is the directory config rec, and arg is an argument to the directive set in the configuration file we are processing.

Thus, if we are setting per-directory configuration, we just cast the cfg argument, whereas if we are setting per-server configuration we need to retrieve it from the server_rec:

show annotation

Note by anonymous, Mon Jul 2 17:42:11 2007

The configuration functions must return NULL or an error string - if you don't return anything, the server will say there is a syntax error

show annotation

Note by anonymous, Sun May 11 10:56:33 2008

show annotation

Note by anonymous, Tue May 13 22:09:22 2008

Add Comment

Configuration Function Types


The above example used the AP_INIT_TAKE1 macro, which defines a function having a singe string argument. This is one of several such macros defined in http_config.h:

  • AP_INIT_NO_ARGS (no args)
  • AP_INIT_FLAG (a single On/Off arg)
  • AP_INIT_TAKE1 (a single string arg)
  • AP_INIT_TAKE2, AP_INIT_TAKE3, AP_INIT_TAKE12, etc - directives taking different numbers of string args
  • AP_INIT_ITERATE (function will be called repeatedly with each of an unspecified number of arguments)
  • AP_INIT_ITERATE2 (function will be called repeatedly with two arguments)
  • AP_INIT_RAW_ARGS (function will be called with args unprocessed)

This gives module authors a choice of simple prototypes, together with the hands-on RAW_ARGS for modules to do their own parsing. Modules using RAW_ARGS should retrieve the arguments using the function ap_getword_conf repeatedly until it returns NULL.

show annotation

Note by anonymous, Wed Nov 15 19:23:31 2006

Apache has defined AP_INIT_TAKEXXX where XXX is in: {1, 2, 3, 12, 23, 123}. 4 or more args are not defined (as of Apache 2.0.x).

Add Comment

Scope of Configuration


The above example used OR_ALL, to say that MyFirstDirective can be used anywhere in httpd.conf or in any .htaccess file (provided htaccess is enabled on the server). Other options we could have used include:

  • RSRC_CONF - httpd.conf at top level or in a VirtualHost context. All directives using server config should use this, as other contexts are meaningless for a server config.
  • ACCESS_CONF - httpd.conf in a Directory context. This is appropriate to per-dir config directives for a server administrator only, and is often combined (using OR) with RSRC_CONF to allow its use anywhere within httpd.conf.
  • OR_LIMIT, OR_OPTIONS, OR_FILEINFO, OR_AUTHCFG, OR_INDEXES - extend to allow use of the directive in .htaccess according to AllowOverride setting.
Add Comment

Pre-Packaged Configuration Functions


In general, as in the above example, we write our own function to implement a directive. but this is not always necessary. In the common case of a directive that simply set a field in the directory config, we can use one of the pre-packaged functions: ap_set_string_slot, ap_set_string_slot_lower, ap_set_int_slot, ap_set_flag_slot, ap_set_file_slot to set a field, according to the type of the field to be set. To use these fields, we need to pass the field to be set in the data pointer, so a declaration looks like:
AP_INIT_TAKE1("MySimpleDirective", ap_set_int_slot, (void*)APR_OFFSETOF(my_dir_cfg, myintvar), OR_ALL, "Set something") ;
where myintvar is an integer member of the my_dir_cfg struct.

show annotation

Note by anonymous, Tue Apr 5 14:59:30 2005

This functions works ONLY for per_dir config not for per server

show annotation

Note by anonymous, Sat May 17 20:39:09 2008

Add Comment

The Configuration Hierarchy


We have now dealt with creating the configuration structures and populating them using configuration directives. The final topic we need to deal with is managing the configuration hierarchy: how directives set at different levels interact with each other. This is the purpose of the merge functions in the module struct.

A merge function is called whenever there are directives at more than one level in a hierarchy, starting at the top level of httpd.conf. In the case of the per-directory config there may be several levels and thus several calls to a merge function, incorporating htaccess files (if applicable) as well as sections in httpd.conf.

A merge function may also be NULL. In that case all directives in the less-specific container are discarded, so incremental configuration is not possible. Nevertheless, it is perfectly adequate for some modules.

More typically, we want the merge function to honour directives set in the more specific container, but inherit values that are not explicitly set. This is where we need a merge function. Consider the following example:

typedef struct {
  int a , b , c :
} my_dir_cfg;

with directives to set each of these, and a configuration

<Location />
	SetMyA	123
	SetMyC	321
</Location>
<Location /somewhere/>
	SetMyB	456
</Location>
<Location /somewhere/else/again/>
	SetMyC	789
</Location>

Here the most specific section is /somewhere/else/again/, so in the absence of a directory merge function, c will be set to 789 but the values of a and b are unset. We need a merge function, which takes the generic form:

static void* my_merge_dir_conf(apr_pool_t* pool, void* BASE, void* ADD) {
    my_dir_cfg* base = BASE ;
    my_dir_cfg* add = ADD ;
    my_dir_cfg* conf = apr_palloc(pool, sizeof(my_dir_cfg)) ;
    conf->a = ( add->a == UNSET ) ? base->a : add->a ;
    conf->b = ( add->b == UNSET ) ? base->b : add->b ;
    conf->c = ( add->c == UNSET ) ? base->c : add->c ;
    return conf ;
}

To make this effective, we define the value of UNSET to some value that won't be used (e.g. -1 if our integers will always be positive), and initialise them to that in our create_config function. Now our configuration is processed as follows:

  1. At the top level, a is set to 123 and c to 321 while b is unset.
  2. The first merge sets b to 456. Since a and c are not set (overridden) at this level, the previous values are inherited in the merge.
  3. There are no configuration directives at /somewhere/else/, so this level simply inherits from /somewhere/ without any need for a merge.
  4. The second merge sets the value of c overriding the previous setting, while inheriting the previous values of a and b. Now we have a=123, b=456, c=789.

This is obviously a trivial merge function. Often we may need to do something a little more interesting: for example to merge nontrivial structures, or to deal with cases where there is no meaningful UNSET value to test. When merging structures involving pointers, take care about modifying the originals: it's usually safer to make a copy unless you're using a standard APR datatype with its merge functions. You'll just have to deal with each case on its merits.

show annotation

Note by anonymous, Wed Apr 13 16:30:37 2005

The <Location> example is misleading in the implication that the last <Location> section is merged last because it is the most specific. Actually, it's merged last because it comes later in the config file; if the <Location> sections were in the opposite order, both matching sections would still be merged but the least specific one would be merged last, with unusual results.

Add Comment

Dealing with Variables


The configuration structures should normally be treated as read-only outside of the functions discussed above. A few limited exceptions may be appropriate, usually on the server config, where it is used to manage, for example, a pool or cache of resources whose contents might change at any time. This can safely be done in a post_config or child_init hook. But at any later point - when processing a Connection or Request - this gives rise to a race condition. Any such operations must therefore use an appropriate lock: usually an apr_thread_mutex (which must itself be set up during module initialisation).

show annotation

Note by anonymous, Thu May 15 06:16:17 2008

Add Comment

Request and Connection Variables


It is not appropriate to use the configuration structs for variables used in processing a Request or Connection. However, similar structs are provided for these, and can be allocated on the request or connection pools with the lifetime of the request or connection.

typedef struct {
    ....
} my_request_vars ;

We can now set this in some hook:
my_request_vars* vars = apr_palloc(r->pool, sizeof(my_request_vars)) ;
/* store stuff in vars */
ap_set_module_config(r->request_config, &my_module, vars) ;

and retrieve what we set later in the request:
my_request_vars* vars = ap_get_module_config(r->request_config, &my_module) ;

The conn_rec has an analagous conn_config field. Apache provides other contexts that may be useful for some applications: each filter and namespace has a context field for its own data. These topics will be the subjects of separate articles.

show annotation

Note by anonymous, Wed Jan 12 03:47:06 2005

my_request_vars* vars = ap_set_module_config(...) should be ap_get_module_config

Add Comment

Further Reading


This article complements other introductory developer articles at apachetutor, such as Resource Management and Request Processing.

show annotation

Note by anonymous, Sat May 17 20:39:17 2008