Union Station filter language

By default Phusion Passenger logs every dynamic request and sends it to Union Station. Everything; not just slow requests. However sometimes there may be some requests you don't want to log, e.g. because you are only interested in slow requests or only interested in requests to a certain controller.

Phusion Passenger >= 3.0.5 allows client-side filtering of data. One writes filters in the Union Station filter language and specifies them in the web server config file. The result of a filter is either true (allow the given data to be sent to Union Station) or false (don't allow the given data to be sent). After logging a request, Phusion Passenger runs all defined filters to determine whether to send the request's data to Union Station.

Filters are defined with the UnionStationFilter directive (Apache) or the union_station_filter directive (Nginx).

The Union Station filter language somewhat resembles expressions in many popular languages such as C, Javascript and Ruby. Every filter is a combination of expressions, each of which evaluate to a boolean. An expression is either a matching expression or a function call.

Quick examples

Example 1: URI must be exactly equal to /foo/bar:

uri == "/foo/bar"

Example 2: Response time must be larger than 150 miliseconds (150000 microseconds):

response_time > 150000

Example 3: URI must match the regular expression /animals/(dog|cat) and the response time must be larger than 60 miliseconds (60000 microseconds):

uri =~ /\/animals\/(dog|cat)/ && response_time > 60000

Example 4: The response time - not taking garbage collection into consideration - must be larger than 50 miliseconds, and the response status must be unsuccesful (in the 4xx or 5xx range).

response_time_without_gc > 50000 && status_code >= 400

Values

The filter language supports literal values and identifier values. Values always have a type. The following types are supported:

Integers
Integer literals must be written in decimal format. Hexadecimal and octal forms are not supported. Examples of integer literals: `1`, `1234`.
Booleans
Two boolean literals exist: `true` and `false`.
Strings
String literals begin and with either a single quote or a double quote character. `\` can be used as escape character. The following special escaped characters are supported: * **\n** - newline (byte 10) * **\r** - carriage return (byte 13) * **\t** - tab (byte 9) * **\\** - backslash Examples:
"foo"
"hello world"
"Joe \"Trigger-Happy\" Dalton"
"string\nliteral"
'single-quote string'
Please note that, unlike most programming languages, escape characters work the same way in single-quote strings and double-quote strings. In the Union Station filter language the following string literals are equivalent:
"string\nliteral"
'string\nliteral'
Regular expressions
There are two regular expression literal syntaxes. The first one begins and ends with a slash:
/regexp definition here/optional modifiers here
The second one begins with `%r{` and ends with `}`:
%r{regexp definition here}optional modifiers here
Regular expressions are case-sensitive by default. You can use the `i` modifier to make it case-insensitive. At this time this is the only supported modifier. Just like with strings, `\` can be used as escape character, and all special escaped characters supported by strings are also supported by regular expressions. Examples:
/foo/       matches "foo", "foobar", etc. but NOT "Foo", "FooBar", etc.
%r{foo}     same as above

/foo/i      matches "foo", "foobar", etc. and also "Foo", "FooBar", etc.
%r{foo}i    same as above
/foo( bar)+/
%r{foo( bar)+}
/newline\n/
%r{newline\n}
/\/users\/1/
%r{/users/1}

Identifier values

Identifier values are identifiers that evaluate to a value. The following identifiers are available:

  • uri - the URL, not including the scheme, host name and port, but including the query string. Examples: /foo/bar, /users/1/edit?return_to=overview.
  • response_time - the response time in microseconds.
  • response_time_without_gc - the response time in microseconds, without taking into account the time spent on garbage collection. Logically equivalent to response_time - gc_time.

    Your Ruby interpreter must support GC statistics APIs, otherwise this identifier is always equal to response_time. Please read gc_time for details and notes.

    Supported since Phusion Passenger 3.0.8.

  • gc_time - the amount of time spent on garbage collection, in microseconds. In order for Phusion Passenger to be able to collect garbage collection statistics, it must be using Ruby Enterprise Edition or some other Ruby interpreter which supports the GC statistics API. On Ruby interpreters where such an API is not available, gc_time is always 0.

    Supported since Phusion Passenger 3.0.8.

  • controller - the controller name, including the suffix "Controller". Only available when the app is a Rails app; for all other apps, this identifier evaluates to the empty string. Examples: CustomersController, UsersController.
  • status_code - the HTTP response status code as an integer.

    Supported since Phusion Passenger 3.0.8.

Matching expressions

Matching expressions have the form of:

subject operator object

and always evaluate to boolean values. subject and object are values, while operator is one of these:

==
Equality. Subject and object must be both strings or both integers.
!=
Inequality. Subject and object must be both strings or both integers.
=~
Test whether regular expression matches. Subject must be a string, object must be a regular expression.
!~
Test whether regular expression doesn't match. Subject must be a string, object must be a regular expression.
<
Less than. Subject and object must be both integers.
<=
Less than or equal to. Subject and object must be both integers.
>
Greater than. Subject and object must be both integers.
>=
Greather than or equal to. Subject and object must be both integers.

Function calls

Only one function call is available at the moment:

starts_with(haystack, needle)
Returns whether the string _haystack_ starts with the string _needle_. The following example returns whether the URI starts with `/foo/bar`:
starts_with(uri, "/foo/bar")

Combining expressions with logical operators

One can combine expressions with boolean operators:

&&
Logical AND.
||
Logical OR.
!
Negation.

Examples:

uri == "/foo" || !starts_with(uri, "/bar")
response_time < 10000 && uri == "/should_be_slow"

To avoid ambiguity, one can group expressions together with brackets:

(uri == "/foo") || (uri == "bar" && response_time > 10000)

Please note that the language does not currently support operator precedence! That is, && and || have the same operator priority. So something like

response_time > 100000 || uri == "/foo" && response_time > 1000

is currently being interpreted as

(response_time > 100000 || uri == "foo") && response_time > 1000