Dynamic scaling of Ruby application processes vs fixed process pool
on Passenger + Nginx

Generally speaking, the more application processes you run, the more concurrent traffic you can handle and the better your CPU core utilization becomes, until your hardware is saturated. But running more processes also means higher memory consumption. So when using Passenger, you can choose between two modes of application process handling, each optimized for different use cases:

  1. Letting Passenger automatically adjust the number of application processes based on traffic.
  2. Maintaining a fixed number of application processes.

Table of contents

  1. Loading...

Dynamic application process scaling

In this mode, Passenger will automatically spin up application processes when traffic is high. It will also automatically shut down application processes when traffic is low, allowing you to conserve resources. This mode is a great fit for the following use cases:

  • You have deployed multiple web applications on Passenger, and you want resources for each application to be automatically adjusted based on traffic.
  • You are running multiple services on a machine with limited resources, so you want conserve web application memory usage as much as possible.

This is the default process handling mode in Passenger.

However, dynamic process scaling has a performance penalty. Spinning processes up and down is expensive. If performance is important to you, then we recommend that you maintain a fixed number of application processes.

How it works

To learn how exactly Passenger decides when to spin up and spin down processes, read How dynamic scaling of application processes works.

Configuration

Dynamic process scaling can be tweaked through the following configuration options:

  • passenger_min_instances – Configures the minimum number of processes for a given application. Defaults to 1.
  • passenger_max_instances – Configures the maximum number of processes for a given application. If you do not configure this, then Passenger may spin up as many processes as it wants, up to the max pool size.
  • passenger_pool_idle_time – How many seconds a process should be idle before Passenger considers shutting it down.
  • passenger_max_pool_size – Configures the maximum total number of application processes that may simultaneously exists, regardless of how many different applications you have deployed on Passenger. This is a "safety switch" that prevents Passenger from overloading your system with too many processes. Defaults to 6.

Example 1

Suppose that you have a VPS with 2 GB RAM. You deploy 8 different web applications on that single server (instead of giving each application its own server) in order to minimize cost. Performance isn't terribly important to you: saving cost is more important. You want to allocate at most 1.5 GB of RAM for your web applications, leaving 512 MB for other services. Assuming that each application process consumes about 300 MB RAM, you can set:

  • min instances to 1
  • max pool size to 5 (1500 / 300 = 5)

Example 2

Suppose that you have the same VPS with 2 GB RAM. The VPS is running an ecommerce site and a blog. The ecommerce site is more important than the blog, so whenever the blog isn't very active, you want to spin down blog processes to make more room for the ecommerce site.

You want to allocate at most 1.5 GB of RAM for your web applications, leaving 512 MB for other services. Assuming that each application process consumes about 300 MB RAM, you can configure the following:

  • Set max pool size to 5 (1500 / 300 = 5)
  • For the ecommerce site:
    • Set min instances to 3
  • For the blog:
    • Set min instances to 1
    • Set max instances to 2

This way, the ecommerce site will use 3-5 processes. The blog will 1-2 processes.

Maintaining a fixed number of application processes

Passenger can also maintain a fixed number of application processes instead of adjusting the number dynamically. This mode of process handling is ideal in the following use cases:

  1. You run only a single web application on your server. Your server is dedicated to running that web application only.
  2. Performance is important to you. Adjusting processes dynamically is expensive, so by maintaining a fixed number of application processes you will avoid this overhead.

This mode of process handling is also used by Unicorn and Puma.

Configuration

Let N be the number of processes you want. Set the following configuration in your http block:

passenger_max_pool_size N;

Set the following configuration in your server block:

passenger_min_instances N;

You should also configure passenger_pre_start in the http block so that your app is started during web server launch:

passenger_pre_start http://your-website-url.com;