We’re going to dig into the process of a working Gaggle, so you can see how it runs, and how to deal with some of the errors you might encounter as you start working with this feature of Goose. Goose does not currently have a UI; this example expects you to be familiar with the command line interface (CLI).

This example uses one Manager and two Workers, so there are three different things going on at the same time.

Leading the Gaggle: Starting the Manager

As explained in the README for Goose, Cargo is the Rust package manager. We're using it to run an example from the Goose codebase. We do this primarily because these examples are included in the Goose codebase and therefore are available to everyone for review and experimentation.

When starting a Gaggle, you must start a Goose application in Manager mode. Adding the --manager flag isn't enough; a Manager requires other configuration. The error generated explains that the --manager flag expects some number of defined workers from the --expect-workers option:


$ cargo run --example drupal_loadtest --release -- --manager
    Finished release [optimized] target(s) in 0.10s
    Running `target/release/examples/drupal_loadtest --manager`
    Error: InvalidOption { option: "--expect-workers", value: "0", detail: "The --expect-workers option must be set to at least 1." }

This example is running the test on a 1-core VM. Goose defaults to launching 1 user (the number of available CPU cores), but we need at least as many workers defined as there are users (otherwise a Worker would be totally idle). To specify a number of Workers, use the --expect-workers option, with an argument for the number of Workers to start. Here, we’re starting two Workers (--expect-workers 2):


$ cargo run --example drupal_loadtest --release -- --manager --expect-workers 2
    Finished release [optimized] target(s) in 0.10s
    Running `target/release/examples/drupal_loadtest --manager --expect-workers 2`
    Error: InvalidOption { option: "--expect-workers", value: "2", detail: "The --expect-workers option can not be set to a value larger than --users option." }

Next, add in the remaining required run-time options, specifically

  • --host - a URL. Here we’re using http://local.dev
  • --users - the number of users to start. Here it is specified by the option -u8

This configuration causes another error:


$ cargo run --example drupal_loadtest --release -- --host http://local.dev/ -u8 --manager --expect-workers 2
    Finished release [optimized] target(s) in 0.10s
     Running `target/release/examples/drupal_loadtest --host 'http://local.dev/' -u8 --manager --expect-workers 2`
    Error: FeatureNotEnabled { feature: "gaggle", detail: "Load test must be recompiled with `--features gaggle` to start in manager mode." }

Gaggles are a compile-time option, because the code has additional dependencies and it can be more difficult to compile in Gaggle-mode.

On most linux distributions you have to add cmake and openssl-dev.

We add the extra flag (--features gaggle) as detailed in the error and finally the Manager starts correctly.


$ cargo run --example drupal_loadtest --release --features gaggle -- --host http://local.dev/ -u8 --manager --expect-workers 2
      Downloaded cmake v0.1.44
      Downloaded 1 crate (14.4 KB) in 1.90s
       Compiling cmake v0.1.44
       Compiling nng-sys v1.1.1-rc.2
       Compiling nng v0.5.1
       Compiling goose v0.10.1-dev (/home/jandrews/goose)
    Finished release [optimized] target(s) in 1m 10s
    Running `target/release/examples/drupal_loadtest --host 'http://local.dev/' -u8 --manager --expect-workers 2`

If we build a real application we could use `cargo build` instead of `cargo run` and then run the application binary, not using Cargo to run our load test. Let's say we built a binary named `loadtest`; in that case we'd instead run `./loadtest --host http://local.dev/ -u8 --manager --expect-workers 2`).

Now that it's working, we restart the above command, but this time including the "-v" (verbose) flag so we can see what's happening. It's already compiled above, so it doesn't have to recompile just to handle a new flag:


$ cargo run --example drupal_loadtest --release --features gaggle -- --host http://local.dev/ -u8 --manager --expect-workers 2 -v
    Finished release [optimized] target(s) in 0.09s
    Running `target/release/examples/drupal_loadtest --host 'http://local.dev/' -u8 --manager --expect-workers 2 -v`
    05:43:36 [ INFO] Output verbosity level: INFO
    05:43:36 [ INFO] Logfile verbosity level: WARN
    05:43:36 [ INFO] hatch_rate = 1
    05:43:36 [ INFO] global host configured: http://local.dev/
    05:43:36 [ INFO] initializing user states...
    05:43:36 [ INFO] manager listening on tcp://0.0.0.0:5115, waiting for 2 workers
    05:43:36 [ INFO] each worker to start 4 users

The last two lines are specifically relevant to what we're doing here. The Manager process is listening on all interfaces (0.0.0.0) on port 5115. The Manager configures each Worker process to run 4 users; we’re using two Workers (expect-workers is set to 2), for a total of 8 users.

The flock follows: Start the Workers

On the same server, start one Worker. The Manager’s job is to aggregate metrics, meaning it's mostly idle. For that reason, it can be useful to start both the Manager and one Worker on the same server. You don't want to starve either process; ensure there are enough CPU cores for the Manager and Worker to have separate cores. This means we need to set the "--worker" flag. By default the Worker will attempt to connect to the Manager on 127.0.0.1:5115. We’re using verbose mode to watch the process.


$ cargo run --example drupal_loadtest --release --features gaggle -- --worker -v
    Finished release [optimized] target(s) in 0.10s
    Running `target/release/examples/drupal_loadtest --worker -v`
    05:49:32 [ INFO] Output verbosity level: INFO
    05:49:32 [ INFO] Logfile verbosity level: WARN
    05:49:32 [ INFO] worker connecting to manager at tcp://127.0.0.1:5115
    05:49:32 [ INFO] waiting for instructions from manager
    05:49:33 [ INFO] initializing user states...
    05:49:33 [ INFO] [1] initialized 4 user states
    05:49:33 [ INFO] [1] waiting for go-ahead from manager

The last 5 lines are relevant. First, the Worker connects to the Manager over TCP on the localhost on port 5115 (worker connecting to manager at tcp://127.0.0.1:5115). Once connected, the Worker waits for the Manager to give instructions. When it receives instructions the Worker initializes 4 user states (initialized 4 user states), and goes into standby mode waiting for the Manager to give it the go-ahead. The amount of time this takes depends on how many Workers and how many users are being initialized. When all expected Workers are in standby mode, the Manager tells them all to start (waiting for go-ahead from manager).

Start a second Worker

Now, on another server we start a second Worker. This one requires more configuration, because it can't connect over localhost to the Manager on a remote server. In the run command, use the --manager-host option to set the IP address of the Manager’s server. In this case, the Worker will connect to the Manager on the Manager’s default port (5115); if the Manager is on a different port, you would specify it in the command here with the --manager-port option.


$ cargo run --example drupal_loadtest --release --features gaggle -- --worker --manager-host 10.10.3.21 -v
   Finished release [optimized] target(s) in 0.11s
   Running `target/release/examples/drupal_loadtest --worker --manager-host 10.10.3.21 -v`
    06:12:42 [ INFO] Output verbosity level: INFO
    06:12:42 [ INFO] Logfile verbosity level: WARN
    06:12:42 [ INFO] worker connecting to manager at tcp://10.10.3.21:5115
    06:12:42 [ INFO] waiting for instructions from manager
    06:12:42 [ INFO] initializing user states...
    06:12:42 [ INFO] [2] initialized 4 user states
    06:12:42 [ INFO] [2] waiting for go-ahead from manager
    06:12:43 [ INFO] [2] entering gaggle mode, starting load test
    06:12:43 [ INFO] [2] prepared to start 1 user every 2.00 seconds
    06:12:43 [ INFO] [2] launching user 1 from AnonBrowsingUser...
    06:12:45 [ INFO] [2] launching user 2 from AnonBrowsingUser...
    06:12:47 [ INFO] [2] launching user 3 from AnonBrowsingUser...
    06:12:49 [ INFO] [2] launching user 4 from AnonBrowsingUser...
    06:12:51 [ INFO] [2] launched 4 users...

This log should look very familiar; it's nearly identical to what we saw on Worker 1. The only real difference is it's prefixed with [2] instead of [1] indicating this is the log from the 2nd Worker. This is the second of two workers, so the Manager is able to give both workers the go-ahead (entering gaggle mode, starting load test). Goose starts one user per second by default. Each Worker takes a turn starting a user. Each Worker operates on its own, and sends metrics to the Manager.

Worker 1 was idle waiting for the go-ahead from the Manager. When Worker 2 started, the Manager gave both workers the go-ahead.


    06:12:42 [ INFO] [1] entering gaggle mode, starting load test
    06:12:42 [ INFO] [1] prepared to start 1 user every 2.00 seconds
    06:12:42 [ INFO] [1] launching user 1 from AnonBrowsingUser...
    06:12:44 [ INFO] [1] launching user 2 from AnonBrowsingUser...
    06:12:46 [ INFO] [1] launching user 3 from AnonBrowsingUser...
    06:12:48 [ INFO] [1] launching user 4 from AuthBrowsingUser...
    06:12:50 [ INFO] [1] launched 4 users…

Just like on Worker 2, Worker 1 is told to start 1 user every 2 seconds. Looking at the time stamps you can see Goose is starting 1 user per second, but it starts 1 on Worker 1, then a second later it starts 1 on Worker 2.

For a complete picture, here is what is happening on the Manager at the same time as what we see above from the Workers:


    05:49:33 [ INFO] worker 1 of 2 connected
    05:49:33 [ INFO] sending 4 users to worker 1
    06:12:42 [ INFO] worker 2 of 2 connected
    06:12:42 [ INFO] sending 4 users to worker 2
    06:12:42 [ INFO] gaggle distributed load test started

The final line indicates that the Manager has learned that all the Workers are ready, and gives them the go-ahead to start the load test.

Time for a landing

To stop the load test, type Ctrl-C on the Manager or a Worker -- in this example we did it on Worker 2:


    ^C06:13:48 [ WARN] caught ctrl-c, stopping...
    06:13:49 [ INFO] [2] stopping after 58 seconds...
    06:13:49 [ INFO] [2] waiting for users to exit
    06:13:49 [ INFO] [2] exiting user 3 from AnonBrowsingUser...
    06:13:49 [ INFO] [2] exiting user 1 from AnonBrowsingUser...
    06:13:49 [ INFO] [2] exiting user 4 from AnonBrowsingUser...
    06:13:49 [ INFO] [2] exiting user 2 from AnonBrowsingUser…
    06:13:49 [ INFO] [2] manager went away

The Worker notifies the Manager that it was told to shut down. The Manager passes this message to all other Workers (in this case to Worker 1). Worker 1 notes that it has received the command to exit.


    06:13:50 [ INFO] [1] received EXIT command from manager
    06:13:50 [ INFO] [1] stopping after 59 seconds...
    06:13:50 [ INFO] [1] waiting for users to exit
    06:13:50 [ INFO] [1] exiting user 4 from AuthBrowsingUser...
    06:13:50 [ INFO] [1] exiting user 2 from AnonBrowsingUser...
    06:13:50 [ INFO] [1] exiting user 3 from AnonBrowsingUser...
    06:13:50 [ INFO] [1] exiting user 1 from AnonBrowsingUser...
    06:13:50 [ INFO] [1] received EXIT command from manager
    06:13:50 [ INFO] [1] manager went away

And finally, the Manager itself shuts down cleanly, displaying metrics before exiting:


    06:13:50 [ INFO] worker 2 exited
    06:13:50 [ INFO] worker went away, stopping gracefully after 68 seconds...
    06:13:50 [ INFO] worker 1 exited
    06:13:50 [ INFO] all workers have exited
    06:13:50 [ INFO] printing metrics after 68 seconds...

     === PER TASK METRICS ===
     ------------------------------------------------------------------------------
     Name                 	|   # times run |      # fails |  task/s |  fail/s
     ------------------------------------------------------------------------------
     1: AnonBrowsingUser  	|
     1: (Anon) front page  	|         7,166 |     	0 (0%) |  	3.97 |	0.00
       2: (Anon) node page	|         4,781 |     	0 (0%) |  	2.65 |	0.00
       3: (Anon) user page	|         1,433 |     	0 (0%) | 	0.79 |	0.00
     2: AuthBrowsingUser  	|
       1: (Auth) login    	|             1 |     	0 (0%) | 	0.00 |	0.00
       2: (Auth) front page     |           631 |     	0 (0%) | 	0.35 |	0.00
       3: (Auth) node page	|           422 |     	0 (0%) | 	0.23 |	0.00
       4: (Auth) user page	|           126 |     	0 (0%) | 	0.07 |	0.00
       5: (Auth) comment form   |           126 |     	0 (0%) | 	0.07 |	0.00
     -------------------------+---------------+----------------+----------+--------
     Aggregated           	|    	14,686 |     	0 (0%) | 	8.13 |	0.00
     ------------------------------------------------------------------------------
     Name                 	|     Avg (ms) |        Min |     Max |	Median
     ------------------------------------------------------------------------------
     1: AnonBrowsingUser  	|
       1: (Anon) front page     |   	24.04 |      	1 | 	1,632 |      	3
       2: (Anon) node page	|   	51.33 |      	1 |    11,336 |     	30
       3: (Anon) user page	|   	19.47 |      	2 |   	1,468 |     	13
     2: AuthBrowsingUser  	|
       1: (Auth) login    	|   	56.00 |        56 |         6 |     	56
       2: (Auth) front page     |   	33.48 |        17 |       530 |     	24
       3: (Auth) node page	|   	53.81 |        12 |       870 |     	28
       4: (Auth) user page	|   	16.77 |        	8 |       505 |     	12
       5: (Auth) comment form   |  	122.17|        45 |       726 |     	69
     -------------------------+-------------+------------+-------------+-----------
     Aggregated           	|   	34.52 |      	1 |  	11,336 |     	18

     === PER REQUEST METRICS ===
     ------------------------------------------------------------------------------
     Name                 	|    	# reqs |     # fails   |   req/s |  fail/s
     ------------------------------------------------------------------------------
     GET (Anon) front page	|     	7,166 |     	0 (0%) | 	3.97 |	0.00
     GET (Anon) node page 	|     	4,781 |     	0 (0%) | 	2.65 |	0.00
     GET (Anon) user page 	|     	1,433 |     	0 (0%) | 	0.79 |	0.00
     GET (Auth) comment form    |         126 |     	0 (0%) | 	0.07 |	0.00
     GET (Auth) front page	|         632 |     	0 (0%) | 	0.35 |	0.00
     GET (Auth) node page 	|         422 |     	0 (0%) | 	0.23 |	0.00
     GET (Auth) user page 	|         126 |     	0 (0%) | 	0.07 |	0.00
     GET static asset     	|      46,782 |     	0 (0%) |	25.90|	0.00
     POST (Auth) comment form   |         126 |     	0 (0%) | 	0.07 |	0.00
     POST (Auth) front page     |           1 |     	0 (0%) | 	0.00 |	0.00
     -------------------------+---------------+----------------+----------+--------
     Aggregated           	|      61,595 |     	0 (0%) |   34.11 |	0.00
     ------------------------------------------------------------------------------
     Name                 	|	Avg (ms) |    Min |       Max |    Median
     ------------------------------------------------------------------------------
     GET (Anon) front page	|    	3.17 |          1 |   	1,453 |      	1
     GET (Anon) node page 	|      51.30 |          1 |    11,336 |     	30
     GET (Anon) user page 	|      19.44 |          2 |   	1,468 |     	13
     GET (Auth) comment form    |      51.13 |   	15 |      686 |     	29
     GET (Auth) front page	|      28.65 |   	12 |      517 |     	20
     GET (Auth) node page 	|      53.77 |   	12 |      870 |     	28
     GET (Auth) user page 	|      16.75 |          8 |       505 |     	12
     GET static asset     	|    	2.44 |          1 |   	1,584 |      	1
     POST (Auth) comment form   |      67.83 |    	23 |      555 |     	39
     POST (Auth) front page     |      43.00 |    	43 |       43 |     	43
     -------------------------+-------------+------------+-------------+-----------
     Aggregated             |    	7.60 |      	1 |  	11,336 |    1

An abbreviated flight

That’s the long version. To summarize the commands:

  1. Start the Manager, with two workers: cargo run --example drupal_loadtest --release --features gaggle -- --manager --expect-workers 2
  2. Start Worker 1 on the same server: cargo run --example drupal_loadtest --release --features gaggle -- --worker -v
  3. Start Worker 2 on a different server, pointed at the Manager server: cargo run --example drupal_loadtest --release --features gaggle -- --worker --manager-host 10.10.3.21 -v
  4. When ready, Ctrl-C the Manager any of the Workers. Alternatively, use the --run-time option when starting the Manager process to have Goose automatically stop after a predefined amount of time.

This group of commands are just the very basics of using Goose and its Gaggle feature as a distributed load test. See the Goose documentation for more details and additional features.

Photo by Raphael Rychetsky on Unsplash