Skip to content

Running Optimizer Studio in Concurrent mode

Overview

Oftentimes each experiment testing a knob configuration runs for a long time - hours, or even days, making the whole optimization process unbearably long.

Optimizer Studio can distribute the workload to different computers in order to speed up the entire optimization process.

Limitations

The parallel processing has some limitations:

  1. As Optimizer Studio progresses with the optimization, it suggests new knob configurations for testing. Most of the time, there are several knob configurations that it tries to explore. Usually, the number of such configurations is less than 10. Only this set of knob configurations can be executed in parallel. So it's impossible to utilize a larger number of computers.

  2. The computers running the workload should be very similar and provide a similar performance results for the same knob configuration, with reasonable standard deviation. Otherwise, Optimizer Studio can't conclude about representative performance metric for each configuration.

  3. In particular, at the beginning of the optimization, Optimizer Studio runs a workload on a baseline (initial) knob configuration which should provide similar results on each parallel computer. If Optimizer Studio can't calculate the correct baseline performance, it can't proceed with the entire optimization process.

Concurrent Optimization Workflow

  1. User defines a main computer to run Optimizer Studio on and one or more worker computers.
  2. User installs Optimizer Studio package on each computer, main and workers.
  3. User installs her workload scripts etc. on each computer, so that they are located in the same directories on each computer.
  4. User arranges a password-less ssh connection from the main computer to the workers, so that it doesn't require login credentials interactively. This may prevent the automatic session to proceed. Recommended way is to use SSH keys with no passphrase.
  5. User starts optimizer-studio with an additional parameter --workload
  6. optimizer-studio script runs the workload starter script using the parameters passed in --workload command line parameter.
  7. Workload starter script runs optimizer-studio.worker scripts on remote computers using ssh.
  8. All terminal output from the remote computers is transferred by ssh to the main computer terminal

Firewall Configuration

Remote Optimizer Studio Workers communicate with the main Optimizer Studio engine using HTTP REST API. Therefore, the communication port should be open in the main computer firewall. The default HTTP port is 8421. User can change the port by specifying the command line parameter like in the following example:

optimizer-studio ... --http-port=12345

Optimizer Studio Configuration in Concurrent Mode

--workload Command Line Parameter

The --workload parameter of optimizer-studio actually triggers the run in concurrent mode:

optimizer-studio --knobs=<knobs-file> --workload='<starter-script and parameters>'

Here, <knobs-file> is a regular knobs definition yaml file, where the workload_settings section is defined

Workload Starter Script

Optimizer Studio comes with two different starter scripts: optimizer-studio.starter.ssh and optimizer-studio.starter.local, which are located in the main Optimizer Studio directory. Users can write their own scripts according to their environment, using the existing scripts as an example.

Each starter script can accept its own parameters.

The first argument for --workload parameter is the name of the starter script, in case of standard scripts, or full path of a custom starter script. The rest of the arguments are passed to the starter script without change.

For example:

--workload='ssh 192.168.1.1 192.168.1.2`

uses optimizer-studio.starter.ssh script and passes the two IP addresses of working computers to it.

--workload='local 4`

uses optimizer-studio.starter.local script which runs 4 worker processes on the local (main) computer.

Workload Section of the knobs.yaml File

The recommended way of workload definition for concurrent optimization is using the workload_settings section of the knobs.yaml file. This way, arguments to the workload script can remain in the file and won't clutter the optimizer-studio command line. You don't have to define multiple workloads, validation rules or multi-objective optimization in this case.

Example:

workload_settings:
  workloads:
    -
      script: ./workload.sh
      metrics:
        my_target: /tmp/my_target.${WORKER_ID}

  target: my_target

When defining the .yaml file and the workload script, it is necessary to use ${WORKER_ID} shell variable in temporary file names in order to distinct between different worker runs.

The corresponding workload file would have the following string:

echo ${result} > /tmp/my_target.${WORKER_ID}