Skip to content

Overview

Configuration files of Optimizer Studio are located in the installation directory /opt/concertio-optimizer/studio/. You can specify your own configuration files via command line switches.
There are two configuration files:

  1. knobs.yaml - Specifies the knobs, input metrics and target function of Optimizer Studio.
  2. settings.yaml - Specifies the system-wide settings of Optimizer Studio.

Optimization Target

Users can configure Optimizer Studio to search for the optimal parameters that maximize (or minimize) a specific Target Metric.
The target metric can be defined as follows:

domain:
  common:
    target:
      name: performance
      goal: max

or in shorthand:

domain:
  common:
    target: performance:max

The target metric name can refer any of the sampled metrics, such as proc.diskstats.sda.sectors_written. The metrics definition is further detailed below.
The optimization goal is set to max, if omitted.

Predefined Optimization Target

While target metric can refer any of the sampled metrics, Optimizer-Studio supports a list of predefined targets:

Name Default Goal Meaning
duration min Total workload execution time.
performance max Total number of retired instructions (msr.inst_retired_all) measured in all CPU cores per second.
net.throughput max Total number of bytes transferred over the network per second.

Optimization Target given by an expression

It is also possible to define the optimization target as an expression (formula):

domain:
  common:
    target:
      name: optimization_target
      goal: max
      expression: "{{ metric1 / (metric2 + 1) }}"

In the example above, the target metric is given by the ratio between metric1 and (metric2 + 1) values averaged over min_samples_per_config samples. The stability of the ratio is tested against CV (coefficient of variation) given by max_config_cv.

knobs.yaml: knob definition

Knobs are tunable parameters of the system which Optimizer Studio will try to change in order to optimize the system.

Concertio Optimizer Studio ships with embedded set of knob definitions containing many knobs that were tested and benchmarked by Concertio engineers. If you want to use these knobs, specify --knobs=embedded command line switch.

It is possible to provide additional knob files by using the --knobs /path/to/knob-file.yaml command line switch, where you can add custom knobs that are relevant for your system. If you don't supply your knobs file, the embedded knobs will be used automatically.

Optimizer Studio can accept any number of --knobs PATH command line switches. In such case Optimizer Studio loads knob files in the order they appear in the command line. This way a knob files appearing later can overload any settings in previous knob files. Note that this doesn't include embedded knobs which are always processed first.

knobs section: adding a Knob

It is possible to add a new or override an existing knob by adding a knob section to the knobs list. Each knob section contains a few values and POSIX shell scripts used for defining and manipulating the knob.

Knob kinds

Each knob belongs to one of the supported knob kinds: scripted, environment, accelerator.
The main difference between the knob kinds is the way the knob is applied.
Below is a list of the supported knob kinds:

  1. scripted knobs
    The scripted knob is applied via executing a POSIX shell script with $KNOB_VAL, $KNOB_DEV environment variables.

    • kind: script
    • get_script - returns the current knob value, that will be set as baseline
    • set_script - set new knob value
    • pre_set_script - when provided, this script is run prior to running any of the set_script(s)
      Note that in case more than one knob refers to the same pre_set_script, identical script invocations will be merged.
    • post_set_script - when provided, this script is run after running any of the set_script(s)
      Note that in case more than one knob refers to the same post_set_script, identical script invocations will be merged.
    • device_script - when provided, returns the list of applicable devices, and the knob is split into multiple per-device knobs, e.g.
      io.scheduler → io.scheduler.sda, io.scheduler.sdb, io.scheduler.sdc
    • Environment variables observable by the knob scripts:
      $KNOB_VAL - knob value being set by set_script
      $KNOB_DEV - device name, in device-specific knobs, as returned by device_script
  2. environment knobs (occasionally referred to as memory knobs)
    The envirnment knobs are exported as POSIX environment variables observable by the workload. An environment knob name must comply with the POSIX variable naming.

    • kind: env
  3. accelerator knobs
    The accelerator knobs encompass all the knob kinds passed directly to Concertio accelerator.

    • kind: accel.mem_alloc - provides a selection of malloc implementations

General knob parameters

Below is a list of the general purpose parameters supported by all the knob kinds:

  1. kind - knob kind. When omitted, Optimizer software will attempt to identify the knob kind by presence or absence of kind-specific parameters.
  2. enable_script - when provided, a 0 / false indicate that the knob should be discarded, any other value enables the knob. When not provided, the knob is enabled by default.
  3. disable - when provided, the knob will be discarded and erased from internal data structures. This parameter overrides enable_script.
  4. options - a list of available knob options. Options definition is detailed below.
  5. default - when provided, sets the baseline to the value provided. This option is mutually exclusive with get_script.
    Notice, that when neither default nor get_script are specified, the 1-st option in the options list will be set as baseline value.
  6. set_policy - knob setting policy:

    • on_change (default) - each time the internal algorithm needs to change a knob, its value is applied.
    • once:<value> - the value is applied in the beginning of the run, and is reverted to baseline in the end.
    • always - the current knob value is applied on each sample.

    Once and always setting policies do not participate in the optimization process, and can be considered auxiliary utilities.

Knob options definition

The simplest way to define knob options is to list all valid values explicitly:

options:
  values: [10, 15, 20]
default: 15

The second way is to define an options script which returns the list of option values. For example, the following script creates a list of numbers from 0 to 7 for a computer with 8 logical CPUs

options:
  script: cat /proc/cpuinfo | awk '/processor/ {print $3}'

Yet another possibility is to define a list of options using numeric range:

options:
  range:
    min: 1
    max: 16
    step: 1
    format: --num-threads=%d
default: 8

This definition creates a list of options "--num-threads=1", --num-threads=2, etc.
Printf style format field is optional. It contains a string with one of supported format specifiers, like in the example:

Format Numeric Value Knob Value
"-O%d" 2 -O2
"--pi=%.2f" 3.14159 --pi=3.14
"--pi=%.2f" 3 --pi=3.00
"%g" 5000000 5000000
"%g" 2.718000 2.718
"%x" 2047 7ff
"%X" 2047 7FF

You can combine different ways of specification:

options:
  values: [ "" ]
  range:
    min: 1
    max: 16
    step: 1
    format: --num-threads=%d

The example above adds an empty option value in addition to the ones defined by the range.

Knob dependency

In practice, there are situations where one parameter depends on the other. Thus, in such cases knob dependency should be considered when defining and optimizing knobs. Optimizer Studio supports knob dependency. A simple example of dependency handling between knobs

In the provided example knob Y depends on knob X in the following manner:

  1. 1 <= X < 50: 1 <= Y <= 150
  2. 50 <= X <= 100: Y = 1

Given, Z = X + Y, maximum value of Z is achieved when {X, Y} = {49, 150}.

The dependency is maintained via shell environment variable BIGX.
BIGX is set each time X is written back.
BIGX is consulted each time Y is written back.

In order for this dependency to be properly maintained:

  1. Knob X has to be written back prior to knob Y - the knobs are ordered alphabetically
  2. Knob Y has to be written back each time knob X is written back - this is controlled by 'dependency' section in knob Y definition

The knob definition for this example is available at /opt/concertio-optimizer/studio/examples/knob-dependency.

Disabling a knob

It is possible to disable a knob, either temporarily or permanently, without actually removing its definition from the knobs file.
To temporarily disable a knob, add to the knob section the following value:

skip_tuning: ""

To permanently erase a knob from the internal data structures, either add to the knob section the following value:

disable: ""

or define an appropriate enable_script:

enable_script: "echo -n 0"

Note that this approach allows disabling embedded knobs as well. For example:

domain:
  common:
    knobs:
      kernel.io_scheduler:  ## The name of the knob you wish to disable
        disable: ""

Examples

A: vm.dirty_background_ratio

domain:
  common:
    knobs:
      sys.vm.dirty_background_ratio:
        kind: script
        description: "The number of pages at which the pdflush threads begin writeback of dirty data."
        get_script: "/sbin/sysctl -n vm.dirty_background_ratio"
        set_script: "/sbin/sysctl -w vm.dirty_background_ratio=$KNOB_VAL"
        options: [10, 15, 20]

B: io.scheduler

Below is a more complex knob example of a device-specific knob. Eventually, this knob definition produces multiple knobs: io.scheduler.sda, io.scheduler.sdb etc.

domain:
  common:
    knobs:
      io.scheduler:
        kind: script
        description: "Block device scheduling algorithm selector."
        get_script: "cat /sys/block/$KNOB_DEV/queue/scheduler | sed 's/.*\\[\\([^]]*\\)\\].*/\\1/g'"
        set_script: "echo $KNOB_VAL > /sys/block/$KNOB_DEV/queue/scheduler"
        device_script: "ls -d /sys/block/sd* | cut -d/ -f 4"
        options:
          script: cat /sys/block/$KNOB_DEV/queue/scheduler | sed 's/\[\|\]//g ; s/ $//g ; s/\s/\n/g'

Passing knob values to the workload

The easiest way of passing knob values to applications is via the shell environment using environment knobs. Below is an example of a simple knob with five values:

domain:
  common:
    knobs:
      my_knob:
        options: [1,2,3,4,5]

Notice that no set_script or get_script are defined for the environment knob. The workload scripts will receive $my_knob as an environment variable:

#!/bin/bash
executable_name ${my_knob}

In the above case, the baseline value of my_knob is 1 as it is the first value in the options list.

It is also possible to use a scripted knob to pass values of application knobs. One way to achieve this is using the filesystem. For example, a knob's set script can write a value into /tmp/knob_file:

domain:
  common:
    knobs:
      my_knob:
        get_script: "cat /tmp/knob_file"
        set_script: "echo $KNOB_VAL > /tmp/knob_file"
        options: [1,2,3,4,5]

Then, the workload script can read this value when invoked. For example:

#!/bin/bash
my_parameter=$(cat /tmp/knob_value)
executable_name ${my_parameter}

In order for the above to work, the file /tmp/knob_file should be populated, for example by:

$ echo "1" > /tmp/knob_file

The workload script can also read the knob names and values through an associated array. This is useful when experimenting with numerous knobs in the configuration files because the workload script can detect which knobs have been defined. Below is an example:

# source studio functions
. /opt/concertio-optimizer/studio/studio-functions.bash
# call memory knobs associative array function
get_memory_knobs_assoc
for K in "${!ASSOC_KNOBS_ARRAY[@]}"; do
        args+="--$K=${ASSOC_KNOBS_ARRAY[$K]} "
done
executable_name $args

The code in knobs.yaml can be tested as follows:

$ optimizer-studio --knobs=knobs.yaml --testknob=my_knob
I[3857][12:34:41.278] Concertio Optimizer, version 2.5.0
I[3857][12:34:41.278] License expiration date: January 1, 2021
I[3857][12:34:41.285] Knob my_knob: set value: 1 --> 2 
I[3857][12:34:41.287] Knob my_knob: set value: 2 --> 3 
I[3857][12:34:41.288] Knob my_knob: set value: 3 --> 4 
I[3857][12:34:41.290] Knob my_knob: set value: 4 --> 5 
I[3857][12:34:41.292] Knob my_knob: set value: 5 --> 1 [baseline]
E[3857][12:34:41.293] Knob my_knob test: success
Knob my_knob test: success

You can also test all knobs by using the option --testknob=all:

$ optimizer-studio --knobs=knobs.yaml --testknob=all

By default, Optimizer Studio performs a silent test of all knobs prior to optimization tasks. If you wish to skip this test (not recommended) you can do so by using the --testknob=none option:

$ optimizer-studio --knobs=knobs.yaml ./my_workload.sh --testknob=none

metrics section: named HW and SW metrics

Metrics are used by Optimizer Studio to learn about the system behavior and to detect different phases of execution. Optimizer Studio will then attempt to find an optimal knob configuration that maximizes a certain sampled metric for each phase. The metrics are sampled periodically.

Metrics definition

Comma separated regular expressions define which metrics are sampled, and which metrics are excluded.

domain:
  common:
    include_metrics: [msr.*, proc.*]
    exclude_metrics: [proc.diskstats.sda.sectors_written]

In the above example, all msr metrics and all proc metrics, except for proc.diskstats.sda.sectors_written will be considered by Optimizer Studio for learning about the system behavior.

User-defined metrics

Optimizer Studio supports user-developed plugins for sampling custom metrics.

Importing Configuration Files

It is possible to import configuration files by using the import directive. For example, the default embedded knobs of Optimizer Studio can be imported as follows:

import:
  optimizer.studio:
domain:
  common:
    knobs:
...

Other embedded knob definitions will always have the optimizer. prefix. In order to import a configuration file from the filesystem, its yaml extension should be removed and slashes (/) need to be converted into dots .. For example, my_configurations/my_software_knobs.yaml will be imported as my_configurations.my_software_knobs.

Embedded knob categories supported by Optimizer-Studio:

knobs category import syntax description comments/limitations
mellanox import optimizer.mellanox.connectx3 Mellanox Connect-X 3,4 and 5 NIC cards optional args: MELLANOX_DEVICES. works on bare metal machines only. NICS are detected automatically if not provided
solarflare import optimizer.solarflare.knobs SolarFlare Onload NIC cards provides all onload tuning knobs as envirnment variables
intel import optimizer.intel.msrs Intel CPUs msr tuning parameters works on bare metal machines only. disabled automatically on unsupported platforms
java import optimizer.jvm.jvm-[7, 8, 9, 11] Java Virtual Machine tunables JRE or JDK to be installed
nginx import optimizer.nginx.nginx NGINX Web and Proxy server tunables required args: NGINX_CONF_FILE - for location of nginx.conf file
mysql import optimizer.mysql.mysql MySQL 5.7.8 and above system and caching tunables required args: MYSQL_CONF_FILE - for location of mysqld.cnf. currently assumes mysql client and server installed on the same machine
mongodb import optimizer.mongodb MongoDB 4.x and above tunables assumes mongo client and server installed on the same machine
php import optimizer.php.php7 PHP 7 tuning parameters required args: PHP_CONF_FILE - for location of php.ini
apache2 import optimizer.apache.apache2 Apache 2.x web server tuning parameters required args: APACHE_CONF_FILE - for location of mpm_prefork.conf file
postgresql import optimizer.postgresql.v11 or v10 PostgreSQL 10 and 11 best practice tuning parameters required args: POSTGRESQL_CONF_FILE - for location of postgresql.conf file
openmpi import optimizer.posgresql.openmpi.mca OpenMPI Modular Component Architecture (MCA) tuning parameters required args: MCA_CONF_FILE - for location of mca-params.conf
network import optimizer.network Operating System (Linux) system level network tunables
hhvm import optimizer.hhvm.hack HHVM benchmark tuning required args: HHVM_CONF_FILE - for the server.ini file path. HHVM must be installed. supports hhvm version 4.6.0
hadoop yarn import optimizer.hadoop.yarn tuning parameters for Hadoop Yarn Cluster required args: MAPRED_CONF_FILE - to point to Yarn config file path
hadoop spark import optimizer.hadoop.spark tuning parameters for Hadoop Spark Cluster required args: SPARK_CONF_FILE - to point to Spark config file path
gcc import optimizer.compilers.gcc.[4-7-0, 4-8-0, 4-9-0, 5-3-0, 7-1-0, 10-1-0] all GCC compilation flags tuning parameters gcc of supported version to be installed
llvm import optimizer.compilers.llvm.4-0-0 all LLVM compilation flags tuning parameters compatible with version 4 llvm of supported version to be installed - tested up to version 9
icc import optimizer.compilers.icc all compilation flags tuning parameters icc of supported version to be installed - tested up to version 19.1
aocc import optimizer.compilers.aocc all compilation flags tuning parameters aocc of supported version to be installed - tested using AOCC version 2.0.0

Example for import with configuration file as argument:

import:
  optimizer.postgresql.v11:
    args:
      POSTGRESQL_CONF_FILE: /etc/postgres/postgresql.conf

See examples folder under optimizer-studio folder for many embedded knob examples with self-documented knobs files.

We are working continuously to add support for more projects.

Filtering knobs from imported files

Specific knobs can be selected from imported files using regular expressions. In the following example, all embedded knobs are imported, except those that have "net" in their names:

import:
  optimizer.studio:
    include_knobs: [ .* ]
    exclude_knobs: [ .*net.* ]

File-specific directives

Configuration files can have their own enable and onload directives, as shown in the following example:

import:
  my_example_import:
enable:
  script: echo 1
onload:
  source: my_script.sh
domain: ...

File-specific enable

The enable directive determines whether the configuration file should be loaded. It can either be a scalar value (enable:), a script, or a sourced script. A script can be defined as following:

enable:
  script: echo 1

A sourced script can be defined as follows:

enable:
  source: my_filesystem_script.sh

In all of these cases, if 1 is returned, the configuration file is loaded. Otherwise it is skipped.

Passing arguments to the enable scripts of imported files

Passing arguments as environment variables is possible using the args directive:

import:
  my_example_import:
    args:
      MY_ENV_VARIABLE: value

The parameter can then be used in my_example_import.yaml's enable script as follows:

enable:
  script: echo ${MY_ENV_VARIABLE}

File-specific onload

When a configuration file is found to be enabled, the onload script is invoked. It can either be sourced (using source:) or in-lined (using script:).

Invalid Knob Configuration Management

From time to time, a workload can consider that some knob values are invalid, thus returning an invalid (NaN) value for a target metric. Optimizer Studio will not consider such knob configurations for the final results. However, the same knobs may be parts of other configurations, and additional run of a workload with such a configuration would most probably return an invalid result as well.

Note that sometimes it may be a single knob which is invalid in any configuration it participates. There may exist invalid combinations of knobs, for example a pair of knobs which are incompatible with each other when they appear together in some configuration.

In order to prevent the repeated usage of invalid knobs, Optimizer Studio has the mechanism of knob config validation. This mechanism can blacklist a knob and reuse such a blacklist for subsequent runs of the same experiment. Knobs found as invalid are saved in knob config validity file, in YAML format, which will be automatically used and updated in subsequent experiment runs.

Except storing knob blacklist, a configuration validity file stores also a configuration whitelist which contains configurations which already produced valid results, so it's not needed to validate knob combinations of such configurations each time.

Config validation is turned off by default. In order to turn it on, include the following in your main YAML configuration file:

domain:
  common:
    config_validation: /path/to/validity_file.yaml

To switch the invalid combinations search mechanism off explicitly, use:

domain:
  common:
    config_validation: off

In case the full path to the validity file is omitted, the Optimizer Studio working directory (~/.concertio by default) will be assumed. Normally, the file will not exist at the first run, so it will be created after the first run.
The file is formatted as following:

invalid_configs:
  -
    - A
    - B
  -
    - A
    - C
valid_configs:
  -
    - B
    - C
  -
    - B
    - D
  -
    - D
    - E
    - F

At the end of the Optimizer Studio session, the validity file is updated with the new found combinations, so it may be reused in the next run. If problems arise during saving the file, e.g. due to file permissions, the file is stored in a temporary directory, and the file path is mentioned in the optimizer log.

If you can prepare the list of invalid knobs in advance, you can add it to the config_validation section in your main YAML file:

domain:
  common:
    config_validation:
      file: /path/to/validity_file.yaml
      invalid_configs:
        - [A, B]
        - [A, C]
      valid_configs:
        - [B, C]
        - [B, D]
        - [D, E, F]

In the case above, the invalid and valid combination lists will be used as the initial seed, as long as the validity file in not generated. When the validity file is present, it takes precedence over the initial seed.

Invalidation approach

By default, entire knob configurations are blacklisted, i.e.

invalid_configs:
  - [A, B]

means that any configuration where both knobs A and B are not at baseline is considered invalid.
But what if higher resolution is required, and combinations of precise knob values need to be invalidated:

invalid_configs:
  - A: 2
    B: 3

Two operating modes above can be selected via:

domain:
  common:
    config_validation:
      invalidate_by: knob | option
                     ^^^^

Selective invocation of active search for invalid combinations

When config validation is on, upon encountering an invalid configuration, Optimizer will invoke the active search for minimal invalid knob combinations.
Empirical study shows that there are some kinds of invalid configurations, that had better be excluded from this rule, in particular it applies to a workload that has exited with non-zero error code, or a workload that has timed out (i.e. killed by Optimizer Studio for not completing within the defined time frame).
This is controlled via a dedicated flag (true by default) that directs Optimizer to exclude invalid configurations, that stem from workload that returned error code or timed out:

domain:
  common:
    config_validation:
      exclude_error: true | false
                     ^^^^

Workload declarative definition

To better control integration of workload execution with Optimizer-Studio, it is advised to use the workload declarative syntax in the experiment defintion (a.k.a knobs.yaml). The workload definition contains structured definition for how Optimizer-Studio should execute, control and monitor the given workload. Many of the examples included with Optimizer-Studio package demonstrate usage of this approach. Here are the currently supported properties for a workload definition:

workload:
  kind: sync                    # Optional. the mechanism in which Optimizer-Studio interact with workload. Options: `sync`, `async`, `accel`. `sync` means that Optimizer-Studio shall run the workload to completation in each sample. async - run workload in the background and sample it asynchornously. `accel` - accelerate (event driven) mode. Default: sync
  start_command: ./workload.sh  # Required. Provide the workload command or script that Optimizer-Studio shall execute
  stop_command: killall my_app  # Optional. Provide a stopping command if the workload is not designed to quit in each execution by itself
  on_config_change: make        # Optional. In use cases such as compiler flag mining, when configuration doesn't change, we wish not to re-run the complication part of the workload. This parameter allows for controlling what pre-workload task shall run only when configuration has changed since last sample.
  timeout: 10m                  # Optional. Define a timeout in XmYhZs format (minutes, hours, seconds) for how long to wait for workload (`start_command`) to complete, otherwise stop it, using the `stop_command` (default is to kill the process with SIGKILL if `stop_command` is not provided). Default: no timeout

From Optimizer-Studio version 3.6.0 and forward, a more advanced syntax was introduced to support finer grained control over timeout functionality and async mode (timeout per command). The advance format is described by the following examples:

async mode:

workload:
  kind: async
  on_config_change:
    command: make
    timeout: 3s               # Optional
    stop: |
      kill -SIGTERM ${WORKLOAD_PID}
  run:
    command: ./workload.sh    # Similar to "start_command" in the basic mode syntax
    sample_after: 2s          # can also use templates here. e.g. "{{base.duration + 4}}s"
    stop: |                   # Optional. Similar to "stop_command" in the basic mode syntax
      kill -SIGTERM ${WORKLOAD_PID}    

sync mode:

workload:
  kind: sync
  on_config_change:
    command: make
    timeout: 3s
    stop: |
      kill -SIGTERM ${WORKLOAD_PID}
  run:
    command: ./workload.sh    # Similar to "start_command" in the basic mode syntax
    stop: |                   # Optional. Similar to "stop_command" in the basic mode syntax
      kill -SIGTERM ${WORKLOAD_PID}
    timeout: "{{best.duration * 3}}s"

Note the template variable for the timeout key. best is the best tuned configuration found so far, and duration is the duration of running workload with that configuration.

A few more words about templates and their usage in workload section:

  • templates can be used with timeout keys in sync mode and sample_after key in async mode. They can also be used with command Bash scripts
  • best.duration, baseline.duration or any other metric name and knob names are reserved keywords to be used in templates
  • templates syntax allows implicit and explicit keyword references

Few examples:

  • timeout: "{{best.duration * 3}}s" - uses a metric
  • timeout: "{{baseline.duration * 3}}s" - uses a metric
  • timeout: "{{A}}s" - implicit knob name A
  • timeout: "{{knob.A}}" - explicit knob name A

Several more examples of sync mode exist as well in Optimizer-Studio package demonstrating this format (using "run" section and per command timeout)

Workload definition and multi-objective optimization (Deprecated)

A workload is normally defined in a script supplied to Optimizer Studio via its command line.
Optimizer Studio also supports complex workload definition with scalarized muti-objective target.
The complex workload is defined in the configuration file as a sequence of steps, each comprising a script, metrics and validity checks, as following:

  1. Script: a script that runs the workload
  2. Metrics: after the script (if defined) completes, metrics are gathered from the filesystem
  3. Validity checks: after the metrics are gathered, they are validated in a sequence of tests. If all checks pass, the next workload step is executed. Otherwise, the knob configuration is deemed invalid and Optimizer Studio resumes testing a different configuration.

Upon completion and validation of all the steps, workload target metric is calculated. The workload target is specified through scalarization of the step metrics as described above.

Optimization target metric definition

The workload target can be set as target metric of the whole optimization.

Example

domain:
  common:

    ...

    target: workload.target:max

workload_settings:
  workloads:
    -
      script: ./workload_1.sh arg1
      metrics:
        workload1_output: /tmp/target1
        metric1: /tmp/metric1
      validity: 
        - workload1_output > 3200
        - metric1 < 200
    -
      script: ./workload_2.sh arg2
      metrics:
        workload2_output: /tmp/target2
      validity: workload2_output < 10000

  target: workload1_output * 0.12 + workload2_output * 8 - metric1

The above example demonstrates a 2-step complex workload.
Optimizer Studio comes up with a configuration to test.
It will then run ./workload_1.sh arg1. Upon completion, metrics will be sampled, and validity criteria applied - (workload1_output > 3200) && (metric1 < 200).
If both checks pass, Optimizer Studio will proceed to running the next step script (./workload_2.sh arg2). Otherwise, Optimizer Studio will come up with a different configuration, and start testing the first step again.

If all the workloads run correctly and pass their validity tests, Optimizer Studio will calculate the workload target according to the formula at the last line of the example.
Workload target is defined as optimization target metric (and is maximized).

Reusing Optimization Results In a New Experiment

More often than not, users perform several optimization experiments in the same system. Users can reuse results of previous experiments in succeeding ones in order to reduce total optimization time by starting optimization from best results of a previous optimization run. In addition, users can define a baseline knob configuration based on results of a previous optimization run in order to calculate incremental improvement of succeeding experiments.

At the end of optimization, an optimization report file is created in the ${HOME}/.concertio directory next to log and csv file. The report file has the name report_<timestamp>.json, where timestamp specifies the file creation time. The file is stored in JSON format. Among other information, the report contains a list of best knob configurations found during optimization.

Reusing best configurations

If users believe that best knob configurations found in one experiment can be good candidates for another optimization, there is a way to reuse the previously found configurations. Doing this, can save time in comparison to starting optimization from the baseline. However, in other cases, previous results can be irrelevant for the experiment, so users have to apply their own judgement.

A user can specify a special directive in the knobs.yaml file which points to "top_knob_configs" item in the report file, which contains a list of best knob configurations:

domain:
  common:
    knobs:
    ...
    seed_configs: <report_file_path>#top_knob_configs

Here, #top_knob_configs is the separator character and the JSON tag name. The default tag name can be omitted, together with its separator. If users provide their own list(s) of knob configurations, they can specify their own tag, provided that the JSON format is kept.

It is possible to provide several files with seed configurations:

domain:
  common:
    knobs:
    ...
    seed_configs:
     - <report_file_path1>#top_knob_configs
     - <report_file_path2>#top_knob_configs
     ...

Defining an alternative baseline configuration

Before running an optimization, Optimizer Studio runs the workload using baseline knob configuration and uses the obtained target metric as a base for improvement calculations. There is a possibility for a user to define an alternative baseline configuration in a knobs.yaml file.

domain:
  common:
    knobs:
    ...
    baseline_config: <report_file_path>#best_knob_config

Here, the config specification includes a JSON report file and a tag name inside this JSON document. The format of this JSON item is like below:

"tag": [
  {"name":"knob1", "value":"value1"},
  {"name":"knob2", "value":"value2"},
  ...
]

Conductor - Experiment Management System

Conductor is an experiment management system with advanced web user interface that allows watching experiment progress and results in a web browser, as well as other analysis operations.
In order to use Conductor, you need to sign up at Concertio's portal https://optimizer.concertio.com. After signing up, sign in and create a project either through the web interface or using optimizer-ctl create_project command.

Connecting Optimizer Studio to Conductor

Follow the instructions on optimizer.concertio.com for how to connect Optimizer-Studio a Conductor project. Use optimizer-ctl login and then optimizer-ctl create_project if you wish to manage the entire process from the command line interface.

Once you are logged in and project is connected with your knobs.yaml definition file, Optimizer Studio connects to Conductor system at startup and starts reporting to it the experiment results. Pay attention to the diagnostics output in Optimizer Studio Console, which shows the connection progress and parameters.

Alternatively to connecting the project via knobs.yaml definition, the <project_guid> that is injected during the connection process, can be defined as an environment variable instead:

export OPTIMIZER_STUDIO_PROJECT_GUID=<project_guid>

If the GUID is defined both in knobs.yaml and the environment variable, the environment variable value is used.

Experiment Inventory

At the beginning of an experiment, Optimizer Studio sends the experiment inventory to Conductor system, and thus it can be viewed via the Web interface.
The inventory usually includes description of the hardware and software system that could be obtained from the Linux OS, like Linux version, amount of RAM, number of CPUs and more.

If desired, the user can supply additional inventory data, by describing it in the knobs.yaml file. The inventory is a list of key/value pairs, where a value can be either an explicit string or a result output from a user script. The inventory data can be hierarchical. Example:

inventory:
  experiment: A+B example                             # explicit value
  resources:
    free_disk:
      script: df . | awk '/[0-9]%/{print $(NF-2)}'    # script output
    free_memory:
      script: |                                       # several line script output
        freemem=$(cat /proc/meminfo | awk '/MemFree/ {print $2}')
        echo "${freemem} KB"

The example above will result (depending on the real resources of your computer) in the following section of the entire inventory object, in JSON format, in the Web UI:

  "user_defined": {
    "resources": {
      "free_disk": "177647680",
      "free_memory": "884904 KB"
    },
    "experiment": "A+B example"
  }

Optimizer Studio example a_plus_b_inventory demonstrates user inventory description.
Note. The a_plus_b_inventory example does not include connectivity parameters for your Conductor project. In order to see the inventory data, you have to make sure you are logged in to Conductor (using optimizer-ctl login command) and assign a project_guid field in knobs.yaml or set the OPTIMIZER_STUDIO_PROJECT_GUID environment variable to your project GUID.

System-wide Settings

System-wide settings can be configured in the settings.yaml file or directly in the configuration file, as follows:

global_settings:
  max_config_mean_cv: 0.02

Note that some settings need to be defined in settings.yaml or an equivalent parameters file, such as out_directory, metrics_csv_directory, and shell_command. All of the others can be defined in the regular configuration files, together with the knobs. It is recommended to use the template in the installation directory.

Time Duration Specification

Some parameters specify a time duration. For example, save_interval parameter defines the interval between saving the data file to the disk. For such parameters, special way of specification is used. For example: 2h, 1h30m, 1m15s, 500ms.

The supported units of specification: h (hours), m (minutes), s (seconds), ms (milliseconds).
More than one unit can be combined in the same time specification, but no unit can appear more than once.

Earlier versions of Optimizer Studio used other names for such parameters, without time unit specifications. These parameters are deprecated.

Available Setting Parameters

Parameter name Default value Description
sampling_interval 1s The interval between samples. Relevant only for asynchronous sampling mode.
knob_ranking_max_num_of_samples 10000 Maximum number of configurations to use for knob ranking calculations.
max_baseline_cv 0.04 The maximum allowed coefficient of variation of the mean of the measurements in baseline settings. Lower values imply a stricter convergence threshold, so additional measurements might be required in order to converge.
max_config_mean_cv 0.04 The maximum allowed coefficient of variation of the mean of the measurements per knob configuration. Lower values imply a stricter convergence threshold, so additional measurements might be required in order to converge.
max_configs_in_report 10 Maximum number of knob configurations to include into best configurations report
max_invalid_samples_per_config 0 The maximum allowed invalid measurements per knob configuration, above which the configuration is considered invalid and will not be further tested.
max_samples_per_config 120 Optimizer will not test any knob configuration more than the number of times specified by this parameter.
metrics_csv_filename - If specified, Optimizer Studio creates a CSV file with the details of all the knob settings and metric measurements.
min_baseline_samples 2 The minimum number of baseline samples. This is used in conjunction with max_baseline_cv.
min_samples_per_config 2 The minimum number of samples per knob configuration. This is used in conjunction with max_config_mean_cv.
optimization_strategy evolution The algorithm employed for searching through the knobs. Available options are greedy and evolution.
optimization_strategy_settings Settings specific for each optimization strategy. Only settings specific for the selected strategy will be parsed.
out_directory ${HOME}/.concertio Concertio Optimizer Studio generates output data such as optimization database file, log files, etc. into this location.
pending_config_timeout 0 Configuration attempt scheduling policy: 0 - attempt configs sequentially, above 0 (in minutes) - attempt configs interleaved with a timeout.
point_estimator average: <no_value> Point estimation function. Additional functions: percentile: <percent>, mode: <no_value>
save_interval 120m The interval between saving the data file to the disk
shell_command /bin/sh +e This defines the backend shell of the knobs and metrics. It is possible to run all knobs and metrics scripts on remote hosts using a different shell command
max_idle_time 10m maximum time for optimizer service to live without any connection from clients after this time the service exits. default value if not set in settings.yaml is 0 meaning no timeout

Deprecated Setting Parameters

If you get a parameter deprecation warning when running Optimizer Studio, please replace the deprecated parameters with new ones in your YAML files, adjusting the parameter values accordingly.

Note that the old parameters still work as before. However they will be dropped in the future versions.

Old parameter New parameter Change Example
interval_seconds sampling_interval Time units 1s
save_interval_minutes save_interval Time units 2h
pending_config_timeout_minutes pending_config_timeout Time units 30m
max_idle_time_minutes max_idle_time Time units 10m