Overview¶
Configuration files of Optimizer Studio are located in the installation directory /opt/concertio-optimizer/studio/. You can specify your own configuration files via command line switches.
There are two configuration files:
- knobs.yaml - Specifies the knobs, input metrics and target function of Optimizer Studio.
- settings.yaml - Specifies the system-wide settings of Optimizer Studio.
Notes on Yaml file formatting¶
Yaml files require exact formatting, and syntax errors can result in Optimizer experiments being unable to run. Please ensure that you have used ABSOLUTELY NO tabs within the file, ONLY SPACES. In the knobs.yaml, the section and option titles all need to be correctly spelt, with non-obvious typos being an occasional issue. If you cannot determine if this is a syntax issue, you can review the syntax with a YAML linter in your IDE. As well you can upload your file to an online YAML Linter in order to easily check that the syntax within the file is correct.
Optimization Target¶
Users can configure Optimizer Studio to search for the optimal parameters that maximize (or minimize) a specific Target Metric.
The target metric can be defined as follows:
domain:
common:
target:
name: performance
goal: max
or in shorthand:
domain:
common:
target: performance:max
The target metric name can refer any of the sampled metrics, such as proc.diskstats.sda.sectors_written. The metrics definition is further detailed below.
The optimization goal is set to max, if omitted.
Predefined Optimization Target¶
While target metric can refer any of the sampled metrics, Optimizer-Studio supports a list of predefined targets:
Name | Default Goal | Meaning |
---|---|---|
duration | min | Total workload execution time. |
performance | max | Total number of retired instructions (msr.inst_retired_all) measured in all CPU cores per second. |
net.throughput | max | Total number of bytes transferred over the network per second. |
Optimization Target given by an expression¶
It is also possible to define the optimization target as an expression (formula):
domain:
common:
target:
name: optimization_target
goal: max
expression: "{{ metric1 / (metric2 + 1) }}"
In the example above, the target metric is given by the ratio between metric1 and (metric2 + 1) values averaged over min_samples_per_config samples. The stability of the ratio is tested against CV (coefficient of variation) given by max_config_cv.
knobs.yaml: knob definition¶
Knobs are tunable parameters of the system which Optimizer Studio will try to change in order to optimize the system.
Concertio Optimizer Studio ships with embedded set of knob definitions containing many knobs that were tested and benchmarked by Concertio engineers. If you want to use these knobs, specify --knobs=embedded command line switch.
It is possible to provide additional knob files by using the --knobs /path/to/knob-file.yaml command line switch, where you can add custom knobs that are relevant for your system. If you don't supply your knobs file, the embedded knobs will be used automatically.
Optimizer Studio can accept any number of --knobs PATH command line switches. In such case Optimizer Studio loads knob files in the order they appear in the command line. This way a knob files appearing later can overload any settings in previous knob files. Note that this doesn't include embedded knobs which are always processed first.
knobs section: adding a Knob¶
It is possible to add a new or override an existing knob by adding a knob section to the knobs list. Each knob section contains a few values and POSIX shell scripts used for defining and manipulating the knob.
Knob kinds¶
Each knob belongs to one of the supported knob kinds: scripted, environment, accelerator.
The main difference between the knob kinds is the way the knob is applied.
Below is a list of the supported knob kinds:
-
scripted knobs
The scripted knob is applied via executing a POSIX shell script with $KNOB_VAL, $KNOB_DEV environment variables.- kind: script
- get_script - returns the current knob value, that will be set as baseline
- set_script - set new knob value
- pre_set_script - when provided, this script is run prior to running any of the set_script(s)
Note that in case more than one knob refers to the same pre_set_script, identical script invocations will be merged. - post_set_script - when provided, this script is run after running any of the set_script(s)
Note that in case more than one knob refers to the same post_set_script, identical script invocations will be merged. - device_script - when provided, returns the list of applicable devices, and the knob is split into multiple per-device knobs, e.g.
io.scheduler → io.scheduler.sda, io.scheduler.sdb, io.scheduler.sdc - Environment variables observable by the knob scripts:
$KNOB_VAL - knob value being set by set_script
$KNOB_DEV - device name, in device-specific knobs, as returned by device_script
-
environment knobs (occasionally referred to as memory knobs)
The envirnment knobs are exported as POSIX environment variables observable by the workload. An environment knob name must comply with the POSIX variable naming.- kind: env
-
accelerator knobs
The accelerator knobs encompass all the knob kinds passed directly to Concertio accelerator.- kind: accel.mem_alloc - provides a selection of malloc implementations
General knob parameters¶
Below is a list of the general purpose parameters supported by all the knob kinds:
- kind - knob kind. When omitted, Optimizer software will attempt to identify the knob kind by presence or absence of kind-specific parameters.
- enable_script - when provided, a 0 / false indicate that the knob should be discarded, any other value enables the knob. When not provided, the knob is enabled by default.
- disable - when provided, the knob will be discarded and erased from internal data structures. This parameter overrides enable_script.
- options - a list of available knob options. Options definition is detailed below.
- default - when provided, sets the baseline to the value provided.
This option is mutually exclusive with get_script.
Notice, that when neither default nor get_script are specified, the 1-st option in the options list will be set as baseline value. -
set_policy - knob setting policy:
- on_change (default) - each time the internal algorithm needs to change a knob, its value is applied.
- once:<value> - the value is applied in the beginning of the run, and is reverted to baseline in the end.
- always - the current knob value is applied on each sample.
Once and always setting policies do not participate in the optimization process, and can be considered auxiliary utilities.
Knob options definition¶
The simplest way to define knob options is to list all valid values explicitly:
options:
values: [10, 15, 20]
default: 15
The second way is to define an options script which returns the list of option values. For example, the following script creates a list of numbers from 0 to 7 for a computer with 8 logical CPUs
options:
script: cat /proc/cpuinfo | awk '/processor/ {print $3}'
Yet another possibility is to define a list of options using numeric range:
options:
range:
min: 1
max: 16
step: 1
format: --num-threads=%d
default: 8
This definition creates a list of options "--num-threads=1", --num-threads=2, etc.
Printf style format field is optional. It contains a string with one of supported format specifiers, like in the example:
Format | Numeric Value | Knob Value |
---|---|---|
"-O%d" | 2 | -O2 |
"--pi=%.2f" | 3.14159 | --pi=3.14 |
"--pi=%.2f" | 3 | --pi=3.00 |
"%g" | 5000000 | 5000000 |
"%g" | 2.718000 | 2.718 |
"%x" | 2047 | 7ff |
"%X" | 2047 | 7FF |
You can combine different ways of specification:
options:
values: [ "" ]
range:
min: 1
max: 16
step: 1
format: --num-threads=%d
The example above adds an empty option value in addition to the ones defined by the range.
If a range needs to be scaled by a certain factor or function, the expression option allows for every value in a knobs range to be executed with a desired function. This capability is beneficial for ranges that need to scale exponentially in size relative to the given range. The option can calculate any given expression, with the prime benefit being the ability to scale a range using exponents. By using exponential scaling it is possible to write knobs that are able to scale from 0 bytes, to 1000000000000 bytes (1TB) as seen in the knob below:
options:
range:
min: 0
max: 1000
step: 100
expression: "{{ (value * 100) ^ 2 }}"
format: "--%d bytes"
The expected range for this knob would be as follows:
Index | Expected Bytes | Knob Value |
---|---|---|
0 | 0B | --0 bytes |
100 | 0.1GB | --100000000 bytes |
200 | 1.6GB | --1600000000 bytes |
300 | 8.1GB | --8100000000 bytes |
400 | 25.6GB | --25600000000 bytes |
500 | 62.5GB | --62500000000 bytes |
600 | 129.6GB | --129600000000 bytes |
700 | 240.1GB | --240100000000 bytes |
800 | 490.6GB | --409600000000 bytes |
900 | 656.1GB | --656100000000 bytes |
1000 | 1TB | --1000000000000 bytes |
Knob dependency¶
In practice, there are situations where one parameter depends on the other. Thus, in such cases knob dependency should be considered when defining and optimizing knobs. Optimizer Studio supports knob dependency. A simple example of dependency handling between knobs
In the provided example knob Y depends on knob X in the following manner:
- 1 <= X < 50: 1 <= Y <= 150
- 50 <= X <= 100: Y = 1
Given, Z = X + Y, maximum value of Z is achieved when {X, Y} = {49, 150}.
The dependency is maintained via shell environment variable BIGX.
BIGX is set each time X is written back.
BIGX is consulted each time Y is written back.
In order for this dependency to be properly maintained:
- Knob X has to be written back prior to knob Y - the knobs are ordered alphabetically
- Knob Y has to be written back each time knob X is written back - this is controlled by 'dependency' section in knob Y definition
The knob definition for this example is available at /opt/concertio-optimizer/studio/examples/knob-dependency
.
Disabling a knob¶
It is possible to disable a knob, either temporarily or permanently, without actually removing its definition from the knobs file.
To temporarily disable a knob, add to the knob section the following value:
skip_tuning: ""
To permanently erase a knob from the internal data structures, either add to the knob section the following value:
disable: ""
or define an appropriate enable_script:
enable_script: "echo -n 0"
Note that this approach allows disabling embedded knobs as well. For example:
domain:
common:
knobs:
kernel.io_scheduler: ## The name of the knob you wish to disable
disable: ""
Examples¶
A: vm.dirty_background_ratio¶
domain:
common:
knobs:
sys.vm.dirty_background_ratio:
kind: script
description: "The number of pages at which the pdflush threads begin writeback of dirty data."
get_script: "/sbin/sysctl -n vm.dirty_background_ratio"
set_script: "/sbin/sysctl -w vm.dirty_background_ratio=$KNOB_VAL"
options: [10, 15, 20]
B: io.scheduler¶
Below is a more complex knob example of a device-specific knob. Eventually, this knob definition produces multiple knobs: io.scheduler.sda, io.scheduler.sdb etc.
domain:
common:
knobs:
io.scheduler:
kind: script
description: "Block device scheduling algorithm selector."
get_script: "cat /sys/block/$KNOB_DEV/queue/scheduler | sed 's/.*\\[\\([^]]*\\)\\].*/\\1/g'"
set_script: "echo $KNOB_VAL > /sys/block/$KNOB_DEV/queue/scheduler"
device_script: "ls -d /sys/block/sd* | cut -d/ -f 4"
options:
script: cat /sys/block/$KNOB_DEV/queue/scheduler | sed 's/\[\|\]//g ; s/ $//g ; s/\s/\n/g'
Passing knob values to the workload¶
The easiest way of passing knob values to applications is via the shell environment using environment knobs. Below is an example of a simple knob with five values:
domain:
common:
knobs:
my_knob:
options: [1,2,3,4,5]
Notice that no set_script
or get_script
are defined for the environment knob. The workload scripts will receive $my_knob
as an environment variable:
#!/bin/bash
executable_name ${my_knob}
In the above case, the baseline value of my_knob
is 1
as it is the first value in the options list.
It is also possible to use a scripted knob to pass values of application knobs. One way to achieve this is using the filesystem. For example, a knob's set script can write a value into /tmp/knob_file
:
domain:
common:
knobs:
my_knob:
get_script: "cat /tmp/knob_file"
set_script: "echo $KNOB_VAL > /tmp/knob_file"
options: [1,2,3,4,5]
Then, the workload script can read this value when invoked. For example:
#!/bin/bash
my_parameter=$(cat /tmp/knob_value)
executable_name ${my_parameter}
In order for the above to work, the file /tmp/knob_file
should be populated, for example by:
$ echo "1" > /tmp/knob_file
The workload script can also read the knob names and values through an associated array. This is useful when experimenting with numerous knobs in the configuration files because the workload script can detect which knobs have been defined. Below is an example:
# source studio functions
. /opt/concertio-optimizer/studio/studio-functions.bash
# call memory knobs associative array function
get_memory_knobs_assoc
for K in "${!ASSOC_KNOBS_ARRAY[@]}"; do
args+="--$K=${ASSOC_KNOBS_ARRAY[$K]} "
done
executable_name $args
The code in knobs.yaml
can be tested as follows:
$ optimizer-studio --knobs=knobs.yaml --testknob=my_knob
I[3857][12:34:41.278] Concertio Optimizer, version 2.5.0
I[3857][12:34:41.278] License expiration date: January 1, 2021
I[3857][12:34:41.285] Knob my_knob: set value: 1 --> 2
I[3857][12:34:41.287] Knob my_knob: set value: 2 --> 3
I[3857][12:34:41.288] Knob my_knob: set value: 3 --> 4
I[3857][12:34:41.290] Knob my_knob: set value: 4 --> 5
I[3857][12:34:41.292] Knob my_knob: set value: 5 --> 1 [baseline]
E[3857][12:34:41.293] Knob my_knob test: success
Knob my_knob test: success
You can also test all knobs by using the option --testknob=all:
$ optimizer-studio --knobs=knobs.yaml --testknob=all
By default, Optimizer Studio performs a silent test of all knobs prior to optimization tasks. If you wish to skip this test (not recommended) you can do so by using the --testknob=none option:
$ optimizer-studio --knobs=knobs.yaml ./my_workload.sh --testknob=none
metrics section: named HW and SW metrics¶
Metrics are used by Optimizer Studio to learn about the system behavior and to detect different phases of execution. Optimizer Studio will then attempt to find an optimal knob configuration that maximizes a certain sampled metric for each phase. The metrics are sampled periodically.
Metrics definition¶
Comma separated regular expressions define which metrics are sampled, and which metrics are excluded.
domain:
common:
include_metrics: [msr.*, proc.*]
exclude_metrics: [proc.diskstats.sda.sectors_written]
In the above example, all msr metrics and all proc metrics, except for proc.diskstats.sda.sectors_written will be considered by Optimizer Studio for learning about the system behavior.
User-defined metrics¶
Optimizer Studio supports user-developed plugins for sampling custom metrics.
Importing Configuration Files¶
It is possible to import configuration files by using the import
directive. For example, the default embedded knobs of Optimizer Studio can be imported as follows:
import:
optimizer.studio:
domain:
common:
knobs:
...
Other embedded knob definitions will always have the optimizer.
prefix. In order to import a configuration file from the filesystem, its yaml extension should be removed
and slashes (/
) need to be converted into dots .
. For example, my_configurations/my_software_knobs.yaml
will be imported as my_configurations.my_software_knobs
.
Embedded knob categories supported by Optimizer-Studio:¶
knobs category | import syntax | description | comments/limitations |
---|---|---|---|
mellanox | import optimizer.mellanox.connectx3 | Mellanox Connect-X 3,4 and 5 NIC cards | optional args: MELLANOX_DEVICES. works on bare metal machines only. NICS are detected automatically if not provided |
solarflare | import optimizer.solarflare.knobs | SolarFlare Onload NIC cards | provides all onload tuning knobs as envirnment variables |
intel | import optimizer.intel.msrs | Intel CPUs msr tuning parameters | works on bare metal machines only. disabled automatically on unsupported platforms |
java | import optimizer.jvm.jvm-[7, 8, 9, 11] | Java Virtual Machine tunables | JRE or JDK to be installed |
nginx | import optimizer.nginx.nginx | NGINX Web and Proxy server tunables | required args: NGINX_CONF_FILE - for location of nginx.conf file |
mysql | import optimizer.mysql.mysql | MySQL 5.7.8 and above system and caching tunables | required args: MYSQL_CONF_FILE - for location of mysqld.cnf. currently assumes mysql client and server installed on the same machine |
mongodb | import optimizer.mongodb | MongoDB 4.x and above tunables | assumes mongo client and server installed on the same machine |
php | import optimizer.php.php7 | PHP 7 tuning parameters | required args: PHP_CONF_FILE - for location of php.ini |
apache2 | import optimizer.apache.apache2 | Apache 2.x web server tuning parameters | required args: APACHE_CONF_FILE - for location of mpm_prefork.conf file |
postgresql | import optimizer.postgresql.v11 or v10 | PostgreSQL 10 and 11 best practice tuning parameters | required args: POSTGRESQL_CONF_FILE - for location of postgresql.conf file |
openmpi | import optimizer.posgresql.openmpi.mca | OpenMPI Modular Component Architecture (MCA) tuning parameters | required args: MCA_CONF_FILE - for location of mca-params.conf |
network | import optimizer.network | Operating System (Linux) system level network tunables | |
hhvm | import optimizer.hhvm.hack | HHVM benchmark tuning | required args: HHVM_CONF_FILE - for the server.ini file path. HHVM must be installed. supports hhvm version 4.6.0 |
hadoop yarn | import optimizer.hadoop.yarn | tuning parameters for Hadoop Yarn Cluster | required args: MAPRED_CONF_FILE - to point to Yarn config file path |
hadoop spark | import optimizer.hadoop.spark | tuning parameters for Hadoop Spark Cluster | required args: SPARK_CONF_FILE - to point to Spark config file path |
gcc | import optimizer.compilers.gcc.[4-7-0, 4-8-0, 4-9-0, 5-3-0, 7-1-0, 10-1-0] | all GCC compilation flags tuning parameters | gcc of supported version to be installed |
llvm | import optimizer.compilers.llvm.4-0-0 | all LLVM compilation flags tuning parameters compatible with version 4 | llvm of supported version to be installed - tested up to version 9 |
icc | import optimizer.compilers.icc | all compilation flags tuning parameters | icc of supported version to be installed - tested up to version 19.1 |
aocc | import optimizer.compilers.aocc | all compilation flags tuning parameters | aocc of supported version to be installed - tested using AOCC version 2.0.0 |
Example for import with configuration file as argument:
import:
optimizer.postgresql.v11:
args:
POSTGRESQL_CONF_FILE: /etc/postgres/postgresql.conf
See examples
folder under optimizer-studio folder for many embedded knob examples with self-documented knobs files.
We are working continuously to add support for more projects.
Filtering knobs from imported files¶
Specific knobs can be selected from imported files using regular expressions. In the following example, all embedded knobs are imported, except those that have "net" in their names:
import:
optimizer.studio:
include_knobs: [ .* ]
exclude_knobs: [ .*net.* ]
File-specific directives¶
Configuration files can have their own enable
and onload
directives, as shown in the following example:
import:
my_example_import:
enable:
script: echo 1
onload:
source: my_script.sh
domain: ...
File-specific enable¶
The enable
directive determines whether the configuration file should be loaded. It can either be a scalar value (enable:
), a script, or a sourced script. A script can be defined as following:
enable:
script: echo 1
A sourced script can be defined as follows:
enable:
source: my_filesystem_script.sh
In all of these cases, if 1
is returned, the configuration file is loaded. Otherwise it is skipped.
Passing arguments to the enable scripts of imported files¶
Passing arguments as environment variables is possible using the args
directive:
import:
my_example_import:
args:
MY_ENV_VARIABLE: value
The parameter can then be used in my_example_import.yaml
's enable script as follows:
enable:
script: echo ${MY_ENV_VARIABLE}
File-specific onload¶
When a configuration file is found to be enabled, the onload
script is invoked. It can either be sourced (using source:
) or in-lined (using script:
).
Invalid Knob Configuration Management¶
From time to time, a workload can consider that some knob values are invalid, thus returning an invalid (NaN) value for a target metric. Optimizer Studio will not consider such knob configurations for the final results. However, the same knobs may be parts of other configurations, and additional run of a workload with such a configuration would most probably return an invalid result as well.
Note that sometimes it may be a single knob which is invalid in any configuration it participates. There may exist invalid combinations of knobs, for example a pair of knobs which are incompatible with each other when they appear together in some configuration.
In order to prevent the repeated usage of invalid knobs, Optimizer Studio has the mechanism of knob config validation. This mechanism can blacklist a knob and reuse such a blacklist for subsequent runs of the same experiment. Knobs found as invalid are saved in knob config validity file, in YAML format, which will be automatically used and updated in subsequent experiment runs.
Except storing knob blacklist, a configuration validity file stores also a configuration whitelist which contains configurations which already produced valid results, so it's not needed to validate knob combinations of such configurations each time.
Config validation is turned off by default. In order to turn it on, include the following in your main YAML configuration file:
domain:
common:
config_validation: /path/to/validity_file.yaml
To switch the invalid combinations search mechanism off explicitly, use:
domain:
common:
config_validation: off
In case the full path to the validity file is omitted, the Optimizer Studio working directory (~/.concertio by default) will be assumed.
Normally, the file will not exist at the first run, so it will be created after the first run.
The file is formatted as following:
invalid_configs:
-
- A
- B
-
- A
- C
valid_configs:
-
- B
- C
-
- B
- D
-
- D
- E
- F
At the end of the Optimizer Studio session, the validity file is updated with the new found combinations, so it may be reused in the next run. If problems arise during saving the file, e.g. due to file permissions, the file is stored in a temporary directory, and the file path is mentioned in the optimizer log.
If you can prepare the list of invalid knobs in advance, you can add it to the config_validation
section in your main YAML file:
domain:
common:
config_validation:
file: /path/to/validity_file.yaml
invalid_configs:
- [A, B]
- [A, C]
valid_configs:
- [B, C]
- [B, D]
- [D, E, F]
In the case above, the invalid and valid combination lists will be used as the initial seed, as long as the validity file in not generated. When the validity file is present, it takes precedence over the initial seed.
Invalidation approach¶
By default, entire knob configurations are blacklisted, i.e.
invalid_configs:
- [A, B]
means that any configuration where both knobs A and B are not at baseline is considered invalid.
But what if higher resolution is required, and combinations of precise knob values need to be invalidated:
invalid_configs:
- A: 2
B: 3
Two operating modes above can be selected via:
domain:
common:
config_validation:
invalidate_by: knob | option
^^^^
Selective invocation of active search for invalid combinations¶
When config validation is on, upon encountering an invalid configuration, Optimizer will invoke the active search
for minimal invalid knob combinations.
Empirical study shows that there are some kinds of invalid configurations, that had better be excluded
from this rule, in particular it applies to a workload that has exited with non-zero error code,
or a workload that has timed out (i.e. killed by Optimizer Studio for not completing within the defined
time frame).
This is controlled via a dedicated flag (true by default) that directs Optimizer to exclude invalid configurations,
that stem from workload that returned error code or timed out:
domain:
common:
config_validation:
exclude_error: true | false
^^^^
Reusing Optimization Results In a New Experiment¶
More often than not, users perform several optimization experiments in the same system. Users can reuse results of previous experiments in succeeding ones in order to reduce total optimization time by starting optimization from best results of a previous optimization run. In addition, users can define a baseline knob configuration based on results of a previous optimization run in order to calculate incremental improvement of succeeding experiments.
At the end of optimization, an optimization report file is created in the ${HOME}/.concertio directory next to log and csv file. The report file has the name report_<timestamp>.json, where timestamp specifies the file creation time. The file is stored in JSON format. Among other information, the report contains a list of best knob configurations found during optimization.
Reusing best configurations¶
If users believe that best knob configurations found in one experiment can be good candidates for another optimization, there is a way to reuse the previously found configurations. Doing this, can save time in comparison to starting optimization from the baseline. However, in other cases, previous results can be irrelevant for the experiment, so users have to apply their own judgement.
A user can specify a special directive in the knobs.yaml file which points to "top_knob_configs" item in the report file, which contains a list of best knob configurations:
domain:
common:
knobs:
...
seed_configs: <report_file_path>#top_knob_configs
Here, #top_knob_configs is the separator character and the JSON tag name. The default tag name can be omitted, together with its separator. If users provide their own list(s) of knob configurations, they can specify their own tag, provided that the JSON format is kept.
It is possible to provide several files with seed configurations:
domain:
common:
knobs:
...
seed_configs:
- <report_file_path1>#top_knob_configs
- <report_file_path2>#top_knob_configs
...
Defining an alternative baseline configuration¶
Before running an optimization, Optimizer Studio runs the workload using baseline knob configuration and uses the obtained target metric as a base for improvement calculations. There is a possibility for a user to define an alternative baseline configuration in a knobs.yaml file.
domain:
common:
knobs:
...
baseline_config: <report_file_path>#best_knob_config
Here, the config specification includes a JSON report file and a tag name inside this JSON document. The format of this JSON item is like below:
"tag": [
{"name":"knob1", "value":"value1"},
{"name":"knob2", "value":"value2"},
...
]
Conductor - Experiment Management System¶
Conductor is an experiment management system with advanced web user interface that allows watching experiment progress and results in a web browser, as well as other analysis operations.
In order to use Conductor, you need to sign up at Concertio's portal https://optimizer.concertio.com.
After signing up, sign in and create a project either through the web interface or using optimizer-ctl create_project
command.
Connecting Optimizer Studio to Conductor¶
Follow the instructions on optimizer.concertio.com for how to connect Optimizer-Studio a Conductor project.
Use optimizer-ctl login
and then optimizer-ctl create_project
if you wish to manage the entire process from the command line interface.
Once you are logged in and project is connected with your knobs.yaml
definition file, Optimizer Studio connects to Conductor system at startup and starts reporting to it the experiment results.
Pay attention to the diagnostics output in Optimizer Studio Console, which shows the connection progress and parameters.
Alternatively to connecting the project via knobs.yaml
definition, the <project_guid>
that is injected during the connection process, can be defined as an environment variable instead:
export OPTIMIZER_STUDIO_PROJECT_GUID=<project_guid>
If the GUID is defined both in knobs.yaml
and the environment variable, the environment variable value is used.
Experiment Inventory¶
At the beginning of an experiment, Optimizer Studio sends the experiment inventory to Conductor
system, and thus it can be viewed via the Web interface.
The inventory usually includes description of the hardware and software system that could be obtained from the
Linux OS, like Linux version, amount of RAM, number of CPUs and more.
If desired, the user can supply additional inventory data, by describing it in the knobs.yaml file. The inventory is a list of key/value pairs, where a value can be either an explicit string or a result output from a user script. The inventory data can be hierarchical. Example:
inventory:
experiment: A+B example # explicit value
resources:
free_disk:
script: df . | awk '/[0-9]%/{print $(NF-2)}' # script output
free_memory:
script: | # several line script output
freemem=$(cat /proc/meminfo | awk '/MemFree/ {print $2}')
echo "${freemem} KB"
The example above will result (depending on the real resources of your computer) in the following section of the entire inventory object, in JSON format, in the Web UI:
"user_defined": {
"resources": {
"free_disk": "177647680",
"free_memory": "884904 KB"
},
"experiment": "A+B example"
}
Optimizer Studio example a_plus_b_inventory demonstrates user inventory description.
Note. The a_plus_b_inventory example does not include connectivity parameters for your Conductor project.
In order to see the inventory data, you have to make sure you are logged in to Conductor (using optimizer-ctl login
command) and assign a project_guid field in knobs.yaml
or set the OPTIMIZER_STUDIO_PROJECT_GUID environment variable to your project GUID.
System-wide Settings¶
System-wide settings can be configured in the settings.yaml
file or directly in the configuration file, as follows:
global_settings:
max_config_mean_cv: 0.02
Note that some settings need to be defined in settings.yaml
or an equivalent parameters file, such as out_directory
, metrics_csv_directory
, and shell_command
. All of the others can be defined in the regular configuration files, together with the knobs. It is recommended to use the template in the installation directory.
Time Duration Specification¶
Some parameters specify a time duration. For example, save_interval parameter defines the interval between saving the data file to the disk. For such parameters, special way of specification is used. For example:
2h
, 1h30m
, 1m15s
, 500ms
.
The supported units of specification: h
(hours), m
(minutes), s
(seconds), ms
(milliseconds).
More than one unit can be combined in the same time specification, but no unit can appear more than once.
Earlier versions of Optimizer Studio used other names for such parameters, without time unit specifications. These parameters are deprecated.
Available Setting Parameters¶
Parameter name | Default value | Description |
---|---|---|
sampling_interval | 1s | The interval between samples. Relevant only for asynchronous sampling mode. |
knob_ranking_max_num_of_samples | 10000 | Maximum number of configurations to use for knob ranking calculations. |
max_baseline_mean_cv | 0.04 | The maximum allowed coefficient of variation of the mean of the measurements in baseline settings. Lower values imply a stricter convergence threshold, so additional measurements might be required in order to converge. |
max_config_mean_cv | 0.04 | The maximum allowed coefficient of variation of the mean of the measurements per knob configuration. Lower values imply a stricter convergence threshold, so additional measurements might be required in order to converge. |
max_configs_in_report | 10 | Maximum number of knob configurations to include into best configurations report |
max_invalid_samples_per_config | 0 | The maximum allowed invalid measurements per knob configuration, above which the configuration is considered invalid and will not be further tested. |
max_samples_per_config | 120 | Optimizer will not test any knob configuration more than the number of times specified by this parameter. |
metrics_csv_filename | - | If specified, Optimizer Studio creates a CSV file with the details of all the knob settings and metric measurements. |
min_baseline_samples | 2 | The minimum number of baseline samples. This is used in conjunction with max_baseline_mean_cv . |
min_samples_per_config | 2 | The minimum number of samples per knob configuration. This is used in conjunction with max_config_mean_cv . |
optimization_strategy | evolution | The algorithm employed for searching through the knobs. Available options are greedy and evolution. |
optimization_strategy_settings | Settings specific for each optimization strategy. Only settings specific for the selected strategy will be parsed. | |
out_directory | ${HOME}/.concertio | Concertio Optimizer Studio generates output data such as optimization database file, log files, etc. into this location. |
pending_config_timeout | 0 | Configuration attempt scheduling policy: 0 - attempt configs sequentially, above 0 (in minutes) - attempt configs interleaved with a timeout. |
point_estimator | average: <no_value> | Point estimation function. Additional functions: percentile: <percent>, mode: <no_value> |
save_interval | 120m | The interval between saving the data file to the disk |
shell_command | /bin/sh +e | This defines the backend shell of the knobs and metrics. It is possible to run all knobs and metrics scripts on remote hosts using a different shell command |
max_idle_time | 10m | maximum time for optimizer service to live without any connection from clients after this time the service exits. default value if not set in settings.yaml is 0 meaning no timeout |
Deprecated Setting Parameters¶
If you get a parameter deprecation warning when running Optimizer Studio, please replace the deprecated parameters with new ones in your YAML files, adjusting the parameter values accordingly.
Note that the old parameters still work as before. However they will be dropped in the future versions.
Old parameter | New parameter | Change | Example |
---|---|---|---|
interval_seconds | sampling_interval | Time units | 1s |
save_interval_minutes | save_interval | Time units | 2h |
pending_config_timeout_minutes | pending_config_timeout | Time units | 30m |
max_idle_time_minutes | max_idle_time | Time units | 10m |