Overview¶
The below describes the required steps to get started with using Concertio Optimizer Studio software.
While Optimizer Studio can run under any modern Linux distribution, the directions below have been tested on Ubuntu 20.04, Ubuntu 18.04, Ubuntu 16.04, Centos 7 and CentOS 8.
Community Edition¶
The community edition does not require activating a license.
Instead, it requires connecting to Concertio's Experiment Management System (a software as a service).
Following are the steps to set up optimizer-studio to connect to that service:
- sign up at optimizer.concertio.com
- login to the service and create a new Project
- once project created, click the Connect button and follow the popup instructions
For instructions how to run experiments, please skip the license activation section, straight to Using Optimizer Studio.
Activating Your License (paid plans only)¶
Prior to using Optimizer Studio, you'll have to activate your license. In order to do so you'll need to use the optimizer-license tool:
$ optimizer-license
Concertio Optimizer License Handling Tool
Online documentation is available at https://www.concertio.com/docs/
Usage: optimizer-license ACTION [--verbose]
Required one of the following ACTIONs:
--activate=PROD_KEY Online Activation of your installation using the product key [PROD_KEY].
--deactivate Online Deactivation of your installation.
--activation-request=PROD_KEY Generate an offline activation request file named <activation.req> using
the product key [PROD_KEY].
--activation-by-file=FILE_PATH Offline activation using the activation response file [FILE_PATH].
--deactivation-request Generate an offline deactivation request file <deactivation.req>.
--show Display license information.
--help Print this help message.
Optional:
--verbose Detailed output.
If your system is connected to the internet you can use the online activation:
$ optimizer-license --activate=XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX
(XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX is the license key you received when you purchased Optimizer Studio).
Otherwise, if your system is offline the procedure is as follows-
First you'll need to generate a request file:
$ optimizer-license --activation-request=XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX
This will generate a request file named activation.req, you should copy this file and send it as attachment by email to Concertio (support@concertio.com).
Next, in a reply email you'll receive an activation file. You should copy the file to your system and execute the command:
$ optimizer-license --activation-by-file=FILE_PATH
(FILE_PATH is the file path of the activation file).
Deactivation¶
In case you wish to deactivate your license (for example, if you wish to use your license on a different system), you should use the following command (if your system is online):
$ optimizer-license --deactivate
Alternatively, if your system is offline, you should use:
$ optimizer-license deactivation-request
This will generate a deactivation request file named deactivation.req. You should email this file to Concertio.
Using Optimizer Studio¶
For a brief description of available command line parameters, issue:
$ optimizer-studio --help
NAME:
Concertio Optimizer Studio - version 2.10.0 (build time: 2020-05-10_17:36_[UTC])
USAGE:
optimizer-studio [OPTION] [OPTION] ... "<workload> [workload args]"
optimizer-studio [OPTION] [OPTION] ...
VERSION:
2.10.0
DESCRIPTION:
Online documentation is available at https://www.concertio.com/docs/
AUTHOR:
Concertio
COMMANDS:
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--sampling-mode MODE Select sampling mode. MODE may be either sync, async:
sync - sampling is synchronized with workload execution
async - periodic sampling with constant intervals
(default: "sync")
--settings PATH PATH to settings YAML file
(default: "settings.yaml")
--knobs PATH PATH to configuration YAML files. When no configuration file(s)
have been provided, then ./knobs.yaml is used. If no such file exists,
the embedded knobs are used.
Each file specified on a command line may override
settings defined in a previous configuration file
--knobs embedded Load embedded configuration file, even though other
configuration file(s) have been specified
--retain Retain optimization data from previous Optimizer Studio run
--replay metrics.csv file Feed all metrics from the specified metrics.csv file produced by a previous
optimizer run back to optimizer, then proceed with optimization
--workload-timeout T Maximum time allowed for a single workload run, after which it is killed
T is given in the following format: AhBmCs, for [h]ours, [m]inutes, [s]econds
--max-minutes T Maximum optimization time. When T is 0 or not specified, the optimization process
ends according to internal logic
(default: 0)
--workload PARAMS This parameter causes Optimizer Studio to run in parallel mode. The
PARAMS specify workload starter name and parameters for parallel execution.
The parameters can be different for each starter script
--http-port PORT The PORT number to use for REST HTTP server if Optimizer Studio runs in
REST server mode
(default: 8421)
--testknob KNOB-NAME Run testing procedure for a knob identified by KNOB-NAME and exit
--testknob all Run testing procedure for all knobs and exit
--testknob none Skip mandatory knob testing procedure
--settings-script PATH The PATH of generated script that applies the optimal settings discovered
throughout the run. If not specified, the name is defined automatically
--report status Output a status report of already running Optimizer Studio session.
Should be run from another terminal session
--stages list Define optimization stages involved, as a comma separated list
(default: "optimization,refinement")
--session SESSIONID Provide your own Optimizer Studio session ID.
For example, in case there are several sessions running
simultaneously which require some automation logic.
If not specified, a random SESSIONID is generated automatically
and printed out on the terminal at the beginning
--help, -h show help
--version, -v print the version
COPYRIGHT:
Copyright (c) 2016-2020, Concertio
Running Optimizer Studio¶
In order to optimize a workload, provide a workload script as a parameter, as follows:
$ sudo optimizer-studio ./my_workload
Concertio Optimizer Studio version 2.4.0
Optimizing ./my_workload
Optimization target: duration
Starting runtime (no time limit) ...
...
Optimization time: 00:06:23
Progress: 100%
Best known knob settings so far:
vm.swappiness: 60
sys.vm.overcommit_ratio: 40
Settings written to: optimizer_studio_settings.sh
Alternatively, if your have a workload.sh
file in the current directory, you could run optimizer-studio without any argument and optimizer will look for that file automatically.
In this example, the script my_workload is optimized by Optimizer Studio. Since no configuration file is specified using --knobs=, Optimizer Studio defaults to optimizing CPU and OS knobs, unless you already have a predefined knobs.yaml
in current folder. Optimizing CPU and OS knobs requires root access, so this is why Optimizer Studio is executed using sudo in the above example.
There are two main ways to use Optimizer Studio to optimize a workload:
- Synchronous Sampling (Optimizing Full Workloads) - This is the default optimization mode. Optimizer Studio will run the workload until completion several times and will attempt to minimize its runtime or maximize its reported metric using different system settings. This method is appropriate for complex workloads.
- Asynchronous Sampling (Optimizing Specific Metrics) - By specifying
--sampling-mode=async
, Optimizer Studio will identify phases of execution of a workload and explore the best settings for each phase. This method is appropriate for simpler or synthetic workloads, where early feedback of application performance can be used to run many more experiments.
The specifics of the optimization method can be controlled using the configuration files.
Workload Settings¶
Optimizer Studio is responsible for running the workload provided by workload script. There are several command line parameters that control the way Optimizer Studio runs the workload:
--max-minutes. The maximum optimization time can be limited by this parameter. If it's not specified or is zero, Optimizer Studio will terminate the optimization process according to its estimations.
--settings-script. After optimization, Optimizer Studio outputs the discovered optimal settings and the baseline settings to this file, which can run as a script.
Running workloads as a non-root user while optimizing OS/CPU knobs¶
Optimizer Studio requires root access when optimizing OS and CPU knobs. It is possible to run the workload as the current user using su
as in the following example:
$ sudo optimizer-studio su - $USER -c "./workload.sh"
Configuring Knobs¶
Optimizer Studio can optimize user-defined knobs by providing a configuration file. This is explained in the Configuration section.
Running Long Workloads¶
Status Report¶
During optimization session, Optimizer Studio prints out a short line about its progress. If a user wants to get more detailed information about optimization session status, they can open another terminal session and run the following command:
$ optimizer-studio --report=status
In case that more than one optimization session run at the same time on the same computer, an additional --session command line parameter can be used:
$ optimizer-studio --report=status --session=<SESSION_ID>
where the session ID value is output to the terminal at the beginning of the optimizer session
Workload Timeout¶
Sometimes, a workload script can run too long time or even indefinitely. In such cases a user might want to limit a single workload run time with the --workload-timeout command parameter. In case that workload script runs more than the allowed time, Optimizer studio will kill it using SIGKILL signal, and the corresponding knob settings will be invalidated.
The format for time specification is AhBmCs where A, B, C specify correspondingly hours, minutes and seconds. For example:
$ optimizer-studio --knobs=<my_knobs.yaml> --workload-timeout=5m ./my_workload.sh
Resuming Stopped Optimization Session¶
Sometimes, when optimization session stops in the middle, a user may want to resume it from the same point it's stopped. Retaining Optimization Data is one way to achieve this. Another way is to "replay" the data collected during the previous optimization session.
Retaining Optimization Data¶
Optimizer Studio stores the optimization data in ${HOME}/.concertio/opt.data. By default, upon invocation of Optimizer Studio, this file is removed. In order to retain the data from the previous run, the --retain command line parameter can be used:
$ optimizer-studio --retain ./my_workload
Replaying Optimization Samples¶
During the optimization, Optimizer Studio stores all metrics collected for each knob configuration applied in a CSV file. This file is stored in ${HOME}/.concertio directory and has the name metrics_<timestamp>.csv, where timestamp specifies the file creation time at the beginning of optimization session. Using the --replay command line parameter, a user can feed all this data to optimizer:
$ optimizer-studio --knobs=<my_knobs.yaml> --replay=${HOME}/.concertio/<my_metrics>.csv ./my_workload.sh
When To Use Retain And Replay¶
Each method of resuming optimization has its pros and cons.
Retaining Optimization Data allows to restore the optimization session exactly in the same state as it was when the optimization has stopped. However, if there were any changes in the environment, knobs definitions or product version, Optimizer Studio may fail to read this information, and then it discards it altogether. In this case the optimization starts from the very beginning.
In addition, the opt.data file is overwritten each time Optimizer Studio runs. So if users do not want the file to be lost, they need to backup it.
Replaying metrics.csv uses the data in this file. Prior to invoking Optimizer Studio, users can review it, modify, drop some invalid samples, etc. Replaying the CSV file is a fast operation. However the optimizer state after replay is not a precise copy of its state prior to optimization stop. Following this, optimizer may decide to reevaluate some knob configurations, which may take additional time to run the workload. However, this overhead is usually relatively small.
In-situ Optimization¶
Note: This is an evolving new feature. Expect changes and additions to this feature.
In order to achieve best results, it is preferred to perform the optimization on a production machine rather than in a simulated laboratory environment. This is of course not always possible, but when it is (e.g., when the optimization may take place on a single production server) it may yield better results.
In-situ optimization opens up the possibility for new optimization techniques such as replacing memory allocation algorithms in the optimized services.
Configuring In-situ Optimization¶
Assuming your service start-up looks like the following:
MyService <Param1> <Param2> ... <ParamN>
Adding the wrapper command accelerate will mark the service as a target for optimization:
accelerate MyService <Param1> <Param2> ... <ParamN>
Next you have to configure a suitable knobs.yaml file. A simple knobs.yaml file may look like:
global_settings:
min_samples_per_config: 1
min_baseline_samples: 1
domain:
common:
knobs:
memory:
kind: accel.mem_alloc
options: [glibc, mimalloc, tcmalloc, jemalloc]
This simple configuration file will select the best memory allocation algorithm for MyService among the default glibc, mimalloc, tcmalloc and jemalloc algorithms.
In order to start the optimization session, optimizer-studio needs to be started with accelerate workload. accelerate is a reserved workload name which is used for in-situ optimization:
optimizer-studio accelerate
When optimizer-studio is running in accelerate mode, it will wait for an instance of MyService to show up in order to start each sample. This means that your timing of the service operation will not be affected by the optimization process (i.e., if the service is started once every 24 hours, then each sample will take 24 hours). If your service is not regularly restarted, it is advised to add a cron job (or some kind of script) which will force a restart of the service. This will allow optimizer-studio to evaluate different configurations.
Optimizer Studio Workflow¶
Optimizer Studio Workflow consists of several stages, which can be selected by the --stages
command line
switch, where the desired stages should be listed, separated by comma.
The available workflow stages are:
- readiness
- optimization
- refinement
- validation
The default workflow setting is --stages='optimization,refinement'
.
All stages specified in the --stages
parameter are run one after another. If you want to break the optimization
workflow after some stage, specify only the required stages. Then, when you are ready to resume the process, run
optimizer-studio
again using --retain
parameter and specifying other stages and other parameters.
Some stages have their own parameters, which should be specified in a knobs.yaml
file.
Readiness Stage¶
The Readines Stage goal is finding out the correct optimization parameters, before running the optimization process. Optimizer Studio runs the workload several times using baseline knob configuration, and finds out the approximate time of running a workload, as well as statistic characteristics of the target metric for the baseline configuration - mean value, standard deviation, coefficient of variation (CV), etc.
After this stage, the user can set optimization parameters with more confidence.
The results of the Readiness stage are reported to Experiment Management System.
Readiness Stage Configuration¶
The following section of a knobs.yaml
file is used for configuration:
stages:
readiness:
duration: 15m
max_samples: 50
min_samples: 10
duration
values should have the syntax: AhBmCs
, where A, B, C
are the values, and h, m, s
mean "hours", "minutes" or "seconds" correspondingly. For example: 1h30m
, 15m
, etc.
Only one of duration
or max_samples
parameters are required. min_samples
is optional. Optimizer Studio
runs the Readiness stage until max_samples
were acquired or duration
has passed, the earliest. But not earlier
than min_samples
were acquired. The default for min_samples
is 0.
Optimization Stage¶
Optimization stage comprises the main optimization process by running the workload and testing different knob configurations. This is the main and default functionality of Optimizer Studio.
Optimization stage doesn't have its own parameters, it's controlled by all parameters defined for Optimizer Studio.
Refinement Stage¶
The Refinement Stage runs after the Optimization Stage has finished. It tries to minimize the number of knobs which should differ from their baseline configuration.
If Optimizer Studio has been stopped before, starting optimizer-studio
again with --stages='refinement'
will
automatically try to find out the results of the previous optimization session and continue with the refinement
process.
Validation Stage¶
The Validation Stage is the last in Optimizer Studio workflow.
After finding and refining the optimal knob configuration, Optimizer Studio runs the workload using this configuration for a number of times. Doing this, Optimizer Studio collects various statistics for the target metric like mean value, standard deviation, coefficient of variation (CV). The report containing these statistics is sent to the Experiment Management System.
If Optimizer Studio has been stopped before, starting optimizer-studio
again with --stages='validation'
will
automatically try to find out the results of the previous optimization session and continue with the validation
process.
Validation Stage Configuration¶
The following section of a knobs.yaml
file is used for configuration:
stages:
validation:
duration: 15m
max_samples: 50
min_samples: 10
The parameter settings for the Validation stage is similar to the Readiness stage.
Using Optimization Results¶
Optimizer studio stores the results of the optimization in a script, as configured in settings.yaml. This script can be used to apply the discovered results:
$ /tmp/studio-settings.XYZ.sh tuned
Concertio Optimizer Studio, version 2.4.0
Auxiliary script for loading optimized settings for ./my_workload
Creation date: Thu Dec 12 13:35:58 UTC 2019
Apply tuned settings
...
To revert to baseline, use:
$ /tmp/studio-settings.XYZ.sh baseline
Concertio Optimizer Studio, version 2.4.0
Auxiliary script for loading optimized settings for ./my_workload
Creation date: Thu Dec 12 13:35:58 UTC 2019
Applying baseline settings
...
To define bash variable array with the knob values, use:
$ . /tmp/studio-settings.XYZ.sh valarray
Concertio Optimizer Studio, version 2.4.0
Auxiliary script for loading optimized settings for ./my_workload
Creation date: Thu Dec 12 13:35:58 UTC 2019
Exporting values as KNOB_VALUES_TUNED and KNOB_VALUES_BASELINE array variables
$ echo ${KNOB_VALUES_TUNED[@]}
...