Benchmarking Tips and Tricks
Below are a few advices to facilitate the evaluation of your solvers. Feel free to follow them if them seem relevant to you and to adapt them to your own use-cases.
Easily testing with different parameters
One can expect some solvers to have a lot of possible parameters. It would be impractical to manually define a name for each of the possible variants. What would be really nice would be to have the program automatically understand that:
taboo
refers to the taboo method with the default parameterstaboo__est-spt__20
refers to the taboo method with a taboo size of20
and theest-spt
solver to provide the initial solutiontaboo__descent__10
refers to the taboo method with a taboo size of10
and thedescent
solver to provide the initial solution
Here is a partial example of an implementation of the Solver.getSolver()
method that leverages regular expressions to do exactly that. It should of course be adapted to match the parameters of your own solvers, for instance, a value for the parameter 'maxIter' in the taboo method or the neighborhood to use.
/** Static factory method to create a new solver based on its name. */
static Solver getSolver(String name) {
switch (name) {
case "basic": return new BasicSolver();
...
// taboo with some default parameters
case "taboo": return new TabooSolver(getSolver("est_lrpt"), 20);
default:
// not one of the predefined solvers, try to see if we can extract parameters from the solver name
// using Pattern/Matcher from java.util.regex package
Pattern taboo = Pattern.compile("taboo__([a-z_]+)__([0-9]+)");
Matcher m = taboo.matcher(name);
if (m.find()) {
String baseSolverName = m.group(1);
int tabooSize = Integer.parseInt(m.group(2));
return new TabooSolver(getSolver(baseSolverName), tabooSize);
}
// not a taboo, try matching with descent
Pattern descent = Pattern.compile("descent__([a-z_]+)");
...
throw new RuntimeException("Unknown solver: "+ name);
}
}
Using INSA's compute servers
To launch your experimental analysis, you can use one of the available compute servers (accessible with INSA's VPN): srv-gei-gpu1
, srv-gei-gpu2
or srv-ens-calcul
Those server are suppose to be always on. You can check with the uptime
command for how long a server has been running. Please notify us if the server has been recently restarted of if you have any problem acccessing them.
# Example to connect to srv-gei-gpu2
moi@mamachine:~$ ssh srv-gei-gpu2
moi@srv-gei-gpu2 password:
...
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-72-generic x86_64)
...
moi@srv-gei-gpu2:~$ cd PATH/TO/MY/CODE
moi@srv-gei-gpu2:~/MyCode$ mvn compile
moi@srv-gei-gpu2:~/MyCode$
moi@srv-gei-gpu2:~/MyCode$ nohup mvn compile exec:java -Dexec.args="--solver method1 method2 (...) --instance ta (...)" > MyOuFile 2> MyErrFile &
Before running any benchmark, you should check that the server is not already under heavy load, which might impact the performance of your program. You can manually check that with the top
command.
Automating the collection and analysis of results
The jobshop.Main
program that is provided to you is a good start to build you own benchmarking suite. However, the results that are printed on the standard output are mostly intended to be easily readable by a human being, not by a computer program.
To automate the the collection of results and their analysis, you might want to modify the jobshop.Main
program to:
- accept an additional command line argument that specifies a filename to which the collected data must be exported.
- if this argument is specified, then write the results to this file in commonly accepted format.
A very common format for this is the CSV (Comma Separated Value) format that can be:
- opened in spreadsheet software such as Excel or LibreOffice
- read from most programming languages (including python with the
csv
orpandas
modules)
- have automated scripts that process the results files, for instance to produce some graphs. In python, the
matplotlib
package is useful for making some graphs and thepandas
package to perform some statistical analysis. Thegnuplot
program is also commonly used to create graphs from CSV data.