While functional tests (see Managing functional tests) form an excellent tool for verifying specific features and avoiding regression issues, there is also a need for validating simulation results for more realistic and complex models. Because nontrivial radiative transfer problems cannot be solved analytically or with deterministic numerical methods, the only option is comparing the results of different codes. This realization has led several authors to present well-defined benchmark problems with corresponding reference solutions produced by a number of codes participating in the benchmark.
SKIRT successfully performs the relevant benchmarks available in the literature. The geometries, source spectra and medium properties needed for these benchmarks are built into the code. For an overview, see the Benchmarks section of the SKIRT web site. It shows the results for each benchmark and offers the corresponding configuration files for download, so that any interested third party can run the benchmark simulations.
As a side benefit, benchmarks also test the SKIRT multi-threading parallization mechanisms (because benchmarks are usually executed using a single process, the multi-processing implementation is not tested).
It is not feasible to run benchmarks in a fully automated manner for two reasons:
Fortunately, there is no need to run the benchmarks for every update to the SKIRT code. The recommended approach is as follows:
The _ReadMe.txt text file in the Benchmark directory describes the structure of a typical benchmark directory and the Python scripts provided for running the benchmark simulations and verifying the results. Essentially, the procedure for running a benchmark has four stages: prepare, execute, visualize, and evaluate. By seperating the execution stage from the other stages, it is possible to run the simulations on another computer such as a remote server.
When a new area of physics is implemented in SKIRT, its operation usually is validated by comparison with known results, theoretical or generated by other simulation codes. These validation tests should be added to the list of benchmark specifications in the Benchmark directory. Likewise, when a new relevant benchmark effort is published, with or without involvement of the SKIRT team, a corresponding benchmark specification should be added.
When constructing a new benchmark specification, follow the structure of the other benchmarks as closely as possible. If there is a need for deviating from this structure, document it in a ReadMe file. Also, remember to add a corresponding benchmark description, ski file(s) and results to the Benchmarks section of the SKIRT web site.