Custom cost functions¶
All user-defined cost functions are stored inside the file:
$TS/exp/stan/nmr/py/user/poise_backend/costfunctions_user.py
where $TS
is your TopSpin installation path.
In order to modify or add cost functions, you will need to edit this file (with your favourite text editor or IDE).
The corresponding file containing builtin cost functions is costfunctions.py
.
You can edit this file directly: if you add a cost function there, it will work.
However, there are two risks with this.
Firstly, if you ever reinstall POISE, this file will be reset to the default (whereas costfunctions_user.py
will not).
Secondly, any cost functions defined in costfunctions_user.py
will shadow (i.e. take priority over) the cost functions defined in costfunctions.py
if they have the same name.
The rules for cost functions¶
Cost functions are defined as a standard Python 3 function which takes no parameters and returns a float (the value of the cost function).
Do write a useful docstring if possible: this docstring will be shown to the user when they type
poise -l
into TopSpin (which lists all available cost functions and routines).The spectrum under optimisation, as well as acquisition parameters, can be accessed via helper functions. These are described more fully below.
Never print anything inside a cost function directly to stdout. This will cause the optimisation to stop. If you want to perform debugging, use the
log
function described below.To terminate the optimisation prematurely and return the best point found so far, raise CostFunctionError(). See below for more information.
Accessing spectra and parameters¶
The most primitive way of accessing “outside” information is through the class _g
, which is imported from shared.py
and contains a series of global variables reflecting the current optimisation.
For example, _g.p_spectrum
is the path to the procno folder: you can read and parse the 1r
file inside this to get the real spectrum as a numpy.ndarray
(for example).
Class to store the “global” variables.
- Attributes
- optimiserstr from {‘nm’, ‘mds’, ‘bobyqa’}
The optimiser being used.
- routine_idstr
The name of the routine being used.
- p_spectrum
Path
The path to the procno folder of the spectrum just acquired. (e.g.
/path/to/data/1/pdata/1
)- p_optlog
Path
The path to the currently active
poise.log
file.- p_errlog
Path
The path to the currently active
poise_err_backend.log
file.- maxfevint
The maximum number of function evaluations specified by the user. Can be zero, indicating no limit (beyond the hard limit of 500 times the number of parameters).
- p_poise
Path
The path to the
$TS/exp/stan/nmr/py/user/poise_backend
folder.- spec_f1pfloat or tuple of float
The
F1P
parameter. For a 1D spectrum this is a float. For a 2D spectrum this is a tuple of floats (indirect, direct
) corresponding to the values ofF1P
in both spectral dimensions.- spec_f2pfloat or tuple of float
The
F2P
parameter.- xvalslist of ndarray
The points sampled during the optimisation, in chronological order. These are ndarrays which contain the values of the parameters being optimised at each spectrum acquisition. The parameters are ordered in the same way as specified in the routine.
- fvalsndarray
The values of the cost functions calculated at each stage of the optimisation.
However, this is quite tedious and error-prone, so there are a number of helper methods which use these primitives.
All the existing cost functions (inside costfunctions.py
) only use these helper methods.
All of these methods are stored inside cfhelpers.py
and are already imported by default.
The ones you are likely to use are the following:
- nmrpoise.poise_backend.cfhelpers.make_p_spec(path=None, expno=None, procno=None)¶
Constructs a
Path
object corresponding to the procno folder<path>/<expno>/pdata/<procno>
. If parameters are not passed, they are inherited from the currently active spectrum (_g.p_spectrum
).Thus, for example,
make_p_spec(expno=1, procno=1)
returns a path to the spectrum with EXPNO 1 and PROCNO 1, but with the same name as the currently active spectrum.
- nmrpoise.poise_backend.cfhelpers.get1d_fid(remove_grpdly=True, p_spec=None)¶
Returns the FID as a
ndarray
.- Parameters
- remove_grpdlybool, optional
Whether to remove the group delay (to be precise, it is shifted to the end of the FID). Defaults to True.
- p_spec
Path
, optional Path to the procno folder of interest. (The FID is taken from the expno folder two levels up.) Defaults to the currently active spectrum (i.e.
_g.p_spectrum
).
- Returns
ndarray
Complex-valued array containing the FID.
- nmrpoise.poise_backend.cfhelpers.get1d_real(bounds='', p_spec=None)¶
Return the real spectrum as a
ndarray
. This function accounts for TopSpin’sNC_PROC
variable, scaling the spectrum intensity accordingly.Note that this function only works for 1D spectra. It does not work for 1D projections of 2D spectra. If you want to work with projections, you can use
get2d_rr
to get the full 2D spectrum, then manipulate it using numpy functions as appropriate. A documented example can be found in theasaphsqc()
function incostfunctions.py
(commented out by default).The bounds parameter may be specified in the following formats:
between 5 and 8 ppm:
bounds="5..8"
ORbounds=(5, 8)
greater than 9.3 ppm:
bounds="9.3.."
ORbounds=(9.3, None)
less than -2 ppm:
bounds="..-2"
ORbounds=(None, -2)
- Parameters
- boundsstr or tuple, optional
String or tuple describing the region of interest. See above for examples. If no bounds are provided, uses the
F1P
andF2P
processing parameters, which can be specified viadpl
. If these are not specified, defaults to the whole spectrum.- p_spec
Path
, optional Path to the procno folder of interest. Defaults to the currently active spectrum (i.e.
_g.p_spectrum
).
- Returns
ndarray
Array containing the spectrum or the desired section of it (if bounds were specified).
- nmrpoise.poise_backend.cfhelpers.get1d_imag(bounds='', p_spec=None)¶
Same as
get1d_real
, except that it reads the imaginary spectrum.
- nmrpoise.poise_backend.cfhelpers.get2d_rr(f1_bounds='', f2_bounds='', p_spec=None)¶
Return the real part of the 2D spectrum (the “RR” quadrant) as a 2D
ndarray
. This function takes into account theNC_PROC
value in TopSpin’s processing parameters.The f1_bounds and f2_bounds parameters may be specified in the following formats:
between 5 and 8 ppm:
f1_bounds="5..8"
ORf1_bounds=(5, 8)
greater than 9.3 ppm:
f1_bounds="9.3.."
ORf1_bounds=(9.3, None)
less than -2 ppm:
f1_bounds="..-2"
ORf1_bounds=(None, -2)
- Parameters
- f1_boundsstr or tuple, optional
String or tuple describing the indirect-dimension region of interest. See above for examples. If no bounds are provided, uses the
1 F1P
and1 F2P
processing parameters, which can be specified viadpl
. If these are not specified, defaults to the whole spectrum.- f2_boundsstr or tuple, optional
String or tuple describing the direct-dimension region of interest. See above for examples. If no bounds are provided, uses the
2 F1P
and2 F2P
processing parameters, which can be specified viadpl
. If these are not specified, defaults to the whole spectrum.- p_spec
Path
, optional Path to the procno folder of interest. Defaults to the currently active spectrum (i.e.
_g.p_spectrum
).
- Returns
ndarray
2D array containing the spectrum or the desired section of it (if f1_bounds or f2_bounds were specified).
- nmrpoise.poise_backend.cfhelpers.get2d_ri(f1_bounds='', f2_bounds='', p_spec=None)¶
Same as
get2d_rr
, except that it reads the ‘2ri’ file.
- nmrpoise.poise_backend.cfhelpers.get2d_ir(f1_bounds='', f2_bounds='', p_spec=None)¶
Same as
get2d_rr
, except that it reads the ‘2ir’ file.
- nmrpoise.poise_backend.cfhelpers.get2d_ii(f1_bounds='', f2_bounds='', p_spec=None)¶
Same as
get2d_rr
, except that it reads the ‘2ii’ file.
- nmrpoise.poise_backend.cfhelpers.getpar(par, p_spec=None)¶
Obtains the value of a numeric (acquisition or processing) parameter. Non-numeric parameters (i.e. strings) are not currently accessible! Works for both 1D and 2D spectra (see return type below), but nothing higher.
- Parameters
- parstr
Name of the parameter.
- p_spec
Path
, optional Path to the procno folder of interest. Defaults to the currently active spectrum (i.e.
_g.p_spectrum
).
- Returns
- float or
ndarray
Value(s) of the requested parameter. None if the given parameter was not found.
For parameters that exist for both dimensions of 2D spectra, getpar() returns an ndarray consisting of (f1_value, f2_value). Otherwise (for 1D spectra, or for 2D parameters which only apply to the direct dimension), getpar() returns a float.
Note that a float is returned even for parameters which can logically only be integers (e.g. TD). If you want an integer you have to manually convert it using
int()
.
- float or
- nmrpoise.poise_backend.cfhelpers.getndim(p_spec=None)¶
Obtains the dimensionality of the spectrum, i.e. the status value of PARMODE, plus one (becaus PARMODE is 0 for 1D spectra, etc.)
- Parameters
- p_spec
Path
, optional Path to the procno folder of interest. Defaults to the currently active spectrum (i.e.
_g.p_spectrum
).
- p_spec
- Returns
- int
Dimensionality of the spectrum.
- nmrpoise.poise_backend.cfhelpers.getnfev()¶
Returns the number of NMR spectra evaluated so far. This will be equal to 1 (not 0) the first time the cost function is called, since the cost function is only called after the NMR spectrum has been acquired.
Logging¶
As noted above, printing anything to stdout
will cause the optimisation to crash.
The reason for this is because stdout
is reserved for communication between the POISE backend (the Python 3 component) and frontend (which runs in TopSpin).
Please use log()
instead, which will print to the poise.log
file in the expno folder.
It works in exactly the same way as the familiar print()
, and accepts the same kind of arguments.
- nmrpoise.poise_backend.cfhelpers.log(*args)¶
Prints something to the poise.log file.
If this is called from inside a cost function, the text is printed before the cost function is evaluated, so will appear above the corresponding function evaluation.
- Parameters
- args
The arguments that
log()
takes are exactly the same as those ofprint()
.
- Returns
- None
Premature termination¶
In order to terminate an optimisation prematurely, you can raise any exception you like, for example with raise ValueError
.
POISE will stop the optimisation, and the error will be displayed in TopSpin.
The drawback of this naive approach is that no information from the incomplete optimisation will be retained.
That means that even if you have found a point that is substantially better than the initial point, it will not be saved.
If you want to terminate and return the best point found so far, please raise a CostFunctionError
instead of any other exception (such as ValueError
).
CostFunctionError
takes up to 2 arguments: the number of arguments to supply depends on whether you want to include the final point which caused the error to be raised.
If you want to discard the last point, i.e. just stop the optimisation right away, then raise a CostFunctionError
with just 1 argument. The argument should be the message that you want to display to the user:
def cost_function():
cost_fn_value = foo() # whatever calculation you want here
if some_bad_condition:
raise CostFunctionError("Some bad condition occurred!")
return cost_fn_value
It is possible, and probably advisable, to use this string to show the user helpful information (further steps to take, or the value of the cost function, for example).
Alternatively, you may want the current point (and the corresponding cost function value) to be saved as part of the optimisation. For example, it may be the case that a certain threshold is “good enough” for the cost function and any value below that is acceptable. In that situation, you would want to raise CostFunctionError once the cost function goes below that threshold, but also save that point as the best value. To do so, pass the value of the cost function as the second parameter when raising CostFunctionError:
def cost_function():
cost_fn_value = foo() # whatever calculation you want here
if cost_fn_value < threshold:
raise CostFunctionError("The cost function is below the threshold.",
cost_fn_value)
# Note that we still need the return statement, because it will be used
# if cost_fn_value is greater than the threshold.
return cost_fn_value
The value passed as the second argument can in general be any number. If you want to make sure that the final point (which raised the CostFunctionError) is always chosen to be the optimal point, then you can do this:
raise CostFunctionError("A message here", -np.inf)
Because -np.inf
is smaller than any other possible number, it will always be picked as the “best” point.
Examples¶
There are a number of cost functions which ship with POISE.
These can all be found inside the costfunctions.py
file referred to above.
This file also contains a number of more specialised cost functions, which were used for the examples in the POISE paper.
(Instead of opening the file, you can also find the source code on GitHub.)
A number of these are thoroughly commented with detailed explanations; do consider checking these out if you want more guidance on how to write your own cost function.