Apply baseline correction to the estimator’s data.
The algorithm applied is desribed in [1]. This uses an implementation
provided by pybaselines.
Parameters:
min_length – From the pybaseline docs: Any region of consecutive baseline
points less than min_length is considered to be a false
positive and all points in the region are converted to peak points.
A higher min_length ensures less points are falsely assigned as
baseline points.
The parameters of new oscillators to be added. Should be of shape
(n,2*(1+self.dim)), where n is the number of new
oscillators to add. Even when one oscillator is being added this
should be a 2D array, i.e.
1D data:
params=np.array([[a,φ,f,η]])
2D data:
params = np.array([[a, φ, f₁, f₂, η₁, η₂]])
rm_oscs – An iterable of ints for the indices of oscillators to remove from
the result.
merge_oscs – An iterable of iterables. Each sub-iterable denotes the indices of
oscillators to merge together. For example, [[0,2],[6,7]]
would mean that oscillators 0 and 2 are merged, and oscillators 6
and 7 are merged. A merge involves removing all the oscillators,
and creating a new oscillator with the sum of amplitudes, and the
average of phases, freqeuncies and damping factors.
split_oscs –
A dictionary with ints as keys, denoting the oscillators to split.
The values should themselves be dicts, with the following permitted
key/value pairs:
"separation" - An list of length equal to self.dim.
Indicates the frequency separation of the split oscillators in Hz.
If not specified, this will be the spectral resolution in each
dimension.
"number" - An int indicating how many oscillators to split
into. If not specified, this will be 2.
"amp_ratio" A list of floats with length equal to the number of
oscillators to be split into (see "number"). Specifies the
relative amplitudes of the oscillators. If not specified, the amplitudes
will be equal.
As an example for a 1D estimator:
split_oscs={2:{"separation":1.,# if 1D, don't need a list},5:{"number":3,"amp_ratio":[1.,2.,1.],},}
Here, 2 oscillators will be split.
Oscillator 2 will be split into 2 (default) oscillators with
equal amplitude (default). These will be separated by 1Hz.
Oscillator 5 will be split into 3 oscillators with relative
amplitudes 1:2:1. These will be separated by self.sw()[0]/self.default_pts()[0] Hz (default).
estimate_kwargs – Keyword arguments to provide to the call to estimate(). Note
that "initial_guess" and "region_unit" are set internally and
will be ignored if given.
(Optional, but highly advised) Generate a frequency-filtered “sub-FID”
corresponding to a specified region of interest.
(Optional) Generate an initial guess using the Minimum Description
Length (MDL) [2] and Matrix Pencil Method (MPM) [3][4][5][6]
Apply numerical optimisation to determine a final estimate of the signal
parameters. The optimisation routine employed is the Trust Newton Conjugate
Gradient (NCG) algorithm ([7] , Algorithm 7.2).
Parameters:
region – The frequency range of interest. Should be of the form [left,right]
where left and right are the left and right bounds of the region
of interest in Hz or ppm (see region_unit). If None, the
full signal will be considered, though for sufficently large and
complex signals it is probable that poor and slow performance will
be realised.
noise_region – If region is not None, this must be of the form [left,right]
too. This should specify a frequency range where no noticeable signals
reside, i.e. only noise exists.
region_unit – One of "hz" or "ppm" Specifies the units that region
and noise_region have been given as.
initial_guess –
If None, an initial guess will be generated using the MPM
with the MDL being used to estimate the number of oscillators
present.
If an int, the MPM will be used to compute the initial guess with
the value given being the number of oscillators.
If a NumPy array, this array will be used as the initial guess.
hessian –
Specifies how to construct the Hessian matrix.
If "exact", the exact Hessian will be used.
If "gauss-newton", the Hessian will be approximated as is
done with the Gauss-Newton method. See the “Derivation from
Newton’s method” section of this article.
mode – A string containing a subset of the characters "a" (amplitudes),
"p" (phases), "f" (frequencies), and "d" (damping factors).
Specifies which types of parameters should be considered for optimisation.
In most scenarios, you are likely to want the default value, "apfd".
amp_thold –
A value that imposes a threshold for deleting oscillators of
negligible ampltiude.
If None, does nothing.
If a float, oscillators with amplitudes satisfying will be
removed from the parameter array, where is the Euclidian norm of the vector of
all the oscillator amplitudes. It is advised to set amp_thold
at least a couple of orders of magnitude below 1.
phase_variance – Whether or not to include the variance of oscillator phases in the cost
function. This should be set to True in cases where the signal being
considered is derived from well-phased data.
mpm_trim – Specifies the maximal size allowed for the filtered signal when
undergoing the Matrix Pencil. If None, no trimming is applied
to the signal. If an int, and the filtered signal has a size
greater than mpm_trim, this signal will be set as
signal[:mpm_trim].
nlp_trim – Specifies the maximal size allowed for the filtered signal when undergoing
nonlinear programming. By default (None), no trimming is applied to
the signal. If an int, and the filtered signal has a size greater than
nlp_trim, this signal will be set as signal[:nlp_trim].
max_iterations – A value specifiying the number of iterations the routine may run
through before it is terminated. If None, a default number
of maximum iterations is set, based on the the data dimension and
the value of hessian.
negative_amps –
Indicates how to treat oscillators which have gained negative
amplitudes during the optimisation.
"remove" will result in such oscillators being purged from
the parameter estimate. The optimisation routine will the be
re-run recursively until no oscillators have a negative
amplitude.
"flip_phase" will retain oscillators with negative
amplitudes, but the the amplitudes will be multiplied by -1,
and a π radians phase shift will be applied.
"ignore" will do nothing (negative amplitude oscillators will remain).
output_mode –
Dictates what information is sent to stdout.
If None, nothing will be sent.
If 0, only a message on the outcome of the optimisation will
be sent.
If a positive int k, information on the cost function,
gradient norm, and trust region radius is sent every kth
iteration.
save_trajectory –
If True, a list of parameters at each iteration will be saved, and
accessible via the trajectory attribute.
Warning
Not implemented yet!
epsilon – Sets the convergence criterion. Convergence will occur when
.
eta –
Criterion for accepting an update. An update will be accepted if
the ratio of the actual reduction and the predicted reduction is
greater than eta:
initial_trust_radius – The initial value of the radius of the trust region.
max_trust_radius – The largest permitted radius for the trust region.
check_neg_amps_every – For every iteration that is a multiple of this, negative amplitudes
will be checked for and dealt with if found.
“The pickle module is not secure. Only unpickle data you trust.
It is possible to construct malicious pickle data which will
execute arbitrary code during unpickling. Never unpickle data
that could have come from an untrusted source, or that could have
been tampered with.”
You should only use from_pickle on files that you are 100%
certain were generated using to_pickle(). If you load
pickled data from a .pkl file, and the resulting output is not an
estimator object, an error will be raised.
Parameters:
path – The path to the pickle file. Do not include the .pkl suffix.
Construct chemical shifts which reflect the experiment parameters.
Parameters:
pts – The number of points to construct the shifts with in each dimesnion.
If None, and self.default_pts is a tuple of ints, it will be
used.
unit – Must be one of "hz" or "ppm".
flip – If True, the shifts will be returned in descending order, as is
conventional in NMR. If False, the shifts will be in ascending order.
meshgrid – If time-points are being derived for a N-dimensional signal (N > 1),
setting this argument to True will return N-dimensional arrays
corresponding to all combinations of points in each dimension. If
False, an iterable of 1D arrays will be returned.
Construct time-points which reflect the experiment parameters.
Parameters:
pts – The number of points to construct the time-points with in each dimesnion.
If None, and self.default_pts is a tuple of ints, it will be
used.
start_time –
The start time in each dimension. If set to None, the initial
point in each dimension will be 0.0. To set non-zero start times,
a list of floats or strings can be used.
If floats are used, they specify the first value in each
dimension in seconds.
Strings of the form f'{N}dt', where N is an integer, may be
used, which indicates a cetain multiple of the dwell time.
meshgrid – If time-points are being derived for a N-dimensional signal (N > 1),
setting this argument to True will return N-dimensional arrays
corresponding to all combinations of points in each dimension. If
False, an iterable of 1D arrays will be returned.
pts – The number of points to construct the time-points with in each dimesnion.
If None, and self.default_pts is a tuple of ints, it will be
used.
snr – The signal-to-noise ratio. If None then no noise will be added
to the FID.
decibels – If True, the snr is taken to be in units of decibels. If False,
it is taken to be simply the ratio of the singal power over the
noise power.
indirect_modulation –
Acquisition mode in the indirect dimension if the data is 2D.
If the data is 1D, this argument is ignored.
None - hypercomplex dataset:
"amp" - amplitude modulated pair:
"phase" - phase-modulated pair:
None will lead to an array of shape (n1,n2). amp and phase
will lead to an array of shape (2,n1,n2), with fid[0] and
fid[1] being the two components of the pair.
Manually phase the data using a Graphical User Interface.
Parameters:
max_p1 – The largest permitted first order correction (rad). Set this to a larger
value than the default (10π) if you anticipate having to apply a
very large first order correction.
convdta – If True and the data is derived from an fid file, removal of
the FID’s digital filter will be carried out.
Notes
There are certain file paths expected to be found relative to directory
which contain the data and parameter files. Here is an extensive list of
the paths expected to exist, for different data types:
nucleus – The identity of the nucleus. Should be of the form "<mass><sym>"
where <mass> is the atomic mass and <sym> is the element symbol.
Examples: "1H", "13C", "195Pt"
snr – The signal-to-noise ratio (dB). If None then no noise will be added
to the FID.
shifts – A list of tuple of chemical shift values for each spin.
couplings – The scalar couplings present in the spin system. Given shifts is of
length n, couplings should be an iterable with entries of the form
(i1,i2,coupling), where 1<=i1,i2<=n are the indices of
the two spins involved in the coupling, and coupling is the value
of the scalar coupling in Hz. None will set all spins to be
uncoupled.
pts – The number of points the signal comprises.
sw – The sweep width of the signal (Hz).
offset – The transmitter offset (Hz).
sfo – The magnetic field strength (T).
nucleus –
The identity of the nucleus. Should be of the form "<mass><sym>"
where <mass> is the atomic mass and <sym> is the element symbol.
Examples:
"1H"
"13C"
"195Pt"
snr – The signal-to-noise ratio of the resulting signal, in decibels. None
produces a noiseless signal.
lb – Line broadening (exponential damping) to apply to the signal.
The first point will be unaffected by damping, and the final point will
be multiplied by np.exp(-lb). The default results in the final
point being decreased in value by a factor of roughly 1000.
high_resolution_pts – Indicates the number of points used to generate the oscillators and model.
Should be greater than or equal to self.default_pts[0]. If None,
self.default_pts[0] will be used.
axes_left – The position of the left edge of the axes, in figure coordinates. Should be between 0. and 1..
axes_right – The position of the right edge of the axes, in figure coordinates. Should
be between 0. and 1..
axes_top – The position of the top edge of the axes, in figure coordinates. Should
be between 0. and 1..
axes_bottom – The position of the bottom edge of the axes, in figure coordinates. Should
be between 0. and 1..
axes_region_separation – The extent by which adjacent regions are separated in the figure,
in figure coordinates.
xaxis_unit – The unit to express chemical shifts in. Should be "hz" or "ppm".
xaxis_label_height – The vertical location of the x-axis label, in figure coordinates. Should
be between 0. and 1., though you are likely to want this to be
only slightly larger than 0..
xaxis_ticks – Specifies custom x-axis ticks for each region, overwriting the default
ticks. Should be of the form: [(i,(a,b,...)),(j,(c,d,...)),...]
where i and j are ints indicating the region under consideration,
and a-d are floats indicating the tick values.
oscillator_colors – Describes how to color individual oscillators. See color cycle
for details.
plot_model –
Todo
Add description
plot_residual –
Todo
Add description
model_shift – The vertical displacement of the model relative to the data.
residual_shift – The vertical displacement of the residaul relative to the data.
label_peaks – If True, label peaks according to their index. The parameters of a peak
denoted with the label i in the figure can be accessed with
self.get_results(indices)[i].
denote_regions – If True, and there are regions which share a boundary, a
vertical line will be plotted to show the boundary.
spectrum_line_kwargs – Keyword arguments for the spectrum line. All keys should be valid
arguments for matplotlib.axes.Axes.plot.
oscillator_line_kwargs –
Keyword arguments for the oscillator lines. All keys should be valid
arguments for matplotlib.axes.Axes.plot.
If "color" is included, it is ignored (colors are processed
based on the oscillator_colors argument.
residual_line_kwargs –
Keyword arguments for the residual line (if included). All keys
should be valid arguments for matplotlib.axes.Axes.plot.
model_line_kwargs –
Keyword arguments for the model line (if included). All keys should
be valid arguments for matplotlib.axes.Axes.plot.
label_kwargs – Keyword arguments for oscillator labels. All keys should be valid
arguments for
matplotlib.text.Text
If "color" is included, it is ignored (colors are procecessed
based on the oscillator_colors argument.
"horizontalalignment", "ha", "verticalalignment", and
"va" are also ignored, as these are determined internally.
force_overwrite – If path already exists and force_overwrite is set to False,
the user will be asked to confirm whether they are happy to
overwrite the file. If True, the file will be overwritten
without prompt.
fprint – Specifies whether or not to print infomation to the terminal.
Perform estiamtion on the entire signal via estimation of
frequency-filtered sub-bands.
This method splits the signal up into nsubbands equally-sized region
and extracts parameters from each region before finally concatenating all
the results together.
Warning
This method is a work-in-progress. It is unlikely to produce decent
results at the moment! I aim to improve the way that regions are
created in the future.
Parameters:
noise_region – Specifies a frequency range where no noticeable signals reside, i.e. only
noise exists.
noise_region_unit – One of "hz" or "ppm". Specifies the units that noise_region
have been given in.
nsubbands – The number of sub-bands to break the signal into. If None, the number
will be set as the nearest integer to the data size divided by 500.
estimate_kwargs – Keyword arguments to give to estimate(). Note that region
and initial_guess will be ignored.
Save the estimator to a byte stream using Python’s pickling protocol.
Parameters:
path – Path of file to save the byte stream to. Do not include the
'".pkl" suffix. If None, ./estimator_<x>.pkl will be
used, where <x> is the first number that doesn’t cause a clash
with an already existent file.
force_overwrite –
Defines behaviour if the specified path already exists:
If False, the user will be prompted if they are happy
overwriting the current file.
If True, the current file will be overwritten without prompt.
fprint – Specifies whether or not to print infomation to the terminal.
fmt – Must be one of "txt" or "pdf". If you wish to generate a PDF, you
must have a LaTeX installation. See LaTeX (Optional).
description – Descriptive text to add to the top of the file.
sig_figs – The number of significant figures to give to parameters. If
None, the full value will be used. By default this is set to 5.
sci_lims – Given a value (-x,y) with ints x and y, any parameter p
with a value which satisfies p<10**-x or p>=10**y will be
expressed in scientific notation. If None, scientific notation
will never be used.
integral_mode –
One of "relative" or "absolute".
If "relative", the smallest integral will be set to 1,
and all other integrals will be scaled accordingly.
If "absolute", the absolute integral will be computed. This
should be used if you wish to directly compare different datasets.
force_overwrite –
Defines behaviour if the specified path already exists:
If False, the user will be prompted if they are happy
overwriting the current file.
If True, the current file will be overwritten without prompt.
fprint – Specifies whether or not to print information to the terminal.
pdflatex_exe –
The path to the system’s pdflatex executable.
Note
You are unlikely to need to set this manually. It is primarily
present to specify the path to pdflatex.exe on Windows when
the NMR-EsPy GUI has been loaded from TopSpin.
Write a signal generated with estimated parameters to Bruker format.
<path>/<expno>/ will contain the time-domain data and information
(fid, acqus, …)
<path>/<expno>/pdata/1/ will contain the processed data and
information (pdata, procs, …)
Note
There is a known problem that the spectral data has timepoints along
the x-axis rather than chemical shifts. I will try to figure out why
and fix this in due course!
Parameters:
path – The path to the root directory to store the data in.
pts – The number of points to construct the signal from.
expno – The experiment number. If None, the smallest int x for which the
directory <path>/<x>/ doesn’t exist will be used.
force_overwrite – If False and the directory <path>/<expno>/ already exists,
the user will be prompted to confirm whether they are happy to
overwrite it. If True, said directory will be overwritten.