unidec.IsoDec package¶
Subpackages¶
- unidec.IsoDec.IDGUI package
- unidec.IsoDec.IsoGen package
- Submodules
- unidec.IsoDec.IsoGen.isogen_base module
HighMassNeuralNetwork
IsoGenDatasetBase
IsoGenEngineBase
IsoGenModelBase
IsoGenModelBase.batch_predict()
IsoGenModelBase.evaluate_model()
IsoGenModelBase.get_model()
IsoGenModelBase.load_model()
IsoGenModelBase.predict()
IsoGenModelBase.run_training()
IsoGenModelBase.save_model()
IsoGenModelBase.setup_model()
IsoGenModelBase.setup_training()
IsoGenModelBase.train_model()
IsoGenNeuralNetwork
save_model_to_binary()
- unidec.IsoDec.IsoGen.isogen_tools module
- unidec.IsoDec.IsoGen.isogenatom module
- unidec.IsoDec.IsoGen.isogenatom_synthetic_training module
- unidec.IsoDec.IsoGen.isogenatom_trainingdata module
- unidec.IsoDec.IsoGen.isogenmass module
- unidec.IsoDec.IsoGen.isogenpep module
- unidec.IsoDec.IsoGen.isogenpep_synthetic_training module
- unidec.IsoDec.IsoGen.isogenpep_trainingdata module
- unidec.IsoDec.IsoGen.isogenrna module
- unidec.IsoDec.IsoGen.isogenrna_trainingdata module
- Module contents
Submodules¶
unidec.IsoDec.IsoDecGUI module¶
- class unidec.IsoDec.IsoDecGUI.IsoDecPres¶
Bases:
UniDecPres
- batch_process(path)¶
Batch process a file :param path: File name :return: None
- export_results()¶
- fix_parameters()¶
- init_config()¶
- makeplot1(e=None, intthresh=False, imfit=True)¶
Plot data and fit in self.view.plot1 and optionally in plot1fit :param e: unused event :return: None
- makeplot2(e=None)¶
Plot mass spectrum in :param e: unused event :return: None
- on_batch(e=None)¶
- on_charge_states(e=None, mass=None, plot=None, peakpanel=None, data=None)¶
Triggered by right click “plot charge states” on self.view.peakpanel. Plots a line with text listing the charge states of a specific peak. :param e: unused event :return: None
- on_dataprep_button(e=None)¶
Run data preparation :param e: unused event :return: None
- on_delete(evt=None)¶
- on_full(evt=None)¶
- on_init_config(e=None)¶
- on_label_integral(e=None, peakpanel=None, pks=None, plot=None, dataobj=None)¶
Triggered by right click “Label Masses” on self.view.peakpanel. Plots a line with text listing the mass of each specific peak. Updates the peakpanel to show the masses. :param e: unused event :return: None
- on_open(e=None)¶
Open dialog for file opening :param e: unused space for event :return: None
- on_open_file(filename, directory, skipengine=False, refresh=False, **kwargs)¶
Opens a file. Run self.eng.open_file. :param filename: File name :param directory: Directory containing file :param skipengine: Boolean, Whether to skip running the engine (used when loading state) :return: None
- on_paste_spectrum(e=None)¶
Gets spectum from the clipboard, writes it to a new file, and then opens that new file. :param e: unused space for event :return: None
- on_plot_dists(evt=None)¶
- on_plot_peaks(evt=None)¶
- on_raw_open(evt=None, dirname=None)¶
- on_replot(evt=None)¶
- on_unidec_button(e=None)¶
Run IsoDec :param e: unused event :return: None
- plot_mass_peaks(evt=None)¶
- plot_mz_peaks(evt=None)¶
- translate_config()¶
- translate_id_config()¶
- translate_pks()¶
unidec.IsoDec.altdecon module¶
- unidec.IsoDec.altdecon.gen_thrash_arrays(centroids, startpad=10)¶
- unidec.IsoDec.altdecon.thrash_predict(centroids)¶
unidec.IsoDec.c_interface module¶
- class unidec.IsoDec.c_interface.IDConfig¶
Bases:
Structure
- dlen¶
Structure/Union member
- elen¶
Structure/Union member
- l1¶
Structure/Union member
- l2¶
Structure/Union member
- l3¶
Structure/Union member
- l4¶
Structure/Union member
- maxz¶
Structure/Union member
- pres¶
Structure/Union member
- verbose¶
Structure/Union member
- class unidec.IsoDec.c_interface.IDSettings¶
Bases:
Structure
- adductmass¶
Structure/Union member
- css_thresh¶
Structure/Union member
- datathreshold¶
Structure/Union member
- isolength¶
Structure/Union member
- isotopethreshold¶
Structure/Union member
- knockdown_rounds¶
Structure/Union member
- mass_diff_c¶
Structure/Union member
- matchtol¶
Structure/Union member
- maxshift¶
Structure/Union member
- min_score_diff¶
Structure/Union member
- minareacovered¶
Structure/Union member
- minpeaks¶
Structure/Union member
- minusoneaszero¶
Structure/Union member
- mzwindow¶
Structure/Union member
- peakthresh¶
Structure/Union member
- peakwindow¶
Structure/Union member
- phaseres¶
Structure/Union member
- plusoneintwindow¶
Structure/Union member
- verbose¶
Structure/Union member
- zscore_threshold¶
Structure/Union member
- class unidec.IsoDec.c_interface.IsoDecWrapper(dllpath=None)¶
Bases:
object
- encode(centroids, maxz=50, phaseres=8, config=None)¶
- predict_charge(centroids)¶
- process_spectrum(centroids, pks=None, config=None)¶
- class unidec.IsoDec.c_interface.MPStruct¶
Bases:
Structure
- area¶
Structure/Union member
- avgmass¶
Structure/Union member
- endindex¶
Structure/Union member
- isodist¶
Structure/Union member
- isomass¶
Structure/Union member
- isomz¶
Structure/Union member
- matchedindsexp¶
Structure/Union member
- matchedindsiso¶
Structure/Union member
- monoiso¶
Structure/Union member
- monoisos¶
Structure/Union member
- mz¶
Structure/Union member
- peakint¶
Structure/Union member
- peakmass¶
Structure/Union member
- realisolength¶
Structure/Union member
- score¶
Structure/Union member
- startindex¶
Structure/Union member
- z¶
Structure/Union member
- unidec.IsoDec.c_interface.config_to_settings(config)¶
- unidec.IsoDec.c_interface.matchedinds¶
alias of
c_long_Array_32
unidec.IsoDec.datatools module¶
- unidec.IsoDec.datatools.check_spacings(spectrum: ndarray)¶
- unidec.IsoDec.datatools.datacompsub(datatop, buff)¶
Complex background subtraction.
Taken from Massign Paper
First creates an array that matches the data but has the minimum value within a window of +/- buff. Then, smooths the minimum array with a Gaussian filter of width buff * 2 to form the background array. Finally, subtracts the background array from the data intensities.
- Parameters:
datatop – Data array
buff – Width parameter
- Returns:
Subtracted data
- unidec.IsoDec.datatools.fastcalc_FWHM(peak, data)¶
- unidec.IsoDec.datatools.fastnearest(array, target)¶
In a sorted array, quickly find the position of the element closest to the target. :param array: Array :param target: Value :return: np.argmin(np.abs(array - target))
- unidec.IsoDec.datatools.fastpeakdetect(data, window: int = 10, threshold=0.0, ppm=None, norm=True)¶
- unidec.IsoDec.datatools.fastwithin_abstol(array, target, tol)¶
- unidec.IsoDec.datatools.fastwithin_abstol_withnearest(array, target, tol)¶
- unidec.IsoDec.datatools.get_all_centroids(data, window=5, threshold=0.0001, background=100, moving_average_smoothing=3)¶
- unidec.IsoDec.datatools.get_centroid(data, peakmz, fwhm=1)¶
Get the centroid of the peak. :param data: 2D numpy array of data :param peakmz: float, m/z value of the peak :return: float, centroid of the peak
- unidec.IsoDec.datatools.get_centroids(data, peakmz, mzwindow=None)¶
- unidec.IsoDec.datatools.get_fwhm_peak(data, peakmz)¶
Get the full width half max of the peak. :param data: 2D numpy array of data :param peakmz: float, m/z value of the peak :return: float, full width half max of the peak
- unidec.IsoDec.datatools.get_noise(data, n=20)¶
Get the noise level of the data. :param data: 2D numpy array of data :return: float, noise level
- unidec.IsoDec.datatools.isotope_finder(data, mzwindow=1.5)¶
- unidec.IsoDec.datatools.remove_noise_cdata(data, localmin=100, factor=1.5, mode='median')¶
Remove noise from the data. :param data: 2D numpy array of data :param localmin: int, number of data points local width to take for min calcs :return: data with noise removed
- unidec.IsoDec.datatools.simp_charge(centroids, silent=False)¶
Simple charge prediction based on the spacing between peaks. Picks the largest peak and looks at the spacing between the two nearest peaks. Takes 1/avg spacing as the charge. :param centroids: Centroid data, 2D numpy array as [m/z, intensity] :param silent: Whether to print the charge state, default False :return: Charge state as int
- unidec.IsoDec.datatools.subtract_matched_centroid_range(profile_data, matched_theoretical, centroids, noise_threshold=1000, tolerance=0.001, window_size=5)¶
unidec.IsoDec.encoding module¶
- unidec.IsoDec.encoding.encode_dir(pkldir, outdir=None, name='medium', maxfiles=None, plot=False, **kwargs)¶
- unidec.IsoDec.encoding.encode_double(centroid, centroid2, maxdist=1.5, minsep=0.1, intmax=0.2, phaseres=8)¶
- unidec.IsoDec.encoding.encode_harmonic(centroid, z, intmax=0.2, phaseres=8)¶
- unidec.IsoDec.encoding.encode_noise(peakmz: float, intensity: float, maxlen=16, phaseres=8)¶
- unidec.IsoDec.encoding.encode_phase(centroids, maxz=50, phaseres=8)¶
Encode the charge phases for a set of centroids :param centroids: Centroids (m/z, intensity) :param maxz: Maximum charge state to calculate :param phaseres: Resolution of phases to encode in number of bins :return: Charge phase histogram (maxz x phaseres)
- unidec.IsoDec.encoding.encode_phase_all(centroids, peaks, lowmz=-1.5, highmz=5.5, phaseres=8, minpeaks=3, datathresh=0.05)¶
Work on speeding this up :param centroids: :param peaks: :param lowmz: :param highmz: :return:
- unidec.IsoDec.encoding.encode_phase_file(file, maxlen=8, save=True, outdir='C:\\Data\\IsoNN\\multi', name='medium', onedropper=0.95)¶
- unidec.IsoDec.encoding.extract_centroids(centroids, peaks, lowmz=-1.5, highmz=5.5, minpeaks=3, datathresh=0.05)¶
- unidec.IsoDec.encoding.save_encoding(data, outfile)¶
unidec.IsoDec.engine module¶
- class unidec.IsoDec.engine.IsoDecDataset(emat, z)¶
Bases:
Dataset
Dataset class for IsoDec
- class unidec.IsoDec.engine.IsoDecEngine(phaseres=8, verbose=False, use_wrapper=False)¶
Bases:
object
Main class for IsoDec Engine
- add_doubles(double_percent)¶
Add double peaks to the training and test data :param double_percent: Percent of total data to add as double peaks :return: None
- add_harmonics(harmonic_percent=0.4)¶
Add harmonics at 2x charge to the training and test data :param harmonic_percent: Percent of total data to add as harmonics :return: None
- add_noise(noise_percent)¶
Add noise to the training and test data :param noise_percent: Percent of total data to add as noise :return: None
- batch_process_spectrum(data, window=None, threshold=None, centroided=False, refresh=False)¶
Process a spectrum and identify the peaks. It first identifies peak cluster, then predicts the charge, then checks the peaks. If all is good, it adds them to the MatchedCollection as a MatchedPeak object.
- Parameters:
data – Spectrum data, m/z in first column, intensity in second
window – Window for peak selection
threshold – Threshold for peak selection
centroided – Whether the data is already centroided. If not, it will centroid it.
- Returns:
MatchedCollection of peaks
- create_merged_dataloader(dirs, training_path, noise_percent=0.0, batchsize=None, double_percent=0.4, harmonic_percent=0.0, onedrop_percent=0.0)¶
Create a merged dataloader from multiple directories. Looks for common file names and merges them together :param dirs: Directories to look in :param training_path: File name or tag, fed to load_training_data :param noise_percent: Percent of noise to add to the training and test data :param batchsize: Batch size for training :param double_percent: Percent of double peaks to add to the training and test data :return:
- create_training_dataloader(training_path, test_path=None, noise_percent=0, batchsize=None, double_percent=0.4, harmonic_percent=0, one_drop_percent=0)¶
Create the training and test dataloaders from a single file path :param training_path: Path to the training data file or name of the file tag :param test_path: Optional path to the test data file or name of the file tag. Default is same as training :param noise_percent: Percent of noise to add to the training and test data :param batchsize: Batch size for training :param double_percent: Percent of double peaks to add to the training and test data :return:
- drop_ones(percentage=0.8)¶
Drop 80% of the training data with charge 1 :param percentage: Percentage of data to drop :return: None
- export_peaks(type='prosightlite', filename='output', reader=None, act_type='HCD', max_precursors=None)¶
- get_matches(centroids, z, peakmz, pks=None)¶
Get the matches for a peak :param centroids: Centroid data, m/z in first column, intensity in second :param z: Predicted charge :param peakmz: Peak m/z value :param pks: MatchedCollection peaks object :return: Indexes of matched peaks from the centroid data
- get_matches_multiple_z(centroids, zs, peakmz, pks=None)¶
- load_training_data(training_path, test_path=None, noise_percent=0.0, double_percent=0.4, harmonic_percent=0.0, onedrop_percent=0.0)¶
Load training data from a file :param training_path: Path to the training data file or name of the file tag :param test_path: Optional path to the test data file or name of the file tag.
If not, will default to same as training_path
- Parameters:
noise_percent – The percent of noise to add to the training and test data
double_percent – The percent of double peaks to add to the training and test data
- Returns:
None
- phase_predictor(centroids)¶
Predict the charge of a peak :param centroids: Set of centroid data for a peak with m/z in first column and intensity in second :return: Charge state, integer
- pks_to_mass(binsize=0.1)¶
Convert the MatchedCollection to mass :return: None
- process_file(file, scans=None)¶
- save_bad_data(filename='bad_data.pkl', maxbad=50)¶
Save bad data to a file. Evaluates the model, collects bad data, and saves it :param filename: Filename to save too. Default is bad_data.pkl :param maxbad: How many to save, default is 50 :return: None
- thrash_predictor(centroids)¶
- train_model(epochs=30, save=True, lossfn='crossentropy', forcenew=False)¶
Train the model :param epochs: Number of epochs :param save: Whether to save it. Default is True :param lossfn: Loss function, default is crossentropy. Options are crossentropy, weightedcrossentropy, focal :return: None
unidec.IsoDec.example_single_run module¶
unidec.IsoDec.generate module¶
unidec.IsoDec.isogenwrapper module¶
- class unidec.IsoDec.isogenwrapper.IsoGenWrapper(dllpath=None)¶
Bases:
object
- gen_isodist(mass)¶
- gen_isomike(mass)¶
- unidec.IsoDec.isogenwrapper.gen_isodist(mass, isolen=64)¶
- unidec.IsoDec.isogenwrapper.gen_isomike(mass, isolen=64)¶
- unidec.IsoDec.isogenwrapper.isodist¶
alias of
c_float_Array_64
unidec.IsoDec.match module¶
- class unidec.IsoDec.match.IsoDecConfig¶
Bases:
object
- set_scan_info(s, reader=None)¶
Sets the active scan info :param s: The current scan :param reader: The reader object :return: None
- class unidec.IsoDec.match.MatchedCollection¶
Bases:
object
Class for collecting matched peaks
- add_peak(peak)¶
Add peak to collection :param peak: Add peak to collection :return:
- add_peaks(peaks)¶
- add_pk_to_masses(pk, ppmtol)¶
Checks if an existing mass matches to this peak, if so adds it to that mass, otherwise creates a new mass The list of masses is constantly kept in order of monoisotopic mass.
- add_pk_to_masses2(pk, ppmtol, rt_tol=2, maxshift=3)¶
Checks if an existing mass matches to this peak, if so adds it to that mass, otherwise creates a new mass The list of masses is constantly kept in order of monoisotopic mass.
- copy_to_string()¶
- export_msalign(config, reader, filename='export.msalign', act_type='HCD', max_precursors=None)¶
- export_prosightlite(filename='prosight.txt')¶
- export_tsv(filename='export.tsv')¶
- filter(minval, maxval, type='mz')¶
- get_z(item)¶
- get_z_dist()¶
- group_peaks(peaks)¶
- load_pks(filename='peaks.pkl')¶
- merge_missed_monoisotopics(ppm_tolerance=20, max_mm=1)¶
- save_pks(filename='peaks.pkl')¶
- to_df()¶
Convert a MatchedCollection object to a pandas dataframe :return: Pandas dataframe
- to_mass_spectrum(binsize=0.1)¶
- class unidec.IsoDec.match.MatchedMass(pk)¶
Bases:
object
Matched mass object for collecting data on MatchedPeaks with matched masses.
- class unidec.IsoDec.match.MatchedPeak(z, mz, centroids=None, isodist=None, matchedindexes=None, isomatches=None)¶
Bases:
object
Matched peak object for collecting data on peaks with matched distributions
- unidec.IsoDec.match.calculate_cosinesimilarity(cent_intensities, iso_intensities, shift: int, max_shift: int, minusoneareaszero: bool = True)¶
- unidec.IsoDec.match.compare_annotated(l1, l2, ppmtol, maxshift)¶
- unidec.IsoDec.match.compare_matched_ions(coll1, coll2, other_alg=None)¶
- unidec.IsoDec.match.compare_matchedcollections(coll1, coll2, ppmtol=50, objecttocompare='monoisos', maxshift=3, ignorescan=False)¶
- unidec.IsoDec.match.compare_matchedmasses(coll1, coll2, ppmtol=20, maxshift=3, rt_tol=2, f_shared_zs=0.5)¶
- unidec.IsoDec.match.create_isodist(peakmz, charge, data, adductmass=1.007276467)¶
Create an isotopic distribution based on the peak m/z and charge state. :param peakmz: Peak m/z value as float :param charge: Charge state :param data: Data to match to as 2D numpy array [m/z, intensity] :return: Isotopic distribution as 2D numpy array [m/z, intensity]
- unidec.IsoDec.match.create_isodist2(monoiso, charge, maxval, adductmass=1.007276467)¶
- unidec.IsoDec.match.create_isodist_full(peakmz, charge, data, adductmass=1.007276467, isotopethresh: float = 0.01)¶
Create an isotopic distribution based on the peak m/z and charge state. :param peakmz: Peak m/z value as float :param charge: Charge state :param data: Data to match to as 2D numpy array [m/z, intensity] :return: Isotopic distribution as 2D numpy array [m/z, intensity]
- unidec.IsoDec.match.df_to_matchedcollection(df, monoiso='Monoisotopic Mass', peakmz='Most Abundant m/z', peakmass='Most Abundant Mass', scan='Scan', z='Charge', intensity='Abundance', ion='Ion')¶
Convert a pandas dataframe to a MatchedCollection object :param df: Pandas dataframe with columns mz, intensity, z, scan, rt, monoiso, peakmass, avgmass :return: MatchedCollection object
- unidec.IsoDec.match.find_matched_intensities(spec1_mz: ndarray, spec1_intensity: ndarray, spec2_mz: ndarray, max_shift: int, tolerance: float, z: int, peakmz: float) List[float] ¶
Faster search for matching peaks. Makes use of the fact that spec1 and spec2 contain ordered peak m/z (from low to high m/z).
- Parameters:
spec1_mz – Spectrum peak m/z values as numpy array. Peak mz values must be ordered.
spec2_mz – Theoretical isotope peak m/z values as numpy array. Peak mz values must be ordered.
tolerance – Peaks will be considered a match when within [tolerance] ppm of the theoretical value.
- Returns:
List containing entries of type (centroid intensity)
- Return type:
matches
- unidec.IsoDec.match.find_matches(spec1: ndarray, spec2: ndarray, tolerance: float) Tuple[List[int], List[int]] ¶
Faster search for matching peaks. Makes use of the fact that spec1 and spec2 contain ordered peak m/z (from low to high m/z).
- Parameters:
spec1 – Spectrum peak m/z and int values as numpy array. Peak mz values must be ordered.
spec2 – Isotope distribution peak m/z and int values as numpy array. Peak mz values must be ordered.
tolerance – Peaks will be considered a match when <= tolerance appart in ppm.
- Returns:
List containing entries of type (idx1, idx2).
- Return type:
matches
- unidec.IsoDec.match.get_accepted_shifts(cent_intensities, isodist, maxshift, min_score_diff, css_thresh, minusoneaszero=True)¶
- unidec.IsoDec.match.get_estimated_monoiso(peakmass)¶
Estimates the monoisotopic mass from the peak mass. This is a lazy approximation, but is good enough for most purposes. –JGP :param peakmass: Most abundant isotopologue mass
- Returns:
estimated monoisotopic mass
- unidec.IsoDec.match.get_unique_matchedions(coll1, coll2)¶
- unidec.IsoDec.match.is_close(mz1, mz2, tolerance)¶
- unidec.IsoDec.match.make_shifted_peak(shift: int, shiftscore: float, monoiso: float, massdist: ndarray, isodist: ndarray, peakmz: float, z: int, centroids: ndarray, matchtol: float, minpeaks: int, p1low: float, p1high: float, css_thresh: float, minareacovered: float, verbose=True)¶
- unidec.IsoDec.match.optimize_shift2(config, centroids: ndarray, z, peakmz)¶
- unidec.IsoDec.match.peak_mz_z_df_to_matchedcollection(df, data=None)¶
Convert a pandas dataframe of peak mzs (col1) and zs (col2) to a MatchedCollection object :param df: Pandas dataframe with columns mz, z :param data: Optional data to match to as 2D numpy array [m/z, intensity] :return: MatchedCollection object
- unidec.IsoDec.match.read_manual_annotations(path=None, delimiter=' ', data=None)¶
- unidec.IsoDec.match.read_msalign_to_matchedcollection(file, data=None, mz_type='monoiso')¶
Read an msalign file to a MatchedCollection object :param file: Path to msalign file :return: MatchedCollection object
- unidec.IsoDec.match.remove_noise_peaks(pks, noiselevel)¶
Remove peaks below a certain noise level :param pks: List of MatchedPeaks :param noiselevel: Noise level to remove :return: List of MatchedPeaks
unidec.IsoDec.models module¶
- class unidec.IsoDec.models.Fast4PhaseNeuralNetwork¶
Bases:
Module
Very simple neural net for classification. Inputs are 50x4 phase images. Output is len 50 array of probabilities for each charge state 0 to 49.
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class unidec.IsoDec.models.Fast8PhaseNeuralNetwork¶
Bases:
Module
Very simple neural net for classification. Inputs are 50x8 phase images. Output is len 50 array of probabilities for each charge state 0 to 49.
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class unidec.IsoDec.models.FocalLoss(alpha, gamma=2)¶
Bases:
Module
Class for focal loss function.
- forward(inputs, targets)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class unidec.IsoDec.models.PhaseModel(working_dir=None)¶
Bases:
object
General model class for charge state prediction base on a phase encoding.
Includes functions to train, evaluate, and predict charge states.
- batch_predict(dataloader, zscore_thresh=0.95)¶
Predict charge states for a batch of data. :param dataloader: DataLoader object with the data to predict :return: Output charge state predictions
- encode(centroids)¶
Encode the centroids into a format for the model. :param centroids: Centroid data, m/z and intensity :return: Output array of the encoded data, size dims[0] x dims[1]
- evaluate_model(dataloader, savebad=False)¶
Evaluate the model on a test set. :param dataloader: Test DataLoader object :param savebad: Whether to collect incorrect predictions :return: List of incorrect predictions if savebad is True
- get_class_weights(dataloader)¶
Calculate class weights for the data in a DataLoader object. :param dataloader: Training DataLoader object :return: None
- get_model(modelid)¶
Get the model based on the model ID. :param modelid: Model ID integer. Options are 0, 1, and 2. :return: None
- load_model()¶
Load model from savepath. :return: None
- predict(centroids)¶
Predict charge state for a single set of centroids. :param centroids: Centroid data, m/z and intensity :return: Predicted charge state, integer
- predict_returnvec(centroids)¶
Predict charge state for a single set of centroids. :param centroids: Centroid data, m/z and intensity :return: Predicted charge state, integer
- save_model()¶
Save model to savepath. :return: None
- setup_model(modelid=None, forcenew=False)¶
Setup model and load if savepath exists. Set device. :param modelid: Model ID passed to self.get_model() :param forcenew: Whether to force starting over from scratch on model parameters :return: None
- setup_training(lossfn='crossentropy', forcenew=False)¶
” Setup loss function, optimizer, and scheduler. :param lossfn: Loss function to use. Options are “crossentropy”, “weightedcrossentropy”, and “focal”. :return: None
- train_model(dataloader, lossfn='crossentropy', forcenew=False)¶
Train the model on a DataLoader object. :param dataloader: Training DataLoader object :param lossfn: Loss function to use. Options are “crossentropy”, “weightedcrossentropy”, and “focal”. :return: None
- class unidec.IsoDec.models.PhaseNeuralNetwork(size=8, outsize=50)¶
Bases:
Module
Very simple neural net for classification. Generalized dimensions. Inputs are nxm phase images. Output is len outsize array of probabilities for each charge state 0 to outsize-1.
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- unidec.IsoDec.models.print_model(model)¶
- unidec.IsoDec.models.save_model_to_binary(model, outfile)¶
- unidec.IsoDec.models.set_debug_apis(state: bool = False)¶
unidec.IsoDec.msalign_export module¶
- unidec.IsoDec.msalign_export.findprecursors(precursor_min, precursor_max, precursor_scanNum, ms1_scan_dict, max_precursors=None)¶
- unidec.IsoDec.msalign_export.findprecursors_noms1(precursor_min, precursor_max, pks, max_precursors=None)¶
- unidec.IsoDec.msalign_export.get_ms1_scan_num_id(ms1_scan_dict, k, v)¶
- unidec.IsoDec.msalign_export.sort_by_scan_order(matched_collection, order)¶
- unidec.IsoDec.msalign_export.write_ms1_msalign(ms1_scan_dict, ms2_scan_dict, file, config)¶
- unidec.IsoDec.msalign_export.write_ms2_msalign(ms2_scan_dict, ms1_scan_dict, reader, file, config, act_type='HCD', max_precursors=None)¶
unidec.IsoDec.oldisodecfunctions module¶
OLD ISODEC FUNCTIONS NO LONGER RELEVANT TO USE
unidec.IsoDec.plots module¶
- unidec.IsoDec.plots.cplot(centroids, color='r', factor=1, base=0, mask=None, mfactor=-1, mcolor='g', z=0, zcolor='b', zfactor=1, isodist=None)¶
Simple script to plot centroids :param centroids: Centroid array with m/z in first column and intensity in second :param color: Color :param factor: Mutiplicative factor for intensity. -1 will set below the axis :param base: Base of the lines. Default is 0. Can be adjusted to shift up or down. :return: None
- unidec.IsoDec.plots.fast_vlines(centroids, color, base, factor)¶
Fast vlines plotter :param centroids: Centroids array :param color: Color :param base: Base of the lines :param factor: Factor to multiply intensity by :return: None
- unidec.IsoDec.plots.gen_plot_fig(pks, centroids, ccolor='k', forcecolor='b', tickfont=14, labfont=18)¶
- unidec.IsoDec.plots.on_scroll(event)¶
- unidec.IsoDec.plots.plot_pks(pks, data=None, centroids=None, scan=-1, show=False, labelz=False, title=None, ccolor='r', plotmass=True, zcolor=False, zcolormap='nipy_spectral', forcecolor=None, nocentroids=False, tickfont=12, labfont=14, labelpeaks=False)¶
unidec.IsoDec.runtime module¶
- class unidec.IsoDec.runtime.IsoDecRuntime(phaseres=8, verbose=False)¶
Bases:
object
Main class for IsoDec Engine
- batch_process_spectrum(data, window=5, threshold=0.0001, centroided=False, refresh=False)¶
Process a spectrum and identify the peaks. It first identifies peak cluster, then predicts the charge, then checks the peaks. If all is good, it adds them to the MatchedCollection as a MatchedPeak object.
- Parameters:
data – Spectrum data, m/z in first column, intensity in second
window – Window for peak selection
threshold – Threshold for peak selection
centroided – Whether the data is already centroided. If not, it will centroid it.
- Returns:
MatchedCollection of peaks
- export_peaks(type='prosightlite', filename=None, reader=None, act_type='HCD', max_precursors=None)¶
- phase_predictor(centroids)¶
Predict the charge of a peak :param centroids: Set of centroid data for a peak with m/z in first column and intensity in second :return: Charge state, integer
- pks_to_mass(binsize=0.1)¶
Convert the MatchedCollection to mass :return: None
- process_file(file, scans=None, check_centroided=True, assume_centroided=False)¶
- thrash_predictor(centroids)¶
unidec.IsoDec.train module¶
unidec.IsoDec.trainingdata module¶
- unidec.IsoDec.trainingdata.is_valid(file)¶
Check file to see if extension is recognized by UniDec :param file: File name :return: True/False whether file is valid UniDec data file
- unidec.IsoDec.trainingdata.match_charge(centroids, peakmz, charge_range=[1, 50], tol=0.01, threshold=0.85, nextbest=0.8, baseline=0.1, silent=True)¶
- unidec.IsoDec.trainingdata.match_peaks(centroids: array, isodist: array, tol: float = 5.0) Tuple[List[int], List[int]] ¶
- matchingpeaks = matchms.similarity.spectrum_similarity_functions.find_matches(centroids[:, 0],
isodist[:, 0], tol)
matchedindexes = [match[0] for match in matchingpeaks] isomatches = [match[1] for match in matchingpeaks]
- unidec.IsoDec.trainingdata.process_dir(directory)¶
Process directory to turn raw files into centroid pkl files :param directory: Target directory :return: List of all charges found for histogram plots
- unidec.IsoDec.trainingdata.process_file(file, overwrite=False, peakdepth=10, maxpeaks=None, onedropper=0.9)¶
Process raw m/z data into centroid pkl files. :param file: File name :param overwrite: Whether to overwrite existing pkl files :param peakdepth: Maximum number of peaks to pick from each spectrum :param maxpeaks: Maximum total number of peaks to pick per file :param onedropper: Percentage of 1+ charge states to drop :return: List of charge states picked for histogram plots