ResAnalysis¶

class ResAnalysis(prodnet, solution_input, prod_to_remove_id)¶

This class features multiple functions that streamline the analysis of one or more solutions (all designs in a given set of parameters) and designs (a specific set of deletions with a solution from the mop).

ResAnalysis(prodnet, solution_input, prod_to_remove_id)¶

Args: Solution input( cell/string or mop_solution structure). Indicates solutions to be analyzed. Contains solution id(s) in the

current prodnet problem path OR mop_solution structure.

prod_id_to_remove(cell). The id of products to be omited from the analysis.

consistent_solutions = None¶: (structure of mop solutions) All mop_solutions have the same number of networks, in case some networks are missing the design variables and objectives will be padded with zeros.

default_figure_info = None¶: (structure) Note: Currently unused..colors: vector of 5-10 colors with strong contrast, .line_type

default_plot_lines = None¶: (structure) Plot formatting. Note: currently unused.

is_wt_points_calc = None¶: (logical) Bookeeping.

n_solutions = None¶: (integer) Number of loaded solutions.

prodnet = None¶: (Prodnet class) Note this is a copy of prodnet, not a reference.

solution_ids = None¶: (cell of strings) ids of loaded solutions, order matches the solutions structure.

solutions = None¶: (structure of mop solutions) Solutions for analysis.

wt_all_growth_rates = None¶: (matrix) Points for all wild type production network production envelopes.

wt_all_product_yields = None¶: (matrix) Points for all wild type production network production envelopes.

calc_flux_table(obj, design_ind, varargin)¶

Create a table of flux distributions for wt and all mutant production networks. Two states are possible growth and non-growth. Also, two flux estimation methods are possible, pFBA and FVA, in both cases the fluxes of product and biomass are fixed to the optimal state.

Parameters:

design_ind (integer) – Index of the specific design to be applied to construct mutant.
plot_top_n (integer, optional) – top fluxes to be ploted in heatmap figure
ng_state (logical, optional) – False (default) calcualte growth state fluxes (maximum growth rate is impossed as constraint). True: calculate non-growth state fluxes (growth rate isconstrained to 0)
sol_ind (integer,optioinal) – Index of the solution to which the design_ind belongs. Defaults to 1.
add_FVA (logical, optional) – If true adds FVA results to the pFBA table. Default is false.
calc_cofactor_turnover (logical, optional) – If true, creates a new table with cofactor turnovers from pFBA flux distributions. Default is false.
min_obj (double, optional) – Minimum objective value required to include production networks in the final table. Default is 0.
write_output (logical, optional) – If true, writes an xls table (two if cofactor turnover option is enabled). Default is true. to problem output folder.
only_mutant_flux (logical, optional) – If true the table only contains reactions for mutant flux (no wild type, and no difference information). Default is false.
sort_by_diff (logical, optional) – If true, sorts the table by the average difference with respect to the wild type flux. default true.
solver (string, optional) – The solver to use, default is ‘cplex’, if not ‘gurobi’ will be used, and if neither is an option ‘matlab’ will be used.

Returns:

flux.headers (cell) – headers for flux table
flux.data (double) – data for flux table
ct.headers (cell) – headers for cofactor turnover table
ct.data (double) – data for covator turnover table

Notes

Gurobi is the preferred QP solver for pFBA. Parameters have been tweaked to ensure convergence. While the Matlab QP solver (quadprog) is supproted, it may not converge.

calc_prod_envelope(obj, model_ind, npoints)¶

Finds points for 2d projection of the metabolic model bu solving a series of LPs.

Parameters:

model_ind (int) – index of the target production network.
npoints (int) – Number of points to sample.

Returns:

growth_rates (vector)
product_yields (vector)

Notes

This function is a wrapper for the cobratoolbox function productionEnvelope().

calc_prod_envelope_s(model, npoints)¶

Finds points for 2d projection of the metabolic model by solving a series of LPs.: This is a static verion of calc_prod_env

Parameters:

model – A cobra model with modcell fields.
npoints (int) – Number of points to sample.

Returns:

growth_rates (vector)
product_rates (vector)
product_yields (vector)

Notes

This function is a wrapper for the cobratoolbox function productionEnvelope().

compatibility(obj, varargin)¶

Analyze compatibility of the solutions in obj.solutions

Args: cutoff(double): Compatibility threshoold, default is 0.6;

Returns:	comp(i).vals (vector) – compatibility of all solutions comp(i).max (double) – maximum compatibility comp(i).max_inds (vector) – indices of the most compatible solutions

create_consistent_solutions(obj)¶

Create consistent solutions, i.e. all mop_solutions index to the same production networks. The first solution is used as a reference, and missing production networks are included in other solutions.

Notes

This is relevant for analysis functions like plot_yield_vs_growth and

plot_design_tradeoff.

escher_input_from_pFBA_table(obj, pfba_t_filename, column_id)¶

Keeps two columns for pFBA table and writes a csv which may be used for analysis with escher.

Parameters:	pfba_t_filename (string) – name of the file resulting from `src.@ResAnalysis.calc_flux_table()` column_id (string) – id indicating the production network to be kept in the output.

fcm_get_table(obj, n_clusters, sol_ind)¶

Performs fuzzy c-means clustering and returns the output in a table form

Parameters:

n_clusters (integer) – Number of clusters for k-medoids.
solution_ind (integer, optional) – Index of the solution to be ploted, defaults to 1.

Returns:

table_id (cell of strings) – contains cluster id (headers)
table_val (doubles) – table of membership values

fcm_scan_c(obj, sol_ind)¶

Plot the objective value of fuzzy c-means results versus the number of clusters

Parameters:	solution_ind (integer, optional) – Index of the solution to be ploted, defaults to 1.

get_CPF(PF, cutoff)¶: Computes categorical pareto front.

graph_deletion_frequency(obj, solution_ind)¶

Defined by a square matrix of reaction deletion frequencies, this graph allows to identify clusters of reactions that appear deleted often. Aij = P(reaction j in design | reaction i in design).

Notes

The graph has to be directed, because the probability of needing one deletion depends on which one has been done first.
A simpler graph could use the overall probability (based on the numeber of desings)

graph_sequential_implementation(obj)¶: Wrapper for graph_sequential_implementation_s()

graph_sequential_implementation_s(mop_solutions, solution_id, prodnet, prod_id, varargin)¶

Sequential implementation directed k-partite graph. Each partition corresponds to a set of parameters (e.g. wGCP-5-0) and each node to a specific design. E.g. wGCP-5-0-1 points to wGCP-6-0-1 if is the deletions in the former are contained in the later.

Notes

The output files are meant to be analyzed with Cytoscape.
The implementation uses sets of deletion IDs instead of faster logical indices , to allow for the case where indices are not consistent (e.g. wGCP vs NGP).

Parameters:

mop_solutions (cell array of mop_solution) –
solution_id (cell) –
prodnet (Prodnet object instance) –
prod_id (cell) –
write_nstep_graph (logical, optional) – Default false.
base_path (string, optioinal) – Default = [inputs.prodnet.problem_path, filesep, ‘output’, filesep]

Returns:

sequential_implem_graph_edge.csv (csv file) – headers correspond to: source | target | additional_rxns |is_same.
sequential_implem_graph_edge_nstep.csv (csv file) – The same as sequential_implem_graph_edge.csv but the graph is complete in the sense that a node from one parameter set can have edges to any other downstream parameter set. ( In sequential_implem_graph_edge.csv nodes from one parameter set can only be connected with the next parameter set)
sequential_implem_graph_node.csv (csv file) – A node attribute table with: node_id | short_name | design_param(i.e. partition)
Warning –
- This method assumes that mop_solutions are ordered in terms of
  
  increasing deletions.

identify_deletion_role(model, base_deletion_set, compare_deletion_set, varargin)¶

Identifies the role of one or more reaction deletions in the model flux distribution.

Parameters:

model (cobra model) –
base_deleion_set (cell array of reaction ids) – Use as a reference.
compare_deletion-set (cell array of reaction ids) – Usually contains one deletion less than the base_deletion_set, to idenify the role of such reaciton.
growth_state (string, optional) –
-‘all’ (default), growth state is not constrained. -‘max’ growth rate is fixed to the max attainable by the model with

the base_deleion_set applied.

-‘none’ growth rate is fixed to zero.
difference_type (string, optional) –
- fva_range_l1 (default), computes fva for both models, the flux
  
  range, and then sorts by the absolute difference of ranges.
- fva_range_jaccard, omputes fva for both models, the flux
  
  range, and then sorts by the jacard distancde of ranges.
- sample_distance, perform sampling in both models, then sort by the
  
  distance specified by the parameter ‘distribution_metric’
- pfba, compares pfba solution when maximizing growth rate for both models, and sorts by L1 norm.
distribution_metric (string, optional) –
If using the ‘sampling_distance’ difference_type, determines what metric to use when

comparing distributions.

-‘kolmogorov-smirnov-p’(default) -

Warning

Only fva methods have been tested

load_solutions(obj, solution_ids)¶

loads mop_solutions from problem output folders.

Parameters:	solution_ids (cell or string) – ids of mop_solution files.

piechart_deletion_frequency(deletions, categories_in, varargin)¶

Pie chart of deletion distribution with category information.

Args: deletions(index).id (cell array): deletion ids categories(containers.Map()): maps ids to their respective category (e.g. subsystem). top(double, optional) figure_handle(object, optional):

Notes

While containers.Map() seemed like a useful data structure, it does

not integrate well with the rest of matlab.

piechart_deletion_frequency_w(obj, sol_inds, varargin)¶: Wrapper for piechart_deletion_frequency()

plot_2d_pf(obj, n_clusters, solution_ind)¶

A 2 dimensional representation of the pareto front, representative solutions are determined through k-medoids clustering.

Parameters:	n_clusters (integer) – Number of clusters for k-medoids. solution_ind (integer, optional) – Index of the solution to be ploted, defaults to 1.

plot_compatibility(obj, varargin)¶

Box plot of compatibility distributions accross paramters sets. Compatibility of a solution is the number of products with design objective above a certain threshold.

Parameters:

categorical_cutoff_wGCP_NGP (-) – build the categorical pareto front of solutions using the wGCP and NGP objectives, the default is 0.6.
categorical_cutoff_sGCP (-) – build the categorical pareto front of solutions using the sGCP objective, the default is 0.36.
plot_type (-) – ‘default’ (default), ‘no-ndesigns’,’inplot-ndesigns’.
y_n_loc (-) – If the option ‘inplot-ndesigns’ is used, this controls the y position. Default is 0.4.

plot_design_tradeoff(obj, design_ind, varargin)¶

Plots the objective value of the selected design(s), specified by design_ind, with respect to the maximum objective value for each objective.

Usage:: For two wGCP solutions, wGCP-10-0 and wGCP-10-3 loaded in obj, in that order, to plot wGCP-10-0-5 and wGCP-10-3-10 run: obj.plot_design_tradeoff([5,10],[1,2]).

Parameters:

design_ind (vector) – Indices of the solutions of designs to be plotted, each design corresponds to a solution specified in inputs.solution_ind.
solution_ind (integer, optional) – Defaults to 1:length(design_ind).
sort_solution (logical, optional) – Defaults to true, the objectives are sorted to improve readibility. If false, the order of the ra.consitent_solution is used
plot_type (string, optional) – ‘overlap’(default), all solutions in one plot. ‘split’(one plot per solution); ‘split-bar’, one plot per solution using bar plot for maximum objective.

plot_fva_range(fva_results, varargin)¶

Creates a visual plot of fva ranges comparing two models

Args

fva_results.rxns (cell array): reaction ids
fva_results.maxflux_base (vector): vector of maximum fluxes for base model
fva_results.minflux_base (vector)
fva_results.maxflux_compare (vector)
fva_results.minflux_compare (vector)
sort_ind (vector, optional): A sorting vector for reactions. Default

is sort by range.
top(integer, optional): plots the top reactions from sort_ind,

default is 20 first.

Credits

Adapted from cobra toolbox tutorial

plot_pareto_front(obj, varargin)¶

Plots pareto front clustergram and related figures.

Parameters:

solution_ind_or_id (integer or string, optional) –
plot_cpf (logical, optional) –
save_cpf (string, optional) – ‘no’ (default), ‘heatmap’(saves a tables of 0s and 1s which can be used for a heatmap), ‘names’ (a real table version which lists the networks in each cpf).
cpf_cutoffs (vector, optional) – A vector of objective values used to generate a categorical pareto front. Note that a scalar can also be provided. Default is 0.6.
plot_pareto_set (logical, optional) – If true, the pareto set is plotted as a clustergram. Default false. plot_hetmap (logical, optional): If true, heatmaps plots are used instead of clustergrams. Defaults to false. save_to_emf (logical, optional): If true, saves the output to an enhanced metafile. Defaults to true. figure_size (vector of figure size, optional): Used in Matlab to specify figure size and location. By default Matlab will determine this.

Notes

This instruction can be used to supress figure display: set(0,’DefaultFigureVisible’,’off’)

plot_pca()¶: WIP

plot_yield_vs_growth(obj, design_inds, varargin)¶

Generates a multiple plot of production envelopes (aka convex hull). Many aspects of the plot can be customized.

Usage:: Note that optional parameters must be entered as a string-value pair, e.g. obj.plot_yield_vs_growth([5,1],’min_obj_val’, 0.1).

Parameters:

design_inds (vector) – The length matches the numbers of solutions loaded in obj.solutions. Each entry corresponds to the index of a design to be plotted for the solution in the same possition. E.g. if wGCP-5-0 and wGCP-6-0 are loaded, desing_inds = [5,1] plots wGCP-5-0-5 and wGCP-6-0-1.
plot_type (string, optional) – Two options, ‘matrix’(default) where each row is a design, or ‘overlap’ where designs share the same phenotypic space.
min_obj_val (double, optional) – Products below this objective value are not plotted. Default is 0.
yticks_values (vector, opional) – Default is choosen by Matlab.
n_rows (string, optional) – For overlap plot. # rows
n_cols (string, optional) – For overlap plot. # columns
use_prod_name (logical, optional) – If true, product names are used for plot titles. By default(false) product ids are used.
npoints (integer, optional) – Points used to sample the convex hull
convHullLineWidthWT (double, optional) –
convHullLineWidthMut (double, optional) –
fill_space (logical,optional) – Defautl true, color the space inside the convex hull.
wt_color (rgb triplet scaled from 0-1, optional) – e.g. [0,255,0]./255
wt_line_color (rgb triplet scaled from 0-1, optional) –
mut_color (rgb triplet scaled from 0-1, optional) –
mut_line_color (rgb triplet scaled from 0-1, optional) –
set_axis_front (logical, optional) – Default true, will put axis on top drawn figure content
TODO –
- Document additional options.
- Consolidate with static method to avoid code duplication.

plot_yield_vs_growth_s(model_array, ko_array, prod_id, solution_id, varargin)¶

Generates a multiple plot of production envelopes (aka convex hull).: Many aspects of the plot can be customized.

Parameters:

model_array (structure array) – A structure of cobra models.
ko_array (structure array) –
The indices of this sturcture match those of the models.The only field is ko_array(i).designs(j).del, which is a cell array containing either reaction

deletions or gene deletions. i is corresponds to the model index, and j to the design. All models must have the same number of designs. If the deletion_type is ‘other’, ko_array(i).designs(j).ub and ko_array(i).designs(j).lb must be provided.

prod_id : cell of strings

Names of the models in model_array.

solution_id : cell of strings

Names of the solutions in model index.
deletion_type (string, optional) –

Type of deletion. Default is ‘reaction’. Alternatives are ‘gene’ for gene deletions,

and ‘other’, which will enforce ko_array(i).designs(j).ub and .lb.
use_rates (logical, optional) –

If true The product rate will be ploted vs growth rate, if

if false(default), the product yield will be ploted vs growth rate.
plot_type (string, optional) – Two options, ‘matrix’(default) where each row is a design, or ‘overlap’ where designs share the same phenotypic space.
yticks_values (vector, opional) – Default is choosen by Matlab.
n_rows (string, optional) – For overlap plot. # rows
n_cols (string, optional) – For overlap plot. # columns
npoints (integer, optional) – Points used to sample the convex hull
convHullLineWidthWT (double, optional) –
convHullLineWidthMut (double, optional) –
fill_space (logical,optional) – Defautl true, color the space inside the convex hull.
wt_color (rgb triplet scaled from 0-1, optional) – e.g. [0,255,0]./255
wt_line_color (rgb triplet scaled from 0-1, optional) –
mut_color (rgb triplet scaled from 0-1, optional) –
mut_line_color (rgb triplet scaled from 0-1, optional) –
set_axis_front (logical, optional) – Default true, will put axis on top drawn figure content
Usage –
- Optional parameters must be entered as a string-value pair, e.g. obj.plot_yield_vs_growth([5,1],’min_obj_val’, 0.1).
- Module variables must be applied prior to input in the function.
(i.e. ko_array(i).designs(j).del should contain deletions of the original desing - module variables).
Warning –
- The state of the model_array (i.e. deletions or lack thereof) will be
  
  considered as the wild type state. Thus it is
Notes –
- This function is based on plot_yield_vs_growth.m but it is mean to be
  
  more flexible.

print_design(obj, design_ind, varargin)¶

Displays deleted reactions for a certain solution. Also indicates if the deletion does not apply to a certain producion network (module reaction)

Parameters:

design_ind (int) – Index of the design to be printed.
sol_ind (int, optional) – Index of the solution from which to draw design, default is 1.
geneid2name (dict_path, optional) – The path to a two column csv file,
first columns are gene ids and second column are gene names. (where) – default is ‘’.
extra_rxns (cell array of reaction ids) – Additional reactions not in the design which will be included in the table. (useful for alternative solutions)
is_alternative (logical, optional) – If true, an alternative solution of design_ind, specified by alternative_ind will be considered. Default is false.
alternative_ind (double, optional) – Only relevant for alternative solutions (see is alternative). Index of the alternative solution, default is 1.
verbose (logical, optional) – Weather or not to display the design. Default is true.

Returns:

T (table) – Design information
deleted_reactions (cell array)

Notes

Currently only supports reaction deletions.

remove_always_zero_prod(obj, solution_ind)¶

Removes products (ignore for the analysis) which are always 0 in the feasible solutions corresponding to solution_ind.

Args: solution_ind (int): Index of the solution to be deleted

remove_products(obj, prod_to_remove_id)¶

Deletes production networks from obj.prodnet, so that they are ignored

for the analysis.

Args:: prod_to_remove_id (cell fo strings)

Prodnet already has a method for this

set_solution_state(obj, sol_ind)¶

Returns a mop_solution and sets the obj.prodnet to the same state, in order to avoid side effects when analyzing that solution.

Parameters:	sol_ind (integer) – Index of the solution to be retrieved.

similarity_plots(obj, type, output_graph_name, corr_cutoff, correl_type, solution_ind_or_id)¶

Render correlation graphs for pareto front matrix.

Parameters:	type (str) – Correlation coefficient ‘pearson’ or ‘spearman’.

Notes

Because matlabs graph rendering features are terrible, the graph can be outputed to a csv for better drawing with high quality free software like cytoscape

stepwise_implementation(obj, design_ind, varargin)¶

Explores all possible subsets of a solution and generates a report and a tree (cytoscape input) with the most promissing canidates to achieve the target design with useful designs on the way.

Parameters:

design_ind (integer) – Index of the design in the solution given by sol_ind to be analyzed.
sol_ind (integer, optional) – Solution index, default 1.
compatibiliy_cutoff (double, optional) – Value used to determine compatibiliy of a design, defaul is 0.6.
max_level (integer, optional) – Number of levels in the implementation tree to explore. The default is up to the number of deleions in the given design minus 1. (e.g. if the given design has 5 deletions, subset designs with 1,2,3, and 4 deletions will be explored).
write_tables (logical, optional) – If true (default is false) a table of
sorted by compatibility, is written frol each level to (designs,) –
+ levelX.csv (output_base_path) –
write_graph (logical, optional) – If true (default) a output_base_path(string, opional) : For ouput files, default is obj.prodnet.problem_path/output/<design-objective>-<design-deletitons>-<design-ind>
min_obj_val (double, optional) – For output graph, products with objectives below min_obj_val in all designs will be removed. Default is 0.1.
alt_sol_ind (integer, optional) – Index of an alternative_solution to design_ind. Default is 0 which indicates that no alternative solution is considered.
only_nondominated (logical, optional) – If true only non-dominated
at each step are kept. Default is false. (solutions) –

Notes

Module reactions are those of the final solution, so only one module

needs to be constructed for all strains.

write_result_tables(obj, varargin)¶

Generates a report for a given set of solutions

Parameters:	file_name (string, optional) – Name of the output file. Defaults to problem-name-report. skip_log – Indicates if the sheet containing a log of the solutions should be skipped. Defaults to false.

Notes

Gene deletion report not suported yet
All designs have to be growth type (i.e. wGCP,sGCP) or non-growth (NGP).

Currently csv output does not include superheaders

write_to_xls(obj, file_name, skip_log)¶

Generates a report for a given set of solutions

Parameters:	file_name (string, optional) – Name of the output file. Defaults to problem-name-report. skip_log (logical, optional) – Indicates if the sheet containing a log of the solutions should be skipped. Defaults to false.

Notes

Gene deletion report not suported yet
All designs have to be growth type (i.e. wGCP,sGCP) or non-growth (NGP).

Warning

Depreciated function, use ResAnalysis.write_result_tables instead.

ResAnalysis¶

ModCell2

Navigation

Related Topics