figureflow.statannot.plot_and_add_stat_annotation

figureflow.statannot.plot_and_add_stat_annotation = <function plot_and_add_stat_annotation>

Plot data row and add statistical annotations on top.

Optionally computes statistical test between pairs of data series, and add statistical annotation on top of the boxes. Uses the same exact arguments data, x, y,`hue`, order, hue_order as the seaborn boxplot function. This function works in one of the two following modes: a) perform_stat_test is True: statistical test as given by argument test` is performed. b) perform_stat_test is False: no statistical test is performed, list of custom p-values pvalues are used for each pair of boxes. The test_short_name argument is then used as the name of the custom statistical test. data :param x: Column name of data to be used for x axis

(already a parameter for figure_panel.show_data)

Parameters:
  • y – Column name or list of column names of data to be used for y axis. Multiple y values will be plotted as several rows. (already a parameter for figure_panel.show_data)

  • hue – column name of data to be used for hue (already a parameter for figure_panel.show_data)

  • col – column name of data used for plots in different columns (generating a row of plots) (already a parameter for figure_panel.show_data)

  • x_order

    list of x values after applying the changes

    of x_labels, determining the order of c values

    (already a parameter for figure_panel.show_data)

  • hue_order

    list of hue values after applying the changes

    of hue_labels, determining the order of c values

    (already a parameter for figure_panel.show_data)

  • col_order

    list of col values after applying the changes

    of col_labels, determining the order of c values

    (already a parameter for figure_panel.show_data)

  • show_legend – Whether to show legend of plot for different values in hue column (will not be shown if there is only one hue value)

  • pair_unit_columns – list of columns that uniquely identify one set of dependent datapoints (needed to connect paired datapoints and also needed as preprocessing to allow statistics tests for paired data)

  • plot_type – string of plot-type name (e.g. “box”, “bar”, “line”, “regression”, “scatter”, “swarm”). Available values are determined by modules in figureflow.plots. Alowed strings are either the complete module name or the module name without the “_plot” ending.

  • data_plot_kwds – dictionary with keywords for plotting object used in function “plot_data” depending on the plot_type defined; see respective data_plot object for parameters which can be used

  • show_data_points – Whether to show single datapoints as swarm plot. This is only possible for non continuous x vals.

  • connect_paired_data_points – Whether to connect paired data points of paired data in different groups with lines

  • plot_colors – List of plot colors

  • box_pairs

    Pairs of data groups (boxes) that should be statistically compared. Can be of any of the following forms: For non-grouped boxplot: [[cat1, cat2], [cat3, cat4]]. For boxplot grouped by hue: `[[(cat1, hue1), (cat2, hue2)],

    [(cat3, hue3), (cat4, hue4)]]`

    For boxplots grouped by hue and an additional column (col):
    ’[[(col1,cat1, hue1), (col2,cat2, hue2)],

    [(col3,cat3, hue3), (col4,cat4, hue4)]]’

  • perform_stat_test – Whether to perform stat tests

  • pvalues – list of p-values for each box pair comparison, if no stat test should be performed

  • test – statistics test that should be performed. For comparing two groups can be a list with one element or just a string. For comparing more than two groups has to be a list with two strings with the first string being the group test Statistics test can either be from scipy.stats or scikit-posthocs, for tests in the scikit-posthocs package the test name should start with “posthocs.” while for tests under scipy.stats should start with “stats.”. Scipy stats test work for group comparisons and for comparing two groups without control.

  • stats_params – Dictionary with parameters for the used stat test

  • text_format – Type of stat annotation, “star” for “*” and “simple” for direct p values.

  • show_test_name – Whether to show the test name if text_format is “simple”

  • stat_star_annot_font_size_factor – Factor by which the fontsize for the annotation of significance with stars will be multiplied. Increase to increase the size of significance stars in plots

  • test_short_name – Short name of stat test. If show_text_name is True, is displayed in annotation if text_format is “simple”

  • pvalue_format_string – defaults to “{.3e}”

  • pvalue_thresholds – list of lists, or tuples. Default is: For “star” text_format: [[1e-4, “****”], [1e-3, “***”], [1e-2, “**”], [0.05, “*”], [1, “ns”]]. For “simple” text_format : [[1e-5, “1e-5”], [1e-4, “1e-4”], [1e-3, “0.001”], [1e-2, “0.01”]]

  • annotate_nonsignificant – Annotate nonsignificant values as n.s. if text_format is “star”

  • show_stats_to_control_without_lines – When annotating statistics, do not draw lines when comparing to control, instead just add annotation on top of respective bar/box/points. This is helpful if too many lines would be drawn otherwise.

  • loc – location of statistics annotation, “inside” or “outside” of plot

  • print_stat_test_results – Whether all the statistic test information should be printed

  • plot_title – String; Title of the plot that will be added above the plot

  • plot_title_across_whole_width – Whether the plot_title should be extended across the whole axis or just across the plot itself (excluding the width of the axis label, tick labels, legend width, etc.)

  • plot_boxed_title – String, Title that will be in a colored box above all other title of the plot and across the entire width of the plot

  • title_box_colors – List of colors to use for boxed title. The box will be divided into as many even parts as colors are defined. The order of the colors will also be used for the parts of the box.

  • show_col_labels_above – Whether to show col labels above the plot, if True, show_col_labels_below will be set to False

  • show_col_labels_below – Whether to show col labels below the plot

  • always_show_col_label – Whether to always show col labels (also if there was only one col value)

  • col_label_padding – Label padding for cols

  • col_label_weight – Font weight of col labels: “normal” or “bold”

  • show_row_label – Bool; Whether there should be a label displayed right of the rightmost plot when True, legend will not be displayed automatically (both together is not implemented at the moment)

  • row_label_text – Text that should be used as row label instead of the value in the row column

  • row_label_orientation – Text orientation of row label, “hor” or “vert”

  • x_range – List of minimum and maximum x value

  • show_x_axis – Whether to show x axis

  • x_axis_label – Label of x axis.

  • show_x_label_in_all_columns – Whether to show x axis in all columns even if they have the same x axis.

  • center_x_axis_label_under_all_plots – Whether the x axis label should be plotted once, centered under all plots, instead of being plotted below separate plots.

  • x_tick_label_rotation – Boolean, Whether the x tick labels should be rotated by 45 degree

  • x_tick_interval – Interval between major ticks on x axis

  • leave_space_for_x_tick_overhang – For plotting multiple rows where only the last row might have an x axis with potential x tick overhangs. Whether to leave space for these overhangs in the last row in the other rows.

  • y_range – List of minimum and maximum y value

  • y_scale – Scale on y_axis (e.g. “linear” or “log”)

  • x_scale – Scale on x_axis (e.g. “linear” or “log”)

  • show_y_axis – Whether to show the y axis

  • neg_y_vals – are there data points below zero, if not lowest y axis value is 0 (margin settings will make it lower than zero otherwise)

  • y_axis_label – Label of y axis

  • y_tick_interval – Interval between major ticks on y axis

  • show_y_minor_ticks – Whether to show minor ticks on y axis

  • y_ticks – List of specific tick values that should be used for y axis

  • x_ticks – List of specific tick values that should be used for x axis

  • zero_y_tick_as_int – Whether to show 0 tick for y axis as integer (0) and not as a float (0.0).

  • zero_x_tick_as_int – Whether to show 0 tick for x axis as integer (0) and not as a float (0.0).

  • axis_padding – padding between plot and y_axis ticks in points

  • hor_alignment – “right”, “left” or “center”, horizontal alignment of plots in panel plots always will the entire y space but not necessarily the entire x space. therefore alignment in x matters but not in y

  • use_fixed_offset – Whether to use a fixed distance of stat annotations (if True) or whether to plot annotations above highest y datapoint.

  • line_offset_to_box – Offset of stat annotation lines to highest point in data

  • line_offset – Offset of annotation lines to highest annotation lines.

  • text_offset – Offset of annotation text from annotation line in points

  • line_width – Width of annotation lines and grid lines in plot

  • line_width_thin – Width of minor tick grid lines in plot

  • line_height – in arbitrary units

  • legend_title – Title of legend, displayed above legend.

  • legend_handle_length – Length of colored bars used in legend

  • legend_handle_vert_alignment – Alignment of legend handle relative to text. “center” will align it in the vertical center of the whole legend label, while “top” will align it vertically centered to the first line of the legend label. For a multiline legend label with 4 lines (3 line breaks), “center” will align the legend handle between the second and third line and “top” will align it in the middle of the first line.

  • n_legend_columns – The number of columns for the legend.

  • legend_spacing – space between legend handles (color boxes) and legend text

  • leave_space_for_legend – Internal parameter to indicate whether space for legend should be kept free even if legend was removed

  • borderaxespad – in points, padding of legend from plot and also of row label from plot

  • box_width – maximum box width in inches - scaled by width of one column in inches / 2

  • group_padding – space between two cols in inches - scaled by width of one column in inches / 2

  • auto_scale_group_padding

    Bool; NOT WORKING AT THE MOMENT! Whether defined group_padding should be scaled automatically Autoscaling scales space between plots as much as the size of plots Thereby boxes becoming too small due to group padding is prevented automatically this reduces a bit of manual work,

    however, for facet plots it cannot

    be used since the exact size of the group padding is necessary for a proper grid-like arrangement

  • inner_padding – Padding of axis labels to plots

  • vertical_lines – Boolean, Whether thin vertical lines should be drawn for each box

  • add_background_lines – Whether to add lines between plots in different columns (different col values). Adding lines makes plots seem connected, not adding them makes them look like separate plots.

  • color – Color of lines and text

  • background_color – Background color of plot

  • outer_border – Outer border of plot (cannot be manually set, will be set by figure_panel)

  • fig – matplotlib figure object (cannot be manually set, will be set by figure_panel)

  • figure_panel – figureflow figure_panel object (cannot be manually set, will be set by figure_panel)

  • letter – Letter of figure_panel (cannot be manually set, will be set by figure_panel)

  • padding – padding between panels in inches (cannot be manually set, will be set by figure_panel)