小能豆

自定义 statannotations 中“星号”文本格式的 p 值阈值

py

statannotations 包提供绘图中数据对统计显著性水平的可视化注释(例如在 seaborn 箱线图或条形图中)。这些注释可以采用“星号”文本格式,其中一个或多个星号出现在数据对之间的条形图顶部:
fTi1N.png

有没有办法自定义星级的阈值?我希望第一个重要性阈值为 0.0001,而不是 0.05,两颗星为 0.00001,三星*为 0.000001。

示例图是根据statsannotations 的 github 页面上的示例代码生成的:

from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

设置verbose为 后2,这还会告诉我们用于确定条形图上方出现多少颗星星的阈值:

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 1.00e-02 < p <= 5.00e-02
      **: 1.00e-03 < p <= 1.00e-02
     ***: 1.00e-04 < p <= 1.00e-03
    ****: p <= 1.00e-04

我想向 Annotator 提供一些类似 p 值阈值词典:星星数量的东西,但我不知道应该提供给什么参数。


阅读 46

收藏
2024-11-24

共1个答案

小能豆

在他们的存储库中,特别是在文件[Annotator.py][1]: 中,我们有self._pvalue_format = PValueFormat()。这意味着我们可以更改相同的内容。该类可在此处PValueFormat()找到,具有以下可配置参数:

CONFIGURABLE_PARAMETERS = [
    'correction_format',
    'fontsize',
    'pvalue_format_string',
    'simple_format_string',
    'text_format',
    'pvalue_thresholds',
    'show_test_name'
]

为了完整起见,下面是代码的修改版本和新结果,其中两行显示了 pvalues 的前后值。此外,图像也会相应变化。

# ! pip install statannotations
from smartprint import smartprint as sprint
from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)

print ("Before hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])


annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot._pvalue_format.pvalue_thresholds =  [[0.01, '****'], [0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']]
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

print ("After hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])

输出:

Before hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.0001, '****'],
                       [0.001, '***'],
                       [0.01, '**'],
                       [0.05, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 2.00e-01 < p <= 6.00e-01
      **: 3.00e-02 < p <= 2.00e-01
     ***: 1.00e-02 < p <= 3.00e-02
    ****: p <= 1.00e-02

Thur vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:6.477e-01 U_stat=6.305e+02
Thur vs. Sat: Mann-Whitney-Wilcoxon test two-sided, P_val:4.690e-02 U_stat=2.180e+03
Sun vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:2.680e-02 U_stat=9.605e+02
After hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.01, '****'],
                       [0.03, '***'],
                       [0.2, '**'],
                       [0.6, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

图像:

VnQbRm.png

更改阈值也可以通过在调用时简单地附加键值来实现.configure,如下所示:

annot.configure(test='Mann-Whitney', text_format='star', loc='outside',\
verbose=2, pvalue_thresholds=[[0.01, '****'], \
[0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']])
2024-11-24