自定义 statannotations 中“星号”文本格式的 p 值阈值

小能豆

自定义 statannotations 中“星号”文本格式的 p 值阈值

statannotations 包提供绘图中数据对统计显著性水平的可视化注释（例如在 seaborn 箱线图或条形图中）。这些注释可以采用“星号”文本格式，其中一个或多个星号出现在数据对之间的条形图顶部：

有没有办法自定义星级的阈值？我希望第一个重要性阈值为 0.0001，而不是 0.05，两颗星为 0.00001，三星*为 0.000001。

示例图是根据statsannotations 的 github 页面上的示例代码生成的：

from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

设置verbose为后2，这还会告诉我们用于确定条形图上方出现多少颗星星的阈值：

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 1.00e-02 < p <= 5.00e-02
      **: 1.00e-03 < p <= 1.00e-02
     ***: 1.00e-04 < p <= 1.00e-03
    ****: p <= 1.00e-04

我想向 Annotator 提供一些类似 p 值阈值词典：星星数量的东西，但我不知道应该提供给什么参数。

阅读 46

2024-11-24

共1个答案

小能豆

在他们的存储库中，特别是在文件[Annotator.py][1]: 中，我们有self._pvalue_format = PValueFormat()。这意味着我们可以更改相同的内容。该类可在此处PValueFormat()找到，具有以下可配置参数：

CONFIGURABLE_PARAMETERS = [
    'correction_format',
    'fontsize',
    'pvalue_format_string',
    'simple_format_string',
    'text_format',
    'pvalue_thresholds',
    'show_test_name'
]

为了完整起见，下面是代码的修改版本和新结果，其中两行显示了 pvalues 的前后值。此外，图像也会相应变化。

# ! pip install statannotations
from smartprint import smartprint as sprint
from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)

print ("Before hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])


annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot._pvalue_format.pvalue_thresholds =  [[0.01, '****'], [0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']]
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

print ("After hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])

输出：

Before hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.0001, '****'],
                       [0.001, '***'],
                       [0.01, '**'],
                       [0.05, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 2.00e-01 < p <= 6.00e-01
      **: 3.00e-02 < p <= 2.00e-01
     ***: 1.00e-02 < p <= 3.00e-02
    ****: p <= 1.00e-02

Thur vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:6.477e-01 U_stat=6.305e+02
Thur vs. Sat: Mann-Whitney-Wilcoxon test two-sided, P_val:4.690e-02 U_stat=2.180e+03
Sun vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:2.680e-02 U_stat=9.605e+02
After hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.01, '****'],
                       [0.03, '***'],
                       [0.2, '**'],
                       [0.6, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

图像：

更改阈值也可以通过在调用时简单地附加键值来实现.configure，如下所示：

annot.configure(test='Mann-Whitney', text_format='star', loc='outside',\
verbose=2, pvalue_thresholds=[[0.01, '****'], \
[0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']])

2024-11-24