pytorch日积月累9-损失函数

损失函数：衡量模型输出与真实标签的差异

损失函数(Loss Function)：

$\operatorname{Loss} =f\left(\hat{y}, y\right)$

代价函数(Cost Function)：

$\cos t=\frac{1}{N} \sum_{i}^{N} f\left(\hat{y_{i}}, y_{i}\right)$

目标函数(Objective Function)：

$\boldsymbol{O} \boldsymbol{b} \boldsymbol{j}= Cost + Regularization$

class _Loss(Module):
    def __init__(self, size_average=None, reduce=None, reduction='mean'):
        super(_Loss, self).__init__()
        if size_average is not None or reduce is not None:
            self.reduction = _Reduction.legacy_get_string(
                            size_average, reduce)
        else:
            self.reduction = reduction

1.交叉熵损失函数

功能： nn.LogSoftmax ()与nn.NLLLoss ()结合，进行交叉熵计算

主要参数：

weight：各类别的loss设置权值
ignore _index：忽略某个类别
reduction ：计算模式，可为none/sum /mean
- none- 逐个元素计算
- sum- 所有元素求和，返回标量
- mean- 加权平均，返回标量

熵：

$\mathrm{H}(\mathrm{P})=E_{x \sim p}[I(x)]=-\sum_{i}^{N} P\left(x_{i}\right) \log P\left(x_{i}\right)$

自信息：

$I(x)=-\log [p(x)]$

相对熵：

$\begin{aligned} D_{K L}(P, Q) &=E_{x \sim p}\left[\log \frac{P(x)}{Q(x)}\right] \\ &=E_{x \sim p}[\log P(x)-\log Q(x)] \\ &=\sum_{i=1}^{N} P\left(x_{i}\right)\left[\log P\left(x_{i}\right)-\log Q\left(x_{i}\right)\right] \\ &=\sum_{i=1}^{N} P\left(x_{i}\right) \log P\left(x_{i}\right)-\sum_{i=1}^{N} P\left(x_{i}\right) \log Q\left(x_{i}\right) \\ &=H(P, Q)-H(\mathrm{P}) \end{aligned}$

交叉熵：

$\mathrm{H}(\boldsymbol{P}, \boldsymbol{Q})=\boldsymbol{D}_{K L}(\boldsymbol{P}, \boldsymbol{Q})+\mathrm{H}(\boldsymbol{P})$ $\mathrm{H}(\boldsymbol{P}, \boldsymbol{Q})=-\sum_{i=1}^{N} \boldsymbol{P}\left(\boldsymbol{x}_{i}\right) \log Q\left(\boldsymbol{x}_{i}\right)$

nn.CrossEntropyLoss(weight=None, #各类别的loss设置权值
                    size_average=None, 
                    ignore_index=-100,#忽略某一个类别 
                    reduce=None, 
                    reduction=‘mean’)#计算模式
#none-逐元素计算
#sum-所有元素求和返回标量
#mean-加权平均，返回标量

$\mathrm{H}(\boldsymbol{P}, \boldsymbol{Q})=-\sum_{\boldsymbol{i}=1}^{N} \boldsymbol{P}\left(\boldsymbol{x}_{\boldsymbol{i}}\right) \log \boldsymbol{Q}\left(\boldsymbol{x}_{\boldsymbol{i}}\right) \\ \operatorname{loss}(x, \text {class})=-\log \left(\frac{\exp (x[\operatorname{class}])}{\sum_{j} \exp (x[j])}\right)=-x[\operatorname{class}]+\log \left(\sum_{j} \exp (x[j])\right) \\ \operatorname{loss}(x, \text {class})=\text { weight[class] }\left(-x[\text { class }]+\log \left(\sum_{j} \exp (x[j])\right)\right)$

# def loss function
loss_f_none = nn.CrossEntropyLoss(weight=None, reduction='none')
# forward
loss_none = loss_f_none(inputs, target)
# view
print("Cross Entropy Loss:\n ", loss_none, loss_sum, loss_mean)
#实现原理:
    idx = 0
    input_1 = inputs.detach().numpy()[idx]      # [1, 2]
    target_1 = target.numpy()[idx]              # [0]
    # 第一项
    x_class = input_1[target_1]
    # 第二项
    sigma_exp_x = np.sum(list(map(np.exp, input_1)))
    log_sigma_exp_x = np.log(sigma_exp_x)
    # 输出loss
    loss_1 = -x_class + log_sigma_exp_x
    print("第一个样本loss为: ", loss_1)
#带权值：
weights = torch.tensor([1, 2], dtype=torch.float)
loss_f_none_w = nn.CrossEntropyLoss(weight=weights, reduction='none')

2.NLLLoss

功能：实现负对数似然函数中的负号功能

$\ell(x, y)=L=\left\{l_{1}, \ldots, l_{N}\right\}^{\prime}, \quad l_{n}=-w_{y_{n}} x_{n, y_{n}}$

nn.NLLLoss( weight=None,
            size_average=None, 
            ignore_index=-100, 
            reduce=None, 
            reduction='mean')

3.BCELoss

功能：二分类交叉熵
注意事项：输入值取值在[0,1]
主要参数：

nn.BCELoss( weight=None, 
            size_average=None, 
            reduce=None, 
            reduction='mean')

inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
target_bce = target
# itarget
inputs = torch.sigmoid(inputs)
weights = torch.tensor([1, 1], dtype=torch.float)
loss_f_none_w = nn.BCELoss(weight=weights, reduction='none')
# forward
loss_none_w = loss_f_none_w(inputs, target_bce)

4.BCEWithLogitsLoss

功能：结合Sigmoid与二分类交叉熵

注意事项：网络最后不加sigmoid函数

$l_{n}=-w_{n}\left[y_{n} \cdot \log \sigma\left(x_{n}\right)+\left(1-y_{n}\right) \cdot \log \left(1-\sigma\left(x_{n}\right)\right)\right]$

nn.BCEWithLogitsLoss(weight=None, 
                    size_average=None, 
                    reduce=None, 
                    reduction='mean', 
                    pos_weight=None)#正样本的权值

inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
target_bce = target
weights = torch.tensor([1], dtype=torch.float)
pos_w = torch.tensor([3], dtype=torch.float)        # 3
loss_f_none_w = nn.BCEWithLogitsLoss(weight=weights, reduction='none', 
                                     pos_weight=pos_w)
loss_none_w = loss_f_none_w(inputs, target_bce)

5.nn.L1Loss

功能： 计算inputs与target之差的绝对值

$l_{n}=\left|x_{n}-y_{n}\right|$

1
2
3

nn.L1Loss(size_average=None, 
          reduce=None, 
          reduction='mean’)

6.nn.MSELoss

功能： 计算inputs与target之差的平方

$l_{n}=\left(x_{n}-y_{n}\right)^{2}$

1
2
3

nn.MSELoss(size_average=None, 
           reduce=None, 
           reduction='mean’)

7.SmoothL1Loss

功能： 平滑的L1Loss

1
2
3

nn.SmoothL1Loss(size_average=None, 
                reduce=None, 
                reduction='mean’)

8.PoissonNLLLoss

功能：泊松分布的负对数似然损失函数

log_input = True：loss(input, target) = exp(input) - target * input
log_input = False：loss(input, target) = input - target * log(input+eps)

nn.PoissonNLLLoss(log_input=True, #log_input：输入是否为对数形式，决定计算公式
                  full=False, #full：计算所有loss，默认为False
                  size_average=None, 
                  eps=1e-08,#eps：修正项，避免log（input）为nan*
                  reduce=None, 
                  reduction='mean')

9.nn.KLDivLoss

功能：计算KLD（divergence），KL散度，相对熵

注意事项：需提前将输入计算 log-probabilities，如通过nn.logsoftmax()

$D_{K L}(P \| Q)=E_{x \sim p}\left[\log \frac{P(x)}{Q(x)}\right]=E_{x-p}[\log P(x)-\log Q(x)] \\ =\sum_{i=1}^{N} P\left(x_{i}\right)\left(\log P\left(x_{i}\right)-\log Q\left(x_{i}\right)\right) \\ l_{n}=y_{n} \cdot\left(\log y_{n}-x_{n}\right)$

nn.KLDivLoss(size_average=None, 
             reduce=None, 
             reduction='mean')
#reduction ：none/sum/mean/batchmean
#batchmean- batchsize维度求平均值

inputs = torch.tensor([[0.5, 0.3, 0.2], [0.2, 0.3, 0.5]])
inputs_log = torch.log(inputs)
target = torch.tensor([[0.9, 0.05, 0.05], [0.1, 0.7, 0.2]], 
                      dtype=torch.float)
loss_f_bs_mean = nn.KLDivLoss(reduction='batchmean')
loss_bs_mean = loss_f_bs_mean(inputs, target)

10.nn.MarginRankingLoss

功能：计算两个向量之间的相似度，用于排序任务

特别说明：该方法计算两组数据之间的差异，返回一个 $n\times n$ 的 loss 矩阵

$\operatorname{loss}(x, y)=\max (0,-y \times(x_ 1-x _2)+\operatorname{margin})$

nn.MarginRankingLoss(margin=0.0, #margin ：边界值，x1与x2之间的差异值
                     size_average=None, 
                     reduce=None, 
                     reduction='mean')#reduction ：计算模式，可为none/sum/mean
#y = 1时， 希望x1比x2大，当x1>x2时，不产生loss
#y = -1时，希望x2比x1大，当x2>x1时，不产生loss

x1 = torch.tensor([[1], [2], [3]], dtype=torch.float)
x2 = torch.tensor([[2], [2], [2]], dtype=torch.float)
target = torch.tensor([1, 1, -1], dtype=torch.float)
loss_f_none = nn.MarginRankingLoss(margin=0, reduction='none')
loss = loss_f_none(x1, x2, target)
print(loss)

11.nn.MultiLabelMarginLoss

功能：多标签边界损失函数

举例：四分类任务，样本x属于0类和3类，标签：[0, 3, -1, -1] , 不是[1, 0, 0, 1]

nn.MultiLabelMarginLoss(
    size_average=None, 
    reduce=None, 
    reduction='mean')

x = torch.tensor([[0.1, 0.2, 0.4, 0.8]])
y = torch.tensor([[0, 3, -1, -1]], dtype=torch.long)
loss_f = nn.MultiLabelMarginLoss(reduction='none')
loss = loss_f(x, y)
#下面是手动计算的代码
x = x[0]
item_1 = (1-(x[0] - x[1])) + (1 - (x[0] - x[2]))    # [0]
item_2 = (1-(x[3] - x[1])) + (1 - (x[3] - x[2]))    # [3]
loss_h = (item_1 + item_2) / x.shape[0]
print(loss_h)

12.nn.SoftMarginLoss

功能：计算二分类的logistic损失

$\operatorname{loss}(x, y)=\sum_{i} \frac{\log (1+\exp (-y[i] \times x[i]))}{\text { x.nelement }()}$

1
2
3

nn.SoftMarginLoss(size_average=None, 
                  reduce=None, 
                  reduction='mean')

13.nn.MultiLabelSoftMarginLoss

功能：SoftMarginLoss多标签版本

$\operatorname{los} s(x, y)=-\frac{1}{C} * \sum_{i} y[i] * \log \left((1+\exp (-x[i]))^{-1}\right)+(1-y[i]) * \log \left(\frac{\exp (-x[i])}{(1+\exp (-x[i]))}\right)$

nn.MultiLabelSoftMarginLoss(weight=None, 
                            size_average=None, 
                            reduce=None, 
                            reduction='mean')

14.nn.MultiMarginLoss

功能：计算多分类的折页损失

$\operatorname{loss}(x, y)=\frac{\left.\sum_{i} \max (0, \operatorname{margin}-x[y]+x[i])\right)^{p}}{x . \operatorname{size}(0)}$

nn.MultiMarginLoss(p=1, #可选参数1或2
                   margin=1.0, #边界值
                   weight=None, #各类别的loss设置权值
                   size_average=None, 
                   reduce=None, 
                   reduction='mean')

15.nn.TripletMarginLoss

功能：计算三元组损失，人脸验证中常用

$\begin{array}{c} L(a, p, n)=\max \left\{d\left(a_{i}, p_{i}\right)-d\left(a_{i}, n_{i}\right)+\operatorname{margin}, 0\right\} \\ \qquad d\left(x_{i}, y_{i}\right)=\left\|\mathbf{x}_{i}-\mathbf{y}_{i}\right\|_{p} \end{array}$

nn.TripletMarginLoss(margin=1.0, 
                     p=2.0, 
                     eps=1e-06, 
                     swap=False, 
                     size_average=None, 
                     reduce=None, 
                     reduction='mean')

16.nn.HingeEmbeddingLoss

功能：计算两个输入的相似性，常用于非线性embedding和半监督学习

特别注意：输入x应为两个输入之差的绝对值。

$l_{n}=\left\{\begin{array}{ll} x_{n}, & \text { if } y_{n}=1 \\ \max \left\{0, \Delta-x_{n}\right\}, & \text { if } y_{n}=-1 \end{array}\right.$

nn.HingeEmbeddingLoss(margin=1.0, 
                      size_average=None, 
                      reduce=None, 
                      reduction='mean’)

inputs = torch.tensor([[1., 0.8, 0.5]])
target = torch.tensor([[1, 1, -1]])
loss_f = nn.HingeEmbeddingLoss(margin=1, reduction='none')
loss = loss_f(inputs, target)
print("Hinge Embedding Loss", loss)

17.nn.CosineEmbeddingLoss

功能:采用余弦相似度计算两个输入的相似性

$\operatorname{loss}(x, y)=\left\{\begin{array}{ll} 1-\cos \left(x_{1}, x_{2}\right), & \text { if } y=1 \\ \max \left(0, \cos \left(x_{1}, x_{2}\right)-\operatorname{margin}\right), & \text { if } y=-1 \end{array}\right. \\ \cos (\theta)=\frac{A \cdot B}{\|A\|\|B\|}=\frac{\sum_{i=1}^{n} A_{i} \times B_{i}}{\sqrt{\sum_{i=1}^{n}\left(A_{i}\right)^{2}} \times \sqrt{\sum_{i=1}^{n}\left(B_{i}\right)^{2}}}$

nn.CosineEmbeddingLoss(margin=0.0, #可取值[-1, 1] , 推荐为[0, 0.5]
                       size_average=None, 
                       reduce=None, 
                       reduction='mean')

18.nn.CTCLoss

功能：计算CTC损失，解决时序类数据的分类

Connectionist Temporal Classification

1
2
3

torch.nn.CTCLoss(blank=0, 
                 reduction='mean', 
                 zero_infinity=False)