Regression: Torch as a Tensor library¶

回归问题是一种寻找自变量与因变量之间的关系的统计建模方法。

设有自变量 \(\boldsymbol{x}\) 和因变量 \(y\) 符合 \(y = g(\boldsymbol{x})\)，则回归的任务是找到一个 \(f(\cdot)\)，\(\mathrm{s.t.} \min{\mathrm{dist}(f(\boldsymbol x), g(\boldsymbol x))}\)。

较为常用的，也是最简单的，是线性回归模型。~~虽然单变量线性回归在中学已经学过解析解了~~

线性回归假设自变量与因变量成线性关系，也就是

\[ H(x) = Wx+b \]

由此关系我们构造 cost function：

\[ \begin{gathered} cost=\frac{1}{m} \sum_{i=1}^{m}\left(H\left(x^{(i)}\right)-y^{(i)}\right)^{2} \\ H(x)=W x+b \end{gathered} \]

其中 \(W\) 和 \(b\) 被称为参数，故也可以写作

\[ \operatorname{cost}(W, b)=\frac{1}{m} \sum_{i=1}^{m}\left(H\left(x^{(i)}\right)-y^{(i)}\right)^{2} \]

目标函数就是：

\[ \min_{W, b} \operatorname{cost}(W, b) \]

读作 Minimize \(\operatorname{cost}(W, b)\) with regard to \(W\) and \(b\).

那么比如我们拿到了_~西瓜~数据

x	Y
1	1
2	2
3	3

对于每对 \((x, Y)\) 我们都可以计算 \(\mathrm{cost}\)。

这里利用 PyTorch 进行计算。

import torch
import numpy as np

def cost(W: torch.Tensor, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    x = x.unsqueeze(-1)
    y = y.unsqueeze(-1)
    return torch.mean(torch.square(W * x - y), (0,))

x = torch.tensor([1., 2., 3.])
y = torch.tensor([1., 2., 3.])

print(cost(W, x, y)) # tensor(4.6667)
W = torch.ones(3)
print(cost(W, x, y)) # tensor(0.)
W = torch.full((3,), 2)
print(cost(W, x, y)) # tensor(4.6667)

# What does cost(W) look like?
%matplotlib inline
import matplotlib.pyplot as plt
rng = torch.arange(-3., 5., 0.01)
inp = torch.stack([rng]*3)

output = cost(inp, x, y)
plt.plot(rng.numpy(), output.numpy())

{loading=lazy}

问题是：如何最小化这个损失函数？

解析解是这样的：

\[ \operatorname{cost}(W)=\frac{1}{2 m} \sum_{i=1}^{m}\left(W x^{(i)}-y^{(i)}\right)^{2} \]

则求偏导

\[ \frac{\partial }{\partial W}\mathrm{cost}(W) = \frac{1}{m} \sum_{i=1}^{m} x^{(i)} \left(W x^{(i)}-y^{(i)}\right) \]

虽然这个简单的方程有解析解，但对于大多数优化问题来说是没有解析解的。在此节中我们将学习如何利用梯度下降法求解回归问题。

\[ \begin{gathered} W:=W-\alpha \frac{\partial}{\partial W} \operatorname{cost}(W) \\ W:=W-\alpha \frac{1}{m} \sum_{i=1}^{m}\left(W x^{(i)}-y^{(i)}\right) x^{(i)} \end{gathered} \]

import torch
from torch.utils.data import Dataset, DataLoader
from torch.utils.tensorboard import SummaryWriter
import numpy as np
np.random.seed(42)
import matplotlib.pyplot as plt

class FakeDataset(Dataset):
    def __init__(self):
        super(FakeDataset, self).__init__()
        k = np.random.random()
        self.x = x = np.arange(1, 4)
        b = np.arange(1, 4)
        self.val = k * x + b
        # print("Guess: ", k, b)
        self.data = list(zip(x, k*x+b))

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return len(self.data);


class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.k = torch.tensor(10000., requires_grad=True)
        self.b = torch.rand(1, requires_grad=True)

    def forward(self, input):
        return input * self.k + self.b

def loss_fn(out, target):
    return (out-target)**2

if __name__ == '__main__':
    dataset = FakeDataset()
    model = Net()
    tb = SummaryWriter(log_dir="runs")

    step = 1
    lr = 0.01
    EPOCHS = 1000
    for epoch in range(1, EPOCHS):
        for x, val in dataset:
            out = model(x)
            loss = loss_fn(out, val)
            model.k.grad = None
            model.b.grad = None
            loss.backward()
            with torch.no_grad():
                tb.add_scalar("Loss", float(loss), step)
                step+=1
                model.k -= lr*model.k.grad
                model.b -= lr*model.b.grad

    plt.scatter(dataset.x, dataset.val)
    eval_out = model(torch.tensor(np.arange(1, 11)))
    plt.plot(np.arange(1, 11), eval_out.detach().numpy())
    plt.show()