Linear Regression

2020-10-08

This is the note I used as an example of applications in Linear Algebra I lectured at Purdue University. It is slightly modified so that it is more or less self contained.

Starting from least-squares solution, we are going to give an introductory exploration on (linear) regression in this note.

import numpy as np
import sklearn.linear_model
import matplotlib.pyplot as plt
from IPython.display import set_matplotlib_formats

plt.rcParams["figure.figsize"] = (8, 6)
set_matplotlib_formats('png', 'pdf')

Least-squares solution

Let \(A\) be an \(m \times n\) matrix, and \(B\) be a vector in \(\mathbb{R}^m\). A least-squares solution to a linear system \(Ax = B\) is an \(\hat{x}\) such that \(|A \hat{x} - B| \le |A x - B|\) for all \(x\). Here, \(|x|\) is the length of the vector \(x\). If the system \(Ax = B\) is consistent, then a least-squares solution is just a solution.

LeetCode Contest 209

2020-10-04

这次只能做出前面三题，而且第三题用时过长，导致这次排名只有505。

第一题Special Array With X Elements Greater Than or Equal X

给定一个数组，找出 x 使得恰有 x 个数不小于 x。

暴力枚举所有可能的 x。

class Solution:
    def specialArray(self, nums: List[int]) -> int:
        for n in range(len(nums) + 1):
            if sum(1 for v in nums if v >= n) == n:
                return n
        return -1

LeetCode Contest 208

2020-09-26

第二题 WA 了一次，其他都能一次 AC。这次拖慢速度的竟然不是手速，而是阅读速度。。。排名150左右。

第一题Crawler Log Folder

给定一系列类似于 cd 的操作，问最后的文件路径深度是多少。

模拟题，因为只需要知道深度，所以我们并不关心文件夹的名字，只考虑对深度影响 ../ -> -1，./ -> 0，其他 -> +1 。

class Solution:
    def minOperations(self, logs: List[str]) -> int:
        depth = 0
        for path in logs:
            if path == '../':
                depth = max(0, depth - 1)
            elif path == './':
                pass
            else:
                depth += 1
        return depth

LeetCode Contest 204

2020-08-29

偶尔冒泡参加 Leetcode 的比赛。这次四题都做出来了，但是第一、三和四题都各自错了一次。排名250左右。

第一题Detect Pattern of Length M Repeated K or More Times

问一个数组里面是否存在长度为 m 的子数组被重复至少 k 次。

因为数据比较小，直接暴力搜索就行。第一次 WA 是因为 n - m * k + 1 写成 n - m * k 了。

class Solution:
    def containsPattern(self, arr: List[int], m: int, k: int) -> bool:
        n = len(arr)
        for i in range(n - m * k + 1):
            t = True
            for s in range(m):
                for t in range(k):
                    if arr[i + s] != arr[i + s + t * m]:
                        t = False
                        break
                if not t:
                    break
            if t:
                return True
        return False

SIR to fit COVID-19 data in the U.S.

2020-08-16

Life is quite disrupted during this hard time of pandemic. It is especially stressful to see new cases rising again within the U.S.. When will the pandemic come to its peak? I belive many have asked about this question and many have given it serious thought. In this blog post, I want to give this question an answer using SIR model.

LeetCode Contest 188

2020-05-10

快半年没有参加 LeetCode 的每周比赛了，感觉已经跟不上节奏了哈。四题用了1个小时，错了一次罚时5分钟，然后就排到350左右了。

第一题Build an Array With Stack Operations

模拟题，用 "Push" 和 "Pop" 构建一个给定的数组。

class Solution:
    def buildArray(self, target: List[int], n: int) -> List[str]:
        ret = []
        cur = 0
        for t in target:
            ret.extend(["Push", "Pop"] * (t - cur - 1))
            ret.append("Push")
            cur = t
        return ret

[npnet] 循环神经网络

2020-04-30

循环神经网络 (RNN, Recurrent Neural Network) 相对于之前的线性模型 (Linear Model) 或者卷积神经网络 (Convolution Neural Network) 能更好地处理具有顺序性的数据。比如说温度，每一个时间点的温度都跟之前若干个时间点的温度有联系。更重要的是，某个地方的温度在一天或者一年的时间内有一定的周期性。又比如说语言，一些单独的字或词需要有序地组织起来才能表达出清晰的意思。这样的数据用 RNN 能更好的归纳出规律。我们在这篇笔记中讨论简单 RNN 的理论以及实现。本文的主要参考文章：The Unreasonable Effectiveness of Recurrent Neural Networks。

[npnet] 卷积神经网络

2020-03-07

我们将用上一篇笔记实现的模型来搭建卷积神经网络去识别 Fashion MNIST 数据集。在之前的一篇笔记中我们用两个线性模型的串型网络达到了85%左右的准确率。我们期待使用卷积神经网络能达到更好的准确率。

[npnet] 卷积模型

2020-02-13

卷积神经网络 (convolution neural network) 是当下图像识别最为热门而有效的手段。这种网络一般有若干个卷积模型 (convolution model) 加上若干个线性模型 (linear model) 组合而成。卷积模型部分相当于提取图像的特征，而线性模型部分是根据这些特征做出预测。卷积模型的想法是通过卷积运算提取图像的局部信息。我们在本笔记中首先介绍卷积运行，然后按照 CS231n Convolutional Neural Networks for Visual Recognition 这篇参考资料的思路实现卷积模型。我们会在下一篇笔记中使用卷积模型建立一个简单的卷积神经网络来做 Fashion MNIST 识别。

[npnet] 串型网络

2020-01-04

我们在之前的一篇笔记中使用简单的线性模型去识别 Fashion-MNIST 数据集，达到了 75% 到 80% 的准确率。通过对损失 [loss] 的可视化我们发现这个简单的模型并没有过拟合 [overfitting]，说明我们基本达到了这个模型的上限了。我们如何增加提高识别的准确率呢？一个直接的做法就是提高模型的复杂度，比如我们可以使用简单的模型搭建一个复杂的神经网络。这一般来说也是最有效的方法。我们还可以通过数据预处理的方法来榨取原先模型的能力，只是我对此是门外汉，除了使数据具有零平均值 (zero mean) 和单位方差 (unit variance) 之外就一无所知了，所以我就在这方面就不展开了。

Previous 1 2 3 4 ... 11 Next