神经网络学习笔记

Posted on 2017-10-10

Coursera平台 Andrew Ng《Machine Learning》课程 Nueral Network 一章学习笔记.

Neural Network Presentation

考虑如下的情况：特征数$n=100$，直接使用sigmoid函数的计算为$g(\theta_0+\theta_1x_1+\theta_2x_2+\theta_3x_1x_2+\theta_4x_1^2x_2+\dots)$，仅二次的乘积有5000次，三次的乘积需计算$O(n^3)$次.对于许多情况，n很大，因而，采用直接计算乘积使用sigmoid函数的方法不是一个好办法.

对于CV，假设图像大小为$50\times50$像素，对于灰度图$n=(50\times50=)2500$，即

$$
x=
\begin{pmatrix}
像素1(0-255)\
像素2(0-255)\
\vdots\
像素2500(0-255)
\end{pmatrix}
$$

$x_i\times x_j$越有300万个特征，计算量极大.

神经元模型：Logistic Unit

对于具有3个特征$x_1 x_2 x_3$的逻辑回归，$h_\theta(x)=\frac{1}{1+\exp{(-\theta^Tx)}}$，其中$x=\begin{pmatrix}x_0 & x_1 & x_2 & x_3\end{pmatrix}^T$，$\theta=\begin{pmatrix}\theta_0 & \theta_1 & \theta_2 & \theta_3\end{pmatrix}^T$.

###一个3层的神经网络

Neural Network

上图中，Layer 1 称为 Input Layer，Layer 2 称为 Hidden Layer，Layer 3 称为 Output Layer，$a^{(j)}_i$为第j层第i个单元的激励，$\Theta^{(j)}$为控制第j到第j+1层的映射函数的比重.

上图中，

$$
a^{(2)}1=g(\Theta^{(1)}{10}x_0 + \Theta^{(1)}_{11}x_1 + \Theta^{(1)}_{12}x_2 + \Theta^{(1)}_{13}x_3)
$$

$$
a^{(2)}2=g(\Theta^{(1)}{20}x_0 + \Theta^{(1)}_{21}x_1 + \Theta^{(1)}_{22}x_2 + \Theta^{(1)}_{23}x_3)
$$

$$
a^{(2)}3=g(\Theta^{(1)}{30}x_0 + \Theta^{(1)}_{31}x_1 + \Theta^{(1)}_{32}x_2 + \Theta^{(1)}_{33}x_3)
$$

$$
h_\Theta(x)=a^{(3)}1=g(\Theta^{(2)}{10}a^{(2)}0 + \Theta^{(2)}{11}a^{(2)}1 + \Theta^{(2)}{12}a^{(2)}2 + \Theta^{(2)}{13}a^{(2)}_3)
$$

其中$x_0=a^{(2)}_0=1$

若第$j$层有$s_j$个单元，$j+1$层有$s_{j+1}$个单元，则矩阵$\Theta^{(j)}$为$s_j \times s_{j+1}$阵.

取$z^{(2)}1=\Theta^{(1)}{10}x_0 + \Theta^{(1)}_{11}x_1 + \Theta^{(1)}_{12}x_2 + \Theta^{(1)}_{13}x_3$，$z^{(2)}2=\Theta^{(1)}{20}x_0 + \Theta^{(1)}_{21}x_1 + \Theta^{(1)}_{22}x_2 + \Theta^{(1)}_{23}x_3$，$z^{(2)}3=\Theta^{(1)}{30}x_0 + \Theta^{(1)}_{31}x_1 + \Theta^{(1)}_{32}x_2 + \Theta^{(1)}_{33}x_3$.

则$z^{(2)}=\Theta^{(1)}x=\begin{pmatrix}z^{(2)}_1 \ z^{(2)}_2 \ z^{(2)}_3 \end{pmatrix} \in \mathbb{R}^3$

同理，$z^{(3)}=\Theta^{(2)}a^{(2)}, h_\Theta(x)=a^{(3)}=g(z^{(3)})$

以上计算过程可称前向传播(Foward Propogation).

简单例子：“与”门

记$x_1, x_2 \in \begin{Bmatrix} 0,1\end{Bmatrix},x=\begin{pmatrix} 1 \ x_1 \ x_2\end{pmatrix},h_\Theta(x)=g(\Theta_0+\Theta_1x_1+\Theta_2x_2)$.

取$\Theta=\begin{pmatrix} -30 \ 20 \ 20\end{pmatrix}$，则输入与输出关系如下：

$$
\begin{array}{c|c|c}
{x_1} & {x_2} & {h_\Theta(x)} \
\hline
{0} & {0} & {g(-30)\approx 0} \
{0} & {1} & {g(-10)\approx 0} \
{1} & {0} & {g(-10)\approx 0} \
{1} & {1} & {g(30)\approx 1}
\end{array}
$$

故而$h_\Theta(x) \approx x_1 \land x_2$

多类别分类

例如$h_\Theta(x) \in \mathbb{R}^4$，我们希望$h_\Theta(x) \approx \begin{pmatrix} 1\0\0\0 \end{pmatrix},\dots,\begin{pmatrix} 0\0\0\1 \end{pmatrix}$，这与逻辑回归相似，只不过有了4个类别.

设训练集为$((x^{(1)},y^{(1)}),\dots,(x^{(m)},y^{(m)})$，其中
$y^{(i)}=
\begin{Bmatrix}
\begin{pmatrix} 1\0\0\0 \end{pmatrix},
\begin{pmatrix} 0\1\0\0 \end{pmatrix},
\begin{pmatrix} 0\0\1\0 \end{pmatrix},
\begin{pmatrix} 0\0\0\1 \end{pmatrix}
\end{Bmatrix}
$

Cost Function

用$l$表示层数，$s_l$表示第$l$层的单元个数.

对于逻辑回归，有

$$
J(\theta)=-\frac{1}{m}[\sum\limits_{i=1}\limits^{m} y^{(i)}\log h_\theta(x^{(i)}) + (1-y^{(i)})\log (1-h_\theta(x^{(i)}))] + \frac{\lambda}{2m} \sum\limits_{j=1}\limits^{n} \theta_j^2
$$

对神经网络，有

$$
h_\Theta(x) \in \mathbb{R}^K,(h_\Theta(x))i=i^{th} \mathtt{output}
$$

$$
J(\Theta)=- \frac{1}{m}\left[\sum\limits{i=1}\limits^{m} \sum\limits_{k=1}\limits^{K} y^{(i)}k\log h\Theta(x^{(i)})_k + (1-y^{(i)}k)\log(1-h\Theta(x^{(i)})k)\right] + \frac{\lambda}{2m} \sum\limits{l=1}\limits^{L-1} \sum\limits_{i=1}\limits^{s_l} \sum\limits_{j=1}\limits^{s_l+1} \left(\Theta^{(l)}_{j i}\right)^2
$$

Back Propogation

我们希望获取$\Theta$使得$J(\Theta)$最小.为此，需要计算$J(\Theta)$和$\frac{\partial}{\partial\Theta^{(l)}_{i j}}J(\Theta)$.

偏导项的计算

假设有一个训练样本$(x,y)$，存在4层的神经网络，每个 Hidden Layer 有3个单元，$x \in \mathbb{R}^3$，$y \in \mathbb{R}^4$.

使用前向传播算法计算如下.

$$
\begin{align}
& a^{(1)}=x \
& z^{(2)}=\Theta^{(1)}a^{(1)},\ a^{(2)}=g(z^{(2)}) \
& z^{(3)}=\Theta^{(2)}a^{(2)},\ a^{(3)}=g(z^{(3)}) \
& z^{(4)}=\Theta^{(3)}a^{(3)},\ a^{(4)}=h_\Theta(x)=g(z^{(4)})
\end{align}
$$

定义$\delta^{(l)}_j$为第$l$层第$j$个单元的“误差”，则$\delta^{(i)}$的计算方向与$z^{(l)}$、$a^{(l)}$相反.

$$
\begin{align}
& \delta^{(4)}=a^{(4)}-y \in \mathbb{R}^4 \
& \delta^{(3)}=(\Theta^{(3)})^T\delta^{(4)} . g’(z^{(3)}) \
& \delta^{(2)}=(\Theta^{(2)})^T\delta^{(3)} . g’(z^{(2)}) \
\end{align}
$$

其中$g’(z^{(l)})$是对$g(z^{(l)})$的求导，$g’(z^{(l)})=a^{(l)}.*(1-a^{(l)})$.

可以证明，忽略bias项时，有$\frac{\partial}{\partial\Theta^{(l)}_{ij}}J(\Theta)=a^{(l)}_j\delta^{(l+1)}_i$.

Back Propogation Algrithm

设训练集${(x^{(1)},y^{(1)}),\dots,(x^{(m)},y^{(m)})}$.

置$\Delta^{(l)}_{i j} = 0,\ \mathrm{for\ all\ i,\ j,\ l}$.

$$
\begin{align}
& \mathrm{for\ i=1:m} \
& \quad a^{(1)}:=x^{(i)} \
& \quad 使用\mathrm{Foward\ Propogation}计算a^{(l)},l=2,3,\dots,L. \
& \quad 利用y^{(i)}计算\delta^{(L)}=a^{(L)}-y^{(i)}. \
& \quad 计算\delta^{(L-1)},\delta^{(L-2)},\dots,\delta^{(2)}. \
& \quad \Delta^{(l)}{i j} := \Delta^{(l)}{i j} + a^{(l)}_j\delta^{(l+1)}j \
& \mathrm{end} \
& D^{(l)}{i j} := \frac{1}{m}\Delta^{(l)}{i j} + \lambda\Theta^{(l)}{i j},\ \mathrm{if}\ j \neq 0. \
& D^{(l)}{i 0} := \frac{1}{m}\Delta^{(l)}{i j}
\end{align}
$$

Gradient Checking

当$\epsilon$很小时，$\frac{d}{d\theta}J(\theta) \approx \frac{J(\theta + \epsilon) - J(\theta - \epsilon)}{2\epsilon}$.

对于神经网络，$\Theta \in \mathbb{R}^n$.

$$
\begin{align}
& \mathrm{for\ i = 1 : n} \
& \quad \theta^+ = \theta;\ \theta^+(i) = \theta(i) + \epsilon; \
& \quad \theta^- = \theta;\ \theta^-(i) = \theta(i) - \epsilon; \
& \quad \frac{\partial}{\partial\theta(i)}J(\theta) = \frac{J(\theta^+) - J(\theta^-)}{2\epsilon} \
& \mathrm{check\ that\ } \frac{\partial}{\partial\theta}J(\theta) \approx \mathrm{DVec}
\end{align}
$$

在使用 gradient check 确认算法正确之后，由于其计算量大，不再使用，直接使用BP算法.

Random Initialize

在神经网络中，将$\Theta$初始化为全0，会导致神经网络失去非对称性.随机初始化用于打破这种对称性.将$\Theta^{(l)}_{i j}$的值随机初始化为$[-\epsilon,\epsilon]$之间的一个数，其中$\epsilon$是个很小的正数.

Puting It Together

选择一个神经网络

神经网络有几层，每层包含几个unit.
特征$x^{(i)}$的维数.
类别数（$y^{(i)}$的维数）.

训练神经网络

随机初始化$\Theta^{(l)}_{i j}$.
实现前向传播，利用$x^{(i)}$计算$h_\Theta(x^{(i)})$.
计算 cost function $J(\Theta)$.
利用反向传播算法计算$\frac{\partial}{\partial\Theta^{(l)}{i j}}J(\Theta)$.
$$
\begin{align}
& \mathrm{for}\ i = 1 : m \
& \quad 对(x^{(i)},y{(i)})实现前向传播和反向传播 \
& \quad 计算\Delta^{(l)} := \Delta^{(l)} + \delta^{(l+1)}\left(a^{(l)}\right)^T \
& \mathrm{end} \
& 计算 \frac{\partial}{\partial\Theta^{(l)}{i j}}J(\Theta)
\end{align}
$$
使用 Gradient Check 检验正确性.
使用梯度下降或其他优化方法，最小化$J(\Theta)$.

异常检测学习笔记

Posted on 2017-09-16

Coursera平台 Andrew Ng《Machine Learning》课程Anomaly Detection一章（第9周）学习笔记。

已知训练集$(x^{(1)},x{(2)},\dots,x^{(m)})$，给出测试数据$x_{test}$。建立模型$p(x)$，如果$p(x_{test})\lt\epsilon$，则认为$x_{test}$异常，否则认为其正常。

高斯分布

设$x\in\mathbb{R}$服从高斯分布，其均值为$\mu$，方差为$\sigma$，则

$$
p(x;\mu,\sigma^2)=\frac{1}{\sqrt{2\pi}\sigma}\exp(-\frac{(x-\mu)^2}{2\sigma^2})
$$

参数估计：

Dataset: $(x^{(1)},\dots,x^{(m)}),x^{(i)}\in\mathbb{R}$
$\mu=\frac{1}{m}\sum\limits_{i=1}\limits^{m}x^{(i)}$
$\sigma^2=\frac{1}{m}\sum\limits_{i=1}\limits^{m}(x^{(i)}-\mu)^2$

异常检测算法

给出训练集$(x^{(1)},\dots,x^{(m)}),x^{(i)}\in\mathbb{R}^n$，若$x_i$服从高斯分布，其均值为$\mu_i$，方差为$\sigma_i^2$，则可建立模型：

$$
p(x)=p(x_1;\mu_1,\sigma_1^2)\times\dots\times p(x_n;\mu_n,\sigma_n^2)=\prod\limits_{j=1}\limits^np(x_j;\mu_j,\sigma_j^2)
$$

计算步骤如下：

选择可能异常的$x$；
计算$\mu_1,\dots,\mu_n,\sigma_1^2,\dots,\sigma_n^2$
计算$p(x)$。若$p(x)\lt\epsilon $，则认为$x$异常。

Develop and Evaluation an Anomaly Detection Algorithm

使用带有标签的数据作为交叉验证集和测试集的数据。算法估计过程如下：

对训练集数据$(x^{(1)},\dots,x^{(m)})$，获取模型$p(x)$。
在交叉验证集数据上，使用$p(x)$预测$y$：

$$
y=
\begin{cases}
1&,if & p(x) \lt \epsilon \
0&,if & p(x) \ge \epsilon
\end{cases}
$$

计算： True positive, false positive, false negative, true negative。
计算：Recall, Precision。
计算： $F1_{score}$。

可使用CV集，选择使$F1_{score}$最小的$\epsilon$值。

多元高斯分布

数据$x\in\mathbb{R}^n$，参数$\mu\in\mathbb{R}^n$，$\Sigma\in\mathbb{R}^{n\times n}$。

$$
p(x;\mu,\Sigma)=
\frac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}}
\exp\left(-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\right)
$$

使用多元高斯分布的异常检测

计算步骤：

$\mu=\frac{1}{m}\sum\limits_{i=1}\limits^{m}x^{(i)}$，$\Sigma=\frac{1}{m}\sum\limits_{i=1}\limits^{m}(x^{(i)}-\mu)(x^{(i)}-\mu)^T$
对于测试数据$x$，计算：

$$
p(x;\mu,\Sigma)=
\frac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}}
\exp\left(-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\right)
$$

如果$p(x)\lt\epsilon$，则记$x$为异常。

多元高斯分布和前述高斯分布的关系

前述模型：

$$
p(x)=p(x_1;\mu_1,\sigma_1^2)\times\dots\times p(x_n;\mu_n,\sigma_n^2)
$$

相当于多元高斯分布的一个特例，此时多元高斯分布的参数

$$
\Sigma=\left[
\begin{matrix}
\sigma_1^2 & & &O \
&\sigma_2^2 & & \
& & \ddots \
O & & & \sigma_n^2
\end{matrix}
\right]
$$

计算代价较大，且必须有$m\gt n$。

SVM学习笔记

Posted on 2017-09-12

Coursera平台 Andrew Ng《Machine Learning》课程SVM一章（第7周）学习笔记。
对SVM的初步理解：一种二类分类模型，在特征空间上使类到边界间隔最大。

从 Logistic Regression 到 Support Vector Machine

对 Logistic Regression，我们有：

$$
g(z) = h_\theta(x) = \frac{1}{1 + e^{-\theta^Tx}}
$$
$$
z = \theta^Tx
$$

预测“1”当$h_\theta(x)\ge 0.5$或$z\ge 0$时；
预测“0”当$h_\theta(x)\lt 0.5$或$z\lt 0$时。

训练过程：

$$
\min\limits_\theta\frac{1}{m}\left[\sum\limits_{i=1}\limits^{m}y^{(i)}\left(-\log h_\theta\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right)\left(-\log\left(1-h_\theta\left(x^{(i)}\right)\right)\right)\right]+\frac{\lambda}{2m}\sum\limits_{j=1}\limits^{n}\theta_j^2
$$

对SVM，当$y=1$时，我们希望代价函数为$cost_1(z)$，当$z\ge 1$时，$cost_1(z)=0$；当$y=0$时，我们希望代价函数为$cost_0(z)$，当$z\le -1$时，$cost_0(z)=0$。

利用$cost_1$和$cost_0$得到新的代价函数：

$$
\frac{1}{m}\left[\sum\limits_{i=1}\limits^{m}y^{(i)}cost_1(z)+(1-y^{(i)})cost_0(z)\right]+\frac{\lambda}{2m}\sum\limits_{j=1}\limits^{n}\theta_j^2
$$

可取：

$$
cost_1(\theta^Tx^{(i)})=-\log h_\theta(x^{(i)})
$$
$$
cost_0(\theta^Tx^{(i)})=-\log (1-h_\theta(x^{(i)}))
$$

消去$m$和第二项中的$\lambda$，则训练过程为：

$$
\min\limits_\theta C\sum\limits_{i=1}\limits^{m}\left[y^{(i)}cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})\right] + \frac{1}{2}\sum\limits_{j=1}\limits^{n}\theta_j^2
$$

训练过程以$\pm 1$为依据，预测“1”当$\theta^Tx\ge 0$时，预测“0”当$\theta^Tx\lt 0$时。

最大间隔分类器

SVM可以尝试发现一个与样本数据集有最大间隔的分类边界。考虑SVM的训练过程：

$$
\min\limits_\theta C\sum\limits_{i=1}\limits^{m}\left[y^{(i)}cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})\right] + \frac{1}{2}\sum\limits_{j=1}\limits^{n}\theta_j^2
$$

如果我们找到了对应的参数$\theta$，使得$y=0$和$y=1$时上式的左边一项都尽量为0（即$y=0$时$\theta^Tx\le -1$，$y=1$时$\theta^Tx\ge 1$），则上式变为：

$$
\min\limits_\theta \frac{1}{2} \sum\limits_{j=1}\limits^{n}\theta_j^2
$$
s.t.$\theta^Tx^{(i)}\ge 1$ if $y^{(i)}=1$, $\theta^Tx^{(i)}\le -1$ if $y^{(i)}=0$

取向量$\vec\theta$和$\vec{x^{(i)}}$：

$$
\vec\theta = \left(\begin{matrix} \theta_0 \ \theta_1 \ \theta_2 \ \vdots \ \theta_n \end{matrix}\right),\vec{x^{(i)}} = \left(\begin{matrix} 1\x^{(i)}_1 \ x^{(i)}_2 \ \vdots \ x^{(i)}_n \end{matrix}\right)
$$

对于$y=1$一类，其“边缘”（接近分界）的数据，有$\theta^Tx^{(i)}=1$（取临界值），或$\vec\theta \vec{x^{(i)}}=1$。
又：$\min\limits_\theta \frac{1}{2}\sum\limits_{j=1}\limits^{n}\theta_j^2=\min\limits_\theta \frac{1}{2} \left|\left|\vec\theta\right|\right|^2$

故判定边界（以$\vec\theta$为法向量）与$x^{(i)}$间隔最大。

对于$y=0$一类，同理。

核函数

核函数可用于构建非线性的分类模型，用于计算新的特征。
选定“地标”(landscape)$l^{(1)},l^{(2)},l^{(3)}, \cdots$，给定训练实例$x$，计算$x$与$l^{(i)}$的“相似程度”作为新的特征$f_i$，$i=1,2,3,\cdots$。

此时的SVM决策边界为$\theta^Tf$，其中

$$
f=\left( \begin{matrix} 1 \ f_1 \ f_2 \ \vdots \ f_n \end{matrix} \right)
$$

高斯核函数

$$
f_i=similarity(x,l^{(i)})=\exp\left(-\frac{||x-l^{(i)}||^2}{2\sigma}\right)
$$

当$x$接近$l^{(i)}$时，$f_i\approx 1$；当$x$远离$l^{(i)}$时，$f_i\approx 0$。

当$\theta^Tf\ge 0$时预测“1”；当$\theta^Tf\lt 0$时预测“0”。

$l^{(i)}$的获取

设$m$为训练集的大小。

给出${x^{(i)},y^{(i)}},i=1,2,\cdots,m$
选择$l^{(i)}=x{(i)},i=1,2,\cdots,m$
给出训练实例$x$，计算$f_i=similarity(x,l^{(i)}),i=1,2,\cdots,m$

具有核函数的SVM

其训练过程为：

$$
\min\limits_\theta C \sum\limits_{i=1}\limits^{m} \left[y^{(i)}cost_1(\theta^Tf^{(i)}) + (1-y^{(i)})cost_0(\theta^Tf^{(i)})\right] + \frac{1}{2}\sum\limits_{j=1}\limits^{m}\theta_j^2
$$

其中，

$$
f^{(i)}= \left( \begin{matrix} 1 \ f^{(i)}_1 \ \vdots \ f^{(i)}_m \end{matrix} \right) \in \mathbb{R}^{m+1} , f^{(i)}_i=1
$$

SVM的参数选择与应用

SVM参数选择：

$C$较大，相当于$\lambda$较小，可能导致过拟合，high variance。
$C$较小，相当于$\lambda$较大，可能导致欠拟合，high bias。
$\sigma$较大，导致high bias。
$\sigma$较小，导致high variance。

使用SVM，需要选择参数$C$，选择核函数（常用高斯核函数（需要选择参数$\sigma$）、无核函数（“线性核函数”））。

Logistic Regression 与 SVM 比较

设$m$为训练集大小，$n$为特征数。
一些普遍的准则：

如果$n$（相较于$m$）很大，通常使用逻辑回归，或采用带线性核函数的SVM。
如果$n$较小，$m$中等大小（如$n$为1000，$m$为10000），采用带高斯核函数的SVM。
如果$n$较小，$m$较大，直接使用SVM会较慢。需要创建、增加更多的特征，然后使用逻辑回归，或带线性核函数的SVM。

神经网络在上述情况下表现较好，但计算较慢。选择SVM主要在于其代价函数是凸函数，不存在局部最小非全局最小的情况。

A Solution of A Written Question in An Examination (3)

Posted on 2017-09-07

Question

假设以定长存储结构表示串，试设计一个算法，求串s和串t的一个最长公共子串。

Solution

串的定长存储表示可描述如下：

1 2	#define MAXSTRLEN 255 //用户可在255以内定义最大串长 typedef unsigned char SString[MAXSTRLEN + 1]; //0号单元存放串的长度

对于求两个串s和t的公共子串，根据算法的空间复杂度分为两类：

(1) 空间复杂度为O(length(s) + length(t))
(2) 空间复杂度为O(length(s) * length(t))

对于第1种方法，可以采用串的模式匹配算法，对串s的每一个子串寻找其在串t中是否可以匹配。下面主要介绍第二种方法的一种实现。

考虑到串s和串t为定长存储结构，可以比较方便地建立二维数组A来表示两个串中每一个字符的匹配情况，如果s[i]和t[j]相等，则置A[i][j]>0，否则A[i][j]=0。因此，对于s和t的公共子串r，必有：

A[i][j] > 0, for continious s[i] and t[j] in r

为便于计算最长的r，特别地，置

A[i][j] = A[i - 1][j - 1] + 1, if s[i] == t[j]

此时A[i][j]表示对应的公共子串的长度。

因此，遍历A取A[i][j]的最大值，对应的子串即为所求的最长公共子串。

算法的C语言实现如下：

void lngstCmnSubStr(SString s, SString t, SString& r)
//求以定长存储表示的串s和t的最长公共子串，存入r
{
	r[0] = 0;
    int** ch = new int [s[0] + 1][t[0] + 1];
    
    for(int i = 1; i < s[0]; i++)
    {
    	for(int j = 1; j < t[0]; j++)
        {
        	if(s[i] != t[j])
            {
            	ch[i][j] = 0;
            } // if
            else
            {
            	if(j == 1 || i == 1)
                {
                	ch[i][j] = 1;
                }
                else
                {
                	ch[i][j] = ch[i - 1][j - 1] + 1;
                }
                // Refresh maxLen
                if(ch[i][j] > maxLen)
                {
                	maxLen = ch[i][j];
                    max_s = i;
                }
            } // else
        } // for
    } // for
    
    // Refresh r
    r[0] = maxLen;
    for(int i = 1; i < maxLen; i++)
    {
    	r[i] = s[max_s - maxLen + i];
    }
    
    delete ch[][];
} // lngstCmnSubStr()

A Solution of A Written Question in An Examination (2)

Posted on 2017-08-29

Question

设计一个算法，将数组A[n]中的元素A[0]至A[n-1]循环右移k位，并要求只用一个元素大小的附加存储，元素移动或交换次数为O(n)。

Solution

对于要求只有一个元素大小的附加存储的循环右移，可以分为以下3步进行：

1 对所有n个元素逆序；
2 对前k个元素逆序；
3 对后n-k个元素逆序。

由于逆序可以只占用一个元素大小，元素交换次数为O(n)，该算法可以满足题目要求。

算法的C语言实现：

void rotateRight(int* A, int n, int k)
// A为给定数组，n为数组大小，k为指定的移动位数
{
	int temp; //附加存储
	k = k % n;
	// n个元素逆序
	for(int i = 0; i < n / 2; i++)
	{
		temp = A[i];
		A[i] = A[n - 1 - i];
		A[n - 1 - i] = temp;
	}
	// 前k位逆序
	for(int i = 0; i < k / 2; i++)
	{
		temp = A[i]; 
		A[i] = A[k - 1 - i];
		A[k - 1 - i] = temp;
	}
	// 后n-k位逆序
	for(int i = 0; i < (n - k) / 2; i++)
	{
		temp = A[i + k];
		A[i + k] = A[n - 1 - i];
		A[n - 1 - i] = temp;
	}
	return;
}

A Solution of A Written Question in An Examination (1)

Posted on 2017-07-25

Question

Assume an Slist L, with a data structure of each node as following:

struct node
{
    int num;
    node* next;
};

Complete the function as following, where for each ordinal sublist with n nodes, the function is able to reserve its order, and return the new head of the list. For the last sublist, do not reserve its order if the number of nodes is less than n.

node* f(node* head, int n)
{
    /****************************/
    /* Add your code here. */
    /****************************/
}

Feature head is the first node of the slist, n is the given number. This function will return the first node of the list.

Solution

Assume that we have a list of the length 3, thus we can complete the code as following.

node* f(node* head)
{
    node* former(head->next), temp(head->next->next), letter(head->next->next->next);
    former->next = head;
    temp->next = former;
    letter->next = temp;
    return head;
}

For the list L with the length n(n >= 3), we can saberate it as n lists, each of which has one node that needs transforming.

node* f(node* head, int n)
{
    node* former(head->next);
    node* temp(head->next->next);
    node* letter(head->next->next->next);
    former->next = head;
    for(int i = 1; i < n; i++)
    {
        temp->next = former;
        former = temp;
        temp = letter;
        letter = letter->next;
    }
    temp->next = former;
    return temp;
}

With the combination of the algorithm shown and the considerition of that it is used in a sublist with the length n, we can complete the solution of this problem as following consequently.

node* f(node* head, int n)
{
    node* former(head->next);
    node* temp(head->next->next);
    node* letter(head->next->next->next);
    node* header(head); // The node before the first one
    node* header_temp(head);
    node* tail(head);   // The node after the last one
    do
    {   
        for(int i = 1; i <= n || !tail; i++)
        {
            tail = tail->next;
        }
        if(!tail) break;
        else
        {
            tail = tail->next;
            former->next = tail;
            header_temp = former;   // Save the node as the one before the first node of the next group of nodes
            for(int i = 1; i < n; i++)
            {
                temp->next = former;
                former = temp;
                temp = letter;
                letter = letter->next;
            }
            temp->next = former;
            header->next = temp;
            header = header_temp;   // Refresh header
            continue;
        }
    }while(1);  // while
    return head;
}   // f

Curriculum Vitae

Posted on 2017-07-24

Qi Kehan

EDUCATION

Undergraduate, Bachelor of Engineering 2013-2017

Zhejiang University

College of Biomedical Engineering and Instrument Science
major in Measurement Control Technology and Instruments
GPA: 3.08/4.0

Graduate, Master of Engineering 2018-2021

University of Chinese Academy of Science

Shenzhen Institute of Advanced Technology
major in Computer Technology
Research Interest: Image Segmentation, MR Image reconstruction, Machine Learning

Coursera

Machine Learning by Stanford University

Certificate earned on September 14, 2017

WORK EXPERIENCE

Software Development Engineer 2015

Zhejiang University, Advisor: Jiquan Liu, Associate Prof.

Complete the Packman based on openCV. Design and overview the map based on APIs given by openCV. Design and realize the algrithm of the path which leads the ghost. Design the skills and optimize the game.

Software Development Engineer 2015-2016

Zhejiang University, Advisor: Chen’ge Geng, Associate Prof.

Complete the program to land in, read file, upload, get character, and print, based on the voice recognition API, SDK and demo given by iFlyTech.

Thesis Project 2016-2017

Zhejiang University, Advisor: Youzhao Wang, Associate Prof.

Realize a system which is able to complete the signal acquisition, data processing, data storage and displaying. Select a sensor and design a circuit based on a common low-pass filter to achieve the signal processing. Select mini2440 development board based on ARM-9 as an embedded development platform and design the program to complete data processing and storage based on embedded Linux. Achieve man-machine interface design based on Qt design application.

Student Intern 2018

Shenzhen Institute of Advanced Technology, Advisor: Shanshan Wang

Help to do research on MR Image segmentation and Machine Learning.

SKILLS

Writing Markdowm, HTML, MS Office
Programming Matlab, C/C++, Linux Shell, Verilog HDL
OS Windows, Ubuntu, Rasbian, Zigbee Protocol Stack
Home https://andrewsher.github.io/