使用Express完善Crawler

项目结构如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
├── data
│   └── movie.json
├── index.html
├── node_modules
│   ├── @types
│   │   ├── body-parser -> ../.pnpm/@types+body-parser@1.19.2/node_modules/@types/body-parser
│   │   ├── express -> ../.pnpm/@types+express@4.17.17/node_modules/@types/express
│   │   └── node -> ../.pnpm/@types+node@20.4.2/node_modules/@types/node
│   ├── body-parser -> .pnpm/body-parser@1.20.2/node_modules/body-parser
│   └── express -> .pnpm/express@4.18.2/node_modules/express
├── package.json
├── pnpm-lock.yaml
├── router
│   └── index.ts
├── src
│   ├── crawler.ts
│   ├── index.ts
│   └── nowPlaying.ts
├── study-ts.code-workspace
└── tsconfig.json

起步

安装express、body-parser

1
2
pnpm i express
pnpm i body-parser // nodejs中用于解析post请求的body

根目录下新建router文件夹及index.ts文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
/*
* @Description:
* @Author: xiuji
* @Date: 2023-08-11 10:37:05
* @LastEditTime: 2023-08-12 08:58:07
* @LastEditors: Do not edit
*/
import { Router, Request, Response } from "express";
import NowPlaying from "../src/nowPlaying";
import Crawler from "../src/crawler";
const router = Router();

router.get('/', (req: Request, res: Response) => {
res.send(`
<html>
<body>
<form method="post" action="/getData">
<input type="password" name="password" />
<button type="submit">爬取数据</button>
</form>
</body>
</html>
`)
})

router.post('/getData', (req: Request, res: Response) => {
if (req.body.password === '123') {
const url = 'https://movie.douban.com/cinema/nowplaying/nanjing/';
const nowPlaying = NowPlaying.getInstance();
new Crawler(url, nowPlaying);
res.send('getData success');
} else {
res.send('password error');
}
})

export default router;

src下新建index.ts编写express启动文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/*
* @Description:
* @Author: xiuji
* @Date: 2023-08-11 10:34:39
* @LastEditTime: 2023-08-12 09:20:19
* @LastEditors: Do not edit
*/
import express from 'express'
import router from '../router/index';
import bodyParser from 'body-parser'; // nodejs中用于解析post请求的body的中间件

const app = express();
app.use(bodyParser.urlencoded({ extended: false }));
app.use(router);

app.listen(7001, () => {
console.log('server is running');
})

自定义Express类型扩展文件

express.js是javascript编写的库,在ts中使用需要安装对应ts类型声明文件,否则ts无法识别该库的全部方法

1
pnpm i @types/express -D

安装类型描述文件后仍有一些不清晰的类型,如:

1
2
3
4
5
6
/**
* router/index.ts
*/
router.post('/getData', (req: Request, res: Response) => {
const { password } = req.body; // password是any类型,期望得到一个string或者undefined
})

通过extands继承js库中需要重新定义类型的原有属性和方法,不可以重写属性,但可以重写方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
interface RequestWithBody extends Request {
body: {
passward: string | undefined,
[propName: string]: string | undefined
}
}

/**
* router/index.ts
*/
router.post('/getData', (req: RequestWithBody, res: Response) => {
const { password } = req.body; // password是any类型
if (password === '123') {
const url = 'https://movie.douban.com/cinema/nowplaying/nanjing/';
const nowPlaying = NowPlaying.getInstance();
new Crawler(url, nowPlaying);
res.send('getData success');
} else {
res.send('password error');
}
})

类型融合

场景:router使用中间件时,req和res做了修改,实际上类型不能修改

1
2
3
4
5
6
7
import express, { Request, Response, NextFunction } from 'express'

const app = express();
app.use((req: Request, res: Response, next: NextFunction) => {
req.userName = 'Neo'; // 类型“Request<ParamsDictionary, any, any, ParsedQs, Record<string, any>>”上不存在属性“userName”。ts(2339)
next();
})

解决:参照express自己的类型文件node_modules/.pnpm/@types+express-serve-static-core@4.17.35/node_modules/@types/express-serve-static-core/index.d.ts自定义扩展类型

1
2
3
4
5
6
7
8
9
10
declare global {
namespace Express {
// These open interfaces may be extended in an application-specific manner via declaration merging.
// See for example method-override.d.ts (https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/method-override/index.d.ts)
interface Request {}
interface Response {}
interface Locals {}
interface Application {}
}
}

参照上面的命名空间自定义扩展属性

1
2
3
4
5
6
7
8
9
10
11
12
/*
* @Description: src/expressExtend.d.ts
* @Author: xiuji
* @Date: 2023-08-12 11:22:38
* @LastEditTime: 2023-08-12 11:23:58
* @LastEditors: Do not edit
*/
declare namespace Express {
interface Request {
userName?: string;
}
}

ts编译器会自动融合对应的命名空间,这时ts不再报错

为Crawler添加登录功能

登录状态需要通过cookie-session中间件设定

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/*
* @Description:
* @Author: xiuji
* @Date: 2023-08-11 10:34:39
* @LastEditTime: 2023-08-14 10:53:14
* @LastEditors: Do not edit
*/
import express from 'express'
import router from '../router/index';
import bodyParser from 'body-parser';
import cookieSession from 'cookie-session';

const app = express();
app.use(bodyParser.urlencoded({ extended: false }));
app.use(
cookieSession({
name: 'session',
keys: ['system key'],
maxAge: 24 * 60 * 60 * 1000 // 24 hours
})
)
app.use(router);

app.listen(7001, () => {
console.log('server is running');
})

登录业务代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
/*
* @Description:
* @Author: xiuji
* @Date: 2023-08-11 10:37:05
* @LastEditTime: 2023-08-14 15:42:08
* @LastEditors: Do not edit
*/
import path from "path";
import fs from 'fs'; // nodejs自带的文件模块
import { Router, Request, Response } from "express";
import NowPlaying from "../src/nowPlaying";
import Crawler from "../src/crawler";

interface RequestWithBody extends Request {
body: {
passward: string | undefined,
[propName: string]: string | undefined
}
}

const router = Router();

router.post('/login', (req: RequestWithBody, res: Response) => {
const isLogin = req.session ? req.session.login : false;
const { password } = req.body;
if (isLogin) {
res.send('已经登录过');
return;
} else {
if (password === undefined || password === '') {
res.send(`
<html>
<body>
<div style="color:red;">密码不能为空</div>
<form method="post" action="/login">
<input type="password" name="password" />
<button type="submit">登录</button>
</form>
</body>
</html>
`);
return;
}
if (password === '123' && req.session) {
req.session = { login: true };
res.redirect('/');
} else {
res.send(`
<html>
<body>
<div style="color:red;">密码错误</div>
<form method="post" action="/login">
<input type="password" name="password" />
<button type="submit">登录</button>
</form>
</body>
</html>
`);
}
}
})

router.get('/logout', (req: Request, res: Response) => {
req.session = undefined;
res.redirect('/');
})

router.get('/', (req: Request, res: Response) => {
const isLogin = req.session ? req.session.login : false;
if (isLogin) {
res.send(`
<html>
<body>
<a href="/logout">退出</a>
<a href="/getData">爬取数据</a>
<a href="/showData">展示数据</a>
</body>
</html>
`)
} else {
res.send(`
<html>
<body>
<form method="post" action="/login">
<input type="password" name="password" />
<button type="submit">登录</button>
</form>
</body>
</html>
`)
}
})

/**
* router/index.ts
*/
router.get('/getData', (req: RequestWithBody, res: Response) => {
const isLogin = req.session ? req.session.login : false;
if (!isLogin) {
res.send('请先登录');
return;
}
const url = 'https://movie.douban.com/cinema/nowplaying/nanjing/';
const nowPlaying = NowPlaying.getInstance();
new Crawler(url, nowPlaying);
res.send('getData success');
})

router.get('/showData', (req: RequestWithBody, res: Response) => {
const isLogin = req.session ? req.session.login : false;
if (!isLogin) {
res.send('请先登录');
return;
}
const filePath = path.resolve(__dirname, '../data/movie.json');
const fileContent = fs.readFileSync(filePath, 'utf-8');
res.json(JSON.parse(fileContent));
})

export default router;

定义中间件检查登录状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const checkLogin = (req: Request, res: Response, next: NextFunction) => {
const isLogin = req.session ? req.session.login : false;
if (isLogin) {
next();
} else {
res.json(responseData(null, '请先登录'));
}
}

// 使用中间件
router.get('/getData', checkLogin, (req: RequestWithBody, res: Response) => {
const url = 'https://movie.douban.com/cinema/nowplaying/nanjing/';
const nowPlaying = NowPlaying.getInstance();
new Crawler(url, nowPlaying);
res.json(responseData(null, '爬取数据成功'));
})

统一接口返回数据类型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/*
* @Description:
* @Author: xiuji
* @Date: 2023-08-14 16:22:35
* @LastEditTime: 2023-08-14 16:30:17
* @LastEditors: Do not edit
*/
interface ResponseData {
success: boolean;
data: any;
errMsg?: string;
}

export const responseData = (data: any, errMsg?: string): ResponseData => {
return {
success: errMsg ? false : true,
data,
errMsg: errMsg ? errMsg : ''
}
}