0%

第四阶段总结报告

一些碎语

我查看了三篇文档 , 其中https://os.phil-opp.com/async-await/#pinning

新使用的crate:crossbeam ,conquer_once

为啥使用conquer_once的OnceCell而不是lazy_static!因为这个类型实现
了避免中断程序进行堆空间开辟(从而避免产生一个死锁)

future 中还有一个Stream的trait
tokio文档看了一下(trait回忆一下,它像个接口,implement后应用到具体的一个类型)

Tokio 文档阅读

chat

首先阅读的是一堆example,这里先看chat。
chat是一个聊天室,设立服务端然后异步读取其他客户端的消息,并且广播到是所有的客户端,
是一种中心式结构

io_uring

与linux

linux有一个自己的原生异步IO接口,叫aio,他有一些问题:

  1. 因为0_DIRECT,所以很多的IO用例不适用。普通的IO会退化为同步IO。
  2. 就算异步,IO也可能会阻塞,比如硬件部分,这使得软件必须获取加载这些部分的锡信息
  3. API不够完美,每次提交需要64+8字节的内存,每次完成徐娅萍
    他常用的方法是:io_submit() , io_setup() , io_getevents().

io-uring 和 epoll。

epoll 只是通知机制,本质上事情还是通过用户代码直接 syscall 来做的,如 read。这样在高频
syscall 的场景下,频繁的用户态内核态切换会消耗较多资源。io-uring 可以做异步 syscall,
即便是不开 SQ_POLL 也可以大大减少 syscall 次数。

io-uring 的问题在于下面几点:

兼容问题。平台兼容就不说了,linux only(epoll 在其他平台上有类似的存在,可以基于已
经十分完善的 mio 做无缝兼容)。linux 上也会对 kernel 版本有一定要求,且不同版本的实现性
能还有一定差距。大型公司一般还会有自己修改的内核版本,所以想持续跟进 backport 也是一件头疼
事。同时对于 Mac/Windows 用户,在开发体验上也会带来一定困难。

Buffer 生命周期问题。io-uring 是全异步的,Op push 到 SQ 后就不能移动 buffer,一定
要保证其有效,直到 syscall 完成或 Cancel Op 执行完毕。无论是在 C/C++ 还是 Rust 中,都
会面临 buffer 生命周期管理问题。epoll 没有这个问题,因为 syscall 就是用户做的,陷入 sy
scall 期间本来就无法操作 buffer,所以可以保证其持续有效直到 syscall 返回。

smol

在几个库的基础上封装成最后几个接口

在 Linux 上,mio 使用 epoll 来实现高效的 I/O 多路复用。smol 使用 mio 来实现
这一点。具体来说,smol 会将 I/O 操作抽象为异步任务,然后将这些任务交给 mio 处理。
mio 通过 epoll 监听文件描述符,当某个文件描述符变得可读或可写时,它会通知 smol 来
执行相应的任务。

在 smol 的源码中,底层通过调用 mio 提供的异步 I/O API 来实现任务的异步调度。例如
,读取数据时,它会发出 epoll 查询,直到某个文件描述符准备好读取数据,才会从事件队列中获
取该事件并执行对应的异步任务。

关于我的运行时

用proactor包装io_uring实现基本的异步读写,然后再包装proactor实现io和file,然后没有什么了,
一开始低估了整个运行时的大小

然后看了很多的文档:

https://github.com/rust-lang/futures-rs/blob/master/futures-core/src/stream.rs
https://rustmagazine.github.io/rust_magazine_2021/chapter_12/monoio.html
https://github.com/bytedance/monoio/blob/master/docs/zh/io-cancel.md
https://github.com/ihciah/mini-rust-runtime/blob/master/src/tcp.rs
https://github.com/rust-lang/futures-rs/blob/master/futures-core/src/lib.rs

这里我认为monoio算是一个正确的,完善的运行时,不过这个大小太夸张了,可以下期一开始就把它粘出来,让大家
直观感受一下成果的规模。

抢占式调度

两个实验第二个比第一个多一些关于wait_queue的内容.

CFS公平调度策略.

抢占式调度的基本保障是定时器.

ArceOS的抢占不是无条件的:

  1. 内部条件:时间片耗尽
  2. 外部条件
    1. 通过关抢占(锁)确定一段执行的过程中不会出现抢占,形成关抢占的临界区
    2. 只有的特定的执行点上才会发生抢占(一个反向的临界区?)

要内外部条件都满足.

preempt_disable_count是多个结合的,因此计数会大于1.只有是0的情况下才可以被抢占.

CFS比之前的那种抢占式调度不同,加了一种调度算法.

vruntime = init_vruntime + (delta / weight(nice)).

系统初始化时,init_vruntime, delta, nice三者都是0.但是我们可以人为根据偏好设置init_vruntime,但是随着系统运行时间的增加,init_vruntime的作用越来越小.

vruntime最小的任务就是优先权最高任务,即当前任务.

每次始终中断的时候都会递增delta(但是不会直接切换任务,而会运行优先级最高的任务),随着delta的递增,导致这个任务的优先级不够,然后就会被换掉.

即使下次还是运行自己,还是会发生一次无用切换.

实验验证

第一个实验

make run A=tour/u_6_0

出现的log如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

Multi-task(Preemptible) is starting ...
worker1 ... ThreadId(4)
worker1 [0]
worker1 [1]
worker1 [2]
worker1 [3]
worker1 [4]
worker1 [5]
worker1 [6]
worker1 [7]
worker1 [8]
worker1 [9]
Wait for workers to exit ...
worker2 ... ThreadId(5)
worker2 [0]
worker2 [1]
worker2 [2]
worker2 [3]
worker2 [4]
worker2 [5]
worker2 [6]
worker2 [7]
worker2 [8]
worker2: nothing to do!
worker2: nothing to do!
worker2: nothing to do!
worker1 [10]
worker2 [9]
worker2: nothing to do!
worker2: nothing to do!
worker1 [11]
worker2 [10]
worker2: nothing to do!
worker1 [12]
worker1 [13]
worker2 [11]
worker2 [12]
worker1 [14]
worker1 [15]
worker2 [13]
worker2 [14]
worker2: nothing to do!
worker1 [16]
worker2 [15]
worker2: nothing to do!
worker1 [17]
worker1 [18]
worker2 [16]
worker2 [17]
worker1 [19]
worker1 [20]
worker2 [18]
worker2 [19]
worker2: nothing to do!
worker1 [21]
worker2 [20]
worker2: nothing to do!
worker1 [22]
worker1 [23]
worker2 [21]
worker2 [22]
worker1 [24]
worker1 [25]
worker2 [23]
worker2 [24]
worker1 [26]
worker1 [27]
worker1 [28]
worker2 [25]
worker2 [26]
worker1 [29]
worker1 [30]
worker2 [27]
worker2 [28]
worker1 [31]
worker1 [32]
worker1 [33]
worker2 [29]
worker2 [30]
worker1 [34]
worker1 [35]
worker2 [31]
worker2 [32]
worker1 [36]
worker1 [37]
worker1 [38]
worker2 [33]
worker2 [34]
worker1 [39]
worker1 [40]
worker2 [35]
worker2 [36]
worker2 [37]
worker1 [41]
worker1 [42]
worker2 [38]
worker2 [39]
worker1 [43]
worker1 [44]
worker2 [40]
worker2 [41]
worker2 [42]
worker1 [45]
worker1 [46]
worker2 [43]
worker2 [44]
worker1 [47]
worker1 [48]
worker2 [45]
worker2 [46]
worker2 [47]
worker1 [49]
worker1 [50]
worker2 [48]
worker2 [49]
worker1 [51]
worker1 [52]
worker2 [50]
worker2 [51]
worker2: nothing to do!
worker1 [53]
worker2 [52]
worker2: nothing to do!
worker1 [54]
worker1 [55]
worker2 [53]
worker2 [54]
worker1 [56]
worker1 [57]
worker2 [55]
worker2 [56]
worker1 [58]
worker1 [59]
worker1 [60]
worker2 [57]
worker2 [58]
worker1 [61]
worker1 [62]
worker2 [59]
worker2 [60]
worker2 [61]
worker1 [63]
worker1 [64]
worker2 [62]
worker2 [63]
worker1 [65]
worker1 [66]
worker2 [64]
worker2 [65]
worker2: nothing to do!
worker1 [67]
worker2 [66]
worker2: nothing to do!
worker1 [68]
worker1 [69]
worker2 [67]
worker2 [68]
worker1 [70]
worker1 [71]
worker2 [69]
worker2 [70]
worker2: nothing to do!
worker1 [72]
worker2 [71]
worker2: nothing to do!
worker1 [73]
worker2 [72]
worker2: nothing to do!
worker1 [74]
worker1 [75]
worker2 [73]
worker2 [74]
worker1 [76]
worker1 [77]
worker1 [78]
worker2 [75]
worker2 [76]
worker1 [79]
worker1 [80]
worker2 [77]
worker2 [78]
worker1 [81]
worker1 [82]
worker1 [83]
worker2 [79]
worker2 [80]
worker1 [84]
worker1 [85]
worker2 [81]
worker2 [82]
worker1 [86]
worker1 [87]
worker1 [88]
worker2 [83]
worker2 [84]
worker1 [89]
worker1 [90]
worker2 [85]
worker2 [86]
worker1 [91]
worker1 [92]
worker1 [93]
worker2 [87]
worker2 [88]
worker1 [94]
worker1 [95]
worker2 [89]
worker2 [90]
worker1 [96]
worker1 [97]
worker1 [98]
worker2 [91]
worker2 [92]
worker1 [99]
worker1 [100]
worker2 [93]
worker2 [94]
worker1 [101]
worker1 [102]
worker2 [95]
worker2 [96]
worker1 [103]
worker1 [104]
worker2 [97]
worker2 [98]
worker2 [99]
worker1 [105]
worker1 [106]
worker2 [100]
worker2 [101]
worker1 [107]
worker1 [108]
worker2 [102]
worker2 [103]
worker1 [109]
worker1 [110]
worker2 [104]
worker2 [105]
worker1 [111]
worker1 [112]
worker2 [106]
worker2 [107]
worker1 [113]
worker1 [114]
worker1 [115]
worker2 [108]
worker2 [109]
worker1 [116]
worker1 [117]
worker2 [110]
worker1 [118]
worker1 [119]
worker2 [111]
worker2 [112]
worker2 [113]
worker1 [120]
worker1 [121]
worker2 [114]
worker2 [115]
worker1 [122]
worker1 [123]
worker2 [116]
worker2 [117]
worker1 [124]
worker1 [125]
worker2 [118]
worker2 [119]
worker1 [126]
worker1 [127]
worker2 [120]
worker2 [121]
worker1 [128]
worker1 [129]
worker2 [122]
worker2 [123]
worker2 [124]
worker1 [130]
worker1 [131]
worker2 [125]
worker2 [126]
worker1 [132]
worker1 [133]
worker2 [127]
worker2 [128]
worker1 [134]
worker1 [135]
worker2 [129]
worker2 [130]
worker1 [136]
worker1 [137]
worker2 [131]
worker2 [132]
worker1 [138]
worker1 [139]
worker2 [133]
worker2 [134]
worker2 [135]
worker1 [140]
worker1 [141]
worker2 [136]
worker2 [137]
worker1 [142]
worker1 [143]
worker2 [138]
worker2 [139]
worker1 [144]
worker1 [145]
worker2 [140]
worker2 [141]
worker1 [146]
worker1 [147]
worker2 [142]
worker2 [143]
worker1 [148]
worker1 [149]
worker2 [144]
worker2 [145]
worker2 [146]
worker1 [150]
worker1 [151]
worker2 [147]
worker2 [148]
worker1 [152]
worker1 [153]
worker2 [149]
worker2 [150]
worker1 [154]
worker1 [155]
worker2 [151]
worker2 [152]
worker1 [156]
worker1 [157]
worker2 [153]
worker2 [154]
worker1 [158]
worker1 [159]
worker2 [155]
worker2 [156]
worker2 [157]
worker1 [160]
worker1 [161]
worker2 [158]
worker2 [159]
worker1 [162]
worker1 [163]
worker2 [160]
worker2 [161]
worker1 [164]
worker1 [165]
worker2 [162]
worker2 [163]
worker1 [166]
worker1 [167]
worker2 [164]
worker2 [165]
worker1 [168]
worker1 [169]
worker2 [166]
worker2 [167]
worker2 [168]
worker1 [170]
worker1 [171]
worker2 [169]
worker2 [170]
worker1 [172]
worker1 [173]
worker2 [171]
worker2 [172]
worker1 [174]
worker1 [175]
worker2 [173]
worker2 [174]
worker1 [176]
worker1 [177]
worker2 [175]
worker2 [176]
worker1 [178]
worker1 [179]
worker2 [177]
worker2 [178]
worker1 [180]
worker1 [181]
worker1 [182]
worker2 [179]
worker2 [180]
worker1 [183]
worker1 [184]
worker2 [181]
worker2 [182]
worker1 [185]
worker1 [186]
worker2 [183]
worker2 [184]
worker1 [187]
worker1 [188]
worker2 [185]
worker2 [186]
worker1 [189]
worker1 [190]
worker2 [187]
worker2 [188]
worker1 [191]
worker1 [192]
worker2 [189]
worker2 [190]
worker1 [193]
worker1 [194]
worker1 [195]
worker2 [191]
worker2 [192]
worker1 [196]
worker1 [197]
worker2 [193]
worker2 [194]
worker1 [198]
worker1 [199]
worker2 [195]
worker2 [196]
worker1 [200]
worker1 [201]
worker2 [197]
worker2 [198]
worker1 [202]
worker1 [203]
worker2 [199]
worker2 [200]
worker1 [204]
worker1 [205]
worker2 [201]
worker2 [202]
worker1 [206]
worker1 [207]
worker2 [203]
worker2 [204]
worker2 [205]
worker1 [208]
worker1 [209]
worker2 [206]
worker2 [207]
worker1 [210]
worker1 [211]
worker2 [208]
worker2 [209]
worker1 [212]
worker1 [213]
worker2 [210]
worker2 [211]
worker1 [214]
worker1 [215]
worker2 [212]
worker2 [213]
worker1 [216]
worker1 [217]
worker2 [214]
worker2 [215]
worker2 [216]
worker1 [218]
worker1 [219]
worker2 [217]
worker2 [218]
worker1 [220]
worker1 [221]
worker2 [219]
worker2 [220]
worker1 [222]
worker1 [223]
worker2 [221]
worker2 [222]
worker1 [224]
worker1 [225]
worker2 [223]
worker2 [224]
worker1 [226]
worker1 [227]
worker1 [228]
worker2 [225]
worker2 [226]
worker1 [229]
worker1 [230]
worker2 [227]
worker2 [228]
worker1 [231]
worker1 [232]
worker2 [229]
worker2 [230]
worker1 [233]
worker1 [234]
worker2 [231]
worker2 [232]
worker1 [235]
worker1 [236]
worker2 [233]
worker2 [234]
worker1 [237]
worker1 [238]
worker1 [239]
worker2 [235]
worker2 [236]
worker1 [240]
worker1 [241]
worker2 [237]
worker2 [238]
worker1 [242]
worker1 [243]
worker2 [239]
worker2 [240]
worker1 [244]
worker1 [245]
worker2 [241]
worker2 [242]
worker1 [246]
worker1 [247]
worker2 [243]
worker2 [244]
worker1 [248]
worker1 [249]
worker1 [250]
worker2 [245]
worker2 [246]
worker1 [251]
worker1 [252]
worker2 [247]
worker2 [248]
worker1 [253]
worker1 [254]
worker2 [249]
worker2 [250]
worker1 [255]
worker1 [256]
worker2 [251]
worker2 [252]
worker1 ok!
worker2 [253]
worker2 [254]
worker2 [255]
worker2 [256]
worker2 ok!
Multi-task(Preemptible) ok!
  1. 最开始worker1运行了9次println,但是只运行了8次push_back,每次时间片耗尽触发了cfs调度算法,但是一直到这时候才轮到worker2vruntime最小(而不是触发时间片耗尽的round_robin).
  2. 随后worker2获取到双端队列之后,尝试输出队列最前边的内容.全部输出完之后,进入另一个分支输出worker2: nothing to do!,并且yield.但是由于vruntime仍然是worker2最小,因此又运行了两次,一共运行三次以后才切换回worker1.(这说明cfs调度算法的yield是考虑切换,而不是直接把当前任务放进队列最后)
  3. worker1在输出worker1 [10]之后还没来得及把10压入队列,这时候又切换回worker2.
  4. 这时候worker2发生了三次yield但是还是没有切换到work1.于是输出了一次worker2 [9],又触发了两次worker2: nothing to do!.
  5. 后边也都这样进行分析即可.可以看到在cfs算法中,即使你使用了yield还是可能切换回本任务.

第二个实验

make run A=tour/u_6_1

产生的log如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

WaitQ is starting ...
worker1 ...
worker1 [0]
worker1 [1]
worker1 [2]
worker1 [3]
worker1 [4]
worker2 ...
Wait for workers to exit ...
worker2 [0]
worker2 [1]
worker2 [2]
worker2 [3]
worker2 [4]
worker1 [5]
worker2 [5]
worker1 [6]
worker2 [6]
worker1 [7]
worker2 [7]
worker1 [8]
worker2 [8]
worker1 [9]
worker2 [9]
worker1 [10]
worker2 [10]
worker1 [11]
worker2 [11]
worker1 [12]
worker2 [12]
worker1 [13]
worker2 [13]
worker1 [14]
worker2 [14]
worker1 [15]
worker2 [15]
worker1 [16]
worker2 [16]
worker1 [17]
worker2 [17]
worker1 [18]
worker2 [18]
worker1 [19]
worker2 [19]
worker1 [20]
worker2 [20]
worker1 [21]
worker2 [21]
worker1 [22]
worker2 [22]
worker1 [23]
worker2 [23]
worker1 [24]
worker2 [24]
worker1 [25]
worker2 [25]
worker1 [26]
worker2 [26]
worker1 [27]
worker2 [27]
worker1 [28]
worker2 [28]
worker1 [29]
worker2 [29]
worker1 [30]
worker2 [30]
worker1 [31]
worker2 [31]
worker1 [32]
worker2 [32]
worker1 [33]
worker2 [33]
worker1 [34]
worker2 [34]
worker1 [35]
worker2 [35]
worker1 [36]
worker2 [36]
worker1 [37]
worker2 [37]
worker1 [38]
worker2 [38]
worker1 [39]
worker2 [39]
worker1 [40]
worker2 [40]
worker1 [41]
worker2 [41]
worker1 [42]
worker2 [42]
worker1 [43]
worker2 [43]
worker1 [44]
worker2 [44]
worker1 [45]
worker2 [45]
worker1 [46]
worker2 [46]
worker1 [47]
worker2 [47]
worker1 [48]
worker2 [48]
worker1 [49]
worker2 [49]
worker1 [50]
worker2 [50]
worker1 [51]
worker2 [51]
worker1 [52]
worker2 [52]
worker1 [53]
worker2 [53]
worker1 [54]
worker2 [54]
worker1 [55]
worker2 [55]
worker1 [56]
worker2 [56]
worker1 [57]
worker2 [57]
worker1 [58]
worker2 [58]
worker1 [59]
worker2 [59]
worker1 [60]
worker2 [60]
worker1 [61]
worker2 [61]
worker1 [62]
worker2 [62]
worker1 [63]
worker2 [63]
worker1 [64]
worker2 [64]
worker1 [65]
worker2 [65]
worker1 [66]
worker2 [66]
worker1 [67]
worker2 [67]
worker1 [68]
worker2 [68]
worker1 [69]
worker2 [69]
worker1 [70]
worker2 [70]
worker1 [71]
worker2 [71]
worker1 [72]
worker2 [72]
worker1 [73]
worker2 [73]
worker1 [74]
worker2 [74]
worker1 [75]
worker2 [75]
worker1 [76]
worker2 [76]
worker1 [77]
worker2 [77]
worker1 [78]
worker2 [78]
worker1 [79]
worker2 [79]
worker1 [80]
worker2 [80]
worker1 [81]
worker2 [81]
worker1 [82]
worker2 [82]
worker1 [83]
worker2 [83]
worker1 [84]
worker2 [84]
worker1 [85]
worker2 [85]
worker1 [86]
worker2 [86]
worker1 [87]
worker2 [87]
worker1 [88]
worker2 [88]
worker1 [89]
worker2 [89]
worker1 [90]
worker2 [90]
worker1 [91]
worker2 [91]
worker1 [92]
worker2 [92]
worker1 [93]
worker2 [93]
worker1 [94]
worker2 [94]
worker1 [95]
worker2 [95]
worker1 [96]
worker2 [96]
worker1 [97]
worker2 [97]
worker1 [98]
worker2 [98]
worker1 [99]
worker2 [99]
worker1 [100]
worker2 [100]
worker1 [101]
worker2 [101]
worker1 [102]
worker2 [102]
worker1 [103]
worker2 [103]
worker1 [104]
worker2 [104]
worker1 [105]
worker2 [105]
worker1 [106]
worker2 [106]
worker1 [107]
worker2 [107]
worker1 [108]
worker2 [108]
worker1 [109]
worker2 [109]
worker1 [110]
worker2 [110]
worker1 [111]
worker2 [111]
worker1 [112]
worker2 [112]
worker1 [113]
worker2 [113]
worker1 [114]
worker2 [114]
worker1 [115]
worker2 [115]
worker1 [116]
worker2 [116]
worker1 [117]
worker2 [117]
worker1 [118]
worker2 [118]
worker1 [119]
worker2 [119]
worker1 [120]
worker2 [120]
worker1 [121]
worker2 [121]
worker1 [122]
worker2 [122]
worker1 [123]
worker2 [123]
worker1 [124]
worker2 [124]
worker1 [125]
worker2 [125]
worker1 [126]
worker2 [126]
worker1 [127]
worker2 [127]
worker1 [128]
worker2 [128]
worker1 [129]
worker2 [129]
worker1 [130]
worker2 [130]
worker1 [131]
worker2 [131]
worker1 [132]
worker2 [132]
worker1 [133]
worker2 [133]
worker1 [134]
worker2 [134]
worker1 [135]
worker2 [135]
worker1 [136]
worker2 [136]
worker1 [137]
worker2 [137]
worker1 [138]
worker2 [138]
worker1 [139]
worker2 [139]
worker1 [140]
worker2 [140]
worker1 [141]
worker2 [141]
worker1 [142]
worker2 [142]
worker1 [143]
worker2 [143]
worker1 [144]
worker2 [144]
worker1 [145]
worker2 [145]
worker1 [146]
worker2 [146]
worker1 [147]
worker2 [147]
worker1 [148]
worker2 [148]
worker1 [149]
worker2 [149]
worker1 [150]
worker2 [150]
worker1 [151]
worker2 [151]
worker1 [152]
worker2 [152]
worker1 [153]
worker2 [153]
worker1 [154]
worker2 [154]
worker1 [155]
worker2 [155]
worker1 [156]
worker2 [156]
worker1 [157]
worker2 [157]
worker1 [158]
worker2 [158]
worker1 [159]
worker2 [159]
worker1 [160]
worker2 [160]
worker1 [161]
worker2 [161]
worker1 [162]
worker2 [162]
worker1 [163]
worker2 [163]
worker1 [164]
worker2 [164]
worker1 [165]
worker2 [165]
worker1 [166]
worker2 [166]
worker1 [167]
worker2 [167]
worker1 [168]
worker2 [168]
worker1 [169]
worker2 [169]
worker1 [170]
worker2 [170]
worker1 [171]
worker2 [171]
worker1 [172]
worker2 [172]
worker1 [173]
worker2 [173]
worker1 [174]
worker2 [174]
worker1 [175]
worker2 [175]
worker1 [176]
worker2 [176]
worker1 [177]
worker2 [177]
worker1 [178]
worker2 [178]
worker1 [179]
worker2 [179]
worker1 [180]
worker2 [180]
worker1 [181]
worker2 [181]
worker1 [182]
worker2 [182]
worker1 [183]
worker2 [183]
worker1 [184]
worker2 [184]
worker1 [185]
worker2 [185]
worker1 [186]
worker2 [186]
worker1 [187]
worker2 [187]
worker1 [188]
worker2 [188]
worker1 [189]
worker2 [189]
worker1 [190]
worker2 [190]
worker1 [191]
worker2 [191]
worker1 [192]
worker2 [192]
worker1 [193]
worker2 [193]
worker1 [194]
worker2 [194]
worker1 [195]
worker2 [195]
worker1 [196]
worker2 [196]
worker1 [197]
worker2 [197]
worker1 [198]
worker2 [198]
worker1 [199]
worker2 [199]
worker1 [200]
worker2 [200]
worker1 [201]
worker2 [201]
worker1 [202]
worker2 [202]
worker1 [203]
worker2 [203]
worker1 [204]
worker2 [204]
worker1 [205]
worker2 [205]
worker1 [206]
worker2 [206]
worker1 [207]
worker2 [207]
worker1 [208]
worker2 [208]
worker1 [209]
worker2 [209]
worker1 [210]
worker2 [210]
worker1 [211]
worker2 [211]
worker1 [212]
worker2 [212]
worker1 [213]
worker2 [213]
worker1 [214]
worker2 [214]
worker1 [215]
worker2 [215]
worker1 [216]
worker2 [216]
worker1 [217]
worker2 [217]
worker1 [218]
worker2 [218]
worker1 [219]
worker2 [219]
worker1 [220]
worker2 [220]
worker1 [221]
worker2 [221]
worker1 [222]
worker2 [222]
worker1 [223]
worker2 [223]
worker1 [224]
worker2 [224]
worker1 [225]
worker2 [225]
worker1 [226]
worker2 [226]
worker1 [227]
worker2 [227]
worker1 [228]
worker2 [228]
worker1 [229]
worker2 [229]
worker1 [230]
worker2 [230]
worker1 [231]
worker2 [231]
worker1 [232]
worker2 [232]
worker1 [233]
worker2 [233]
worker1 [234]
worker2 [234]
worker1 [235]
worker2 [235]
worker1 [236]
worker2 [236]
worker1 [237]
worker2 [237]
worker1 [238]
worker2 [238]
worker1 [239]
worker2 [239]
worker1 [240]
worker2 [240]
worker1 [241]
worker2 [241]
worker1 [242]
worker2 [242]
worker1 [243]
worker2 [243]
worker1 [244]
worker2 [244]
worker1 [245]
worker2 [245]
worker1 [246]
worker2 [246]
worker1 [247]
worker2 [247]
worker1 [248]
worker2 [248]
worker1 [249]
worker2 [249]
worker1 [250]
worker2 [250]
worker1 [251]
worker2 [251]
worker1 [252]
worker2 [252]
worker1 [253]
worker2 [253]
worker1 [254]
worker2 [254]
worker1 [255]
worker2 [255]
worker1 [256]
worker2 [256]
worker2 ok!
worker1 ok!
WaitQ ok!

可以很明显看到加了WaitQ之后worker1worker2的调度更均匀了,或许说均匀并不好,应该说是worker2: nothing to do!基本消失了.

块设备

引入块设备之后引入了从块设备去加载磁盘的数据.而不是只读取PFlash这种简单的设备.

AllDevices管理系统上所有的设备.

static用泛型的方法高效率地对设备类型进行的封装.因此一个类型只能管理一个设备.

dyn的动态方法,其实是利用了动态可变Vec每个类型有多个实例.

设备的发现方式:

  1. pcie的协议来发现设备
  2. mmio通过设备树找设备

但是现在没有使用设备树.而是使用(for_each_drivers!)的方式.

virtio设备是用qemu的命令设置出来的虚拟设备.

有八个槽位来放这些驱动.

virtio-mmio驱动是对各个槽位发送查询请求,就此获得槽位的设备类型.

IPI中断是多核之间通信的中断,是中断但是是通过发送一个命令到另一个核触发硬中断实现的.

文件系统的操作流程,就是从Root节点一直进行lookup的方式一直到目标节点,然后对目标节点进行操作.

文件系统的示例:

  1. Ext2
  2. ArceOS使用的文件系统是Fat32

mount可以看成文件系统在内存中的展开操作.

就是目录树存在储存介质里的时候是扁平的,而现在要使用的是展开的立体结构.

mount可以把另一个文件系统的目录树挂载到当前文件系统的目录树上.

mount做挂载的时候,可以通过读取最长的路径来解决一部分问题.

读取块设备的实验

make run A=tour/u_7_0 BLK=y

有一个feature叫做dyn,只有在开启这个的时候才是每种类型的设备有一个Vec来管理,不然都是每种类型的设备都只有一个.

这里用的驱动不是mmio而是pci的驱动,虽然PPT上讲了很多关于mmio的内容.这令人感到奇怪,看modules/axdriver/Cargo.toml里的描述,default=["bus-pci"],这里我们改成default = ["bus-mmio"]也可以正常运行,而且看LOG是不一样的.

cargo b -vv在编译的时候看build.rs的输出.

同样地,这个实验里被预先加载进块设备的内容在scripts/make/utils.mk里有定义,详见之前的mk_pflash的部分.

从文件系统加载数据的实验

1
make run A=tour/u_8_0 BLK=y

这个任务没什么好说的,主要看这个实验的Cargo文件,为axstd打开了fs这个feature.

我们可以进去尽情看一下实验中调用的openread的实现.

为shell增加文件操作命令

让我难受了很久的一个问题就是这个rename,ax_rename对应的(ax_)fatfsrename也是只能使用改名当前目录下的文件.
实际上调用的rust-fatfsrename是支持改名一个文件到另一个目录下边的.实际上也就是允许把一个inode更替到另一个inode下边.
这样实际上就能用rename实现mv的功能.

这一个部分心太急了,没有看其它的api的运行流程.

在编译运行的时候应该利用好LOG="level"

实验1

这部分实验是上部分的作业暂时略过.

LinuxApp

实验命令:

1
2
3
make payload
./update_disk.sh payload/hello_c/hello
make run A=tour/m_3_0 BLK=y

这里看payload/hello_c/Makefile,可以看到:

1
2
3
4
5
6
7
8
9
10
11
12
13
TARGET := hello

CC := riscv64-linux-musl-gcc
STRIP := riscv64-linux-musl-strip

all: $(TARGET)

%: %.c
$(CC) -static $< -o $@
$(STRIP) $@

clean:
@rm -rf ./$(TARGET)

可以看到我们使用的编译器信息移除工具都是指定的版本是linux.

这张图有些害人匪浅了,

这个图是linux应用的用户栈.

但是我们从实用的角度来看,应用主函数的原型:

1
int main(int argc, char *argv[], char *enp[]);

我们只需要在栈里边按顺序保存:

  1. argc
  2. arg_ptr
  3. env_ptr
  4. auxv

即可,只要argc的值是对的,arg_ptrenv_ptr指向的实例是对的即可.

这里有一个疑问:到底谁是对的?

kernel-elf-parser里的src/user_stack.rs的注释和它具体的实现是一样的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//! Initialize the user stack for the application
//!
//! The structure of the user stack is described in the following figure:
//! position content size (bytes) + comment
//! ------------------------------------------------------------------------
//! stack pointer -> [ argc = number of args ] 8
//! [ argv[0] (pointer) ] 8 (program name)
//! [ argv[1] (pointer) ] 8
//! [ argv[..] (pointer) ] 8 * x
//! [ argv[n - 1] (pointer) ] 8
//! [ argv[n] (pointer) ] 8 (= NULL)
//! [ envp[0] (pointer) ] 8
//! [ envp[1] (pointer) ] 8
//! [ envp[..] (pointer) ] 8
//! [ envp[term] (pointer) ] 8 (= NULL)
//! [ auxv[0] (Elf32_auxv_t) ] 16
//! [ auxv[1] (Elf32_auxv_t) ] 16
//! [ auxv[..] (Elf32_auxv_t) ] 16
//! [ auxv[term] (Elf32_auxv_t) ] 16 (= AT_NULL vector)
//! [ padding ] 0 - 16
//! [ argument ASCIIZ strings ] >= 0
//! [ environment ASCIIZ str. ] >= 0
//!
//! (0xbffffff8) [ end marker ] 8 (= NULL)
//!
//! (0xc0000000) < bottom of stack > 0 (virtual)
//!
//! More details can be found in the link: <https://articles.manugarg.com/aboutelfauxiliaryvectors.html>

形成的栈:

|300

运行log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = info

[ 1.746356 0 axruntime:130] Logging is enabled.
[ 1.856119 0 axruntime:131] Primary CPU 0 started, dtb = 0x87000000.
[ 1.905723 0 axruntime:133] Found physcial memory regions:
[ 1.962960 0 axruntime:135] [PA:0x80200000, PA:0x80232000) .text (READ | EXECUTE | RESERVED)
[ 2.026512 0 axruntime:135] [PA:0x80232000, PA:0x80241000) .rodata (READ | RESERVED)
[ 2.073912 0 axruntime:135] [PA:0x80241000, PA:0x80244000) .data .tdata .tbss .percpu (READ | WRITE | RESERVED)
[ 2.124278 0 axruntime:135] [PA:0x80244000, PA:0x80284000) boot stack (READ | WRITE | RESERVED)
[ 2.168556 0 axruntime:135] [PA:0x80284000, PA:0x802ad000) .bss (READ | WRITE | RESERVED)
[ 2.212764 0 axruntime:135] [PA:0x802ad000, PA:0x88000000) free memory (READ | WRITE | FREE)
[ 2.261680 0 axruntime:135] [PA:0x101000, PA:0x102000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.305544 0 axruntime:135] [PA:0xc000000, PA:0xc210000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.349843 0 axruntime:135] [PA:0x10000000, PA:0x10001000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.394978 0 axruntime:135] [PA:0x10001000, PA:0x10009000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.440055 0 axruntime:135] [PA:0x22000000, PA:0x24000000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.485718 0 axruntime:135] [PA:0x30000000, PA:0x40000000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.530990 0 axruntime:135] [PA:0x40000000, PA:0x80000000) mmio (READ | WRITE | DEVICE | RESERVED)
[ 2.583846 0 axruntime:208] Initialize global memory allocator...
[ 2.621634 0 axruntime:209] use TLSF allocator.
[ 2.816195 0 axmm:81] Initialize virtual memory management...
[ 3.188863 0 axruntime:150] Initialize platform devices...
[ 3.249907 0 axtask::api:68] Initialize scheduling...
[ 3.436552 0 axtask::api:74] use Completely Fair scheduler.
[ 3.474966 0 axdriver:152] Initialize device drivers...
[ 3.510394 0 axdriver:153] device model: static
[ 3.664938 0 virtio_drivers::device::blk:59] config: 0xffffffc040006000
[ 3.721121 0 virtio_drivers::device::blk:64] found a block device of size 65536KB
[ 3.787426 0 axdriver::bus::pci:104] registered a new Block device at 00:01.0: "virtio-blk"
[ 21.285217 0 axfs:41] Initialize filesystems...
[ 21.329601 0 axfs:44] use block device 0: "virtio-blk"
[ 22.099152 0 fatfs::dir:139] Is a directory
[ 22.277106 0 fatfs::dir:139] Is a directory
[ 22.556181 0 fatfs::dir:139] Is a directory
[ 22.683443 0 fatfs::dir:139] Is a directory
[ 22.770783 0 axruntime:176] Initialize interrupt handlers...
[ 22.932112 0 axruntime:186] Primary CPU 0 init OK.
[ 23.210370 0:2 m_3_0::loader:58] e_entry: 0x50E
phdr: offset: 0x0=>0x0 size: 0x17CC=>0x17CC
VA:0x0 - VA:0x2000
phdr: offset: 0x1E70=>0x2E70 size: 0x338=>0x9A8
VA:0x2000 - VA:0x4000
entry: 0x50e
Mapping user stack: VA:0x3fffff0000 -> VA:0x4000000000
New user address space: AddrSpace {
va_range: VA:0x0..VA:0x4000000000,
page_table_root: PA:0x8064e000,
}
[ 23.946790 0:4 m_3_0::task:56] Enter user space: entry=0x50e, ustack=0x3fffffffc0, kstack=VA:0xffffffc0806a7010
handle_syscall [96] ...
handle_syscall [29] ...
Unimplemented syscall: SYS_IOCTL
handle_syscall [66] ...
Hello, UserApp!
handle_syscall [66] ...

handle_syscall [94] ...
[SYS_EXIT_GROUP]: system is exiting ..
monolithic kernel exit [Some(0)] normally!
[ 24.504671 0:2 axhal::platform::riscv64_qemu_virt::misc:3] Shutting down...

可以看到运行过程中还调用了:SYS_IOCTLSYS_SET_TID_ADDRESS两个系统调用.

这是因为:”示例m_3_0基于musl工具链以静态方式编译,工具链为应用附加的部分也会调用syscall。”

就是添加的这个_start_exit的系统调用.

set_tid_address会设置clear_child_tid的值,在进程创建和释放的时候会用到.
set_tid_address在父线程创建一个子线程的时候会把自己的tid写到这个address的区域里.
clear_child_tid在释放自己线程或者锁和其它资源的时候,会把返回的值里写入到address里.

ioctl是用来设置对外输出终端属性的.
现在用的是sbiputchar,因此可以直接跳过.

对于不同的体系结构,系统调用号不同。示例是基于riscv64的系统调用号规范。

最后总结就是我们设置好合理的syscall,把系统调用号设置好,那么就可以实现一定程度上的兼容.

像这个APP只需要提供syscall的兼容层就行了.

其余的兼容层根据APP不同也需要实现.

对Linux常用文件系统的支持

arceOS是通过axfs_ramfsprocfssysfs提供兼容.通过axfs_devfs提供devfs的兼容.
目前用ramfs进行兼容是一个临时的方法.
也就是使用内存文件系统.访问的时候相当于访问了一个基于内存的节点,里边有一些基于内存的数据,这些数据是其它子系统填充过来的数据.
正常的Linux是你访问这个proc之类的文件的时候实际上是调用了一个回调函数去获取系统状态.

实现mmap系统调用

实现方法:

  1. 通过sys_read方法读取到文件里的内容.
  2. 读取当前的任务的user space.
  3. 寻找空闲的映射空间的虚拟地址
  4. 构造flag.
  5. 创建一块frame,并且把虚拟地址映射到frame.
  6. 把文件内容拷贝到内存中去
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#[allow(unused_variables)]
fn sys_mmap(
addr: *mut usize,
length: usize,
prot: i32,
flags: i32,
fd: i32,
_offset: isize,
) -> isize {
const MAX_MMAP_SIZE: usize = 64;
let mut buf: [u8; 64] = [0u8;MAX_MMAP_SIZE];
unsafe {
let buf_ptr = &mut buf as *mut _ as *mut c_void;
sys_read(fd, buf_ptr, length+_offset as usize);
}
let mut buf = &buf[_offset as usize..length+_offset as usize];

let binding = current();
let mut uspace = &mut binding.task_ext().aspace.lock();

let free_va = if addr.is_null() {
uspace.find_free_area(
(addr as usize).into(),
length,
VirtAddrRange::new(
uspace.base(),
uspace.end()))
.unwrap_or_else(|| panic!("No free area for mmap"))
}else{
(addr as usize).into()
};

// 把prot转换成MappingFlags
let mut flags = MappingFlags::from(MmapProt::from_bits_truncate(prot));
flags.set(MappingFlags::USER, true);

uspace.map_alloc(
free_va,
PAGE_SIZE_4K,
flags,
true)
.unwrap();
let (paddr, _, _) = uspace
.page_table()
.query(free_va)
.unwrap_or_else(|_| panic!("Mapping failed for segment"));
unsafe {
core::ptr::copy_nonoverlapping(
buf.as_ptr(),
phys_to_virt(paddr).as_mut_ptr(),
PAGE_SIZE_4K,
);
}
free_va.as_usize() as isize
}

这里flags的处理还是很不到位,需要后续增加.

学习内容

介绍地址空间和页表相关的内容.任务调度下次课说.

  1. 如何在一个主任务的基础上启用一个子任务,让他完成一系列的工作
  2. 启用一个单任务的基础上能够开两个任务,然后完成两个任务之间的通信

makefile的原理

调用是这样实现的:

首先是Makefile里的:

1
run: build justrun

它需要有buildjustrun这两个虚拟文件.

再去看build:

1
build: $(OUT_DIR) $(OUT_BIN)

需要的是OUT_DIROUT_BIN这两个实体文件.

创建它们两个的文件在scripts/make/build.mk:

1
2
3
4
5
$(OUT_DIR):
$(call run_cmd,mkdir,-p $@)

$(OUT_BIN): _cargo_build $(OUT_ELF)
$(call run_cmd,$(OBJCOPY),$(OUT_ELF) --strip-all -O binary $@)

这里调用的run_cmdscripts/make/utils.mk:

1
2
3
4
5
6
7
8
9
10
11
GREEN_C := \033[92;1m
CYAN_C := \033[96;1m
YELLOW_C := \033[93;1m
GRAY_C := \033[90m
WHITE_C := \033[37m
END_C := \033[0m

define run_cmd
@printf '$(WHITE_C)$(1)$(END_C) $(GRAY_C)$(2)$(END_C)\n'
@$(1) $(2)
endef

这里$(1)$(2)表示接受的是两个参数.

这个是两个操作,

  1. 通过颜色参数把要执行的命令输出出来(第一行)
  2. 第二行相当于执行接受的两个参数
  3. $@是代表这个虚拟文件本身

其实这一套操作下来就是创建这个OUT_DIR这个名字的文件夹.

Makefile中:

1
2
3
4
5
6
A ?= tour/u_1_0
APP ?= $(A)
... ...

# Paths
OUT_DIR ?= $(APP)

这时候把目光转回OUT_BIN.它在scripts/make/build.mk中:

1
2
$(OUT_BIN): _cargo_build $(OUT_ELF)
$(call run_cmd,$(OBJCOPY),$(OUT_ELF) --strip-all -O binary $@)

那么它的构建需要虚拟文件_cargo_build和实体文件OUT_ELF.

那么_cargo_build的功能也是先进行输出,随后调用cargo_build:

1
2
3
4
5
6
7
8
_cargo_build:
@printf " $(GREEN_C)Building$(END_C) App: $(APP_NAME), Arch: $(ARCH), Platform: $(PLATFORM_NAME), App type: $(APP_TYPE)\n"
ifeq ($(APP_TYPE), rust)
$(call cargo_build,$(APP),$(AX_FEAT) $(LIB_FEAT) $(APP_FEAT))
@cp $(rust_elf) $(OUT_ELF)
else ifeq ($(APP_TYPE), c)
$(call cargo_build,ulib/axlibc,$(AX_FEAT) $(LIB_FEAT))
endif

那么cargo_buildscripts/make/cargo.mk里:

1
2
3
define cargo_build
$(call run_cmd,cargo -C $(1) build,$(build_args) --features "$(strip $(2))")
endef

由于我们知道run_cmd是什么套路了,因此这边就是执行cargo来构建一个elf文件.

回到OUT_BIN这边,得到两个所需文件之后,通过OBJCOPY ?= rust-objcopy --binary-architecture=$(ARCH)(在Makefile)中定义,把elf多余的信息头去掉,只留下可执行二进制文件:

1
2
$(OUT_BIN): _cargo_build $(OUT_ELF)
$(call run_cmd,$(OBJCOPY),$(OUT_ELF) --strip-all -O binary $@)

最后回到justrun,它调用了run_qemu(在scripts/make/qemu.mk),

1
2
3
4
define run_qemu
@printf " $(CYAN_C)Running$(END_C) on qemu...\n"
$(call run_cmd,$(QEMU),$(qemu_args-y))
endef

这里调用了QEMU,是依赖于ARCH ?= riscv64(在Makefile):

1
QEMU := qemu-system-$(ARCH)

这里调用了qemu_args-y,同样是依赖于ARCH的这里不赘述:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
qemu_args-x86_64 := \
-machine q35 \
-kernel $(OUT_ELF)

qemu_args-riscv64 := \
-machine virt \
-bios default \
-kernel $(OUT_BIN)

qemu_args-aarch64 := \
-cpu cortex-a72 \
-machine virt \
-kernel $(OUT_BIN)

qemu_args-y := -m 128M -smp $(SMP) $(qemu_args-$(ARCH))

ReadPFlash

本节目标:

  1. 引入页表管理组件,通过地址空间重映射,支持设备MMIO
  2. 地址空间概念,重映射的意义,页表机制

希望能从PFlash把应用的数据加载进来,以为运行后边的程序做基础.

实验没有paging时的情况

正常运行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

Try to access dev region [0xFFFFFFC022000000], got 0x646C6670
Got pflash magic: pfld

pfld在哪?

scripts/make/utils.mk中,在pflash中写入了pfld:

1
2
3
4
5
6
7
8
9
10
define mk_pflash
@RUSTFLAGS="" cargo build -p origin --target riscv64gc-unknown-none-elf --release
@rust-objcopy --binary-architecture=riscv64 --strip-all -O binary ./target/riscv64gc-unknown-none-elf/release/origin /tmp/origin.bin
@printf "pfld\00\00\00\01" > /tmp/prefix.bin
@printf "%08x" `stat -c "%s" /tmp/origin.bin` | xxd -r -ps > /tmp/size.bin
@cat /tmp/prefix.bin /tmp/size.bin > /tmp/head.bin
@dd if=/dev/zero of=./$(1) bs=1M count=32
@dd if=/tmp/head.bin of=./$(1) conv=notrunc
@dd if=/tmp/origin.bin of=./$(1) seek=16 obs=1 conv=notrunc
endef

那么这个在哪里调用呢?答案是没有调用.

我们是直接pull下来的,如果调用make pflash_img就会重新生成它.

scripts/make/qemu.mk里:

1
2
3
4
5
6
7
8
9
10
qemu_args-y := -m 128M -smp $(SMP) $(qemu_args-$(ARCH))

qemu_args-$(PFLASH) += \
-drive if=pflash,file=$(CURDIR)/$(PFLASH_IMG),format=raw,unit=1
... ...

define run_qemu
@printf " $(CYAN_C)Running$(END_C) on qemu...\n"
$(call run_cmd,$(QEMU),$(qemu_args-y))
endef

而在MakefilePFLASH被指定为y.这样就会在运行qemu的时候加上pflash.img这个文件.

没有指定paging的情况

代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#![cfg_attr(feature = "axstd", no_std)]
#![cfg_attr(feature = "axstd", no_main)]

#[macro_use]
#[cfg(feature = "axstd")]
extern crate axstd as std;

use core::{mem, str};
use std::os::arceos::modules::axhal::mem::phys_to_virt;

/// Physical address for pflash#1
const PFLASH_START: usize = 0x2200_0000;

#[cfg_attr(feature = "axstd", no_mangle)]
fn main() {
// Makesure that we can access pflash region.
let va = phys_to_virt(PFLASH_START.into()).as_usize();
let ptr = va as *const u32;
unsafe {
println!("Try to access dev region [{:#X}], got {:#X}", va, *ptr);
let magic = mem::transmute::<u32, [u8; 4]>(*ptr);
println!("Got pflash magic: {}", str::from_utf8(&magic).unwrap());
}
}

这里需要看下一节关于PFlash的部分,因为这时候需要访问外设,在没有remap的时候是1G恒等映射,外设没有映射到地址空间中,因此报错.

一开始只映射了物理空间的0x8000_00000xC000_0000.

这里访问的物理地址是0x2200_0000,本来就不属于刚刚提到的物理空间,因此通过恒等映射平移也得不到恒等映射之后的虚拟地址.

产生的log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

Try to access dev region [0xFFFFFFC022000000], got [ 2.002842 0 axruntime::lang_items:5] panicked at modules/axhal/src/arch/riscv/trap.rs:55:13:
Unhandled trap Exception(LoadFault) @ 0xffffffc080203fa6:
TrapFrame {
regs: GeneralRegisters {
ra: 0xffffffc08020376c,
sp: 0xffffffc0802499f0,
gp: 0x0,
tp: 0x0,
t0: 0x3f,
t1: 0x23,
t2: 0x5,
s0: 0xffffffc080249a80,
s1: 0xffffffc080249c60,
a0: 0xffffffc022000000,
a1: 0xffffffc080249a80,
a2: 0x0,
a3: 0xffffffc080203f9e,
a4: 0x2,
a5: 0xffffffc080202b76,
a6: 0xa,
a7: 0x1,
s2: 0x2,
s3: 0xffffffc080249bc0,
s4: 0xffffffc080249bf0,
s5: 0x38,
s6: 0xffffffc080205098,
s7: 0x2,
s8: 0x1,
s9: 0x24000,
s10: 0x2,
s11: 0xffffffc08026e000,
t3: 0x23,
t4: 0x3a,
t5: 0x5000,
t6: 0x55555555,
},
sepc: 0xffffffc080203fa6,
sstatus: 0x8000000000006100,
}

分支名称:tour_u_3_0_no_paging

MAP和REMAP

ArceOS Unikernel包括两阶段地址空间映射,
Boot阶段默认开启1G空间的恒等映射;
如果需要支持设备MMIO区间,通过指定一个feature - “paging”来实现重映射。

上一节说了启动之后需要remap,这样才可以实现重映射.

那么就需要打开paging.

初始化的线性页表

其实是创建了两个映射,相当于拿前一个恒等映射做了跳板,因为要求开启MMU之后仍然可以以原来的物理地址正常访问.

#TODO

后续创建多级页表

#TODO

PFlash

是一个模拟闪存磁盘.QEMU启动的时候会自动从内存中加载内容到固定的MMIO区域.

读操作是不需要驱动的,但是写是需要驱动的.

目前我们只需要读,只要加载成功即可.

物理地址空间

外设被映射到一个物理地址空间里边.

注意:linker_riscv64-qemu-virt.lds,段布局都是听的它的.

分页

类似于rCore的三级页表,我们实验中用的也是SV39.

分页阶段1-恒等映射+偏移

我们希望sbikernel都保存在高地址空间.

  1. 开局的时候把虚拟空间内的地址和物理空间内的地址完全对应上.
  2. 给指针寄存器pc栈寄存器sp加偏移,这里的偏移是0xffff_ffc0_0000_0000.

sbi还是放在0x8000_0000,kernel还是放在0x8020_0000.

那么在这个情况下其实已经是物理地址了,就是一个线性偏移的操作实现虚拟地址和物理地址的映射.

如果不需要访问物理设备,现在就可以完成了.

分页阶段2-重建映射

重映射的时候干脆在虚拟地址空间里把sbi去掉,因为不应该继续访问sbi了.

不同的数据段的权限不一样,比如READ,WRITE,EXECUTE不一样.比如代码段就只能读和运行不能写.在重建的时候就不需要给它这些权限.

这样设备的地址空间也可以被映射进来,权限粒度也更细.

这里似乎仍然是线性映射.

多任务

需求:启用多任务,开一个子任务,让子任务替主任务完成一些具体工作,然后回到主任务.

并发是多个任务同时在等待使用CPU而不是同时运行.并行是真的需要同时运行.

调度的一个很好的描述:一个是保存现场,一个是任务无感知.

实验多任务

make run A=tour/u_4_0

任务代码解析:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#![cfg_attr(feature = "axstd", no_std)]
#![cfg_attr(feature = "axstd", no_main)]

#[macro_use]
#[cfg(feature = "axstd")]
extern crate axstd as std;

use core::{mem, str};
use std::thread;
use std::os::arceos::modules::axhal::mem::phys_to_virt;

/// Physical address for pflash#1
const PFLASH_START: usize = 0x2200_0000;

#[cfg_attr(feature = "axstd", no_mangle)]
fn main() {
println!("Multi-task is starting ...");

let worker = thread::spawn(move || {
println!("Spawned-thread ...");

// Makesure that we can access pflash region.
let va = phys_to_virt(PFLASH_START.into()).as_usize();
let ptr = va as *const u32;
let magic = unsafe {
mem::transmute::<u32, [u8; 4]>(*ptr)
};
if let Ok(s) = str::from_utf8(&magic) {
println!("Got pflash magic: {s}");
0
} else {
-1
}
});

let ret = worker.join();
// Makesure that worker has finished its work.
assert_eq!(ret, Ok(0));

println!("Multi-task OK!");
}

和上一节的内容一样,同样是访问了我们预导入的pflash.img的前几个字符pfld.

只不过用了spawn的方法生成,并且用join的方法等待.

任务的数据结构

有一个关键点在于task_ext,是任务的拓展属性,是面向宏内核Hypervisor的关键.

通用调度框架

分层,并且实现同样的接口,这样就可以自己决定是什么样的调度机制.

系统默认内置任务

  • GC: 除main之外的任务(线程)退出后,由gc负责回收清理。
  • IDLE: 当其它所有任务都阻塞时,执行它。对某些arch,wait_for_irqs对应非忙等指令.比如等待中断什么的,而不是忙等,如果发生新的”event“(自己取的名)那么就会马上响应.

MsgQueue

本节目标:

  1. 任务的切换机制,协作式调度算法
  2. 同步的方式,Mutex的机制

运行实验

执行make run A=tour/u_5_0:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

Multi-task is starting ...
Wait for workers to exit ...
worker1 ...
worker1 [0]
worker2 ...
worker2 [0]
worker2: nothing to do!
worker1 [1]
worker2 [1]
worker2: nothing to do!
worker1 [2]
worker2 [2]
worker2: nothing to do!
worker1 [3]
worker2 [3]
worker2: nothing to do!
worker1 [4]
worker2 [4]
worker2: nothing to do!
worker1 [5]
worker2 [5]
worker2: nothing to do!
worker1 [6]
worker2 [6]
worker2: nothing to do!
worker1 [7]
worker2 [7]
worker2: nothing to do!
worker1 [8]
worker2 [8]
worker2: nothing to do!
worker1 [9]
worker2 [9]
worker2: nothing to do!
worker1 [10]
worker2 [10]
worker2: nothing to do!
worker1 [11]
worker2 [11]
worker2: nothing to do!
worker1 [12]
worker2 [12]
worker2: nothing to do!
worker1 [13]
worker2 [13]
worker2: nothing to do!
worker1 [14]
worker2 [14]
worker2: nothing to do!
worker1 [15]
worker2 [15]
worker2: nothing to do!
worker1 [16]
worker2 [16]
worker2: nothing to do!
worker1 [17]
worker2 [17]
worker2: nothing to do!
worker1 [18]
worker2 [18]
worker2: nothing to do!
worker1 [19]
worker2 [19]
worker2: nothing to do!
worker1 [20]
worker2 [20]
worker2: nothing to do!
worker1 [21]
worker2 [21]
worker2: nothing to do!
worker1 [22]
worker2 [22]
worker2: nothing to do!
worker1 [23]
worker2 [23]
worker2: nothing to do!
worker1 [24]
worker2 [24]
worker2: nothing to do!
worker1 [25]
worker2 [25]
worker2: nothing to do!
worker1 [26]
worker2 [26]
worker2: nothing to do!
worker1 [27]
worker2 [27]
worker2: nothing to do!
worker1 [28]
worker2 [28]
worker2: nothing to do!
worker1 [29]
worker2 [29]
worker2: nothing to do!
worker1 [30]
worker2 [30]
worker2: nothing to do!
worker1 [31]
worker2 [31]
worker2: nothing to do!
worker1 [32]
worker2 [32]
worker2: nothing to do!
worker1 [33]
worker2 [33]
worker2: nothing to do!
worker1 [34]
worker2 [34]
worker2: nothing to do!
worker1 [35]
worker2 [35]
worker2: nothing to do!
worker1 [36]
worker2 [36]
worker2: nothing to do!
worker1 [37]
worker2 [37]
worker2: nothing to do!
worker1 [38]
worker2 [38]
worker2: nothing to do!
worker1 [39]
worker2 [39]
worker2: nothing to do!
worker1 [40]
worker2 [40]
worker2: nothing to do!
worker1 [41]
worker2 [41]
worker2: nothing to do!
worker1 [42]
worker2 [42]
worker2: nothing to do!
worker1 [43]
worker2 [43]
worker2: nothing to do!
worker1 [44]
worker2 [44]
worker2: nothing to do!
worker1 [45]
worker2 [45]
worker2: nothing to do!
worker1 [46]
worker2 [46]
worker2: nothing to do!
worker1 [47]
worker2 [47]
worker2: nothing to do!
worker1 [48]
worker2 [48]
worker2: nothing to do!
worker1 [49]
worker2 [49]
worker2: nothing to do!
worker1 [50]
worker2 [50]
worker2: nothing to do!
worker1 [51]
worker2 [51]
worker2: nothing to do!
worker1 [52]
worker2 [52]
worker2: nothing to do!
worker1 [53]
worker2 [53]
worker2: nothing to do!
worker1 [54]
worker2 [54]
worker2: nothing to do!
worker1 [55]
worker2 [55]
worker2: nothing to do!
worker1 [56]
worker2 [56]
worker2: nothing to do!
worker1 [57]
worker2 [57]
worker2: nothing to do!
worker1 [58]
worker2 [58]
worker2: nothing to do!
worker1 [59]
worker2 [59]
worker2: nothing to do!
worker1 [60]
worker2 [60]
worker2: nothing to do!
worker1 [61]
worker2 [61]
worker2: nothing to do!
worker1 [62]
worker2 [62]
worker2: nothing to do!
worker1 [63]
worker2 [63]
worker2: nothing to do!
worker1 [64]
worker2 [64]
worker2 ok!
worker1 ok!
Multi-task OK!

分析代码,worker1是尝试获取Arc中的这个双端队列,然后尝试在队列的最后放东西.

由于是协作式调度,worker1每次放入的之后都会yield,因此worker2就会接手,然后尝试把队列里所有的内容都打出来,如果队列为空就报告worker2: nothing to do!,然后再由worker1接手CPU.

协作式调度

rCore不同,现在使用的是List而不是一个数组,原理上就不设置任务的个数了.

互斥锁和自旋锁

自旋锁可能是我们脑子中的那个锁,每次访问资源需要访问这个锁,如果没办法访问那你这个任务要处理这种情况.

互斥锁则是自己加了一个等待队列,如果有任务在等待这个资源,那么这个任务被加入等待队列之后不会参与调度,这样就节省了很多任务切换时的资源.

bump内存分配算法

#TODO

挑战性作业

initialize global allocator at: [0xffffffc08026f000, 0xffffffc088000000)

5376*2

align是8的是有关于Vec的内存的分配

131,665,920

1048576

524288+8000

#TODO

内容

虚拟机运行的实验内核是第一周的u_3_0:从pflash设备读出数据,验证开头部分。

有两种处理方式:

  1. 模拟模式 - 为虚拟机模拟一个pflash,以file1为后备文件。当Guest读该设备时,提供file1文件的内容。
  2. 透传模式 - 直接把宿主物理机(即qemu)的pflash透传给虚拟机。

优劣势:模拟模式可为不同虚拟机提供不同的pflash内容,但效率低;透传模式效率高,但是捆绑了设备。

实验

课后作业

组件化内核的心得和实践经验

应用场景多样化->多种内核场景的出现

Unikernel->安全性换高效性->为一个APP适配一个内核

宏内核就是典型的Linux之类的操作系统

微内核主要是安全->用形式化证明安全性->反复切换用户态以至于很慢

虚拟机管理程序->hypervisor->多个内核每个内核认为自己独享了整个设备

关注点在于组件化场景下的异构内核的快速实现.理解概念和优势.

不同的需求对应了不同的内核->使用不同的组件实现不同的内核

使用宏内核+hypervisor的架构也可以实现这个功能,但是会产生性能瓶颈.

利用对unikernel的几个部件的连接方式的修改,加一个宏内核插件,这样就可以变成宏内核.

通过对unikernel对于hypervisor插件的调用,就可以变成hypervisor的系统.

其实上边论述的是优势所在.

BACKBONE层的重要性:把共性放在下层内容.
TASK的拓展:把任务看成是内核资源的集合.

未来工作:扩展泛型化——同时引入不同类型扩展 -> 甚至能到异构内核

回顾与展望

之前就是做了一系列的实验建立了unikernel的框架.

通过unikernel的形式通过增加一些组件来跨过这个边界,来实现一个宏内核.

从Unikernel到宏内核

通过跨越模式边界,弄一个简单的系统调用的操作.

增加用户特权级和特权级上下文切换是变成宏内核的关键.

实验1

rust-analyzer不能正常解析代码的原因,需要在.vscode/settings.json里加入"rust-analyzer.cargo.target": "riscv64gc-unknown-none-elf"

实验命令行:

1
2
3
make payload
./update_disk.sh ./payload/origin/origin
make run A=tour/m_1_0 BLK=y

如果不能执行payload说明代码版本太老了,需要先git fetch origin然后再git merge origin到当前的分支

这里注意如果make payload报错Error,那么一定是因为没有配置好musl的环境变量,注意看一下~/.bashrc,记得更新完~/.bashrc要进行狠狠的source ~/.bashrc

对于./update_disk.sh ./payload/origin/origin的操作对于我这种没操作过的人来说是非常神奇的操作.这一步实际上是把disk.img挂载在linux的文件系统里,然后在直接用linux的指令直接往里边拷贝应用文件的数据.

然后make run A=tour/m_1_0 BLK=y就和上一节课的实验一样了.

跑出来的结果是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

[ 21.794824 0 fatfs::dir:139] Is a directory
[ 22.065035 0 fatfs::dir:139] Is a directory
[ 22.359963 0 fatfs::dir:139] Is a directory
[ 22.490439 0 fatfs::dir:139] Is a directory
app: /sbin/origin
paddr: PA:0x80642000
Mapping user stack: VA:0x3fffff0000 -> VA:0x4000000000
New user address space: AddrSpace {
va_range: VA:0x0..VA:0x4000000000,
page_table_root: PA:0x80641000,
}
Enter user space: entry=0x1000, ustack=0x4000000000, kstack=VA:0xffffffc080697010
handle_syscall ...
[SYS_EXIT]: process is exiting ..
monolithic kernel exit [Some(0)] normally!

让我们看一下orgin的app内容:

1
2
3
4
5
6
7
8
9
10
#[no_mangle]
unsafe extern "C" fn _start() -> ! {
core::arch::asm!(
"addi sp, sp, -4",
"sw a0, (sp)",
"li a7, 93",
"ecall",
options(noreturn)
)
}

很容易懂的,就是调用了第93syscall.

课后练习

主要是要理解AddrSpacemap_alloc的时候的populating选项.

根据在rCore中学到的经验,去查看源码,我们的结构是这样的.

|800

就是在创建MemoryArea的时候要传入一个泛型Backend.

应该就是和这边页的懒加载有关的内容.

调用到最后调用的是modules/axmm/src/backend/alloc.rs这个文件里的map_alloc,因为层层抽象,这里各个参数都还原成了最开始tour/m_1_0/src/main.rs里的变量名称.

|800

然后关键代码是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
if populate {
// allocate all possible physical frames for populated mapping.
for addr in PageIter4K::new(start, start + size).unwrap() {
if let Some(frame) = alloc_frame(true) {
if let Ok(tlb) = pt.map(addr, frame, PageSize::Size4K, flags) {
tlb.ignore(); // TLB flush on map is unnecessary, as there are no outdated mappings.
} else {
return false;
}
}
}
true
} else {
// Map to a empty entry for on-demand mapping.
let flags = MappingFlags::empty();
pt.map_region(start, |_| 0.into(), size, flags, false, false)
.map(|tlb| tlb.ignore())
.is_ok()
}

这里假如我们的poplulate是选定的true,那么就会立即根据4k一个大小的frame进行内存申请,然后把这个虚拟地址和刚刚申请到的framepage_table中映射起来.

但是如果我们选定populatefalse,那么直接把虚拟地址和0这个错误的物理地址映射起来.

那么这时候实际上就需要我们在访问到这个物理地址的时候,再进行物理页申请.

那么在访问到这个地址的时候会发生缺页异常.

这时候我们运行一下应用:

1
2
3
make payload
./update_disk.sh payload/origin/origin
make run A=tour/m_1_0/ BLK=y

这是对应的log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
OpenSBI v0.9
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count : 1
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000080000000-0x000000008001ffff ()
Domain0 Region01 : 0x0000000000000000-0xffffffffffffffff (R,W,X)
Domain0 Next Address : 0x0000000080200000
Domain0 Next Arg1 : 0x0000000087000000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes

Boot HART ID : 0
Boot HART Domain : root
Boot HART ISA : rv64imafdcsu
Boot HART Features : scounteren,mcounteren,time
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109

d8888 .d88888b. .d8888b.
d88888 d88P" "Y88b d88P Y88b
d88P888 888 888 Y88b.
d88P 888 888d888 .d8888b .d88b. 888 888 "Y888b.
d88P 888 888P" d88P" d8P Y8b 888 888 "Y88b.
d88P 888 888 888 88888888 888 888 "888
d8888888888 888 Y88b. Y8b. Y88b. .d88P Y88b d88P
d88P 888 888 "Y8888P "Y8888 "Y88888P" "Y8888P"

arch = riscv64
platform = riscv64-qemu-virt
target = riscv64gc-unknown-none-elf
smp = 1
build_mode = release
log_level = warn

[ 21.690418 0 fatfs::dir:139] Is a directory
[ 21.963457 0 fatfs::dir:139] Is a directory
[ 22.252957 0 fatfs::dir:139] Is a directory
[ 22.383790 0 fatfs::dir:139] Is a directory
app: /sbin/origin
paddr: PA:0x80642000
Mapping user stack: VA:0x3fffff0000 -> VA:0x4000000000
New user address space: AddrSpace {
va_range: VA:0x0..VA:0x4000000000,
page_table_root: PA:0x80641000,
}
Enter user space: entry=0x1000, ustack=0x4000000000, kstack=VA:0xffffffc080687010
[ 23.235085 0:4 axhal::arch::riscv::trap:24] No registered handler for trap PAGE_FAULT
[ 23.319751 0:4 axruntime::lang_items:5] panicked at modules/axhal/src/arch/riscv/trap.rs:25:9:
Unhandled User Page Fault @ 0x1002, fault_vaddr=VA:0x3ffffffffc (WRITE | USER):
TrapFrame {
regs: GeneralRegisters {
ra: 0x0,
sp: 0x3ffffffffc,
gp: 0x0,
tp: 0x0,
t0: 0x0,
t1: 0x0,
t2: 0x0,
s0: 0x0,
s1: 0x0,
a0: 0x0,
a1: 0x0,
a2: 0x0,
a3: 0x0,
a4: 0x0,
a5: 0x0,
a6: 0x0,
a7: 0x0,
s2: 0x0,
s3: 0x0,
s4: 0x0,
s5: 0x0,
s6: 0x0,
s7: 0x0,
s8: 0x0,
s9: 0x0,
s10: 0x0,
s11: 0x0,
t3: 0x0,
t4: 0x0,
t5: 0x0,
t6: 0x0,
},
sepc: 0x1002,
sstatus: 0x40020,
}

实现方法tour/m_2_0里的实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#[register_trap_handler(PAGE_FAULT)]
fn handle_page_fault(vaddr: VirtAddr, access_flags: MappingFlags, is_user: bool) -> bool {
if is_user {
if !axtask::current()
.task_ext()
.aspace
.lock()
.handle_page_fault(vaddr, access_flags)
{
ax_println!("{}: segmentation fault, exit!", axtask::current().id_name());
axtask::exit(-1);
} else {
ax_println!("{}: handle page fault OK!", axtask::current().id_name());
}
true
} else {
false
}
}

这里主要是调用了aspace也即当前任务地址空间中处理缺页故障的方法.

就像我们之前在上一节分析到的Backendmap方法一样,还是调用了Backendremap方法.

就是当即分配一个frame,然后把当前出问题的va虚拟地址重新映射到frame.

详细编写过程

附在ArceOS igb网卡驱动编写上

体会

前边很多时间用在流控、过滤器上边。由于igb-drive和ixgbe-drive的抽象化不一样。尤其是对ring和dma内存的结构的抽象。

比如把ring初始化的时候的操作,需要适配Tx和Rx。看文档的时间花了很久,具体怎么抽象化反而做的很差。

最后igb-drive卡在Descriptor构成之后想要写入Tail发送,但是想不出怎么进行发送上边。

很难解决具体的问题。

后来看到群友有人魔改ixgbe-dirve,得到成功。进而参考学习,自己也可以通过修改本地的ixgbe驱动来实现成功的httpserver。

并且看到有的人参考的驱动是旧版本的linux驱动,而我参考的是新版本的驱动,这就体现了我策略性的问题。

后续

不管这次结果如何,仍然想要参加下一期的学习,并且尝试自己完成igb驱动的编写,并且放到自己的blog中去。

阶段一-Rustlings

这个阶段主要是需要进行rust相关的学习.

其实早就有接触过rust,但是语法,尤其是所有权和生命周期这块,一直还比较生疏.

有关于所有权相关的内容主要是看了这个视频:如何一步一步推导出Rust所有权、借用规则、引用类型以及秒懂生命周期标注_哔哩哔哩_bilibili

其中关于rust作为无gc语言,大部分内存都是保存在栈上的,只有少部分是保存在栈里边.

‘a指出返回的引用与输入参数x、y之间有关联.

这个非常非常重要.

阶段二-OS内核实验

ch0-2

可以参见我之前复现rCore-Tutorial-Book-v3 3.6.0-alpha.1 文档的笔记,被公开在博客园上.

[winddevil的笔记 - 博客园

这次学习主要是从ch3~ch8的学习,重点换做了解决通过测例,和一些自己的问题.

ch3

主要是讲的一个任务切换的流程,有了任务切换之后又通过定时器中断实现了抢占式调度.

替代文本|800

ch4

这一章主要解决的是一个虚拟地址转换为物理地址的过程,说是虚拟地址,我原来以为真的改变了地址,实际上每一次调用资源都还是使用了物理地址的,利用地址空间对所有的需要访问具体物理地址的对象进行操作.

替代文本|800

ch5

这一章同样是承接了上一章的知识,讲的主要是一个进程的概念,加入了很多新的结构体,后边我应该会有时间的时候更新一下图片.

进程中最巧妙的就是使用了fork这个复制一个任务的操作,有了进程,那么就可以实现编程的简洁性,倭只需要编写一个小任务,然后再进行组合,而不是调用fn,然后自己设计各种分支结构.

有了进程,相当于把调度的工作委托给了os.

ch6

在上一章的基础上,引入了块和文件系统.

这一部分的知识学的非常的不牢靠.

但是让我印象深刻的地方是,这一章基本实现了我对之前学习的时候发现windows是可以直接”点开”应用这个操作的好奇.

那么应用保存在哪里,为什么我用U盘拷贝了还是可以继续运行.

之前学习单片机的时候很少想到我可以通过什么东西对”可执行”的东西进行操作.

通过二进制文件进行加载然后运行的操作属实惊艳到了我.

ch7

这个和ch4更加相关.之前运行rtos的时候总是想着,那么这个变量可以直接以全局变量的方式进行传输为什么我要使用各种比如信号量比如邮箱的方式,现在就一目了然了.

因为地址空间的不同所以进程之间的通信需要通过管道,也就是需要经由操作系统这一层.

ch8

这一部分让我想起了之前进行单片机编程的时候的临界段保护操作,那时候是通过非常暴力的方法关掉了所有的中断以保证这次读取不会出现问题.

或者使用原子操作保证中断无法打断单一时钟下的操作.

这里并没有和硬件和中断打交道,而是选用了三种方式,加锁\条件变量\信号量的方式.

使用银行家算法进行了调度,算法不难,但是调用本身很麻烦,需要在每一次加锁的时候对题中的变量进行操作.并且每一次上锁的时候都需要detect,那么对上锁的程序也必须进行改造.

本次体验

越到后边越忙,如果有幸进入下一个阶段一定不能好高骛远多线程操作,一定要留足时间给自己.