总结
第二阶段的强度慢慢上来了,由于平时要上班和前期收不到短信验证码的问题,不能跟上每周的直播课,主要通过学习tutorial文档和课后看录频和课件的方式学习。本阶段对内存管理,进程管理和文件系统, 线程进行了系统的学习。特别是内存映射和文件系统章节,学习到很多。同时,随着代码量增大,学习的同时画UML图对理解代码和整个框架也很有帮助。
第一次写rust,被所有权干烂了,不停和编译器搏斗+不停拷问AI终于磕磕绊绊写完了,以前习惯于C++的引用和指针乱飘了,写rust确实比C和C++安全,因为有编译器的所有权机制保障。
因为以前写过NJUPA(极其推荐),所以对上下文切换有一点概念。rcore-os的上下文切换采用了xv6类似的方式,在内核内部做控制流切换来切换执行的用户程序。对于操作系统中常用的锁,rcore采用了单核中的使用rust所有权机制来实现,上锁的时候将所有权转交给函数,尝试二次获取的时候就会因为所有权引发rust错误,避免了内核的死锁的情况。cargo的编译链接功能确实比GNU那一套方便,rust的属性语法支持也比C++要好很多。
地址空间和页表中极多的用来表示地址组成的结构体,充分体现了rust面向对象的特性和into
的语法,增加了代码的可读性,以前用C系的写也就使用各种宏。rcore用了额外的BTreeMap来支持快速查询页表。作业题的实现也大多是调用现有的函数就差不多可以了。
到文件系统这里比较新鲜,因为以前没有怎么接触过,文件系统的实现层层嵌套,层层抽象,一个功能实现往往要修改多层,理解起来比较困难。
并发的作业是用Coffman提出的方法做死锁检测,开始的时候理解成了银行家算法,读作业提示的时候也没有读懂,卡了很长时间,在群u的提示下才通过。后来在上操作系统课的时候翻到书上才知道这是Coffman的死锁检测算法(书上的描述其实和作业里写得差不多)。
第三阶段换了个内核模式叫unikernel,不太习惯这样的组件化操作系统,还是习惯以前的传统宏内核结构。
print_with_color很简单,理解一下arceos提供的axstd模块,为内核应用提供了运行时环境。
support_hash_map在理解了rust可以自定义堆分配器后,唯一要解决的就是rust中如何得到一个hashable值的hash值,然后做简单的映射即可,测试不需要特别追求效率。
alt_alloc主要是实现一下需要的接口,了解如何向arceos添加自己的分配器,bump allocator
可以说是最简单的分配器了。
ramfs_rename要把cargo里的包换成arceos下面的(不然文件系统的包永远是网上拉下来的),找到对traitVfsOps
和VfsNodeOps
的实现之后,搞清楚记录文件的数据结构,实现rename
就可以了。
sys_map同样搞清楚task_ext
提供的接口就可以直接调库了,第一次实现忘记输入地址是NULL
的话要由内核自己找地址,遂发现了一个叫做find_free_area
的妙妙工具。
simple_hv最简单的一次,同样的由操作系统代为执行一些指令的手法可以用来帮用户程序加载一些未对齐的数据(如果处理器不支持)。
经过这一阶段对操作系统核心内容的系统学习和实践,我对操作系统的本质、关键机制以及与 Rust 结合的优势有了更深刻的理解。以下是我的主要体会和总结:
操作系统是管理硬件资源、为上层应用提供抽象和隔离的基础软件。它的核心任务包括:
之前对操作系统有一定理论基础,rcore 和 arceos 项目对我最大的挑战主要包括:
unikernel 架构是没有特权级切换的,应用程序也运行在 s 态。刚开始没有仔细理解 ppt,给我造成了挺大的困扰。
hashmap 的实验我并没有自己手写代码,而是直接引入了 hashbrown 库。但手撕一下代码应该能更加锻炼能力。
此外,hypervisor 给我带来了挺大的困难,参考其他同学的经验我才得以通过。
We need to place multiple app to multiple memory address to run app in cycle. Rather run once and clear for next.
First We want to place each app to each isolated addr, due to our kernel restriction, we need to load it with build.py
.
Task: a workflow process
Define every time slice of Task as Task Slice
Define the switch between app as Task switching
We need to store Task Context
Design:
We will store these register in ctx:
1 | // os/src/task/context.rs |
1 | .altmacro |
Expose to Rust
1 | // os/src/task/switch.rs |
We will design TaskManager
:
TaskContext
and TaskState
for running or exited etc…__restore
ctx to TaskContext
1 | let current_task_cx_ptr = &mut inner.tasks[current].task_cx as *mut TaskContext; |
Manually design interface yield
for App to use
1 | pub fn sys_yield() -> isize { |
But it can be inefficient for some case that app already done its work but reluctant to exit.
We will design interrupt clock in a fixed time bound to force switch between app.
ecall
You should know this as a pre-knowledge:
1 | const SBI_SET_TIMER: usize = 0; |
It corresponds to riscv
:
S
(guaranteed by Supervisor Execution Environment
of RustSBI
)U
(constructed in current chapter as Application Execution Environment
)Reason:
Workflow:
riscv
designs following CSR
(Control and Status Register) to handle this:
Begin Trap:
SPP
seg to the current level of CPU.stvec is a 64-bit CSR, with:
- MODE(Direct/Vectored)
[1:0]
(read from right to left): 2-bits- BASE
[63:2]
: 62-bits
finally, it will return by instruction sret
which will change level and jump by sepc
.
Design:
CSR
.Construct:
KernelStack
and UserStack
for separationKernelStack
, we store TrapContext
in it, by asm and rust to control dispatch and handle, then store the code to stvec
as the entry of Trap.UserStack
by push a new context refer to UserStack
.build stack and push context:
1 | // stack struct ... |
1 | // os/src/trap/context.rs |
We will design __alltrap
and __restore
for operation by asm and part of rust:
1 | .altmacro |
To handle Trap context, we will use riscv
lib:
1 | // os/Cargo.toml |
restore operation:
1 | extern "C" { fn __restore(cx_addr: usize); } |
AppManager
to maintain and storeAPP_BASE_ADDRESS
(Currently we have no ability to dynamically read address)1 | # os/src/link_app.S |
Design it!
1 | // os/src/batch.rs |
Load App:
1 | // part of code of copying to kernel |
Run each app!
1 | // os/src/batch.rs |
We don’t want a fixed physical addr for allocation, rather, we want a unified abstract interface for dynamic memory layout for app storage. We call it Virtual Address
Safety: Every app will be allocated in its own virtual memory space, so each can’t interfere others.
Efficiency: Every app can coexist in same time without demand of reading outer peripherals to switch app.(With development of memory size)
We need MMU(Memory Management Unit) to achieve Address Translation for interview from virtual to physical.
Different designs occur.
Each app exist in one fixed slot for one segment as $[0,bound)$, with a linear map by MMU.
Problem: Wasteful and inflexible
We may want a different linear map for each app, for example, its allocation for heap, data, code etc… So we can dispatch memory in more finer style, but it can’t resolve the problem because now even slot is dynamically allocated, it may still exist some free memory too small to reuse, cause the External Fragment rather than Internal Fragment which is the problem due to fixed slot.
We could set a Protection bit as r
for read, w
for write, x
for execution.
Another way is to inverse our mind, rather take a slot on memory, we take slot on MMU, it can map its slot(now we call it Page) for real physical memory layout. To adjust, we can take slice for Page to form Frame which is a smaller unit to suit physical memory layout, each app can take many Page Number for a Page Table, record the map.
SV39 only use lower 39 bits rather whole 64 bits in design even bit width is 64 bits(it’s a fairly large amount!)
In a address, $[11:0]$ called Page Offset, and $[38:12]$ is the VPN which will be used for location of page and use offset to direct in one page(with 4KiB in one page).
We should modify satp
to open Paging for S and U-level memory.
1 | const PAGE_SIZE: usize = 4096 |
Page Size and offset to slice physical addr.
1 | // os/src/mm/address.rs |
Page Entry to bundle permission and physical page.
1 | pub fn ppn(&self) -> PhysPageNum { |
Usually, a simple design for page manager would be a linear map from base addr and follow up. But actually it will take a huge amount of memory due to the storage of offset by base addr for each app.
A finer design is from Trie. We will take index slice for each 8 bits(it will fit in to 4KB just right!), it means for each slice has 512 states, and link those state up, form a tree. Usually with 3-level for Page Index. Total 27 bits.
Virtual Page will reserve 27 bits for the index slice and 12 bits for offset for certain page. Total 39 bits.
When we transform a virtual addr to physical one, we will do the following exposition reversely.
A simple allocation for page(rather management!) is stack style.
1 | pub struct StackFrameAllocator { |
Based on Allocation, we can design Page Table.
1 | // os/src/mm/frame_allocator.rs |
It means for one physical page, we will record each allocation by vector of FrameTracker as a representation of real Frame after the root.
We should design transformation from virtual addr to physical addr.
1 | // for each idxs represent index slices of virtual addr. |
Based on our abstraction, we need a design for MapArea
to given a map from continuous virtual address(no matter their corresponding page!) to physical address.
1 | // os/src/mm/memory_set.rs |
Based on continuous virtual address map, we can define discontinuous map for one app.
1 | // os/src/mm/memory_set.rs |
In Each MapArea
allocated for some key-value for virtual-physical addr, it will allocate the same for PageTable
for Frame.
To open SV39, we should write instruction for satp
:
1 | // os/src/mm/page_table.rs |
Therefore, it will fill current root of physical page number as activation.
Notice that we should make instruction contigeous for SV39
open in physical address to amend the gap of transformation of address before and after open.
Initiation for Kernel memory layout.
1 | let mut memory_set = Self::new_bare(); |
Initiation for App memory layout.
Previously, we will cut off elf
part of binary of apps, now we load it and extract useful infos, s.t. Permissions.
MemorySet
should allocate storage of execution code with its permissions, allocate user stack and trap context at top of the memory layout.
Output MemorySet
, user stack top, entry point addr.
1 | let mut max_end_vpn = VirtPageNum(0); |
A problem is that separation of Kernel and App will also isolate Trap, the process need info from App to Kernel but App can’t see it. Therefore, we need a transfer operation. We achieve this by storing related info in TrapContext
.
(Because there’s no more register to store these without breaking state like sscratch
designed for kernel stack.)
1 | // os/src/trap/context.rs |
Notice that we also need to modify below to trigger satp
and specify corresponding(U-level, S-level satp
) physical page number to change state.
1 | # __alltraps: |
To amend the problem of contigeous instructions after switch, we need to adjust memory layout for trampoline
which is in the same location in U-level and S-level(unified for all app to trap!). It will be placed at highest virtual page.
1 | # os/src/linker.ld |
We modify rather raw handler and restore, due to virtual address, we need to adjust it for trampoline
rather the address we had code!(it’s virtual!)
1 | #[no_mangle] |
Then map the virtual address for it up to the physical address for unifying.
1 | // os/src/config.rs |
We should adjust TaskControlBlock
for the same reason, record each Trap
.
1 | // os/src/task/task.rs |
It will read data getting from elf, get trap contexr ppn, its kernel stack bottom and top, and then initiate trap context.
Here the part of task control initiation:
1 | impl TaskContext { |
The execution environment is defined by the Target Triplet, which specifies the platform, CPU architecture, and library required for the build. For example: x86_64-unknown-linux-gnu
.
Components of the Target Triplet:
If the target platform contains no std
or any support syscall, such platform called bare-metal, Rust
contains a core
lib independent of any platform support.
If we change .cargo/config
s.t.:
1 | # os/.cargo/config |
it called cross compile because the running platform is different form execution platform.
The basic functionality provided by std
and start
semantic is panic_handler
and main
entry.
To toggle it off with:
1 | #![no_std] |
As for riscv, thing will be tough in here, we need to complete our own entry point, exit, and basic functionality like print/println
.
First, we need to define linker
and entry
for stack allocation.
Linker:
1 | # os/src/linker.ld |
Stack Space:
1 | # os/src/entry.asm |
For riscv, we need to call RustSBI
(a underlying specification for rust in riscv).
After implement sbi_call
, we could construct put_char
:
1 | const SBI_CONSOLE_PUTCHAR: usize = 1; |
With a formal interface for write
:
1 | struct Stdout; |
Now we construct basic functionality in println
, you could also handle panic_handler
and others…
After the demand of code execution one by one, we want to introduce Process which will be a full space-time description of execution process of binary file in OS. It means for one process, it should hold independent resources to be executed.
After Process, Thread and Coroutine are also developed in growth of OS. They are different in resources taken up, usually, Thread will be in one process and hold their own independent stack and workflow; Coroutine will be in one Thread and hold their own independent workflow.
Every process need independent memory layout, can be dispatch by cpu. It’s the functionality based on Task, after that, each process can Fork their own children processes, so there’s a workflow in time discrepancy. Its resource can’t be recycled in time due to children processes, we need to mark it as Zombie Process.
To clarify which is children, which is parent, and each isolated process, we mark each with PID-Process Identifier. Notice if we fork a process, it will be same as parent only a0
which is the register called for return will be different, parent process return new PID of child process, child process return 0 as none of fork.
sp
etc…) as its child process.We will recycle all presumed pid by PidAllocator
, No need to worry about previous pid used.
1 | pub struct PidHandle(pub usize); |
Therefore, if one pid recycled, it deallocated memory can be reused, we will define its KernelStack
addr by pid.
1 | // os/src/task/pid.rs |
We need to construct TaskControlBlock
for parent and children process.
1 | pub struct TaskControlBlockInner { |
TaskManager
manage all tasks and cpu dispatch, we will separate only tasks management for TaskManager
.
1 | // os/src/task/manager.rs |
And cpu dispatch for newly introduced Processor.
We introduce a idle process that used to call other process.
The whole workflow would be:
1 | pub struct Processor { |
Previously, we use suspend_current_and_run_next
to pause task and switch to next, now we need to adapt it to process design.
1 | // os/src/task/mod.rs |
In previous case, task won’t be created by its parent task, but process will. So, if its TrapContext
has been recycled, we need to refactor our trap_handler
for such case.
1 | // fn trap_handler() -> ! |
Now we will construct fork
, exec
, waitpid
,exit
.
We need to copy all memory layout and its task state. Then reallocate new kernel stack for it.
1 | // impl MemorySet |
Finally, implement sys_fork
1 | pub fn sys_fork() -> isize { |
We can see that if trap_handler
call sys_fork
, parent process x[10]
would be new_pid
as return value.
If we want to execute a task by its name, we need to first load string in app load.
1 | // os/build.rs |
1 | # link_app.S |
Construct APP_NAMES
as global state in OS.
1 | // os/src/loader.rs |
When execute a new binary file, we need to read it and extract all state to replace original one.
1 | // os/src/task/task.rs |
We will read input str
as a ptr and replace current task state.
1 | // os/src/syscall/process.rs |
When a task exit, it will return a exit code assigned by app if successfully or kernel if anomaly.
1 | // fn exit_current_and_run_next(exit_code:i32) |
WaitPid will return -1
if there’s no specified pid process exist, if it’s running, return -2
, finally, if it finished, recycle it and return 0
.
1 | // sys_waitpid(pid:uisze, exit_code_ptr:*mut i32) -> isize |