TranslateProject/translated/tech/20200117 C vs. Rust- Which to choose for programming hardware abstractions.md
Xingyu Wang 231f508597 PART 2
2020-01-19 17:28:59 +08:00

23 KiB
Raw Blame History

C 还是 Rust选择哪个用于编程硬件抽象

在 Rust 中使用类型级编程可以使硬件抽象更加安全。

Tools illustration

Rust 是一种日益流行的编程语言,被视为硬件接口的最佳选择。通常会将其与 C 的抽象级别进行比较。本文介绍了 Rust 如何以多种方式处理按位运算,并提供了既安全又易于使用的解决方案。

语言 源自 官方说明 总览
C 1972 年 C 是一种通用编程语言,具有表达式简约、现代的控制流和数据结构,以及丰富的运算符集等特点。(来源:[CS 基础知识] 2 C 是(一种)命令式语言,旨在以相对简单的方式进行编译,从而提供对内存的低级访问。(来源:[W3schools.in] 3
Rust 2010 年 一种使所有人都能构建可靠、高效的软件的语言(来源:[Rust 网站] 4 Rust 是一种专注于安全性(尤其是安全并发性)的多范式系统编程语言。(来源:[维基百科] 5

在 C 语言中对寄存器值进行按位运算

在系统编程领域,你可能经常需要编写硬件驱动程序或直接与内存映射的设备进行交互,而这些交互几乎总是通过硬件提供的内存映射的寄存器来完成的。通常,你通过对某些固定宽度的数字类型进行按位运算来与这些寄存器进行交互。

例如,假设一个具有三个字段的 8 位寄存器:

+----------+------+-----------+---------+
| (unused) | Kind | Interrupt | Enabled |
+----------+------+-----------+---------+
   5-7       2-4        1          0

字段名称下方的数字规定了该字段在寄存器中使用的位。要启用该寄存器,你将写入值 1(以二进制表示为0000_0001)来设置 Enabled 字段的位。但是,通常情况下,你也不想干扰寄存器中的现有配置。假设你要在设备上启用中断功能,但也要确保设备保持启用状态。为此,必须将 Interrupt 字段的值与 Enabled 字段的值结合起来。你可以通过按位操作来做到这一点:

1 | (1 << 1)

通过将 1 和 2左移 1 一位得到)进行“或”运算得到二进制值 0000_0011 。你可以将其写入寄存器,使其保持启用状态,但也允许中断。

有很多事情要记住,特别是当你要为一个完整的系统处理可能有数百个之多的寄存器时。实际上,你可以使用助记符来执行此操作,助记符可跟踪字段在寄存器中的位置以及字段的宽度(即它的上边界是什么?)

这是这些助记符之一的示例。它们是 C 语言的宏,用右侧的代码替换它们的出现的地方。这是上面列出的寄存器的简写。 的左侧是该字段的位置,而右侧则限制该字段的位:

#define REG_ENABLED_FIELD(x) (x << 0) & 1
#define REG_INTERRUPT_FIELD(x) (x << 1) & 2
#define REG_KIND_FIELD(x) (x << 2) & (7 << 2)

然后,你将使用这些通过类似以下方式来抽象化寄存器值的操作:

void set_reg_val(reg* u8, val u8);

fn enable_reg_with_interrupt(reg* u8) {
    set_reg_val(reg, REG_ENABLED_FIELD(1) | REG_INTERRUPT_FIELD(1));
}

这就是现在的做法。实际上,这就是大多数驱动程序出现在 Linux 内核中的方式。

有没有更好的办法?如果能够基于对现代编程语言研究得出新的类型系统,就可能能够获得安全性和可表达性的好处。也就是说,如何使用更丰富、更具表现力的类型系统来使此过程更安全、更持久?

在 Rust 语言中对寄存器值进行按位运算

继续用上面的寄存器作为例子:

+----------+------+-----------+---------+
| (unused) | Kind | Interrupt | Enabled |
+----------+------+-----------+---------+
   5-7       2-4        1          0

你可能想如何用 Rust 类型来表示它?

你将以类似的方式开始,为每个字段的偏移定义常量(即,距最低有效位有多远)及其掩码。掩码是一个值,其二进制表示形式可用于更新或读取寄存器内部的字段:

const ENABLED_MASK: u8 = 1;
const ENABLED_OFFSET: u8 = 0;

const INTERRUPT_MASK: u8 = 2;
const INTERRUPT_OFFSET: u8 = 1;

const KIND_MASK: u8 = 7 << 2;
const KIND_OFFSET: u8 = 2;

接下来,你将声明一个 Field 类型,并进行操作以将给定值转换为与其位置相关的值以供在寄存器内使用:

struct Field {
    value: u8,
}

impl Field {
    fn new(mask: u8, offset: u8, val: u8) -> Self {
        Field {
            value: (val << offset) & mask,
        }
    }
}

最后,你将使用一个 Register 类型,该类型会封装一个与你的寄存器宽度匹配的数字类型。 Register 具有 update 函数,可使用给定字段来更新寄存器:

struct Register(u8);

impl Register {
    fn update(&mut self, val: Field) {
        self.0 = self.0 | field.value;
    }
}

fn enable_register(&mut reg) {
    reg.update(Field::new(ENABLED_MASK, ENABLED_OFFSET, 1));
}

使用 Rust你可以使用数据结构来表示字段将它们附加到特定的寄存器并在与硬件交互时提供简洁明了的人机工程学。这个例子使用了 Rust 提供的最基本的功能。无论如何,添加的结构都会减轻上述 C 示例中的某些密度。现在,字段是个已命名的事物,而不是从模糊的按位运算符派生而来的数字,并且寄存器是具有状态的类型 —— 这在硬件上多了一层抽象。

A Rust implementation for ease of use

The first rewrite in Rust is nice, but it's not ideal. You have to remember to bring the mask and offset, and you're calculating them ad hoc, by hand, which is error-prone. Humans aren't great at precise and repetitive tasks—we tend to get tired or lose focus, and this leads to mistakes. Transcribing the masks and offsets by hand, one register at a time, will almost certainly end badly. This is the kind of task best left to a machine.

Second, thinking more structurally: What if there were a way to have the field's type carry the mask and offset information? What if you could catch mistakes in your implementation for how you access and interact with hardware registers at compile time instead of discovering them at runtime? Perhaps you can lean on one of the strategies commonly used to suss out issues at compile time, like types.

You can modify the earlier example by using typenum, a library that provides numbers and arithmetic at the type level. Here, you'll parameterize the Field type with its mask and offset, making it available for any instance of Field without having to include it at the call site:

#[macro_use]
extern crate typenum;

use core::marker::PhantomData;

use typenum::*;

// Now we'll add Mask and Offset to Field's type
struct Field&lt;Mask: Unsigned, Offset: Unsigned&gt; {
    value: u8,
    _mask: PhantomData&lt;Mask&gt;,
    _offset: PhantomData&lt;Offset&gt;,
}

// We can use type aliases to give meaningful names to
// our fields (and not have to remember their offsets and masks).
type RegEnabled = Field&lt;U1, U0&gt;;
type RegInterrupt = Field&lt;U2, U1&gt;;
type RegKind = Field&lt;op!(U7 &lt;&lt; U2), U2&gt;;

Now, when revisiting Field's constructor, you can elide the mask and offset parameters because the type contains that information:

impl&lt;Mask: Unsigned, Offset: Unsigned&gt; Field&lt;Mask, Offset&gt; {
    fn new(val: u8) -&gt; Self {
        Field {
            value: (val &lt;&lt; Offset::U8) &amp; Mask::U8,
            _mask: PhantomData,
            _offset: PhantomData,
        }
    }
}

// And to enable our register...
fn enable_register(&amp;mut reg) {
    reg.update(RegEnabled::new(1));
}

It looks pretty good, but… what happens when you make a mistake regarding whether a given value will fit into a field? Consider a simple typo where you put 10 instead of 1:

fn enable_register(&amp;mut reg) {
    reg.update(RegEnabled::new(10));
}

In the code above, what is the expected outcome? Well, the code will set that enabled bit to 0 because 10 &amp; 1 = 0. That's unfortunate; it would be nice to know whether a value you're trying to write into a field will fit into the field before attempting a write. As a matter of fact, I'd consider lopping off the high bits of an errant field value undefined behavior (gasps).

Using Rust with safety in mind

How can you check that a field's value fits in its prescribed position in a general way? More type-level numbers!

You can add a Width parameter to Field and use it to verify that a given value can fit into the field:

struct Field&lt;Width: Unsigned, Mask: Unsigned, Offset: Unsigned&gt; {
    value: u8,
    _mask: PhantomData&lt;Mask&gt;,
    _offset: PhantomData&lt;Offset&gt;,
    _width: PhantomData&lt;Width&gt;,
}

type RegEnabled = Field&lt;U1,U1, U0&gt;;
type RegInterrupt = Field&lt;U1, U2, U1&gt;;
type RegKind = Field&lt;U3, op!(U7 &lt;&lt; U2), U2&gt;;

impl&lt;Width: Unsigned, Mask: Unsigned, Offset: Unsigned&gt; Field&lt;Width, Mask, Offset&gt; {
    fn new(val: u8) -&gt; Option&lt;Self&gt; {
        if val &lt;= (1 &lt;&lt; Width::U8) - 1 {
            Some(Field {
                value: (val &lt;&lt; Offset::U8) &amp; Mask::U8,
                _mask: PhantomData,
                _offset: PhantomData,
                _width: PhantomData,
            })
        } else {
            None
        }
    }
}

Now you can construct a Field only if the given value fits! Otherwise, you have None, which signals that an error has occurred, rather than lopping off the high bits of the value and silently writing an unexpected value.

Note, though, this will raise an error at runtime. However, we knew the value we wanted to write beforehand, remember? Given that, we can teach the compiler to reject entirely a program which has an invalid field value—we dont have to wait until we run it!

This time, you'll add a trait bound (the where clause) to a new realization of new, called new_checked, that asks the incoming value to be less than or equal to the maximum possible value a field with the given Width can hold:

struct Field&lt;Width: Unsigned, Mask: Unsigned, Offset: Unsigned&gt; {
    value: u8,
    _mask: PhantomData&lt;Mask&gt;,
    _offset: PhantomData&lt;Offset&gt;,
    _width: PhantomData&lt;Width&gt;,
}

type RegEnabled = Field&lt;U1, U1, U0&gt;;
type RegInterrupt = Field&lt;U1, U2, U1&gt;;
type RegKind = Field&lt;U3, op!(U7 &lt;&lt; U2), U2&gt;;

impl&lt;Width: Unsigned, Mask: Unsigned, Offset: Unsigned&gt; Field&lt;Width, Mask, Offset&gt; {
    const fn new_checked&lt;V: Unsigned&gt;() -&gt; Self
    where
        V: IsLessOrEqual&lt;op!((U1 &lt;&lt; Width) - U1), Output = True&gt;,
    {
        Field {
            value: (V::U8 &lt;&lt; Offset::U8) &amp; Mask::U8,
            _mask: PhantomData,
            _offset: PhantomData,
            _width: PhantomData,
        }
    }
}

Only numbers for which this property holds has an implementation of this trait, so if you use a number that does not fit, it will fail to compile. Take a look!

fn enable_register(&amp;mut reg) {
    reg.update(RegEnabled::new_checked::&lt;U10&gt;());
}
12 |     reg.update(RegEnabled::new_checked::&lt;U10&gt;());
   |                           ^^^^^^^^^^^^^^^^ expected struct `typenum::B0`, found struct `typenum::B1`
   |
   = note: expected type `typenum::B0`
           found type `typenum::B1`

new_checked will fail to produce a program that has an errant too-high value for a field. Your typo won't blow up at runtime because you could never have gotten an artifact to run.

You're nearing Peak Rust in terms of how safe you can make memory-mapped hardware interactions. However, what you wrote back in the first example in C was far more succinct than the type parameter salad you ended up with. Is doing such a thing even tractable when you're talking about potentially hundreds or even thousands of registers?

Just right with Rust: both safe and accessible

Earlier, I called out calculating masks by hand as being problematic, but I just did that same problematic thing—albeit at the type level. While using such an approach is nice, getting to the point when you can write any code requires quite a bit of boilerplate and manual transcription (I'm talking about the type synonyms here).

Our team wanted something like the TockOS mmio registers, but one that would generate typesafe implementations with the least amount of manual transcription possible. The result we came up with is a macro that generates the necessary boilerplate to get a Tock-like API plus type-based bounds checking. To use it, write down some information about a register, its fields, their width and offsets, and optional enum-like values (should you want to give "meaning" to the possible values a field can have):

register! {
    // The register's name
    Status,
    // The type which represents the whole register.
    u8,
    // The register's mode, ReadOnly, ReadWrite, or WriteOnly.
    RW,
    // And the fields in this register.
    Fields [
        On    WIDTH(U1) OFFSET(U0),
        Dead  WIDTH(U1) OFFSET(U1),
        Color WIDTH(U3) OFFSET(U2) [
            Red    = U1,
            Blue   = U2,
            Green  = U3,
            Yellow = U4
        ]
    ]
}

From this, you can generate register and field types like the previous example where the indices—the Width, Mask, and Offset—are derived from the values input in the WIDTH and OFFSET sections of a field's definition. Also, notice that all of these numbers are typenums; they're going to go directly into your Field definitions!

The generated code provides namespaces for registers and their associated fields through the name given for the register and the fields. That's a mouthful; here's what it looks like:

mod Status {
    struct Register(u8);
    mod On {
        struct Field; // There is of course more to this definition
    }
    mod Dead {
        struct Field;
    }
    mod Color {
        struct Field;
        pub const Red: Field = Field::&lt;U1&gt;new();
        // &amp;c.
    }
}

The generated API contains the nominally expected read and write primitives to get at the raw register value, but it also has ways to get a single field's value, do collective actions, and find out if any (or all) of a collection of bits is set. You can read the documentation on the complete generated API.

Kicking the tires

What does it look like to use these definitions for a real device? Will the code be littered with type parameters, obscuring any real logic from view?

No! By using type synonyms and type inference, you effectively never have to think about the type-level part of the program at all. You get to interact with the hardware in a straightforward way and get those bounds-related assurances automatically.

Here's an example of a UART register block. I'll skip the declaration of the registers themselves, as that would be too much to include here. Instead, it starts with a register "block" then helps the compiler know how to look up the registers from a pointer to the head of the block. We do that by implementing Deref and DerefMut:

#[repr(C)]
pub struct UartBlock {
    rx: UartRX::Register,
    _padding1: [u32; 15],
    tx: UartTX::Register,
    _padding2: [u32; 15],
    control1: UartControl1::Register,
}

pub struct Regs {
    addr: usize,
}

impl Deref for Regs {
    type Target = UartBlock;

    fn deref(&amp;self) -&gt; &amp;UartBlock {
        unsafe { &amp;*(self.addr as *const UartBlock) }
    }
}

impl DerefMut for Regs {
    fn deref_mut(&amp;mut self) -&gt; &amp;mut UartBlock {
        unsafe { &amp;mut *(self.addr as *mut UartBlock) }
    }
}

Once this is in place, using these registers is as simple as read() and modify():

fn main() {
    // A pretend register block.
    let mut x = [0_u32; 33];

    let mut regs = Regs {
        // Some shenanigans to get at `x` as though it were a
        // pointer. Normally you'd be given some address like
        // `0xDEADBEEF` over which you'd instantiate a `Regs`.
        addr: &amp;mut x as *mut [u32; 33] as usize,
    };

    assert_eq!(regs.rx.read(), 0);

    regs.control1
        .modify(UartControl1::Enable::Set + UartControl1::RecvReadyInterrupt::Set);

    // The first bit and the 10th bit should be set.
    assert_eq!(regs.control1.read(), 0b_10_0000_0001);
}

When we're working with runtime values we use Option like we saw earlier. Here I'm using unwrap, but in a real program with unknown inputs, you'd probably want to check that you got a Some back from that new call:1,2

fn main() {
    // A pretend register block.
    let mut x = [0_u32; 33];

    let mut regs = Regs {
        // Some shenanigans to get at `x` as though it were a
        // pointer. Normally you'd be given some address like
        // `0xDEADBEEF` over which you'd instantiate a `Regs`.
        addr: &amp;mut x as *mut [u32; 33] as usize,
    };

    let input = regs.rx.get_field(UartRX::Data::Field::Read).unwrap();
    regs.tx.modify(UartTX::Data::Field::new(input).unwrap());
}

Decoding failure conditions

Depending on your personal pain threshold, you may have noticed that the errors are nearly unintelligible. Take a look at a not-so-subtle reminder of what I'm talking about:

error[E0271]: type mismatch resolving `&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UTerm, typenum::B1&gt;, typenum::B0&gt;, typenum::B1&gt;, typenum::B0&gt;, typenum::B0&gt; as typenum::IsLessOrEqual&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UInt&lt;typenum::UTerm, typenum::B1&gt;, typenum::B0&gt;, typenum::B1&gt;, typenum::B0&gt;&gt;&gt;::Output == typenum::B1`
  --&gt; src/main.rs:12:5
   |
12 |     less_than_ten::&lt;U20&gt;();
   |     ^^^^^^^^^^^^^^^^^^^^ expected struct `typenum::B0`, found struct `typenum::B1`
   |
   = note: expected type `typenum::B0`
       found type `typenum::B1`

The expected typenum::B0 found typenum::B1 part kind of makes sense, but what on earth is the typenum::UInt&lt;typenum::UInt, typenum::UInt… nonsense? Well, typenum represents numbers as binary cons cells! Errors like this make it hard, especially when you have several of these type-level numbers confined to tight quarters, to know which number it's talking about. Unless, of course, it's second nature for you to translate baroque binary representations to decimal ones.

After the U100th time attempting to decipher any meaning from this mess, a teammate got Mad As Hell And Wasn't Going To Take It Anymore and made a little utility, tnfilt, to parse the meaning out from the misery that is namespaced binary cons cells. tnfilt takes the cons cell-style notation and replaces it with sensible decimal numbers. We imagine that others will face similar difficulties, so we shared tnfilt. You can use it like this:

`$ cargo build 2>&1 | tnfilt`

It transforms the output above into something like this:

`error[E0271]: type mismatch resolving `<U20 as typenum::IsLessOrEqual<U10>>::Output == typenum::B1``

Now that makes sense!

In conclusion

Memory-mapped registers are used ubiquitously when interacting with hardware from software, and there are myriad ways to portray those interactions, each of which has a different place on the spectra of ease-of-use and safety. We found that the use of type-level programming to get compile-time checking on memory-mapped register interactions gave us the necessary information to make safer software. That code is available in the [bounded-registers][15] crate (Rust package).

Our team started out right at the edge of the more-safe side of that safety spectrum and then tried to figure out how to move the ease-of-use slider closer to the easy end. From those ambitions, bounded-registers was born, and we use it anytime we encounter memory-mapped devices in our adventures at Auxon.


  1. Technically, a read from a register field, by definition, will only give a value within the prescribed bounds, but none of us lives in a pure world, and you never know what's going to happen when external systems come into play. You're at the behest of the Hardware Gods here, so instead of forcing you into a "might panic" situation, it gives you the Option to handle a "This Should Never Happen" case.

  2. get_field looks a little weird. I'm looking at the Field::Read part, specifically. Field is a type, and you need an instance of that type to pass to get_field. A cleaner API might be something like:

`regs.rx.get_field::<UartRx::Data::Field>();`

But remember that Field is a type synonym that has fixed indices for width, offset, etc. To be able to parameterize get_field like this, you'd need higher-kinded types.


This originally appeared on the Auxon Engineering blog and is edited and republished with permission.


via: https://opensource.com/article/20/1/c-vs-rust-abstractions

作者:Dan Pittman 选题:lujun9972 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出