[Rust] lifetime annotation

1. 背景

關于 Rust lifetime 的幾個 “官方” 資料,更有助于理解。

2. 資料

(1)Learning Rust: The Tough Part - Lifetimes

注:這個資料不建議閱讀,好多寫法比喻性較強,容易引起誤解。

(2)Rust By Example: Lifetimes

注:該資料引入了 borrow checker 的概念

A lifetime is a construct the compiler (or more specifically, its borrow checker) uses to ensure all borrows are valid(lifetime 編譯器為了驗證 borrow 有效性的機制). Specifically, a variable's lifetime begins when it is created and ends when it is destroyed. While lifetimes and scopes are often referred to together, they are not the same.

Take, for example, the case where we borrow a variable via &. The borrow(borrow 必是一個引用) has a lifetime that is determined by where it is declared. As a result, the borrow is valid as long as it ends before the lender is destroyed. (只要不再使用該引用,就會被銷毀,并不一定超出作用域)However, the scope of the borrow is determined by where the reference is used.

(3)rust-lang / rfcs

注:該資料區(qū)分了 scope 和 lifetime,并提到 CFG

Extend Rust's borrow system to support non-lexical lifetimes -- these are lifetimes that are based on the control-flow graph根據(jù)控制流圖分析引用的有效范圍), rather than lexical scopes.

The basic idea of the borrow checker is that values may not be mutated or moved while they are borrowed, but how do we know whether a value is borrowed? The idea is quite simple: whenever you create a borrow, the compiler assigns the resulting reference a lifetime. This lifetime corresponds to the span of the code where the reference may be used(lifetime 是指引用的有效范圍). The compiler will infer this lifetime to be the smallest lifetime(一旦不再引用,lifetime 立即終止) that it can have that still encompasses all the uses of the reference.

Note that Rust uses the term lifetime in a very particular way. In everyday speech, the word lifetime can be used in two distinct -- but similar -- ways:

  • The lifetime of a reference, corresponding to the span of time in which that reference is used.

  • The lifetime of a value, corresponding to the span of time before that value gets freed (or, put another way, before the destructor for the value runs).

This second span of time, which describes how long a value is valid, is very important. To distinguish the two, we refer to that second span of time as the value's scope. Naturally, lifetimes and scopes are linked to one another. Specifically, if you make a reference to a value, the lifetime of that reference cannot outlive the scope of that value.(引用的 lifetime 不會比它引用 value 的作用域更長) Otherwise, your reference would be pointing into freed memory.

(4)The Rust Programming Language:Validating References with Lifetimes

注:該資料澄清了很多概念,值得仔細閱讀

Every reference in Rust has a lifetime, which is the scope for which that reference is valid(lifetime 機制是為了檢測引用的有效性). Most of the time, lifetimes are implicit and inferred, just like most of the time, types are inferred. We must annotate types when multiple types are possible. In a similar way, we must annotate lifetimes when the lifetimes of references could be related in a few different ways. Rust requires us to annotate the relationships using generic lifetime parameters to ensure the actual references used at runtime will definitely be valid.(lifetime annotation 的目的是為了消除歧義

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a. In practice, it means that the lifetime of the reference returned by the longest function is the same as the smaller(交集、或更小的那個) of the lifetimes of the references passed in. These constraints are what we want Rust to enforce. Remember, when we specify the lifetime parameters in this function signature, we’re not changing the lifetimes of any values passed in or returned. Rather, we’re specifying that the borrow checker should reject any values that don’t adhere to these constraints(添加 lifetime annotation 只是增加約束條件). Note that the longest function doesn’t need to know exactly how long x and y will live, only that some scope can be substituted for 'a that will satisfy this signature.

When annotating lifetimes in functions, the annotations go in the function signature, not in the function body. Rust can analyze the code within the function without any help. However, when a function has references to or from code outside that function, it becomes almost impossible for Rust to figure out the lifetimes of the parameters or return values on its own. The lifetimes might be different each time the function is called. This is why we need to annotate the lifetimes manually.(函數(shù)每次調用 lifetime annotation 實例化為不同的值

注:以上兩段寫的非常好,直擊 lifetime annotation 本質

When we pass concrete references to longest, the concrete lifetime that is substituted for 'a is the part of the scope of x that overlaps with the scope of ylongest 被調用時,'a 被實例化為 xy lifetime 的交集). In other words, the generic lifetime 'a will get the concrete lifetime that is equal to the smaller交集、或更小的那個) of the lifetimes of x and y. Because we’ve annotated the returned reference with the same lifetime parameter 'a, the returned reference will also be valid for the length of the smaller of the lifetimes of x and y只要 xy 有一個失效,返回值就失效).

注:以上是 lifetime 的實例化約定

You’ve learned that every reference has a lifetime and that you need to specify lifetime parameters for functions or structs that use references. However, in Chapter 4 we had a function in Listing 4-9, which is shown again in Listing 10-26, that compiled without lifetime annotations.

fn first_word(s: &str) -> &str {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return &s[0..i];
        }
    }

    &s[..]
}

注:下面介紹了一些重要歷史

The reason this function compiles without lifetime annotations is historical: in early versions (pre-1.0) of Rust, this code wouldn’t have compiled because every reference needed an explicit lifetime.(早期版本的 Rust 每個引用都需要標記 lifetime annotation) At that time, the function signature would have been written like this:

fn first_word<'a>(s: &'a str) -> &'a str {

After writing a lot of Rust code, the Rust team found that Rust programmers were entering the same lifetime annotations over and over in particular situations. These situations were predictable and followed a few deterministic patterns. The developers programmed these patterns into the compiler’s code so the borrow checker could infer the lifetimes in these situations and wouldn’t need explicit annotations.(編譯器團隊將 lifetime annotation 常見模式內置到了 Rust 語言中

This piece of Rust history is relevant because it’s possible that more deterministic patterns will emerge and be added to the compiler. In the future, even fewer lifetime annotations might be required.(未來可能會有更多的場景不需要寫 lifetime annotation 了

The patterns programmed into Rust’s analysis of references are called the lifetime elision rules. These aren’t rules for programmers to follow; they’re a set of particular cases that the compiler will consider, and if your code fits these cases, you don’t need to write the lifetimes explicitly.

The elision rules don’t provide full inference. If Rust deterministically applies the rules but there is still ambiguity as to what lifetimes the references have, the compiler won’t guess what the lifetime of the remaining references should be. In this case, instead of guessing, the compiler will give you an error that you can resolve by adding the lifetime annotations that specify how the references relate to each other.(當 lifetime 出現(xiàn)歧義時,仍需要手動標記 lifetime annotation

注:下面幾段很重要,介紹了編譯器自動推斷 lifetime annotation 的過程

Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.

The compiler uses three rules to figure out what lifetimes references have when there aren’t explicit annotations.(編譯器使用了三條規(guī)則推斷 lifetime annotationThe first rule applies to input lifetimes, and the second and third rules apply to output lifetimes. If the compiler gets to the end of the three rules and there are still references for which it can’t figure out lifetimes, the compiler will stop with an error.(如果使用了這三條規(guī)則后,仍無法推斷出完整的 lifetime annotation,編譯器就會報錯) These rules apply to fn definitions as well as impl blocks.

第一條規(guī)則:為每個入?yún)⒎峙湟粋€不同的 lifetime annotation

  • The first rule is that each parameter that is a reference gets its own lifetime parameter. In other words, a function with one parameter gets one lifetime parameter: fn foo<'a>(x: &'a i32); a function with two parameters gets two separate lifetime parameters: fn foo<'a, 'b>(x: &'a i32, y: &'b i32); and so on.

第二條規(guī)則:如果只有一個入?yún)?,則將該入?yún)⒌?lifetime annotation 設置為所有出參的 lifetime

  • The second rule is if there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters: fn foo<'a>(x: &'a i32) -> &'a i32.

第三條規(guī)則:如果有多個入?yún)?,且其中一個是 &self&mut self,則把這個入?yún)⒌?lifetime annotation 設置為所有出參的 lifetime

  • The third rule is if there are multiple input lifetime parameters, but one of them is &self or &mut self because this is a method, the lifetime of self is assigned to all output lifetime parameters. This third rule makes methods much nicer to read and write because fewer symbols are necessary.

注:如果以上三條規(guī)則使用后,都無法為所有出入?yún)⒃O置 lifetime,就報錯。

Let’s pretend we’re the compiler. We’ll apply these rules to figure out what the lifetimes of the references in the signature of the first_word function in Listing 10-26 are. The signature starts without any lifetimes associated with the references:

fn first_word(s: &str) -> &str {

Then the compiler applies the first rule, which specifies that each parameter gets its own lifetime. We’ll call it 'a as usual, so now the signature is this:

fn first_word<'a>(s: &'a str) -> &str {

The second rule applies because there is exactly one input lifetime. The second rule specifies that the lifetime of the one input parameter gets assigned to the output lifetime, so the signature is now this:

fn first_word<'a>(s: &'a str) -> &'a str {

Now all the references in this function signature have lifetimes, and the compiler can continue its analysis without needing the programmer to annotate the lifetimes in this function signature.

Let’s look at another example, this time using the longest function that had no lifetime parameters when we started working with it in Listing 10-21:

fn longest(x: &str, y: &str) -> &str {

Let’s apply the first rule: each parameter gets its own lifetime. This time we have two parameters instead of one, so we have two lifetimes:

fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &str {

You can see that the second rule doesn’t apply because there is more than one input lifetime. The third rule doesn’t apply either, because longest is a function rather than a method, so none of the parameters are self. After working through all three rules, we still haven’t figured out what the return type’s lifetime is.(三條規(guī)則用完之后,仍然無法為返回值指定 lifetime) This is why we got an error trying to compile the code in Listing 10-21: the compiler worked through the lifetime elision rules but still couldn’t figure out all the lifetimes of the references in the signature.

3. 總結

(1)關于 reference 的有效性

  • lifetime 是編譯器用于檢測 reference 有效性的機制,稱為 borrow check。
  • 編譯器使用 CFG(control-flow graph)靜態(tài)分析一個 reference 的有效范圍。
  • reference 在最后一次使用后失效,這時它所引用 value 的 owner 甚至還未超出其作用域范圍,即 value 還未被釋放。

(2)lifetime annotation 推斷

  • 編譯器使用了三條規(guī)則,為函數(shù)的出入?yún)⒆詣犹砑?lifetime annotation。
  • 如果三條規(guī)則應用過后仍然無法推斷出返回值的 lifetime,就會報錯。

(3)lifetime annotation 的實例化方式

  • 顯式的 lifetime annotation 只是一個額外標記,用于給編譯器消除歧義。
  • 函數(shù)的每次調用,其 lifetime annotation 被 “實例化” 為不同的值。
  • lifetime annotation 總是被 “實例化” 為當前調用參數(shù)中 lifetime 最短的那個。

4. 源碼

(1)borrow check

github: rust v1.45.1 src/librustc_mir/borrow_check

src/librustc_mir/borrow_check
├── borrow_set.rs
├── constraint_generation.rs
├── constraints
│   ├── graph.rs
│   └── mod.rs
├── def_use.rs
├── diagnostics
│   ├── conflict_errors.rs
│   ├── explain_borrow.rs
│   ├── find_use.rs
│   ├── mod.rs
│   ├── move_errors.rs
│   ├── mutability_errors.rs
│   ├── outlives_suggestion.rs
│   ├── region_errors.rs
│   ├── region_name.rs
│   └── var_name.rs
├── facts.rs
├── invalidation.rs
├── location.rs
├── member_constraints.rs
├── mod.rs
├── nll.rs
├── path_utils.rs
├── place_ext.rs
├── places_conflict.rs
├── prefixes.rs
├── region_infer
│   ├── dump_mir.rs
│   ├── graphviz.rs
│   ├── mod.rs
│   ├── opaque_types.rs
│   ├── reverse_sccs.rs
│   └── values.rs
├── renumber.rs
├── type_check
│   ├── constraint_conversion.rs
│   ├── free_region_relations.rs
│   ├── input_output.rs
│   ├── liveness
│   │   ├── local_use_map.rs
│   │   ├── mod.rs
│   │   ├── polonius.rs
│   │   └── trace.rs
│   ├── mod.rs
│   └── relate_tys.rs
├── universal_regions.rs
└── used_muts.rs

(2)使用位置

github: rust v1.45.1 src/librustc_interface/passes.rs

sess.time("MIR_borrow_checking", || {
    tcx.par_body_owners(|def_id| tcx.ensure().mir_borrowck(def_id));
});

參考

Rust By Example: Lifetimes
rust-lang / rfcs
The Rust Programming Language: Validating References with Lifetimes
github: rust v1.45.1 src/librustc_mir/borrow_check

?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

友情鏈接更多精彩內容