Is It Really That Hard To Learn A RUST?

Is It Really That Hard To Learn A RUST Main Logo

Is It Really That Hard To Learn A RUST?

Some time ago, I began to understand the need to diversify my programming experience exclusively in C#. After some study of various options, such as Haskell, Scala, Rust, and some others, the choice fell on the last one.

Over time, I began to notice that Rust is increasingly being advertised solely as a “system language” that is needed for extremely complex compilers and super-loaded systems, with special security and multithreading requirements, and for simpler options there is Go/Python/Java/…, while I gladly and quite successfully used it as a replacement for my workhorse C#.

In this article I wanted to tell why I consider this trend as a whole harmful, and why Rust is a good general-purpose language in which you can do any projects, starting with any microservices, and ending with the scripting of the daily routine.

Introduction

Why, in fact, learn a new language, the more difficult? It seems to me that the answer to the article “Winning Mediocrity” is the closest to the truth, namely:

Everyone knows that writing the entire program manually in machine language is wrong. But much less often they understand that there is a more general principle: if there is a choice of several languages, it is wrong to program on something other than the most powerful one, unless other reasons influence the choice.

The more complex the language, the richer the phrases composed with its help, and the better it can express the desired subject area. Since concepts are usually studied once, but are applied many times, much more profitable from the point of view of investing their own time to study all kinds of scary words like “monadic transformers” (and, preferably, their meaning), in order to save their mental forces and spend them nice And therefore, it is very sad to see the trend of some companies to make especially “simplified” languages. As a result, the vocabulary of these languages is much smaller, and it is not difficult to learn it, but then reading the program “my yours to buy onions” is very difficult, not to mention possible ambiguous interpretations.

The basics

As usual, a newbie gets to know a programming language? He Google the most popular book on the language pulls it and begins to read. As a rule, it contains Hello World, a compiler installation instruction, and the basic information on the language with a gradual complication. In the case of Rust, this is a handpiece, and the first example is reading a number from the console and displaying it on the screen. How would we do this in the same C#? Well, probably something like that:


var number = int.Parse(Console.ReadLine());
Console.WriteLine($"You guessed: {number}");

Here is how it looks on RUST:


let mut guess = String::new();

io::stdin().read_line(&mut guess)
.expect("Failed to read line");

let guess: u32 = guess.trim().parse()
.expect("Please type a number!");

println!("You guessed: {}", guess

Triple the amount of code, all these curtsies with the creation of a variable before use (hello Pascal!), Calling a heap of auxiliary code, etc. “What a horror” the average developer will think and once again be convinced of the “consistency” of the language.

But in fact it can be written much easier:


let mut guess = String::new();
io::stdin().read_line(&mut guess)?;
let guess: u32 = guess.trim().parse()?;
println!("You guessed: {}", guess);

There is still a separate creation of the variable and reading into it from the stream, but here the ideology of the rasta imposes an imprint that it transfers the allocation of the buffer to the user. At first, it is unusual, but then you realize that with responsibility comes the corresponding force. Well, for those who do not soar, there is always the option to write a three-line function and forget this question once and for all.

Why is the example in the book composed in this way? Most likely due to the fact that the explanation of error handling occurs much later, and ignoring them as in the case of C # does not allow the rasta paradigm, which controls all possible paths where something can go wrong.

Lifetime and Borrow checker

Oh, those terrible beasts. People rush with obscure spells like:


fn search<F>(self, hash: u64, is_match: F, compare_hashes: bool)
-> RawEntryMut<'a, K, V, S>
where for<'b> F: FnMut(&'b K) -> bool

Newcomers run in panic, the guys from going say “we warned you about unnecessary complexity,” Haskell-ists say “so much difficulty for a language that even has no effects,” and the twist their fingers at the temple “did it cost a fence for only GC refuse”.

In fact, when writing an application program, you most likely do not need to specify a single lifetime at all. The reason for this is that in RUST there are rules for outputting Lifestyle, which is almost always the compiler can set itself. Here they are:

  • Each position has become a distinct lifetime parameter.
  • If there is a lifespan, it is assigned to all elite output lifetimes.
  • It has been assigned that it has been assigned a lifespan.
  • Otherwise, it is an error to elide an output lifetime.

Or, if in a nutshell, in the case of static functions, the lifetime of all arguments is assumed to be equal, in the case of instance methods, the lifetime of all resulting references is equal to the lifetime of the instance on which we call the method. And in practice, this is almost always respected in the case of application code. Therefore, instead of horror above, you will usually write something like.


struct Point(i32, i32);

impl Point {
pub fn get_x(&self) -> &i32 {
&self.0
}

pub fn get_y(&self) -> &i32 {
&self.1
}
}

And the compiler himself will gladly output everything that is needed for this to work.

Personally, I see the beauty of the concept of automatic control in several aspects.

  • From the point of view of a person with programming experience in a language with GC, memory is not a separate type of resource. In C # there is a whole story with the IDisposable interface, which is used for deterministic cleaning of resources, precisely because the GC deletes the object “when it is there”, and we may need to release the resource immediately. As a result, there is a whole heap of consequences: one should not forget about the correct implementation of finalizers, and the whole keyword was introduced for this (as well as Java’s try-with-resources), and the compiler should be cracked to generate foreach with this in mind … Unification of all types of resources that are released automatically, and as quickly as possible after the last use it is very nice. I opened the file for myself, and work with it, it will close when needed without any scoping. I’ll immediately answer the potential objection that DI containers make life a little easier, but they don’t solve all the problems.
  • From the point of view of a person with programming experience in a manual language, in 99% of cases, it is not necessary to use smart pointers; it is enough to use ordinary links.

As a result, the code turns out to be clean (as in the language with GC), but at the same time, all resources are released as quickly as possible (as in the manual language). And lifetime: a declarative description of the expected lifetime of the object. And the declarative description is always better than the imperative “free the object here.”

Cruel compiler

Some consequence of the previous paragraph. There is a good picture, it generally describes many languages, but drawn specifically for the case of RUST:

RUST

In fact, the compiler is really pretty picky. This was especially true until the appearance of Rust 2018 when the compiler in some cases did not miss the obvious code. But now there are problems, especially from a misunderstanding of the concept of ownership. For example, when a person tries to implement a doubly linked list. With a naive implementation, he first tries to do:


pub struct Node {
value: u64,
next: Option<Box<Node>>,
prev: Option<Box<Node>>,
}

The compiler will compile the declaration of this structure, but it will not work, because Box <Node> is a proprietary link, or unique_ptr in C ++ terms. A unique link, of course, can only be one

A person’s next attempt might look like this:


pub struct Node {
value: u64,
next: Option<&Box<Node>>,
prev: Option<&Box<Node>>,
}

Now we have non-owned links (they are shared_ptr), and there can be as many of them as one object. But here there are two problems: first, the owner must be somewhere. So, we will most likely get a bunch of compilation errors “the owner died when someone referred to his data,” because dangling pointers are not allowed to grow. And secondly, what is more important, we will not be able to change these values, because of the rules of rasta, “either one is a mutable link, or an arbitrary number of immutable ones, and nothing else.”

  • After that, the person usually starts to beat the keyboard, and write articles that “even in a growing room, a coherent list will not work out normally.” Of course, you can implement it, but it is a bit more difficult than in other languages, you have to add reference counting (Rc/Arc/Cell/RefCell primitives) to run the number of these links because the compiler is powerless in this situation.

The reasons for this are: this data structure does not fit well with the concept of ownership of rasta, the whole language and the ecosystem as a whole are built around. Any data structures where multiple owners are required will require some squats, such as the implementation of various forest/trees/graphs or the same linked lists. But this is true for all programming languages: attempts to implement memory management in languages with GC lead to terrible monsters working through WeakReferences with giant byte [] arrays, resurrecting objects in destructors to bring them back to the pool, and another terrible necromancy. Trying to get away from the dynamic nature of JS to write productive code leads to even stranger things.

Thus, in any programming language there is a “pain point”, and in the case of RUST, these are data structures with many owners. But, if we look from the application point of view of high-level programmers, our programs are arranged just in this way. For example, in my environment, a typical application looks like a certain layer of controllers that are linked between services. Each service has a link to some repositories that return some objects. All this fits perfectly into the concept of ownership. And if we consider that in practice the main data structures are lists, arrays, and hashmaps, it turns out that everything is not so bad.

What to do with this beast

In fact, the compiler struggles to help. Error messages in Rast, probably the most pleasant of all the programming languages with which I worked.

For example, if you try to use the first version of our linked list, get the message:


error[E0382]: assign to part of moved value: `head`
--> src\main.rs:23:5
|
19 | prev: Some(Box::new(head)),
| ---- value moved here
...
23 | head.next = Some(Box::new(next));
| ^^^^^^^^^ value partially assigned here after move
|
= note: move occurs because `head` has type `Node`, which does not implement the `Copy` trait

It says just that we have transferred ownership of the link to one element, and can no longer reuse it. He also tells us that there is a certain Copy trait that allows you to copy it instead of moving an object, which is why you can use it after the “move” because we moved the copy. If you didn’t know about its existence, then a compilation error will provide you with information for thinking “.

In general, the first language for compiler-driven development is for me. You just run the compilation, if something doesn’t work, the language will simply tell you “hmm, something doesn’t fit. I think the problem is in X. Try adding this code here and it will work.” A typical example, let’s say we wrote two functions, and forgot to add a generic constraint:


fn foo<T: Copy>() {

}

fn bar<T>() {
foo::<T>();
}

Compile, we get the error:


error[E0277]: the trait bound `T: std::marker::Copy` is not satisfied
--> src\main.rs:6:5
|
6 | foo::<T>();
| ^^^^^^^^ the trait `std::marker::Copy` is not implemented for `T`
|
= help: consider adding a `where T: std::marker::Copy` bound
note: required by `foo`
--> src\main.rs:1:1
|
1 | fn foo<T: Copy>() {
| ^^^^^^^^^^^^^^^^^

error: aborting due to previous error

Copy-Paste where T: std :: marker :: Copy from the error message, compile, everything is ready, let’s go to the prod!

Yes, the IDE of all modern languages can do this through all sorts of snippets, but firstly the benefit is that you can see from which crate/namespace the restriction has flown, and secondly it is support from the compiler, not the IDE. This is very helpful in cross-platform development when you have everything going locally, and on some matrix on the CI server somewhere, something falls due to conditional compilation. There is no IDE on the build server, and so the log looked, framed, and everything gathered. Conveniently.

  • I wrote some time ago a telegram-bot on a plant, as a language training. And I had a moment where I decided to refactor the entire application. I replaced everything I wanted, and then for half an hour, I tried to build a project and inserted suggestions from the compiler here and there. After this time, everything gathered and earned the first time.

Well, I can say that after a year from the moment I first started writing, I learned how to write simple snippets without errors from the first time. It sounds ridiculous, especially for people with dynamic PL, but for me, it was serious progress. And for all the time of working with the growth, I debug included exactly two times. And in both cases, I debugged FFI with C++ code that was segmented. Raster code either worked correctly for me or was not going to. In the case of C#, I have much less confidence, all the time I’m thinking, “willn’t null here”, “will there be a KeyNotFoundException here”, “have I synchronized access to these variables from many streams correctly”, etc. Well, in the case of JS (when I full stepped and wrote the front as well), after each change, a mandatory check in the browser followed what had changed.

Confidence in what has gathered == it really works. This does not mean that there are no bugs in the code, it means that all the bugs are related to the application logic. You have no unexpected nulls, non-synchronized access, buffer overflow, and so on. And they are much easier to catch, and sometimes you can bring to the level of types (a good article on the topic).

Summary

RUST is an excellent language for writing absolutely any application, and not just heavily loaded exchanges, blockchains, and trading bots. Always instead of sending a link, you can simply copy the value. Yes, yes, it is possible that times off will shower me with stones, but instead of trying to explain to the compiler in a couple of places of my bot, I could easily share the variable, I cloned it, and passed a copy. Yes, this is not so cool, but I do not have a goal to write the most productive application, as there is no such goal for people using C#/Java/Go/… I want to be as productive as possible and get the application at an acceptable speed. Implementing an application on Rastus according to all canons, eliminating all unnecessary copying is a very difficult task. But writing an application at the same time as in your favorite language, and getting a free performance increase is very real.

  • Try to write an application on the grow. If you don’t manage to pass, check your data structures and their interconnections, because I gradually began to understand that borrow checker is not just a mechanism responsible for the possibility of freeing memory, but also an excellent detector of the application architecture, from why this object X depends on Y, I did not expect this! “. If you understand everything, but explaining the correct answer to borrow checker is too difficult, just copy the value. Most likely, you will still get an application that works much faster (if you write in Java/C#/…, like me), or much more stable (if you write in C/C++), at the same time, which you would normally spend.

RUST concepts are very powerful and work fine at the application level, which does not reflect on performance, but rather only on developer productivity, the speed of implementation of new features and ease of support. It is very sad to observe that such an excellent in all respects language is increasingly receiving the stigma of “strange and complex language for low-level geek tasks.” I hope I have shaken this harmful myth a bit, and there will be more productive and happy developers in the world.