In the [previous post][2] about the Rust programming language, we looked at variables, constants and shadowing.
It is only natural to cover data types now.
### What are data types?
Change the order of these words and you get your answer; "data types" -> "type of data".
The computer stores data as `0`s and `1`s but to make sense of it when reading, we use data type to say what those `0`s and `1`s mean.
Rust has two types of data types:
- **Scalar data type**: Types that store only a single value.
- **Compound data type**: Types that store multiple values, even values of different types.
In this article, I shall cover scalar data types. I will go through the second category in the next article.
Following is a brief overview of the four main categories of Scalar data types in Rust:
- **Integers**: Stores whole numbers. Has sub-types for each specific use case.
- **Floats**: Stores numbers with a fractional value. Has two sub-types based on size.
- **Characters**: Stores a single character of UTF-8 encoding. (Yes, you can store an emoji* in a character.)
- **Booleans**: Stores either a `true` or a `false`. (For developers who can't agree if `0` is `true` or if `0` means `false`.)
### Integers
An integer in the context of a programming language refers to whole numbers. Integers in Rust are either **Signed** or **Unsigned**. Unsigned integers store only 0 and positive numbers, while Signed integers can store negative numbers, 0 and positive numbers.
> 💡 The range of Signed integers begins from `-(2n-1)` and this range ends with `(2n-1)-1`. Likewise, the range for Unsigned integers starts at `0` and ends with `(2n)-1`.
Following are the available Integer types based on the sign and length:
![Integer data types in Rust][3]
As you can see, Rust has Signed and Unsigned integers of length 8, 16, 32, 64 and even 128!
The integers with `*size` vary based on the architecture of the computer. On 8-bit micro-controllers, it is `*8`, on 32-bit legacy computers, it is `*32` and on modern 64-bit systems, it is `*64`.
The use of `*size` is to store data that is mostly related to memory (which is machine dependent), like pointers, offsets, etc.
> 💡 When you do not explicitly specify a subset of the Integer type, the Rust compiler will infer it's type to be `i32` by default. Obviously, if the value is bigger or smaller than what `i32` can hold, the Rust compiler will politely error out and ask you to manually annotate the type.
Rust not only allows you to store integers in their decimal form but also in the binary, octal and hex forms too.
For better readability, you can use underscore `_` as a replacement for commas in writing/reading big numbers.
```
fn main() {
let bin_value = 0b100_0101; // use prefix '0b' for Binary representation
let oct_value = 0o105; // use prefix '0o' for Octals
let hex_value = 0x45; // use prefix '0x' for Hexadecimals
let dec_value = 1_00_00_000; // same as writing 1 Crore (1,00,00,000)
println!("bin_value: {bin_value}");
println!("oct_value: {oct_value}");
println!("hex_value: {hex_value}");
println!("dec_value: {dec_value}");
}
```
I have stored the decimal number 69 in binary form, octal form and hexadecimal form in the variables `bin_value`, `oct_value` and `hex_value` respectively. In the variable `dec_value`, I have stored the number [1 Crore][4] (10 million) and have commas with underscores, as per the Indian numbering system. For those more familiar with the International numbering system, you may write this as `10_000_000`.
Upon compiling and running this binary, I get the following output:
```
bin_value: 69
oct_value: 69
hex_value: 69
dec_value: 10000000
```
### Floating point numbers
Floating point numbers, or more commonly known as "float(s)" is a data type that holds numbers that have a fractional value (something after the decimal point).
Unlike the Integer type in Rust, Floating point numbers have only two subset types:
-`f32`: Single precision floating point type
-`f64`: Double precision floating point type
Like the Integer type in Rust, when Rust infers the type of a variable that seems like a float, it is assigned the `f64` type. This is because the `f64` type has more precision than the `f32` type and is almost as fast as the `f32` type in most computational operations. Please note that _both the floating point data types (`f32` and `f64`) are **Signed**_.
> 📋 The Rust programming language stores the floating point numbers as per the [IEEE 754][5] standard of floating point number representation and arithmetic.
```
fn main() {
let pi: f32 = 3.1400; // f32
let golden_ratio = 1.610000; // f64
let five = 5.00; // decimal point indicates that it must be inferred as a float
let six: f64 = 6.; // even the though type is annotated, a decimal point is still
// **necessary**
println!("pi: {pi}");
println!("golden_ratio: {golden_ratio}");
println!("five: {five}");
println!("six: {six}");
}
```
Look closely at the 5th line. Even though I have annotated the type for the variable `six`, I **need** to at least use the decimal point. If you have something _after_ the decimal point is up to you.
The output of this program is pretty predictable... Or is it?
```
pi: 3.14
golden_ratio: 1.61
five: 5
six: 6
```
In the above output, you might have noticed that while displaying the value stored inside variables `pi`, `golden_ratio` and `five`, the trailing zeros that I specified at the time of variable declaration, are missing.
While those zeros are not _removed_, they are omitted while outputting the values via the `println` macro. So no, Rust did not tamper with your variable's values.
### Characters
You can store a single character in a variable and the type is simply `char`. Like traditional programming languages of the '80s, you can store an [ASCII][6] character. But Rust also extends the character type to store a valid UTF-8 character. This means that you can store an emoji in a single character 😉
> 💡 Some emojis are a mix of two existing emojis. A good example is the 'Fiery Heart' emoji: ❤️🔥. This emoji is constructed by combining two emojis using a [zero width joiner][7]: ❤️ + 🔥 = ❤️🔥
>
> Storing such emojis in a single Rust variable of the character type is not possible.
```
fn main() {
let a = 'a';
let p: char = 'p'; // with explicit type annotation
let crab = '🦀';
println!("Oh look, {} {}! :{}", a, crab, p);
}
```
As you can see, I have stored the ASCII characters 'a' and 'p' inside variables `a` and `p`. I also store a valid UTF-8 character, the crab emoji, in the variable `crab`. I then print the characters stored in each of these variables.
Following is the output:
```
Oh look, a 🦀! :p
```
### Booleans
The boolean type in Rust stores only one of two possible values: either `true` or `false`. If you wish to annotate the type, use `bool` to indicate the type.
```
fn main() {
let val_t: bool = true;
let val_f = false;
println!("val_t: {val_t}");
println!("val_f: {val_f}");
}
```
The above code, when compiled and executed results in the following output:
```
val_t: true
val_f: false
```
### Bonus: Explicit typecasting
In the previous article about Variables in the Rust programming language, I showed a very basic [temperature conversion program][8]. In there, I mentioned that Rust does not allow implicit typecasting.
But that doesn't mean that Rust does not allow _explicit_ typecasting either ;)
To perform explicit type casting, the `as` keyword is used and followed by the data type to which the value should be cast in.
Following is a demo program:
```
fn main() {
let a = 3 as f64; // f64
let b = 3.14159265359 as i32; // i32
println!("a: {a}");
println!("b: {b}");
}
```
On line 2, instead of using '3.0', I follow the '3' with `as f64` to denote that I want the compiler to handle type casting of '3' (an Integer) into a 64-bit float. Same with the 3rd line. But here, the type casting is **lossy**. Meaning, that the fractional element is _completely gone_. Instead of storing `3.14159265359`, it is stored as simply `3`.
This can be verified from the program's output:
```
a: 3
b: 3
```
### Conclusion
This article covers the Primitive/Scalar data types in Rust. There are primarily four such data types: Integers, Floating point numbers, Characters and Booleans.
Integers are used to store whole numbers and they have several sub-types based on either they are signed or unsigned and the length. Floating point numbers are used to store numbers with some fractional values and have two sub-types based on length. The character data type is used to store a single, valid UTF-8 encoded character. Finally, booleans are used to store either a `true` or `false` value.
In the next chapter, I'll discuss compound data types like arrays and tuples. Stay tuned.