Value and Reference Types in C#
The concepts of value and reference types may be found to be a bit complex by newcomers of the C#, but once the logic behind it is understood, it will be highly beneficial for making progress in software development.
As clearly stated in the naming, “Value” types stores the “Value” in the stack region directly. On the other hand, “Reference” Types stores a reference, which is an address in the memory actually.
While value types are stored on the the stack, similarly the reference types held on the stack as well, however they are only stored on a part of it.
Stack and Heap are two different areas in the memory that are used to store temporarily created variables. Since the topic of this article is Value and Reference Types, we will only provide a brief explanation only as needed.
Data stored on the stack is stored in a stacked structure. It allows for fast access to the stored data and can be considered as a Last In First Out (LIFO) structure. On the other hand, the heap has a fragmented structure and accessing data from it is relatively slower. The allocation of memory in the heap is done in a random order.
Let’s try to explain value and reference types by writing just a few lines of code.
Assume we have a Book Class and some variables as following
class Book
{
public int Id { get; set; }
public string Title { get; set; }
}
int result = 30;
Book book1;
bool success;
string input = "Lorem ipsum dolor";
The reflection of it to the Memory almost will be approximately look like this:
Observations
- We have declared but did not initialized the variable “success” yet, however it is clear that it is false.
- We have declared “book1” just like we did for “success” earlier, however it is clear that there is nothing in it.
- We initialized a lorem string. We could observe the content of it is stored in the heap, hence it is reasonable to consider “string” as a reference type based on the definition we did initially ,right? Yes, but the story behind the type “string” is just a little bit complicated.
Let’s Start with First Premise
The main reason for that, the value types that are most used like the following
- int
- byte
- boolean
- double
- char
Could take on a default value, if no value is assigned to them
C# has a default operator and also literal
// with default literal
int number = default; // => 0
bool active = default; // => false
char c = default; // => \0
// with default operator, same results will be obtained
int number = default(int)
// ...
2nd
There is nothing to be concerned about at this stage. The code could be compiled and run without any error.
However, when we try to access the properties of an object variable which is not initialized and has a type that is a reference type, even if the compilation is done without any error, the program will throw an exception named “NullReferenceException” in the runtime.
Microsoft defines the Exception as below
“The exception that is thrown when there is an attempt to dereference a null object reference.”
We have declared the object “book1” but did not initialize it in our code.
Intuitively, it is evident that the same scenario as mentioned above would occur for book1, If we were access a property of “Book” class via object “book1”
Let’s create an instance of “Book” and assign this to “book1”. After doing this, when we access to the book1 object, we will make sure that it does not throw a NullReference exception.
The creation of an instance is what the new keyword does. It adds an instance, a new copy, of the Book Class to the heap region of the memory.
Book book1 = new Book{Id = 1, Title = "First Book"};
From now on, the reference to this instance will be stored in book1.
Then now we only focus on the “book1” in the first picture
A further exploration of the second premise is possible for our goal which is the understanding of Value and Reference Types
Let’s think of a new scenario and assume that there is an object named “book2”, this is almost the same book as “book1” but at that time “book2” is published by another publisher.
We could distinguish the two books from each other using the Publisher and Id properties, lets make arrangements in the “Book” Class
class Book
{
public int Id { get; set; }
public string Title { get; set; }
public string Publisher { get; set; } // Add Property "Publisher"
}
// codes
Book book1 = new Book(Id = 1, Title = "First Book", Publisher = "Publisher1");
// codes
Assuming that we have abode by the following approach to avoid rewriting the title (usually, there are more fields involved) when creating our object named “book2”, which shares the same title as “book1” but is published by a different publisher.
Book book2 = book1;
book2.Id = 2;
book2.Title = "Second Book";
book2.Publisher = "Publisher2";
At this point, let’s retrieve the object book1 and print its properties.
Console.WriteLine(book1.Id + " , " + book1.Title);
// => 2 , Second Book
It seems that something went wrong in terms of our aim
We have set the properties of book2 but also book1 was also affected, how could this be possible?
Whereas, we know that the code below is working properly, right ?
int x = 5;
int y = 10;
x = y; // x = 10;
x = x + 5; // x = 15;
Console.WriteLine(x)
// => 15
Console.WriteLine(y)
// => 10
To understand why our intended goal was not successful, let’s examine the memory view.
Firstly it seems that we have assigned book1 to book2,
Book book2 = book1;
Then for the properties of it,
book2.Id = 2;
book2.Title = "Second Book";
book2.Publisher = "Publisher2";
The images clearly show the representation of the code in memory. We assigned the address pointed by “book1” to “book2” as well. From this point on, any changes made to the properties through “book2” will be observed on “book1” as well, since “book1” and “book2” are pointing to the same address.
The ICloneable interface and the concept of deep copy can be used to accomplish our goal of creating a copy of an object’s content.
3rd
We have explained that a string is a reference type. However, when we initialize a string variable, we do not need to use the “new” keyword explicitly. Nevertheless, the initialization syntax, such as assigning a string literal or a value to the string variable, triggers the .NET CLR to create the string object.
Despite the fact that a string is an immutable type, when we change the content of the “lorem” string, we can still access the modified string from the “input” variable. The reason for this is that for every change made to a string object, similarly .NET CLR creates a new string instance for it.