 Pavneet Singh

# Deep Dive into Array Basics Part 1

• Aug 28, 2018
• 486 Views
• Aug 28, 2018
• 486 Views
Array
C#

## Introduction

Data management is the backbone of all applications. The performance of an application is significantly dependent upon the choice of data structure to access and manipulate data. Every data structure has its own implementation and implications.

In earlier times, CPU's were specifically optimized to work with instruction sets that operate on one-dimensional data, known as vector processing. Modern GPUs are a modified version of a vector processor to support rapid computation.

Arrays (AKA Vectors) are the most important and widely used data-structure and this guide will cover the deep aspects of array data-structure.

## Introduction to Arrays

An array is defined as a collection of similar data with fixed length, stored in a linear fashion. Every element in an array is accessed by an index (a numerical value) and every index can be computed by applying a mathematical operation.

In the below example

• Book is an array
• Chapters are the data stored inside the Book
• Every chapter in the Book array is accessible with an index ``````1
````char[] book = {'A','B','C','D'};````
csharp

## Array Declarations and Initialization

Declaration is a process of defining a place holder name preceded by it's type and arrays are declared with rectangular brackets as

``````1
2
``````type[] name_of_placeholder;
char[] book;``````
csharp

Initialization means to allocate memory to objects with the help of `new` keyword or array initializer.

``````1
2
``````name_of_placeholder = new data_type[size_of_array];
book = new char;``````
csharp

Array Initializers: Arrays can also be declared using initialization list `{array elements separated by comma}` and suitable when you have the data while declaring array reference. e.g. create array of length 4 with values as A, B, C, D.

``````1
2
3
``````char[] array = new char[] {'A','B','C','D'};
char[] array = {'A','B','C','D'};
char[] array = new[] {'A','B','C','D'};``````
csharp

Implicit typed array: `var` is the variable type placeholder (introduced in C# 3) whose type can be obtained from the right side of the expression using type inference which allows us to eliminate the explicit data type. The above examples can be declared using `var` as:

``````1
2
3
4
``````var array = new char; // creates array of length 4
var array = new char {'A','B','C','D'};
var array = new char[] {'A','B','C','D'};
var array = new[] {'A','B','C','D' };``````
csharp

An array initializer cannot be used with the `var` keyword, so `var array = { 'A', 'B' , 'C', 'D' };` is invalid. The decision was made by the design team to avoid the use of array initializer and its side effects on the parser to avoid parsing nested blocks of `{}`. The solution was the usage of `new` keyword with var for implicit types syntax.

### Memory Allocation

The main memory is divided into separate logical sections and a heap is known as dynamic memory. Meaning that memory requested at runtime is allocated in heap.

Memory for arrays is allocated in a continuous manner. Meaning that if you create an array of four `char` elements then the application will allocate: On 32 bit OS, `char` is of 2 byte and 4 byte for reference variable so (2 * 4) + 4 = 12 bytes On 64 bit OS, `char` is of 2 byte and 8 byte for reference variable so (2 * 4) + 8 = 16 bytes and may be stored at memory address 2000 to 2008 (2008 is exclusive), look like this:

``````1
2
3
4
5
6
7
8
9
10
``````index    Data      Memory addresses
_____
0     |  A  |     4000  <=  name of the array
1     |  B  |     4002
|_____|
2     |  C  |     4004
|_____|
3     |  D  |     4006
|_____|``````
csharp

This is a general scenario, the actual allocation might slightly differ according to OS implementation.

The value of an array is read by using the array name followed by index within rectangular brackets:

``````1
2
3
``````array_name[numeric_index]
e.g       output
book     C``````
csharp

The memory address of `C` can be computed using initial address of memory, index and size:

``````1
2
3
4
``````initial address + (index * size of data type)
4000            + (2 * 2)
4000            +    4
=> 4004``````
csharp

In order to modify array elements, an assignment(=) operator is used along with the array name and index:

``````1
2
``````array_name[numeric_index] = new_element
book  =  'P' // Replace A with P``````
csharp

A modified book array will be:

``````1
````['P','B','C','D'];````
csharp

Important: There are two types of data in C#:

1. Value Type: ValueType elements always have a fixed default value when initialized e.g. The default value of an `int` element is `0` and `bool` is `false`.

2. Reference Type: The default value of reference type is `null`.

### Array fundamentals

• Zero-based numbering: Array index always starts at 0 and the value of the last index will be `length_of_array - 1`.

• Fix length: Once an array is created, its size cannot be altered to increase the length. A larger array is required to store the values from the smaller array using a copy operation.

• By default, every element in array is initialized to their default value.

• Every array object is derived from `Array` class and inherits the data member (length,Rank) and methods (GetLowerBound, GetUpperBound) of the Array class.

• An array can also represent matrix data as a two dimensional, multi-dimensional, or jagged array. In a jagged array, the size of each row can be different and can be defined at run time as per requirement.

• The default maximum size of an Array is two gigabytes (GB) in a 32 bit environment and in a 64 bit environment, gcAllowVeryLargeObjects method can be invoked to expand the limit to four billion entries.

• Arrays are not thread safe; multiple-threads can modify the data at the same type which can cause inconsistency issues.

## Array Class and Structure of an Array

Every array variable has `Array` class as the base type and can access the properties and methods. In C or C++, you cannot get the array length from the array object, so being an object type of a class allows array objects to store additional information like length, IsReadOn, and helpful methods like Clone and Equals.

`Clone` will provide a shallow copy which is suitable for one dimensional arrays. For a multi-dimensional array, you need to copy each and every element from the source array to the newarray. This is known as deep copy.

• Array class: An array class implements various methods from interfaces to provide a fixed set of features for consistency across any supported framework and platform.

• IList: To support collections like dynamic list with no size limit, supported by `using System.Linq` which is used to convert data from one type to another.

``````1
2
3
4
``````using System.Linq;

int[] intArr = new [] { 11, 2, 0, 14, 112 };
List<int> intList = intArr.OfType<int>().ToList(); // convert array to List``````
csharp
• ICloneable: To support clone method.
``````1
2
``````int[] arr = new int;
int[] temp = (int[])arr.Clone();``````
csharp
• IStructuralEquatable: To compare array values as per index.
``````1
2
3
``````StructuralComparisons.StructuralEqualityComparer.Equals(new int[]{1,2}, new int[]{2,1}); // False

StructuralComparisons.StructuralEqualityComparer.Equals(new int[]{1,2}, new int[]{1,2}); // True``````
csharp
• IStructuralComparable: To support methods like `sort` to define the order or create customize ordering.
``````1
2
3
4
5
``````StructuralComparisons.StructuralComparer.Compare(new int[]{1,2}, new int[]{2,1}); // -1 first comes before in a second

StructuralComparisons.StructuralComparer.Compare(new int[]{1,2}, new int[]{1,2}); // 0 both are equals

StructuralComparisons.StructuralEqualityComparer.Equals(new int[]{2,1}, new int[]{1,2}); // 1, first comes after second``````
csharp

## Array Traversal

Traversal is the process of accessing each element of an array to perform a read or update operation. To print the `book` array, you can write:

``````1
2
3
4
5
``````Console.WriteLine("["+book+","+book+","+book+","+book+"]"); // direct access
Console.WriteLine("[{0},{1},{2},{3}]",book,book,book,book); // interpolated string
Console.WriteLine("[{0}]", string.Join(",", book)); // get array string and use comma as separator
Output:
[A,B,C,D] // for all 3 statements``````
csharp

As you can observe, the only dynamic part to access the array is the index value. So, we just need to execute the `book[index_value]` statement with an index value in an increasing order. This can be achieved by simply using `for` loop.

Loops allow us to execute the same set of instructions several times in a defined sequence. A simple `for` loop syntax is:

``````1
2
3
``````for(int i = initial_value ; exit_condtion ; increment/decrement){
// statement
}``````
csharp

Below is a simple program to replace the value of `book` array with any random alphabet:

``````1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
``````Random rand = new Random();
string strA_Z = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
for(int i = 0 ; i < book.Length ; i++){
// Note : FormattableString string instance is reused to display new values
Console.Write("book[{0}] = {1}",i,book[i]);
// random number between 0 - 25
int randomOndex = rand.Next(strA_Z.Length);
// replace element with random alphabet
book[i] = strA_Z[randomOndex];
Console.WriteLine(" : book[{0}] = {1}",i,book[i]);
}
Output: Its random, so output may differ
book = A : book = H
book = B : book = U
book = C : book = G
book = D : book = O``````
csharp

Note: String objects can be used as an array because of inbuilt indexer support in string class.

## Use of Out, Ref, and Param Keyword

Arrays can be passed from one method to another as a parameter to process array data. Passing an array as a parameter means the callee method can only access or manipulate the array data but cannot assign a new array object to the passed array reference.

Note: Caller is the method that calls another method - e.g. Main is a caller and ChangeReference is a callee method, invoked by Main method.

For example:

``````1
2
3
4
5
6
7
8
9
10
``````static void Main(string[] args){
char[] book = new[]{'A','B','C','D'};
Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D]
ChangeReference(book); // no change after this executes
Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D]
}
static void ChangeReference(char[] book){
Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D]
book = new[]{'W','X','Y','Z'};
}``````
csharp

In the above example, the `book` array is passed to another method where a new array object is assigned to the `book` reference. Though this will not change the original `book` reference pointing to `A,B,C,D`, the `ChangeReference` method can access or modify the elements inside of the `book` array.

There is the possibility that an uninitialized array could be passed or that a method needs to initialize the array reference before using it and make the changes that reflect in the caller method. Fortunately, C# introduced keywords like `ref` and `out` in version 7 to apply constrains on references at compilation time.

### Out

`out` indicates to the compiler that the received array reference (in callee method) should be initialized before use and if you try to use the array reference before initialization then it will cause a compilation error.

``````1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
``````static void Main(string[] args){
char[] book = new[]{'A','B','C','D'};
Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D]
ChangeReference(out book);
Console.WriteLine("[{0}]",string.Join(",",book)); // [W,X,Y,Z]
ChangeReference(out null); // compile time error, un-assignable reference
ChangeReferenceErr(out book);
}
static void ChangeReference(out char[] book){
book = new[]{'W','X','Y','Z'};
}
static void ChangeReferenceErr(out char[] book){
Console.WriteLine("[{0}]",string.Join(",",book)); // compilation error, Cannot use before initialization
book = new[]{'W','X','Y','Z'}; // do this before using book
}``````
csharp

Note:

• The passed reference should be an assignable reference, meaning you cannot pass `null` as a parameter with `out` although you can assign `null` value to reference inside callee method.

• It doesn't matter whether the array reference was initialized before or not, the callee method has to initialize the reference before using it.

### Ref

`ref` acts as a constraint so that the array reference should be initialized before passing it to another method.

``````1
2
3
4
5
6
7
8
9
10
11
``````static void Main(string[] args){
char[] book = new[]{'A','B','C','D'};
PrintArrayValues(ref book); // [A,B,C,D]
char[] bookNull;
PrintArrayValues(ref bookNull); // compile time error, use of unassigned local variable
char[] upComingBooks = null;
PrintArrayValues(ref upComingBooks); // Runtime error, ArgumentNullException, because books is null
}
static void PrintArrayValues(ref char[] book){
Console.WriteLine("[{0}]",string.Join(",",book));
}``````
csharp

Note:

• Caller method can initialize the reference as `null` which will crash the application if you perform any operation which required a not-null reference

• `ref` is useful when one method wants to send data and receive modified data; it's two way communication. Whereas `out` is one way communication every time a reference should be initialized

• `ref` and `out` are also applicable on value type data

• A method cannot be overloaded on the basis of `ref` and `out`

``````1
2
``````static void add (ref int a){}
static void add (out int a){} // compilation error, ref and out are not enough to overload a method``````
csharp

### Params

The `params` keyword allows us to receive any arbitrary number of parameters as an array.

``````1
2
3
4
5
6
7
8
9
10
11
12
``````static void Main(string[] args){
TotalAnyLengthArray(1,2,3);          // 6
TotalAnyLengthArray(1,2,3,3,4,5,5);  // 23
TotalAnyLengthArray();               // 0
}
static void TotalAnyLengthArray(params int[] ints){
long sum = 0;
for(int i = 0 ; i < ints.Length ; i++){
sum += ints[i];
}
Console.WriteLine("Total is {0}",sum);
}``````
csharp

You can pass separate int parameters by keeping `params` as the last parameter:

``````1
2
3
4
5
6
7
8
9
10
11
``````static void Main(string[] args){
TotalAnyLengthArrayAndVerify(6,1,2,3);           // True
TotalAnyLengthArrayAndVerify(20,1,2,3,3,4,5,5);  // False, total is 23 not 20
}
static void TotalAnyLengthArrayAndVerify(int expectedTot, params int[] ints){
long sum = 0;
for(int i = 0 ; i < ints.Length ; i++){
sum += ints[i];
}
Console.WriteLine("Match is {0}", sum == expectedTot);
}``````
csharp

## Key Points

• Arrays index starts from `0` and type of data can be restricted by the type of the array.
``````1
2
``````object[] all = new object[]{1,true,""}; // object can store any type
object[] listInt = new int[] {1, 2}; // not possible``````
csharp
• Looping an array is almost twice as fast as lists.

• Never compare reference type with `==` . Instead, use `SequenceEqual` with arrays as:

``````1
2
3
``````char[] nName = new[]{'P', 'A', 'V' };
Console.WriteLine(nName == new char[]{'P','A','V'}); // False
Console.WriteLine(nName.SequenceEqual(new char[]{'P','A','V'})); // True``````
csharp
• It's good practice to return an empty array instead of `null` (to avoid null checks) in case there is an issue with data availability or consistency.
``````1
````char[] a = new char[]{}; // create empty array````
csharp
• You can convert char array to string to perform string manipulations.
``````1
2
``````char[] chars = {'P','A','V','N','E','E','T'};
string name = new string(chars); //PAVNEET``````
csharp

Share your appreciation and press like on this guide. Thank you for reading!

10