19
Data management is the backbone of all applications. The performance of an application is significantly dependent upon the choice of data structure to access and manipulate data. Every data structure has its own implementation and implications.
In earlier times, CPU's were specifically optimized to work with instruction sets that operate on one-dimensional data, known as vector processing. Modern GPUs are a modified version of a vector processor to support rapid computation.
Arrays (AKA Vectors) are the most important and widely used data-structure and this guide will cover the deep aspects of array data-structure.
An array is defined as a collection of similar data with fixed length, stored in a linear fashion. Every element in an array is accessed by an index (a numerical value) and every index can be computed by applying a mathematical operation.
In the below example
1
char[] book = {'A','B','C','D'};
Declaration is a process of defining a place holder name preceded by it's type and arrays are declared with rectangular brackets as
1 2
type[] name_of_placeholder; char[] book;
Initialization means to allocate memory to objects with the help of new
keyword or array initializer.
1 2
name_of_placeholder = new data_type[size_of_array]; book = new char[4];
Array Initializers: Arrays can also be declared using initialization list {array elements separated by comma}
and suitable when you have the data while declaring array reference. e.g. create array of length 4 with values as A, B, C, D.
1 2 3
char[] array = new char[] {'A','B','C','D'}; char[] array = {'A','B','C','D'}; char[] array = new[] {'A','B','C','D'};
Implicit typed array: var
is the variable type placeholder (introduced in C# 3) whose type can be obtained from the right side of the expression using type inference which allows us to eliminate the explicit data type. The above examples can be declared using var
as:
1 2 3 4
var array = new char[4]; // creates array of length 4 var array = new char[4] {'A','B','C','D'}; var array = new char[] {'A','B','C','D'}; var array = new[] {'A','B','C','D' };
An array initializer cannot be used with the var
keyword, so var array = { 'A', 'B' , 'C', 'D' };
is invalid. The decision was made by the design team to avoid the use of array initializer and its side effects on the parser to avoid parsing nested blocks of {}
. The solution was the usage of new
keyword with var for implicit types syntax.
The main memory is divided into separate logical sections and a heap is known as dynamic memory. Meaning that memory requested at runtime is allocated in heap.
Memory for arrays is allocated in a continuous manner. Meaning that if you create an array of four char
elements then the application will allocate:
On 32 bit OS, char
is of 2 byte and 4 byte for reference variable so (2 * 4) + 4 = 12 bytes
On 64 bit OS, char
is of 2 byte and 8 byte for reference variable so (2 * 4) + 8 = 16 bytes
and may be stored at memory address 2000 to 2008 (2008 is exclusive), look like this:
1 2 3 4 5 6 7 8 9 10
index Data Memory addresses _____ 0 | A | 4000 <= name of the array |_____| pointing to initial address 1 | B | 4002 |_____| 2 | C | 4004 |_____| 3 | D | 4006 |_____|
This is a general scenario, the actual allocation might slightly differ according to OS implementation.
The value of an array is read by using the array name followed by index within rectangular brackets:
1 2 3
array_name[numeric_index] e.g output book[2] C
The memory address of C
can be computed using initial address of memory, index and size:
1 2 3 4
initial address + (index * size of data type) 4000 + (2 * 2) 4000 + 4 => 4004
In order to modify array elements, an assignment(=) operator is used along with the array name and index:
1 2
array_name[numeric_index] = new_element book[0] = 'P' // Replace A with P
A modified book array will be:
1
['P','B','C','D'];
Important: There are two types of data in C#:
Value Type: ValueType elements always have a fixed default value when initialized e.g. The default value of an int
element is 0
and bool
is false
.
Reference Type: The default value of reference type is null
.
Zero-based numbering: Array index always starts at 0 and the value of the last index will be length_of_array - 1
.
Fix length: Once an array is created, its size cannot be altered to increase the length. A larger array is required to store the values from the smaller array using a copy operation.
By default, every element in array is initialized to their default value.
Every array object is derived from Array
class and inherits the data member (length,Rank) and methods (GetLowerBound, GetUpperBound) of the Array class.
An array can also represent matrix data as a two dimensional, multi-dimensional, or jagged array. In a jagged array, the size of each row can be different and can be defined at run time as per requirement.
The default maximum size of an Array is two gigabytes (GB) in a 32 bit environment and in a 64 bit environment, gcAllowVeryLargeObjects method can be invoked to expand the limit to four billion entries.
Arrays are not thread safe; multiple-threads can modify the data at the same type which can cause inconsistency issues.
Every array variable has Array
class as the base type and can access the properties and methods. In C or C++, you cannot get the array length from the array object, so being an object type of a class allows array objects to store additional information like length, IsReadOn, and helpful methods like Clone and Equals.
Clone
will provide a shallow copy which is suitable for one dimensional arrays. For a multi-dimensional array, you need to copy each and every element from the source array to the newarray. This is known as deep copy.
Array class: An array class implements various methods from interfaces to provide a fixed set of features for consistency across any supported framework and platform.
IList: To support collections like dynamic list with no size limit, supported by using System.Linq
which is used to convert data from one type to another.
1 2 3 4
using System.Linq; int[] intArr = new [] { 11, 2, 0, 14, 112 }; List<int> intList = intArr.OfType<int>().ToList(); // convert array to List
1 2
int[] arr = new int[2]; int[] temp = (int[])arr.Clone();
1 2 3
StructuralComparisons.StructuralEqualityComparer.Equals(new int[]{1,2}, new int[]{2,1}); // False StructuralComparisons.StructuralEqualityComparer.Equals(new int[]{1,2}, new int[]{1,2}); // True
sort
to define the order or create customize ordering.1 2 3 4 5
StructuralComparisons.StructuralComparer.Compare(new int[]{1,2}, new int[]{2,1}); // -1 first comes before in a second StructuralComparisons.StructuralComparer.Compare(new int[]{1,2}, new int[]{1,2}); // 0 both are equals StructuralComparisons.StructuralEqualityComparer.Equals(new int[]{2,1}, new int[]{1,2}); // 1, first comes after second
Traversal is the process of accessing each element of an array to perform a read or update operation. To print the book
array, you can write:
1 2 3 4 5
Console.WriteLine("["+book[0]+","+book[1]+","+book[2]+","+book[3]+"]"); // direct access Console.WriteLine("[{0},{1},{2},{3}]",book[0],book[1],book[2],book[3]); // interpolated string Console.WriteLine("[{0}]", string.Join(",", book)); // get array string and use comma as separator Output: [A,B,C,D] // for all 3 statements
As you can observe, the only dynamic part to access the array is the index value. So, we just need to execute the book[index_value]
statement with an index value in an increasing order. This can be achieved by simply using for
loop.
Loops allow us to execute the same set of instructions several times in a defined sequence. A simple for
loop syntax is:
1 2 3
for(int i = initial_value ; exit_condtion ; increment/decrement){ // statement }
Below is a simple program to replace the value of book
array with any random alphabet:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Random rand = new Random(); string strA_Z = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; for(int i = 0 ; i < book.Length ; i++){ // Note : FormattableString string instance is reused to display new values Console.Write("book[{0}] = {1}",i,book[i]); // random number between 0 - 25 int randomOndex = rand.Next(strA_Z.Length); // replace element with random alphabet book[i] = strA_Z[randomOndex]; Console.WriteLine(" : book[{0}] = {1}",i,book[i]); } Output: Its random, so output may differ book[0] = A : book[0] = H book[1] = B : book[1] = U book[2] = C : book[2] = G book[3] = D : book[3] = O
Note: String objects can be used as an array because of inbuilt indexer support in string class.
Arrays can be passed from one method to another as a parameter to process array data. Passing an array as a parameter means the callee method can only access or manipulate the array data but cannot assign a new array object to the passed array reference.
Note: Caller is the method that calls another method - e.g. Main is a caller and ChangeReference is a callee method, invoked by Main method.
For example:
1 2 3 4 5 6 7 8 9 10
static void Main(string[] args){ char[] book = new[]{'A','B','C','D'}; Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D] ChangeReference(book); // no change after this executes Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D] } static void ChangeReference(char[] book){ Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D] book = new[]{'W','X','Y','Z'}; }
In the above example, the book
array is passed to another method where a new array object is assigned to the book
reference. Though this will not change the original book
reference pointing to A,B,C,D
, the ChangeReference
method can access or modify the elements inside of the book
array.
There is the possibility that an uninitialized array could be passed or that a method needs to initialize the array reference before using it and make the changes that reflect in the caller method. Fortunately, C# introduced keywords like ref
and out
in version 7 to apply constrains on references at compilation time.
out
indicates to the compiler that the received array reference (in callee method) should be initialized before use and if you try to use the array reference before initialization then it will cause a compilation error.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
static void Main(string[] args){ char[] book = new[]{'A','B','C','D'}; Console.WriteLine("[{0}]",string.Join(",",book)); // [A,B,C,D] ChangeReference(out book); Console.WriteLine("[{0}]",string.Join(",",book)); // [W,X,Y,Z] ChangeReference(out null); // compile time error, un-assignable reference ChangeReferenceErr(out book); } static void ChangeReference(out char[] book){ book = new[]{'W','X','Y','Z'}; } static void ChangeReferenceErr(out char[] book){ Console.WriteLine("[{0}]",string.Join(",",book)); // compilation error, Cannot use before initialization book = new[]{'W','X','Y','Z'}; // do this before using book }
Note:
The passed reference should be an assignable reference, meaning you cannot pass null
as a parameter with out
although you can assign null
value to reference inside callee method.
It doesn't matter whether the array reference was initialized before or not, the callee method has to initialize the reference before using it.
ref
acts as a constraint so that the array reference should be initialized before passing it to another method.
1 2 3 4 5 6 7 8 9 10 11
static void Main(string[] args){ char[] book = new[]{'A','B','C','D'}; PrintArrayValues(ref book); // [A,B,C,D] char[] bookNull; PrintArrayValues(ref bookNull); // compile time error, use of unassigned local variable char[] upComingBooks = null; PrintArrayValues(ref upComingBooks); // Runtime error, ArgumentNullException, because books is null } static void PrintArrayValues(ref char[] book){ Console.WriteLine("[{0}]",string.Join(",",book)); }
Note:
Caller method can initialize the reference as null
which will crash the application if you perform any operation which required a not-null reference
ref
is useful when one method wants to send data and receive modified data; it's two way communication. Whereas out
is one way communication every time a reference should be initialized
ref
and out
are also applicable on value type data
A method cannot be overloaded on the basis of ref
and out
1 2
static void add (ref int a){} static void add (out int a){} // compilation error, ref and out are not enough to overload a method
The params
keyword allows us to receive any arbitrary number of parameters as an array.
1 2 3 4 5 6 7 8 9 10 11 12
static void Main(string[] args){ TotalAnyLengthArray(1,2,3); // 6 TotalAnyLengthArray(1,2,3,3,4,5,5); // 23 TotalAnyLengthArray(); // 0 } static void TotalAnyLengthArray(params int[] ints){ long sum = 0; for(int i = 0 ; i < ints.Length ; i++){ sum += ints[i]; } Console.WriteLine("Total is {0}",sum); }
You can pass separate int parameters by keeping params
as the last parameter:
1 2 3 4 5 6 7 8 9 10 11
static void Main(string[] args){ TotalAnyLengthArrayAndVerify(6,1,2,3); // True TotalAnyLengthArrayAndVerify(20,1,2,3,3,4,5,5); // False, total is 23 not 20 } static void TotalAnyLengthArrayAndVerify(int expectedTot, params int[] ints){ long sum = 0; for(int i = 0 ; i < ints.Length ; i++){ sum += ints[i]; } Console.WriteLine("Match is {0}", sum == expectedTot); }
0
and type of data can be restricted by the type of the array.1 2
object[] all = new object[]{1,true,""}; // object can store any type object[] listInt = new int[] {1, 2}; // not possible
Looping an array is almost twice as fast as lists.
Never compare reference type with ==
. Instead, use SequenceEqual
with arrays as:
1 2 3
char[] nName = new[]{'P', 'A', 'V' }; Console.WriteLine(nName == new char[]{'P','A','V'}); // False Console.WriteLine(nName.SequenceEqual(new char[]{'P','A','V'})); // True
null
(to avoid null checks) in case there is an issue with data availability or consistency.1
char[] a = new char[]{}; // create empty array
1 2
char[] chars = {'P','A','V','N','E','E','T'}; string name = new string(chars); //PAVNEET
Share your appreciation and press like on this guide. Thank you for reading!
19