Data Structures(Subject Code: BTCOC303) Unit 2 Arrays and Hash Tables

Question 1. What is Array? How data is stored sequentially in computer?

Answer: Sequential access means a group of elements accessed in a predetermined, ordered sequence. Array uses sequential data storage for storing and accessing its elements. The array marks[10] defines the marks of the student in 10 different subjects where each subject marks are located at a particular subscript in the array i.e. marks[0] denotes the marks in first subject, marks[1] denotes the marks in 2nd subject and so on. Following are the important terms to understand the concept of Array.

      Element − Each item stored in an array is called an element.

      Index − Each location of an element in an array has a numerical index, which is used to identify the element.

An array is a variable which can store multiple values of same data type at a time. Consider the following example declaration.

       int a[3];

Total Size allocated     = No. of variables in array * Data type storage size

= 3 * 4 = 12 bytes

Here, compiler allocates total 12 bytes of continuous memory locations with single name ‘a’. But allows to store three different integer values (each in 4 bytes of memory) at a time.  

Array Representation: Arrays can be declared in various ways in different languages. For illustration, let's take C array declaration.

Arrays can be declared in various ways in different languages. For illustration, let's take C array declaration.

Memory Allocation of the array

 int arr [5]

 Properties of the Array

  Each element is of same data type and carries a same size i.e. int = 4 bytes.

  Elements of the array are stored at contiguous memory locations where the first element is stored at the smallest memory location.

  Elements of the array can be randomly accessed since we can calculate the address of each element of the array with the given base address and the size of data element.

Program without array

#include <stdio.h>  

void main ()  

{  

   int marks_1 = 56, marks_2 = 78, marks_3 = 88, marks_4 = 76,        

   marks_5 = 56, marks_6 = 89;   

    float avg = (marks_1 + marks_2 + marks_3 + marks_4 +

     marks_5 +marks_6) / 6 ;   

    printf(avg);   

}  

 Program by using array

#include <stdio.h>  

void main ()  

{   int marks[6] = {56, 78, 88, 76, 56, 89);  

    int i;    

    float avg;  

    for ( i = 0; i < 6; i ++ )   

    {    avg = avg + marks[ i ];  }    

    printf(avg);   

}   

Advantages of Array

   Array provides the single name for the group of variables of the same type therefore, it is easy to remember the name of all the elements of an array.

   Traversing an array is a very simple process, we just need to increment the base address of the array in order to visit each element one by one.

   Any element in the array can be directly accessed by using the index.

Basic Operations on Array:

Following are the basic operations supported by an array.

       Traverse − print all the array elements one by one.

       Insertion − Adds an element at the given index.

       Deletion − Deletes an element at the given index.

       Search − Searches an element using the given index or by the value.

1. Traverse Operation: This operation is to traverse through the elements of an array.

#include <stdio.h>

main() {

   int arr [ ] = { 1, 3, 5, 7, 8 };

   int  n = 5;

   int i = 0;  

   printf ( “ The original array elements are:\n “ ) ;

   for ( i = 0; I < n ;  i++ ) {

      printf ( “ arr [ %d ] = %d \n ", i, arr [ I ] );

   }

}

 Output

The original array elements are:

arr [0] = 1

arr [1] = 3

arr [2] = 5

arr [3] = 7

arr [4] = 8

2. Insert an element at the end

#include<stdio.h>

int insertSorted(int arr[], int n, int key, int capacity)

{ //Cannot insert more elements if n is already more than or equal to capcity

    if (n >= capacity)

       return n;

      arr[n] = key;

    return (n + 1);

}

void main()

{   int arr[20] = {12, 16, 20, 40, 50, 70};

    int capacity = sizeof(arr) / sizeof(arr[0]);

    int n = 6;

    int i, key = 26;

  

    printf("\n Before Insertion: ");

    for (i = 0; i < n; i++)

        printf("%d  ", arr[i]);

  

    // Inserting key

    n = insertSorted(arr, n, key, capacity);

  

    printf("\n After Insertion: ");

    for (i = 0; i < n; i++)

        printf("%d  ",arr[i]);

}

Output:

Before Insertion: 12 16 20 40 50 70

After Insertion: 12 16 20 40 50 70 26

 

3. Delete Operation: In delete operation, the element to be deleted is searched using the linear search and then delete operation is performed followed by shifting the elements.

#include<stdio.h>

// To search a key to be deleted

int findElement(int arr[], int n, int key);

// Function to delete an element

int deleteElement(int arr[], int n, int key)

{   // Find position of element to be deleted

    int pos = findElement(arr, n, key);

    if (pos == - 1)

    {   printf("Element not found");

        return n;

    }

    // Deleting element

    int i;

    for (i = pos; i < n - 1; i++)

        arr[i] = arr[i + 1];

     return n - 1;

}

// Function to implement search operation 

int findElement(int arr[], int n, int key)

{   int i;

    for (i = 0; i < n; i++)

        if (arr[i] == key)

            return i;

    return - 1;

}

int main()

 {  int i;

    int arr[] = {10, 50, 30, 40, 20};

    int n = sizeof(arr) / sizeof(arr[0]);

    int key = 30;

  

    printf("Array before deletion\n");

    for (i = 0; i < n; i++)

      printf("%d  ", arr[i]);


    n = deleteElement(arr, n, key);

    printf("\nArray after deletion\n");

    for (i = 0; i < n; i++)

      printf("%d  ", arr[i]);

    return 0;

}


Output:

Array before deletion

10 50 30 40 20

Array after deletion

10 50 40 20

 4. Search − Searches an element using the given index or by the value.

#include<stdio.h>

// Function to implement search operation 

int findElement(int arr[], int n, int key)

{   int i;

    for (i = 0; i < n; i++)

        if (arr[i] == key)

            return i;

    return -1;

}

int main()

{   int arr[] = {12, 34, 10, 6, 40};

    int n = sizeof(arr) / sizeof(arr[0]);

  

    // Using a last element as search element

    int key = 40;

    int position = findElement(arr, n, key);

    if (position == - 1)

        printf("Element not found");

    else

        printf("Element Found at Position: %d", position + 1 );

      return 0; 

}

Output

Element Found at Position: 5

==============================================================================

Question 2: Difference between linear and non-linear data structure

Answer:

Linear Data Structure

Non Linear Data Structure

The elements are inserted adjacent to each other and can also be retrieved similarly.

Elements which are stored in a non-linear data structure have certain relationship among them while being stored or retrieved. There is a certain definite pattern which always govern the addition of a new element to the structure

Data elements are easy to be retrieved as they can be just accessed in one run.

Data elements are not easy to be retrieved or stored as they follow a strict relationship among the various elements.

The Linear Data Structures are comparatively simpler and provide a certain ease of working with.

The Non Linear Data Structures are complex data structures which can prove to be tricky to an extent.

These do not provide us with efficient memory utilization.

Efficient memory utilization is experienced while working with the non-linear data structures.

Examples: Linked List, Stack, Queue etc.

Examples: Trees, graphs etc.

Q. 3 Sparse Matrices

A matrix is a two-dimensional data object made of m rows and n columns, therefore having total m x n values. If most of the elements of the matrix have 0 value, then it is called a sparse matrix.

Why to use Sparse Matrix instead of simple matrix?

  • Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements.
  • Computing time: Computing time can be saved by logically designing a data structure traversing only non-zero elements..

·         Example:

·         0 0 3 0 4           

·         0 0 5 7 0

·         0 0 0 0 0

·         0 2 6 0 0

·         Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements, we only store non-zero elements. This means storing non-zero elements with triples- (Row, Column, value).

·         Sparse Matrix Representations can be done in many ways following are two common representations:

  1. Array representation
  2. Linked list representation
Method 1: Using Arrays

2D array is used to represent a sparse matrix in which there are three rows named as

  • Row: Index of row, where non-zero element is located
  • Column: Index of column, where non-zero element is located
  • Value: Value of the non zero element located at index – (row,column)

Method 2: Using Linked Lists

In linked list, each node has four fields. These four fields are defined as:

  • Row: Index of row, where non-zero element is located
  • Column: Index of column, where non-zero element is located
  • Value: Value of the non zero element located at index – (row,column)
  • Next node: Address of the next node

Q. 4 Transpose of sparse matrices

What is transpose of sparse matrix?

Transpose of a matrix is obtained by interchanging rows and columns. ... In another way, we can say that element in the i, j position gets put in the j, i position.

 Q. 5 Hashing

Hashing is a technique that is used to uniquely identify a specific object from a group of similar objects. Hashing is a technique to convert a range of key values into a range of indexes of an array. For example, to use modulo operator to get a range of key values.

Some examples of how hashing is used in our lives include:

  • In universities, each student is assigned a unique roll number that can be used to retrieve information about them.
  • In libraries, each book is assigned a unique number that can be used to determine information about the book, such as its exact position in the library or the users it has been issued to etc.
  • Contact List

In first 2 examples the students and books were hashed to a unique number.

  • Hash Table is a data structure which stores data in an associative manner.
  • In a hash table, data is stored in an array format, where each data value has its own unique index value.
  • Access of data becomes very fast if we know the index of the desired data.
  • Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of the size of the data.
  • Hash Table uses an array as a storage medium and uses hash technique to generate an index where an element is to be inserted or is to be located from.

Hashing is implemented in two steps:
  • A key is converted into index by using a hash function. This index is used to store the original element, which falls into the hash table.
  • The element is stored in the hash table where it can be quickly retrieved using hashed key.

hash = hashfunc(key)
index = hash % array_size

  In this method, the hash is independent of the array size and it is then reduced to an index (a number between 0 and array_size − 1) by using the modulo operator (%).

Hash function

  • A hash function is any function that can be used to map a data set of an arbitrary size to a data set of a fixed size, which falls into the hash table.
  • The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes.

What is Collision?
Since a hash function gets us a small number for a key which is a big integer or string, there is possibility that two keys result in same value.

  The situation where a newly inserted key maps to an already occupied slot in hash table is called collision and must be handled using some collision handling technique.

How to handle Collisions?
There are mainly two methods to handle collision:
1) Separate Chaining
2) Open Addressing

  1. Separate Chaining

  The idea is to make each cell of hash table point to a linked list of records that have same hash function value.

  Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.

Advantages:
1) Simple to implement.
2) Hash table never fills up, we can always add more elements to chain.
3) Less sensitive to the hash function or load factors.
4) It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted.

Disadvantages:
1) Cache performance of chaining is not good as keys are stored using linked list. Open addressing provides better cache performance as everything is stored in same table.
2) Wastage of Space (Some Parts of hash table are never used)
3) If the chain becomes long, then search time can become O(n) in worst case.
4) Uses extra space for links.

  1. Open Addressing (Linear Probing)
  • Like separate chaining, open addressing is a method for handling collisions.
  • In Open Addressing, all elements are stored in the hash table itself.
  • So at any point, size of the table must be greater than or equal to the total number of keys
  • In such a case, we can search the next empty location in the array by looking into the next cell until we find an empty cell.
  • This technique is called linear probing.

Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.

hash table

Tables which can be searched for an item in O(1) time using a hash function to form an address from the key.

hash function

Function which, when applied to the key, produces a integer which can be used as an address in a hash table.

collision

When a hash function maps two different keys to the same table address, a collision is said to occur.

linear probing

A simple re-hashing scheme in which the next slot in the table is checked on a collision.

 

Comments