πŸ”’ Private Site

This site is password-protected.

Unit II β€” Complete Solutions to All Practice Questions


Table of Contents


Practice Questions β€” Arrays


Q1. Define an array as an ADT. List all operations with their time complexities.

Array as an Abstract Data Type (ADT):

An Array ADT is a fixed-size, ordered collection of elements of the same data type, where each element is identified by a numerical index.

ADT Array:
    Data:
        A fixed-size collection of n elements of identical type
        Each element is accessed by an index (0 to n-1)

    Operations:
        create(size)            β†’ Allocate array of given size           β€” O(1)
        get(index)              β†’ Return element at given index          β€” O(1)
        set(index, value)       β†’ Update element at given index          β€” O(1)
        insert(index, value)    β†’ Insert value at index (shift right)    β€” O(n)
        delete(index)           β†’ Remove element at index (shift left)   β€” O(n)
        search(value)           β†’ Find index of given value              β€” O(n)  [linear]
        length()                β†’ Return current number of elements      β€” O(1)
        display()               β†’ Print all elements                     β€” O(n)

    Error Conditions:
        get/set with invalid index β†’ "Index out of bounds"
        insert on full array       β†’ "Array overflow"
        delete on empty array      β†’ "Array underflow"

Key time complexities explained:

Operation Time Reason
get(i) / set(i, v) $O(1)$ Address computed directly: $B + i \times w$
insert(i, v) $O(n)$ Must shift elements right to create a gap
delete(i) $O(n)$ Must shift elements left to fill the gap
search(v) $O(n)$ Worst case: scan entire array (linear search)

The defining characteristic of an array is random access β€” any element can be accessed in constant time using its index, because elements are stored in contiguous memory.


Q2. An array A[10] of integers starts at address 500. Each integer takes 4 bytes. Find the address of A[7].

Given:

  • Base address $B = 500$
  • Element size $w = 4$ bytes (integer)
  • Index $i = 7$
  • Lower bound $l = 0$ (C-style, 0-based indexing)

Formula (1D array):

\[\text{Address of } A[i] = B + (i - l) \times w\]

Calculation:

\[\text{Address of } A[7] = 500 + (7 - 0) \times 4 = 500 + 28 = \boxed{528}\]

Verification by enumeration:

Element Address
A[0] 500
A[1] 504
A[2] 508
A[3] 512
A[4] 516
A[5] 520
A[6] 524
A[7] 528 βœ”
A[8] 532
A[9] 536

Q3. A 2D array B[5][8] starts at base address 1000, element size 2 bytes. Find the address of B[3][5] in (a) Row-major (b) Column-major order.

Given:

  • Array: $B[5][8]$ β†’ $m = 5$ rows, $n = 8$ columns
  • Base address $B_0 = 1000$
  • Element size $w = 2$ bytes
  • Target: $B[3][5]$ β†’ $i = 3$, $j = 5$
  • Indexing starts at 0 (so $l_1 = 0, l_2 = 0$)

(a) Row-Major Order

In row-major, elements are stored row by row.

Formula:

\[\text{Address} = B_0 + (i \times n + j) \times w\]

Calculation:

\[= 1000 + (3 \times 8 + 5) \times 2\] \[= 1000 + (24 + 5) \times 2\] \[= 1000 + 29 \times 2\] \[= 1000 + 58\] \[= \boxed{1058}\]

Explanation: Before row 3, there are $3 \times 8 = 24$ elements (rows 0, 1, 2 each have 8 elements). Within row 3, element at column 5 is the 6th element (index 5). Total offset = $(24 + 5) \times 2 = 58$ bytes.


(b) Column-Major Order

In column-major, elements are stored column by column.

Formula:

\[\text{Address} = B_0 + (j \times m + i) \times w\]

Calculation:

\[= 1000 + (5 \times 5 + 3) \times 2\] \[= 1000 + (25 + 3) \times 2\] \[= 1000 + 28 \times 2\] \[= 1000 + 56\] \[= \boxed{1056}\]

Explanation: Before column 5, there are $5 \times 5 = 25$ elements (columns 0–4 each have 5 elements). Within column 5, element at row 3 is the 4th element (index 3). Total offset = $(25 + 3) \times 2 = 56$ bytes.


Q4. Write a C function to insert an element at a given position in an array. Analyze its time and space complexity.

C Code:

#include <stdio.h>

#define MAX_SIZE 100

// Insert 'value' at position 'pos' in array A of current size n
// Returns 1 on success, 0 on failure
int insertElement(int A[], int *n, int pos, int value) {
    // Validate position
    if (pos < 0 || pos > *n || *n >= MAX_SIZE) {
        printf("Invalid position or array is full.\n");
        return 0;
    }

    // Step 1: Shift elements from position pos to end, one position to the right
    for (int i = *n - 1; i >= pos; i--) {
        A[i + 1] = A[i];
    }

    // Step 2: Insert the new element at position pos
    A[pos] = value;

    // Step 3: Increment the size
    (*n)++;

    return 1;
}

// Driver
int main() {
    int A[MAX_SIZE] = {10, 20, 30, 40, 50};
    int n = 5;

    printf("Before insertion: ");
    for (int i = 0; i < n; i++) printf("%d ", A[i]);
    printf("\n");

    insertElement(A, &n, 2, 25);  // Insert 25 at position 2

    printf("After insertion:  ");
    for (int i = 0; i < n; i++) printf("%d ", A[i]);
    printf("\n");

    return 0;
}

Output:

Before insertion: 10 20 30 40 50
After insertion:  10 20 25 30 40 50

Step-by-step trace (inserting 25 at position 2):

Original: [10, 20, 30, 40, 50]     n = 5
                   ↑ pos = 2

i=4: A[5] = A[4] β†’ [10, 20, 30, 40, 50, 50]
i=3: A[4] = A[3] β†’ [10, 20, 30, 40, 40, 50]
i=2: A[3] = A[2] β†’ [10, 20, 30, 30, 40, 50]

Insert:   A[2] = 25 β†’ [10, 20, 25, 30, 40, 50]     n = 6

Complexity Analysis:

Aspect Complexity Explanation
Best Case Time $O(1)$ Insert at the end (pos = n): no shifting needed
Worst Case Time $O(n)$ Insert at position 0: all $n$ elements must shift
Average Case Time $O(n)$ On average, $n/2$ elements shift
Space $O(1)$ Only uses a fixed number of extra variables (i, temp)

Why shifting must go right-to-left: If we shifted left-to-right, A[pos+1] = A[pos] would overwrite A[pos+1] before we moved it, losing data. Going from the end backwards ensures each element is safely copied before being overwritten.


Q5. Write C code to multiply two matrices. What are the conditions for multiplication to be valid?

Condition for valid multiplication:

For matrices $A_{m \times p}$ and $B_{p \times n}$, multiplication is valid only if the number of columns of A equals the number of rows of B (i.e., both are $p$). The result is a matrix $C_{m \times n}$.

\[C[i][j] = \sum_{k=0}^{p-1} A[i][k] \times B[k][j]\]

C Code:

#include <stdio.h>

#define MAX 10

void matrixMultiply(int A[][MAX], int B[][MAX], int C[][MAX],
                    int m, int p, int n) {
    for (int i = 0; i < m; i++) {
        for (int j = 0; j < n; j++) {
            C[i][j] = 0;  // Initialize accumulator
            for (int k = 0; k < p; k++) {
                C[i][j] += A[i][k] * B[k][j];
            }
        }
    }
}

void printMatrix(int M[][MAX], int rows, int cols) {
    for (int i = 0; i < rows; i++) {
        for (int j = 0; j < cols; j++)
            printf("%4d ", M[i][j]);
        printf("\n");
    }
}

int main() {
    int A[MAX][MAX] = { {1, 2, 3},
                        {4, 5, 6} };       // 2Γ—3

    int B[MAX][MAX] = { {7, 8},
                        {9, 10},
                        {11, 12} };        // 3Γ—2

    int C[MAX][MAX];
    int m = 2, p = 3, n = 2;              // A is 2Γ—3, B is 3Γ—2

    matrixMultiply(A, B, C, m, p, n);

    printf("Result (2x2):\n");
    printMatrix(C, m, n);

    return 0;
}

Output:

Result (2x2):
  58   64
 139  154

Step-by-step for C[0][0]:

\[C[0][0] = A[0][0] \times B[0][0] + A[0][1] \times B[1][0] + A[0][2] \times B[2][0]\] \[= 1 \times 7 + 2 \times 9 + 3 \times 11 = 7 + 18 + 33 = 58 \quad βœ”\]

Step-by-step for C[1][1]:

\[C[1][1] = A[1][0] \times B[0][1] + A[1][1] \times B[1][1] + A[1][2] \times B[2][1]\] \[= 4 \times 8 + 5 \times 10 + 6 \times 12 = 32 + 50 + 72 = 154 \quad βœ”\]

Time Complexity: $O(m \times n \times p)$. For square matrices ($m = n = p$): $O(n^3)$.

Space Complexity: $O(m \times n)$ for the result matrix.


Q6. Explain the difference between Row-major and Column-major storage with diagrams.

Row-Major Order: Elements are stored row by row in memory. After all elements of row 0, row 1 follows, and so on. Used by C, C++, Java, Python (NumPy default).

Column-Major Order: Elements are stored column by column in memory. After all elements of column 0, column 1 follows, and so on. Used by FORTRAN, MATLAB, R.

Example: Consider a $3 \times 4$ matrix:

Logical View:
        Col 0   Col 1   Col 2   Col 3
Row 0 [  a00     a01     a02     a03  ]
Row 1 [  a10     a11     a12     a13  ]
Row 2 [  a20     a21     a22     a23  ]

Row-Major Layout in Memory:

Address:  100  104  108  112  116  120  124  128  132  136  140  144
         β”Œβ”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”
         β”‚a00 β”‚a01 β”‚a02 β”‚a03 β”‚a10 β”‚a11 β”‚a12 β”‚a13 β”‚a20 β”‚a21 β”‚a22 β”‚a23 β”‚
         β””β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”˜
         β”œβ”€β”€β”€ Row 0 β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”œβ”€β”€β”€ Row 1 β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”œβ”€β”€β”€ Row 2 ──────────

Column-Major Layout in Memory:

Address:  100  104  108  112  116  120  124  128  132  136  140  144
         β”Œβ”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”
         β”‚a00 β”‚a10 β”‚a20 β”‚a01 β”‚a11 β”‚a21 β”‚a02 β”‚a12 β”‚a22 β”‚a03 β”‚a13 β”‚a23 β”‚
         β””β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”˜
         β”œβ”€β”€ Col 0 β”€β”€β”€β”œβ”€β”€ Col 1 β”€β”€β”€β”œβ”€β”€ Col 2 β”€β”€β”€β”œβ”€β”€ Col 3 ───

Address Formulas (0-based, $m$ rows, $n$ cols, element size $w$):

Order Formula for $A[i][j]$
Row-Major $B + (i \times n + j) \times w$
Column-Major $B + (j \times m + i) \times w$

Key Differences:

Aspect Row-Major Column-Major
Storage order Row by row Column by column
Efficient traversal Row-wise (i varies slowly) Column-wise (j varies slowly)
Languages C, C++, Java, Python FORTRAN, MATLAB, R
Cache performance Better for row-wise access Better for column-wise access

Why it matters: Accessing elements in the order they’re stored in memory (row-wise in C, column-wise in FORTRAN) gives much better cache performance because consecutive memory locations are loaded into cache together. Accessing in the wrong order causes frequent cache misses, slowing down the program significantly.


Practice Questions β€” Strings


All seven string functions implemented in C:

#include <stdio.h>

// 1. String Length β€” O(n)
int stringLength(char str[]) {
    int len = 0;
    while (str[len] != '\0') {
        len++;
    }
    return len;
}

// 2. String Copy β€” O(n)
void stringCopy(char dest[], char src[]) {
    int i = 0;
    while (src[i] != '\0') {
        dest[i] = src[i];
        i++;
    }
    dest[i] = '\0';  // Null-terminate destination
}

// 3. String Concatenation β€” O(m + n)
void stringConcat(char dest[], char src[]) {
    int i = 0, j = 0;
    while (dest[i] != '\0') i++;      // Find end of dest
    while (src[j] != '\0') {          // Append src
        dest[i] = src[j];
        i++;
        j++;
    }
    dest[i] = '\0';                   // Null-terminate result
}

// 4. String Reverse (in-place) β€” O(n)
void stringReverse(char str[]) {
    int len = 0;
    while (str[len] != '\0') len++;

    int start = 0, end = len - 1;
    while (start < end) {
        char temp = str[start];
        str[start] = str[end];
        str[end] = temp;
        start++;
        end--;
    }
}

// 5. String Compare β€” O(min(m, n))
// Returns: -1 if s1 < s2, 0 if equal, 1 if s1 > s2
int stringCompare(char s1[], char s2[]) {
    int i = 0;
    while (s1[i] != '\0' && s2[i] != '\0') {
        if (s1[i] < s2[i]) return -1;
        if (s1[i] > s2[i]) return 1;
        i++;
    }
    if (s1[i] == '\0' && s2[i] == '\0') return 0;
    if (s1[i] == '\0') return -1;
    return 1;
}

// 6. Palindrome Check β€” O(n)
// Returns: 1 if palindrome, 0 otherwise
int isPalindrome(char str[]) {
    int len = 0;
    while (str[len] != '\0') len++;

    int start = 0, end = len - 1;
    while (start < end) {
        if (str[start] != str[end])
            return 0;
        start++;
        end--;
    }
    return 1;
}

// 7. Substring Search (Brute Force) β€” O(n Γ— m)
// Returns: index of first occurrence, or -1 if not found
int substringSearch(char text[], char pattern[]) {
    int n = 0, m = 0;
    while (text[n] != '\0') n++;
    while (pattern[m] != '\0') m++;

    for (int i = 0; i <= n - m; i++) {
        int j;
        for (j = 0; j < m; j++) {
            if (text[i + j] != pattern[j])
                break;
        }
        if (j == m)
            return i;    // Found at index i
    }
    return -1;           // Not found
}

Logic summary for each function:

Function Technique Key Idea
Length Linear scan Count characters until '\0'
Copy Index-by-index Copy each char; manually add '\0'
Concat Two-phase scan Find end of dest, then append src
Reverse Two-pointer swap Swap from both ends, moving inward
Compare Character-by-character Lexicographic comparison via ASCII
Palindrome Two-pointer compare Same as reverse but compare instead of swap
Substring Sliding window brute force Try every starting position in text

Q2. What is the null character '\0'? Why is it important in C strings?

The Null Character '\0':

  • It is the character with ASCII value 0 (not the digit β€˜0’ which has ASCII value 48).
  • It is a sentinel (end marker) for C strings.
  • It occupies 1 byte in memory.

Why it is important:

  1. C has no built-in string length field. Unlike Java or Python, C strings are just arrays of characters with no metadata. The only way to know where a string ends is the '\0' marker.

  2. All standard string functions depend on it. Functions like strlen(), printf("%s"), strcpy(), strcmp() all scan for '\0' to know when to stop. Without it, they would read past the string into garbage memory (buffer overrun).

  3. Memory allocation must account for it. A string of $n$ characters requires $n + 1$ bytes β€” the extra byte is for '\0'.

Example:

char str[] = "Hello";
// Memory: ['H', 'e', 'l', 'l', 'o', '\0']
// sizeof(str) = 6 bytes (5 chars + 1 null)
// strlen(str) = 5 (counts characters, not null)

What happens without '\0':

char str[5] = {'H', 'e', 'l', 'l', 'o'};  // No null terminator!
printf("%s", str);  // UNDEFINED BEHAVIOR β€” prints "Hello" followed by garbage
                    // until it happens to find a 0 byte in memory

Key point: The null character is what makes an array of characters a string in C. Without it, it’s just a character array.


Q3. Write a function to count the number of vowels in a string.

#include <stdio.h>

int countVowels(char str[]) {
    int count = 0;
    for (int i = 0; str[i] != '\0'; i++) {
        char ch = str[i];
        // Check for both lowercase and uppercase vowels
        if (ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u' ||
            ch == 'A' || ch == 'E' || ch == 'I' || ch == 'O' || ch == 'U') {
            count++;
        }
    }
    return count;
}

int main() {
    char str[] = "Hello World";
    printf("Vowels in \"%s\": %d\n", str, countVowels(str));
    return 0;
}

Output:

Vowels in "Hello World": 3

Trace:

H β†’ not a vowel
e β†’ vowel βœ” (count = 1)
l β†’ not
l β†’ not
o β†’ vowel βœ” (count = 2)
  β†’ not (space)
W β†’ not
o β†’ vowel βœ” (count = 3)
r β†’ not
l β†’ not
d β†’ not
Result: 3

Time Complexity: $O(n)$ β€” single pass through the string.

Space Complexity: $O(1)$ β€” only a counter variable.


Q4. Write a function to check if two strings are anagrams.

Two strings are anagrams if they contain the same characters with the same frequencies, possibly in a different order. Example: β€œlisten” and β€œsilent” are anagrams.

Approach: Count character frequencies in both strings and compare.

#include <stdio.h>

int areAnagrams(char s1[], char s2[]) {
    int count[256] = {0};  // Frequency array for all ASCII characters
    int i;

    // Count frequencies from s1
    for (i = 0; s1[i] != '\0'; i++)
        count[(unsigned char)s1[i]]++;

    // Subtract frequencies from s2
    for (i = 0; s2[i] != '\0'; i++)
        count[(unsigned char)s2[i]]--;

    // If all counts are 0, strings are anagrams
    for (i = 0; i < 256; i++) {
        if (count[i] != 0)
            return 0;  // Not anagrams
    }
    return 1;  // Anagrams
}

int main() {
    char s1[] = "listen";
    char s2[] = "silent";

    if (areAnagrams(s1, s2))
        printf("\"%s\" and \"%s\" are anagrams.\n", s1, s2);
    else
        printf("\"%s\" and \"%s\" are NOT anagrams.\n", s1, s2);

    return 0;
}

Output:

"listen" and "silent" are anagrams.

Trace for β€œlisten” and β€œsilent”:

After counting s1 ("listen"):
  count['l']=1, count['i']=1, count['s']=1, count['t']=1, count['e']=1, count['n']=1

After subtracting s2 ("silent"):
  count['s']=0, count['i']=0, count['l']=0, count['e']=0, count['n']=0, count['t']=0

All counts are 0 β†’ Anagrams βœ”

Time Complexity: $O(n + m)$ where $n, m$ are lengths of the two strings (plus $O(256) = O(1)$ for the final check).

Space Complexity: $O(1)$ β€” the frequency array has a fixed size of 256 regardless of input size.

Why this approach? Sorting both strings and comparing would also work ($O(n \log n)$), but the frequency-counting approach is faster at $O(n)$.


Q5. Explain the time complexity of each string operation you implemented.

Operation Time Complexity Space Complexity Explanation
Length $O(n)$ $O(1)$ Must scan every character to find '\0'. No shortcut exists since C strings have no length metadata.
Copy $O(n)$ $O(1)$ Copies each of the $n$ characters one by one, plus the null terminator.
Concatenation $O(m + n)$ $O(1)$ Phase 1: traverse dest to find its end ($m$ steps). Phase 2: copy src ($n$ steps). Total = $m + n$.
Reverse $O(n)$ $O(1)$ Finds length in $O(n)$, then does $n/2$ swaps. Total work is $O(n)$. In-place using two-pointer technique.
Compare $O(\min(m, n))$ $O(1)$ Compares character by character. Stops at the first mismatch or when the shorter string ends.
Palindrome $O(n)$ $O(1)$ Finds length in $O(n)$, then at most $n/2$ comparisons. Total = $O(n)$.
Substring Search $O(n \times m)$ worst $O(1)$ For each of $O(n)$ starting positions, compares up to $m$ characters. Worst case: text = β€œAAAA…AB”, pattern = β€œAAB”.

Important observations:

  • All operations use $O(1)$ extra space β€” no auxiliary arrays.
  • The $O(n)$ for string length means every function that needs the length internally scans the entire string first.
  • Concatenation is $O(m + n)$, not $O(n)$, because finding the end of the destination string also takes time.
  • Substring search can be improved to $O(n + m)$ using the KMP (Knuth-Morris-Pratt) algorithm, but the naive approach is the standard syllabus requirement.

Practice Questions β€” Searching


Q1. Write iterative and recursive versions of Binary Search. Trace both for the array [3, 7, 11, 15, 19, 23, 27] searching for key = 19.

Iterative Binary Search:

int binarySearchIterative(int A[], int n, int key) {
    int low = 0, high = n - 1;

    while (low <= high) {
        int mid = low + (high - low) / 2;

        if (A[mid] == key)
            return mid;
        else if (A[mid] < key)
            low = mid + 1;
        else
            high = mid - 1;
    }
    return -1;
}

Recursive Binary Search:

int binarySearchRecursive(int A[], int low, int high, int key) {
    if (low > high)
        return -1;

    int mid = low + (high - low) / 2;

    if (A[mid] == key)
        return mid;
    else if (A[mid] < key)
        return binarySearchRecursive(A, mid + 1, high, key);
    else
        return binarySearchRecursive(A, low, mid - 1, key);
}

Trace β€” Iterative version for [3, 7, 11, 15, 19, 23, 27], key = 19:

Array:  [3,  7,  11,  15,  19,  23,  27]
Index:   0   1    2    3    4    5    6

Step 1: low = 0, high = 6
        mid = 0 + (6-0)/2 = 3
        A[3] = 15 < 19  β†’  low = 3 + 1 = 4

Step 2: low = 4, high = 6
        mid = 4 + (6-4)/2 = 5
        A[5] = 23 > 19  β†’  high = 5 - 1 = 4

Step 3: low = 4, high = 4
        mid = 4 + (4-4)/2 = 4
        A[4] = 19 == 19  β†’  FOUND at index 4 βœ”

Total comparisons: 3


Trace β€” Recursive version for the same input:

Call 1: binarySearchRecursive(A, 0, 6, 19)
        mid = 3, A[3] = 15 < 19
        β†’ Call binarySearchRecursive(A, 4, 6, 19)

Call 2: binarySearchRecursive(A, 4, 6, 19)
        mid = 5, A[5] = 23 > 19
        β†’ Call binarySearchRecursive(A, 4, 4, 19)

Call 3: binarySearchRecursive(A, 4, 4, 19)
        mid = 4, A[4] = 19 == 19
        β†’ Return 4

Unwinding: Call 3 returns 4 β†’ Call 2 returns 4 β†’ Call 1 returns 4

Result: FOUND at index 4 βœ”

Both versions make exactly 3 comparisons. The iterative version uses $O(1)$ space; the recursive version uses $O(\log n)$ stack space (3 stack frames in this case).


Q2. Compare Linear Search and Binary Search in terms of prerequisites, time complexity, and use cases.

Feature Linear Search Binary Search
Prerequisite None β€” works on any array Array must be sorted
Best Case $O(1)$ β€” target at first position $O(1)$ β€” target at middle
Worst Case $O(n)$ β€” target at end or absent $O(\log n)$ β€” target absent
Average Case $O(n)$ β€” $\frac{n+1}{2}$ comparisons $O(\log n)$ β€” $\approx \log_2 n$ comparisons
Space (iterative) $O(1)$ $O(1)$
Space (recursive) $O(n)$ $O(\log n)$
Data structure Array or linked list Array only (needs random access)
Implementation Very simple (1 loop) Slightly complex (midpoint logic)
For $n = 1{,}000{,}000$ Up to 1,000,000 comparisons At most 20 comparisons

When to use each:

Use Linear Search when:

  • The array is unsorted and sorting is not feasible
  • The array is small ($n < 20$)
  • You need to search only once (sorting + binary search has overhead)
  • Data is stored in a linked list (no random access)

Use Binary Search when:

  • The array is sorted (or can be sorted once and searched many times)
  • The array is large
  • Multiple searches will be performed (amortize sorting cost)
  • Performance is critical

Break-even analysis: If you need to search a dataset $k$ times, binary search is worthwhile when:

\[O(n \log n) + k \cdot O(\log n) < k \cdot O(n)\]

This simplifies to $k > \frac{n \log n}{n - \log n} \approx \log n$ for large $n$. So even just $\log n$ searches justify sorting first!


Q3. Why can’t Binary Search be used on a linked list efficiently?

Binary search requires random access β€” the ability to jump directly to any element by its index in $O(1)$ time. This is what makes computing and accessing the middle element efficient.

Why linked lists fail:

  1. No random access: In a linked list, to access the $k$-th element, you must traverse $k$ nodes from the head. Accessing the middle element takes $O(n/2) = O(n)$ time.

  2. Finding mid is costly: In an array, mid = (low + high) / 2 is computed in $O(1)$ and accessed in $O(1)$. In a linked list, finding the middle node requires traversing from the head each time β€” $O(n)$.

  3. Each iteration becomes $O(n)$: Binary search does $O(\log n)$ iterations. If each iteration costs $O(n)$ to find the middle, the total becomes $O(n \log n)$ β€” worse than linear search which is $O(n)$!

Comparison:

Operation Array Linked List
Access middle element $O(1)$ $O(n)$
Binary search total $O(\log n)$ $O(n \log n)$
Linear search total $O(n)$ $O(n)$

Conclusion: For linked lists, linear search ($O(n)$) is the best option. Binary search on a linked list is theoretically possible but practically pointless since it’s slower than linear search. This is why binary search is designed specifically for arrays (or any structure with $O(1)$ random access, like a balanced BST, which effectively does binary search in its structure).


Q4. For an array of 1 million elements, how many comparisons does Binary Search need in the worst case?

Given: $n = 1{,}000{,}000 = 10^6$

Worst-case comparisons for Binary Search:

\[\text{Comparisons} = \lfloor \log_2 n \rfloor + 1\]

Calculation:

\[\log_2(1{,}000{,}000) = \frac{\ln(10^6)}{\ln 2} = \frac{6 \times \ln 10}{\ln 2} = \frac{6 \times 2.3026}{0.6931} \approx 19.93\] \[\lfloor 19.93 \rfloor + 1 = 19 + 1 = \boxed{20}\]

Verification by powers of 2:

\[2^{19} = 524{,}288\] \[2^{20} = 1{,}048{,}576\]

Since $524{,}288 < 1{,}000{,}000 \leq 1{,}048{,}576$, we need at most 20 comparisons.

Interpretation: Even with one million elements, binary search finds the answer (or determines it doesn’t exist) in at most 20 comparisons! Linear search, by contrast, could need up to 1,000,000 comparisons.

Array Size Binary Search (worst) Linear Search (worst)
1,000 10 1,000
1,000,000 20 1,000,000
1,000,000,000 30 1,000,000,000

Q5. Modify Binary Search to return the first occurrence of a duplicate element.

The standard binary search may return any occurrence of the target when there are duplicates. To find the first (leftmost) occurrence, we modify the algorithm to continue searching left even after finding a match.

int binarySearchFirst(int A[], int n, int key) {
    int low = 0, high = n - 1;
    int result = -1;  // Store the answer

    while (low <= high) {
        int mid = low + (high - low) / 2;

        if (A[mid] == key) {
            result = mid;       // Record this occurrence
            high = mid - 1;     // Keep searching LEFT for earlier occurrence
        }
        else if (A[mid] < key)
            low = mid + 1;
        else
            high = mid - 1;
    }
    return result;
}

Key change: When A[mid] == key, instead of immediately returning, we:

  1. Save mid in result (in case this is the first occurrence)
  2. Set high = mid - 1 to continue searching the left half for a possibly earlier occurrence

Trace Example:

Array: [2, 4, 4, 4, 8, 8, 10]
Index:  0  1  2  3  4  5   6

Search for key = 4 (first occurrence):

Step 1: low=0, high=6, mid=3 β†’ A[3]=4 == 4
        result=3, high=2 (search left)

Step 2: low=0, high=2, mid=1 β†’ A[1]=4 == 4
        result=1, high=0 (search left)

Step 3: low=0, high=0, mid=0 β†’ A[0]=2 < 4
        low=1

Step 4: low=1, high=0 β†’ low > high β†’ STOP

Return result = 1 βœ” (first occurrence of 4 is at index 1)

Similarly, for the LAST occurrence:

int binarySearchLast(int A[], int n, int key) {
    int low = 0, high = n - 1;
    int result = -1;

    while (low <= high) {
        int mid = low + (high - low) / 2;

        if (A[mid] == key) {
            result = mid;
            low = mid + 1;     // Keep searching RIGHT for later occurrence
        }
        else if (A[mid] < key)
            low = mid + 1;
        else
            high = mid - 1;
    }
    return result;
}

Time Complexity: Still $O(\log n)$ β€” same number of iterations as standard binary search. We just don’t early-exit on the first match.


Practice Questions β€” Sorting


Q1. Sort the array [5, 2, 8, 1, 9, 3] using (a) Bubble Sort (b) Selection Sort (c) Insertion Sort. Show all passes.

(a) Bubble Sort

Array: [5, 2, 8, 1, 9, 3]

Pass 1: Compare adjacent pairs (indices 0 to 4):

[5, 2, 8, 1, 9, 3]   5 > 2 β†’ swap   β†’ [2, 5, 8, 1, 9, 3]
[2, 5, 8, 1, 9, 3]   5 < 8 β†’ no swap
[2, 5, 8, 1, 9, 3]   8 > 1 β†’ swap   β†’ [2, 5, 1, 8, 9, 3]
[2, 5, 1, 8, 9, 3]   8 < 9 β†’ no swap
[2, 5, 1, 8, 9, 3]   9 > 3 β†’ swap   β†’ [2, 5, 1, 8, 3, 9]
After Pass 1: [2, 5, 1, 8, 3, 9]   ← 9 is in place

Pass 2: Compare indices 0 to 3:

[2, 5, 1, 8, 3, 9]   2 < 5 β†’ no swap
[2, 5, 1, 8, 3, 9]   5 > 1 β†’ swap   β†’ [2, 1, 5, 8, 3, 9]
[2, 1, 5, 8, 3, 9]   5 < 8 β†’ no swap
[2, 1, 5, 8, 3, 9]   8 > 3 β†’ swap   β†’ [2, 1, 5, 3, 8, 9]
After Pass 2: [2, 1, 5, 3, 8, 9]   ← 8, 9 in place

Pass 3: Compare indices 0 to 2:

[2, 1, 5, 3, 8, 9]   2 > 1 β†’ swap   β†’ [1, 2, 5, 3, 8, 9]
[1, 2, 5, 3, 8, 9]   2 < 5 β†’ no swap
[1, 2, 5, 3, 8, 9]   5 > 3 β†’ swap   β†’ [1, 2, 3, 5, 8, 9]
After Pass 3: [1, 2, 3, 5, 8, 9]   ← 5, 8, 9 in place

Pass 4: Compare indices 0 to 1:

[1, 2, 3, 5, 8, 9]   1 < 2 β†’ no swap
[1, 2, 3, 5, 8, 9]   2 < 3 β†’ no swap
No swaps occurred β†’ swapped flag = 0 β†’ EARLY EXIT

Final result: [1, 2, 3, 5, 8, 9] βœ”


(b) Selection Sort

Array: [5, 2, 8, 1, 9, 3]

Pass 1: Find minimum in [5, 2, 8, 1, 9, 3] β†’ min = 1 at index 3. Swap with index 0.

[1, 2, 8, 5, 9, 3]
 βœ”

Pass 2: Find minimum in [2, 8, 5, 9, 3] β†’ min = 2 at index 1. Already in place.

[1, 2, 8, 5, 9, 3]
 βœ”  βœ”

Pass 3: Find minimum in [8, 5, 9, 3] β†’ min = 3 at index 5. Swap with index 2.

[1, 2, 3, 5, 9, 8]
 βœ”  βœ”  βœ”

Pass 4: Find minimum in [5, 9, 8] β†’ min = 5 at index 3. Already in place.

[1, 2, 3, 5, 9, 8]
 βœ”  βœ”  βœ”  βœ”

Pass 5: Find minimum in [9, 8] β†’ min = 8 at index 5. Swap with index 4.

[1, 2, 3, 5, 8, 9]
 βœ”  βœ”  βœ”  βœ”  βœ”  βœ”

Final result: [1, 2, 3, 5, 8, 9] βœ”


(c) Insertion Sort

Array: [5, 2, 8, 1, 9, 3]

i = 1: key = 2. Compare with sorted part [5].

  2 < 5 β†’ shift 5 right β†’ insert 2 at pos 0
  [2, 5, 8, 1, 9, 3]    sorted: [2, 5]

i = 2: key = 8. Compare with sorted part [2, 5].

  8 > 5 β†’ stop. 8 is already in correct position.
  [2, 5, 8, 1, 9, 3]    sorted: [2, 5, 8]

i = 3: key = 1. Compare with sorted part [2, 5, 8].

  1 < 8 β†’ shift 8 right
  1 < 5 β†’ shift 5 right
  1 < 2 β†’ shift 2 right
  Insert 1 at pos 0
  [1, 2, 5, 8, 9, 3]    sorted: [1, 2, 5, 8]

i = 4: key = 9. Compare with sorted part [1, 2, 5, 8].

  9 > 8 β†’ stop. 9 is already in correct position.
  [1, 2, 5, 8, 9, 3]    sorted: [1, 2, 5, 8, 9]

i = 5: key = 3. Compare with sorted part [1, 2, 5, 8, 9].

  3 < 9 β†’ shift 9 right
  3 < 8 β†’ shift 8 right
  3 < 5 β†’ shift 5 right
  3 > 2 β†’ stop. Insert 3 at pos 2
  [1, 2, 3, 5, 8, 9]    sorted: [1, 2, 3, 5, 8, 9]

Final result: [1, 2, 3, 5, 8, 9] βœ”


Q2. Trace Merge Sort on the array [12, 11, 13, 5, 6, 7]. Show the divide and merge steps.

Array: [12, 11, 13, 5, 6, 7]

Phase 1: Divide (splitting into halves)

Level 0:        [12, 11, 13, 5, 6, 7]
                /                     \
Level 1:  [12, 11, 13]            [5, 6, 7]
           /        \              /       \
Level 2: [12]    [11, 13]       [5]      [6, 7]
                  /    \                  /    \
Level 3:       [11]   [13]            [6]    [7]

Phase 2: Merge (combining sorted sub-arrays bottom-up)

Merge [11] and [13]:

  • Compare 11 and 13 β†’ 11 < 13 β†’ take 11, then take 13
  • Result: [11, 13]

Merge [12] and [11, 13]:

  • Compare 12 and 11 β†’ 11 < 12 β†’ take 11
  • Compare 12 and 13 β†’ 12 < 13 β†’ take 12
  • Remaining: 13 β†’ take 13
  • Result: [11, 12, 13]

Merge [6] and [7]:

  • Compare 6 and 7 β†’ 6 < 7 β†’ take 6, then take 7
  • Result: [6, 7]

Merge [5] and [6, 7]:

  • Compare 5 and 6 β†’ 5 < 6 β†’ take 5
  • Remaining: [6, 7] β†’ take 6, then 7
  • Result: [5, 6, 7]

Final Merge: [11, 12, 13] and [5, 6, 7]:

  L = [11, 12, 13]    R = [5, 6, 7]
  i=0, j=0

  Step 1: L[0]=11 vs R[0]=5 β†’ 5 < 11 β†’ take 5        Output: [5]
  Step 2: L[0]=11 vs R[1]=6 β†’ 6 < 11 β†’ take 6        Output: [5, 6]
  Step 3: L[0]=11 vs R[2]=7 β†’ 7 < 11 β†’ take 7        Output: [5, 6, 7]
  Step 4: R exhausted. Copy remaining L: 11, 12, 13    Output: [5, 6, 7, 11, 12, 13]

Final result: [5, 6, 7, 11, 12, 13] βœ”

Summary Tree

                    [12, 11, 13, 5, 6, 7]
                   /                      \
           [12, 11, 13]               [5, 6, 7]
           /          \               /        \
        [12]       [11, 13]        [5]       [6, 7]
                   /     \                   /    \
                [11]    [13]              [6]    [7]
                   \    /                    \   /
                 [11, 13]                  [6, 7]
                  \    /                    \   /
              [11, 12, 13]              [5, 6, 7]
                     \                    /
              [5, 6, 7, 11, 12, 13]  βœ”

Q3. Trace Quick Sort on [10, 7, 8, 9, 1, 5] using the last element as pivot. Show each partition step.

Array: [10, 7, 8, 9, 1, 5], pivot = last element


Step 1: quickSort(A, 0, 5) β€” pivot = A[5] = 5

i = -1 (boundary of ≀ pivot section)

j=0: A[0]=10 > 5  β†’ skip                      i=-1
j=1: A[1]=7  > 5  β†’ skip                      i=-1
j=2: A[2]=8  > 5  β†’ skip                      i=-1
j=3: A[3]=9  > 5  β†’ skip                      i=-1
j=4: A[4]=1  ≀ 5  β†’ i=0, swap A[0]↔A[4]      i=0
     Array: [1, 7, 8, 9, 10, 5]

Final: swap A[i+1]↔A[high] β†’ swap A[1]↔A[5]
     Array: [1, 5, 8, 9, 10, 7]

Pivot 5 is at index 1.
Left partition: [1]          (index 0 to 0)
Right partition: [8, 9, 10, 7]  (index 2 to 5)

Step 2: quickSort(A, 0, 0) β€” single element [1]

Base case: low >= high β†’ already sorted. [1] βœ”


Step 3: quickSort(A, 2, 5) β€” sub-array [8, 9, 10, 7], pivot = A[5] = 7

i = 1 (starts at low - 1 = 2 - 1 = 1)

j=2: A[2]=8  > 7  β†’ skip                      i=1
j=3: A[3]=9  > 7  β†’ skip                      i=1
j=4: A[4]=10 > 7  β†’ skip                      i=1

Final: swap A[i+1]↔A[high] β†’ swap A[2]↔A[5]
     Array: [1, 5, 7, 9, 10, 8]

Pivot 7 is at index 2.
Left partition: [] (empty, index 2 to 1)
Right partition: [9, 10, 8]  (index 3 to 5)

Step 4: quickSort(A, 3, 5) β€” sub-array [9, 10, 8], pivot = A[5] = 8

i = 2 (starts at low - 1 = 3 - 1 = 2)

j=3: A[3]=9  > 8  β†’ skip                      i=2
j=4: A[4]=10 > 8  β†’ skip                      i=2

Final: swap A[i+1]↔A[high] β†’ swap A[3]↔A[5]
     Array: [1, 5, 7, 8, 10, 9]

Pivot 8 is at index 3.
Left partition: [] (empty, index 3 to 2)
Right partition: [10, 9]  (index 4 to 5)

Step 5: quickSort(A, 4, 5) β€” sub-array [10, 9], pivot = A[5] = 9

i = 3 (starts at low - 1 = 4 - 1 = 3)

j=4: A[4]=10 > 9  β†’ skip                      i=3

Final: swap A[i+1]↔A[high] β†’ swap A[4]↔A[5]
     Array: [1, 5, 7, 8, 9, 10]

Pivot 9 is at index 4.
Left partition: [] (empty)
Right partition: [10] (single element)

Final sorted array: [1, 5, 7, 8, 9, 10] βœ”

Summary of pivots chosen and their final positions:

Partition Pivot Final Position
[10,7,8,9,1,5] 5 index 1
[8,9,10,7] 7 index 2
[9,10,8] 8 index 3
[10,9] 9 index 4

Q4. Build a Max-Heap from the array [4, 10, 3, 5, 1]. Then perform Heap Sort step by step.

Array: [4, 10, 3, 5, 1], $n = 5$


Phase 1: Build Max-Heap

Start heapifying from the last non-leaf node: index $\lfloor n/2 \rfloor - 1 = \lfloor 5/2 \rfloor - 1 = 1$.

Initial tree:

        4
       / \
     10    3
    / \
   5   1

i = 1: Heapify node at index 1 (value 10)

  • Left child: index 3 (value 5)
  • Right child: index 4 (value 1)
  • Largest = 10 (parent) β†’ no swap needed

Array: [4, 10, 3, 5, 1] (unchanged)

i = 0: Heapify node at index 0 (value 4)

  • Left child: index 1 (value 10)
  • Right child: index 2 (value 3)
  • Largest = 10 at index 1 β†’ swap A[0] and A[1]

Array: [10, 4, 3, 5, 1]

        10
       /  \
      4    3
     / \
    5   1

Now recursively heapify index 1 (value 4):

  • Left child: index 3 (value 5)
  • Right child: index 4 (value 1)
  • Largest = 5 at index 3 β†’ swap A[1] and A[3]

Array: [10, 5, 3, 4, 1]

        10
       /  \
      5    3
     / \
    4   1

Recursively heapify index 3 (value 4):

  • No children within bounds β†’ stop

Max-Heap built: [10, 5, 3, 4, 1] βœ”


Phase 2: Heap Sort (extract max repeatedly)

Extraction 1: Swap root (10) with last element (1). Heap size = 4.

Array: [1, 5, 3, 4, | 10]    (10 is sorted)

Heapify index 0 (value 1):
  Left=5, Right=3 β†’ largest=5 at index 1 β†’ swap A[0]↔A[1]
  Array: [5, 1, 3, 4, | 10]
  
  Heapify index 1 (value 1):
    Left=4 β†’ largest=4 at index 3 β†’ swap A[1]↔A[3]
    Array: [5, 4, 3, 1, | 10]

Heap: [5, 4, 3, 1]
         5
        / \
       4   3
      /
     1

Extraction 2: Swap root (5) with last unsorted (1). Heap size = 3.

Array: [1, 4, 3, | 5, 10]    (5, 10 sorted)

Heapify index 0 (value 1):
  Left=4, Right=3 β†’ largest=4 at index 1 β†’ swap A[0]↔A[1]
  Array: [4, 1, 3, | 5, 10]

Heap: [4, 1, 3]
        4
       / \
      1   3

Extraction 3: Swap root (4) with last unsorted (3). Heap size = 2.

Array: [3, 1, | 4, 5, 10]    (4, 5, 10 sorted)

Heapify index 0 (value 3):
  Left=1 β†’ largest=3 (parent) β†’ no swap

Heap: [3, 1]
       3
      /
     1

Extraction 4: Swap root (3) with last unsorted (1). Heap size = 1.

Array: [1, | 3, 4, 5, 10]    (3, 4, 5, 10 sorted)

Single element β†’ stop.

Final sorted array: [1, 3, 4, 5, 10] βœ”


Q5. Sort [1, 4, 1, 2, 7, 5, 2] using Counting Sort. Show the count array and output.

Array A = [1, 4, 1, 2, 7, 5, 2], $n = 7$, range $k = 7$ (max value)


Step 1: Count occurrences

Scan A and increment count[A[i]]:

A[0]=1 β†’ count[1]++
A[1]=4 β†’ count[4]++
A[2]=1 β†’ count[1]++
A[3]=2 β†’ count[2]++
A[4]=7 β†’ count[7]++
A[5]=5 β†’ count[5]++
A[6]=2 β†’ count[2]++
Index 0 1 2 3 4 5 6 7
Count 0 2 2 0 1 1 0 1

Step 2: Cumulative count (prefix sum)

count[i] += count[i-1]

Index 0 1 2 3 4 5 6 7
Cumulative 0 2 4 4 5 6 6 7

Meaning: count[v] = number of elements ≀ $v$. For example, count[2] = 4 means there are 4 elements with value ≀ 2.


Step 3: Build output array (traverse input right-to-left for stability)

i=6: A[6]=2 β†’ position = count[2]-1 = 4-1 = 3 β†’ output[3]=2, count[2]=3
i=5: A[5]=5 β†’ position = count[5]-1 = 6-1 = 5 β†’ output[5]=5, count[5]=5
i=4: A[4]=7 β†’ position = count[7]-1 = 7-1 = 6 β†’ output[6]=7, count[7]=6
i=3: A[3]=2 β†’ position = count[2]-1 = 3-1 = 2 β†’ output[2]=2, count[2]=2
i=2: A[2]=1 β†’ position = count[1]-1 = 2-1 = 1 β†’ output[1]=1, count[1]=1
i=1: A[1]=4 β†’ position = count[4]-1 = 5-1 = 4 β†’ output[4]=4, count[4]=4
i=0: A[0]=1 β†’ position = count[1]-1 = 1-1 = 0 β†’ output[0]=1, count[1]=0
Output index 0 1 2 3 4 5 6
Value 1 1 2 2 4 5 7

Final sorted array: [1, 1, 2, 2, 4, 5, 7] βœ”

Stability verified: The two 1’s β€” A[0]=1 appeared before A[2]=1 in input. In output, A[0]’s 1 is at index 0 and A[2]’s 1 is at index 1 β†’ original order preserved βœ”. Same for the two 2’s.


Q6. What is a stable sort? Give an example showing why stability matters.

Definition: A sorting algorithm is stable if it preserves the relative order of elements that have equal keys. That is, if two elements have the same key value and element A appeared before element B in the input, then A still appears before B in the sorted output.

Example β€” Student Records sorted by marks:

Input (already sorted alphabetically):

Name Marks
Alice 85
Bob 90
Charlie 85
Dave 90

Stable sort by marks:

Name Marks
Alice 85
Charlie 85
Bob 90
Dave 90

Alice still comes before Charlie (both scored 85) βœ”
Bob still comes before Dave (both scored 90) βœ”
The alphabetical ordering within the same marks is preserved!

Unstable sort by marks (possible output):

Name Marks
Charlie 85
Alice 85
Dave 90
Bob 90

Charlie before Alice βœ— β€” the original relative order is broken.


Why stability matters:

  1. Multi-key sorting: Suppose you first sort by name, then sort by marks. With a stable sort, students with the same marks remain in alphabetical order. With an unstable sort, the alphabetical ordering within a group is destroyed.

  2. Database operations: When sorting records that are already ordered by a secondary key, stability preserves that secondary ordering.

  3. Radix sort depends on stability: Radix sort sorts by least significant digit first, using a stable sort at each digit. If the per-digit sort were unstable, the entire algorithm would produce wrong results.

Classification:

Stable Sorts Unstable Sorts
Bubble Sort Selection Sort
Insertion Sort Quick Sort
Merge Sort Heap Sort
Counting Sort Β 

Q7. Why is Quick Sort preferred over Merge Sort in practice despite having $O(n^2)$ worst case?

Despite its $O(n^2)$ worst case, Quick Sort is often the default choice in practice for the following reasons:

1. Better Cache Performance:

  • Quick Sort works in-place on the same array. It accesses elements that are close together in memory, leading to excellent cache locality.
  • Merge Sort copies data to temporary arrays during merging. This causes more cache misses because data is spread across different memory locations.

2. Lower Space Overhead:

  • Quick Sort: $O(\log n)$ space (recursion stack only)
  • Merge Sort: $O(n)$ extra space (auxiliary arrays for merging)
  • For large datasets, the $O(n)$ memory overhead of Merge Sort can be significant.

3. Smaller Constant Factor:

  • Quick Sort’s inner loop is very tight: compare and swap adjacent elements. The constant factor hidden in $O(n \log n)$ is smaller than Merge Sort’s.
  • Merge Sort has overhead from copying to and from temporary arrays.

4. Worst Case is Avoidable:

  • The $O(n^2)$ worst case occurs when the pivot is always the smallest or largest element (e.g., sorted array with first/last element as pivot).
  • Randomized pivot selection or median-of-three makes the worst case extremely improbable β€” practically $O(n \log n)$ always.

5. In-Place Operation:

  • Quick Sort doesn’t need extra arrays. For memory-constrained systems, this is a significant advantage.

6. Practical Benchmarks:

  • On real-world data, Quick Sort typically outperforms Merge Sort by 2–3Γ— due to the factors above.
  • Java uses Quick Sort for primitives; Python’s Timsort is a hybrid of Merge Sort + Insertion Sort (chosen for stability, not speed).

When Merge Sort IS preferred:

  • When stability is required
  • When worst-case $O(n \log n)$ is guaranteed (safety-critical systems)
  • When sorting linked lists (merge is easy without extra space; Quick Sort’s random access is expensive)
  • When sorting data from external storage (merge sort’s sequential access pattern suits disk I/O)

Q8. Can Bubble Sort’s worst case be improved from $O(n^2)$? Explain the optimization using a swap flag.

Short answer: The worst case remains $O(n^2)$ β€” it cannot be improved with a swap flag. The flag only improves the best case from $O(n^2)$ to $O(n)$.


The Swap Flag Optimization:

void bubbleSortOptimized(int A[], int n) {
    for (int i = 0; i < n - 1; i++) {
        int swapped = 0;  // Flag
        for (int j = 0; j < n - 1 - i; j++) {
            if (A[j] > A[j + 1]) {
                int temp = A[j];
                A[j] = A[j + 1];
                A[j + 1] = temp;
                swapped = 1;
            }
        }
        if (!swapped) break;  // No swaps β†’ array is sorted β†’ exit early
    }
}

How it works: After each pass, if no swaps occurred, the array is already sorted. The flag detects this and exits early.

Best Case (already sorted array):

Array: [1, 2, 3, 4, 5]
Pass 1: Compare all adjacent pairs β†’ no swaps β†’ swapped = 0 β†’ BREAK
Total comparisons: n - 1 = 4 β†’ O(n)

Without the flag, even a sorted array would go through all $n-1$ passes, doing $\frac{n(n-1)}{2}$ comparisons β†’ $O(n^2)$.

Worst Case (reverse sorted array):

Array: [5, 4, 3, 2, 1]
Pass 1: 4 swaps β†’ [4, 3, 2, 1, 5]     swapped = 1
Pass 2: 3 swaps β†’ [3, 2, 1, 4, 5]     swapped = 1
Pass 3: 2 swaps β†’ [2, 1, 3, 4, 5]     swapped = 1
Pass 4: 1 swap  β†’ [1, 2, 3, 4, 5]     swapped = 1
All passes needed. Total comparisons = 4+3+2+1 = 10 = n(n-1)/2 β†’ O(nΒ²)

The flag never triggers early exit because every pass has at least one swap.

Case Without Flag With Flag
Best (sorted) $O(n^2)$ $O(n)$ βœ” improved
Average $O(n^2)$ $O(n^2)$ β€” no change
Worst (reverse) $O(n^2)$ $O(n^2)$ β€” no change

Conclusion: The swap flag optimization helps detect an already-sorted or nearly-sorted array early, but it cannot improve the fundamental worst-case complexity of $O(n^2)$. To achieve better worst-case performance, you need a fundamentally different algorithm like Merge Sort or Heap Sort.


Q9. Which sorting algorithms are suitable for nearly sorted data? Explain why.

Best algorithms for nearly sorted data:

1. Insertion Sort β€” Best Choice ⭐

  • Nearly sorted performance: $O(n)$ to $O(n \times k)$, where $k$ is how far each element is from its correct position.
  • Why: Insertion sort shifts elements to their correct position. If they’re already close to their correct spot, very few shifts are needed. The inner while loop terminates quickly.
  • Example: [1, 3, 2, 4, 5, 7, 6, 8] β€” each element is at most 1 position away. Insertion sort handles this in nearly $O(n)$.

2. Bubble Sort (with swap flag) β€” Good

  • Nearly sorted performance: $O(n)$ (detects sorted data in one pass)
  • Why: If only a few elements are out of place, most passes will have few or no swaps. The swap flag terminates early once a pass has no swaps.
  • Limitation: Not as fast as insertion sort when elements are slightly out of order but still need multiple passes.

3. TimSort (hybrid) β€” Industry Standard

  • Used by Python and Java. It’s a hybrid of Merge Sort and Insertion Sort.
  • Why: It detects existing sorted β€œruns” in the data and merges them. Nearly sorted data has long runs, so TimSort approaches $O(n)$.

Why these work better on nearly sorted data:

Algorithm Nearly Sorted Time Reason
Insertion Sort $O(n)$ Few shifts per element
Bubble Sort (flag) $O(n)$ Early termination
Merge Sort $O(n \log n)$ Cannot exploit existing order
Quick Sort $O(n \log n)$ to $O(n^2)$ Sorted data is its WORST case (with fixed pivot)
Heap Sort $O(n \log n)$ Cannot exploit existing order
Selection Sort $O(n^2)$ Always does all comparisons regardless

Insertion Sort is the clear winner for nearly sorted data. This is why practical algorithms like TimSort use insertion sort for small or nearly sorted sub-arrays.


Q10. Compare all sorting algorithms in a table. Under what conditions would you choose each?

Master Comparison Table:

Algorithm Best Average Worst Space Stable In-Place Method
Bubble Sort $O(n)$ $O(n^2)$ $O(n^2)$ $O(1)$ βœ” βœ” Comparison
Selection Sort $O(n^2)$ $O(n^2)$ $O(n^2)$ $O(1)$ βœ— βœ” Comparison
Insertion Sort $O(n)$ $O(n^2)$ $O(n^2)$ $O(1)$ βœ” βœ” Comparison
Merge Sort $O(n \log n)$ $O(n \log n)$ $O(n \log n)$ $O(n)$ βœ” βœ— Divide & Conquer
Quick Sort $O(n \log n)$ $O(n \log n)$ $O(n^2)$ $O(\log n)$ βœ— βœ” Divide & Conquer
Heap Sort $O(n \log n)$ $O(n \log n)$ $O(n \log n)$ $O(1)$ βœ— βœ” Selection (heap-based)
Counting Sort $O(n+k)$ $O(n+k)$ $O(n+k)$ $O(n+k)$ βœ” βœ— Non-comparison

When to choose each:

Scenario Best Algorithm Reason
Small array ($n < 50$) Insertion Sort Low overhead, simple, fast for small $n$
Nearly sorted data Insertion Sort Nearly $O(n)$ performance
General purpose, fast Quick Sort Best average performance, good cache behavior
Guaranteed $O(n \log n)$ Merge Sort or Heap Sort No worst-case degradation
Stability required Merge Sort Only stable $O(n \log n)$ sort
Memory constrained Heap Sort $O(1)$ extra space, guaranteed $O(n \log n)$
Integer keys in small range Counting Sort $O(n+k)$ β€” faster than any comparison sort
Writes are expensive (flash memory) Selection Sort Minimum swaps: exactly $n - 1$
Linked list Merge Sort No random access needed; merge is natural for lists
External sorting (data on disk) Merge Sort Sequential access pattern suits disk I/O
Simple implementation needed Bubble Sort or Insertion Sort Easy to code and debug
Online sorting (data arrives incrementally) Insertion Sort Can sort elements as they arrive

Q11. Prove that any comparison-based sorting algorithm has a lower bound of $\Omega(n \log n)$.

Theorem: Any comparison-based sorting algorithm must make at least $\Omega(n \log n)$ comparisons in the worst case to sort $n$ elements.

Proof using the Decision Tree model:

Step 1: Model sorting as a decision tree.

Any comparison-based sort can be modeled as a binary decision tree where:

  • Each internal node represents a comparison ($a_i \leq a_j$?)
  • Each branch represents the outcome (yes/no)
  • Each leaf represents a specific permutation (output ordering)

Step 2: Count the leaves.

For $n$ distinct elements, there are $n!$ possible input permutations. Each permutation must lead to a different leaf (because different orderings require different outputs). Therefore:

\[\text{Number of leaves} \geq n!\]

Step 3: Relate tree height to leaves.

A binary tree of height $h$ has at most $2^h$ leaves. Since we need at least $n!$ leaves:

\[2^h \geq n!\] \[h \geq \log_2(n!)\]

Step 4: Apply Stirling’s approximation.

Using Stirling’s approximation: $n! \approx \sqrt{2\pi n} \left(\frac{n}{e}\right)^n$

\[\log_2(n!) = \sum_{i=1}^{n} \log_2 i\]

A simpler bound: $n! \geq \left(\frac{n}{2}\right)^{n/2}$ (because at least half the terms in the product are $\geq n/2$).

\[h \geq \log_2\left(\left(\frac{n}{2}\right)^{n/2}\right) = \frac{n}{2} \cdot \log_2\left(\frac{n}{2}\right) = \frac{n}{2}(\log_2 n - 1)\] \[h \geq \frac{n \log_2 n}{2} - \frac{n}{2} = \Omega(n \log n)\]

Alternatively, using the exact form:

\[\log_2(n!) = \log_2 1 + \log_2 2 + \cdots + \log_2 n = \sum_{i=1}^{n} \log_2 i\] \[\geq \sum_{i=n/2}^{n} \log_2 i \geq \frac{n}{2} \cdot \log_2\left(\frac{n}{2}\right) = \Omega(n \log n)\]

Step 5: Conclusion.

The height $h$ represents the maximum number of comparisons (the worst-case path from root to leaf). Since $h = \Omega(n \log n)$, any comparison-based sort must make $\Omega(n \log n)$ comparisons in the worst case.

\[\boxed{\text{Lower bound for comparison-based sorting} = \Omega(n \log n)}\]

Important implications:

  • Merge Sort and Heap Sort are optimal comparison-based sorts since they achieve $O(n \log n)$.
  • To sort faster than $O(n \log n)$, you must use non-comparison methods (e.g., Counting Sort, Radix Sort), which exploit special structure in the keys.

Q12. Write C code for Merge Sort and trace its execution for [38, 27, 43, 3].

C Code:

#include <stdio.h>

void merge(int A[], int left, int mid, int right) {
    int n1 = mid - left + 1;   // Size of left half
    int n2 = right - mid;       // Size of right half

    // Create temporary arrays
    int L[n1], R[n2];
    for (int i = 0; i < n1; i++) L[i] = A[left + i];
    for (int j = 0; j < n2; j++) R[j] = A[mid + 1 + j];

    // Merge the two sorted halves back into A
    int i = 0, j = 0, k = left;
    while (i < n1 && j < n2) {
        if (L[i] <= R[j])       // <= for stability
            A[k++] = L[i++];
        else
            A[k++] = R[j++];
    }

    // Copy remaining elements
    while (i < n1) A[k++] = L[i++];
    while (j < n2) A[k++] = R[j++];
}

void mergeSort(int A[], int left, int right) {
    if (left < right) {
        int mid = left + (right - left) / 2;
        mergeSort(A, left, mid);        // Sort left half
        mergeSort(A, mid + 1, right);   // Sort right half
        merge(A, left, mid, right);     // Merge sorted halves
    }
}

int main() {
    int A[] = {38, 27, 43, 3};
    int n = 4;

    printf("Before: ");
    for (int i = 0; i < n; i++) printf("%d ", A[i]);
    printf("\n");

    mergeSort(A, 0, n - 1);

    printf("After:  ");
    for (int i = 0; i < n; i++) printf("%d ", A[i]);
    printf("\n");

    return 0;
}

Output:

Before: 38 27 43 3
After:  3 27 38 43

Detailed Trace for [38, 27, 43, 3]:

mergeSort(A, 0, 3)
β”œβ”€β”€ mid = 1
β”œβ”€β”€ mergeSort(A, 0, 1)         β€” Sort [38, 27]
β”‚   β”œβ”€β”€ mid = 0
β”‚   β”œβ”€β”€ mergeSort(A, 0, 0)     β€” [38] β†’ base case (single element)
β”‚   β”œβ”€β”€ mergeSort(A, 1, 1)     β€” [27] β†’ base case (single element)
β”‚   └── merge(A, 0, 0, 1)     β€” Merge [38] and [27]
β”‚       L = [38], R = [27]
β”‚       38 > 27 β†’ take 27     β†’ A = [27, __, 43, 3]
β”‚       Copy remaining L: 38  β†’ A = [27, 38, 43, 3]
β”‚       βœ” Merged: [27, 38]
β”‚
β”œβ”€β”€ mergeSort(A, 2, 3)         β€” Sort [43, 3]
β”‚   β”œβ”€β”€ mid = 2
β”‚   β”œβ”€β”€ mergeSort(A, 2, 2)     β€” [43] β†’ base case (single element)
β”‚   β”œβ”€β”€ mergeSort(A, 3, 3)     β€” [3] β†’ base case (single element)
β”‚   └── merge(A, 2, 2, 3)     β€” Merge [43] and [3]
β”‚       L = [43], R = [3]
β”‚       43 > 3 β†’ take 3       β†’ A = [27, 38, 3, __]
β”‚       Copy remaining L: 43  β†’ A = [27, 38, 3, 43]
β”‚       βœ” Merged: [3, 43]
β”‚
└── merge(A, 0, 1, 3)         β€” Merge [27, 38] and [3, 43]
    L = [27, 38], R = [3, 43]
    
    Step 1: L[0]=27 vs R[0]=3  β†’ 3 < 27 β†’ take 3    A = [3, __, __, __]
    Step 2: L[0]=27 vs R[1]=43 β†’ 27 < 43 β†’ take 27   A = [3, 27, __, __]
    Step 3: L[1]=38 vs R[1]=43 β†’ 38 < 43 β†’ take 38   A = [3, 27, 38, __]
    Step 4: L exhausted. Copy remaining R: 43          A = [3, 27, 38, 43]
    
    βœ” Final merged: [3, 27, 38, 43]

Summary of recursive calls:

                  [38, 27, 43, 3]
                 /               \
          [38, 27]              [43, 3]
          /     \               /     \
       [38]    [27]          [43]    [3]
          \    /                \    /
        [27, 38]              [3, 43]
              \                /
          [3, 27, 38, 43]  βœ”

Total comparisons: 4 (1 in first merge + 1 in second merge + 2 in final merge).


Comprehensive Unit II Practice Set


Short Answer


Q1. Define sequential organization. Why do arrays support random access?

Sequential Organization: A method of storing data elements in consecutive (contiguous) memory locations. Each element is placed immediately after the previous one in memory, with no gaps.

Why arrays support random access:

Arrays support random access β€” accessing any element in $O(1)$ time β€” because:

  1. Contiguous storage: All elements are stored back-to-back in memory. The address of any element can be calculated directly from the base address:
\[\text{Address of } A[i] = \text{Base} + i \times \text{ElementSize}\]
  1. Uniform element size: Every element has the same size ($w$ bytes), so the offset from the base is a simple multiplication.

  2. Direct hardware support: The CPU can compute the address using simple arithmetic (multiply and add), then issue a single memory read β€” no traversal needed.

Contrast with linked lists: In a linked list, elements are scattered across memory, connected by pointers. To access the $k$-th element, you must follow $k$ pointers from the head β€” that’s $O(k)$ time, not random access.


Q2. What is the address formula for A[i][j] in row-major and column-major order?

General formulas for 2D array A[l₁ ... u₁][lβ‚‚ ... uβ‚‚]:

Let:

  • $B$ = base address
  • $w$ = size of each element in bytes
  • $R = u_1 - l_1 + 1$ = number of rows
  • $C = u_2 - l_2 + 1$ = number of columns

Row-Major Order (row by row):

\[\boxed{\text{Address of } A[i][j] = B + \big[(i - l_1) \times C + (j - l_2)\big] \times w}\]

Explanation: Skip $(i - l_1)$ complete rows (each of $C$ elements), then skip $(j - l_2)$ elements within the current row.


Column-Major Order (column by column):

\[\boxed{\text{Address of } A[i][j] = B + \big[(j - l_2) \times R + (i - l_1)\big] \times w}\]

Explanation: Skip $(j - l_2)$ complete columns (each of $R$ elements), then skip $(i - l_1)$ elements within the current column.


Simplified formulas (when indexing starts at 0, i.e., $l_1 = l_2 = 0$):

Order Formula
Row-Major $B + (i \times n + j) \times w$
Column-Major $B + (j \times m + i) \times w$

where $m$ = number of rows, $n$ = number of columns.


Q3. Write C functions for string length and palindrome check without using library functions.

// String Length β€” O(n) time, O(1) space
int stringLength(char str[]) {
    int len = 0;
    while (str[len] != '\0') {
        len++;
    }
    return len;
}

Logic: Start counter at 0, walk through the array until the null terminator '\0' is found. The counter value at that point is the string length.

// Palindrome Check β€” O(n) time, O(1) space
// Returns: 1 if palindrome, 0 otherwise
int isPalindrome(char str[]) {
    // Step 1: Find length
    int len = 0;
    while (str[len] != '\0') len++;

    // Step 2: Compare from both ends
    int start = 0, end = len - 1;
    while (start < end) {
        if (str[start] != str[end])
            return 0;   // Mismatch β†’ not a palindrome
        start++;
        end--;
    }
    return 1;   // All pairs matched β†’ palindrome
}

Logic: Use two pointers β€” start at the beginning and end at the end. Compare characters moving inward. If any pair mismatches, it’s not a palindrome. If all pairs match (or pointers cross), it is.

Example traces:

stringLength("Hello") β†’ 'H','e','l','l','o','\0' β†’ len = 5

isPalindrome("racecar"):
  r == r βœ”, a == a βœ”, c == c βœ”, e (middle) β†’ return 1 βœ”

isPalindrome("hello"):
  h != o β†’ return 0 βœ”

Q4. What is the difference between stable and unstable sorting?

Aspect Stable Sort Unstable Sort
Definition Preserves relative order of equal elements May change relative order of equal elements
Guarantee If $A[i] = A[j]$ and $i < j$ in input, then $A[i]$ appears before $A[j]$ in output No such guarantee
Examples Bubble Sort, Insertion Sort, Merge Sort, Counting Sort Selection Sort, Quick Sort, Heap Sort
Use case Multi-key sorting, maintaining secondary order General purpose where stability isn’t needed

Concrete example:

Input: [(apple, 3), (banana, 1), (cherry, 3), (date, 2)]

Sort by the number:

  • Stable: [(banana, 1), (date, 2), (apple, 3), (cherry, 3)] β€” apple before cherry βœ”
  • Unstable: [(banana, 1), (date, 2), (cherry, 3), (apple, 3)] β€” cherry before apple βœ—

Stability matters when you need to preserve an existing ordering (e.g., sort by score, keeping alphabetical order within same scores).


Q5. Why is Binary Search $O(\log n)$? Prove it.

Intuitive explanation: Binary search halves the search space with every comparison. Starting with $n$ elements:

\[n \to \frac{n}{2} \to \frac{n}{4} \to \frac{n}{8} \to \cdots \to 1\]

The number of halvings needed to go from $n$ to $1$ is $\log_2 n$.


Formal proof using recurrence:

Let $T(n)$ = number of comparisons for binary search on $n$ elements.

Recurrence relation:

\[T(n) = T\left(\frac{n}{2}\right) + 1, \quad T(1) = 1\]
  • $T(n/2)$: After one comparison, we recurse on half the array
  • $+1$: The one comparison with the middle element

Solving by repeated substitution:

\[T(n) = T\left(\frac{n}{2}\right) + 1\] \[= T\left(\frac{n}{4}\right) + 1 + 1 = T\left(\frac{n}{4}\right) + 2\] \[= T\left(\frac{n}{8}\right) + 3\] \[= T\left(\frac{n}{2^k}\right) + k\]

The recursion stops when $\frac{n}{2^k} = 1$, i.e., $2^k = n$, i.e., $k = \log_2 n$.

\[T(n) = T(1) + \log_2 n = 1 + \log_2 n\] \[\boxed{T(n) = \lfloor \log_2 n \rfloor + 1 = O(\log n)}\]

Alternative proof using Master Theorem:

$T(n) = 1 \cdot T(n/2) + O(1)$

Here $a = 1$, $b = 2$, $f(n) = O(1)$, and $n^{\log_b a} = n^{\log_2 1} = n^0 = 1$.

Since $f(n) = \Theta(n^{\log_b a}) = \Theta(1)$, by Case 2 of the Master Theorem:

\[T(n) = \Theta(\log n)\]

Tracing & Analytical


Q6. An integer array A[4][6] starts at address 2000. Element size = 4 bytes. Find address of A[2][3] in row-major and column-major.

Given:

  • Array: $A[4][6]$ β†’ $m = 4$ rows, $n = 6$ columns
  • Base address $B = 2000$
  • Element size $w = 4$ bytes
  • Target: $A[2][3]$ β†’ $i = 2$, $j = 3$
  • Indexing starts at 0 ($l_1 = 0, l_2 = 0$)

Row-Major Order:

\[\text{Address} = B + (i \times n + j) \times w\] \[= 2000 + (2 \times 6 + 3) \times 4\] \[= 2000 + (12 + 3) \times 4\] \[= 2000 + 15 \times 4\] \[= 2000 + 60\] \[= \boxed{2060}\]

Explanation: Before row 2, there are $2 \times 6 = 12$ elements (rows 0 and 1, each with 6 elements). Within row 2, element at column 3 is the 4th element. Total elements before $A[2][3]$ = $12 + 3 = 15$. Offset = $15 \times 4 = 60$ bytes.


Column-Major Order:

\[\text{Address} = B + (j \times m + i) \times w\] \[= 2000 + (3 \times 4 + 2) \times 4\] \[= 2000 + (12 + 2) \times 4\] \[= 2000 + 14 \times 4\] \[= 2000 + 56\] \[= \boxed{2056}\]

Explanation: Before column 3, there are $3 \times 4 = 12$ elements (columns 0, 1, 2, each with 4 elements). Within column 3, element at row 2 is the 3rd element. Total elements before $A[2][3]$ = $12 + 2 = 14$. Offset = $14 \times 4 = 56$ bytes.


Q7. Sort [29, 10, 14, 37, 13] using each of the 6 sorting algorithms. Show all intermediate steps.

(a) Bubble Sort

Initial: [29, 10, 14, 37, 13]

Pass 1 (compare indices 0-3):
  29 > 10 β†’ swap β†’ [10, 29, 14, 37, 13]
  29 > 14 β†’ swap β†’ [10, 14, 29, 37, 13]
  29 < 37 β†’ no swap
  37 > 13 β†’ swap β†’ [10, 14, 29, 13, 37]
  After: [10, 14, 29, 13, 37]    ← 37 in place

Pass 2 (compare indices 0-2):
  10 < 14 β†’ no swap
  14 < 29 β†’ no swap
  29 > 13 β†’ swap β†’ [10, 14, 13, 29, 37]
  After: [10, 14, 13, 29, 37]    ← 29 in place

Pass 3 (compare indices 0-1):
  10 < 14 β†’ no swap
  14 > 13 β†’ swap β†’ [10, 13, 14, 29, 37]
  After: [10, 13, 14, 29, 37]    ← 14 in place

Pass 4 (compare index 0):
  10 < 13 β†’ no swap
  No swaps β†’ EARLY EXIT

Result: [10, 13, 14, 29, 37] βœ”

(b) Selection Sort

Initial: [29, 10, 14, 37, 13]

Pass 1: Find min in [29,10,14,37,13] β†’ 10 at idx 1. Swap with idx 0.
  [10, 29, 14, 37, 13]

Pass 2: Find min in [29,14,37,13] β†’ 13 at idx 4. Swap with idx 1.
  [10, 13, 14, 37, 29]

Pass 3: Find min in [14,37,29] β†’ 14 at idx 2. Already in place.
  [10, 13, 14, 37, 29]

Pass 4: Find min in [37,29] β†’ 29 at idx 4. Swap with idx 3.
  [10, 13, 14, 29, 37]

Result: [10, 13, 14, 29, 37] βœ”

(c) Insertion Sort

Initial: [29, 10, 14, 37, 13]

i=1: key=10. Compare with [29].
  10 < 29 β†’ shift 29 right. Insert 10 at pos 0.
  [10, 29, 14, 37, 13]    sorted: [10, 29]

i=2: key=14. Compare with [10, 29].
  14 < 29 β†’ shift 29 right.
  14 > 10 β†’ stop. Insert 14 at pos 1.
  [10, 14, 29, 37, 13]    sorted: [10, 14, 29]

i=3: key=37. Compare with [10, 14, 29].
  37 > 29 β†’ stop. Already in place.
  [10, 14, 29, 37, 13]    sorted: [10, 14, 29, 37]

i=4: key=13. Compare with [10, 14, 29, 37].
  13 < 37 β†’ shift 37 right
  13 < 29 β†’ shift 29 right
  13 < 14 β†’ shift 14 right
  13 > 10 β†’ stop. Insert 13 at pos 1.
  [10, 13, 14, 29, 37]    sorted: [10, 13, 14, 29, 37]

Result: [10, 13, 14, 29, 37] βœ”

(d) Merge Sort

                  [29, 10, 14, 37, 13]
                 /                    \
          [29, 10]                [14, 37, 13]
          /     \                 /          \
       [29]    [10]           [14]        [37, 13]
                                          /     \
                                       [37]    [13]

Merge [29] and [10]:
  10 < 29 β†’ [10, 29]

Merge [37] and [13]:
  13 < 37 β†’ [13, 37]

Merge [14] and [13, 37]:
  L=[14], R=[13, 37]
  13 < 14 β†’ take 13
  14 < 37 β†’ take 14
  Remaining: 37
  Result: [13, 14, 37]

Merge [10, 29] and [13, 14, 37]:
  L=[10, 29], R=[13, 14, 37]
  10 < 13 β†’ take 10       [10]
  29 > 13 β†’ take 13       [10, 13]
  29 > 14 β†’ take 14       [10, 13, 14]
  29 < 37 β†’ take 29       [10, 13, 14, 29]
  Remaining: 37            [10, 13, 14, 29, 37]

Result: [10, 13, 14, 29, 37] βœ”

(e) Quick Sort (last element as pivot)

quickSort([29, 10, 14, 37, 13], 0, 4)  pivot = 13
  i = -1
  j=0: 29 > 13 β†’ skip
  j=1: 10 ≀ 13 β†’ i=0, swap A[0]↔A[1] β†’ [10, 29, 14, 37, 13]
  j=2: 14 > 13 β†’ skip
  j=3: 37 > 13 β†’ skip
  Swap A[i+1]↔A[4] β†’ swap A[1]↔A[4] β†’ [10, 13, 14, 37, 29]
  Pivot 13 at index 1.

  Left: quickSort([10], 0, 0) β†’ base case [10] βœ”

  Right: quickSort([14, 37, 29], 2, 4)  pivot = 29
    i = 1
    j=2: 14 ≀ 29 β†’ i=2, swap A[2]↔A[2] β†’ [10, 13, 14, 37, 29]
    j=3: 37 > 29 β†’ skip
    Swap A[i+1]↔A[4] β†’ swap A[3]↔A[4] β†’ [10, 13, 14, 29, 37]
    Pivot 29 at index 3.

    Left: quickSort([14], 2, 2) β†’ base case [14] βœ”
    Right: quickSort([37], 4, 4) β†’ base case [37] βœ”

Result: [10, 13, 14, 29, 37] βœ”

(f) Heap Sort

Step 1: Build Max-Heap from [29, 10, 14, 37, 13]

Initial tree:
        29
       /  \
     10    14
    /  \
  37    13

Last non-leaf: index 1 (value 10)

i=1: Heapify(1). Left=37, Right=13. Largest=37 at idx 3. Swap A[1]↔A[3].
  [29, 37, 14, 10, 13]
  Heapify(3): no children in bounds β†’ stop.

i=0: Heapify(0). Left=37, Right=14. Largest=37 at idx 1. Swap A[0]↔A[1].
  [37, 29, 14, 10, 13]
  Heapify(1): Left=10, Right=13. Largest=29 (parent) β†’ stop.

Max-Heap: [37, 29, 14, 10, 13]
        37
       /  \
     29    14
    /  \
  10    13

Step 2: Extract max repeatedly

Extract 37: Swap A[0]↔A[4]. Heapify(0) on size 4.
  [13, 29, 14, 10, | 37]
  Heapify: 13<29 β†’ swap β†’ [29, 13, 14, 10, | 37]
  Heapify(1): 13>10 β†’ stop.
  Heap: [29, 13, 14, 10]

Extract 29: Swap A[0]↔A[3]. Heapify(0) on size 3.
  [10, 13, 14, | 29, 37]
  Heapify: 10<14 β†’ swap A[0]↔A[2] β†’ [14, 13, 10, | 29, 37]
  Heap: [14, 13, 10]

Extract 14: Swap A[0]↔A[2]. Heapify(0) on size 2.
  [10, 13, | 14, 29, 37]
  Heapify: 10<13 β†’ swap β†’ [13, 10, | 14, 29, 37]
  Heap: [13, 10]

Extract 13: Swap A[0]↔A[1]. Size 1.
  [10, | 13, 14, 29, 37]

Result: [10, 13, 14, 29, 37] βœ”

Q8. Given an array of $n$ elements where only one element is out of place. Which sort would you choose and why?

Best choice: Insertion Sort

Why:

When only one element is out of place, the array is nearly sorted. Insertion Sort excels in this scenario:

  • Insertion sort iterates through the array. For elements already in their correct position, the inner while loop does zero shifts (just one comparison each).
  • When it reaches the out-of-place element, it shifts it to its correct position (at most $n-1$ shifts for that one element).
  • Total work: $O(n)$ β€” one pass through the array with one element needing multiple shifts.

Example: [1, 2, 3, 9, 4, 5, 6, 7, 8] β†’ 9 is out of place.

i=1: key=2 > 1 β†’ stop (0 shifts)
i=2: key=3 > 2 β†’ stop (0 shifts)
i=3: key=9 > 3 β†’ stop (0 shifts)
i=4: key=4 < 9 β†’ shift 9. 4 > 3 β†’ stop. Insert 4.
     [1, 2, 3, 4, 9, 5, 6, 7, 8]
i=5: key=5 < 9 β†’ shift 9. 5 > 4 β†’ stop. Insert 5.
     [1, 2, 3, 4, 5, 9, 6, 7, 8]
... and so on, 9 "bubbles" to its correct position with subsequent elements displacing it.

Why not other algorithms?

Algorithm Time for 1 element out of place Problem
Insertion Sort $O(n)$ βœ” Perfect fit
Bubble Sort (flag) $O(n)$ Also good, but slightly more comparisons
Selection Sort $O(n^2)$ Always does all comparisons
Merge Sort $O(n \log n)$ Can’t exploit nearly-sorted structure
Quick Sort $O(n \log n)$ avg Doesn’t benefit from near-sortedness
Heap Sort $O(n \log n)$ Can’t exploit nearly-sorted structure

Q9. Implement Binary Search recursively. What is its space complexity? How does it differ from iterative version?

Recursive Binary Search:

int binarySearchRecursive(int A[], int low, int high, int key) {
    // Base case: search space is empty
    if (low > high)
        return -1;

    int mid = low + (high - low) / 2;

    if (A[mid] == key)
        return mid;                // Found!
    else if (A[mid] < key)
        return binarySearchRecursive(A, mid + 1, high, key);  // Search right
    else
        return binarySearchRecursive(A, low, mid - 1, key);   // Search left
}

// Call: binarySearchRecursive(A, 0, n-1, key);

Space Complexity Analysis:

Recursive version: $O(\log n)$

Each recursive call adds a stack frame containing the local variables (low, high, mid, key) and the return address. Since the recursion depth is $\log n$ (the array halves each time), there are at most $\log n$ stack frames simultaneously on the call stack.

Iterative version: $O(1)$

The iterative version uses a while loop with just three variables (low, high, mid). No recursion means no stack frames β€” constant extra space.


Comparison:

Aspect Iterative Recursive
Time $O(\log n)$ $O(\log n)$
Space $O(1)$ $O(\log n)$ β€” recursion stack
Elegance Less elegant More mathematically clean
Stack overflow risk None Possible for very deep recursion (huge $n$)
Tail call optimization N/A If compiler optimizes tail recursion, space can become $O(1)$
Performance Slightly faster Slight function call overhead

Recommendation: Use the iterative version in practice. It has the same time complexity but uses constant space and avoids function call overhead. The recursive version is useful for understanding the divide-and-conquer paradigm.


Q10. A company needs to sort 10 million records. The data has a limited range of keys (0 to 999). Which sorting algorithm would you recommend and why?

Recommended: Counting Sort ⭐

Why:

Given:

  • $n = 10{,}000{,}000$ (10 million records)
  • $k = 999$ (key range 0 to 999)

Counting Sort complexity: $O(n + k) = O(10{,}000{,}000 + 999) \approx O(n)$

Comparison-based sorts: $O(n \log n) = O(10{,}000{,}000 \times \log_2 10{,}000{,}000) \approx O(10{,}000{,}000 \times 23.25) \approx O(232{,}500{,}000)$

Counting Sort: $O(10{,}000{,}000 + 999) \approx O(10{,}001{,}000)$

Speedup: $\frac{232{,}500{,}000}{10{,}001{,}000} \approx 23\times$ faster!


Why Counting Sort is ideal here:

  1. Key range is small: $k = 999 \ll n = 10^7$. The condition $k = O(n)$ is satisfied, so counting sort runs in linear time.

  2. Extra space is manageable: We need a count array of size 1000 and an output array of size $n$. The count array is trivially small (1000 integers = ~4 KB). The output array needs $O(n)$ space, which is the same as the input.

  3. Stability: Counting sort is stable, which is important if the records have secondary fields that should maintain their relative order.

  4. Simple implementation: No complex divide-and-conquer or tree structures needed.

Why NOT other algorithms?

Algorithm Time for this case Issue
Counting Sort $O(n)$ β‰ˆ $10^7$ βœ” Best choice
Quick Sort $O(n \log n)$ β‰ˆ $2.3 \times 10^8$ 23Γ— slower
Merge Sort $O(n \log n)$ β‰ˆ $2.3 \times 10^8$ Also needs $O(n)$ extra space
Heap Sort $O(n \log n)$ β‰ˆ $2.3 \times 10^8$ Poor cache performance on large data
Radix Sort $O(d \times (n + k’))$ Also good; uses counting sort as subroutine

Counting sort wins because $k \ll n$, making it effectively linear time. If the key range were $k = 10^9$ (much larger than $n$), counting sort would be impractical (too much memory), and Quick Sort or Merge Sort would be better choices.


Long Answer


Q11. Compare Merge Sort and Quick Sort comprehensively β€” algorithm, dry run, time/space complexity, stability, and use cases.

Merge Sort

Algorithm: Divide and Conquer β€” split the array into halves, recursively sort each half, then merge the two sorted halves.

void merge(int A[], int left, int mid, int right) {
    int n1 = mid - left + 1, n2 = right - mid;
    int L[n1], R[n2];
    for (int i = 0; i < n1; i++) L[i] = A[left + i];
    for (int j = 0; j < n2; j++) R[j] = A[mid + 1 + j];

    int i = 0, j = 0, k = left;
    while (i < n1 && j < n2)
        A[k++] = (L[i] <= R[j]) ? L[i++] : R[j++];
    while (i < n1) A[k++] = L[i++];
    while (j < n2) A[k++] = R[j++];
}

void mergeSort(int A[], int left, int right) {
    if (left < right) {
        int mid = left + (right - left) / 2;
        mergeSort(A, left, mid);
        mergeSort(A, mid + 1, right);
        merge(A, left, mid, right);
    }
}

Dry Run on [5, 3, 8, 1]:

[5, 3, 8, 1] β†’ split β†’ [5, 3] and [8, 1]
[5, 3] β†’ split β†’ [5] and [3] β†’ merge β†’ [3, 5]
[8, 1] β†’ split β†’ [8] and [1] β†’ merge β†’ [1, 8]
Merge [3, 5] and [1, 8]:
  1 < 3 β†’ [1]
  3 < 8 β†’ [1, 3]
  5 < 8 β†’ [1, 3, 5]
  Remaining: 8 β†’ [1, 3, 5, 8] βœ”

Quick Sort

Algorithm: Divide and Conquer β€” choose a pivot, partition elements into ≀pivot and >pivot groups, then recursively sort each group.

int partition(int A[], int low, int high) {
    int pivot = A[high], i = low - 1;
    for (int j = low; j < high; j++) {
        if (A[j] <= pivot) {
            i++;
            int temp = A[i]; A[i] = A[j]; A[j] = temp;
        }
    }
    int temp = A[i+1]; A[i+1] = A[high]; A[high] = temp;
    return i + 1;
}

void quickSort(int A[], int low, int high) {
    if (low < high) {
        int pi = partition(A, low, high);
        quickSort(A, low, pi - 1);
        quickSort(A, pi + 1, high);
    }
}

Dry Run on [5, 3, 8, 1], pivot = 1:

Partition with pivot=1 (A[3]):
  i=-1
  j=0: 5>1 skip; j=1: 3>1 skip; j=2: 8>1 skip
  Swap A[0]↔A[3] β†’ [1, 3, 8, 5]  pivot at index 0

  Left: [] (empty)
  Right: [3, 8, 5], pivot=5
    j=1: 3≀5 β†’ i=1, swap A[1]↔A[1] (no change)
    j=2: 8>5 skip
    Swap A[2]↔A[3] β†’ [1, 3, 5, 8]  pivot at index 2
    Left: [3] βœ”   Right: [8] βœ”

Result: [1, 3, 5, 8] βœ”

Comprehensive Comparison

Aspect Merge Sort Quick Sort
Strategy Divide β†’ Recurse β†’ Merge Partition β†’ Recurse
When work happens During merge (combining) During partition (dividing)
Best Case $O(n \log n)$ $O(n \log n)$
Average Case $O(n \log n)$ $O(n \log n)$
Worst Case $O(n \log n)$ βœ” $O(n^2)$ βœ—
Space $O(n)$ β€” auxiliary arrays $O(\log n)$ β€” stack only
Stable Yes βœ” No βœ—
In-Place No βœ— Yes βœ”
Cache Performance Poor (copies to temp arrays) Excellent (in-place, sequential)
Parallelizable Yes Yes (but less balanced partitions)
Adaptive No (same time regardless) No (unless modified)

Recurrences:

Β  Merge Sort Quick Sort
Best/Avg $T(n) = 2T(n/2) + O(n)$ $T(n) = 2T(n/2) + O(n)$
Worst Same β†’ $O(n \log n)$ $T(n) = T(n-1) + O(n)$ β†’ $O(n^2)$

Use Cases:

Scenario Better Choice Reason
General in-memory sorting Quick Sort Faster in practice (cache, constant factor)
Stability is needed Merge Sort Only stable $O(n \log n)$ sort
Worst-case guarantee needed Merge Sort Always $O(n \log n)$
Memory is limited Quick Sort $O(\log n)$ vs $O(n)$ space
Sorting linked lists Merge Sort No random access needed
External sorting (disk) Merge Sort Sequential access pattern
Large arrays of primitives Quick Sort Best practical performance

Q12. Explain Heap Sort completely β€” what is a heap, how to build a max-heap, how the sorting works, with a full example. Analyze its complexity.

What is a Heap?

A heap is a complete binary tree that satisfies the heap property:

  • Max-Heap: Every parent node is β‰₯ both its children. The root holds the maximum.
  • Min-Heap: Every parent node is ≀ both its children. The root holds the minimum.

Complete binary tree: All levels are fully filled except possibly the last, which is filled left to right.

Array representation: A heap is stored efficiently in an array (0-indexed):

  • Left child of node $i$: $2i + 1$
  • Right child of node $i$: $2i + 2$
  • Parent of node $i$: $\lfloor(i-1)/2\rfloor$

Heapify β€” Fixing a Single Violation

heapify(A, n, i) assumes the subtrees rooted at left(i) and right(i) are already valid max-heaps, but A[i] might violate the max-heap property. It fixes this by β€œsifting down” the element.

void heapify(int A[], int n, int i) {
    int largest = i;
    int left = 2 * i + 1;
    int right = 2 * i + 2;

    if (left < n && A[left] > A[largest])
        largest = left;
    if (right < n && A[right] > A[largest])
        largest = right;

    if (largest != i) {
        int temp = A[i]; A[i] = A[largest]; A[largest] = temp;
        heapify(A, n, largest);  // Recursively fix the affected subtree
    }
}

Time: $O(\log n)$ β€” at most one swap per level, and the tree has $\log n$ levels.


Building the Max-Heap

// Start from the last non-leaf node and heapify each node bottom-up
for (int i = n / 2 - 1; i >= 0; i--)
    heapify(A, n, i);

Why start at n/2 - 1? Nodes at indices $\lfloor n/2 \rfloor$ to $n-1$ are leaves β€” they have no children and trivially satisfy the heap property.

Time for build: $O(n)$ (not $O(n \log n)$!) β€” because most nodes are near the bottom and need little work.


Heap Sort Algorithm

void heapSort(int A[], int n) {
    // Phase 1: Build Max-Heap
    for (int i = n / 2 - 1; i >= 0; i--)
        heapify(A, n, i);

    // Phase 2: Extract max repeatedly
    for (int i = n - 1; i > 0; i--) {
        int temp = A[0]; A[0] = A[i]; A[i] = temp;  // Move max to end
        heapify(A, i, 0);  // Restore heap on reduced array
    }
}

How it works:

  1. Build a max-heap β†’ the largest element is at A[0] (root).
  2. Swap A[0] (max) with A[n-1] (last). Now the largest is at its final sorted position.
  3. Reduce the heap size by 1 and heapify the root to restore the max-heap.
  4. Repeat until heap has size 1.

Full Example: Sort [7, 3, 5, 1, 9, 2]

Phase 1: Build Max-Heap

Initial array: [7, 3, 5, 1, 9, 2]  (n = 6)

Tree:       7
           / \
          3   5
         / \ /
        1  9 2

Last non-leaf: index 2 (n/2 - 1 = 2)

i=2: Heapify(2). Node=5, Left=2. 5 > 2 β†’ no swap.
  [7, 3, 5, 1, 9, 2]

i=1: Heapify(1). Node=3, Left=1, Right=9. Largest=9 at idx 4.
  Swap A[1]↔A[4] β†’ [7, 9, 5, 1, 3, 2]
  Heapify(4): no children β†’ stop.

i=0: Heapify(0). Node=7, Left=9, Right=5. Largest=9 at idx 1.
  Swap A[0]↔A[1] β†’ [9, 7, 5, 1, 3, 2]
  Heapify(1): Left=1, Right=3. Largest=7 (parent) β†’ stop.

Max-Heap: [9, 7, 5, 1, 3, 2]
        9
       / \
      7   5
     / \ /
    1  3 2

Phase 2: Sort by extracting max

Step 1: Swap A[0]=9 ↔ A[5]=2.  Heapify(0, size=5).
  [2, 7, 5, 1, 3, | 9]
  Heapify: 2<7 β†’ swap(0,1) β†’ [7, 2, 5, 1, 3, | 9]
  Heapify(1): 2<3 β†’ swap(1,4) β†’ [7, 3, 5, 1, 2, | 9]
  Heap: [7, 3, 5, 1, 2]

Step 2: Swap A[0]=7 ↔ A[4]=2.  Heapify(0, size=4).
  [2, 3, 5, 1, | 7, 9]
  Heapify: 2<5 β†’ swap(0,2) β†’ [5, 3, 2, 1, | 7, 9]
  Heap: [5, 3, 2, 1]

Step 3: Swap A[0]=5 ↔ A[3]=1.  Heapify(0, size=3).
  [1, 3, 2, | 5, 7, 9]
  Heapify: 1<3 β†’ swap(0,1) β†’ [3, 1, 2, | 5, 7, 9]
  Heap: [3, 1, 2]

Step 4: Swap A[0]=3 ↔ A[2]=2.  Heapify(0, size=2).
  [2, 1, | 3, 5, 7, 9]
  Heapify: 2>1 β†’ no swap.
  Heap: [2, 1]

Step 5: Swap A[0]=2 ↔ A[1]=1.  Size=1 β†’ done.
  [1, | 2, 3, 5, 7, 9]

Result: [1, 2, 3, 5, 7, 9] βœ”

Complexity Analysis

Aspect Value Explanation
Build heap $O(n)$ Most nodes are near the bottom; total work sums to $O(n)$
Extract phase $O(n \log n)$ $n-1$ extractions, each requiring $O(\log n)$ heapify
Total Time $O(n \log n)$ Dominated by extraction phase
Best Case $O(n \log n)$ Same regardless of input
Worst Case $O(n \log n)$ Same β€” no degradation
Average Case $O(n \log n)$ Same
Space $O(1)$ In-place β€” only swaps within the array
Stable No Swapping root to end can change relative order of equal elements

Advantages:

  • Guaranteed $O(n \log n)$ β€” no worst-case degradation like Quick Sort
  • In-place β€” $O(1)$ extra space, unlike Merge Sort’s $O(n)$
  • Combines the best of Merge Sort (guaranteed performance) and Quick Sort (in-place)

Disadvantages:

  • Not stable β€” can’t preserve relative order of equal elements
  • Poor cache performance β€” parent-child relationships jump around in memory
  • Slower in practice than Quick Sort due to cache misses and larger constant factor

Q13. Explain Counting Sort. Why is it $O(n + k)$? When is it better than comparison-based sorts? When is it impractical?

Algorithm

Counting Sort works by counting occurrences of each distinct value, then using those counts to place elements directly in their correct positions.

void countingSort(int A[], int n, int k) {
    int count[k + 1];   // Count array for values 0..k
    int output[n];       // Output array

    // Phase 1: Initialize counts to 0
    for (int i = 0; i <= k; i++) count[i] = 0;

    // Phase 2: Count occurrences of each value
    for (int i = 0; i < n; i++) count[A[i]]++;

    // Phase 3: Compute cumulative (prefix) sums
    for (int i = 1; i <= k; i++) count[i] += count[i - 1];

    // Phase 4: Build output array (right-to-left for stability)
    for (int i = n - 1; i >= 0; i--) {
        output[count[A[i]] - 1] = A[i];
        count[A[i]]--;
    }

    // Phase 5: Copy output back to original array
    for (int i = 0; i < n; i++) A[i] = output[i];
}

Step-by-Step Explanation

Example: Sort [4, 2, 2, 8, 3, 3, 1], $k = 8$

Phase 1 β€” Initialize: count = [0, 0, 0, 0, 0, 0, 0, 0, 0]

Phase 2 β€” Count occurrences:

A[0]=4 β†’ count[4]++    count = [0, 1, 0, 0, 1, 0, 0, 0, 0]
...after all elements:   count = [0, 1, 2, 2, 1, 0, 0, 0, 1]
      (0 zeros, 1 one, 2 twos, 2 threes, 1 four, 0 fives/sixes/sevens, 1 eight)

Phase 3 β€” Prefix sums:

count[i] += count[i-1]:
count = [0, 1, 3, 5, 6, 6, 6, 6, 7]
Meaning: count[v] = number of elements ≀ v
  ≀0: 0 elements, ≀1: 1 element, ≀2: 3 elements, ..., ≀8: 7 elements

Phase 4 β€” Place elements (right-to-left):

i=6: A[6]=2 β†’ pos=count[2]-1=2 β†’ output[2]=2, count[2]=2
i=5: A[5]=5 β†’ pos=count[5]-1=5 β†’ output[5]=5, count[5]=5
i=4: A[4]=7 β†’ pos=count[7]-1=5 β†’ output[5]... wait, let me redo with actual values.

(Using the actual array [4, 2, 2, 8, 3, 3, 1])

After prefix: count = [0, 1, 3, 5, 6, 6, 6, 6, 7]

i=6: A[6]=1 β†’ pos = count[1]-1 = 0 β†’ output[0]=1, count[1]=0
i=5: A[5]=3 β†’ pos = count[3]-1 = 4 β†’ output[4]=3, count[3]=4
i=4: A[4]=3 β†’ pos = count[3]-1 = 3 β†’ output[3]=3, count[3]=3
i=3: A[3]=8 β†’ pos = count[8]-1 = 6 β†’ output[6]=8, count[8]=6
i=2: A[2]=2 β†’ pos = count[2]-1 = 2 β†’ output[2]=2, count[2]=2
i=1: A[1]=2 β†’ pos = count[2]-1 = 1 β†’ output[1]=2, count[2]=1
i=0: A[0]=4 β†’ pos = count[4]-1 = 5 β†’ output[5]=4, count[4]=5

Output: [1, 2, 2, 3, 3, 4, 8] βœ”

Why is it $O(n + k)$?

Phase Operations Time
Initialize count array $k + 1$ writes $O(k)$
Count occurrences $n$ reads/writes $O(n)$
Compute prefix sums $k$ additions $O(k)$
Build output $n$ reads/writes $O(n)$
Copy back $n$ copies $O(n)$
Total Β  $O(n + k)$

Key insight: There are no nested loops. Each phase scans either the input array ($O(n)$) or the count array ($O(k)$) once. The sum is $O(n + k)$.

It is linear when $k = O(n)$ β€” then $O(n + k) = O(n)$.


When is Counting Sort better than comparison-based sorts?

Counting Sort wins when $k = O(n)$:

Scenario Comparison Sort Counting Sort Winner
$n = 10^6$, $k = 1000$ $O(n \log n) \approx 2 \times 10^7$ $O(n + k) \approx 10^6$ Counting Sort (20Γ—)
$n = 10^6$, $k = 10^6$ $O(n \log n) \approx 2 \times 10^7$ $O(n + k) = 2 \times 10^6$ Counting Sort (10Γ—)
$n = 10^6$, $k = 10^9$ $O(n \log n) \approx 2 \times 10^7$ $O(n + k) \approx 10^9$ Comparison Sort

Rule of thumb: Use counting sort when $k \leq n$ (or $k = O(n)$).

Specific scenarios where counting sort shines:

  • Sorting exam scores (0–100) for a class of 1000 students: $k = 100, n = 1000$
  • Sorting ages (0–150) of a population: $k = 150, n$ can be millions
  • Sorting characters by ASCII value (0–127): $k = 127$
  • As a subroutine in Radix Sort (sorting digit by digit)

When is Counting Sort impractical?

  1. Large key range ($k \gg n$):
    • Sorting 100 numbers in range $[0, 10^9]$: the count array would need $10^9$ entries (4 GB memory!) for just 100 numbers. A comparison sort at $O(n \log n) = O(100 \times 7) = 700$ operations is vastly better.
  2. Floating-point keys:
    • Counting sort requires integer keys (or keys that can be mapped to a finite integer range). Floating-point numbers have too many possible values (effectively infinite precision).
  3. Negative keys (without modification):
    • The basic algorithm assumes keys in $[0, k]$. Negative keys require offsetting all values by the minimum, adding complexity.
  4. Keys are complex objects:
    • If keys are strings, composite objects, or non-numeric, counting sort doesn’t directly apply. You’d need a mapping to integers first.
  5. Memory constraints:
    • Even when $k$ is reasonable, the extra space $O(n + k)$ might be unacceptable in memory-constrained environments. The output array alone is $O(n)$.

Summary:

\[\text{Use Counting Sort when: } k = O(n) \text{ and keys are non-negative integers}\] \[\text{Use Comparison Sort when: } k \gg n \text{ or keys are non-integer}\]

← Back to Unit II Notes