Unit II β Linear Data Structures, Searching & Sorting
Syllabus Coverage: Concept of sequential organization, Arrays as ADT, Storage representation of Array, Array insertion and deletion, Matrix operations using arrays, String operations (Length, Concatenation, Copy, Palindrome, Reverse, Compare, Substring) without using library functions, Searching: Linear and Binary search algorithms and its analysis. Sorting: General concepts and algorithm analysis β Bubble sort, Insertion sort, Selection sort, Merge sort, Quick sort, Heap sort, stable and in-place sorting, sorting techniques in linear time β Counting sort. (09 Hours)
Table of Contents
1 β Arrays
- 1.1 Sequential Organization
- 1.2 Array as an ADT
- 1.3 Storage Representation of Arrays
- 1.4 Array Insertion and Deletion
- 1.5 Matrix Operations Using Arrays
- 1.6 Finding Minimum & Maximum β Tournament Method
- Practice Questions β Arrays
2 β String Operations
- 2.1 String Basics
- 2.2 String Length
- 2.3 String Copy
- 2.4 String Concatenation
- 2.5 String Reverse
- 2.6 String Compare
- 2.7 Palindrome Check
- 2.8 Substring Search
- Practice Questions β Strings
3 β Searching
- 3.1 Linear Search
- 3.2 Binary Search
- 3.3 Comparison β Linear vs Binary Search
- Practice Questions β Searching
4 β Sorting
- 4.1 Sorting Concepts β Stable & In-Place
- 4.2 Bubble Sort
- 4.3 Selection Sort
- 4.4 Insertion Sort
- 4.5 Merge Sort
- 4.6 Quick Sort
- 4.7 Heap Sort
- 4.8 Counting Sort (Linear Time)
- 4.9 Sorting Algorithms β Master Comparison
- Practice Questions β Sorting
Glossary β Key Terms
| Term | Meaning |
|---|---|
| Array | A collection of elements of the same type stored in contiguous memory locations. |
| Index | The position number used to access an array element (starts at 0 in C). |
| Row-Major Order | 2D array elements stored row by row in memory. |
| Column-Major Order | 2D array elements stored column by column in memory. |
| String | A sequence of characters terminated by a null character ('\0') in C. |
| Linear Search | Sequentially check each element until the target is found or the array ends. |
| Binary Search | Divide a sorted array in half repeatedly to find the target. |
| Stable Sort | A sort that preserves the relative order of records with equal keys. |
| In-Place Sort | A sort that uses $O(1)$ or $O(\log n)$ extra memory (no large auxiliary arrays). |
| Divide & Conquer | A strategy that splits a problem into subproblems, solves each, then combines results. |
| Pivot | The element chosen in Quick Sort around which partitioning occurs. |
| Heap | A complete binary tree satisfying the heap property (max-heap or min-heap). |
1 β Arrays
1.1 Sequential Organization
Definition: Sequential organization means data elements are stored in consecutive (contiguous) memory locations. The position of each element is determined by its index.
In sequential organization:
- Elements are stored one after another in memory
- The address of any element can be calculated directly from its index
- Random access is possible (access any element in $O(1)$ time)
- Insertion and deletion are expensive (require shifting)
The array is the most fundamental sequential data structure.
1.2 Array as an ADT
ADT Array:
Data:
A fixed-size, ordered collection of elements of the same type
Each element is identified by an index (0 to n-1)
Operations:
create(size) β Create array of given size
get(index) β Return element at index β O(1)
set(index, value) β Set element at index to value β O(1)
insert(index, value) β Insert value at index (shift) β O(n)
delete(index) β Delete element at index (shift) β O(n)
search(value) β Find index of value β O(n)
length() β Return number of elements β O(1)
display() β Print all elements β O(n)
Error Conditions:
get/set/insert/delete with invalid index β "Index out of bounds"
insert on full array β "Array overflow"
1.3 Storage Representation of Arrays
One-Dimensional Array
If an array A of $n$ elements starts at base address $B$, and each element takes $w$ bytes:
Example: int A[5] starting at address 1000, sizeof(int) = 4
| Index | Element | Address |
|---|---|---|
| 0 | A[0] | 1000 |
| 1 | A[1] | 1004 |
| 2 | A[2] | 1008 |
| 3 | A[3] | 1012 |
| 4 | A[4] | 1016 |
Address of A[3] = $1000 + 3 \times 4 = 1012$ β
Two-Dimensional Array β Row-Major Order
Elements are stored row by row.
For array A[m][n] starting at base address $B$:
Example: int A[3][4] at base address 2000
Memory layout (row-major):
A[0][0] A[0][1] A[0][2] A[0][3] | A[1][0] A[1][1] A[1][2] A[1][3] | A[2][0] ...
2000 2004 2008 2012 | 2016 2020 2024 2028 | 2032 ...
Address of A[1][2] = $2000 + (1 \times 4 + 2) \times 4 = 2000 + 24 = 2024$ β
Two-Dimensional Array β Column-Major Order
Elements are stored column by column.
\[\text{Address of } A[i][j] = B + (j \times m + i) \times w\]Example: Same A[3][4] at base 2000 in column-major:
Address of A[1][2] = $2000 + (2 \times 3 + 1) \times 4 = 2000 + 28 = 2028$
C uses Row-Major order. FORTRAN uses Column-Major order. Exam questions often ask you to compute addresses in both orders.
General Formula β Arbitrary Lower Bounds
The formulas above assume indexing starts at 0. In many languages (FORTRAN, Pascal) and in exam problems, arrays can start from any index β including negative values.
1D Array β A[lβ ... uβ] (lower bound $l_1$, upper bound $u_1$):
Number of elements: $n = u_1 - l_1 + 1$
Example: int A[-3 ... 4] starting at base address 500, sizeof(int) = 4
Here $l_1 = -3$, $u_1 = 4$, $n = 4 - (-3) + 1 = 8$ elements.
Address of A[2] = $500 + (2 - (-3)) \times 4 = 500 + 5 \times 4 = 520$
Address of A[-1] = $500 + (-1 - (-3)) \times 4 = 500 + 2 \times 4 = 508$
2D Array β A[lβ ... uβ][lβ ... uβ]:
Let $R = u_1 - l_1 + 1$ (number of rows) and $C = u_2 - l_2 + 1$ (number of columns).
Row-Major:
\[\text{Address of } A[i][j] = B + \big[(i - l_1) \times C + (j - l_2)\big] \times w\]Column-Major:
\[\text{Address of } A[i][j] = B + \big[(j - l_2) \times R + (i - l_1)\big] \times w\]Example: int A[2 ... 5][3 ... 6] at base 1000, $w = 2$ bytes.
Here $l_1 = 2, u_1 = 5, l_2 = 3, u_2 = 6$, so $R = 4, C = 4$.
Row-Major: Address of A[3][5]:
Column-Major: Address of A[3][5]:
Example (negative bounds): A[-2 ... 2][0 ... 3] at base 400, $w = 4$ bytes.
$l_1 = -2, u_1 = 2, l_2 = 0, u_2 = 3 \Rightarrow R = 5, C = 4$, total = 20 elements.
Row-Major: Address of A[0][2]:
Column-Major: Address of A[0][2]:
π Key Insight: The simplified formulas ($B + (i \times n + j) \times w$) are just special cases where $l_1 = 0$ and $l_2 = 0$. When you substitute $l_1 = l_2 = 0$ into the general formula, the subtraction terms vanish and you get the simplified version. Always use the general formula in exams unless the problem explicitly states 0-based indexing.
Quick Reference β All Address Formulas:
| Array Type | Formula |
|---|---|
| 1D $A[l_1 \ldots u_1]$ | $B + (i - l_1) \times w$ |
| 2D Row-Major $A[l_1 \ldots u_1][l_2 \ldots u_2]$ | $B + [(i - l_1) \times C + (j - l_2)] \times w$ |
| 2D Column-Major $A[l_1 \ldots u_1][l_2 \ldots u_2]$ | $B + [(j - l_2) \times R + (i - l_1)] \times w$ |
Where $R = u_1 - l_1 + 1$, $C = u_2 - l_2 + 1$, $w$ = size of each element.
1.4 Array Insertion and Deletion
Insertion at Position pos
// Insert 'value' at position 'pos' in array A of size n
void insert(int A[], int *n, int pos, int value) {
// Step 1: Shift elements to the right
for (int i = *n - 1; i >= pos; i--) {
A[i + 1] = A[i];
}
// Step 2: Insert the new element
A[pos] = value;
// Step 3: Increase size
(*n)++;
}
π Line-by-Line Logic:
int *nβ We pass size as a pointer because insertion changes the array size; the caller needs to see the updated value.for (int i = *n - 1; i >= pos; i--)β We shift right-to-left (from the last element towardpos). Why not left-to-right? Because shifting left-to-right would overwrite elements before we move them! Starting from the end ensures each element is safely copied toi+1before being overwritten.A[i + 1] = A[i]β Each element moves one position right, creating a gap atpos.A[pos] = valueβ Once the gap is created, we simply place the new value there.(*n)++β Array now has one more element. The dereferenced pointer update ensures the caller sees the new size.
Time Complexity:
- Best case (insert at end): $O(1)$
- Worst case (insert at beginning): $O(n)$ β shift all elements
- Average case: $O(n)$
Deletion at Position pos
// Delete element at position 'pos' from array A of size n
void delete(int A[], int *n, int pos) {
// Step 1: Shift elements to the left
for (int i = pos; i < *n - 1; i++) {
A[i] = A[i + 1];
}
// Step 2: Decrease size
(*n)--;
}
π Line-by-Line Logic:
for (int i = pos; i < *n - 1; i++)β We shift left-to-right (opposite of insertion!). Why? Because weβre filling the gap atposby pulling elements leftward. Starting fromposensures each position gets its replacement fromi+1beforei+1is itself overwritten.i < *n - 1β We stop at the second-to-last element becauseA[i] = A[i+1]would access out-of-bounds ifireached*n - 1.A[i] = A[i + 1]β Each element slides one position left, overwriting the deleted position.(*n)--β Array shrinks by one. No need to clear the last position β itβs simply ignored since size decreased.- Notice the symmetry: Insertion shifts right-to-left, deletion shifts left-to-right. This pattern prevents data loss during shifting.
Time Complexity: Same as insertion β $O(n)$ worst case.
Example β Insert 25 at position 2:
Before: [10, 20, 30, 40, 50] n = 5
β pos = 2
Shift: [10, 20, __, 30, 40, 50] (shift 30,40,50 right)
Insert: [10, 20, 25, 30, 40, 50] n = 6
Example β Delete element at position 1:
Before: [10, 20, 30, 40, 50] n = 5
β pos = 1
Shift: [10, 30, 40, 50, __] (shift 30,40,50 left)
After: [10, 30, 40, 50] n = 4
1.5 Matrix Operations Using Arrays
Matrix Addition
\[C[i][j] = A[i][j] + B[i][j]\]void matrixAdd(int A[][MAX], int B[][MAX], int C[][MAX], int m, int n) {
for (int i = 0; i < m; i++)
for (int j = 0; j < n; j++)
C[i][j] = A[i][j] + B[i][j];
}
// Time: O(m Γ n) Space: O(m Γ n)
π Why This Logic: Matrix addition is element-wise β each
C[i][j]depends only onA[i][j]andB[i][j]at the same position. The two nested loops simply visit every cell exactly once. Both matrices must have identical dimensionsm Γ n.
Matrix Multiplication
\[C[i][j] = \sum_{k=0}^{p-1} A[i][k] \times B[k][j]\]For $A_{m \times p}$ and $B_{p \times n}$, result is $C_{m \times n}$:
void matrixMultiply(int A[][MAX], int B[][MAX], int C[][MAX],
int m, int p, int n) {
for (int i = 0; i < m; i++)
for (int j = 0; j < n; j++) {
C[i][j] = 0;
for (int k = 0; k < p; k++)
C[i][j] += A[i][k] * B[k][j];
}
}
// Time: O(m Γ n Γ p) For square matrices: O(nΒ³)
π Line-by-Line Logic:
C[i][j] = 0β We must initialize before accumulating sums. Without this,C[i][j]might hold garbage values and the+=would produce wrong results.- Why 3 nested loops? β Each
C[i][j]is the dot product of rowiof A and columnjof B. Thekloop computes this dot product:A[i][0]*B[0][j] + A[i][1]*B[1][j] + ... + A[i][p-1]*B[p-1][j].k < pβpis the shared dimension (columns of A = rows of B). This is why matrix multiplication requiresA.cols == B.rows.- Why
A[i][k] * B[k][j]? β We walk across rowiof A (varyingk) and down columnjof B (varyingk). The shared indexklinks corresponding elements.
Matrix Transpose
\[B[j][i] = A[i][j]\]void transpose(int A[][MAX], int B[][MAX], int m, int n) {
for (int i = 0; i < m; i++)
for (int j = 0; j < n; j++)
B[j][i] = A[i][j];
}
// Time: O(m Γ n)
π Why This Logic: Transpose mirrors a matrix across its main diagonal β row
ibecomes columni. So element at position(i, j)moves to(j, i). Thatβs whyB[j][i] = A[i][j]. Note: if A ism Γ n, then B must ben Γ mβ the dimensions swap. We use a separate output matrix B because doing it in-place is tricky for non-square matrices.
Special Matrix Storage β Triangular, Diagonal, Symmetric & Tridiagonal
For an $n \times n$ matrix, storing all $n^2$ elements wastes memory when only a subset are non-zero. Special matrices exploit structure to store only the required elements in a compact 1D array.
1. Lower Triangular Matrix
A matrix where all elements above the main diagonal are zero: $A[i][j] = 0$ for $j > i$.
| 1 0 0 0 |
| 2 3 0 0 |
| 4 5 6 0 |
| 7 8 9 10 |
Non-zero elements: Row $i$ has $i + 1$ elements (0-indexed). Total = $\sum_{i=0}^{n-1}(i+1) = \frac{n(n+1)}{2}$
Row-Major Mapping (store row by row into 1D array):
\[\text{index of } A[i][j] = \frac{i(i+1)}{2} + j \qquad (j \leq i)\]π Why? Before row $i$, there are $1 + 2 + \cdots + i = \frac{i(i+1)}{2}$ elements. Within row $i$, element at column $j$ is at offset $j$.
Column-Major Mapping (store column by column):
\[\text{index of } A[i][j] = \frac{j(2n - j - 1)}{2} + (i - j) \qquad (j \leq i)\]π Why? Column $j$ in a lower triangular matrix has elements from row $j$ to row $n-1$. Before column $j$, there are $n + (n-1) + \cdots + (n-j+1) = \frac{j(2n - j + 1)}{2}$ elementsβ¦ simplified to the formula above when we count the offset $i - j$ within the column.
Example: $4 \times 4$ lower triangular matrix. Find index of $A[2][1]$ in row-major.
\[\text{index} = \frac{2(2+1)}{2} + 1 = 3 + 1 = 4\]1D array: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] β arr[4] = 5 β
2. Upper Triangular Matrix
A matrix where all elements below the main diagonal are zero: $A[i][j] = 0$ for $j < i$.
| 1 2 3 4 |
| 0 5 6 7 |
| 0 0 8 9 |
| 0 0 0 10 |
Non-zero elements: Row $i$ has $n - i$ elements (0-indexed). Total = $\frac{n(n+1)}{2}$
Row-Major Mapping:
\[\text{index of } A[i][j] = \frac{i(2n - i - 1)}{2} + (j - i) \qquad (j \geq i)\]π Why? Before row $i$, there are $n + (n-1) + \cdots + (n-i+1) = \frac{i(2n - i + 1)}{2}$ elementsβ¦ and within row $i$, the offset from the diagonal is $j - i$.
Column-Major Mapping:
\[\text{index of } A[i][j] = \frac{j(j+1)}{2} + i \qquad (j \geq i)\]π Why? Column $j$ has elements from row $0$ to row $j$. Before column $j$, there are $1 + 2 + \cdots + j = \frac{j(j+1)}{2}$ elements. Within column $j$, element at row $i$ is at offset $i$.
Example: $4 \times 4$ upper triangular. Find index of $A[1][3]$ in row-major.
\[\text{index} = \frac{1(2 \times 4 - 1 - 1)}{2} + (3 - 1) = \frac{6}{2} + 2 = 3 + 2 = 5\]1D array: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] β arr[5] = 6 β (wait, should be 7)
Actually, letβs verify: Row 0 has 4 elements (indices 0β3), Row 1 has 3 elements (indices 4β6). $A[1][3]$ is the last in row 1 β index 6 β value 7.
Using formula: $\frac{1(2 \times 4 - 1 - 1)}{2} + (3 - 1) = 3 + 2 = 5$. Hmm, let me recount.
Row 0: n elements = 4 β indices 0,1,2,3 Row 1: n-1 elements = 3 β indices 4,5,6
$A[1][1]=5$ at index 4, $A[1][2]=6$ at index 5, $A[1][3]=7$ at index 6.
Elements before row 1: $n = 4$. Offset within row 1: $j - i = 3 - 1 = 2$. Index = $4 + 2 = 6$. β
The correct row-major formula is:
\[\text{index of } A[i][j] = in - \frac{i(i-1)}{2} + (j - i) \qquad (j \geq i)\]3. Diagonal Matrix
A matrix where all elements outside the main diagonal are zero: $A[i][j] = 0$ for $i \neq j$.
| 5 0 0 0 |
| 0 3 0 0 |
| 0 0 7 0 |
| 0 0 0 2 |
Storage: Only $n$ elements needed. Mapping is trivial:
\[\text{index of } A[i][i] = i\]π Only diagonal elements $A[0][0], A[1][1], \ldots, A[n-1][n-1]$ are stored. Any access $A[i][j]$ with $i \neq j$ returns 0 without accessing the array.
4. Symmetric Matrix
A matrix where $A[i][j] = A[j][i]$ for all $i, j$. Store only the lower triangular part (or upper) β the other half is a mirror.
Storage: Same as Lower Triangular = $\frac{n(n+1)}{2}$ elements.
\[\text{To access } A[i][j]: \begin{cases} \text{use LT formula if } i \geq j \\ \text{swap: access } A[j][i] \text{ if } i < j \end{cases}\]5. Tridiagonal Matrix
A matrix where non-zero elements appear only on:
- Main diagonal ($i = j$)
- Diagonal above ($j = i + 1$)
- Diagonal below ($j = i - 1$)
| 1 2 0 0 0 |
| 3 4 5 0 0 |
| 0 6 7 8 0 |
| 0 0 9 10 11 |
| 0 0 0 12 13 |
Non-zero elements: $3n - 2$ (for $n \geq 2$). The first and last rows have 2 elements each; all middle rows have 3.
Mapping to 1D array (row-major):
\[\text{index of } A[i][j] = 2i + j \qquad (\lvert i - j \rvert \leq 1)\]π Why $2i + j$? Row 0 contributes 2 elements (at columns 0, 1). Each subsequent row $i$ starts at position $2i$ in the 1D array (2 elements per row for prior rows, minus the offset). The column within the band is $j$, giving offset $j$.
Alternatively (3-band mapping):
\[\text{index of } A[i][j] = 3i + (j - i + 1) - 1 = 2i + j\]Example: $5 \times 5$ tridiagonal. Find index of $A[2][3]$ and $A[3][2]$.
$A[2][3]$: index = $2 \times 2 + 3 = 7$ β value 8 β
$A[3][2]$: index = $2 \times 3 + 2 = 8$ β value 9 β
Quick Reference β Special Matrix Storage:
| Matrix Type | Non-Zero Elements | 1D Size | Row-Major Index Formula |
|---|---|---|---|
| Lower Triangular | $j \leq i$ | $\frac{n(n+1)}{2}$ | $\frac{i(i+1)}{2} + j$ |
| Upper Triangular | $j \geq i$ | $\frac{n(n+1)}{2}$ | $in - \frac{i(i-1)}{2} + (j - i)$ |
| Diagonal | $i = j$ | $n$ | $i$ |
| Symmetric | Store lower half | $\frac{n(n+1)}{2}$ | Same as Lower Triangular (swap if $i < j$) |
| Tridiagonal | $\lvert i - j \rvert \leq 1$ | $3n - 2$ | $2i + j$ |
All indices are 0-based. For 1-based indexing, adjust formulas accordingly (e.g., Lower Triangular: $\frac{i(i-1)}{2} + j$).
1.6 Finding Minimum & Maximum β Tournament Method
Problem: Find both the minimum and maximum elements of an array of $n$ elements.
NaΓ―ve approach: Scan the array twice (once for min, once for max) β $2(n-1)$ comparisons.
Better: Single scan tracking both β $2(n-1)$ comparisons.
Optimal β Tournament Method (Divide & Conquer): Uses only $\lceil 3n/2 \rceil - 2$ comparisons.
NaΓ―ve Algorithm β Single Pass:
void findMinMax(int arr[], int n, int *min, int *max) {
*min = *max = arr[0];
for (int i = 1; i < n; i++) {
if (arr[i] < *min) *min = arr[i];
if (arr[i] > *max) *max = arr[i];
}
}
// Comparisons: 2(n-1) β two comparisons per element
π Why 2(n-1)? Each element (except the first) is compared against both the current min and max β thatβs 2 comparisons per element for $n-1$ elements.
Tournament Method (Pairwise Comparison):
The key insight: compare pairs of elements first, then compare the smaller with min and the larger with max. This saves comparisons.
void tournamentMinMax(int arr[], int n, int *min, int *max) {
int i;
// Handle odd/even n
if (n % 2 == 0) {
if (arr[0] < arr[1]) { *min = arr[0]; *max = arr[1]; }
else { *min = arr[1]; *max = arr[0]; }
i = 2; // Start from index 2
} else {
*min = *max = arr[0];
i = 1; // Start from index 1
}
// Process pairs
for (; i < n - 1; i += 2) {
if (arr[i] < arr[i + 1]) {
if (arr[i] < *min) *min = arr[i];
if (arr[i + 1] > *max) *max = arr[i + 1];
} else {
if (arr[i + 1] < *min) *min = arr[i + 1];
if (arr[i] > *max) *max = arr[i];
}
}
}
// Comparisons: β3n/2β - 2
π Line-by-Line Logic:
- Pair comparison first β Compare
arr[i]witharr[i+1](1 comparison). This tells us which is the candidate for min and which for max.- Then compare with running min/max β The smaller of the pair only needs to be compared with
min(1 comparison). The larger only withmax(1 comparison).- 3 comparisons per 2 elements β Instead of 4 comparisons per 2 elements (naΓ―ve). For $n$ elements: $\lceil 3n/2 \rceil - 2$ total.
- Odd $n$ handling β If odd, initialize min and max with the first element alone, then process remaining elements in pairs.
- Why is this optimal? β It can be proven that $\lceil 3n/2 \rceil - 2$ is the theoretical lower bound for simultaneous min-max finding. This algorithm achieves it.
Find min and max of [3, 5, 1, 8, 2, 7] ($n = 6$, even)
| Step | Pair | Compare pair | Compare with min/max | min | max |
|---|---|---|---|---|---|
| Init | (3,5) | 3 < 5 | β | 3 | 5 |
| 1 | (1,8) | 1 < 8 | 1 < 3 β, 8 > 5 β | 1 | 8 |
| 2 | (2,7) | 2 < 7 | 2 < 1? No, 7 > 8? No | 1 | 8 |
Result: min = 1, max = 8
Comparisons: 1 (init) + 3 + 3 = 7 = $\lceil 3 \times 6/2 \rceil - 2 = 9 - 2 = 7$ β
NaΓ―ve would use $2(6-1) = 10$ comparisons.
Recursive Tournament (Divide & Conquer) Version:
typedef struct { int min; int max; } MinMax;
MinMax tournamentDC(int arr[], int low, int high) {
MinMax result, left, right;
if (low == high) { // Single element
result.min = result.max = arr[low];
return result;
}
if (high == low + 1) { // Two elements
if (arr[low] < arr[high]) {
result.min = arr[low]; result.max = arr[high];
} else {
result.min = arr[high]; result.max = arr[low];
}
return result;
}
int mid = (low + high) / 2;
left = tournamentDC(arr, low, mid);
right = tournamentDC(arr, mid + 1, high);
result.min = (left.min < right.min) ? left.min : right.min;
result.max = (left.max > right.max) ? left.max : right.max;
return result;
}
π Why βTournamentβ? Think of it like a sports tournament β elements are paired off, winners (max candidates) advance up one bracket, losers (min candidates) advance up another. The overall min and max emerge from fewer total βmatchesβ (comparisons).
Practice Questions β Arrays
Q1. Define an array as an ADT. List all operations with their time complexities.
Q2. An array A[10] of integers starts at address 500. Each integer takes 4 bytes. Find the address of A[7].
Q3. A 2D array B[5][8] starts at base address 1000, element size 2 bytes. Find the address of B[3][5] in (a) Row-major (b) Column-major order.
Q4. Write a C function to insert an element at a given position in an array. Analyze its time and space complexity.
Q5. Write C code to multiply two matrices. What are the conditions for multiplication to be valid?
Q6. Explain the difference between Row-major and Column-major storage with diagrams.
2 β String Operations (Without Library Functions)
2.1 String Basics
In C, a string is an array of characters terminated by the null character '\0'.
char str[] = "Hello";
// Stored as: ['H', 'e', 'l', 'l', 'o', '\0']
// Size = 6 bytes (5 characters + 1 null terminator)
2.2 String Length
int stringLength(char str[]) {
int len = 0;
while (str[len] != '\0') {
len++;
}
return len;
}
// Time: O(n) Space: O(1)
π Why This Logic: In C, strings have no built-in length field β the only way to know where a string ends is to find the null terminator
'\0'. So we start a counter at 0 and walk forward character by character. Each time we see a non-null character, we increment. When we hit'\0', the counter holds the length. This is exactly whatstrlen()does internally.
2.3 String Copy
void stringCopy(char dest[], char src[]) {
int i = 0;
while (src[i] != '\0') {
dest[i] = src[i];
i++;
}
dest[i] = '\0'; // Don't forget the null terminator!
}
// Time: O(n) Space: O(1) (excluding destination)
π Line-by-Line Logic:
while (src[i] != '\0')β We copy characters one-by-one until we reach the source stringβs end.dest[i] = src[i]β Same indexifor both arrays because we want a 1:1 copy.dest[i] = '\0'β Critical! The while loop copies data characters but stops before copying'\0'. We must manually add it, otherwisedestwonβt be a valid C string and any subsequent string operation on it would read garbage past the end.
2.4 String Concatenation
void stringConcat(char dest[], char src[]) {
int i = 0, j = 0;
// Find end of dest
while (dest[i] != '\0') i++;
// Copy src to end of dest
while (src[j] != '\0') {
dest[i] = src[j];
i++;
j++;
}
dest[i] = '\0';
}
// Time: O(m + n) where m = len(dest), n = len(src)
π Line-by-Line Logic:
- Two-phase approach: First find where
destends (phase 1), then appendsrcthere (phase 2).while (dest[i] != '\0') i++;β This walksito the null terminator position ofdest. We need this because we must start writingsrcafter the existing content.- Separate indices
iandjβitracks position indest(continuing from end),jtracks position insrc(starting from 0). They advance independently.dest[i] = '\0'β Again, we must null-terminate the result. The concatenated string needs its own end marker.- Time is O(m+n) because we traverse
destonce (m steps) thensrconce (n steps).
2.5 String Reverse
void stringReverse(char str[]) {
int len = 0;
while (str[len] != '\0') len++;
int start = 0, end = len - 1;
while (start < end) {
// Swap characters
char temp = str[start];
str[start] = str[end];
str[end] = temp;
start++;
end--;
}
}
// Time: O(n) Space: O(1)
π Line-by-Line Logic:
- Two-pointer technique:
startbegins at the first character,endat the last. They move toward each other, swapping characters along the way.while (start < end)β We stop when pointers meet/cross. If the string has odd length, the middle character stays in place (itβs its own mirror). If even, all characters get swapped.- Why not
start <= end? Because whenstart == end, they point to the same character β swapping it with itself is pointless.- In-place reversal β We use only one
tempvariable for swapping, no extra array. This is the most space-efficient approach ($O(1)$).- Why find length first? We need
end = len - 1to know where the last character is. Alternatively, you could walk a pointer to the end, but computing length is cleaner.
2.6 String Compare
int stringCompare(char s1[], char s2[]) {
int i = 0;
while (s1[i] != '\0' && s2[i] != '\0') {
if (s1[i] < s2[i]) return -1; // s1 < s2
if (s1[i] > s2[i]) return 1; // s1 > s2
i++;
}
// If we reach here, one or both strings ended
if (s1[i] == '\0' && s2[i] == '\0') return 0; // equal
if (s1[i] == '\0') return -1; // s1 is shorter
return 1; // s2 is shorter
}
// Time: O(min(m, n)) Space: O(1)
π Line-by-Line Logic:
- Lexicographic (dictionary) comparison β We compare strings character by character, just like how words are ordered in a dictionary.
while (s1[i] != '\0' && s2[i] != '\0')β We keep comparing as long as both strings still have characters. The&&ensures we donβt read past either stringβs end.if (s1[i] < s2[i]) return -1β Characters are compared by ASCII values. E.g.,'a'(97) < 'b'(98), so"apple"<"banana". The moment we find a difference, the result is known β no need to check further.- Post-loop checks β If the loop ends without finding a difference, the strings are identical up to the shorter oneβs length. Then:
- Both ended (
'\0') β theyβre equal (return 0)s1ended first βs1is shorter, sos1 < s2(e.g.,"app" < "apple")s2ended first βs2is shorter, sos1 > s2
2.7 Palindrome Check
int isPalindrome(char str[]) {
int len = 0;
while (str[len] != '\0') len++;
int start = 0, end = len - 1;
while (start < end) {
if (str[start] != str[end])
return 0; // Not a palindrome
start++;
end--;
}
return 1; // Is a palindrome
}
// Time: O(n) Space: O(1)
π Why This Logic:
- Same two-pointer technique as
stringReverse, but instead of swapping, we compare. A palindrome reads the same forwards and backwards, sostr[0]must equalstr[len-1],str[1]must equalstr[len-2], etc.- Early exit on mismatch β The moment any pair doesnβt match, we immediately return 0. No need to check further pairs β one mismatch is enough to prove itβs not a palindrome.
- We only check half the string β The
start < endcondition means we do at mostn/2comparisons. Checking the full string would be redundant since each comparison covers both a character and its mirror.
Example: "madam"
- Compare
str[0]='m'withstr[4]='m'β - Compare
str[1]='a'withstr[3]='a'β - Compare
str[2]='d'withstr[2]='d'β - Result: Palindrome!
2.8 Substring Search
Find if string pattern exists in string text:
int substringSearch(char text[], char pattern[]) {
int n = 0, m = 0;
while (text[n] != '\0') n++; // length of text
while (pattern[m] != '\0') m++; // length of pattern
for (int i = 0; i <= n - m; i++) {
int j;
for (j = 0; j < m; j++) {
if (text[i + j] != pattern[j])
break;
}
if (j == m)
return i; // Pattern found at index i
}
return -1; // Not found
}
// Time: O(n Γ m) worst case Space: O(1)
π Line-by-Line Logic:
i <= n - mβ We only try starting positions where the pattern can fully fit. If the text has 10 chars and pattern has 3, the last valid starting position is index 7 (positions 7, 8, 9). Starting at index 8 would go out of bounds.- The inner
jloop β For each starting positioni, comparetext[i+0]withpattern[0],text[i+1]withpattern[1], etc. The offseti+jmaps pattern positions to text positions.breakon mismatch β As soon as any character doesnβt match, thereβs no point checking the rest of this alignment β move to next starting position.if (j == m)β Ifjmade it all the way tom(pattern length) without breaking, ALL characters matched β the pattern is found at positioni. Ifj < m, the break triggered, meaning a mismatch occurred.- This is the βBrute Forceβ or βNaiveβ pattern matching algorithm. Itβs $O(n \times m)$ in the worst case (e.g., text=βAAAAABβ, pattern=βAABβ). Faster algorithms like KMP achieve $O(n+m)$.
Practice Questions β Strings
Q1. Write C functions (without using library functions) for: String length, copy, concatenation, reverse, compare, palindrome check, and substring search.
Q2. What is the null character '\0'? Why is it important in C strings?
Q3. Write a function to count the number of vowels in a string.
Q4. Write a function to check if two strings are anagrams.
Q5. Explain the time complexity of each string operation you implemented.
3 β Searching
3.1 Linear Search
Linear Search (Sequential Search): Check each element of the array one by one from beginning to end until the target is found or the array is exhausted.
Algorithm
int linearSearch(int A[], int n, int key) {
for (int i = 0; i < n; i++) {
if (A[i] == key)
return i; // Found at index i
}
return -1; // Not found
}
π Line-by-Line Logic:
for (int i = 0; i < n; i++)β Check every element from index 0 to n-1. Thereβs no shortcut because the array is unsorted β the key could be anywhere.if (A[i] == key) return iβ Return immediately upon finding the key. We donβt need to check further because the question is βdoes it exist and where?β Returning the index (not just1) gives the caller the position.return -1β If the loop completes without finding the key, itβs not in the array. We use -1 as a sentinel value because valid indices are 0 to n-1, so -1 unambiguously means βnot found.β- No prerequisites β Unlike binary search, this works on unsorted, sorted, linked lists, anything. Itβs the simplest search possible.
Analysis
| Case | When | Comparisons | Time |
|---|---|---|---|
| Best | Key at index 0 | 1 | $O(1)$ |
| Worst | Key at last index or absent | $n$ | $O(n)$ |
| Average | Key equally likely anywhere | $\frac{n+1}{2}$ | $O(n)$ |
Space Complexity: $O(1)$ β no extra space needed.
When to use Linear Search:
- Array is unsorted (binary search wonβt work)
- Array is small (overhead of sorting not worth it)
- You need to search only once (no point sorting for a single search)
3.2 Binary Search
Binary Search: For a sorted array, repeatedly divide the search interval in half. Compare the target with the middle element and eliminate half the remaining elements each step.
Algorithm (Iterative)
int binarySearch(int A[], int n, int key) {
int low = 0, high = n - 1;
while (low <= high) {
int mid = low + (high - low) / 2; // avoids overflow
if (A[mid] == key)
return mid; // Found
else if (A[mid] < key)
low = mid + 1; // Search right half
else
high = mid - 1; // Search left half
}
return -1; // Not found
}
π Line-by-Line Logic:
low = 0, high = n - 1β These define the search window. Initially, the entire array is our search space.while (low <= high)β As long as thereβs at least one element to check. Whenlow > high, the search window is empty β key doesnβt exist.mid = low + (high - low) / 2β Why not(low + high) / 2? Because iflowandhighare both large (nearINT_MAX), their sum can overflow! The formulalow + (high - low)/2gives the same result without overflow risk. This is a classic bug that even professional code had for decades.if (A[mid] == key)β Found it! Return the index.else if (A[mid] < key) β low = mid + 1β The key is larger than the middle element. Since the array is sorted, the key must be in the right half. We setlow = mid + 1(notmid) because weβve already checkedmidand itβs not the answer.else β high = mid - 1β The key is smaller, so search the left half. Again,mid - 1becausemidis ruled out.- Each iteration halves the search space β Thatβs why itβs $O(\log n)$. For 1 million elements, at most 20 comparisons!
Algorithm (Recursive)
int binarySearchRec(int A[], int low, int high, int key) {
if (low > high)
return -1; // Base case: not found
int mid = low + (high - low) / 2;
if (A[mid] == key)
return mid;
else if (A[mid] < key)
return binarySearchRec(A, mid + 1, high, key);
else
return binarySearchRec(A, low, mid - 1, key);
}
π Why Recursive Version?
- Same logic as iterative, but expressed as tail recursion β each call replaces the loop iteration.
if (low > high) return -1β Base case replaces the while condition. When the search window is empty, we stop.return binarySearchRec(A, mid + 1, high, key)β Instead of updatinglow = mid + 1and looping, we make a new function call with the updated range.- Trade-off: The recursive version uses $O(\log n)$ stack space for the call frames, while the iterative version uses $O(1)$. In practice, iterative is preferred for this reason.
- Both versions are correct β the recursive one is more βelegantβ and maps directly to the mathematical definition: βsearch in [low, mid-1] or [mid+1, high].β
Dry Run Example
Search for key = 23 in sorted array:
A = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
0 1 2 3 4 5 6 7 8 9
Step 1: low=0, high=9, mid=4 β A[4]=16 < 23 β low=5
Step 2: low=5, high=9, mid=7 β A[7]=56 > 23 β high=6
Step 3: low=5, high=6, mid=5 β A[5]=23 == 23 β FOUND at index 5! β
Only 3 comparisons vs potentially 6 for linear search!
Analysis
| Case | When | Comparisons | Time |
|---|---|---|---|
| Best | Key is at the middle | 1 | $O(1)$ |
| Worst | Key not present | $\lfloor \log_2 n \rfloor + 1$ | $O(\log n)$ |
| Average | β | $\approx \log_2 n$ | $O(\log n)$ |
Space Complexity:
- Iterative: $O(1)$
- Recursive: $O(\log n)$ β due to recursion stack
PREREQUISITE: The array MUST be sorted for binary search to work. If the array is unsorted, you must sort it first (costing $O(n \log n)$).
3.3 Comparison β Linear vs Binary Search
| Feature | Linear Search | Binary Search |
|---|---|---|
| Prerequisite | None | Array must be sorted |
| Best Case | $O(1)$ | $O(1)$ |
| Worst Case | $O(n)$ | $O(\log n)$ |
| Average Case | $O(n)$ | $O(\log n)$ |
| Space | $O(1)$ | $O(1)$ iterative / $O(\log n)$ recursive |
| Data Structure | Array or linked list | Array only (needs random access) |
| Simplicity | Very simple | Slightly complex |
| For $n = 1,000,000$ | Up to 1,000,000 comparisons | At most 20 comparisons |
Practice Questions β Searching
Q1. Write iterative and recursive versions of Binary Search. Trace both for the array [3, 7, 11, 15, 19, 23, 27] searching for key = 19.
Q2. Compare Linear Search and Binary Search in terms of prerequisites, time complexity, and use cases.
Q3. Why canβt Binary Search be used on a linked list efficiently?
Q4. For an array of 1 million elements, how many comparisons does Binary Search need in the worst case?
Q5. Modify Binary Search to return the first occurrence of a duplicate element.
4 β Sorting
4.1 Sorting Concepts β Stable & In-Place
Stable Sort: A sorting algorithm is stable if it preserves the relative order of records with equal keys.
In-Place Sort: A sorting algorithm is in-place if it uses only $O(1)$ (or $O(\log n)$ for recursive ones) extra memory.
Stability Example:
Input: [(Alice, 85), (Bob, 90), (Charlie, 85)]
Stable sort by marks: [(Alice, 85), (Charlie, 85), (Bob, 90)]
β Alice appears before Charlie (same order as input) β
Unstable sort might give: [(Charlie, 85), (Alice, 85), (Bob, 90)]
β Charlie before Alice (order changed) β
4.2 Bubble Sort
Idea: Repeatedly step through the array, compare adjacent elements, and swap them if they are in the wrong order. After each pass, the largest unsorted element βbubbles upβ to its correct position.
Algorithm
void bubbleSort(int A[], int n) {
for (int i = 0; i < n - 1; i++) {
int swapped = 0; // Optimization flag
for (int j = 0; j < n - 1 - i; j++) {
if (A[j] > A[j + 1]) {
// Swap
int temp = A[j];
A[j] = A[j + 1];
A[j + 1] = temp;
swapped = 1;
}
}
if (!swapped) break; // Already sorted β early exit
}
}
π Line-by-Line Logic:
i < n - 1β We need at mostn-1passes. Why? After each pass, the largest unsorted element reaches its final position. Aftern-1passes,n-1elements are placed, so the last one is automatically correct.j < n - 1 - iβ This is the key optimization the user asked about! After passi, the lastielements are already in their final sorted positions (they βbubbled upβ). So we donβt need to compare them again. In pass 0, we check alln-1pairs. In pass 1, the last element is sorted, so we checkn-2pairs. In passi, we checkn-1-ipairs. Without-i, the algorithm would still be correct but would do useless comparisons.if (A[j] > A[j + 1]) β swapβ Compare adjacent elements. If the left one is larger, theyβre in wrong order β swap them. This is what makes larger elements βbubbleβ rightward.swappedflag β If we complete an entire pass without any swaps, the array is already sorted! Thebreakskips remaining passes. This turns best-case (already sorted) from $O(n^2)$ to $O(n)$.- Why compare adjacent pairs? β Think of it like bubbles rising in water. Each pass pushes the biggest unsorted element to the end, one swap at a time. After pass 0: max is at position
n-1. After pass 1: second-max atn-2. And so on.
Dry Run
Sort: [64, 34, 25, 12, 22, 11, 90]
Pass 1: Compare adjacent, swap if needed:
[64, 34, 25, 12, 22, 11, 90]
64>34 β swap β [34, 64, 25, 12, 22, 11, 90]
64>25 β swap β [34, 25, 64, 12, 22, 11, 90]
64>12 β swap β [34, 25, 12, 64, 22, 11, 90]
64>22 β swap β [34, 25, 12, 22, 64, 11, 90]
64>11 β swap β [34, 25, 12, 22, 11, 64, 90]
64<90 β no swap
After Pass 1: [34, 25, 12, 22, 11, 64, 90] β 90 is in place
Continue passes until sortedβ¦
Analysis
| Case | Comparisons | Swaps | Time |
|---|---|---|---|
| Best (already sorted, with flag) | $n - 1$ | 0 | $O(n)$ |
| Worst (reverse sorted) | $\frac{n(n-1)}{2}$ | $\frac{n(n-1)}{2}$ | $O(n^2)$ |
| Average | $\frac{n(n-1)}{2}$ | $\frac{n(n-1)}{4}$ | $O(n^2)$ |
Space: $O(1)$ β In-place
Stable: Yes β (equal elements are never swapped)
4.3 Selection Sort
Idea: Find the minimum element in the unsorted portion and swap it with the first unsorted element. Repeat for each position.
Algorithm
void selectionSort(int A[], int n) {
for (int i = 0; i < n - 1; i++) {
int minIdx = i;
for (int j = i + 1; j < n; j++) {
if (A[j] < A[minIdx])
minIdx = j;
}
// Swap A[i] and A[minIdx]
int temp = A[i];
A[i] = A[minIdx];
A[minIdx] = temp;
}
}
π Line-by-Line Logic:
i < n - 1β We placen-1elements; the last one is automatically correct.minIdx = iβ We assume the current position holds the minimum. Then we scan the rest to see if thereβs something smaller.for (int j = i + 1; j < n; j++)β Search the unsorted portion (fromi+1to end) for the actual minimum. Why start ati+1? Because positions0toi-1are already sorted from previous passes.if (A[j] < A[minIdx]) minIdx = jβ We only update the index, not swap yet. Why? Because we want to find the true minimum first, not swap multiple times. This is what makes selection sort do exactlyn-1swaps total β one swap per pass, unlike bubble sort which may swap many times per pass.- The swap after the inner loop β Once we know the minimumβs position, one swap puts it in place. This is why selection sort is preferred when write operations are expensive (e.g., writing to flash memory).
- Why is it unstable? β The swap can move
A[i]to a distant position, jumping over equal elements. Example:[5a, 5b, 3]β swap5awith3β[3, 5b, 5a]β 5a and 5b changed relative order.
Dry Run
Sort: [64, 25, 12, 22, 11]
Pass 1: Find min in [64,25,12,22,11] β 11 at idx 4. Swap with idx 0.
[11, 25, 12, 22, 64]
Pass 2: Find min in [25,12,22,64] β 12 at idx 2. Swap with idx 1.
[11, 12, 25, 22, 64]
Pass 3: Find min in [25,22,64] β 22 at idx 3. Swap with idx 2.
[11, 12, 22, 25, 64]
Pass 4: Find min in [25,64] β 25 at idx 3. No swap needed.
[11, 12, 22, 25, 64] β SORTED
Analysis
| Case | Comparisons | Swaps | Time |
|---|---|---|---|
| All cases | $\frac{n(n-1)}{2}$ | $n - 1$ | $O(n^2)$ |
Space: $O(1)$ β In-place
Stable: No β (swapping can change relative order of equal elements)
Selection Sort does fewer swaps than Bubble Sort β exactly $n-1$ swaps vs up to $\frac{n(n-1)}{2}$. This makes it better when write operations are expensive.
4.4 Insertion Sort
Idea: Build the sorted array one element at a time. Take each element and insert it into its correct position in the already-sorted left portion, shifting elements as needed.
Algorithm
void insertionSort(int A[], int n) {
for (int i = 1; i < n; i++) {
int key = A[i];
int j = i - 1;
// Shift elements greater than key to the right
while (j >= 0 && A[j] > key) {
A[j + 1] = A[j];
j--;
}
A[j + 1] = key;
}
}
π Line-by-Line Logic:
i = 1β We start from the second element because a single element (index 0) is trivially sorted. The sorted portion initially is justA[0].key = A[i]β We save the element to be inserted. This is crucial because the shifting in the while loop will overwriteA[i].j = i - 1β Start comparing from the rightmost element of the sorted portion (just left ofkey).while (j >= 0 && A[j] > key)β Two conditions:
j >= 0β Donβt go past the beginning of the array (bounds check).A[j] > keyβ Keep shifting as long as elements are larger thankey. Equal elements are NOT shifted (>not>=), which is why insertion sort is stable.A[j + 1] = A[j]β Shift element one position right to make room. We donβt need a swap β just overwriteA[j+1]because we saved the original value inkey.j--β Move left to check the next element.A[j + 1] = keyβ When the while loop ends,jis either -1 (key is smallest) orA[j] <= key. Either way,j+1is the correct insertion position.- Think of it like sorting a hand of cards β You pick up one card at a time and slide it into the correct position among the cards youβre already holding.
Dry Run
Sort: [12, 11, 13, 5, 6]
i=1: key=11, compare with 12. 12>11 β shift. Insert 11 at pos 0.
[11, 12, 13, 5, 6]
i=2: key=13, compare with 12. 12<13 β stop. Already in place.
[11, 12, 13, 5, 6]
i=3: key=5, compare with 13βshift, 12βshift, 11βshift. Insert at pos 0.
[5, 11, 12, 13, 6]
i=4: key=6, compare with 13βshift, 12βshift, 11βshift. Insert at pos 1.
[5, 6, 11, 12, 13] β SORTED
Analysis
| Case | Comparisons | Shifts | Time |
|---|---|---|---|
| Best (already sorted) | $n - 1$ | 0 | $O(n)$ |
| Worst (reverse sorted) | $\frac{n(n-1)}{2}$ | $\frac{n(n-1)}{2}$ | $O(n^2)$ |
| Average | $\frac{n(n-1)}{4}$ | $\frac{n(n-1)}{4}$ | $O(n^2)$ |
Space: $O(1)$ β In-place
Stable: Yes β
Insertion Sort is excellent for:
- Nearly sorted arrays (best case $O(n)$!)
- Small arrays ($n < 50$)
- Online sorting (can sort data as it arrives)
Many practical sorting implementations (e.g., TimSort in Python/Java) use Insertion Sort for small sub-arrays.
4.5 Merge Sort
Idea: Divide and Conquer β divide the array into two halves, recursively sort each half, then merge the two sorted halves into one sorted array.
Algorithm
void merge(int A[], int left, int mid, int right) {
int n1 = mid - left + 1; // size of left half
int n2 = right - mid; // size of right half
// Create temp arrays
int L[n1], R[n2];
for (int i = 0; i < n1; i++) L[i] = A[left + i];
for (int j = 0; j < n2; j++) R[j] = A[mid + 1 + j];
// Merge back into A[left..right]
int i = 0, j = 0, k = left;
while (i < n1 && j < n2) {
if (L[i] <= R[j]) // <= ensures stability
A[k++] = L[i++];
else
A[k++] = R[j++];
}
// Copy remaining elements
while (i < n1) A[k++] = L[i++];
while (j < n2) A[k++] = R[j++];
}
void mergeSort(int A[], int left, int right) {
if (left < right) {
int mid = left + (right - left) / 2;
mergeSort(A, left, mid); // Sort left half
mergeSort(A, mid + 1, right); // Sort right half
merge(A, left, mid, right); // Merge sorted halves
}
}
π Line-by-Line Logic β
merge():
n1 = mid - left + 1β Size of left half[left..mid]. The+1is because both endpoints are inclusive.n2 = right - midβ Size of right half[mid+1..right]. No+1needed becauseright - (mid+1) + 1 = right - mid.- Why create temp arrays
L[]andR[]? β Weβre about to overwriteA[left..right]during merging. If we didnβt copy the data first, weβd lose the original values. This is why merge sort needs $O(n)$ extra space.if (L[i] <= R[j])β The<=is critical for stability! When elements are equal, we pick from the left array first. Since left-array elements appeared first in the original array, their relative order is preserved. Using<instead would make it unstable.- The main while loop β Two-pointer merge: compare the front elements of both halves, take the smaller one. This works because both halves are already sorted (guaranteed by recursion). Think of merging two sorted decks of cards.
- Remaining-element loops β One array might run out before the other. The remaining elements are already sorted and all larger than whatβs been placed, so just copy them directly.
Line-by-Line Logic β
mergeSort():
if (left < right)β Base case: ifleft >= right, the sub-array has 0 or 1 element, which is inherently sorted. No work needed.mid = left + (right - left) / 2β Same overflow-safe midpoint formula as binary search.- Divide β Conquer β Combine β First sort left half, then right half, then merge them. The recursion goes deeper and deeper until we hit single elements, then merges build the sorted result bottom-up.
Dry Run
Sort: [38, 27, 43, 3, 9, 82, 10]
[38, 27, 43, 3, 9, 82, 10]
/ \
[38, 27, 43, 3] [9, 82, 10]
/ \ / \
[38, 27] [43, 3] [9, 82] [10]
/ \ / \ / \ |
[38] [27] [43] [3] [9] [82] [10]
\ / \ / \ / |
[27, 38] [3, 43] [9, 82] [10]
\ / \ /
[3, 27, 38, 43] [9, 10, 82]
\ /
[3, 9, 10, 27, 38, 43, 82] β SORTED
Analysis
| Case | Time | Reason |
|---|---|---|
| Best | $O(n \log n)$ | Always divides and merges |
| Worst | $O(n \log n)$ | Same β not input-dependent |
| Average | $O(n \log n)$ | Same |
Recurrence: $T(n) = 2T(n/2) + O(n)$ β By Master Theorem: $O(n \log n)$
Space: $O(n)$ β needs auxiliary arrays for merging
Stable: Yes β
In-Place: No β (needs $O(n)$ extra space)
4.6 Quick Sort
Idea: Divide and Conquer β choose a pivot element, partition the array so all elements smaller than pivot are on the left and all larger are on the right, then recursively sort each partition.
Algorithm (Lomuto Partition Scheme)
int partition(int A[], int low, int high) {
int pivot = A[high]; // Choose last element as pivot
int i = low - 1; // Index of smaller element
for (int j = low; j < high; j++) {
if (A[j] <= pivot) {
i++;
// Swap A[i] and A[j]
int temp = A[i];
A[i] = A[j];
A[j] = temp;
}
}
// Place pivot in correct position
int temp = A[i + 1];
A[i + 1] = A[high];
A[high] = temp;
return i + 1; // Return pivot index
}
void quickSort(int A[], int low, int high) {
if (low < high) {
int pi = partition(A, low, high);
quickSort(A, low, pi - 1); // Sort left of pivot
quickSort(A, pi + 1, high); // Sort right of pivot
}
}
π Line-by-Line Logic β
partition()(Lomuto Scheme):
pivot = A[high]β We choose the last element as pivot. Other strategies (first, random, median-of-three) exist, but last-element is simplest to implement.i = low - 1βitracks the boundary between ββ€ pivotβ elements and β> pivotβ elements. It starts beforelowbecause no elements have been classified yet. Everything to the left ofi(inclusive) is β€ pivot.for (int j = low; j < high; j++)βjscans through all elements (except pivot itself). For each element, we decide: is it β€ pivot or > pivot?if (A[j] <= pivot) { i++; swap(A[i], A[j]); }β Found an element that belongs in the βsmallβ section. Incrementito extend the small section, then swapA[j]into positioni. Elements betweeni+1andj-1are all > pivot.- Final swap:
A[i+1] β A[high]β Place the pivot between the two partitions. Positioni+1is the first element > pivot, so swapping puts pivot at the exact boundary. Everything left is β€ pivot, everything right is > pivot.return i + 1β The pivotβs final index. It will never move again.Why Lomuto over Hoare? Lomuto is easier to understand and implement. Hoareβs partition is slightly more efficient (fewer swaps on average) but harder to code correctly.
Why worst case is $O(n^2)$: If array is already sorted and we pick the last element as pivot, the pivot is always the largest. Every partition produces sub-arrays of size
n-1and0. That gives $T(n) = T(n-1) + O(n) = O(n^2)$.
Dry Run
Sort: [10, 80, 30, 90, 40, 50, 70], pivot = 70 (last element)
Partition around pivot 70:
Elements β€ 70: [10, 30, 40, 50]
Pivot: [70]
Elements > 70: [80, 90]
Result: [10, 30, 40, 50, 70, 80, 90]
β
Pivot in final position
Recursively sort [10, 30, 40, 50] and [80, 90]
β [10, 30, 40, 50, 70, 80, 90] β SORTED
Analysis
| Case | When | Time |
|---|---|---|
| Best | Pivot always splits array into two equal halves | $O(n \log n)$ |
| Worst | Pivot is always the smallest or largest element (already sorted input!) | $O(n^2)$ |
| Average | Random pivots | $O(n \log n)$ |
Space: $O(\log n)$ average (recursion stack), $O(n)$ worst case
Stable: No β
In-Place: Yes β (with recursion stack overhead)
Quick Sortβs worst case is $O(n^2)$! This happens when the array is already sorted (or reverse sorted) and the pivot is always the first/last element. Solution: Use randomized pivot selection or median-of-three.
Quick Sort vs Merge Sort:
- Quick Sort is faster in practice (better cache locality, smaller constant factor)
- Merge Sort guarantees $O(n \log n)$ always
- Quick Sort is in-place ($O(\log n)$ space); Merge Sort needs $O(n)$ extra space
4.7 Heap Sort
Idea: Build a Max-Heap from the array, then repeatedly extract the maximum element (root) and place it at the end of the array.
Max-Heap Property: Every parent node is β₯ its children.
Heap Basics (Array Representation)
For a node at index $i$ (0-indexed):
- Left child: $2i + 1$
- Right child: $2i + 2$
- Parent: $\lfloor(i - 1) / 2\rfloor$
Algorithm
void heapify(int A[], int n, int i) {
int largest = i;
int left = 2 * i + 1;
int right = 2 * i + 2;
if (left < n && A[left] > A[largest])
largest = left;
if (right < n && A[right] > A[largest])
largest = right;
if (largest != i) {
int temp = A[i];
A[i] = A[largest];
A[largest] = temp;
heapify(A, n, largest); // Recursively fix subtree
}
}
void heapSort(int A[], int n) {
// Step 1: Build Max-Heap (from last non-leaf to root)
for (int i = n / 2 - 1; i >= 0; i--)
heapify(A, n, i);
// Step 2: Extract max one by one
for (int i = n - 1; i > 0; i--) {
// Move current root (max) to end
int temp = A[0];
A[0] = A[i];
A[i] = temp;
// Heapify reduced heap
heapify(A, i, 0);
}
}
π Line-by-Line Logic β
heapify():
largest = iβ Assume the current node is the largest among itself and its children.left = 2*i + 1,right = 2*i + 2β These formulas come from the array representation of a complete binary tree. For 0-indexed arrays, the children of nodeiare at2i+1and2i+2.if (left < n && A[left] > A[largest])β Two checks: (1) does the left child exist? (left < n), and (2) is it larger than the current largest? The bounds check prevents array out-of-bounds.if (largest != i) β swap and recurseβ If a child was larger, swap parent with that child to fix the violation. But now the childβs subtree might be violated, so we recursively heapify downward. This βsifting downβ continues until the element finds its correct position.Line-by-Line Logic β
heapSort():
for (int i = n/2 - 1; i >= 0; i--)β Why start atn/2 - 1? Because nodes at indicesn/2ton-1are leaf nodes β they have no children, so they trivially satisfy the heap property. We only need to heapify internal nodes. Starting from the last non-leaf and working up ensures that when we heapify nodei, its subtrees are already valid heaps.- Building the heap is $O(n)$, not $O(n \log n)$! β This is a counter-intuitive result. Most nodes are near the bottom (leaves) and need little work. A mathematical proof shows the total work is $O(n)$.
A[0] β A[i]β The root (maximum element) is swapped to positioni(end of unsorted portion). NowA[i]is in its final position.heapify(A, i, 0)β The element swapped to root likely violates the heap property. Heapify fixes it. We passias the size (notn) because elements after indexiare already sorted and excluded from the heap.- Why is it $O(n \log n)$? β Building heap is $O(n)$, but extracting
nelements costs $O(\log n)$ each = $O(n \log n)$ total.
Analysis
| Case | Time |
|---|---|
| Best | $O(n \log n)$ |
| Worst | $O(n \log n)$ |
| Average | $O(n \log n)$ |
Building the heap: $O(n)$
$n$ extractions Γ heapify: $O(n \log n)$
Space: $O(1)$ β In-place β
Stable: No β
4.8 Counting Sort (Linear Time)
Idea: Count the occurrences of each distinct value, then reconstruct the sorted array from the counts. Works only when values are in a known, limited range $[0, k]$.
Algorithm
void countingSort(int A[], int n, int k) {
int count[k + 1]; // count array for values 0 to k
int output[n];
// Initialize count array
for (int i = 0; i <= k; i++) count[i] = 0;
// Count occurrences
for (int i = 0; i < n; i++) count[A[i]]++;
// Cumulative count (prefix sum)
for (int i = 1; i <= k; i++) count[i] += count[i - 1];
// Build output array (traverse input in reverse for stability)
for (int i = n - 1; i >= 0; i--) {
output[count[A[i]] - 1] = A[i];
count[A[i]]--;
}
// Copy output back to A
for (int i = 0; i < n; i++) A[i] = output[i];
}
π Line-by-Line Logic:
int count[k + 1]β We need one counter for each possible value from 0 to k. If max value is 8, we need indices 0β8 = 9 slots.- Phase 1 β Count occurrences:
count[A[i]]++tallies how many times each value appears. After this,count[v]= number of elements equal tov.- Phase 2 β Cumulative count (prefix sum):
count[i] += count[i - 1]transforms the count array so thatcount[v]now tells us how many elements are β€ v. This directly gives us the position where valuevshould end up in the sorted output.- Phase 3 β Build output (right-to-left):
for (int i = n - 1; i >= 0; i--)β Why traverse in reverse? This is the key to stability! When two elements have the same value, the one appearing later in the input (higher index) gets placed at a later position in the output. If we went left-to-right, equal elements would end up in reversed order.output[count[A[i]] - 1] = A[i]βcount[A[i]]tells us how many elements are β€A[i]. So positioncount[A[i]] - 1(0-indexed) is where this element belongs.count[A[i]]--β Decrement so the next element with the same value goes one position earlier.- This is NOT a comparison-based sort β It never compares two elements against each other. Thatβs how it beats the $\Omega(n \log n)$ lower bound for comparison sorts. The trade-off is it only works for integers in a known range.
Dry Run
Sort: [4, 2, 2, 8, 3, 3, 1], range $k = 8$
Count: [0, 1, 2, 2, 1, 0, 0, 0, 1] (idx 0-8)
Cumulative: [0, 1, 3, 5, 6, 6, 6, 6, 7]
Build output (right to left for stability):
A[6]=1 β output[0]=1, count[1]=0
A[5]=3 β output[4]=3, count[3]=4
A[4]=3 β output[3]=3, count[3]=3
A[3]=8 β output[6]=8, count[8]=6
A[2]=2 β output[2]=2, count[2]=2
A[1]=2 β output[1]=2, count[2]=1
A[0]=4 β output[5]=4, count[4]=5
Output: [1, 2, 2, 3, 3, 4, 8] β SORTED
Analysis
| Aspect | Value |
|---|---|
| Time | $O(n + k)$ where $k$ = range of values |
| Space | $O(n + k)$ |
| Stable | Yes β |
| In-Place | No β |
When NOT to use Counting Sort:
- When $k$ (range) is much larger than $n$ β e.g., sorting 10 numbers in range $[0, 10^9]$ wastes huge memory
- When values are negative (needs modification) or floating-point
4.9 Sorting Algorithms β Master Comparison
| Algorithm | Best | Average | Worst | Space | Stable | In-Place |
|---|---|---|---|---|---|---|
| Bubble Sort | $O(n)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$ | β | β |
| Selection Sort | $O(n^2)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$ | β | β |
| Insertion Sort | $O(n)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$ | β | β |
| Merge Sort | $O(n \log n)$ | $O(n \log n)$ | $O(n \log n)$ | $O(n)$ | β | β |
| Quick Sort | $O(n \log n)$ | $O(n \log n)$ | $O(n^2)$ | $O(\log n)$ | β | β |
| Heap Sort | $O(n \log n)$ | $O(n \log n)$ | $O(n \log n)$ | $O(1)$ | β | β |
| Counting Sort | $O(n+k)$ | $O(n+k)$ | $O(n+k)$ | $O(n+k)$ | β | β |
Quick Memory Aid:
- $O(n^2)$ sorts (simple): Bubble, Selection, Insertion β easy to code, slow for large $n$
- $O(n \log n)$ sorts (efficient): Merge, Quick, Heap β work well for large $n$
- $O(n+k)$ sorts (linear): Counting sort β fastest when $k$ is small
- Only Merge Sort is both stable AND $O(n \log n)$ guaranteed, but costs $O(n)$ extra space
Practice Questions β Sorting
Q1. Sort the array [5, 2, 8, 1, 9, 3] using (a) Bubble Sort (b) Selection Sort (c) Insertion Sort. Show all passes.
Q2. Trace Merge Sort on the array [12, 11, 13, 5, 6, 7]. Show the divide and merge steps.
Q3. Trace Quick Sort on [10, 7, 8, 9, 1, 5] using the last element as pivot. Show each partition step.
Q4. Build a Max-Heap from the array [4, 10, 3, 5, 1]. Then perform Heap Sort step by step.
Q5. Sort [1, 4, 1, 2, 7, 5, 2] using Counting Sort. Show the count array and output.
Q6. What is a stable sort? Give an example showing why stability matters.
Q7. Why is Quick Sort preferred over Merge Sort in practice despite having $O(n^2)$ worst case?
Q8. Can Bubble Sortβs worst case be improved from $O(n^2)$? Explain the optimization using a swap flag.
Q9. Which sorting algorithms are suitable for nearly sorted data? Explain why.
Q10. Compare all sorting algorithms in a table. Under what conditions would you choose each?
Q11. Prove that any comparison-based sorting algorithm has a lower bound of $\Omega(n \log n)$.
Q12. Write C code for Merge Sort and trace its execution for [38, 27, 43, 3].
Comprehensive Unit II Practice Set
Short Answer
1. Define sequential organization. Why do arrays support random access?
2. What is the address formula for A[i][j] in row-major and column-major order?
3. Write C functions for string length and palindrome check without using library functions.
4. What is the difference between stable and unstable sorting?
5. Why is Binary Search $O(\log n)$? Prove it.
Tracing & Analytical
6. An integer array A[4][6] starts at address 2000. Element size = 4 bytes. Find address of A[2][3] in row-major and column-major.
7. Sort [29, 10, 14, 37, 13] using each of the 6 sorting algorithms. Show all intermediate steps.
8. Given an array of $n$ elements where only one element is out of place. Which sort would you choose and why?
9. Implement Binary Search recursively. What is its space complexity? How does it differ from iterative version?
10. A company needs to sort 10 million records. The data has a limited range of keys (0 to 999). Which sorting algorithm would you recommend and why?
Long Answer
11. Compare Merge Sort and Quick Sort comprehensively β algorithm, dry run, time/space complexity, stability, and use cases.
12. Explain Heap Sort completely β what is a heap, how to build a max-heap, how the sorting works, with a full example. Analyze its complexity.
13. Explain Counting Sort. Why is it $O(n + k)$? When is it better than comparison-based sorts? When is it impractical?