GraphsLesson 3

BFS — Breadth-First Search

Explore a graph layer by layer using a queue. Full implementation, step-by-step trace, and why BFS finds shortest paths in unweighted graphs.

What Is It?

Breadth-First Search (BFS) is a way to explore a graph level by level. Start at one node. Visit all its direct neighbors first. Then visit their neighbors. Then their neighbors. Keep going, one layer at a time.

BFS uses a queue — a "first in, first out" structure (like a line at a store). When you discover a new node, you add it to the back of the queue. You always process the node at the front next. This is what keeps the exploration level-by-level.

BFS answers: "What is the shortest path from the starting node to every other node?" (For graphs where all edges have equal weight.)

Analogy

Drop a stone into a still pond. Ripples spread outward in rings — first the water 1 inch away moves, then 2 inches, then 3 inches. BFS works exactly like those ripples.

The starting node is where the stone lands. All direct neighbors are the first ring. Their neighbors are the second ring. Each ring is one step farther from the start.

How It Works

We will use this small graph for the example:

pseudocode

    0 --- 1 --- 2
    |           |
    3 --- 4 --- 5

Adjacency List:

pseudocode

  0: [1, 3]
  1: [0, 2]
  2: [1, 5]
  3: [0, 4]
  4: [3, 5]
  5: [2, 4]

BFS Pseudocode

pseudocode

function BFS(graph, start):
    visited = empty set
    queue = empty queue
    visited.add(start)
    queue.enqueue(start)
    while queue is not empty:
        node = queue.dequeue()
        process(node)           // e.g., print it
        for each neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.enqueue(neighbor)

Important: Mark a node as visited when you add it to the queue, not when you process it. This stops the same node from being added to the queue multiple times.

Step-by-Step Trace (starting from node 0)

pseudocode

Start: add 0 to queue
  Queue: [0]          Visited: {0}
Dequeue 0. Process 0. Neighbors: 1, 3
  Add 1 and 3 to queue.
  Queue: [1, 3]       Visited: {0, 1, 3}
Dequeue 1. Process 1. Neighbors: 0, 2
  0 already visited. Add 2.
  Queue: [3, 2]       Visited: {0, 1, 3, 2}
Dequeue 3. Process 3. Neighbors: 0, 4
  0 already visited. Add 4.
  Queue: [2, 4]       Visited: {0, 1, 3, 2, 4}
Dequeue 2. Process 2. Neighbors: 1, 5
  1 already visited. Add 5.
  Queue: [4, 5]       Visited: {0, 1, 3, 2, 4, 5}
Dequeue 4. Process 4. Neighbors: 3, 5
  3 already visited. 5 already visited.
  Queue: [5]          Visited: {0, 1, 3, 2, 4, 5}
Dequeue 5. Process 5. Neighbors: 2, 4
  Both already visited.
  Queue: []           Visited: {0, 1, 3, 2, 4, 5}
Done! Processing order: 0, 1, 3, 2, 4, 5

Notice the layers:

Distance 0: {0}
Distance 1: {1, 3}
Distance 2: {2, 4}
Distance 3: {5}

Why BFS Finds Shortest Paths

BFS processes all nodes at distance 1 before any node at distance 2. It processes all nodes at distance 2 before any at distance 3. So when BFS first reaches a node, it has taken the shortest possible route. There cannot be a shorter path — if there were, BFS would have found the node earlier.

Tracking Distances

To record how far each node is from the start:

pseudocode

function BFS_distances(graph, start):
    visited = empty set
    queue = empty queue
    distance = array of size V, all -1
    visited.add(start)
    queue.enqueue(start)
    distance[start] = 0
    while queue is not empty:
        node = queue.dequeue()
        for each neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.enqueue(neighbor)
                distance[neighbor] = distance[node] + 1
    return distance

Disconnected Graphs

If the graph has separate components, BFS from one starting node only reaches its own component. To visit everything:

pseudocode

function BFS_all(graph, numNodes):
    visited = empty set
    for v from 0 to numNodes - 1:
        if v not in visited:
            BFS(graph, v)    // starts a new BFS for each unvisited node

Each time the loop finds an unvisited node, that is the start of a new connected component.

Time: O(V + E) — every node is processed once, every edge is checked once.

Space: O(V) — the queue and visited set each hold at most V nodes.

Examples

Finding the Actual Shortest Path (not just the distance)

To reconstruct the path, track which node discovered each other node:

pseudocode

function BFS_path(graph, start, target):
    visited = empty set
    queue = empty queue
    parent = empty hashmap
    visited.add(start)
    queue.enqueue(start)
    parent[start] = null
    while queue is not empty:
        node = queue.dequeue()
        if node == target:
            // Build path by following parent pointers back
            path = []
            current = target
            while current is not null:
                path.prepend(current)
                current = parent[current]
            return path
        for each neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.enqueue(neighbor)
                parent[neighbor] = node
    return null   // target not reachable

For our sample graph, BFS_path(graph, 0, 5) returns: [0, 1, 2, 5] (distance 3) or [0, 3, 4, 5] (also distance 3).

Common Mistakes

Marking visited when you dequeue instead of when you enqueue. This lets the same node get added to the queue multiple times before it is processed. Always mark visited on enqueue.

Using a stack instead of a queue. A stack gives you DFS, not BFS. The data structure determines the order.

Forgetting the outer loop for disconnected graphs. BFS from one node only reaches one component.

Assuming BFS works for weighted graphs. BFS finds shortest paths only when all edges cost the same. For weighted graphs, use Dijkstra's algorithm instead.

Best Practices

Mark visited on enqueue, not dequeue. This is the most common BFS bug — get it right every time.
Use BFS when you need shortest distance in an unweighted graph. That is its main strength.
Track parent pointers when you need the actual path, not just the distance.
Use the outer loop pattern to handle disconnected graphs and count connected components.

Graph Representations DFS — Recursive