algorithm - there - path between two nodes in a undirected graph




Graph Algorithm To Find All Connections Between Two Arbitrary Vertices (11)

find_paths[s, t, d, k]

This question is old and answered already. However, none show perhaps a more flexible algorithm for accomplishing the same thing. So I'll throw my hat into the ring.

I personally find an algorithm of the form find_paths[s, t, d, k] useful, where:

  • s is the starting node
  • t is the target node
  • d is the maximum depth to search
  • k is the number of paths to find

Using your programming language's form of infinity for d and k will give you all paths§.

§ obviously if you are using a directed graph and you want all undirected paths between s and t you will have to run this both ways:

find_paths[s, t, d, k] <join> find_paths[t, s, d, k]

Helper Function

I personally like recursion, although it can difficult some times, anyway first lets define our helper function:

def find_paths_recursion(graph, current, goal, current_depth, max_depth, num_paths, current_path, paths_found)
  current_path.append(current)

  if current_depth > max_depth:
    return

  if current == goal:
    if len(paths_found) <= number_of_paths_to_find:
      paths_found.append(copy(current_path))

    current_path.pop()
    return

  else:
    for successor in graph[current]:
    self.find_paths_recursion(graph, successor, goal, current_depth + 1, max_depth, num_paths, current_path, paths_found)

  current_path.pop()

Main Function

With that out of the way, the core function is trivial:

def find_paths[s, t, d, k]:
  paths_found = [] # PASSING THIS BY REFERENCE  
  find_paths_recursion(s, t, 0, d, k, [], paths_found)

First, lets notice a few thing:

  • the above pseudo-code is a mash-up of languages - but most strongly resembling python (since I was just coding in it). A strict copy-paste will not work.
  • [] is an uninitialized list, replace this with the equivalent for your programming language of choice
  • paths_found is passed by reference. It is clear that the recursion function doesn't return anything. Handle this appropriately.
  • here graph is assuming some form of hashed structure. There are a plethora of ways to implement a graph. Either way, graph[vertex] gets you a list of adjacent vertices in a directed graph - adjust accordingly.
  • this assumes you have pre-processed to remove "buckles" (self-loops), cycles and multi-edges

I am trying to determine the best time efficient algorithm to accomplish the task described below.

I have a set of records. For this set of records I have connection data which indicates how pairs of records from this set connect to one another. This basically represents an undirected graph, with the records being the vertices and the connection data the edges.

All of the records in the set have connection information (i.e. no orphan records are present; each record in the set connects to one or more other records in the set).

I want to choose any two records from the set and be able to show all simple paths between the chosen records. By "simple paths" I mean the paths which do not have repeated records in the path (i.e. finite paths only).

Note: The two chosen records will always be different (i.e. start and end vertex will never be the same; no cycles).

For example:

    If I have the following records:
        A, B, C, D, E

    and the following represents the connections: 
        (A,B),(A,C),(B,A),(B,D),(B,E),(B,F),(C,A),(C,E),
        (C,F),(D,B),(E,C),(E,F),(F,B),(F,C),(F,E)

        [where (A,B) means record A connects to record B]

If I chose B as my starting record and E as my ending record, I would want to find all simple paths through the record connections that would connect record B to record E.

   All paths connecting B to E:
      B->E
      B->F->E
      B->F->C->E
      B->A->C->E
      B->A->C->F->E

This is an example, in practice I may have sets containing hundreds of thousands of records.


Adding to Casey Watson's answer, here is another Java implementation,. Initializing the visited node with the start node.

private void getPaths(Graph graph, LinkedList<String> visitedNodes) {
                LinkedList<String> adjacent = graph.getAdjacent(visitedNodes.getLast());
                for(String node : adjacent){
                    if(visitedNodes.contains(node)){
                        continue;
                    }
                    if(node.equals(END)){
                        visitedNodes.add(node);
                        printPath(visitedNodes);
                        visitedNodes.removeLast();
                    }
                    visitedNodes.add(node);
                    getPaths(graph, visitedNodes);
                    visitedNodes.removeLast();  
                }
            }

As far as I can tell the solutions given by Ryan Fox (58343, Christian (58444), and yourself (58461) are about as good as it get. I do not believe that breadth-first traversal helps in this case, as you will not get all paths. For example, with edges (A,B), (A,C), (B,C), (B,D) and (C,D) you will get paths ABD and ACD, but not ABCD.


Here is a logically better-looking recursive version as compared to the second floor.

public class Search {

private static final String START = "B";
private static final String END = "E";

public static void main(String[] args) {
    // this graph is directional
    Graph graph = new Graph();
    graph.addEdge("A", "B");
    graph.addEdge("A", "C");
    graph.addEdge("B", "A");
    graph.addEdge("B", "D");
    graph.addEdge("B", "E"); // this is the only one-way connection
    graph.addEdge("B", "F");
    graph.addEdge("C", "A");
    graph.addEdge("C", "E");
    graph.addEdge("C", "F");
    graph.addEdge("D", "B");
    graph.addEdge("E", "C");
    graph.addEdge("E", "F");
    graph.addEdge("F", "B");
    graph.addEdge("F", "C");
    graph.addEdge("F", "E");
    List<ArrayList<String>> paths = new ArrayList<ArrayList<String>>();
    String currentNode = START;
    List<String> visited = new ArrayList<String>();
    visited.add(START);
    new Search().findAllPaths(graph, seen, paths, currentNode);
    for(ArrayList<String> path : paths){
        for (String node : path) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println();
    }   
}

private void findAllPaths(Graph graph, List<String> visited, List<ArrayList<String>> paths, String currentNode) {        
    if (currentNode.equals(END)) { 
        paths.add(new ArrayList(Arrays.asList(visited.toArray())));
        return;
    }
    else {
        LinkedList<String> nodes = graph.adjacentNodes(currentNode);    
        for (String node : nodes) {
            if (visited.contains(node)) {
                continue;
            } 
            List<String> temp = new ArrayList<String>();
            temp.addAll(visited);
            temp.add(node);          
            findAllPaths(graph, temp, paths, node);
        }
    }
}
}

Program Output

B A C E 

B A C F E 

B E

B F C E

B F E 

Here's a thought off the top of my head:

  1. Find one connection. (Depth-first search is probably a good algorithm for this, since the path length doesn't matter.)
  2. Disable the last segment.
  3. Try to find another connection from the last node before the previously disabled connection.
  4. Goto 2 until there are no more connections.

I found a way to enumerate all the paths including the infinite ones containing loops.

http://blog.vjeux.com/2009/project/project-shortest-path.html

Finding Atomic Paths & Cycles

Definition

What we want to do is find all the possible paths going from point A to point B. Since there are cycles involved, you can't just go through and enumerate them all. Instead, you will have to find atomic path that doesn't loop and the smallest possible cycles (you don't want your cycle to repeat itself).

The first definition I took of an atomic path is a path that does not go through the same node twice. However, I found out that is was not taking all possibilities. After some reflexion, I figured out that nodes aren't important, however edges are! So an atomic path is a path that does not go through the same edge twice.

This definition is handy, it also works for cycles: an atomic cycle of point A is an atomic path that goes from point A and ends to point A.

Implementation

Atomic Paths A -> B

In order to get all the path starting from point A, we are going to traverse the graph recursively from the point A. While going through a child, we are going to make a link child -> parent in order to know all the edges we have already crossed. Before we go to that child, we must traverse that linked list and make sure the specified edge has not been already walked through.

When we arrive to the destination point, we can store the path we found.

Freeing the list

A problem occurs when you want to free the linked list. It is basically a tree chained in the reverse order. A solution would be to double-link that list and when all the atomic paths been found, free the tree from the starting point.

But a clever solution is to use a reference counting (inspired from Garbage Collection). Each time you add a link to a parent you adds one to its reference count. Then, when you arrive at the end of a path, you go backward and free while the reference count equals to 1. If it is higher, you just remove one and stop.

Atomic Cycle A

Looking for the atomic cycle of A is the same as looking for the atomic path from A to A. However there are several optimizations we can do. First, when we arrive at the destination point, we want to save the path only if the sum of the edges cost is negative: we only want to go through absorbing cycles.

As you have seen previously, the whole graph is being traversed when looking for an atomic path. Instead, we can limit the search area to the strongly connected component containing A. Finding these components requires a simple traverse of the graph with Tarjan's algorithm.

Combining Atomic Paths and Cycles

At this point, we have all the atomic paths that goes from A to B and all the atomic cycles of each node, left to us to organize everything to get the shortest path. From now on we are going to study how to find the best combination of atomic cycles in an atomic path.


I think you should describe your real problem behind this. I say this because you ask for something time efficient, yet the answer set to the problem seems to grow exponentially!

Therefore I wouldn't expect a better algorithm than something exponential.

I'd do backtracking and going through the whole graph. In order to avoid cycles, save all visited nodes along the way. When you go back, unmark the node.

Using recursion:

static bool[] visited;//all false
Stack<int> currentway; initialize empty

function findnodes(int nextnode)
{
if (nextnode==destnode)
{
  print currentway 
  return;
}
visited[nextnode]=true;
Push nextnode to the end of currentway.
for each node n accesible from nextnode:
  findnodes(n);
visited[nextnode]=false; 
pop from currenteay
}

Or is that wrong?

edit: Oh, and I forgot: You should eliminate the recursive calls by utilizing that node stack


It appears that this can be accomplished with a depth-first search of the graph. The depth-first search will find all non-cyclical paths between two nodes. This algorithm should be very fast and scale to large graphs (The graph data structure is sparse so it only uses as much memory as it needs to).

I noticed that the graph you specified above has only one edge that is directional (B,E). Was this a typo or is it really a directed graph? This solution works regardless. Sorry I was unable to do it in C, I'm a bit weak in that area. I expect that you will be able to translate this Java code without too much trouble though.

Graph.java:

import java.util.HashMap;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.Map;
import java.util.Set;

public class Graph {
    private Map<String, LinkedHashSet<String>> map = new HashMap();

    public void addEdge(String node1, String node2) {
        LinkedHashSet<String> adjacent = map.get(node1);
        if(adjacent==null) {
            adjacent = new LinkedHashSet();
            map.put(node1, adjacent);
        }
        adjacent.add(node2);
    }

    public void addTwoWayVertex(String node1, String node2) {
        addEdge(node1, node2);
        addEdge(node2, node1);
    }

    public boolean isConnected(String node1, String node2) {
        Set adjacent = map.get(node1);
        if(adjacent==null) {
            return false;
        }
        return adjacent.contains(node2);
    }

    public LinkedList<String> adjacentNodes(String last) {
        LinkedHashSet<String> adjacent = map.get(last);
        if(adjacent==null) {
            return new LinkedList();
        }
        return new LinkedList<String>(adjacent);
    }
}

Search.java:

import java.util.LinkedList;

public class Search {

    private static final String START = "B";
    private static final String END = "E";

    public static void main(String[] args) {
        // this graph is directional
        Graph graph = new Graph();
        graph.addEdge("A", "B");
        graph.addEdge("A", "C");
        graph.addEdge("B", "A");
        graph.addEdge("B", "D");
        graph.addEdge("B", "E"); // this is the only one-way connection
        graph.addEdge("B", "F");
        graph.addEdge("C", "A");
        graph.addEdge("C", "E");
        graph.addEdge("C", "F");
        graph.addEdge("D", "B");
        graph.addEdge("E", "C");
        graph.addEdge("E", "F");
        graph.addEdge("F", "B");
        graph.addEdge("F", "C");
        graph.addEdge("F", "E");
        LinkedList<String> visited = new LinkedList();
        visited.add(START);
        new Search().depthFirst(graph, visited);
    }

    private void depthFirst(Graph graph, LinkedList<String> visited) {
        LinkedList<String> nodes = graph.adjacentNodes(visited.getLast());
        // examine adjacent nodes
        for (String node : nodes) {
            if (visited.contains(node)) {
                continue;
            }
            if (node.equals(END)) {
                visited.add(node);
                printPath(visited);
                visited.removeLast();
                break;
            }
        }
        for (String node : nodes) {
            if (visited.contains(node) || node.equals(END)) {
                continue;
            }
            visited.addLast(node);
            depthFirst(graph, visited);
            visited.removeLast();
        }
    }

    private void printPath(LinkedList<String> visited) {
        for (String node : visited) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println();
    }
}

Program Output:

B E 
B A C E 
B A C F E 
B F E 
B F C E 

Solution in C code. It is based on DFS which uses minimum memory.

#include <stdio.h>
#include <stdbool.h>

#define maxN    20  

struct  nodeLink
{

    char node1;
    char node2;

};

struct  stack
{   
    int sp;
    char    node[maxN];
};   

void    initStk(stk)
struct  stack   *stk;
{
    int i;
    for (i = 0; i < maxN; i++)
        stk->node[i] = ' ';
    stk->sp = -1;   
}

void    pushIn(stk, node)
struct  stack   *stk;
char    node;
{

    stk->sp++;
    stk->node[stk->sp] = node;

}    

void    popOutAll(stk)
struct  stack   *stk;
{

    char    node;
    int i, stkN = stk->sp;

    for (i = 0; i <= stkN; i++)
    {
        node = stk->node[i];
        if (i == 0)
            printf("src node : %c", node);
        else if (i == stkN)
            printf(" => %c : dst node.\n", node);
        else
            printf(" => %c ", node);
    }

}


/* Test whether the node already exists in the stack    */
bool    InStack(stk, InterN)
struct  stack   *stk;
char    InterN;
{

    int i, stkN = stk->sp;  /* 0-based  */
    bool    rtn = false;    

    for (i = 0; i <= stkN; i++)
    {
        if (stk->node[i] == InterN)
        {
            rtn = true;
            break;
        }
    }

    return     rtn;

}

char    otherNode(targetNode, lnkNode)
char    targetNode;
struct  nodeLink    *lnkNode;
{

    return  (lnkNode->node1 == targetNode) ? lnkNode->node2 : lnkNode->node1;

}

int entries = 8;
struct  nodeLink    topo[maxN]    =       
    {
        {'b', 'a'}, 
        {'b', 'e'}, 
        {'b', 'd'}, 
        {'f', 'b'}, 
        {'a', 'c'},
        {'c', 'f'}, 
        {'c', 'e'},
        {'f', 'e'},               
    };

char    srcNode = 'b', dstN = 'e';      

int reachTime;  

void    InterNode(interN, stk)
char    interN;
struct  stack   *stk;
{

    char    otherInterN;
    int i, numInterN = 0;
    static  int entryTime   =   0;

    entryTime++;

    for (i = 0; i < entries; i++)
    {

        if (topo[i].node1 != interN  && topo[i].node2 != interN) 
        {
            continue;   
        }

        otherInterN = otherNode(interN, &topo[i]);

        numInterN++;

        if (otherInterN == stk->node[stk->sp - 1])
        {
            continue;   
        }

        /*  Loop avoidance: abandon the route   */
        if (InStack(stk, otherInterN) == true)
        {
            continue;   
        }

        pushIn(stk, otherInterN);

        if (otherInterN == dstN)
        {
            popOutAll(stk);
            reachTime++;
            stk->sp --;   /*    back trace one node  */
            continue;
        }
        else
            InterNode(otherInterN, stk);

    }

        stk->sp --;

}


int    main()

{

    struct  stack   stk;

    initStk(&stk);
    pushIn(&stk, srcNode);  

    reachTime = 0;
    InterNode(srcNode, &stk);

    printf("\nNumber of all possible and unique routes = %d\n", reachTime);

}

The National Institute of Standards and Technology (NIST) online Dictionary of Algorithms and Data Structures lists this problem as "all simple paths" and recommends a depth-first search. CLRS supplies the relevant algorithms.

A clever technique using Petri Nets is found here


This may be late, but here's the same C# version of DFS algorithm in Java from Casey to traverse for all paths between two nodes using a stack. Readability is better with recursive as always.

    void DepthFirstIterative(T start, T endNode)
    {
        var visited = new LinkedList<T>();
        var stack = new Stack<T>();

        stack.Push(start);

        while (stack.Count != 0)
        {
            var current = stack.Pop();

            if (visited.Contains(current))
                continue;

            visited.AddLast(current);

            var neighbours = AdjacentNodes(current);

            foreach (var neighbour in neighbours)
            {
                if (visited.Contains(neighbour))
                    continue;

                if (neighbour.Equals(endNode))
                {
                    visited.AddLast(neighbour);
                    printPath(visited));
                    visited.RemoveLast();
                    break;
                }
            }

            bool isPushed = false;
            foreach (var neighbour in neighbours.Reverse())
            {
                if (neighbour.Equals(endNode) || visited.Contains(neighbour) || stack.Contains(neighbour))
                {
                    continue;
                }

                isPushed = true;
                stack.Push(neighbour);
            }

            if (!isPushed)
                visited.RemoveLast();
        }
    }
This is a sample graph to test:

    // Sample graph. Numbers are edge ids
    //       1     3       
    //    A --- B --- C ----
    //    |     |  2       |
    //    | 4   ----- D    |
    //    ------------------




graph-theory