PHP Implementation of Tarjan's Cycle Detection Algorithm

Tarjan's strongly connected components algorithm](https://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm), published in paper Enumeration of the Elementary Circuits of a Directed Graph by Robert Tarjan in 1972, is a graph theory algorithm for detecting all cycles and loops (edges connecting vertices to themselves) in a directed graph.

It performs a single pass of depth-first search. It maintains a stack of vertices that have been explored by the search but not yet assigned to a component, and calculates "low numbers" of each vertex (an index number of the highest ancestor reachable in one step from a descendant of the vertex) which it uses to determine when a set of vertices should be popped off the stack into a new component.

It is trivial to write algorithms to detect cycles in a graph, but most approaches prove highly impractical due to the immense time and storage they require to complete the computation. Tarjan's algorithm is one of the precious few that is able to compute strongly connected components in linear time (time increases linearly with the number of vertices and edges).
The others include Kosaraju's algorithm and the path-based strong component algorithm (Purdo, Munro, Dijkstra, Cheriyan & Mehlhorn, Gabow). Although Kosaraju's algorithm is conceptually simple, Tarjan's and the path-based algorithm are favoured in practice since they require only one depth-first search rather than two.

This PHP implementation of Tarjan's algorithm has been composed and tuned after much reading, trial and error, and peeking at implementations in numerous other programming languages. Special thanks to Jan van der Linde (@janvdl) for helpful remarks and code snippets from his Python implementation.

Questions, issues and comments are most welcome in the GitHub repository: php-tarjan.

<?php

/*
 * Array $G is to contain the graph in node-adjacency format.
 */
$G[] = array(1);
$G[] = array(4, 6, 7);
$G[] = array(4, 6, 7);
$G[] = array(4, 6, 7);
$G[] = array(2, 3);
$G[] = array(2, 3);
$G[] = array(5, 8);
$G[] = array(5, 8);
$G[] = array();
$G[] = array();
$G[] = array(10); // This is a self-cycle (aka "loop").

/*
 * There are 11 results for the above example (strictly speaking: 10 cycles and 1 loop):
 * 2|4
 * 2|4|3|6|5
 * 2|4|3|7|5
 * 2|6|5
 * 2|6|5|3|4
 * 2|7|5
 * 2|7|5|3|4
 * 3|4
 * 3|6|5
 * 3|7|5
 * 1|0
 */
$cycles = php_tarjan_entry($G);
echo '<p>Cycles found: '.count($cycles);
echo '<pre>'; print_r($cycles); echo '</pre>';

/*
 * php_tarjan_entry() iterates through the graph array rows, executing php_tarjan().
 */
function php_tarjan_entry($G_local){

  // All the following must be global to pass them to recursive function php_tarjan().
  global $cycles;
  global $marked;
  global $marked_stack;
  global $point_stack;

  // Initialize global values that are so far undefined.
  $cycles = array();
  $marked = array();
  $marked_stack = array();
  $point_stack = array();

  for ($x = 0; $x < count($G_local); $x++) {
    $marked[$x] = FALSE;
  }

  for ($i = 0; $i < count($G_local); $i++) {
    php_tarjan($i, $i);
    while (!empty($marked_stack)) {
      $marked[array_pop($marked_stack)] = FALSE;
    }
    //echo '<br>'.($i+1).' / '.count($G_local); // Enable if you wish to follow progression through the array rows.
  }

  $cycles = array_keys($cycles);

  return $cycles;
}

/*
 * Recursive function to detect strongly connected components (cycles, loops).
 */
function php_tarjan($s, $v){

  // Source node-adjacency array.
  global $G;

  // All the following must be global to pass them to recursive function php_tarjan().
  global $cycles;
  global $marked;
  global $marked_stack;
  global $point_stack;

  $f = FALSE;
  $point_stack[] = $v;
  $marked[$v] = TRUE;
  $marked_stack[] = $v;

  //$maxlooplength = 3; // Enable to Limit the length of loops to keep in the results (see below).

  foreach($G[$v] as $w) {
    if ($w < $s) {
      $G[$w] = array();
    } else if ($w == $s) {
      //if (count($point_stack) == $maxlooplength){ // Enable to collect cycles of a given length only.
        // Add new cycles as array keys to avoid duplication. Way faster than using array_search.
        $cycles[implode('|', $point_stack)] = TRUE;
      //}
      $f = TRUE;
    } else if ($marked[$w] === FALSE) {
      //if (count($point_stack) < $maxlooplength){ // Enable to only collect cycles up to $maxlooplength.
        $g = php_tarjan($s, $w);
      //}
      if (!empty($f) OR !empty($g)){
        $f = TRUE;
      }
    }
  }

  if ($f === TRUE) {
    while (end($marked_stack) != $v){
      $marked[array_pop($marked_stack)] = FALSE;
    }
    array_pop($marked_stack);
    $marked[$v] = FALSE;
  }

  array_pop($point_stack);
  return $f;
}
?>

Tomáš Fülöpp
Sint-Agatha-Berchem, Belgium
July 9, 2015, July 17, 2015, July 30, 2015, July 31, 2015
Tomáš Fülöpp (2012)

GitHub projectGitHub

Downloadable filesFiles

Children of this entryChildren

Social media linksSocials

Related linksRelated links

Tag-related linksCollection links

Tagstarjanloopgraphcyclecircuitdirectedgraph theorynetworkalgorithmphprobert tarjanproject
LanguageENGLISH Content typeARTICLELast updateOCTOBER 20, 2018 AT 01:46:40 UTC