Hama Graph Tutorial

This document describes the Graph computing framework and serves as a tutorial.

Overview

Hama includes the Graph package for vertex-centric graph computations. Hama's Graph package allows you to program Google's Pregel style applications with simple programming interface.

Vertex API

Writing a Hama graph application involves subclassing the predefined Vertex class. Its template arguments define three value types, associated with vertices, edges, and messages.

  public abstract class Vertex<V extends Writable, E extends Writable, M extends Writable>
      implements VertexInterface<V, E, M> {

    public void compute(Iterator<M> messages) throws IOException;
    ..

  }

The user overrides the Compute() method, which will be executed at each active vertex in every superstep. Predefined Vertex methods allow Compute() to query information about the current vertex and its edges, and to send messages to other vertices. Compute() can inspect the value associated with its vertex via GetValue().

VertexReader API

You can create your own VertexReader for your data format by exending org.apache.hama.graph.VertexInputReader class. For example, an sequence file contains a linked list of Vertex, can be parse as following:

  public static class PagerankSeqReader
      extends
      VertexInputReader<Text, TextArrayWritable, Text, NullWritable, DoubleWritable> {
    @Override
    public boolean parseVertex(Text key, TextArrayWritable value,
        Vertex<Text, NullWritable, DoubleWritable> vertex) throws Exception {
      vertex.setVertexID(key);

      for (Writable v : value.get()) {
        vertex.addEdge(new Edge<Text, NullWritable>((Text) v, null));
      }

      return true;
    }
  }

Example: PageRankVertex

To solve the Page Rank problem using Hama Graph, you can extends the Vertex class to create a PageRankVertex class. In this example, the algorithm described Google's Pregel paper was used. The value of a vertex represents the tentative page rank of the vertex. The graph is intialized with each vertex value equal to 1/numOfVertices. In each of the first 30 supersteps, each vertex sends its tentative page rank along all of its outgoing edges.

From Superstep 1 to 30, each vertex sums up the values arriving on all its messages and sets its tentative page rank to (1 - 0.85) / numOfVertices + (0.85 * sum).

  public static class PageRankVertex extends
      Vertex<Text, NullWritable, DoubleWritable> {

    @Override
    public void compute(Iterator<DoubleWritable> messages) throws IOException {
      if (this.getSuperstepCount() == 0) {
        this.setValue(new DoubleWritable(1.0 / (double) this.getNumVertices()));
      }

      if (this.getSuperstepCount() >= 1) {
        double sum = 0;
        while (messages.hasNext()) {
          DoubleWritable msg = messages.next();
          sum += msg.get();
        }

        double ALPHA = (1 - 0.85) / (double) this.getNumVertices();
        this.setValue(new DoubleWritable(ALPHA + (0.85 * sum)));
      }

      if (this.getSuperstepCount() < this.getMaxIteration()) {
        int numEdges = this.getOutEdges().size();
        sendMessageToNeighbors(new DoubleWritable(this.getValue().get()
            / numEdges));
      }
    }
  }