Sunday, May 3, 2009

Dirichlet Problem...Solved!...Sort of...

Okay, so an alternate route to finding the probabilities has been discovered. Thanks to Richard, the guy doing his Ph.D.

So, while representing the image as an electric circuit, we can apply Kirchoff's Current Law:
{sum I=0}

at each node, of course, using the edge weights as the conductances of the "resistors".

This results in a system of linear equations with n-equations and n-unknowns...which is easily solvable using an elimination method such as Gaussian elimination or Gauss-Jordan elimination.

Keep in mind that the more known node values we have (i.e., user-defined labels), the fewer number of unknowns we have.


Thursday, March 12, 2009

Object Classification (Actual Image Segmentation)

Deciding which class (or label) each object belongs to is simple a matter of comparing the results of the Dirichlet equation for each and every pixel in the image.

For each pixel in the image, the Dirichlet equation is implemented n times, once for each of the n user-provided labels. The results of each pixel for each label are then stored in n arrays and then a comparison must be made. Each pixel is cross checked across each of the arrays (i.e. pixel 2x3 will be compared with every other pixel at location 2x3 in the rest of the arrays). A new array will then be created which will contain the highest value of each pixel from all of the arrays (i.e. the highest of all the 2x3 pixels and the highest of all the 4x8 pixels, etc). This array will be the segmented image, or will be the mask of the segmented image.

Thursday, March 5, 2009

Segmentation From 2D to 3D

Using a 2D image, the image is represented initially as a MxN matrix (which is then converted to a (MxN)x(MxN) Laplacian matrix). Each node is connected to the node above it, below it, to the right of it, and to the left of it. These connections are provided using weighted edges. The ultimate representation looks like a grid.

Taking this to the 3D realm, we simple represent the initial image using a MxNxL matrix (which is then converted to a (MxNxL)x(MxNxL) Laplacian matrix).  Each slice of the image would be 1 MxN slice and the entire 3D image would have L slices, again, each consisting of MxN nodes. Each node would then be connected to the node above it, below it, to the right of it, to the left of it, and now, to the front of it and behind it. These connections, again, would be provided using weighted edges and the the would be weighted using the same formula as the 2D image. 

Using this concept, converting a Random Walker algorithm to accomodate 3D images is pretty easy since all that would have to be done is adding MxN matrices to the originally MxNx1 matrix to eventually have an MxNxL matrix; and also adding more edges to each node. 

Again, all the same formulas for the 2D image segmentation would still be applicable.

Wednesday, February 11, 2009

Components of Random Walker (in a lower level than before)

  1. Input image (and convert to "double precision image"???)
    1. Input: Image
    2. Output: Formatted image + additional info
    3. Description:
      1. Extract dimensions (and other information such as number of colour channels)
      2. Check for correct format
      3. Convert to correct format

 

  1. Build an empty graph (lattice; all edges weight=1) using dimensions of image (in pixels)
    1. Inputs: X, Y (dimensions of lattice)
    2. Outputs: Vertices(Nx2 matrix), Edges(Mx2 matrix)
    3. Description:
      1. Create (X*Y) number of nodes
      2. Create respective amount of edges with weight=1, connecting them accordingly. The connections will be such that each corner node will have 2 connections, each non-corner-outside node will have 3 connections, and each inside node will have 4 connections.
      3. Output Mx2 ‘vertices’ array and Nx2 ‘edges’ array

 

  1. Generate edge weights
    1. Inputs: Mx2 edges matrix, NxK list of nodal values (this will be Nx1 for now (1channel), and Nx3(3channel)), scaling factor(this scaling factor will affect the precision of the segmentation)
    2. Outputs: Mx1 matrix, indexed by the edges, containing the weight corresponding to that edge
    3. Description: The edges matrix is solely used for indexing. The equation implemented for weights is:

will fix later--->

Where i and j are adjacent nodes and ni and nj are the values at nodes i and j provided in the NxK nodal value array.

 

  1. Generate Laplacian matrix
    1. Input: edges matrix, nodes matrix
    2. Output: Laplacian matrix
    3. Description: Applies algorithm to generate Laplacian matrix by first finding the adjacency matrix.

 

  1. Determine locations of labels
    1. Using labels information from input

 

  1. Set up Dirichlet boundaries
    1. Using locations of labels.

 

  1. Solve Random Walker algorithm by solving the combinatorial Dirichlet problem
    1. Inputs: NxN Laplacian matrix, Lx1 vector specifying the boundary nodes (from step 6),
    2. Output: Nx1 vector specifying the new nodal values (which seed they belong to)
    3. Description: Finds the values of the non-boundary nodes

 

  1. Generate mask
    1. Extracted from the vector generated in step 7.

Wednesday, February 4, 2009

Very High Level Algorithm for Implementing Random Walker

  1. Input image (and convert to "double precision image"???)
  2. Build an empty graph (lattice; all edges weight=1) using dimensions of image (in pixels)
  3. Generate edge weights
  4. Generate Laplacian matrix
  5. Determine locations of labels
  6. Set up Dirichlet boundaries
  7. Solve Random Walker algorithm by solving the combinatorial Dirichlet problem
  8. Generate mask

Saturday, January 31, 2009

The Incidence Matrix: A

The incidence matrix will be denoted as 'A' throughout the rest of this project. The importance of this matrix is that is is the combinatorial gradient operator and its transpose is the combinatorial divergence operator. These operators will be defined in detail in a future post. 
Now, the incidence matrix is defined as:
where eij is the edge adjacent to nodes i and j; and vk is the node k

So, by defining the columns as eij and the rows as vk we end up with a matrix which includes a diagonal of 1s since for every k=i, the value is +1. From there, for every node j adjacent to k=i, the value in the matrix is -1. The orientation for the edges here is assigned by taking i=k for node k (always) and then the adjacent node would just be j.

To demonstrate, I have made the incidence matrix for the graph we have been using throughout the blog.
The incidence matrix is always symmetrical.

What I don't understand though is what new information this provides over the adjacency matrix. I realize that it is represented differently by having a diagonal row of 1s, but don't both matrices just show the adjacent nodes of every node?

Thursday, January 29, 2009

Manipulation to Laplacian Matrix

The previous Laplacian matrix assumed al the edges were the same weight, hence the edge matrix placed a 1 (or -1) at every edge. To accomodate the edge weights, we change the definition of the matrix as follows:
where di is the degree of vertex i and wij is the weight of the edge adjacent to i and j.

Considering the graph we've been using throughout this blog, the Laplacian matrix using the new definition is:



Thursday, January 22, 2009

The Laplacian Matrix

The Laplacial matrix is significant because the Dirichlet problem uses it. More information will be posted when more information is understood with regards to why it is needed. This is a just a step towards the complete post explaining the Dirichlet problem.

So, The Laplacian matrix is a representation of a graph. It can represent both directed and undirected graphs but directed graphs are outside the scope of this project so they will not be discussed. Mathematically, the Laplacian matrix is the difference of two matrices, the degree matrix and the adjacency matrix.

The degree matrix is a diagonal nxn matrix that holds information about the degree of each node (vertex) in the graph. The prerequisite for the implementation of this matrix is that the graph must be labelled. Taking node with label i, we write its degree in the location aii of the degree
 matrix. 

[The degree of a vertex is the number of edges adjacent to it.]

The following image (taken from Wikipedia) demonstrates a degree matrix.


Now, the adjacency matrix is a representation of all the adjacent vertices of a
 vertex. It is also an nxn matrix. Element aij is 1 if vertices i and j are adjacent, otherwise it is 0. Therefore, an adjacency matrix must always be symmetrical.  The following image (also from Wikipedia) shows a graph and its respective adjacency matrix.


So, in order to complete the Laplacian matrix, we subtract the adjacency matrix from the degree matrix. The following image (again, Wikipedia) shows the Laplacian matrix for the same graph used in the above 2 images.


This vector can be directly achieved using the following definition:


The Dirichlet Problem

The Dirichlet problem is still very confusing so I will slowly be building upon it until I eventually reach a solid understanding a write a complete, thorough post on it.

Random Walker in a nutshell

Random Walker is an image segmentation algorithm based on the statistical algorithm random walk which finds all likelihoods of a certain point taking a next step. Random Walker then assigns the step with the highest probabibility to that particular steps' respective label. 

To understand this more easily, we can make the analogy of an electric circuit. Each pixel in the image would be represented as a node in the circuit. Each node is connected to every other node by a "resistor". These resistor values can differ depending on what the segmention-determinant is (intensity, colour, brightness, etc). Of course, an equation is derived for each of these determinants. 

Once the resistor values are in place, we need to take each seed (represented by a voltage source), one at a time, and figure out the potential between the voltage source and the node, going through all the resistors in between. Analogously, we would be finding which seed each pixel belongs to based on the numeric values between the pixel and each seed and assinging the pixel to the seed which produces the greatest value (potential).

To find these values, we use the Combinatorial Dirichlet problem (as far as I understand). The boundary limits to this problem would be the location of the pixel being examined and the location of the seed being examined.