Finding texture descriptors using Perl

Posted July 13, 2007 by Lino Ramirez
Categories: Free and Open Source Software, Perl

For an image analysis application I am writing using the PDL, I needed to compute some texture measures. After some research, I decided to go with the measures proposed by Rober Haralick based on the Gray Level Co-occurrence Matrix (GLCM). To make a long story short, I found a nice tutorial on the GLCM and started implementing the code for computing the GLCM and the texture measures following the equations presented in the tutorial. Here is my first take to computing the GLCM and some of the texture measures:

#!/usr/bin/perl
use warnings;
use strict;
use PDL;
use PDL::NiceSlice;


# ================================
# cooccurrence:
#
# $glcm = cooccurrence( $pdl, $dir, $dist, $symmetric )
#
# computes the grey level coocurrence coocurrence
# matrix of piddle $pdl for a given direction and
# distance
#
# Inputs:
# $pdl
# $dir: direction of evaluation
# $dir angle
# 0 +90
# 1 +45
# 2 0
# 3 -45
# 4 -90
# $dist: distance between pixels
# $symmetric: 0 => non-symmetric $glcm
#
# ================================
sub cooccurrence {
my ( $pdl, $dir, $dist, $symmetric ) = @_;

my $min_quantization_level = int( min( $pdl ) );
my $max_quantization_level = int( max( $pdl ) );

my $glcm = zeroes( $max_quantization_level
- $min_quantization_level + 1
, $max_quantization_level
- $min_quantization_level + 1 );

my ($dir_x, $dir_y);

if ( $dir == 0 ){
$dir_x = 0;
$dir_y = 1;
} elsif ( $dir == 1 ){
$dir_x = 1;
$dir_y = 1;
} elsif ( $dir == 2 ){
$dir_x = 1;
$dir_y = 0;
} elsif ( $dir == 3 ){
$dir_x = 1;
$dir_y = -1;
} elsif ( $dir == 4 ){
$dir_x = 0;
$dir_y = -1;
} else {
$dir_x = 0;
$dir_y = 0;
}

$dir_x *= $dist;
$dir_y *= $dist;

my $glcm_ind_x = 0;
my $glcm_ind_y = 0;

foreach my $grey_level_1 ( $min_quantization_level .. $max_quantization_level ){
my ( $ind_x_1, $ind_y_1 )
= whichND( $pdl == $grey_level_1 );
$ind_x_1 += $dir_x;
$ind_y_1 += $dir_y;

foreach my $grey_level_2 ( $min_quantization_level .. $max_quantization_level ){
my ( $ind_x_2, $ind_y_2 )
= whichND( $pdl == $grey_level_2 );
my $count = 0;
foreach my $i (0..$ind_x_1->getdim(0) - 1) {
foreach my $j (0..$ind_x_2->getdim(0) - 1) {
if ( ($ind_x_1($i) == $ind_x_2($j))
and ($ind_y_1($i) == $ind_y_2($j)) ) {
$count++;
}
}
}

$glcm( $glcm_ind_x, $glcm_ind_y ) .= $count;
$glcm_ind_y++;
}
$glcm_ind_y = 0;
$glcm_ind_x++;
}

if ( $symmetric ) {
$glcm += transpose( $glcm );
}
$glcm /= sum( $glcm );
return $glcm;
}

# ================================
# texture_descriptors:
#
# ( $contrast, $dissimilarity, $homogeneity
# , $inverse_difference, $asm, $energy )
# = texture_descriptors( $glcm );
#
# computes a set of texture descriptors
# associated with the GLCM $glcm
#
# $contrast:
# Range = [0 .. ($glcm->getdim(0)-1)^2]
# $contrast = 0 for a constant image.
# $homogeneity:
# Measures the closeness of the distribution
# of elements in the GLCM to the GLCM diagonal.
# Range = [0 1]
# $homogeneity is 1 for a diagonal GLCM.
# ================================
sub texture_descriptors{
my $glcm = pdl( @_ );
my $n = $glcm->getdim(0);
my $i = sequence( $n );
my $j = sequence( $n );
my $diff = $i->dummy(0, $n) - $j->dummy(1, $n);

my $contrast = sum( $glcm * ($diff ** 2) );

my $dissimilarity = sum( $glcm * abs( $diff ) );

my $homogeneity = sum( $glcm / ( 1 + $diff ** 2) );

my $inverse_difference = sum( $glcm / ( 1 + abs( $diff ) ) );

my $asm = sum( $glcm ** 2 );

my $energy = sqrt( $asm );

return ( $contrast, $dissimilarity, $homogeneity
, $inverse_difference, $asm, $energy );
}

my $pdl = pdl([0,0,1,1],[0,0,1,1],[0,2,2,2],[2,2,3,3]);
my $glcm = cooccurrence( $pdl, 2, 1, 1 );
print "glcm: $glcm\n";

my ( $contrast, $dissimilarity, $homogeneity
, $inverse_difference, $asm, $energy )
= texture_descriptors( $glcm );

print "contrast: $contrast\tdissimilarity: $dissimilarity\n";
print "homogeneity: $homogeneity\t";
print "inverse difference: $inverse_difference\n";
print "ASM: $asm\tenergy: $energy\n";

All suggestions are welcome 🙂
Cheers
Lino

OSCON 2007: 16 days away

Posted July 7, 2007 by Lino Ramirez
Categories: Free and Open Source Software, OSCON, OSCON07, Perl, presenting

Only 16 days separate us from OSCON and I am still polishing the material for my session 😉 I asked my fellow PerlMonks for feedback on a preliminary version of the presentation’s outline and as usual the comments were really useful. Based on the comments, I decided to reduce to two the number of case studies to be presented instead of the three I originally planned. I believe that in this way, I will have more time to clearly explain the techniques.

 

By the way, with this post, I will start a series of posts in which I show some of the snippets I will be presenting. Here are the first one:

Description:

A common practice in machine learning is to preprocess the data before building a model. One popular preprocessing technique is data normalization. Normalization puts the variables in a restricted range (with a zero mean and 1 standard deviation). This is important to achieve efficient and precise numerical computation.

In this snippet, I present how to do data normalization using the Perl Data Language. The input is a piddle (see comment below for a definition) in which each column represents a variable and each row represent a pattern. The output is a piddle (in which each variable is normalized to have a 0 mean and 1 standard deviation), and the mean and standard deviation of the input piddle.

What are Piddles?

They are a new data structure defined in the Perl Data Language. As indicated in RFC: Getting Started with PDL (the Perl Data Language):

Piddles are numerical arrays stored in column major order (meaning that the fastest varying dimension represent the columns following computational convention rather than the rows as mathematicians prefer). Even though, piddles look like Perl arrays, they are not. Unlike Perl arrays, piddles are stored in consecutive memory locations facilitating the passing of piddles to the C and FORTRAN code that handles the element by element arithmetic. One more thing to note about piddles is that they are referenced with a leading $

Code:


#!/usr/bin/perl
use warnings;
use strict;

use PDL;
use PDL::NiceSlice;

# ================================
# normalize
# ( $output_data, $mean_of_input, $stdev_of_input) =
# normalize( $input_data )
#
# processess $input_data so that $output_data
# has 0 mean and 1 stdev
#
# $output_data = ( $input_data – $mean_of_input ) / $stdev_of_input
# ================================
sub normalize {
my ( $input_data ) = @_;
my ( $mean, $stdev, $median, $min, $max, $adev )
= $input_data->xchg(0,1)->statsover();

my $idx = which( $stdev == 0 );
$stdev( $idx ) .= 1e-10;
my ( $number_of_dimensions, $number_of_patterns )
= $input_data->dims();
my $output_data
= ( $input_data – $mean->dummy(1, $number_of_patterns) )
/ $stdev->dummy(1, $number_of_patterns);

return ( $output_data, $mean, $stdev );
}

The GPL v. 3 is Here

Posted June 30, 2007 by Lino Ramirez
Categories: Free and Open Source Software, News

The new version of the GNU GPL (General Purpose License) was unveiled on Friday June 29, 2007. The complete text is available at the GNU GPL site. How many projects will switch to use the new license? That is a question that only time can answer …

Venezuelan researcher has set a new Wi-Fi distance record

Posted June 19, 2007 by Lino Ramirez
Categories: News, Venezuela

A researcher from the Universidad de los Andes (ULA), Prof. Ermanno Pietrosemoli, has set a new record for the longest communication link with Wi-Fi: 382 kilometers (238 miles). Pietrosemoli, president of the Escuela Latinoamerica de Redes (or Networking School of Latin America), achieved the record by establishing a Wi-Fi link between two computers located in El Aguila and Platillon Mountain, Venezuela. Pietrosemoli gets about 3 megabits per second in each direction on his long-range connections.

The full article is available at c|net News.com

Is Canada a Land of Mediocrity?

Posted June 17, 2007 by Lino Ramirez
Categories: Canada, News

The Globe and Mail reports:

Canada’s failure to innovate is spilling over into the economy, environmental protection, health care, education, and poverty eradication – turning the country into a land of stifling mediocrity, according to a harsh new report card from the Conference Board of Canada.

The full article is available at Canada Land of Mediocrity

Even though I do not agree with the title of the article, I completely agree that we (specially our Government) need to do something to move forward our beautiful country. We have the people and the talent, we just need the support of our government to create an environment in which innovation could flourish.

Cheers,

Lino

Generating cool fractals

Posted June 16, 2007 by Lino Ramirez
Categories: Free and Open Source Software, Open Science, Perl

This one is from the Free Software Magazine. Xavier Calbet wrote:

Whether you are a professional or amateur scientist, engineer or mathematician, if you need to make numerical calculations and plots quickly and easily, then PDL (Perl Data Language) is certainly one of the best free software tools to use. PDL has everything that similar high-level, proprietary, numerical calculation languages (like IDL or MATLAB) have. And it certainly comes with all the features you would expect to have in a numerical calculation package.

The full article is available at: Generating cool fractals.

Cheers,

Lino

The Art of Innovation…

Posted June 16, 2007 by Lino Ramirez
Categories: presenting

Here is a nice version of Guy Kawasaki‘s The Art of Innovation speech. This version was prepared using Zentation‘s technology.

Cheers,

Lino