Please, help us find missing Canadian Pilot using Google Earth

Posted December 26, 2007 by Lino Ramirez
Categories: Canada, News, Technology

Thanks to technology, you can be instrumental to bringing peace to a tormented family. We are trying to find an apple seed in a backyard infested with weed. We need your eyes and your goodwill.

Ron Boychuk disappeared on October 23, 2007 on his way home, while he was flying alone in a Cessna 172 plane. Ron departed from Revelstoke, Canada en route to the Vancouver Island community of Qualicum, Canada and never arrived. The Canadian government has given up the search. This is why friends and family have been sending their cry for help to the online community.

The incredible people at DigitalGlobe and InternetSAR responded the call and have donated their invaluable time and resources to make this search possible.

DigitalGlobe arranged for a satellite sweep of some of the highest value areas, and have donated us free of charge the best image quality available in North America.

InternetSAR are providing us with their valuable time, experience, and technology to support the collaborative analysis of those images.

We now need many people to thoroughly observe the images and report any spot that should be looked at more closely. We are trying to find an apple seed in a backyard infested with weed.

We need your eyes; we need volunteers; we need goodwill. Thanks to technology, you can help. You can be instrumental to bringing peace to a tormented family.

If you want to help, please:

  1. go to http://internetsar.org/
  2. follow the link “Ron Boychuk Search”
  3. create your account
  4. review one or more imagery overlays with Google Earth

Any question, please join the Google Earth Community. There, we have enabled a forum related to the Search for Ron Boychuk

Our sincerest thanks

 

Calgary Open Source Systems Festival: The largest free technology event in Calgary

Posted October 21, 2007 by Lino Ramirez
Categories: Free and Open Source Software, News, Perl

This coming weekend is the Calgary Open Source Systems Festival (COSSFest 2007). For those open source lovers and those curious about open source, this is a must attend event. By the way, I will be giving two talks: one on Machine Learning Development with Perl and another on Open Source Business Analytics.

Here is the official press release:

 

COSSFEST 07 (Calgary Open Source Systems Festival) will bring together professionals, students and enthusiasts who share a common interest in Open Source software.

This one-day conference and expo will be made up of speakers and hourly workshops.

Booths featuring hardware vendors, software companies, services companies, and (of course) user groups will be on hand. This event is the largest free technology event in Calgary!

 

COSSFEST 07 will publicize open source systems and educate the community at large on how open source systems can satisfy their software needs. This is a great opportunity for visionaries, developers, technologists, entrepreneurs, programmers, CIOs, CTOs, educators, not-for-profit leaders, hackers and average users to meet, network and learn.

 

 

COSSFEST 07

4-Nines Dining Centre, John Ware Building

SAIT Polytechnic

1301 - 16th Avenue NW, Calgary, AB T2M 0L4

Saturday, October 27, 2007

9:30 am to 5:30 pm.

 

This is a FREE event

(Registration for attendance is required at www.cossfest.ca)

 

For more information, please visit: www.cossfest.ca

Contact: Kin Wong at 403.617.9316 or e-mail: info@cossfest.ca

 

Everyone interested in Linux, Unix, Open Source and Free Software is welcome

 

IBM Joins OpenOffice.org Community

Posted September 14, 2007 by Lino Ramirez
Categories: Free and Open Source Software, News

10 September 2007 — The OpenOffice.org community today announced that IBM will be joining the community to collaborate on the development of OpenOffice.org software. IBM will be making initial code contributions that it has been developing as part of its Lotus Notes product, including accessibility enhancements, and will be making ongoing contributions to the feature richness and code quality of OpenOffice.org. Besides working with the community on the free productivity suite’s software, IBM will also leverage OpenOffice.org technology in its products.

read the full press release at OpenOffice.org

Machine Learning Development with Perl

Posted September 11, 2007 by Lino Ramirez
Categories: Perl, analytics, knowledge discovery

I just posted in PerlMonks a draft of a 45 minutes-long talk on Machine Learning Development with Perl. Here is an extract of that post:

————————————————–

Machine Learning Development with Perl

The development of machine learning applications can be seen as a three-phase process involving: preparation, modeling, and implementation (See Fig. 1).

As a developer, you have to move back and forth between phases until you get a satisfactory result.

Preparation

In the preparation phase, you work with your customer to define the problem. You proceed, then, to gather some data. After that, you analyze the data and do some cleaning if necessary and select the features you are going to use in the model. Based on the type of problem, you may decide what type of model you want to develop: a classifier, an estimator, or a clustering application.

Modeling

In the modeling phase, you do the model selection in case you did not do it in the preparation phase and then you do the development and finally you do the evaluation. Based on the results you get, you may decide to got back to the preparation phase and select other features, other cleaning method, or maybe other type of model.

Implementation

In the implementation phase, you simply implement your model. One important consideration is that your model should continue learning from new data. Sometimes, in machine learning, your model works well initially but when the data grow significantly then the model does not perform as well as before. This is why it is important to allow the model to continue learning as more data become available.

————————————————–

The full post ( including source code ) is available at RFC: Machine Learning Development with Perl

Cheers,

Lino

Presenting AranduCorp

Posted September 10, 2007 by Lino Ramirez
Categories: News, knowledge discovery

AranduCorp is a consulting firm focused on helping organizations improve their business processes, marketing, and sales. AranduCorp offers training and consulting services on predictive analytics and will soon offer affordable predictive analytics software solutions for small and medium size businesses.

For more information visit:

AranduCorp’s Website

AranduCorp’s Blog

Machine Learning Made Easy with Perl (the day before)

Posted July 24, 2007 by Lino Ramirez
Categories: Free and Open Source Software, OSCON, OSCON07, Perl

Machine Learning Made Easy with Perl is the name of the session I am giving tomorrow afternoon at OSCON. I really worked hard on this one :-) It took me more time than I expected to make machine learning easy ;-) I do not want to spoil the surprise but the talk is really packed so if you are attending, do not close your eyes for a second because you might miss one of the pointers that could save your next machine learning project.

There is a small update to the session: I will only be covering “Exploratory financial data analysis using fuzzy clustering” and “Medical decision support systems using support vector machines”. I will cover only two case studies to provide more in depth information. Come and see what I mean :-)

I hope to see many faces there. By the way, I will make available the slides and the source code one week after the talk.

Cheers,

Lino

Finding texture descriptors using Perl

Posted July 13, 2007 by Lino Ramirez
Categories: Free and Open Source Software, Perl

For an image analysis application I am writing using the PDL, I needed to compute some texture measures. After some research, I decided to go with the measures proposed by Rober Haralick based on the Gray Level Co-occurrence Matrix (GLCM). To make a long story short, I found a nice tutorial on the GLCM and started implementing the code for computing the GLCM and the texture measures following the equations presented in the tutorial. Here is my first take to computing the GLCM and some of the texture measures:

#!/usr/bin/perl
use warnings;
use strict;
use PDL;
use PDL::NiceSlice;


# ================================
# cooccurrence:
#
# $glcm = cooccurrence( $pdl, $dir, $dist, $symmetric )
#
# computes the grey level coocurrence coocurrence
# matrix of piddle $pdl for a given direction and
# distance
#
# Inputs:
# $pdl
# $dir: direction of evaluation
# $dir angle
# 0 +90
# 1 +45
# 2 0
# 3 -45
# 4 -90
# $dist: distance between pixels
# $symmetric: 0 => non-symmetric $glcm
#
# ================================
sub cooccurrence {
my ( $pdl, $dir, $dist, $symmetric ) = @_;

my $min_quantization_level = int( min( $pdl ) );
my $max_quantization_level = int( max( $pdl ) );

my $glcm = zeroes( $max_quantization_level
- $min_quantization_level + 1
, $max_quantization_level
- $min_quantization_level + 1 );

my ($dir_x, $dir_y);

if ( $dir == 0 ){
$dir_x = 0;
$dir_y = 1;
} elsif ( $dir == 1 ){
$dir_x = 1;
$dir_y = 1;
} elsif ( $dir == 2 ){
$dir_x = 1;
$dir_y = 0;
} elsif ( $dir == 3 ){
$dir_x = 1;
$dir_y = -1;
} elsif ( $dir == 4 ){
$dir_x = 0;
$dir_y = -1;
} else {
$dir_x = 0;
$dir_y = 0;
}

$dir_x *= $dist;
$dir_y *= $dist;

my $glcm_ind_x = 0;
my $glcm_ind_y = 0;

foreach my $grey_level_1 ( $min_quantization_level .. $max_quantization_level ){
my ( $ind_x_1, $ind_y_1 )
= whichND( $pdl == $grey_level_1 );
$ind_x_1 += $dir_x;
$ind_y_1 += $dir_y;

foreach my $grey_level_2 ( $min_quantization_level .. $max_quantization_level ){
my ( $ind_x_2, $ind_y_2 )
= whichND( $pdl == $grey_level_2 );
my $count = 0;
foreach my $i (0..$ind_x_1->getdim(0) - 1) {
foreach my $j (0..$ind_x_2->getdim(0) - 1) {
if ( ($ind_x_1($i) == $ind_x_2($j))
and ($ind_y_1($i) == $ind_y_2($j)) ) {
$count++;
}
}
}

$glcm( $glcm_ind_x, $glcm_ind_y ) .= $count;
$glcm_ind_y++;
}
$glcm_ind_y = 0;
$glcm_ind_x++;
}

if ( $symmetric ) {
$glcm += transpose( $glcm );
}
$glcm /= sum( $glcm );
return $glcm;
}

# ================================
# texture_descriptors:
#
# ( $contrast, $dissimilarity, $homogeneity
# , $inverse_difference, $asm, $energy )
# = texture_descriptors( $glcm );
#
# computes a set of texture descriptors
# associated with the GLCM $glcm
#
# $contrast:
# Range = [0 .. ($glcm->getdim(0)-1)^2]
# $contrast = 0 for a constant image.
# $homogeneity:
# Measures the closeness of the distribution
# of elements in the GLCM to the GLCM diagonal.
# Range = [0 1]
# $homogeneity is 1 for a diagonal GLCM.
# ================================
sub texture_descriptors{
my $glcm = pdl( @_ );
my $n = $glcm->getdim(0);
my $i = sequence( $n );
my $j = sequence( $n );
my $diff = $i->dummy(0, $n) - $j->dummy(1, $n);

my $contrast = sum( $glcm * ($diff ** 2) );

my $dissimilarity = sum( $glcm * abs( $diff ) );

my $homogeneity = sum( $glcm / ( 1 + $diff ** 2) );

my $inverse_difference = sum( $glcm / ( 1 + abs( $diff ) ) );

my $asm = sum( $glcm ** 2 );

my $energy = sqrt( $asm );

return ( $contrast, $dissimilarity, $homogeneity
, $inverse_difference, $asm, $energy );
}

my $pdl = pdl([0,0,1,1],[0,0,1,1],[0,2,2,2],[2,2,3,3]);
my $glcm = cooccurrence( $pdl, 2, 1, 1 );
print “glcm: $glcm\n”;

my ( $contrast, $dissimilarity, $homogeneity
, $inverse_difference, $asm, $energy )
= texture_descriptors( $glcm );

print “contrast: $contrast\tdissimilarity: $dissimilarity\n”;
print “homogeneity: $homogeneity\t”;
print “inverse difference: $inverse_difference\n”;
print “ASM: $asm\tenergy: $energy\n”;

All suggestions are welcome :-)
Cheers
Lino

OSCON 2007: 16 days away

Posted July 7, 2007 by Lino Ramirez
Categories: Free and Open Source Software, OSCON, OSCON07, Perl, presenting

Only 16 days separate us from OSCON and I am still polishing the material for my session ;-) I asked my fellow PerlMonks for feedback on a preliminary version of the presentation’s outline and as usual the comments were really useful. Based on the comments, I decided to reduce to two the number of case studies to be presented instead of the three I originally planned. I believe that in this way, I will have more time to clearly explain the techniques.

 

By the way, with this post, I will start a series of posts in which I show some of the snippets I will be presenting. Here are the first one:

Description:

A common practice in machine learning is to preprocess the data before building a model. One popular preprocessing technique is data normalization. Normalization puts the variables in a restricted range (with a zero mean and 1 standard deviation). This is important to achieve efficient and precise numerical computation.

In this snippet, I present how to do data normalization using the Perl Data Language. The input is a piddle (see comment below for a definition) in which each column represents a variable and each row represent a pattern. The output is a piddle (in which each variable is normalized to have a 0 mean and 1 standard deviation), and the mean and standard deviation of the input piddle.

What are Piddles?

They are a new data structure defined in the Perl Data Language. As indicated in RFC: Getting Started with PDL (the Perl Data Language):

Piddles are numerical arrays stored in column major order (meaning that the fastest varying dimension represent the columns following computational convention rather than the rows as mathematicians prefer). Even though, piddles look like Perl arrays, they are not. Unlike Perl arrays, piddles are stored in consecutive memory locations facilitating the passing of piddles to the C and FORTRAN code that handles the element by element arithmetic. One more thing to note about piddles is that they are referenced with a leading $

Code:


#!/usr/bin/perl
use warnings;
use strict;

use PDL;
use PDL::NiceSlice;

# ================================
# normalize
# ( $output_data, $mean_of_input, $stdev_of_input) =
# normalize( $input_data )
#
# processess $input_data so that $output_data
# has 0 mean and 1 stdev
#
# $output_data = ( $input_data - $mean_of_input ) / $stdev_of_input
# ================================
sub normalize {
my ( $input_data ) = @_;
my ( $mean, $stdev, $median, $min, $max, $adev )
= $input_data->xchg(0,1)->statsover();

my $idx = which( $stdev == 0 );
$stdev( $idx ) .= 1e-10;
my ( $number_of_dimensions, $number_of_patterns )
= $input_data->dims();
my $output_data
= ( $input_data - $mean->dummy(1, $number_of_patterns) )
/ $stdev->dummy(1, $number_of_patterns);

return ( $output_data, $mean, $stdev );
}

The GPL v. 3 is Here

Posted June 30, 2007 by Lino Ramirez
Categories: Free and Open Source Software, News

The new version of the GNU GPL (General Purpose License) was unveiled on Friday June 29, 2007. The complete text is available at the GNU GPL site. How many projects will switch to use the new license? That is a question that only time can answer …

Venezuelan researcher has set a new Wi-Fi distance record

Posted June 19, 2007 by Lino Ramirez
Categories: News, Venezuela

A researcher from the Universidad de los Andes (ULA), Prof. Ermanno Pietrosemoli, has set a new record for the longest communication link with Wi-Fi: 382 kilometers (238 miles). Pietrosemoli, president of the Escuela Latinoamerica de Redes (or Networking School of Latin America), achieved the record by establishing a Wi-Fi link between two computers located in El Aguila and Platillon Mountain, Venezuela. Pietrosemoli gets about 3 megabits per second in each direction on his long-range connections.

The full article is available at c|net News.com