Archive for the ‘Free and Open Source Software’ category

Catalyst: Accelerating Perl Web Application Development

December 28, 2009

If you are a Perl fan that is looking for ways to developing high quality Perl Web Applications, Catalyst: Accelerating Perl Web Application Development is one book you should consider for your library. This is an introductory book to Catalyst, an open source web application framework written in Perl. With it you will get a good understanding of the Catalyst framework, its MVC (Model-VIew-Controller) architecture, and how to create web applications using Catalyst. All chapters start with an introduction followed by step-by-step instructions to accomplish the chapter’s goals. The chapters ends with a summary of what was learned to ease the learning process.

Chapter 1, “Introduction to Catalyst”, explain the need of using a MVC framework when developing web applications. It also explains how Catalyst fulfills that need. This chapter also discusses the Catalyst architecture and how to get it installed in Debian-based Linux, FreeBSD, OpenBSD, and Windows using ActiveState.

Chapter 2, “Creating a Catalyst Application”, shows you how to do the Hello World! Application using Catalyst. The most useful elements of this chapter are the description of the directory structure and the description of the files in the different directories. As a bonus, this chapter includes a brief introduction to SQLite.

Chapter 3, “Building a Real Application”, discusses the steps required to create a basic but extensible AddressBook. Among the steps discussed are environment setup, database design, and creating an Index, and Non Found pages. Of particular interest are the section on a CRUD (Create Retrieve Update Delete) interface to delete a person’s information from the database and the section on Forms to add information to the database.

Chapter 4, “Expanding the Application”, explains how to use configuration files to add value to the AddressBook developed in Chapter 3. Sessions, authentication, and authorization are added to personalize the Address Book. Also, searching and paging functionalities are incorporated into the Address Book to facilitate locating information.

Chapter 5, “Building a More Advanced Application”, describes a tool (ChatStat) for tracking opinions expressed in a popular Perl IRC Channel (irc.perl.org). Of key interest in this chapter are the section on extracting data from the IRC Channel and the section on manipulating the data to make sense of it.

Chapter 6, “Building Your Own Model”, deals with different ways of accessing the accessing the database models. The options explored in this chapter are: (1) mixing a procedural interface with a relational DBIx::Class interface, used to enhanced the AddressBook Application; (2) writing a database interface without DBIx::Class for the AddressBook; and (3) building a custom Model that does not use a database at all (this model was used to create a simple Blog application).

Chapter 7, “Hot Web Topics”, focuses on adding interactivity and improving responsiveness to web sites. The chapter deals first with adding a REST API to the AddressBook application so that API clients can easily look up people and their addresses. Authentication is handled by using username and password. Next, the AddressBook application user interface is modified using Jemplate to allow users to edit addresses in place. The final topic covered in this chapter is RSS feeds. The XML::Feed CPAN module is used to add an RSS feed to the mini blog application described in Chapter 6.

Chapter 8, “Testing”, deals with automating testing of Catalyst applications. The author starts the chapter by promoting “test driven development”, which suggests writing the tests before writing the code. The author, then, how to test applications outside Catalyst using as an example the testing of the message parser and the database of the ChatStat application. Next, the author describes how to test the ChatStats web interface using Test::WWW::Mechanize::Catalyst. After that, the author tests the AddressBook application using a test user and Selenium RC, a portable software testing framework for web applications.

Chapter 9, “Deployment”, teaches how to get a Catalyst application from development to production. Makefile.pl is used to manage dependencies and to create packages. PAR deployment is discussed for cases in which the development and production environment are at the same platform. Configuration management and performance issues are also discussed.

In short, this is a well written book with lots of tips to get started with Catalyst. Even though it has many typos, it would certainly be a good addition to your library of Perl books.

Calgary Open Source Systems Festival: The largest free technology event in Calgary

October 21, 2007

This coming weekend is the Calgary Open Source Systems Festival (COSSFest 2007). For those open source lovers and those curious about open source, this is a must attend event. By the way, I will be giving two talks: one on Machine Learning Development with Perl and another on Open Source Business Analytics.

Here is the official press release:

 

COSSFEST 07 (Calgary Open Source Systems Festival) will bring together professionals, students and enthusiasts who share a common interest in Open Source software.

This one-day conference and expo will be made up of speakers and hourly workshops.

Booths featuring hardware vendors, software companies, services companies, and (of course) user groups will be on hand. This event is the largest free technology event in Calgary!

 

COSSFEST 07 will publicize open source systems and educate the community at large on how open source systems can satisfy their software needs. This is a great opportunity for visionaries, developers, technologists, entrepreneurs, programmers, CIOs, CTOs, educators, not-for-profit leaders, hackers and average users to meet, network and learn.

 

 

COSSFEST 07

4-Nines Dining Centre, John Ware Building

SAIT Polytechnic

1301 – 16th Avenue NW, Calgary, AB T2M 0L4

Saturday, October 27, 2007

9:30 am to 5:30 pm.

 

This is a FREE event

(Registration for attendance is required at http://www.cossfest.ca)

 

For more information, please visit: http://www.cossfest.ca

Contact: Kin Wong at 403.617.9316 or e-mail: info@cossfest.ca

 

Everyone interested in Linux, Unix, Open Source and Free Software is welcome

 

IBM Joins OpenOffice.org Community

September 14, 2007

10 September 2007 — The OpenOffice.org community today announced that IBM will be joining the community to collaborate on the development of OpenOffice.org software. IBM will be making initial code contributions that it has been developing as part of its Lotus Notes product, including accessibility enhancements, and will be making ongoing contributions to the feature richness and code quality of OpenOffice.org. Besides working with the community on the free productivity suite’s software, IBM will also leverage OpenOffice.org technology in its products.

read the full press release at OpenOffice.org

Machine Learning Made Easy with Perl (the day before)

July 24, 2007

Machine Learning Made Easy with Perl is the name of the session I am giving tomorrow afternoon at OSCON. I really worked hard on this one 🙂 It took me more time than I expected to make machine learning easy 😉 I do not want to spoil the surprise but the talk is really packed so if you are attending, do not close your eyes for a second because you might miss one of the pointers that could save your next machine learning project.

There is a small update to the session: I will only be covering “Exploratory financial data analysis using fuzzy clustering” and “Medical decision support systems using support vector machines”. I will cover only two case studies to provide more in depth information. Come and see what I mean 🙂

I hope to see many faces there. By the way, I will make available the slides and the source code one week after the talk.

Cheers,

Lino

Finding texture descriptors using Perl

July 13, 2007

For an image analysis application I am writing using the PDL, I needed to compute some texture measures. After some research, I decided to go with the measures proposed by Rober Haralick based on the Gray Level Co-occurrence Matrix (GLCM). To make a long story short, I found a nice tutorial on the GLCM and started implementing the code for computing the GLCM and the texture measures following the equations presented in the tutorial. Here is my first take to computing the GLCM and some of the texture measures:

#!/usr/bin/perl
use warnings;
use strict;
use PDL;
use PDL::NiceSlice;


# ================================
# cooccurrence:
#
# $glcm = cooccurrence( $pdl, $dir, $dist, $symmetric )
#
# computes the grey level coocurrence coocurrence
# matrix of piddle $pdl for a given direction and
# distance
#
# Inputs:
# $pdl
# $dir: direction of evaluation
# $dir angle
# 0 +90
# 1 +45
# 2 0
# 3 -45
# 4 -90
# $dist: distance between pixels
# $symmetric: 0 => non-symmetric $glcm
#
# ================================
sub cooccurrence {
my ( $pdl, $dir, $dist, $symmetric ) = @_;

my $min_quantization_level = int( min( $pdl ) );
my $max_quantization_level = int( max( $pdl ) );

my $glcm = zeroes( $max_quantization_level
- $min_quantization_level + 1
, $max_quantization_level
- $min_quantization_level + 1 );

my ($dir_x, $dir_y);

if ( $dir == 0 ){
$dir_x = 0;
$dir_y = 1;
} elsif ( $dir == 1 ){
$dir_x = 1;
$dir_y = 1;
} elsif ( $dir == 2 ){
$dir_x = 1;
$dir_y = 0;
} elsif ( $dir == 3 ){
$dir_x = 1;
$dir_y = -1;
} elsif ( $dir == 4 ){
$dir_x = 0;
$dir_y = -1;
} else {
$dir_x = 0;
$dir_y = 0;
}

$dir_x *= $dist;
$dir_y *= $dist;

my $glcm_ind_x = 0;
my $glcm_ind_y = 0;

foreach my $grey_level_1 ( $min_quantization_level .. $max_quantization_level ){
my ( $ind_x_1, $ind_y_1 )
= whichND( $pdl == $grey_level_1 );
$ind_x_1 += $dir_x;
$ind_y_1 += $dir_y;

foreach my $grey_level_2 ( $min_quantization_level .. $max_quantization_level ){
my ( $ind_x_2, $ind_y_2 )
= whichND( $pdl == $grey_level_2 );
my $count = 0;
foreach my $i (0..$ind_x_1->getdim(0) - 1) {
foreach my $j (0..$ind_x_2->getdim(0) - 1) {
if ( ($ind_x_1($i) == $ind_x_2($j))
and ($ind_y_1($i) == $ind_y_2($j)) ) {
$count++;
}
}
}

$glcm( $glcm_ind_x, $glcm_ind_y ) .= $count;
$glcm_ind_y++;
}
$glcm_ind_y = 0;
$glcm_ind_x++;
}

if ( $symmetric ) {
$glcm += transpose( $glcm );
}
$glcm /= sum( $glcm );
return $glcm;
}

# ================================
# texture_descriptors:
#
# ( $contrast, $dissimilarity, $homogeneity
# , $inverse_difference, $asm, $energy )
# = texture_descriptors( $glcm );
#
# computes a set of texture descriptors
# associated with the GLCM $glcm
#
# $contrast:
# Range = [0 .. ($glcm->getdim(0)-1)^2]
# $contrast = 0 for a constant image.
# $homogeneity:
# Measures the closeness of the distribution
# of elements in the GLCM to the GLCM diagonal.
# Range = [0 1]
# $homogeneity is 1 for a diagonal GLCM.
# ================================
sub texture_descriptors{
my $glcm = pdl( @_ );
my $n = $glcm->getdim(0);
my $i = sequence( $n );
my $j = sequence( $n );
my $diff = $i->dummy(0, $n) - $j->dummy(1, $n);

my $contrast = sum( $glcm * ($diff ** 2) );

my $dissimilarity = sum( $glcm * abs( $diff ) );

my $homogeneity = sum( $glcm / ( 1 + $diff ** 2) );

my $inverse_difference = sum( $glcm / ( 1 + abs( $diff ) ) );

my $asm = sum( $glcm ** 2 );

my $energy = sqrt( $asm );

return ( $contrast, $dissimilarity, $homogeneity
, $inverse_difference, $asm, $energy );
}

my $pdl = pdl([0,0,1,1],[0,0,1,1],[0,2,2,2],[2,2,3,3]);
my $glcm = cooccurrence( $pdl, 2, 1, 1 );
print "glcm: $glcm\n";

my ( $contrast, $dissimilarity, $homogeneity
, $inverse_difference, $asm, $energy )
= texture_descriptors( $glcm );

print "contrast: $contrast\tdissimilarity: $dissimilarity\n";
print "homogeneity: $homogeneity\t";
print "inverse difference: $inverse_difference\n";
print "ASM: $asm\tenergy: $energy\n";

All suggestions are welcome 🙂
Cheers
Lino

OSCON 2007: 16 days away

July 7, 2007

Only 16 days separate us from OSCON and I am still polishing the material for my session 😉 I asked my fellow PerlMonks for feedback on a preliminary version of the presentation’s outline and as usual the comments were really useful. Based on the comments, I decided to reduce to two the number of case studies to be presented instead of the three I originally planned. I believe that in this way, I will have more time to clearly explain the techniques.

 

By the way, with this post, I will start a series of posts in which I show some of the snippets I will be presenting. Here are the first one:

Description:

A common practice in machine learning is to preprocess the data before building a model. One popular preprocessing technique is data normalization. Normalization puts the variables in a restricted range (with a zero mean and 1 standard deviation). This is important to achieve efficient and precise numerical computation.

In this snippet, I present how to do data normalization using the Perl Data Language. The input is a piddle (see comment below for a definition) in which each column represents a variable and each row represent a pattern. The output is a piddle (in which each variable is normalized to have a 0 mean and 1 standard deviation), and the mean and standard deviation of the input piddle.

What are Piddles?

They are a new data structure defined in the Perl Data Language. As indicated in RFC: Getting Started with PDL (the Perl Data Language):

Piddles are numerical arrays stored in column major order (meaning that the fastest varying dimension represent the columns following computational convention rather than the rows as mathematicians prefer). Even though, piddles look like Perl arrays, they are not. Unlike Perl arrays, piddles are stored in consecutive memory locations facilitating the passing of piddles to the C and FORTRAN code that handles the element by element arithmetic. One more thing to note about piddles is that they are referenced with a leading $

Code:


#!/usr/bin/perl
use warnings;
use strict;

use PDL;
use PDL::NiceSlice;

# ================================
# normalize
# ( $output_data, $mean_of_input, $stdev_of_input) =
# normalize( $input_data )
#
# processess $input_data so that $output_data
# has 0 mean and 1 stdev
#
# $output_data = ( $input_data – $mean_of_input ) / $stdev_of_input
# ================================
sub normalize {
my ( $input_data ) = @_;
my ( $mean, $stdev, $median, $min, $max, $adev )
= $input_data->xchg(0,1)->statsover();

my $idx = which( $stdev == 0 );
$stdev( $idx ) .= 1e-10;
my ( $number_of_dimensions, $number_of_patterns )
= $input_data->dims();
my $output_data
= ( $input_data – $mean->dummy(1, $number_of_patterns) )
/ $stdev->dummy(1, $number_of_patterns);

return ( $output_data, $mean, $stdev );
}

The GPL v. 3 is Here

June 30, 2007

The new version of the GNU GPL (General Purpose License) was unveiled on Friday June 29, 2007. The complete text is available at the GNU GPL site. How many projects will switch to use the new license? That is a question that only time can answer …

Generating cool fractals

June 16, 2007

This one is from the Free Software Magazine. Xavier Calbet wrote:

Whether you are a professional or amateur scientist, engineer or mathematician, if you need to make numerical calculations and plots quickly and easily, then PDL (Perl Data Language) is certainly one of the best free software tools to use. PDL has everything that similar high-level, proprietary, numerical calculation languages (like IDL or MATLAB) have. And it certainly comes with all the features you would expect to have in a numerical calculation package.

The full article is available at: Generating cool fractals.

Cheers,

Lino

Machine Learning Made Easy with Perl

June 15, 2007

That is the title of a session I am giving on July 25, 2007 at OSCON. Here is the abstract:

Machine learning is concerned with the development of algorithms and techniques that allow computers to “learn” from large data sets. This talk presents an overview of a number of machine learning techniques and the main configuration issues the participants need to understand to successfully deploy machine learning applications. The talk also covers three case studies in which we will use Perl scripts to solve real life problems:

  1. Medical decision support systems using support vector machines
  2. Exploratory financial data analysis using fuzzy clustering
  3. Pattern recognition in weather data using neural networks

This talk offers an intensive presentation of machine learning terminology, best practices, standard process, and strategy. Participants will get to know the techniques but more important, they will learn when to use them and why to use them. The talk is appropriate for educators and programmers who want to use machine learning in their own problem domains.

I will be posting more details about the session as we get closer to OSCON.

Cheers!

Lino

Microsoft Signs Patent Agreement with Another Linux Distributor. Is this the beginning of a trend?

June 14, 2007

Last year, it was Novell. Early this month, it was Xandros. Early this week, it was Linspire. Who will be next? And more importantly: Is this the beginning of a trend?