Alzabo (version 0.64) - Alzabo

Index



NAME

Alzabo - A data modelling tool and RDBMS-OO mapper


SYNOPSIS

  Cannot be summarized here.


DESCRIPTION

What is Alzabo?

Alzabo is a program and a suite of modules, with two core functions. Its first use is as a data modelling tool. Through either a schema creation interface or a perl program, you can create a set of schema, table, column, etc. objects to represent your data model. Alzabo is also capable of reverse engineering your data model from an existing system.

Its second function is as an RDBMS to object mapping system. Once you have created a schema, you can use the Alzabo::Runtime::Table and Alzabo::Runtime::Row classes to access its data. These classes offer a high level interface to common operations such as SQL SELECT, INSERT, DELETE, and UPDATE commands.

A higher level interface can be created through the use of the Alzabo::MethodMaker module. This module takes a schema object and auto-generates useful methods based on the tables, columns, and relationships it finds in the module. The code is generates can be integrated with your own code quite easily.

To take it a step further, you could then aggregate a set of rows from different tables into a larger container object which could understand the logical relationship between these tables.

The Alzabo::Runtime::Row objects support the use of a caching system. Caching modules are included with the distribution. However, you may substitute any caching system you like provided it has the appropriate method interface.

What to Read?

Alzabo has a lot of documentation. If you are primarily interested in using Alzabo as an RDBMS-OO wrapper, much of the documentation can be skipped. This assumes that you will create your schema via the schema creation interface or via reverse engineering.

Here is the suggested reading order:

Alzabo - Alzabo concepts

Alzabo - Rows and cursors

Alzabo - How to use Alzabo

Alzabo - Exceptions

Alzabo - Usage Examples

The section for your RDBMS:

Alzabo and MySQL

Alzabo and PostgreSQL

The Alzabo::Runtime::Schema docs - The most important parts here are those related to loading a schema and connecting to a database. Also be sure to read about the join method.

The Alzabo::Runtime::Table docs - This contains most of the methods used to fetch rows from the database, as well as the insert method.

The Alzabo::Runtime::Row docs - The row objects contain the methods used to update, delete, and retrieve data from the database.

The Alzabo::Runtime::PotentialRow docs - Potential rows are objects that look like real rows but have yet been inserted into the database.

The Alzabo::Runtime::Cursor docs - The most important part of the documentation here is the HANDLING ERRORS section.

The Alzabo::Runtime::RowCursor docs - A cursor object that returns only a single row.

The Alzabo::Runtime::JoinCursor docs - A cursor object that returns multiple rows at once.

The Alzabo::MethodMaker docs - One of the most useful parts of Alzabo. This module can be used to auto-generate methods based on the structure of your schema.

The Alzabo::ObjectCache docs - This describes how to select the caching modules you want to use. It contains a number of scenarios and describes how they are affected by caching. If you plan on using Alzabo in a multi-process environment (such as mod_perl) this is very important.

The Alzabo::Exceptions docs - Describes the nature of all the exceptions used in Alzabo.

The FAQ.

The quick reference - A quick reference for the various methods of the Alzabo objects.

Other areas of interest may include the Validating data, Using SQL functions, Referential integrity, and Changing the schema sections in this document.

Alzabo concepts

Instantiation

Every schema keeps track of whether it has been instantiated or not. A schema that is instantiated is one that exists in an RDBMS backend. This can be done explicitly by calling the schema's create method. It is also implicitly set when a schema is created as the result of reverse engineering.

Instantiation has several effects. The most important part of this is to realize that once a schema is instantiated, the way it generates SQL for itself changes. Before it is instantiated, if you ask it to generate SQL via the make_sql the method, it will generate the set of SQL statements that are needed to create the schema in the RDBMS.

After is instantiated, the schema will instead generate the SQL necessary to convert the version in the RDBMS backend to match the object's current state. This can be thought of as a SQL 'diff'.

While this feature is quite useful, it can be confusing too. The most surprising aspect of this is that if you create a schema via reverse engineering and then call the make_sql method, you will not get any SQL. This is because the schema knows that it is instantiated and it also knows that it is the same as the version in the RDBMS, so no SQL is necessary.

The way to deal with this is to call the set_instantiated method with a false value. Use this method with care.

Rows and cursors

In Alzabo, data is returned in the form of a row object. This object can be used to access the data for an individual row.

Unless you are retrieving a row via a unique identifier (usually its primary key), you will be given a cursor object. This is quite similar to how DBI uses statement handles and is done for similar reasons.

How to use Alzabo

The first thing you'll want to do is create a schema. The easiest way to do this is via the included web based schema creation interface, which requires the HTML::Mason package from CPAN (www.cpan.org) to run.

This interface can be installed during the normal installation process, and you will be prompted as to whether or not you want to use it.

The other way to create a schema is via a perl script. Here's the beginning of such a script:

  use Alzabo::Create::Schema;
  eval
  {
      my $s = Alzabo::Create::Schema->new( name => 'foo',
                                           rdbms => 'MySQL' );
      my $table = $s->make_table( name => 'some_table' );
      my $a_col = $table->make_column( name => 'a_column',
                                       type => 'int',
                                       nullable => 0,
                                       sequenced => 0,
                                       attributes => [ 'unsigned' ] );
      $table->add_primary_key($a_col);
      my $b_col = $table->make_column( name => 'b_column',
                                       type => 'varchar',
                                       length => 240,
                                       nullable => 0 );
      $table->make_index( columns => [ { column => $b_col,
                                         prefix => 10 } ] );
      ...
      $s->save_to_file;
  };
  if ($@) { handle exceptions }

Exceptions

Alzabo uses exceptions as its error reporting mechanism. This means that pretty much all calls to its methods should be wrapped in eval{}. This is less onerous than it sounds. In general, there's no reason not to wrap all of your calls in one eval, rather than each one in a seperate eval. Then at the end of the block simply check the value of $@. See the code of the included HTML::Mason based interface for examples.

Also see the Alzabo::Exceptions documentation, which lists all of the different exception used by Alzabo.

Its important to note that some metohds (such as the driver's rollback method) may use eval internally. This means that if you intend to use them as part of the cleanup after an exception, you may need to store the original exception as $@ will be overwritten at the next eval.

In addition, some methods you might use during cleanup can throw exceptions of their own.

This is the point where I start wishing Perl had a real exception handling mechanism built into the language.

Usage Examples

Alzabo is a powerful tool but as with many powerful tools it can also be a bit overwhelming at first. The easiest way to understand some of its basic capabilities is through some examples.. Let's first assume that you've created the following schema:

  TABLE: Movie
  movie_id                 tinyint      -- primary key
  title                    varchar(200)
  release_year             year
  TABLE: Person
  person_id                tinyint      -- primary key
  name                     varchar(200)
  birthdate                date
  birthplace_location_id   tinyint      -- foreign key to location
  TABLE: Job
  job_id                   tinyint      -- primary key
  job                      varchar(200) -- something like 'actor' or 'director'
  TABLE: Credit
  movie_id                 tinyint      -- primary key part 1, foreign key to movie
  person_id                tinyint      -- primary key part 2, foreign key to person
  job_id                   tinyint      -- primary key part 3, foreign key to job
  TABLE: Location
  location_id              tinyint      -- primary key
  location                 varchar(200) -- 'New York City' or 'USA'
  parent_location_id       tinyint      -- foreign key to location

This is a vastly scaled down version of the 90+ table database that Alzabo was written to support.

Fetching data

First of all, let's do something simple. Let's assume I have a person_id value and I want to find all the movies that they were in and print the title, year of release, and the job they did in the movie. Here's what it looks like:

  my $schema = Alzabo::Runtime::Schema->load_from_file( name => 'movies' );
  my $person_t = $schema->table('Person');
  my $credit_t = $schema->table('Credit');
  my $movie_t  = $schema->table('Movie');
  my $job_t    = $schema->table('Job');
  # returns a row representing this person.
  my $person = $person_t->row_by_pk( pk => 42 );
  # all the rows in the credit table that have the person_id of 42.
  my $cursor = $person->rows_by_foreign_key( foreign_key =>
                                             $person_t->foreign_keys_by_table($credit_t) );
  print $person->select('name'), " was in the following films:\n\n";
  while (my $credit = $cursor->next)
  {
      # rows_by_foreign_key returns a RowCursor object.  We immediately
      # call its next method, knowing it will only have one row (if
      # it doesn't then our referential integrity is in trouble!)
      my $movie =
          $credit->rows_by_foreign_key( foreign_key =>
                                        $credit_t->foreign_keys_by_table($movie_t) )->next;
      my $job =
          $credit->rows_by_foreign_key( foreign_key =>
                                        $credit_t->foreign_keys_by_table($job_t) )->next;
      print $movie->select('title'), " released in ", $movie->select('release_year'), "\n";
      print '  ', $job->('job'), "\n";
  }

A more sophisticated version of this code would take into account that a person can do more than one job in the same movie.

The method names are admittedly verbose but the end result code is significantly simpler to read than the equivalent using raw SQL and DBI calls.

Let's redo the example using Alzabo::MethodMaker;

  # I'm assuming that the pluralize_english subroutine pluralizes
  # things as one would expect.
  use Alzabo::MethodMaker( schema      => 'movies',
                           all         => 1,
                           name_maker  => \&method_namer );
  my $schema = Alzabo::Runtime::Schema->load_from_file( name => 'movies' );
  # instantiates a row representing this person.
  my $person = $schema->Person->row_by_pk( pk => 42 );
  # all the rows in the credit table that have the person_id of 42.
  my $cursor = $person->Credits;
  print $person->name, " was in the following films:\n\n";
  while (my $credit = $cursor->next)
  {
      my $movie = $credit->Movie;
      my $job = $credit->Job;
      print $movie->title, " released in ", $movie->release_year, "\n";
      print '  ', $job->job, "\n";
  }

Validating data

Let's assume that we've been passed a hash of values representing an update to the location table. Here's a way of making sure that that this update won't lead to a loop in terms of the parent/child relationships.

  sub update_location
  {
      my $self = shift; # this is the row object
      my %data = @_;
      if ( $data{parent_location_id} )
      {
          my $parent_location_id = $data{parent_location_id};
          my $location_t = $schema->table('Location');
          while ( my $location = eval { $location_t->row_by_pk( pk => $parent_location_id ) } )
          {
              die "Insert into location would create loop"
                  if $location->select('parent_location_id') == $data{location_id};
              $parent_location_id = $location->select('parent_location_id');
          }
      }
  }

Once again, let's rewrite the code to use Alzabo::MethodMaker:

  sub update_location
  {
      my $self = shift; # this is the row object
      my %data = @_;
      if ( $data{parent_location_id} )
      {
          my $location = $self;
          while ( my $location = eval { $location->parent } )
          {
              die "Insert into location would create loop"
                  if $location->parent_location_id == $data{location_id};
          }
      }
  }

Using SQL functions

Each subclass of Alzabo::SQLMaker is capable of exporting functions that allow you to use all the SQL functions that your RDBMS provides. These functions are normal Perl functions. They take as argument normal scalars (strings and numbers), Alzabo::Column objects, or the return value of another SQL function. They may be used to select data via the select and function methods in both the Alzabo::Runtime::Table and Alzabo::Runtime::Schema classes. They may also be used as part updates, inserts, and where clauses, any place that is appropriate.

Examples:

 use Alzabo::SQLMaker::MySQL qw(MAX NOW PI);
 my $max = $table->function( function => MAX( $table->column('budget') ),
                             where => [ $table->column('country'), '=', 'USA' ] );
 $table->insert( values => { create_date => NOW() } );
 $row->update( pi => PI() );
 my $cursor = $table->rows_where( where =>
                                  [ $table->column('expire_date'), '<=', NOW() ] );
 my $cursor = $table->rows_where( where =>
                                  [ LENGTH( $table->column('password'), '<=', 5 ] );

The documentation for the Alzabo::SQLMaker subclass for your RDBMS will contain a detailed list of all exportable functions.

Changing the schema

In MySQL, there are a number of various types of integers. The type TINYINT can hold values from -128 to 127. But what if have more than 127 movies? And if that's the case we might have more than 127 people too.

For safety's sake, it might be best to make all of the primary key integer columns INT columns instead. And while we're at it we want to make them UNSIGNED as well, as we don't need to insert negative numbers into these columns.

You could break out the RDBMS manual (because you probably forgot the exact ALTER TABLE syntax you'll need). Or you could use Alzabo. Note that this time it is an Alzabo::Create::Schema object, not Alzabo::Runtime::Schema.

  my $schema = Alzabo::Create::Schema->load_from_file( name => 'movies' );
  foreach my $t ( $schema->tables )
  {
      foreach my $c ( $t->columns )
      {
           if ( $c->is_primary_key and lc $c->type eq 'tinyint' )
           {
                $c->set_type('int');
                $c->add_attribute('unsigned');
           }
      }
  }
  $schema->create( user => 'user', password => 'password' );
  $schema->save_to_file;

Multiple RDBMS Support

Alzabo aims to be as cross-platform as possible. To that end, RDBMS specific operations are contained in several module hierarchies. The goal here is to isolate RDBMS-specific behavior and try to provide generic wrappers around it, inasmuch as is possible.

The first, the Alzabo::Driver::* hierarchy, is used to handle communication with the database. It uses DBI and the appropriate DBD::* module to handle communications. It provides a higher level of abstraction than DBI, requiring that the RDBMS specific modules implement methods to do such things as create databases or return the next value in a sequence.

The second, the Alzabo::RDBMSRules::* hierarchy, is used during schema creation in order to validate user input such as schema and table names. It also generates SQL to create the database or turn one schema into another (sort of a SQL diff). Finally, it also handles reverse engineering an existing database.

The this, the Alzabo::SQLMaker::* hierarchy, is used to generate SQL and handle bound parameters for select, insert, update, and delete operations.

The RDBMS to be used is specified when creating the schema. Currently, there is no easy way to convert a schema from one RDBMS to another, though this is a future goal.

Referential integrity

By default, Alzabo will maintain referential integrity in your database based on the relationships you have defined. This can be turned off via the Alzabo::Runtime::Schema->set_referential_integrity method.

Alzabo enforces these referential integrity rules:

Architecture

The general design of Alzabo is as follows.

There are objects representing the schema, which contains table objects. Table objects contain column, foreign key, and index objects. Column objects contain column definition objects. A single column definition may be shared by multiple columns, but has only one owner.

This is a diagram of these inheritance relationships:

  Alzabo::* (::Schema, ::Table, ::Column, ::ColumnDefinition, ::ForeignKey, ::Index)
                   /   \
                is parent to
                 /       \
 Alzabo::Create::*   Alzabo::Runtime::*

This a diagram of how objects contain other objects:

                      Schema - makes--Alzabo::SQLMaker subclass object (many)
                     /      \
              contains       contains--Alzabo::Driver subclass object (1)
                  |                 \
               Table (0 or more)     Alzabo::RDBMSRules subclass object (1)
                /  \                  (* Alzabo::Create::Schema only)
               /    \
              contains--------------------
             /        \                   \
            /          \                   \
     ForeignKey      Column (0 or more)    Index (0 or more)
     (0 or more)       |
                    contains
                       |
                  ColumnDefinition (1)

Note that more than one column _may_ share a single definition object (this is explained in the Alzabo::Create::ColumnDefinition documentation). This is only relevant if you are writing a schema creation interface.

Other classes

Why the subdivision between Alzabo::*, Alzabo::Create::*, and Alzabo::Runtime::*?

There are several reasons for doing this:


SUPPORT

The Alzabo docs are conveniently located online at http://alzabo.sourceforge.net/docs/.

There is also a mailing list. You can sign up at http://lists.sourceforge.net/lists/listinfo/alzabo-general.

Please don't email me directly. Use the list instead so others can see your questions.


LICENSE

Copyright (c) 2000-2001 Dave Rolsky

All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


AUTHOR

Dave Rolsky, <autarch@urth.org>