Kevin Nelson Marshall
Other entries:
« Another baby step.

A little while back I mentioned a quick hack I had done for Bowker to fix some truncated records in one of the production tables...and I finally found a few extra minutes to go ahead and share that quick/simple hack with you now.

First let me give you the quick background. Basically there's a web service that logs gobs of data submitted by a user (via a front end form on a web site). It logs this information in a CLOB field in the database as a raw XML content. In addition to sending the data off to the web service for storage (and later processing), the fron end code saves a local copy of each XML file (because we are in beta and wanted to be able to handle the specific type of problems this hack fixes).

On the backend, once a night, there is a process within Bowker that reads these XML records and does some processing on them (in an attempt to update a handful of other non-integrated systems and products). If the record can be processed by the batch program, the XML record is removed from the table...if it can't the record remains (and the user is notified via another batch system about required updates).

For an unknown reason at some point last week, all of the records remaining in the table were truncated at a certain character length. This of course meant that the XML within the fields was no longer valid XML, and that being the case the batch program and the front end services began to have all kinds of problems dealing with the records.

We needed to clear this problem up as quickly as possible! Simply deleting the truncated records from the table would actually solve the problem, but it would also mean that no more alerts would go out to the users and the data the users put in previously would not be saved (and available for fixes/updates) to them! What we really needed was to fix the trunctaion and restore the XML for each record.

Luckily we had been storing copies of the XML files on the server making the requests. So all I had to do was download those, loop through them figuring out just which ones we really wanted to re-run, and then update them in the database (that's what the below script does).

The keys to making this really easy are mostly related to the internal knowledge I have/had of the systems in place and the data I'm dealing with. For example, I knew the basic web service takes an XML file and stores it into the table (doing an update or an insert as needed)...so that meant I could just reuse that without having to rewrite that logic or deal with CLOB database issues in my local script. I also knew that inside the truncated xml records in the database there was a tag early on that listed the ISBN, and this value was unique to each file. It was early enough in each file that it had not been truncated, and it was unique enough that I could key off of it. So that meant I could use it to determine which records I really should recall the service for (I didn't want to put back in records that the batch system had already successfully processed and therefore deleted from the table).

With all of that in mind, writing a Perl script to run on my local machine (with an DBI connection to the Oracle database) was really pretty trival...and here it is (names and variables changed for security of course):


use DBI;
use HTTP::Request::Common qw(POST);
use LWP::UserAgent;

$database = "MY_DB"; # the DB we want to work in
$username = "MY_USERNAME"; # the username we are accessing the DB with
$password = "MY_PASSWORD"; # the password for the user we are accessing the DB with
my $dbh = DBI->connect("dbi:Oracle:$database", $username, $password);

$dbh->{'LongTruncOk'} = 1;
$dbh->{'LongReadLen'} = 255;

my @files = glob("C:\\MY_LOCAL_FILEPATH\\*");
$reccount = 0;
foreach my $file(@files){
  if ($file =~ /isbnmemberwebservice/) {
    # this is a file we want to to check the contents of
    $reccount++;
    open (SDLFILE, $file);
    my $thisfile = "";
    while ($line = <SDLFILE> ) {
      $thisfile .= $line;
    }
    # get the ISBN out of this data
    my $isbn = "0";
    if ($thisfile =~ /<isbn13>(\d+)<\/isbn13>/) {
      $isbn = $1;
      my $sth = $dbh->prepare("SELECT count(0) FROM MY_SDL_STORAGE_TABLE where xml like '%" . $isbn . "%'") or die;
      $sth->execute() or die;
      while (my @data = $sth->fetchrow_array()) {
        if ($data[0] != 0) {
          $reccount++;
          $ua = LWP::UserAgent->new;
          my $req = POST "http://my_web_service_url_to_submit_data_to", [ xml => $thisfile ];
          $response = $ua->request($req)->{'_content'};
          print $reccount . ": " . $isbn . " record updated in MY_SDL_STORAGE_TABLE\n";
        } else {
          print $reccount . ": " . $isbn . " NO RECORD\n";
        }
      }
    }
  }
}

posted by Kevin Marshall on 2008-09-29 00:00:00+00

Subscribe »

BotFu feed with RSS reader

BotFu feed by Email


Search All Posts »

Blog Details »

This blog now includes 286 wonderfully exciting posts from 1 unique and very special writer!


Kevin Marshall - Who's That?

I'm just your basic programmer. I can't spell to save my life, I'm not the greatest story teller, and I often ramble on about nothing. This blog showcases all of that!

If you're bored drop me an email at info at falicon.com or view my outdated resume.


Stalk me on »

Twitter (@falicon) »
Delicious »
Digg »
Disqus »
Facebook »
Flickr »
FriendFeed »
Last.fm »
LinkedIn »
StumbleUpon »

Archives by Category »

(24) Code »
(5) ColdFusion »
(11) Database »
(7) Factor »
(286) General »
(9) JavaScript »
(15) Perl »
(13) PHP »
(17) Ruby »

Archives by Month »

(1) February 2010 »
(5) January 2010 »
(2) October 2009 »
(6) August 2009 »
(11) July 2009 »
(2) May 2009 »
(3) April 2009 »
(2) March 2009 »
(7) February 2009 »
(9) January 2009 »
(14) December 2008 »
(5) November 2008 »
(12) October 2008 »
(13) September 2008 »
(16) August 2008 »
(23) July 2008 »
(20) June 2008 »
(24) May 2008 »
(23) April 2008 »
(27) March 2008 »
(28) February 2008 »
(26) January 2008 »
(7) December 2007 »

Published Works »

Beginning Amazon's SimpleDB (Apress in dev.)
Pro Active Record (Apress 2007)
Web Services with Rails (O'Reilly 2006).

Contributed To »

Ruby Cookbook (O'Reilly 2006)
SQL Cookbook (O'Reilly 2005)
Various Reviews published in Computing Reviews

Free Code I've Created »

SimpleDB library in Python 3.0

Fantasy focused domains »

draftwizard.com
fantasy-football-draft.com
fantasyfootballkit.com
fantasyfootballquiz.com
hockeynotes.com
pegg.it
rosterhelp.com
sportsxml.com
statsfeed.com
supermug.com

Tech. focused domains »

factorcode.com
perlquiz.com
simpledb.info

Social Tool focused domains »

conversationlist.com
friendstat.us
fuzzypop.com
gawk.it
grou.pe
halfbite.com
jivegas.com
pu.ly
tagli.st
timelylinks.com
tym.ly
wow.ly

Utility focused domains »

fubnub.com

Other domains »

betaread.com
botfu.com
falicon.com
storyrank.com

Not yet live domains »

bar.ackoba.ma
basketballnotes.com
buddydirt.com
budrank.com
cakntoba.com
coachwizard.com
cointhief.com
ezbcs.com
falconsrule.com
fantasydeke.com
fantasyfootballrank.com
ffkit.com
footballnotes.com
footballpublishing.com
giggletweet.com
greentile.com
herobrawl.com
kacode.com
kickasscode.com
knowabout.it
leaguewizard.com
nfldraftnews.com
pa.ly
rorbe.com
slidepitch.com
startfail.com
survivorhub.com
tagli.st
thedfl.com
thescoutsreport.com
toptenify.com
tripacation.com
tweetwiki.com
umock.com

* Yes I realize I have a bit of an addiction to domain names, but I really do have specific ideas for each of the above.



This blog is powered by KickAssCode.