Archive for October, 2007

October 31, 2007: 5:18 pm: Dan.Net, IBM Universe (U2)

So, the other day Bono called me and said “Hey, that new app you wrote is giving me some weird message about ‘corrupted memory’. It’s, like, totally Zoo Station. Should I call Bill?’. “Nah”, I replied, “it’s just U2 acting up again. It doesn’t get along so well with .Net”. “Yeah”, Bono replied”, “U2 is, like, so 1980s”.

That was Bob Bono, by the way, a technical support guy. The “U2″ we were griping about is an IBM database platform, once known by the even less apropos name of “Universe”. After buying the product several years ago IBM rechristened it in what was presumably an attempt to make it seem more modern. I don’t think they’re fooling anyone. Bono was being generous when he suggested that U2 is 20 years past its prime. This is a database system that stores all of its data as delimited strings, and whose native programming language, Pick Basic, makes heavy use of line numbers and GOTO statements and is intended for use in Telnet-based applications. U2 predates relational database technology, not to mention its namesake band.

Nonetheless, U2/Universe is still alive and kicking with a well entrenched competitive position. There is no quick or easy way to migrate the data to a modern database system, and converting Telnet-based Pick Basic code to something designed for a Windows IDE is unthinkable. So, some of us have the imposing challenge of getting U2 to play nice with .Net, a technology for which IBM has no great love and rather bare-boned support. If you are one such wretched soul, then I offer the following humble advice to get you started.

When I began my current project, a Google search revealed an assortment of 3rd party .NET drivers for U2, and an IBM implementation of OLEDB called UniOLEDB. In theory, the latter could be connected to Microsoft’s .Net Data Provider for OLEDB and .Net would happily accept it as a data source. Since the 3rd party solutions all charged significant runtime licenses (almost as much as the runtime license for the U2 database, itself), and our product is being sold to customers who would have to absorb that extra charge, the choice of the freebie UniOLEDB seemed like a no-brainer.

Yet, presumably someone was paying for the expensive 3rd party solutions, so I knew that UniOLEDB must have some serious drawbacks. By Googling UniOLEDB I was able to find a lot of questions, not so many answers, and a lot of grumbling about UniOLEDB’s deficiencies when compared to native Pick Basic coding or the old school Universe API: UniObjects.Net. (UniObjects doesn’t try to pretend that U2 is a real database with fields and relational joins and such, it just makes it a little easier to extract and parse the database’s delimited strings). But, hey, I reasoned, that didn’t necessarily mean that UniOLEDB wasn’t a viable platform. Most Universe programmers cut their teeth on Pick Basic and Telnet programming and would be instinctively averse to a Microsoft SQL-based API, and for all products there is a satisfied silent majority that doesn’t bother posting about how satisfied they are.

After a few months of getting the basic infrastructure ironed out, I wasn’t part of that satisfied majority. Reading from the database was reasonably fast and stable, but updates were hit-and-miss and the odd way that UniOLEDB handled “multivalues” was a major problem for coding productivity. (In case anyone who isn’t proficient in Universe has read this far, multivalues are a way of dividing database fields into sub-fields, and our database uses them extensively). IBM’s technical support seemed surprised that we were even trying to use .Net with the product. After my first bug report was met with silence for a few weeks, their techie apologized for the delay in responding and explained that he was having trouble finding a copy of Visual Studio.

In exasperation, I posted a question to the U2 User Group site asking if anyone had successfully deployed a VS2005 product using UniOLEDB. The answer, posted on the forum and in sent in private e-mails, was basically “don’t go there”. (And some of the private e-mails were kind enough to offer to rewrite our code using UniObjects.Net, for a nice hourly fee).

But we did go there, partly out of lack of affordable alternatives and partly out of bull-headed determination to leverage at least some of ADO.Net’s incredible productivity advantage over delimited strings. And now we are just about ready to deliver the product. I’ll have more to say about U2 and .Net in future postings, but here’s my top 10 list of suggestions for getting U2 and .Net to rock (well, maybe strum) together:

1. Use UniOLEDB for reading, but UniObjects.Net for writing. This one is a must (unless your users don’t mind corrupting the database now and then). I really, really hated to do it, because I really, really, like ADO.Net’s Dataset model, but I minimized the damage by only involving UniObjects.Net at the very end of the update process. Our code still uses Datasets to keep track of what records were added, updated or deleted, and we still call GetChanges to retrieve the list of updated records, and we still code SQL UPDATE, INSERT, and DELETE statements to handle the database updates. All I did was write some custom code to parse the SQL statements, look up the corresponding field attributes from the Universe dictionary, and handle the insert or update of those attributes (or deletion of records) using UniObjects.Net. If IBM ever provides a stable .Net API for handling updates (and they are currently beta testing a .Net driver for U2), it will be easy to knock UniObjects out of the picture and let the DataSet finish the job.

2. Don’t try to use the UniOLEDB model for multivalued fields — access them as delimited strings. This can be done on the read side by adding an “I-type” dictionary entry that returns these fields as a delimited string. On the update side, the code which does the low-level parsing of the SQL statements (see item #1) has to figure out which fields are multivalues and piece them together as delimited strings. The UniOLEDB model for multivalues is just plain weird — perhaps I’ll explain it in a future post.

3. Ensure that all of your code that accesses Dataset columns can handle nulls gracefully (UniOLEDB actually returns them as System.DBNull). In U2 databases any field can be null: string, integer, date, whatever. (Delimited strings, remember?) Even if nulls aren’t supposed to be there, the database won’t do anything to prevent them so you’ll have to code defensively.

4. Define the data type as specifically as possible in the dictionary entry. We use “S” type dictionary entries, so we’ll put a specifier in attribute 6 such as “CHAR,20″ for a 20-character field or “MTS” for a timestamp. (Actually, it’s almost always a CHAR type — see point #5). Without a typed dictionary entry, UniOLEDB tries to determine on the fly what data type to return when you read a field — if it finds integers in that field, it will return an Int32. If your customer happens to have some very large values in that field, UniOLEDB will return doubles. While it’s possible to wrap your .Net code in a layer that will convert whatever it gets from the database to the format you expect, that flexibility isn’t worth the performance hit.

5. Unless you have really clean databases, define your fields as CHARs, and convert the strings to their desired format in your .Net code. As a database, U2 considers everything a string, so type safety is completely out the window. If you define as field as a timestamp, and it contains something that UniOLEDB doesn’t recognize as a timestamp, it throws an Exception. (If you’re lucky — when faced with runtime problems UniOLEDB will sometimes just return a null without signalling an error).

6. Specify a static culture ID whenever you convert a string read from the U2 database. For example, a user with French regional settings will see an Exception if the application tries to convert “1.23″ using Decimal.Parse(strValue), since French regional settings use a comma as the decimal place. So, instead, use something like Decimal.Parse(strValue, new CultureInfo(“EN-US”)). The same rule applies when creating a string value to be inserted into the database: use decValue.ToString(new CultureInfo(“EN-US”))

7. Use F type U2 dictonary entries, not Q type. Q types don’t fully SQL statements executed through UniOLEDB.

8. Write code to automatically update the uci.config file with an entry for the server and account that the user is logging into. The format of these entries is pretty simple, and it sure beats having the users manually update these files everytime a new server name or account is encountered. And this approach allows you to set connection options on the fly, such as MAXFETCHBUFF. By default, the longest field that UniOLEDB can return is 8K. If any of your fields are longer than that (and they likely will be if you have multivalued fields and you’re following my advice in point #2), set MAXFETCHBUFF accordingly. This results in a buffer of that size being allocated on the server for each user session, which is why you may need to set MAXFETCHBUFF on the fly rather than statically.

9. In very large tables, SELECT using specific keys only. We have a table that at large clients is about 12G in size, and contains over 12 million records. If we try to retrieve a range of records using a SELECT statement like SELECT * WHERE KEY LIKE ABC%, the SELECT statements takes 20 minutes to complete (on our relatively new development server), even if it doesn’t return any records. Don’t even think about trying a statement like SELECT * WHERE NONKEYFIELD = ‘X’. To make this table workable, we instead write code to determine the specific keys that we need to retrieve, then use SQL like SELECT * WHERE KEY IN (‘A’, ‘B’, ‘C’). It can be a pain to develop an algorithm for figuring out what the exact keys are, but the records are then returned instantaneously.

10. Use the VS2005 Query Manager to access your Pick Tables. Universe graybeards know the Pick command line interface like the backs of their hands, so firing up Telnet to look through your data is a hard habit to break. But if UniOLEDB is good enough for your code then it’s good enough for you — c’mon, eat your own dog food! You can add each account of your database as a separate Data Connection in the Server Manager (more on how to do that in a future post), then right click on the Connection and select New Query. (At this point you’ll get the scary “corrupted memory” error that started this column — don’t worry, your memory is fine). Entering SQL commands will give you good practice at interacting with U2 using SQL, and having the results displayed in a grid makes U2 look, like, kind of 1990s.

Ooh, delimited strings in grid!
Ooh, delimited strings in grid!

Yes, after long months of trial and error and a little ingenuity, I’m pleased to really is possible to successfully develop a large application using U2 and .Net. (Successfully deploy? To be determined.) I hope that some of the above tips will help you avoid some of the more common pitfalls. This post barely scratches the surface, though, so stay tuned for lots more on this topic in the future.

October 24, 2007: 7:16 pm: Dan.Net

My current project is nearing its release date after months of paring down the bug list, so I was somewhat alarmed this morning when I fired up Outlook and found a bunch of early morning bug reports from the latest build. Panicky bug reports. Reports of message boxes popping up left and right, full of scary looking technical jargon. “What did you guys to the latest release?” they cried? “Don’t you test these things?”

Well, actually, we do test these things, and those scary message boxes were a direct result of how we test things. The guy who did the build had forgotten to switch Visual Studio’s Build settings from Debug to Release, so all the “Debug.Assert” messages that we use to warn coders about possible bugs were now menacing our internal users. (Well, actually they’re supposed to be “testers”, but that doesn’t stop them from bleating in alarm at any error message that pops up on the screen.)

Why so many Asserts? Why not a more subtle way of reporting test results, like a log file? We do use log files, and our testers/users send these log files with their bug reports. I’ve turned on logging on my own development PC as well, of course, but logging is nowhere near as useful. The Assert has the advantage of being a) an attention grabber, and b) an opportunity to use the debugger to see exactly what’s going on at the point that the Assert fails. As a coder I’m embarrassed to admit that even my attention needs to be grabbed, but some bugs just aren’t apparent from what happens on the screen.

I use Debug.Asserts in every Exception handler (natch), but also as a way of “defensive coding”. Almost any time an object is passed as a parameter or read from a data source, I use an Assert to confirm that it’s what I expected it to be: not null, within limits, and of the same size as corresponding lists and arrays. This practice has a collateral effect on my coding style, since thinking about everything that could go wrong ensures that I write code to gracefully handle these “unexpected” error conditions.

But again, why did so many Debug.Asserts pop up today, in a product that has most of the bugs shaken out of it? As it turns out, these particular Asserts called attention to some previously unreported bugs — bugs that resulted from differences between the data used by the testers and the coders, or by different habits that the groups developed when navigating through the user interface. Without the Asserts, the users wouldn’t necessarily notice that a cell or two were incorrect in a grid, and they wouldn’t bother to report that they occasionally had to click on a field a couple of times in order to give it focus. The scary message boxes this morning ensured that these subtle problems were finally reported.

Naturally, when I realized what had happened this morning we immediately released a new build with the Debug features turned off. But I’m tempted to “accidentally” release a Debug build from time to time in the future, just to make sure that the small bugs get reported.

October 20, 2007: 10:15 pm: Dan.Net

I was studying for .Net Certification today using Microsoft’s Self Study Kit for the 70-536 exam. In the chapter on Application Security, the book offers the following advice concerning Code Access Security (CAS) and the .Net Framework Configuration tool.: “As a developer, one of the first things you should do is adjust the permission set assigned to the My_Computer_Zone code group. By default, it’s set to Full Trust, which means any CAS statements in your applications will be completely ignored. Change this to the Everything permission set, which grants similar permissions but respects CAS statements in assemblies.

Wow, that’s weird. By default (in XP SP2), the .Net-specific security settings are turned off for applications launched from your own PC? Sure enough, when I checked the .Net Framework Configuration tool’s setting for the “My Computer” zone, the default was Full Trust. Admittedly, I had never come across CAS issues in my own .Net development to date, but since I wanted to experiment with having my apps run in more secure configurations I thought the book’s suggestion was a good one. I changed the setting to “Everything”, effectively activating CAS for all local applications. For some reason, it didn’t occur to me that this might impact one of the commercial .Net applications installed on my PC. Surely in this age of increased security awareness among programmers, widely used applications would have learned to get by with less than full permissions on the PC.

The next time I rebooted my PC, the ATI Catalyst Control Centre failed to start, displaying a “Command line interface has encountered a problem” error message.

ATI Catalyst error message
ATI Catalyst error message

Having done various tweaks to my OS settings recently, I didn’t connect this error with the change to the .Net Framework Configuration until I Googled the problem and found that the most common solution was to reinstall the .Net runtime. I tried changing the .Net permissions for the “My Computer” zone back to the default, and the ATI Catalyst Centre was happy again.

Configuring .Net Framework with Full Trust
Configuring .Net Framework with Full Trust

The .Net Framework Configuration tool allows you to assign specific permissions to specific applications, so rather than leaving CAS disabled by default I gave ATI’s CLI.exe assembly Full Trust. The average user, though, isn’t going to bother with such a measured response — if they somehow managed to find a way to turn on CAS, they’ll promptly turn it off again.

Given that such a widely used consumer application as ATI Catalyst Control Centre doesn’t play nice with CAS, and given that few users are going to figure out how to enable CAS in the first place, I wonder if there’s much point in learning how to make my applications work with CAS settings (other than to pass the certification exam, of course). Is this a case of Microsoft not doing enough to make developers aware of this technology, or is the technology so flawed that Microsoft is happy to let it be neglected?

October 17, 2007: 6:05 pm: DanSoftware Tools

As an inveterate note taker, I was quite interested by a recent Lifehacker article about Thomas Edison’s note taking system. I was particularly struck by one passage from the article:

Edison had an amazing memory. … Much of this is due to his system of notes. By writing everything down that he thought was worth writing, he was able to free himself of the burden of having to remember it

Well, I wish I could say that about myself: I don’t actually forget things, I just free myself of the burden of having to remember them. In truth, my memory is somewhat short of amazing, but I manage to more than compensate for this by keeping notes. You don’t need Edison’s ingenuity or a librarian’s discipline — just a few good habits and one amazing piece of software:

1. Google Desktop. I used to experiment with various systems for keeping track of my notes. Then along came Google Desktop, and all other notekeeping systems became unnecessary. It really does “free myself of the burden” of having to remember things. It indexes all my documents (text, Word, Excel, PDF), my Outlook mail (but, oddly, not Gmail), Web pages that I’ve looked at recently (using my browser cache). It even indexes documents that I’ve since deleted, often giving my enough information about the document that I don’t have to feel like a complete putz for having deleted it in the first place. I’ve tried some of the other desktop search engines, but I haven’t found any that cover as much ground and run as quickly as Google Desktop.

2. Don’t just take notes, index them. A lot of people bring notepads to meetings, some even use them, but I’m surprised at few people make good use of them. They’ll scrawl a few notes, flip the page, and generally forget that the notes exist before they’ve even forgotten the information that they made note of. Although it’s difficult to find the time during a busy day, I make a habit of typing my notes into some kind of Googleable document — the sooner the better, before I forget the details behind the words on my notepad. I generally put them in an e-mail: I’ll either send the e-mail to some of the other participants in the meeting or, if I don’t want to appear pushy, just e-mail it to myself.

3. Don’t say it, type it. I’m somewhat infamous around the office for having a strong preference to e-mail over the phone or even face-to-face conversation. I’ll often e-mail even the people who sit just outside my office rather than walking a few feet to their desks. It’s not that I hate talking to people (well, not everyone), but I love the fact that e-mail gives me a permanent record of both sides of the “conversation” without having to take notes. E-mail has other advantages too: you’ll generally get (and give) more complete and well-reasoned answers through e-mail, and it allows the other person to respond when their schedule allows rather than interrupt whatever they are doing to answer the phone.

4. Don’t touch that delete key. Assuming you have a nice, large hard drive, don’t be so quick to delete documents and e-mails once you’ve dealt with them. Problems and questions tend to come around again and again, especially if the people you work with don’t keep notes. It’s a great time-saver to be able to answer a question the second time around by forwarding an e-mail with the original answer. And, yeah, being a smarty-pants is kind of fun too.

5. Learn to use your IDE’s indexing capabilities. While Google Desktop does a fantastic job with most of my documents, it casts too wide a net to be of much use with my code. Fortunately, most current IDEs do a pretty good job of indexing the code in a project, going well beyond the basic “Find” tool that IDEs in the 90s offered. I rely pretty heavily on Visual Studio’s “Find All References” feature to locate a particular line of code — as long as I know a variable that it uses or a method that it calls, Find All References will turn it up. Regular expressions are a very powerful search tool that many programmers don’t think of using. If you can remember (or guess) some fragments of the code that you’re looking for, a regular expression will help you to go directly to the lines that contain all of the fragments.

I’ve always felt a little embarrassed at relying so heavily on note taking rather than a sharp memory, so I was delighted to learn that Edison owed much of his success to a similar discipline. According to the Lifehacker article, Edison organized his notes using a “combination of chronological and subject matter based systems”, but “often spent considerable time searching through these records looking for the key item”. By developing a few good habits and learning to use a desktop search tool, you can easily put together a system that Edison could only dream about.

October 13, 2007: 1:20 pm: DanSoftware Tools

Although most software developers use a fancy-schmancy IDE for coding, we all rely on text editors from time to time. Indeed, I doubt that I ever go through a working day without using a text editor for something: examining log files, opening e-mail attachments that Outlook doesn’t recognize, studying raw HTML pages for hidden fragments of ingenuity (or idiocy), or copying text to a temporary scratch pad. Most developers probably do these things without even thinking about it, which explains why most developers that I know still use Notepad.

It pains me to watch someone format text manually, repeatedly, mindlessly, time after time. Or to watch someone bring up 2 copies of Notepad and try to compare their contents: manually aligning them and trying to scroll them line-by-line. Or squint at the ASCII representation of some binary sequence and try to remember exactly which value that funny squiggly thing represents. It pains me to watch them, and it pains me to remember that not so long ago, I was one of them.

If you’re a developer or anyone else who frequently uses text files, and you still use Notepad, drop whatever you are doing and immediately get yourself a replacement like Notepad++. (PSPad is also well thought of, but Notepad++ is my personal tool of choice). Figure out how to do Search and Replace using regular expressions, how to switch to hex editing mode, how to split the window and do synchronized scrolling of the 2 panes.

Admittedly you’ll have to invest a bit of time and a lot of experimentation to master these features, since documentation is not Notepad++’s strongest feature. If you aren’t already familiar with regular expressions, then you’ll find a wealth of articles and even books out there to help you tackle this complex but extremely powerful feature. But trust me, the time you spend learning the basic toolset of Notepad++ will be well worth the effort.

Once you’ve got the hang of it, do your best to forget that there was once a time that you struggled by without these features. Be magnanimous and try to educate your friends. But be prepared to wince at the sight of those who don’t get it, who still turn to Notepad to edit text. Time after time. Without even thinking about it.