.KEYWORD ppeditorial0301
.FLYINGHEAD FROM THE EDITOR-IN-CHIEF
.TITLE I love it when a plan comes together
.DEPT
.SUMMARY All three ZATZ publications are now equipped with search engines so that you can more easily access the vast wealth of resources stored in our back issues. In this month’s editorial, Editor-in-Chief David Gewirtz details the long struggle that went into making this possible.
.AUTHOR David Gewirtz
Every so often, there’s a catch phrase from the movies or television that just seems to stick in your mind and be absolutely appropriate for certain situations. Arnold Schwarzenegger made, "I’ll be back," famous. Then, of course, there are Clint Eastwood’s immortal words, "Go ahead. Make my day."
For me, the one that often seems to resonate just right is, "I love it when a plan comes together." I picture Colonel John "Hannibal" Smith of the A-Team, perfectly played by George Peppard with a sly smirk and a cigar jutting jauntily out of his mouth. He’d remark, "I love it when a plan comes together," right after a tough battle, where somehow he’d manage to make everything work, and the team, of course, came out victorious.
So, I gotta tell ya, I love it when a plan comes together.
PalmPower’s search engine is up. So are search engines for DominoPower and PalmPower’s Enterprise Edition.
Frankly, it was a lot harder to pull together a working search engine for ZATZ than I’d originally thought. First, we tried out much of the off-the-shelf software that was available. However, because our publications are generated via a content management system, only some parts of our Web site are suitable for indexing. For example, we wanted to index each article, but we didn’t want to index the alternate forms of the articles (the EasyPrint or wireless versions). We also didn’t want to index our table of contents pages, mastheads, or other non-relevant content.
.CALLOUT You now have, at your fingertips, the single largest searchable archive of original, edited, Palm-related content anywhere on the planet.
In theory, there’s something called a "robots exclusion standard" that’s supposed to specify to search engine spiders what directories are to be indexed and what are to be ignored. In practice, many of the off-the-shelf products ignored all or part of the standard.
So then we looked at the high-end commercial solutions, like those from Inktomi and Alta Vista. But most of the high-end suppliers charge a great deal for their products, including an increasing fee based on the number of searches and pages indexed. One of the reasons we’re a successful "dot-com" is we manage our expenditures, and most of these products were way too rich for our blood.
There were also a bunch of search engine products available free on the Web. One of the best is available at Atomz.com (at http://www.atomz.com). For a small number of searches (for example, if you’re running your own personal Web site), you can use their engine for free. All they ask is that you include their logo (and, of course, their ads, on the results pages). While a service like this isn’t appropriate for us at ZATZ, I’d certainly recommend them if you’re building your own pages and want a quick, easy, and free search tool.
For me, though, there had to be a better way.
Generally, when it comes to operating Web servers, whenever you say, "there’s got to be a better way," there usually is. And it’s usually Linux.
I need to be honest. I’m not a fanatical Linux fan. The underlying architecture of the system is really well done, but if you look at any Linux distribution, including the vaunted Red Hat distribution, it seems like this was an operating system put together by a committee of crazed monkeys. Let me give you two examples.
First, if you install a typical Linux distribution, it’ll immediately configure properly for just about any Ethernet device, without needing drivers or any special configuration. Yet Linux makes a big fuss about which monitor you’re using, and often you need to hand-tweak refresh rates that never seem to allow for your display to center properly on screen.
Second, if you install a typical Linux distribution, you’ll get not one user interface, but a bunch of user interfaces. In fact, not only will you get a bunch of user interfaces, but also they’re often all loaded at the same time. And not only are they all loaded at the same time, but also different sets of menu items from the Linux equivalent of the Start menu load different programs depending on the interface. It seems like the Linux-ites couldn’t decide on one approach, so they decided to throw it all into the mix. It’s rather confusing at times.
Finally, let’s assume you want to allow another FTP (File Transfer Protocol) user to use your computer. There’s a file called /etc/ftpusers that handles this. Logic would say that if you wanted to give "david" access to your server, you’d stick "david" in /etc/ftpusers. But this is Linux, and logic is the last thing you need. It turns out that the file /etc/ftpusers doesn’t contain FTP users. Oh, no! That’d be way too easy. Instead, the file /etc/ftpusers contains a list of people who are not allowed to use FTP. When I’m 120 years old and looking back on my life, I’ll still be able to remember the five hours of my life I gave up to this particular brand of Linux perversity.
The fact is, Linux is quirky. If you know all the quirks, Linux will make sense. If you’re a Linux fanatic, don’t go sending me mail; you know I’m right.
But Linux is also very, very powerful, very, very flexible, and very, very inexpensive (other than the cost of hiring someone who understands all the quirks).
So I figured that if we wanted a powerful, flexible search engine and we didn’t want to pay a ton for it, Linux had to have the answer. And, of course, it did.
So, after finally getting Red Hat Linux 7.0 installed, I found the engine itself. I decided that we’d use something called ht:\//Dig (and yes, the weird punctuation is part of ht:\//Dig’s charm). The ht:\//Dig program is totally configurable, way the heck fast, and charmingly free. Also, special thanks to all the helpful people on the htdig.org mailing list for helping me figure out how to integrate PHP and ht:\//Dig.
You can use ht:\//Dig straight, meaning without any further goodies, but I wanted to be able to dynamically load ads and our ZATZ bar on each page of search results. It turned out that meant that I had to find a tool for service side scripting that’d integrate with ht:\//Dig. At first, I looked at Perl (Practical Extraction and Reporting Language), but while Perl can literally do anything, its syntax is a bit too cryptic for my tastes.
So, instead, I found a language called PHP. PHP stands for "PHP: Hypertext Preprocessor," a naming redundancy that’s a favorite in-joke of a certain cut of programmer. The idea is that the first word or letter becomes part of the acronym, so, for example, MINCE stands for "Mince is Not Completely EMACS," and some folks even claim that Linux stands for "Linux is Not UNIX."
Yeah, well, at least the software’s good.
In any case, PHP is a very powerful scripting language that integrates into the Linux environment. Of course, to do what I needed to do, first I had to learn a new programming language. The things I do to make our readers happy!
Finally, of course, we needed a Web server running on the Linux machine. Apache, perhaps the most popular Web server on the Web, was the obvious choice. Like all other aspects of Linux, it was a little funky to use at first, but by the tenth reading of the O’Reilly book, I figured out the basics.
Eventually, I got it all together, and so now you can search our magazines. When you enter in a request, you’re talking to Apache, which sends the request to PHP, which asks for a list of matches from ht:\//Dig, which then searches a series of index files. The ht:\//Dig program returns those matches to PHP, which then cleans them up so they look better, grabs an ad banner and the ZATZ bar, and formats a series of search results pages.
Seriously, though, despite my frustrations with some of the aspects of Linux, the overall cost of this solution was very low, and the performance was very high. That, by the way, is a formula that more "dot-coms" should use if they want to stay in the game.
So, now, you can go to the home page of any of our magazines and search our back issues. You now have, at your fingertips, the single largest searchable archive of original, edited, Palm-related content anywhere on the planet.
Ah, but as always, there’s more. One of the things I wanted to do was to make sure you could search PalmPower from your own Web sites. In fact, it’s really easy. Just paste the following code into your Web page:
.BEGIN_CODE