iBookX Knowledge Base Data Structure

Introduction - The Old Way

Sites originally developed in the mid-1990's—and the newer copycat sites—have a single, distinct record for each item listed on their site.  You can think of those records as a collection of index cards that each contain something similar to the following seven data items:

  • Seller ID
  • Author
  • Title
  • Publication Information
  • Description
  • Keywords 
  • Price

This structure is very clean and simple.  The item record can be deleted when the item is sold or the seller withdraws it, just as an index card can be thrown away.  The Seller ID in the item record is used to link to a separate permanent record of the seller's contact information, terms of sale, etc.  Linking data like this is a good thing; it eliminates the storage of redundant data and allows the linked information to remain in the system even after the item record has been removed.

Typically, with this old-style structure, the Seller ID is the only validated field in the item record.  Any of the other data fields can contain virtually any information—valid or not—or be blank.  Since there's no validation provided, the user inputing the data could accidently put the author's name in the title field and there's no way the "system" could trap the mistake.  Even barring untrapped data entry errors, the limited number of fields available results in combining data items that really should be tracked separately.  For example, when an item has multiple authors or editors, the single Author field must contain those multiple authors and editors.  And the Description field acts as the catch-all for edition, binding, size, condition, and comment data.

Unfortunately—while this clean, simple solution makes for a site that's easy to program and to maintain—it does not lend itself to efficient, productive searching, nor does it do anything to assure the quality of the data it contains and provides.  Often the adage "Garbage In, Garbage Out" is proven all too true.

The only relationship this old-style data structure "understands" is that between an item and the linked seller.  It could compare different items' data fields to each other to try to determine if the items are of the same book, but it can rarely be certain.  Due to its "flexability" of accepting any information in any field, it can't even reliably retrieve all the listed copies of a book, much less determine how many copies of a particular edition are currently offered.

Whether you're a buyer looking for the item that doesn't show up in a search, or you're the seller of that item, that's bad news. 

Enter the iBookX Knowledge Base

The iBookX Knowledge Base trades simplicity of design and programming for significant improvements in efficiency, performance, data quality, productivity and—most importantly—usability.  We've put into practice our belief that the programmers should do the heavy lifting so the site's users don't have to.  Our design is not simple, but it is logical and elegant.  We have gone far beyond just linking item information to seller information; we've designed and developed a hierarchy of links and relationships that ultimately has allowed us to implement The Best Book Search Engine on the Web.  (Okay, we'll try to limit the bragging.)

At the highest level, we have taken the-old style item record and broken it down into linked Book, Edition, and Item records.  (See the diagram below.)  In this hierarchy, a Book may have multiple Editions, and in turn there may be multiple Items of any Edition.  Items of the same Edition all share a single Edition record, and Editions of the same Book share a single Book record.  The Book and Edition records are never deleted, however they can be corrected.

This structure is what makes our PriceCompare capability possible.  Since every Item is of a specific Edition, we can easily and quickly search by that Edition and determine (and display) exactly how many copies are currently available and at what price range they are being offered.  We can do the same for any Book.  Item listings are never "lost" because the Book and Edition that they are copies of are known and validated.

Additional Advantages & Benefits of our Knowledge Base Design

By linking a single Book record to an unlimited number of separate Author, Editor, and Title records, we have eliminated redundant data entry and storage while allowing each book to have individually-searchable multiple authors, editors, and titles.  This design is what makes our SmartSearch capability possible.

Searching against an old-style author field that contains multiple authors presents problems.  For example, imagine you execute an old-style search for books by Michael Anthony.  You will get matchs on every item with the two authors Piers Anthony & Michael Whelan, clearly not items you're looking for.  However, with SmartSearch, you won't get the undesired matchs because SmartSearch "understands" that Piers Anthony plus Michael Whelan does not equal Michael Anthony.

Linking a Book record to Title records (rather than having the Title as a field in the Book record) allows us to recognize that the same book may have alternate titles, including translations.  Furthermore, SmartSearch is then able to find all copies of that book when searching by any of its titles.

Additionally, having just one record per Title and Author means the name and title spelling will always be consistent.  (And there's only one record that needs to be updated if a spelling error is found.)

Finally, since Book and Edition records are shared and never deleted, a bookseller does not have to enter any of that Book or Edition information for any Item that they add of a "known" edition; they only have to provide the Item-specific information (features, condition and price).  Data entry time and effort are reduced while data quality and usability are increased.

 
Home  -  SmartSearch by Author  -  SmartSearch by Title