Building ChemFinder in VB

Or: An Exercise in Visual Basic, MolServer, and CDAX

14 Jun 02


There are various ways to manage chemical data from Visual Basic.  One is to call the ChemFinder executable via Automation.  This gives you access to all ChemFinder objects, plus ChemFinder Automation Language, so you can not only manage the database but also execute menu commands and form-building operations.  However, for some projects this approach may not be suitable. It restricts you to a specific database arrangement, and invokes a lot of UI code you may not need.  For more information, see the ChemFinder SDK site.

A more direct channel to chemical data is MolServer, the chemical COM component.  MolServer can access structures in ChemFinder databases, SDFiles, collections of CDX files, Microsoft Office documents, and other sources; it can search them, update, delete, build databases, and export to other formats.  It is lightweight, with a fairly simple interface. Public documentation is evolving at the MolServer SDK site.

In this project, we build a little VB app which has many of the capabilities of ChemFinder, but uses only MolServer.  It is built with MolServer 7 and other components of ChemOffice 2002.  A future version will demonstrate MolServer 8 and CDAX 8.

Click to download VB project


VB App

The application is fairly unimaginative.  It goes by the unimaginative name of VBApp, and has an unimaginative interface with a little display form and a panel of buttons:

VBApp Main Form

Form elements and buttons are as follows:


Setup

The database setup for this application is the standard ChemFinder arrangement:  relational data is stored in an mdb file (or other ADO-accessible database), with structures stored separately in mst/msi files (or other MolServer source), and a column of numeric mol id's relating the two.  Relational data is managed by standard VB database mechanisms, structures managed by MolServer, and the two are kept in synch by the VB code.  A database like this can be accessed both from VB and from ChemFinder.

To begin: 

  1. You must have ChemOffice 7 installed.  This provides MolServer 7 and the ChemDraw 7 control.

  2. Download and unzip the project into a folder.  To download, click the link above or here.

  3. Copy ChemOffice\ChemFinder\Samples\cs_demo.* into the same folder.  The data control is hard-wired to look for cs_demo if a database is not otherwise specified, so having it available will prevent a startup glitch.

  4. Click Open and browse to the ChemFinder database (.cfw file) of your choice.  The file is processed by a command-line utility called formhdr.exe (included in the project folder), which extracts db, table, and mst paths from the cfw file.

  5. When you exit, the currently-open database info is saved to the Registry, and will be used the next time you run the program.  Settings are found under 

HKEY_CURRENT_USER\Software\VB and VBA Program Settings\MolServerTestApp

Three keys are stored: DBPath, pathname of mdb file or ADO connect string; Table, name of the table containing mol id's; MstPath, pathname of mst file or MolServer connect string.  You can edit these by hand if necessary.


How It Works

The application consists of two forms and three code modules, with the main code in MolForm.  At startup (Form_Load), the program finds connection info in the Registry or uses default values.  The database path is handed to the data control, and the mst path (or connect string) is used to open a MolServer Document object:

Dim ms As MolServer7.Document
 
Set ms = New MolServer7.Document
ms.Open mstname, 0, ""
nMols = ms.Count
 

The data control automatically fills the data-bound boxes when necessary, e.g. on opening or moving to a new record.  To fill the structure-related boxes, we watch for an event indicating that the Mol_ID has changed, then retrieve the associated molecule, formula, and molweight.  Note that this technique has a bonus side-effect: if you type a new value into the Mol_ID box, it automatically brings up the related structure instead of editing the ID, which is not allowed anyway.  The code boils down to:

Private Sub Mol_ID_Change()
 
   ' retrieve mol
   Dim molData As Variant
   Dim mol As MolServer7.Molecule
   Set mol = ms.GetMol(Mol_ID)
   Formula = mol.Formula
   Molweight = mol.Molweight
 
   ' put into ChemDraw control
   cbFormat = RegisterClipboardFormat("ChemDraw Interchange Format")
   molData = mol.DataObject.GetData(cbFormat)
   cdax.DataObject.SetData molData, cbFormat
  


Searching

To search, you begin by clicking Enter Query.  All this does is clear the form and set a flag indicating we are in query mode (thus preventing any edits from being stored into the database):

' clear all form boxes
ChemDrawCtl1.Objects.Clear
Formula = ""
Molweight = ""
Mol_ID = ""
Molname = ""
bQueryMode = True
 

When you click Find, the contents of the controls are transferred to a query object (a MolServer.SearchInfo), which is passed to the MolServer Search method.  This prepares the query but doesn't actually do the search until you call Start.  After that, you have the option of (a) repeatedly polling to see whether the search is complete, and posting intermediate results, or (b) calling WaitForCompletion so the search will run to completion before returning.  We've used the latter method.  The code for Find amounts to:

Dim si As New MolServer7.searchInfo
Dim msrch As MolServer7.Search
 
' get mol data from controls, set up search query
If ChemDrawCtl1.Objects.Count > 0 Then
   molData = cdax.DataObject.GetData(cbFormat)
   si.MolQuery.DataObject.SetData molData, cbFormat
End If
si.FmlaQuery = Formula
si.MolwtQuery = Molweight
 
' create search object and run it
Set msrch = ms.Search(si)
msrch.Start
msrch.WaitForCompletion (1000)
 
nHits = msrch.Hitlist.Count
Debug.Print "Hits = " & nHits

 

The result of the MolServer search is a hitlist of ID's.  To convert this into a recordset, we generate from it an IN clause, and use that in a SELECT statement:

' prepare a clause like IN (1,3,5,..) 
tstr = ""
For i = 0 To nHits - 1
   If tstr <> "" Then
      tstr = tstr & ","
   End If
   tstr = tstr & Str(msrch.Hitlist.At(i))
Next
sql = "Mol_ID in (" & tstr & ")"
 
' execute the SQL
fullsql = "select * from " & tblname & " where " & sql
DataCtrl.RecordSource = fullsql
DataCtrl.Refresh
 

After this operation, you are browsing the hits only.  You can iterate over them like this:

While Not DataCtrl.Recordset.EOF
   Set mol = ms.GetMol(Mol_ID)
 
   .. do something with mol
 
   DataCtrl.Recordset.MoveNext
Wend
 

or reset to the full list using Retrieve All, which requeries the full table:

DataCtrl.RecordSource = tblname
DataCtrl.Refresh
 


Storing Structures

When you edit data in a bound control, then move to a different record, the data control automatically stores the update.  The application hooks into this process by trapping the Validate event, when it can abort if the data is invalid.  In our case, we're using this event as an opportunity to store the structure if it has been changed (as indicated by a flag set by the DataChanged event from the ChemDraw control).  The code:

Private Sub DataCtrl_Validate(Action As Integer, Save As Integer)
 
   ' data is headed for db: store the mol here
   ' if query mode, do nothing
   Dim mol As MolServer7.Molecule
   If bQueryMode Then
      Save = False
 
   ElseIf bMolChanged Then
      Set mol = GetMolFromCDAX(ChemDrawCtl1)
      ms.PutMol mol, Mol_ID
      bMolChanged = False
   End If
End Sub
  

Adding a new record is a little different.  To do this cleanly and allow it to be cancelled, we've set up a separate form as a sort of modal dialog:  

Add Record: fill form and commit

When you click New Record, this form loads and calls the MolServer method AssignID, which reserves the next available ID in the structure source.  You then draw the structure, add data (in this case just Molname), click Commit, and the data is moved to the parent form and committed from there:

Private Sub Commit_Click()
 
   Dim mol As MolServer7.Molecule
 
   ' prepare recordset for new record
   MolForm.DataCtrl.Refresh
   MolForm.DataCtrl.Recordset.AddNew
 
   ' copy data from this form to parent
   MolForm.Molname = Molname
   MolForm.Mol_ID = Mol_ID
   Set mol = GetMolFromCDAX(ChemDrawCtl1)
   PutMolToCDAX mol, MolForm.ChemDrawCtl1
   MolForm.bMolChanged = True
 
   ' return to parent form and commit new record
   Hide
   MolForm.DataCtrl.Recordset.Update    ' causes Validate => PutMol
   MolForm.DataCtrl.Recordset.MoveLast
 
End Sub
 


Export

The Export function is not quite as useful as it will become.  In principle it allows a list of structures to be sent to any type of destination; in practice it is limited since most MolServer 7 sources are read-only.  We will improve this in MolServer 8.  Meanwhile, the current function can be used to build a ChemFinder database.  The code is as follows, beginning with an Open dialog for selecting a destination, then a series of GetMol from the source and PutMol to the destination:

Private Sub Export_Click()
 
   ' Export: send current list to output MolServer
   Dim msout As New MolServer7.Document
   Dim dlg As New MolServer7.OpenDialog
   Dim mol As MolServer7.Molecule
   Dim connstr As String
 
   connstr = dlg.QueryForConnectString("", 0)
   If connstr <> "" Then
      msout.Open connstr, 0, ""
 
      While Not DataCtrl.Recordset.EOF
         Set mol = ms.GetMol(Mol_ID)
         msout.PutMol mol, 0
         DataCtrl.Recordset.MoveNext
      Wend
   End If
 
End Sub
 

The best way to use this function is to create an empty database and form using ChemFinder, then select it as the output destination.  Each Export operation appends the current list to the database.