Rules for Text Searching
Jim Dill, 3 Apr 02
The way text searching works in ChemFinder 7.0 is inconsistent, nonstandard, and poorly documented. This page was used to organize thoughts on the matter and design a better scheme for 7.0.3.
Update 4 Apr 02: new code committed today improves the situation considerably. New documentation is on the 7.0.3 doc page. Below are notes made during the improvement process.
Notes made earlier
The online help for ChemFinder 7.0.1 gives this set of rules about text searching:
In summer 2001, when the Merck Index was produced, these rules changed. The rules as documented above still apply to text fields (length up to 254 chars), but not to memo fields (unlimited length), which work as follows:
We are making the following changes to these rules for ChemFinder 7.0.3:
New Rules Planning
This table summarizes how single-token text queries ought to work in 7.0.3, given two new switches:
|Query||Normal||Auto-Float||Full-Word||AF and FW|
|abc||like abc*||like *abc*||=abc or like abc[d]*||=abc or like abc[d]* or like *[d]abc or like *[d]abc[d]*|
|abc*||like abc*||< same||=abc or like abc[d]*||< same|
|*abc||like *abc||< same||=abc or like *[d]abc||< same|
|*abc*||like *abc*||< same||=abc or like abc[d]* or like *[d]abc or like *[d]abc[d]*||< same|
|=abc||= abc||< same||< same||< same|
Note: ALL queries hit the target "abc."
Query Normal Auto-Float Full-Word AF and FW abc abcx xabcx abc x x abc x abc* abcx < same abc x < same *abc xabc < same x abc < same *abc* xabcx < same x abc x < same =abc abc < same < same < same
Appendix: Jet String Comparison Table
Asterisk is not the only wildcard allowed in text queries. The following is from Microsoft Access Help (Jet SQL Reference):
Character(s) in pattern Matches in expression ? or _ (underscore) Any single character * or % Zero or more characters # Any single digit (0 — 9) [charlist] Any single character in charlist [!charlist] Any single character not in charlist
Kind of match Pattern Match
Multiple characters a*a aa, aBa, aBBBa aBC *ab* abc, AABB, Xab aZb, bac Special character a[*]a a*a aaa Multiple characters ab* abcdefg, abc cab, aab Single character a?a aaa, a3a, aBa aBBBa Single digit a#a a0a, a1a, a2a aaa, a10a Range of characters [a-z] f, p, j 2, & Outside a range [!a-z] 9, &, % b, a Not a digit [!0-9] A, a, &, ~ 0, 1, 9 Combined a[!b-m]# An9, az0, a99 abc, aj0
Planning For ChemFinder 8
For future reference, let's start a compilation of all the types of query it might occur to a user to enter when searching for text or numeric data. This will be useful when we rip out the whole mechanism and build it over from scratch, next round.
Text query Description abc straight text abc* text with wildcards ab to cd range ab - cd range between ab and cd range > ab inequality \sql raw sql =abc exact match abc,def,ghi list not abc negated abc or (>d and <g) subphrase not (>d and <g) negated subphrase Numeric query x value x - y range x to y range between x and y range >x and <y range x to y range not x negated <> x negated not (x - y) outside range
[ R&D Home | ChemFinder Home ]