Recognition of ambiguous names by Name>Struct

From its inception, Name>Struct has operated under the premise that if someone asks to convert a name to a structure, then Name>Struct should do its best to come up with some sort of structure that is described by the name. While this approach makes sense in an interactive situation, with an individual chemist converting individual names and reviewing the resulting structures, it can be dangerous in an automated environment where many names are being converted at the same time with little or no individual review. In cases where generated structures must be accurate, the "best guess" of Name>Struct may be worse than no structure at all.

To address these concerns, Name>Struct has been enhanced to recognize when chemical names are ambiguous. It will continue to provide its best guess at the intended structure, as before, but it will now also report that the name was ambiguous and that generated structure might not have been the one that was intended by the author of the name. It might be appropriate to review those structures more thoroughly, or it might be best to discard them altogether, based on the requirements of the name conversion process.

What makes a name ambiguous?

Simply put, and ambiguous name is a name that might reasonably be interpreted as referring to more than one possible chemical structure. Two of the most important phrases in that description are "reasonably" and "more than one".

By convention, substitutive nomenclature -- the most common form of chemical nomenclature -- builds molecules by replacing hydrogen atoms with larger substituents. A name like "pyridine" represents a molecule with five hydrogen atoms. A name like "pentachloropyridine" represents a similar molecule with the five hydrogen atoms replaced by five chlorine atoms. Someone may expect "pentachloropyridine" to represent a molecule with the five chlorine atoms in some other configuration, such as with all of the chlorine atoms connected to the nitrogen atom, but it would not be reasonable to expect that. Furthermore, there is only one way for five chlorine atoms to replace five hydrogen atoms. A name like "pentachloropyridine" therefore is not ambiguous because it can reasonably refer to only a single structure.

pyridine pentachloropyridine pentachloropyridine
not ambiguous not ambiguous WRONG

In contrast, a name like "dichloropyridine" is ambiguous; it may refer to any of six different possible structures.

dichloropyridine? dichloropyridine? dichloropyridine? dichloropyridine? dichloropyridine? dichloropyridine?
ambiguous ambiguous ambiguous ambiguous ambiguous ambiguous

Ambiguity due to missing locants

The most common type of ambiguity in chemical names, and the type most accurately recognized by Name>Struct, is ambiguity caused by the absence of locants. Ambiguous names of this type can be made unambiguous by adding appropriate locants to indicate the exact positions of substitution.

   
  dichloropyridine 2,5-dichloropyridine  
  ambiguous not ambiguous  

butene 2-butene but-2-ene
ambiguous not ambiguous not ambiguous

   
  glycerol diacetate glycerol 1,3-diacetate  
  ambiguous not ambiguous  

Often, an ambiguous name can become unambiguous if further substitutents are present

   
  dichloropyridine 2,4,6-trifluorodichloropyridine  
  ambiguous not ambiguous  

cyclohexadiene chlorocyclohexadiene chlorocyclohexadiene-1,4-dione
ambiguous ambiguous not ambiguous

That said, the lack of locants does not by itself mean that a name will necessarily be ambiguous. It is quite common to omit locants in situations where they truly are not necessary, and Name>Struct will properly recognize names of this type of name as not being ambiguous.

chlorobenzene dichloromethane phenylacetylene
not ambiguous not ambiguous not ambiguous

pentaerythritol diacetate dichlorocyclobutenedione
not ambiguous not ambiguous

Simple cyclic systems, including benzene, provide interesting problems because the numbering of the ring system is determined not by the ring itself, but by a substituent or functional group. If such a ring has one substituent or functional group without a locant and several others with locants other than "1", then the one substituent without a locant must be at the 1-position.

2,4-dichlorobromobenzene chlorobenzene-4-sulfonamide cycloocten-3-one
not ambiguous not ambiguous not ambiguous

Ambiguity due to missing parentheses

All current nomenclature systems mandate the proper use of enclosing marks, usually parentheses, to separate and group various substituents. When used properly, all fragments within a pair of parentheses are bonded to the last fragment in the pair, and finally all outermost parentheses are bonded to the final fragment in the name. Unfortunately, parentheses are often omitted from names, either accidentally or in the mistaken belief that they are unnecessary. With parentheses missing, it becomes impossible to determine the true intent of the author of a name, and such names will be recognized by Name>Struct as ambiguous. They can typically be made unambiguous again by restoring the enclosing marks wherever the should have gone in the first place. This is one of the most difficult types of ambiguity for a chemist to recognize by eye, and therefore one of the most insidious in printed names.

trichloromethylsilane (trichloromethyl)silane tri(chloromethyl)silane trichloro(methyl)silane
ambiguous not ambiguous not ambiguous not ambiguous

2-hydroxymethyloxirene 2-hydroxy(methyl)oxirene 2-(hydroxymethyl)oxirene
ambiguous not ambiguous not ambiguous

Ambiguity of metal ions

Many metals commonly exist in multiple ionic forms. If a particular form is not specified, the resulting name is ambiguous.

   
  iron chloride iron (II) chloride  
  ambiguous not ambiguous  

(Non)ambiguity of cumulated double bonds

As a matter of convention, most chemists assume that polyenes are conjugated when possible, and that double bonds are not located at bridgehead atoms. Technically speaking, names of these types are ambiguous, but Name>Struct reports them as being unambiguous simply in recognition of common usage.

butadiene cyclooctatetraene norbornadiene
technically ambiguous;
reported as not ambiguous
technically ambiguous;
reported as not ambiguous
technically ambiguous;
reported as not ambiguous

(Non)ambiguity due to lack of stereochemistry

Also as a matter of convention, the absence of fully specified stereochemical configuration in a stereogenic molecule is technically ambiguous but not reported as such by Name>Struct. Chemical structure diagrams have a common convention for representation of unspecified stereochemistry: if wedged bonds are absent, then unspecified stereochemistry is (usually) implied.

2-butanol (R)-2-butanol (S)-2-butanol
technically ambiguous;
reported as not ambiguous
not ambiguous not ambiguous

(Non)ambiguity of alkyl substituents

By far the most common convention affecting ambiguous names is the interpretation of alkyl chains. All current nomenclature systems define "propanol" as meaning "propan-1-ol" without exception, but those definitions haven't stopped alkyl chains from being used in an ambiguous sense. A name like "propanol" might possibly be intended to mean "isopropanol"; with further substitution, the similar name "1,1,1,3,3,3-hexafluoropropanol" unquestionably refers to the isopropyl form. As a final matter of convention, Name>Struct will treat alkyl substituents are referring to the straight-chain form and will not report them as ambiguous, even though they are ambiguous in a technical sense.

propanol isopropanol 1,1,1,3,3,3-hexafluoropropanol
technically ambiguous;
reported as not ambiguous
not ambiguous not ambiguous

(Non)ambiguity of incorrect names

Regretfully, some types of errors in chemical names will produce totally unreasonable structures that are also totally unambiguous. There are many aspects of chemical nomenclature that differ by only a single character, which makes mistakes inevitable. Name>Struct cannot guess when a non-ambiguous name happens to represent a structure that simply isn't the one intended by the author. Chemical names will only be marked as ambiguous if they truly are ambiguous, and not if they are wrong for some other reason.

   
  (pentachloromethyl)benzene (pentachloroethyl)benzene  
  not ambiguous;
warning about exceeded valence
not ambiguous  

   
  methyl amine menthyl amine  
  not ambiguous not ambiguous