The CDXML text-based file format

A CDXML is a CDX file specially formatted so that it conforms to the XML specification. We expect that anyone who manipulates a CDXML file will be familiar with the general XML specifications, so we present only a brief overview here.

A CDXML file has the following general attributes:

Header Format

The CDX File Header consists of the following string:

<?xml version="1.0" encoding="UTF-8" ?>

The header is then followed by an object tree of tagged items beginning with a document object.


Since the first object following the header is a document object, the end of the file is signified by the end of the document object:



Properties, also called attributes, are self-contained. A property applies to the object which logically contains it. It may also describe other objects contained within the object which logically contains the property. For example, bond order is a property of a bond, and molecular weight is a property of a molecule. Unless otherwise noted, all properties are optional.

All properties have a name. Properties are included in a CDXML file by listing the property name, an equals sign, and then a quoted string representing the property's value.

Order="2":The bond with this property is a double bond


Like properties, objects are also self-contained. Unlike properties, however, objects can contain properties and other objects.

Each object has a name that identifies the type of object. For example, an object that represents a bond has the name "b". These names, like everything else in XML, are case-sensitive.

The definition of an object starts with a less-than sign, followed by the object's name. Any properties, of present, are listed immediately afterward in pairs formatted as propertyname="propertyvalue". A greater-than sign will appear after the last property.

The definition of an object ends with a less-than sign followed by a slash followed by the object name again, followed by a greater-than sign. Alternatively, if there are no subobjects, the end-object marker may be omitted and replaced by a slash immediately before the first closing greater-than sign.

Any subobjects are listed between those object-begin and object-end markers.

<b:This is a bond
id="29":This bond has an ID of 29
B="21":The atom at the first end of this bond has ID 21
E="22":The atom at the first end of this bond has ID 22
Order="2":This bond has a bond order 2. It is a double bond.
DoublePosition="Right":This double bond is positioned so that the second line of the double bond is to the right of the first, looking from the first atom to the second atom
/>:There are no more properties or objects associated with this bond


Return to Introduction
Continue to simple example

CDX Documentation index