Writing SMILES (Simplified Molecular Input Line Entry System) notation involves representing the structure of a molecule in a concise and standardized text format. Here’s a guide on how to write SMILES notation:
- Understand SMILES Basics: SMILES notation represents a chemical structure using ASCII characters. It is a linear notation that condenses the structural information of a molecule into a single line of text. SMILES is designed to be human-readable and is used in computational chemistry and databases.
- Start with Atom Symbols: Begin the SMILES notation by writing the atomic symbols for each atom in the molecule. For example:
- Carbon:
C - Hydrogen:
H - Oxygen:
O - Nitrogen:
N
- Carbon:
- Connect Atoms with Bonds: Use various symbols to represent the bonds between atoms:
- Single bond:
- - Double bond:
= - Triple bond:
#
- Methane:
CH4(four single bonds) - Ethene:
C=C(a double bond between two carbon atoms) - Water:
O-H(a single bond between oxygen and hydrogen)
- Single bond:
- Include Branches and Rings: For branched structures, use parentheses
( )to enclose the branch. For example:- Isopropanol:
CC(O)C(a chain of three carbons with a hydroxyl group attached to the second carbon) - Ethyl acetate:
CCOC(C)=O(ethyl group attached to the oxygen of the carbonyl)
- Cyclohexane:
C1CCCCC1(a six-membered ring)
- Isopropanol:
- Add Charges and Stereochemistry: Include charges by using
+for positive charges and-for negative charges. Stereochemistry can be indicated using@or@@for cis/trans isomerism.For example:- Ammonium ion:
N(+)H3 - Chlorofluoromethane:
C(F)(Cl)
- Ammonium ion:
- Check Validity: Ensure that the SMILES notation is valid and accurately represents the intended molecule. SMILES notation follows specific rules, and incorrect notations may lead to misinterpretations.
Here are some additional tips:
- Use lowercase letters for aromatic atoms (e.g.,
cfor benzene carbon). - Isotopes can be represented using
[ ](e.g.,[^13CH4]for carbon-13 methane).
Remember that SMILES notation is a tool for representing molecular structures concisely, and it may take some practice to become familiar with its conventions.