PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains substance descriptions and small molecules with fewer than 1000 atoms and 1000 bonds. More than 80 database vendors contribute to the growing PubChem database.[2]

OrganismsHumans and other animals
Research centerNCBI
Primary citationPMID 15879180
Download URLFTP
Web service URLPUG-View[1]
LicensePublic domain


PubChem consists of three dynamically growing primary databases. As of 1 November 2017:

  • Compounds, 93.9 million entries [3] (up from 54 million entries in Sept 2014), contains pure and characterized chemical compounds.[4]
  • Substances, 236 million entries[5] (up from 163 million entries in Sept 2014[6]), contains also mixtures, extracts, complexes and uncharacterized substances.
  • BioAssay, bioactivity results from 1.25 million[7] (up from 6000 in Sept 2014[8]) high-throughput screening programs with several million values.


Searching the databases is possible for a broad range of properties including chemical structure, name fragments, chemical formula, molecular weight, XLogP, and hydrogen bond donor and acceptor count.

PubChem contains its own online molecule editor with SMILES/SMARTS and InChI support that allows the import and export of all common chemical file formats to search for structures and fragments.

Each hit provides information about synonyms, chemical properties, chemical structure including SMILES and InChI strings, bioactivity, and links to structurally related compounds and other NCBI databases like PubMed.

In the text search form the database fields can be searched by adding the field name in square brackets to the search term. A numeric range is represented by two numbers separated by a colon. The search terms and field names are case-insensitive. Parentheses and the logical operators AND, OR, and NOT can be used. AND is assumed if no operator is used.

Example (Lipinski's Rule of Five):

0:500[mw] 0:5[hbdc] 0:10[hbac] -5:5[logp]


PubChem was released in 2004.[9]

ACS's concerns

The American Chemical Society has raised concerns about the publicly supported PubChem database, since it appears to directly compete with their existing Chemical Abstracts Service.[10] They have a strong interest in the issue since the Chemical Abstracts Service generates a large percentage of the society's revenue. To advocate their position against the PubChem database, ACS has actively lobbied the US Congress.

Soon after PubChem's creation, the American Chemical Society lobbied U.S. Congress to restrict the operation of PubChem, which they asserted competes with their Chemical Abstracts Service.[11]

Database fields

Identification numbers
Identification number in current database [UID]
Substance identification number [SID]
Compound identification number [CID]
BioAssay identification number [BAID], [AID]

Any database field [ALL]
Comment [CMT]
Deposition date [DDAT], [DEPDAT]
Depositor's external ID [SRID], [SRCID]
Source name [SRC], [SRCNAM], [SRCNAME]
Source release date [SRD], [SRDAT], [RLSDAT]
Medical Subject Heading (MeSH) term [MSHT], [MESHT]
MeSH tree node [MSHN], [MESHTN]
MeSH pharmacological actions [PHMA], [PHARMA]

Substance properties
Substance synonyms [SYNO]
International Chemical Identifier (InChI) [INCHI]
Molecular weight [MW], [MWT], [MOLWT]
Chemical elements [ELMT], [EL]
Non-Hydrogen atoms [HAC], [HACNT]
Isotope count [IAC], [IACNT]
Total formal charge [TFC], [CHG], [CHRG]
Chiral atom count [ACC], [ACCNT]
Defined chiral atom count [ACDC], [ACDCNT]
Undefined chiral atom count [ACUC], [ACUCNT]
Hydrogen bond acceptor count [HBAC], [HBACNT]
Hydrogen bond donor count [HBDC], [HBDCNT]
Tautomer count [TC], [TCNT], [TTMC]
Rotatable bond count [RBC], [RBCNT]
XLogP[12] [XLGP], [LOGP]

Compound properties
Compound synonyms [CSYN], [CSYNO]
Component count [CC], [CCNT]
Covalent unit (molecule) count [CUC], [CUCNT]
Total bioactivity count [TAC]

See also


  1. ^ Kim, Sunghwan; Thiessen, Paul A.; Cheng, Tiejun; Zhang, Jian; Gindulyte, Asta; Bolton, Evan E. (9 August 2019). "PUG-View: programmatic access to chemical annotations integrated in PubChem". Journal of Cheminformatics. 11 (1). doi:10.1186/s13321-019-0375-2.
  2. ^ "PubChem Source Information". The PubChem Project. USA: National Center for Biotechnology Information.
  3. ^ "Search Results for all compounds". Retrieved 28 January 2016.
  4. ^ "all[filt] - PubChem Compound Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  5. ^ "all[filt] - PubChem Substance Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 28 January 2016.
  6. ^ "all[filt] - PubChem Substance Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  7. ^ "all[filt] - PubChem BioAssay Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 28 January 2016.
  8. ^ "all[filt] - PubChem BioAssay Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  9. ^ "About PubChem". Retrieved 3 May 2014.
  10. ^ Kaiser J (May 2005). "Science resources. Chemists want NIH to curtail database". Science. 308 (5723): 774. doi:10.1126/science.308.5723.774a. PMID 15879180.
  11. ^ "PubChem and the American Chemical Society". Reshaping Scholarly Communication. USA: University of California. 2005-05-31. Retrieved 2018-10-15.
  12. ^ Cheng T (Nov 2007). "Computation of octanol-water partition coefficients by guiding an additive model with knowledge". Journal of Chemical Information and Modeling. 47 (6): 2140–2148. doi:10.1021/ci700257y. PMID 17985865.

External links


3-Hydroxybutanal (acetaldol) is an aldol, formally the product of the dimerization of acetaldehyde. It was formerly used in medicine as a hypnotic and sedative.


Acetophenone is the organic compound with the formula C6H5C(O)CH3 (also represented by the pseudoelement symbols PhAc or BzMe). It is the simplest aromatic ketone. This colorless, viscous liquid is a precursor to useful resins and fragrances.


Allylescaline (4-allyloxy-3,5-dimethoxyphenethylamine) is a lesser-known psychedelic drug. It is closely related in structure to mescaline. Allylescaline was first synthesized by Otakar Leminger in 1972. The compound was later synthesized by Alexander Shulgin and further described in his book PiHKAL. The dosage range is listed as 20–35 mg, and the duration 8-12 hours. Allylescaline produces an entactogenic warmth, an entheogenic effect, and a feeling of flowing energy. Very little data exists about the pharmacological properties, metabolism, and toxicity of allylescaline.


Cefpodoxime is an oral, third-generation cephalosporin antibiotic. It is active against most Gram-positive and Gram-negative organisms. Notable exceptions include Pseudomonas aeruginosa, Enterococcus, and Bacteroides fragilis. Currently, it is only marketed as generic preparations in the USA, according to the FDA Orange Book. It is commonly used to treat acute otitis media, pharyngitis, sinusitis, and gonorrhea. It also finds use as oral continuation therapy when intravenous cephalosporins (such as ceftriaxone) are no longer necessary for continued treatment.

Cefpodoxime inhibits cell wall synthesis by inhibiting the final transpeptidation step of peptidoglycan synthesis in cell walls. It has well established pharmacokinetic profile with absorption of 50%. It is indicated in community acquired pneumonia, uncomplicated skin and skin structure infections, and uncomplicated urinary tract infections.

It was patented in 1980 and approved for medical use in 1989.


Ciguatoxins are a class of toxic polycyclic polyethers found in fish that cause ciguatera.

There are several different chemicals in this class. "CTX" is often used as an abbreviation.

CID 5311333 from PubChem - Ciguatoxin 1

CID 6441260 from PubChem - Ciguatoxin 2

CID 6444399 from PubChem - Ciguatoxin 3

CID 6450530 from PubChem - Ciguatoxin 4B (Gambiertoxin 4b)

Citric acid

Citric acid is a weak organic acid that has the chemical formula C6H8O7. It occurs naturally in citrus fruits. In biochemistry, it is an intermediate in the citric acid cycle, which occurs in the metabolism of all aerobic organisms.

More than a million tons of citric acid are manufactured every year. It is used widely as an acidifier, as a flavoring and chelating agent.A citrate is a derivative of citric acid; that is, the salts, esters, and the polyatomic anion found in solution. An example of the former, a salt is trisodium citrate; an ester is triethyl citrate. When part of a salt, the formula of the citrate ion is written as C6H5O3−7 or C3H5O(COO)3−3.


Formoterol, also known as eformoterol, is a long-acting β2 agonist (LABA) used as a bronchodilator in the management of asthma and COPD. Formoterol has an extended duration of action (up to 12 h) compared to short-acting β2 agonists such as salbutamol (albuterol), which are effective for 4 h to 6 h. LABAs such as formoterol are used as "symptom controllers" to supplement prophylactic corticosteroid therapy. A "reliever" short-acting β2 agonist (e.g., salbutamol) is still required, since LABAs are not recommended for the treatment of acute asthma.

It was patented in 1972 and came into medical use in 1998. It is also marketed in the combination formulations budesonide/formoterol and mometasone/formoterol.


Isoleucine (symbol Ile or I) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH+3 form under biological conditions), an α-carboxylic acid group (which is in the deprotonated −COO− form under biological conditions), and a hydrocarbon side chain with a branch (a central carbon atom bound to three other carbon atoms). It is classified as a non-polar, uncharged (at physiological pH), branched-chain, aliphatic amino acid. It is essential in humans, meaning the body cannot synthesize it, and must be ingested in our diet. Isoleucine is synthesized from pyruvate employing leucine biosynthesis enzymes in other organisms such as bacteria. It is encoded by the codons AUU, AUC, and AUA.

Inability to break down isoleucine, along with other amino acids, is associated with the disease called maple syrup urine disease, which results in discoloration and a sweet smell in the patient's urine, which is where the name comes from. However, in severe cases, MSUD can lead to damage to the brain cells and ultimately death.

National Center for Biotechnology Information

The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper.

The NCBI houses a series of databases relevant to biotechnology and biomedicine and is an important resource for bioinformatics tools and services. Major databases include GenBank for DNA sequences and PubMed, a bibliographic database for the biomedical literature. Other databases include the NCBI Epigenomics database. All these databases are available online through the Entrez search engine. NCBI was directed by David Lipman, one of the original authors of the BLAST sequence alignment program and a widely respected figure in bioinformatics. He also led an intramural research program, including groups led by Stephen Altschul (another BLAST co-author), David Landsman, Eugene Koonin, John Wilbur, Teresa Przytycka, and Zhiyong Lu. David Lipman stood down from his post in May 2017.


Niravoline is a chemical compound with the formula C22H25N3O3. It has diuretic and aquaretic effects and has been studied for its potential use for cerebral edema and cirrhosis.It exerts its pharmacological effect as a kappa opioid receptor agonist.


Oxyfedrine is a vasodilator and a β adrenoreceptor agonist. It was found to depress the tonicity of coronary vessels, improve myocardial metabolism (so that heart can sustain hypoxia better) and also exert a positive chronotropic and inotropic effects, thereby not precipitating angina pectoris. The latter property (positive chronotropic and inotropic effects) is particularly important, because other vasodilators used in angina may be counter productive causing coronary steal phenomenon.

Synergistic effects with antibiotics have been suggested.


Pentolinium is a ganglionic blocking agent which acts as a nicotinic acetylcholine receptor antagonist. Formulated as the pentolinium tartrate salt, it is also known as Ansolysen. It can be used as an antihypertensive drug during surgery or to control hypertensive crises. It works by binding to the acetylcholine receptor of adrenergic nerves and thereby inhibiting the release of noradrenaline and adrenaline. Blocking this receptor leads to smooth muscle relaxation and vasodilation.


Piperidinediones are a derivatives of piperidine with two ketone functional groups. There are six isomers, each of which has a molecular weight of 113.115 and a formula of C5H7NO2. Piperidinediones form the core structure of a variety of pharmaceutical drugs.


RB-3007 is an orally active analogue of RB-101. It acts as an enkephalinase inhibitor, which is used in scientific research.


The sulfate or sulphate (see spelling differences) ion is a polyatomic anion with the empirical formula SO2−4. Sulfate is the spelling recommended by IUPAC, but sulphate is used in British English. Salts, acid derivatives, and peroxides of sulfate are widely used in industry. Sulfates occur widely in everyday life. Sulfates are salts of sulfuric acid and many are prepared from that acid.

Sulfur mustard

Sulfur mustard, commonly known as mustard gas, is the prototypical substance of the sulfur-based family of cytotoxic and vesicant chemical warfare agents, which can form large blisters on exposed skin and in the lungs. They have a long history of use as a blister-agent in warfare and, along with organoarsenic compounds such as Lewisite, are the most well-studied of such agents. Related chemical compounds with similar chemical structure and similar properties form a class of compounds known collectively as sulfur mustards or mustard agents. Pure sulfur mustards are colorless, viscous liquids at room temperature. When used in impure form, such as warfare agents, they are usually yellow-brown and have an odor resembling mustard plants, garlic, or horseradish, hence the name. The common name of "mustard gas" is considered inaccurate because the sulfur mustard is not actually vaporized, but dispersed as a fine mist of liquid droplets. Sulfur mustard was originally assigned the name LOST, after the scientists Wilhelm Lommel and Wilhelm Steinkopf, who developed a method of large-scale production for the Imperial German Army in 1916.Mustard agents are regulated under the 1993 Chemical Weapons Convention. Three classes of chemicals are monitored under this Convention, with sulfur and nitrogen mustard grouped in Schedule 1, as substances with no use other than in chemical warfare. Mustard agents could be deployed by means of artillery shells, aerial bombs, rockets, or by spraying from warplanes or other aircraft.

Sulfur mustard can be readily decontaminated through reaction with chloramine-T.


TC-1827 is an orally active, selective agonist of the α4β2 nicotinic receptors. Administration of TC-1827 improved memory and learning in a variety in rodents, and increased long-term potentiation in hippocampal slices. In addition, the compound was without significant cardiovascular side effects, except for a small, transient rise in arterial blood pressure. The pro-cognitive effects of TC-1827 last much longer than the short half life (0.2 - 1.0 hours) would suggest.

Tartaric acid

Tartaric acid is a white, crystalline organic acid that occurs naturally in many fruits, most notably in grapes, but also in bananas, tamarinds, and citrus. Its salt, potassium bitartrate, commonly known as cream of tartar, develops naturally in the process of winemaking. It is commonly mixed with sodium bicarbonate and is sold as baking powder used as a leavening agent in food preparation. The acid itself is added to foods as an antioxidant E334 and to impart its distinctive sour taste.

Tartaric is an alpha-hydroxy-carboxylic acid, is diprotic and aldaric in acid characteristics, and is a dihydroxyl derivative of succinic acid.


Thioproscaline, or 3,5-dimethoxy-4-propylthiophenethylamine, is a lesser-known psychedelic drug. It is the 4-propylthio analog of mescaline. Thioproscaline was first synthesized by Alexander Shulgin. In his book PiHKAL (Phenethylamines i Have Known And Loved), the dosage range is listed as 20–25 mg, and the duration listed as 10–15 hours. Thioproscaline causes closed-eye visuals, slight open-eye visuals, and a body load. Very little data exists about the pharmacological properties, metabolism, and toxicity of thioproscaline.


This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.