The vocabulary used to describe microbial communities: microbiome, metagenome, microbiota

The exponential increase in publications reporting on the analysis of microbial communities inhabiting the human body has been accompanied by confusion in the vocabulary used to describe different aspects of these communities and their environments. In this post, I would like to try to clarify a few terms, which I think tend to be misused: Microbiome, microbiota and metagenomes (and metagenomics). The literature reveals the interchangeable used of these terms, which has led to major confusions in the community and the general public, as each mean something different.

Here is how I would define these specific terms:

Microbiota: The assemblage of microorganisms present in a defined environment. This bacterial census is established using molecular methods relying on the analysis of 16S rRNA gene sequences amplified from a given environment. Taxonomic assignments are performed that allows to assign each sequence to a microbial taxa (bacteria, archaea or lower eukaryotes).

Microbiome: This term refers to the entire habitat, including the microorganisms, their genomes (i.e., genes) and the surrounding environmental conditions. This definition is based on that of "biome", the biotic and abiotic factors of a given environments.

Other in the field limit the definition of microbiome to the collection of genes that are encoded by the members of a microbiota. I would suggest that this is the definition of metagenome not microbiome. 

Metagenome: The collection of genomes and genes from the members of a microbiota.

Therefore, when discussing results from 16S rRNA gene surveys (which describe the community, not the collective genome of the community or the complete habitat) “microbiota” should be used. To re-iterate on Jonathan Eisen's post, when doing such surveys, one doesn't not do metagenomics analysis, and should not use metagenomic sequences when referring to 16S rRNA gene sequences . However, when applying metagenomics to a biological sample (sequencing the community genomes of a microbiota) one generate a collection of genomes and genes. If this set of genomes and genes are discussed in the context of their environments, once can use the word microbiome.

Other terms which I believe are inappropriate and I often find in papers reporting on microbial surveys is "16S survey", "16S sequencing" or "16S analysis". There is no such things as "16S", the proper term is 16S rRNA genes survey, or 16S rRNA genes sequencing/analysis. The "S" in 16S is a non-SI unit for sedimentation rate and stands for Svedberg unit. The Svedberg unit offers a measure of particle size based on its rate of travel in a tube subjected to high g-force. The small subunits of the bacterial and archaeal ribosomes are 30S, and comprise the 16S rRNA (~1540 nucleotide) bound to 21 proteins. 

This post was informed from papers and other communication I have had with colleagues. I hope that a consensus use of these term could be adopted in the near future, so we can all talk about the same thing when using these terms. I would encourage readers to leave comments to stimulate discussions, and may be agree on rules to use these terms. I plan on summarizing these discussion in an editorial article in the journal Microbiome.