Xenopus gene nomenclature
Gene Nomenclature Guidelines
- Overview
- Gene/RNA names and symbols are lowercase italics (pax6)
- Proteins symbols are first letter capital, not italics (Pax6)
- Should not start with X, Xt, Xl for Xenopus species
- Official Xenopus gene names and symbols are found on Xenbase gene pages and are based on human gene nomenclature (http://www.genenames.org/)
- Orthology to human genes are assigned by synteny
- laevis homeologs (sub-genome genes) are designated ".L" and ".S" for the long and short chromsomes, repectively
- Legacy gene name/symbols that are no longer available are recorded as synonyms (Xbra is a synonym of t)
- For gene name questions please email the Xenbase gene name coordinator xenbase@cchmc.org
- Example:
- Gene Name: beta-carotene oxygenase 2
- Gene Symbol: bco2
- RNA Symbol: bco2
- Protein Symbol: Bco2
- Gene Names
- Detailed Gene Nomenclature
- Xenopus gene names and symbols are identical to human gene names whenever possible. A full description of the nomenclature rules used by the Human Genome Nomenclature Committee (HGNC) can be found at http://www.genenames.org/.
- Orthology assignments are based primarily on synteny and requires more than a Blast result in order to apply the human gene name. Data for 12,000 tropicalis gene models has been generated by Dan Rokhsar see http://www.metazome.net/
- Orthology assignments should be approved by the HGNC, the Xenopus Gene Nomenclature Committee and communicated by Xenbase staff.
- In cases where mammalian gene names reference an original Xenopus name (chordin-like), the Xenopus name will be retained (chordin).
- Gene names should not start with any characters or words in order to identify the gene as being Xenopus (e.g. X, Xt, Xl, Xenopus, tropicalis, laevis).
- Gene names are lower case and italics, and should only contain Latin letters and Arabic numbers. Greek letters should be spelled out (β -> beta), and Roman numerals should be changed to Arabic equivalents (IV -> 4).
- Example: beta-carotene oxygenase 2
- Punctuation should only be used if the human gene name uses punctuation (except paralogs / homeologs as described below).
- When identity is uncertain be cautious. Use a temporary symbol or name such as “caudal type homeobox 2 [provisional]” until more information is available, at which time the name would be changed and the [provisional] tag removed.
- Pioneer species names should not be used. For example, in some species nanos3 is known as “nanos homolog 3 (Drosophila)”. In Xenopus it would simply be named “nanos homolog 3”.
- Xenbase administers Xenopus gene nomenclature.
- Gene names for laevis homeologs are appended with "L homeolog" or "S homeolog" to distinguish the sub-genome they are associated with.
- When there is no human ortholog of a new Xenopus gene or when the human gene name is provisional, new gene names will be based on consultations with the HGNC, the Xenopus gene nomenclature committee, and the requesting parties. Gene name requests should be sent to the Xenbase gene name coordinator xenbase@cchmc.org.
- Gene Families and Paralogs
- Gene families are a related set of genes formed by duplication of a single ancestral gene. Genes within gene families usually have similar biological functions.
- When naming genes in gene families, a root word should be used to identify the gene as being a member of the gene family. Gene family members should be assigned increasing unique numerical identifiers. In keeping with HGNC policies the next available number that is not already used in other species should be appended to the end without punctuation (see exceptions below).
- Example: nodal homolog 1 (nodal1), nodal homolog 2 (nodal2)
- Some exceptions are made for rare legacy names that have a different format.
- Example: nkx2-1 and nkx2-2
- Exceptions can also be made when there are clear subgroups within a gene family. In this case it is acceptable to append a “.1, .2, .n+1.” to the end of the symbol to indicate that the genes belong to the same subfamily. This applies to both Xenopus specific gene family expansions as well as to cases where there are multiple Xenopus genes relative to a single member of a mammalian gene family.
- Example: nodal homolog 3, gene 1 (nodal3.1), nodal homolog 3, gene 2 (nodal3.2)
- Example: Human Hairy and Enhancer of split 6 (HES6) is syntenic with two tropicalis genes: hairy and enhancer of split 6, gene 1 (hes6.1) and hairy and enhancer of split 6, gene 2 (hes6.2)
- Expanded gene families are numbered independently in each Xenopus species or sub-genome. Importantly the same “.n” number designation between different species or subgenomes does not necessarily imply a direct one-to-one orthology.
- Example: X. tropicalis: bix1.1, bix1.2, bix1.3, bix1.4, bix1.5 and bix1.6
- X. laevis: bix1.1.L, bix1.2.L, bix1.3.L, bix1.1.S, bix1.2.S and bix1.3S
- Complex orthologies not covered by the rules above will be resolved in a case-by-case manner in consultation with the XGNC and the HGNC.
- Pseudogenes are non-functional DNA sequences that are similar in structure to normal genes. Xenopus pseudogene names should be given the next integer within the gene family name, and designation “pseudogene” should be appended to the end of the gene name. HGNC pseudogene naming guidelines will be applied.
- Example: fer-1-like 4 pseudogene
Note: Genes that are pseudogenes in one species may be expressed in other species.
- Gene Symbols
- Gene symbols are lower case and italics, and should only contain Latin letters and Arabic numbers (unless specified below).
- Gene symbols are identical to human gene symbols whenever possible.
- Should not start with X, Xt, Xl for Xenopus species.
- Symbols are short-form representations (or abbreviations) of the descriptive gene name. Symbols should also be at least three characters long, with the first character being a letter.
- Gene symbols should have no spaces and punctuation should only be used if the human equivalent uses punctuation (except for paralogs or laevis homeologs as described above).
- Gene symbols must be unique and should avoid matching common words or abbreviations in order to avoid problems with database searching (e.g. DNA, EGTA, PBS, CAN, GET...).
- Example: bco2
- Symbols for genes in gene families should contain a base or root “word”, followed by increasing numerical identifiers.
- Example: nodal, nodal1, nodal2, nodal3
- RNA Symbols
- RNA symbols are the same as gene symbols in lowercase and italics and match human symbol nomenclature.
- Latin Letters and Arabic Numbers only.
- Example: bco2
- RNA splice variants: RNA strands that arise from splice variants of genes should use the same gene symbol as the gene, followed by -v and increasing numerical identifiers.
- Example: fzd4-v1
- Protein Symbols
- Protein names and symbols are exactly the same as the gene name and symbol but have the first letter uppercase, and are not italics.
- The word “protein” or additional terms are not included.
- Example: Bco2
- Protein variants arising from alternative spliced variants of genes should use the symbol as the alternative transcript, including the –v and increasing numerical identifiers.
- Example: Fzd4-v1
Xenopus Gene Nomenclature Committee (2013)
- Enrique Amaya, University of Manchester, UK
- Julie Baker, Stanford University, USA
- Ira Blitz, University of California, Irvine, USA
- Dale Frank, Technion - Israel Institute of Technology, Israel
- Mike Gilchrist, The Francis Crick Institute, Mill Hill Laboratory, London UK
- Matt Guille, EXRC, University of Portsmith, UK
- Richard Harland, University of California, Berkeley, USA
- Marko Horb, Marine Biological Laboratory, USA
- Mustafa Khokha, Yale School of Medicine, USA
- Hajime Ogino, Nara Institute of Science and Technology, Japan
- Nicolas Pollet, Institute of Systems & Synthetic Biology, France
- Atsushi Suzuki, Hiroshima University, Japan
- Masanori Taira, University of Tokyo, Japan
- Gert Veenstra, Nijmegen Center for Molecular Life Sciences, Netherlands
- Peter Vize, University of Calgary, Canada
- Aaron Zorn (Chair), Cincinnati Children's Hospital, USA
Please address all comments or questions to the nomenclature administrator at Xenbase, xenbase@cchmc.org.