Sachslehner AP and Eckhart L (2022), Evolutionary diversification of epidermal barri...

XB-ART-60817

Sci Rep 2022 Aug 10;121:13634. doi: 10.1038/s41598-022-18053-7.

Show Gene links Show Anatomy links

Evolutionary diversification of epidermal barrier genes in amphibians.

Sachslehner AP , Eckhart L .

???displayArticle.abstract???
The epidermal differentiation complex (EDC) is a cluster of genes encoding components of the skin barrier in terrestrial vertebrates. EDC genes can be categorized as S100 fused-type protein (SFTP) genes such as filaggrin, which contain two coding exons, and single-coding-exon EDC (SEDC) genes such as loricrin. SFTPs are known to be present in amniotes (mammals, reptiles and birds) and amphibians, whereas SEDCs have not yet been reported in amphibians. Here, we show that caecilians (Amphibia: Gymnophiona) have both SFTP and SEDC genes. Two to four SEDC genes were identified in the genomes of Rhinatrema bivittatum, Microcaecilia unicolor and Geotrypetes seraphini. Comparative analysis of tissue transcriptomes indicated predominant expression of SEDC genes in the skin of caecilians. The proteins encoded by caecilian SEDC genes resemble human SEDC proteins, such as involucrin and small proline-rich proteins, with regard to low sequence complexity and high contents of proline, glutamine and lysine. Our data reveal diversification of EDC genes in amphibians and suggest that SEDC-type skin barrier genes have originated either in a common ancestor of tetrapods followed by loss in Batrachia (frogs and salamanders) or, by convergent evolution, in caecilians and amniotes.

???displayArticle.pubmedLink??? 35948609
???displayArticle.pmcLink??? PMC9365767
???displayArticle.link??? Sci Rep
???displayArticle.grants??? [+]

Species referenced: Xenopus tropicalis
Genes referenced: gapdh s100a1 s100a11 s100a16

???attribute.lit??? ???displayArticles.show???

	Figure 1 The EDC of caecilians contains SEDC genes. (A) Exon–intron structures of SEDC (single-coding-exon EDC) and S100 fused-type protein (SFTP) genes. Exons are shown as boxes in which the protein coding sequence (cds) is shaded black. (B) Alignment of the nucleotide sequences of the SEDC1 gene of Rhinatrema bivittatum (Rb) (GenBank accession number NC_042630.1, nucleotide positions are indicated above the sequence) and an RNA-seq read from the skin of this species (GenBank accession number: SRR5591419, experiment SRX2848310, read: gnl\|SRA\|SRR5591419.19486855.2). The TATA box is underlined. Only the first and last 5 nucleotides of the intron are shown. The sequence gap is indicated by “//”. Non-coding sequences of exons are indicated by a black box and coding sequences are shown with white fonts on black background. The amino acid sequence of the translation product is shown underneath the nucleotide sequence. (C) Structure of the EDC in amphibians in comparison to the human EDC. The genes of the chromosomal segments bordered by the conserved genes S100A11 and S100A16 in Rhinatrema bivittatum (two-lined caecilian), Microcaecilia unicolor (a common caecilian from French Guayana), Geotrypetes seraphini (Gaboon caecilian), Xenopus tropicalis (tropical clawed frog), Ambystoma mexicanum (axolotl) and Homo sapiens (human) are schematically depicted by arrows pointing in the direction of gene transcription. SEDC genes are shown as red arrows. SFTP genes are shown as blue arrows. Grey arrows indicate S100A11 and S100A16 genes. White arrows indicate other S100 genes located between S100A11 and S100A16 in amphibians. The dashed line in the human EDC indicates that only a subset of SEDC and SFTP genes and no other S100A genes are shown for lack of space. The total numbers of human SEDC and SFTP genes are indicated.
	Figure 1 The EDC of caecilians contains SEDC genes. (A) Exon–intron structures of SEDC (single-coding-exon EDC) and S100 fused-type protein (SFTP) genes. Exons are shown as boxes in which the protein coding sequence (cds) is shaded black. (B) Alignment of the nucleotide sequences of the SEDC1 gene of Rhinatrema bivittatum (Rb) (GenBank accession number NC_042630.1, nucleotide positions are indicated above the sequence) and an RNA-seq read from the skin of this species (GenBank accession number: SRR5591419, experiment SRX2848310, read: gnl\|SRA\|SRR5591419.19486855.2). The TATA box is underlined. Only the first and last 5 nucleotides of the intron are shown. The sequence gap is indicated by “//”. Non-coding sequences of exons are indicated by a black box and coding sequences are shown with white fonts on black background. The amino acid sequence of the translation product is shown underneath the nucleotide sequence. (C) Structure of the EDC in amphibians in comparison to the human EDC. The genes of the chromosomal segments bordered by the conserved genes S100A11 and S100A16 in Rhinatrema bivittatum (two-lined caecilian), Microcaecilia unicolor (a common caecilian from French Guayana), Geotrypetes seraphini (Gaboon caecilian), Xenopus tropicalis (tropical clawed frog), Ambystoma mexicanum (axolotl) and Homo sapiens (human) are schematically depicted by arrows pointing in the direction of gene transcription. SEDC genes are shown as red arrows. SFTP genes are shown as blue arrows. Grey arrows indicate S100A11 and S100A16 genes. White arrows indicate other S100 genes located between S100A11 and S100A16 in amphibians. The dashed line in the human EDC indicates that only a subset of SEDC and SFTP genes and no other S100A genes are shown for lack of space. The total numbers of human SEDC and SFTP genes are indicated.
	Figure 2 Semiquantitative analysis of EDC gene expression in tissue transcriptomes of Rhinatrema bivittatum. (A–E) Sequence fragments of the predicted proteins as described in the “Methods” section were used as queries for tBLASTn analysis. The accession numbers for the transcriptomes were as follows: skin (SRX2848310), liver (SRX2848294), lung (SRX2848293), foregut (SRX2848291), kidney (SRX2848286), spleen (SRX2848287). Default settings of tBLASTn were used except for deactivation of the filter for low complexity regions. tBLASTn hits with 100% sequence identity to the query were counted. GAPDH was investigated as a house-keeping gene (E). Note that this analysis allows comparison of expression levels of a particular gene in different organs, but not comparison between expression levels of different genes. (F) Size of tissue transcriptomes. Gb Gigabases.
	Figure 2 Semiquantitative analysis of EDC gene expression in tissue transcriptomes of Rhinatrema bivittatum. (A–E) Sequence fragments of the predicted proteins as described in the “Methods” section were used as queries for tBLASTn analysis. The accession numbers for the transcriptomes were as follows: skin (SRX2848310), liver (SRX2848294), lung (SRX2848293), foregut (SRX2848291), kidney (SRX2848286), spleen (SRX2848287). Default settings of tBLASTn were used except for deactivation of the filter for low complexity regions. tBLASTn hits with 100% sequence identity to the query were counted. GAPDH was investigated as a house-keeping gene (E). Note that this analysis allows comparison of expression levels of a particular gene in different organs, but not comparison between expression levels of different genes. (F) Size of tissue transcriptomes. Gb Gigabases.
	Figure 3 Amino acid sequences of caecilian EDC proteins are rich in glutamine, lysine and proline. (A) Amino acid sequences of Rhinatrema bivittatum SEDC proteins. (B) Sequence logo of the repeat unit in Rhinatrema bivittatum SEDC2 based on the alignment of repeats shown in Supplementary Fig. S2. (C) Amino acid contents of EDC proteins of caecilians and Xenopus tropicalis. For comparison, the values for two human EDC proteins (SPRR1A, an SEDC protein, GenBank accession NP_001186757.1, and trichohyalin, TCHH, an SFTP, GenBank accession number NP_009044.2) are shown on the right. (D) Amino acid sequence alignment of Rhinatrema bivittatum SEDC1 and human SPRR1A. Glutamine, lysine and proline residues are highlighted by grey, blue and green shading, respectively.
	Figure 3 Amino acid sequences of caecilian EDC proteins are rich in glutamine, lysine and proline. (A) Amino acid sequences of Rhinatrema bivittatum SEDC proteins. (B) Sequence logo of the repeat unit in Rhinatrema bivittatum SEDC2 based on the alignment of repeats shown in Supplementary Fig. S2. (C) Amino acid contents of EDC proteins of caecilians and Xenopus tropicalis. For comparison, the values for two human EDC proteins (SPRR1A, an SEDC protein, GenBank accession NP_001186757.1, and trichohyalin, TCHH, an SFTP, GenBank accession number NP_009044.2) are shown on the right. (D) Amino acid sequence alignment of Rhinatrema bivittatum SEDC1 and human SPRR1A. Glutamine, lysine and proline residues are highlighted by grey, blue and green shading, respectively.
	Figure 4 Scenarios for the evolution of SEDC genes in tetrapods. Two alternative scenarios (A, B) for the evolution of SEDC genes in terrestrial vertebrates are schematically depicted. Both scenarios are compatible with the arrangement of gene types in the epidermal differentiation complex (EDC) of caecilians and other main taxa of vertebrates. The relative arrangement of SEDC, SFTP and S100A genes within the EDC of each taxon is indicated on the right.
	Figure 4 Scenarios for the evolution of SEDC genes in tetrapods. Two alternative scenarios (A, B) for the evolution of SEDC genes in terrestrial vertebrates are schematically depicted. Both scenarios are compatible with the arrangement of gene types in the epidermal differentiation complex (EDC) of caecilians and other main taxa of vertebrates. The relative arrangement of SEDC, SFTP and S100A genes within the EDC of each taxon is indicated on the right.
	Figure 1. The EDC of caecilians contains SEDC genes. (A) Exon–intron structures of SEDC (single-coding-exon EDC) and S100 fused-type protein (SFTP) genes. Exons are shown as boxes in which the protein coding sequence (cds) is shaded black. (B) Alignment of the nucleotide sequences of the SEDC1 gene of Rhinatrema bivittatum (Rb) (GenBank accession number NC_042630.1, nucleotide positions are indicated above the sequence) and an RNA-seq read from the skin of this species (GenBank accession number: SRR5591419, experiment SRX2848310, read: gnl\|SRA\|SRR5591419.19486855.2). The TATA box is underlined. Only the first and last 5 nucleotides of the intron are shown. The sequence gap is indicated by “//”. Non-coding sequences of exons are indicated by a black box and coding sequences are shown with white fonts on black background. The amino acid sequence of the translation product is shown underneath the nucleotide sequence. (C) Structure of the EDC in amphibians in comparison to the human EDC. The genes of the chromosomal segments bordered by the conserved genes S100A11 and S100A16 in Rhinatrema bivittatum (two-lined caecilian), Microcaecilia unicolor (a common caecilian from French Guayana), Geotrypetes seraphini (Gaboon caecilian), Xenopus tropicalis (tropical clawed frog), Ambystoma mexicanum (axolotl) and Homo sapiens (human) are schematically depicted by arrows pointing in the direction of gene transcription. SEDC genes are shown as red arrows. SFTP genes are shown as blue arrows. Grey arrows indicate S100A11 and S100A16 genes. White arrows indicate other S100 genes located between S100A11 and S100A16 in amphibians. The dashed line in the human EDC indicates that only a subset of SEDC and SFTP genes and no other S100A genes are shown for lack of space. The total numbers of human SEDC and SFTP genes are indicated.
	Figure 2. Semiquantitative analysis of EDC gene expression in tissue transcriptomes of Rhinatrema bivittatum. (A–E) Sequence fragments of the predicted proteins as described in the “Methods” section were used as queries for tBLASTn analysis. The accession numbers for the transcriptomes were as follows: skin (SRX2848310), liver (SRX2848294), lung (SRX2848293), foregut (SRX2848291), kidney (SRX2848286), spleen (SRX2848287). Default settings of tBLASTn were used except for deactivation of the filter for low complexity regions. tBLASTn hits with 100% sequence identity to the query were counted. GAPDH was investigated as a house-keeping gene (E). Note that this analysis allows comparison of expression levels of a particular gene in different organs, but not comparison between expression levels of different genes. (F) Size of tissue transcriptomes. Gb Gigabases.
	Figure 3. Amino acid sequences of caecilian EDC proteins are rich in glutamine, lysine and proline. (A) Amino acid sequences of Rhinatrema bivittatum SEDC proteins. (B) Sequence logo of the repeat unit in Rhinatrema bivittatum SEDC2 based on the alignment of repeats shown in Supplementary Fig. S2. (C) Amino acid contents of EDC proteins of caecilians and Xenopus tropicalis. For comparison, the values for two human EDC proteins (SPRR1A, an SEDC protein, GenBank accession NP_001186757.1, and trichohyalin, TCHH, an SFTP, GenBank accession number NP_009044.2) are shown on the right. (D) Amino acid sequence alignment of Rhinatrema bivittatum SEDC1 and human SPRR1A. Glutamine, lysine and proline residues are highlighted by grey, blue and green shading, respectively.
	Figure 4. Scenarios for the evolution of SEDC genes in tetrapods. Two alternative scenarios (A, B) for the evolution of SEDC genes in terrestrial vertebrates are schematically depicted. Both scenarios are compatible with the arrangement of gene types in the epidermal differentiation complex (EDC) of caecilians and other main taxa of vertebrates. The relative arrangement of SEDC, SFTP and S100A genes within the EDC of each taxon is indicated on the right.