Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 21;51(5):2377-2396.
doi: 10.1093/nar/gkad040.

Ultradeep characterisation of translational sequence determinants refutes rare-codon hypothesis and unveils quadruplet base pairing of initiator tRNA and transcript

Affiliations
Free PMC article

Ultradeep characterisation of translational sequence determinants refutes rare-codon hypothesis and unveils quadruplet base pairing of initiator tRNA and transcript

Simon Höllerer et al. Nucleic Acids Res. .
Free PMC article

Abstract

Translation is a key determinant of gene expression and an important biotechnological engineering target. In bacteria, 5'-untranslated region (5'-UTR) and coding sequence (CDS) are well-known mRNA parts controlling translation and thus cellular protein levels. However, the complex interaction of 5'-UTR and CDS has so far only been studied for few sequences leading to non-generalisable and partly contradictory conclusions. Herein, we systematically assess the dynamic translation from over 1.2 million 5'-UTR-CDS pairs in Escherichia coli to investigate their collective effect using a new method for ultradeep sequence-function mapping. This allows us to disentangle and precisely quantify effects of various sequence determinants of translation. We find that 5'-UTR and CDS individually account for 53% and 20% of variance in translation, respectively, and show conclusively that, contrary to a common hypothesis, tRNA abundance does not explain expression changes between CDSs with different synonymous codons. Moreover, the obtained large-scale data provide clear experimental evidence for a base-pairing interaction between initiator tRNA and mRNA beyond the anticodon-codon interaction, an effect that is often masked for individual sequences and therefore inaccessible to low-throughput approaches. Our study highlights the indispensability of ultradeep sequence-function mapping to accurately determine the contribution of parts and phenomena involved in gene regulation.

Figures

Figure 1.
Figure 1.
Ultradeep characterisation of 5′-UTR-CDS combinations. (A) Plasmid architecture for the uASPIre of 5′-UTR-CDS pairs. A bxb1-sfGFP gene (translational fusion) controlled by Prha is placed on the same DNA molecule as the substrate modifiable by Bxb1-sfGFP, which is flanked by Bxb1 attachment sites (attB/P). A SpeI site in codons 17 and 18 of bxb1-sfGFP allows for seamless exchange of 5′-UTR and N-terminal CDS. Once expressed, Bxb1-sfGFP irreversibly inverts its substrate from an unflipped into a flipped state creating recombined attachment sites (attL/R). (B) Design of Librandom. The 25 nucleotides preceding the start codon are fully randomised. Additionally, the third positions of codons 2–16 are mutated allowing only synonymous codon replacements. Sequences follow the IUPAC nucleotide code (N: A/C/G/T, H: A/C/T, Y: C/T). TSS: transcriptional start site of Prha. (C) Experimental workflow for the uASPIre of 5′-UTR-CDS pairs. Pooled transformants of Librandom are grown in LB and bxb1-sfGFP expression is induced by l-rhamnose addition. After, samples are taken at different time points followed by plasmid extraction and preparation of NGS fragments followed by pooling of samples and NGS (Methods). NGS fragments are flanked by duplex adapters with sample-specific index combinations (grey boxes). (D) Close-up view of target fragments for paired-end NGS using forward (seqfwd) and reverse (seqrev) sequencing primers. Forward reads are used to identify the first index (idx1) and the state of the recombinase substrate. Reverse reads are used to obtain the second index (idx2) and the sequence of 5′-UTR and CDS. (E) Representative flipping profiles of 5′-UTR-CDS variants from Librandom. For clarity, only the 1000 most abundant variants are displayed. (F) Flipping profiles of all 198174 Librandom members above high-quality read-count threshold (Methods). Horizontal lines are time series of individual variants coloured according to the fraction flipped and ranked by the average fraction flipped across all time points from high (top) to low (bottom). (G) Illustration of the IFP (grey area), i.e. the normalised integral of the flipping profile. (H) Correlation between IFP and slope sfGFP0-290min as shown for 31 standard RBSs (Methods). A LOESS function (black line) can be used to interconvert IFP and slope sfGFP0-290min with high confidence. (I) Histogram of the rTR of all variants from Librandom.
Figure 2.
Figure 2.
Positional and base-specific effects on translation initiation. (A) Contribution of variable mRNA positions to the observed rTR variance. The relative sum of squares calculated by ANOVA with each position as covariate is displayed. (B) Base-specific effects of the randomised positions. Displayed effects are log2-transformed fold changes (log2 FC) of the mean rTR of variants with a given base at the respective position over the mean rTR of variants with any other base permitted at that position. Positive and negative values correspond to translation-increasing or -decreasing effects, respectively. Crossed boxes indicate unvaried bases. (C) Enrichment of bases amongst strong variants. The log2 FC of a base's relative occurrence amongst strong variants (rTR ≥ 0.5) over its relative occurrence amongst weak variants (rTR < 0.5) is displayed.
Figure 3.
Figure 3.
Effect of different sequence parameters on translation initiation in Librandom. (A) Correlation of GC-content and different mRNA folding metrics with rTR. Spearman's ρ2 and Pearson's R2 are displayed. (B) Scatterplot between rTR and the best-correlating mRNA folding parameter efeC. (C) Correlation of rTR with local mRNA accessibility. Parameters accT10nt and accC10nt correspond to the mRNA accessibility of a 10-nt window centred around the mRNA position specified on the horizontal axis. Endings C and T denote base pairing calculated by two different energy models (Methods). (D) Correlation of hybridisation energy between 16S rRNA and different mRNA positions with rTR. Positional hybridisation energy (hybpos) is displayed for 9-bp windows centred around the indicated mRNA position (horizontal axis). (E) Relative feature importance of a random forest model trained on Librandom. The ten most important of 248 features are displayed. hybopt: best-correlating hybridisation parameter (see main text). accC1nt, pos+6: accC score for position +6 of the mRNA. Upos –1: one-hot encoded U at position −1 of the mRNA. (F) Mean rTR of variants in Librandom as grouped by the two most predictive features of the random forest, hybopt and efeC. Tick labels mark the boundaries of the respective bins (boxes).
Figure 4.
Figure 4.
Overall impact of 5′-UTR, CDS and codon usage on translation initiation. (A) Three additional libraries of combinatorial (Libcomb1, Libcomb2) and full-factorial (Libfact) design were assessed via uASPIre. Libcomb1: combinatorial combination of about 1000 5′-UTRs and 1000 CDSs. Libcomb2: combinatorial combination of about 100 5′-UTRs and 10 000 CDSs. Libfact: ten independent batches, each a full factorial combination of approx. 100 5′-UTRs and 100 CDSs. Libfact was tested in two independent biological replicates. The number of analysed clones is indicated for each library. (B) Impact of the exchange of 5′-UTRs or CDSs on translation initiation. The rTR change (absolute value) of a given 5′-UTR upon exchanging its CDS versus the mean rTR of all variants with that same 5′-UTR is displayed (and vice versa). Black circles within violins are mean relative rTR changes. (C) ANOVA with the mean rTRs of all 5′-UTRs and CDSs in Libfact. Error bars: standard deviation between ten independent batches of Libfact. (D) Correlation of codon usage indices CAI and tAI with rTR. (E) Comparison of rTRs and predicted folding energies (efeC) of variants with low (≤0.1) and high (>0.1) CAI/tAI in all libraries. Black circles within violins are mean rTR/efeC values. (F) Contribution of efeC, CAI and tAI to the rTR variance in all libraries according to an ANOVA with only the three parameters as covariates. (G) Impact of folding and codon usage metrics on the performance of random forest (RF) models trained on Librandom. Sequence parameters for mRNA folding (mfeT, mfeC, efeT, efeC, accT and accC) and codon usage (CAI and tAI) were added or omitted during training. Error bars: Standard deviation of five training repeats with 10-fold cross-validation each. P-values were calculated with Welch two sample t-tests.
Figure 5.
Figure 5.
Assessment of translational anomalies of arginine codon 2 and 5′-UTR position −1 in Librandom. (A) Effect of different synonymous codons in the second triplet of the CDS on rTR and predicted mRNA folding energy (efeC). Black circles within violins are mean rTR/efeC values. *** denote P-values < 10−16 in a Welch two sample t-test. (B) Relationship between relative triplet frequency in E. coli and rTR for the four synonymous triplets in arginine codon 2. (C) Effect of different bases in 5′-UTR position −1 on rTR and efeC. Black circles within violins are mean rTR/efeC values. *** denote P-values < 10−16 in a Welch two sample t-test. (D) Plasmids for the overexpression of native initiator tRNAfMet and mutants thereof (Supplementary Figure S6, Methods). Position 37 (3′-adjacent to the CAU anticodon) of tRNAfMet is mutated from A to C, G or T/U. (E) Growth of E. coli strains carrying plasmids for tRNAfMet overexpression in shake flask cultivations (LB, 37°C). Bars are mean doubling times of independent biological triplicate cultivations with standard deviation as error bars. Dashed lines are the mean doubling time of the respective strain without tRNA overexpression (i.e. empty vector control) with standard deviation as grey shaded areas. For tRNAfMet-A37C, doubling times were not determined (n.d.) due to severe growth inhibition (see main text). (F) Approximately 50 000 variants of Librandom were tested in the presence of overexpressed tRNAfMet variants in E. coli strains containing (WT) and lacking (ΔmetZWV) the chromosomal metZWV locus. (G) Impact of tRNAfMet mutations on the rTR of variants from Librandom. Displayed effects are log2-transformed fold-changes (log2 FC) of the average rTR of variants with a given base at 5′-UTR position −1 over the average rTR of variants with any other base at this position. Black arrows indicate complementarity between 5′-UTR position −1 and position 37 of the tRNAfMet variant. (H) Impact of complementarity between 5′-UTR position −1 and tRNAfMet position 37. Circles are log2-transformed fold-changes (log2 FC) of the average rTR of variants with complementarity or non-complementarity between mRNA and tRNA over the mean rTR of all variants in the same group (i.e. same tRNAfMet variant and strain). Bars are the mean log2 FCs of the three tRNAfMet variants for each case and strain with standard deviation as error bars.

Similar articles

Cited by

References

    1. Wang H.H., Isaacs F.J., Carr P.A., Sun Z.Z., Xu G., Forest C.R., Church G.M.. Programming cells by multiplex genome engineering and accelerated evolution. Nature. 2009; 460:894–898. - PMC - PubMed
    1. Pullmann P., Ulpinnis C., Marillonnet S., Gruetzner R., Neumann S., Weissenborn M.J.. Golden mutagenesis: an efficient multi-site-saturation mutagenesis approach by Golden Gate cloning with automated primer design. Sci. Rep. 2019; 9:10932. - PMC - PubMed
    1. Xu W., Klumbys E., Ang E.L., Zhao H.. Emerging molecular biology tools and strategies for engineering natural product biosynthesis. Metab Eng. Commun. 2020; 10:e00108. - PMC - PubMed
    1. Vellanoweth R.L., Rabinowitz J.C.. The influence of ribosome-binding-site elements on translational efficiency in Bacillus subtilis and Escherichia coli in vivo. Mol. Microbiol. 1992; 6:1105–1114. - PubMed
    1. Laursen B.S., Sorensen H.P., Mortensen K.K., Sperling-Petersen H.U.. Initiation of protein synthesis in bacteria. Microbiol. Mol. Biol. Rev. 2005; 69:101–123. - PMC - PubMed

Publication types