Introduction

Homologous DNA recombination has long been observed in lambdoid phages. It assists in the formation of chromosomes1 and is a frequent and common contributor to genetic diversity within phage populations2,3. Like many dsDNA viruses, lambdoid phages encode an efficient two-component recombinase system which facilitates recombination between homologous regions of ssDNA through a single-strand annealing (SSA) mechanism4. Homologous recombination promoted by viral two-component recombinases has also been shown to increase the efficiency of DNA replication and facilitate the repair of dsDNA breaks throughout the viral life cycle5,6.

SSA is an evolutionarily conserved pathway present in many organisms, from simple bacteriophages to humans7. However, the first discovered and best characterized of these systems is the Red recombinase from the Escherichia coli bacteriophage lambda (phage λ)8. The Red system is comprised of the genes exo, bet and gam, which encode for the proteins λ exonuclease (λExo [originally, Redα]), Redβ and γ-protein, respectively. The Red system proteins are capable of repairing dsDNA breaks by SSA9, while Redβ alone has been shown to be sufficient for ssDNA recombination10. λExo is a 25.9 kDa protein that forms a toroidal homotrimer11, which binds to dsDNA ends. It resects one of the strands in the 5ʹ to 3ʹ direction, converting the other stand to a 3ʹ ssDNA overhang12,13. Redβ (a.k.a. Beta), a 29.7 kDa protein, binds to and protects this exposed nascent ssDNA, catalyzing its annealing to a homologous DNA molecule14. λExo and Redβ have been shown to interact in a 1:1 monomer:monomer stoichiometry to form an Exonuclease Annealase Two-component Recombinase (EATR) complex (a.k.a. SynExo)5,15. The third protein, γ (Gam), though not physically associated with EATR, binds to and inhibits host E. coli RecBCD nuclease16,17 and enhances recombination efficiency10.

The phage λ Red system is the archetypal EATR complex and has been studied extensively over the last 50 years. Its high recombination rate has made it a convenient model system to study SSA9. λ Red has also emerged as a tool for genetic manipulation, termed recombineering12,13. Recombineering has been reported to be significantly more effective than classical bacterial genetic manipulation, with the Red system being key to the development of these technologies1. There are currently two proposed models for how SSA may occur, which have been recently reviewed by us8 and others5,15. In brief, variable processing of dsDNA ends by λExo leads to the following proposed scenarios: (1) Two partially processed homologous dsDNA molecules are annealed to each other18, (2) ssDNA resulting from a partial or complete digestion of dsDNA is annealed to region(s) of transient genomic ssDNA, such as the lagging strand at a DNA replication fork10,19,20,21. In these instances, Redβ is proposed to mediate homology searching and the annealing of homologous sequences. A version of the latter model forms the basis of another genetic tool, Multiplex Autonomous Genome Engineering (MAGE)22, for the simultaneous targeted mutagenesis of numerous genes. However, many questions related to the process of SSA remain unanswered, particularly regarding the detailed molecular mechanisms underlying the catalysis of homology searching and annealing by Redβ.

Functional homologs of Redβ appear to be almost ubiquitous, with an increasing number of representatives being discovered across viral, bacterial, and eukaryotic domains. Despite this, few of the annealase proteins have received even rudimentary characterization. Therefore, there is a scarcity of information regarding the biochemical, structural and functional attributes of annealases. However, the available data on Redβ and other annealases such as ERF (phage P22), ORF6 (Kaposi’s Sarcoma Herpes Virus), ICP8 (Herpes Simplex Virus-1), RecT (E. coli), human Rad52 and its eukaryotic homologs suggest some commonalities. Annealases are ATP-independent and do not catalyze strand invasion of duplex DNA; thus, they are functionally distinct from the well-known and studied ATPase recombination proteins, such as RecA from E. coli and eukaryotic Rad519,23,24. However, in a manner reminiscent of E. coli SSB, annealases bind to ssDNA, holding it in an extended conformation, removing secondary structures and protecting it from degradation25. The DNA-binding properties of Redβ differ substantially from those of SSB. Redβ binds only weakly to ssDNA, but in the presence of magnesium, it binds stably and with high affinity to a nascent dsDNA, the product of annealing, such as one that forms upon the sequential addition of two complementary ssDNA oligos26. Redβ does not bind at all to preformed duplex DNA26.

Redβ is known to form ring-like and filamentous oligomeric superstructures that are proposed to be functionally important. Electron and atomic force microscopy studies have provided images and low-resolution structural information on the oligomeric complexes of Redβ. In the absence of DNA, full-length Redβ forms an abundance of small rings or split-lock-washer conformers, each with an average of 11-12 subunits27,28. However, in the presence of ssDNA, Redβ forms a heterogeneous mix of 11–19 subunit rings, distorted and part rings, and short left-handed helical filaments which are seen in both the full length27,28 and various truncated versions of Redβ27,28,29. In the presence of heat-denatured complementary DNA, the larger rings and filament superstructures are stabilized and are proposed to represent an annealing intermediate26,27,28. The oligomeric heterogeneity of full-length Redβ was recently expanded upon through a detailed, native mass spectrometry analysis30. Above a protein concentration of 1 μM, numerous oligomeric states from 5 to 14 subunits were detected for both Redβ alone and in the presence of short ssDNA and complementary oligonucleotides. Increasing the Redβ concentration up to 30 μM promoted larger structures, as did an increase in the length of DNA used30.

Like Redβ, other annealase proteins have a propensity to form similar ring-like quaternary structures and helical filament superstructures, which appear to suggest a common mode of action. In addition, rings are often seen at the termini of long helical filaments27,31, giving rise to the hypothesis that the ring and helical forms interconvert27,32. Many such oligomeric complexes have been observed by electron microscopy, including Orf633, Rad5224,34,35,36,37 and ICP831,38,39,40,41. Low-resolution reconstructions of ICP8, both in the form of binary nonameric rings (in the presence of short ssDNA)32 and bipolar filaments42 have been generated32. The bipolar filaments form in both the absence of DNA39,41,42 and the presence of heat-denatured DNA31,40, and likely represent an intertwining of single filaments seen in the presence of ssDNA38. The ring and bipolar filaments have also been proposed as interconverting annealing intermediates31,32,42. The fitting of an available crystal structure of the ICP8 monomer, with a 60 residue C-terminal truncation43, into the reconstructions has provided information on intermolecular contacts involved in oligomerization.

Though they share functional and oligomerization properties, the annealases of bacterial and phage origin appear to form a group distinct from, and likely evolutionarily unrelated to, the eukaryotic-host viral annealases, such as ICP8, and homologs BALF2 and ORF6. The latter are much larger at ~130 kDa, compared to 29.7 kDa for Redβ. This difference in size may reflect a diversity of function and a greater number of interacting partners. In addition to involvement in SSA, ICP8 is proposed to have multiple roles in DNA replication, replication compartment formation and gene expression44,45.

Phylogenetic analyses of key annealases, including human, bacterial and phage derived proteins, appear to resolve at least three evolutionary lineages, each forming a superfamily: Gp2.5-like, Rad51-like and Rad52-like46. Later publications further divided these lineages into five distinct families named for their respective flagship members Sak3, Sak4, Erf, Rad52 and RecT/Redβ47. The only near-atomic resolution structural data of an oligomeric complex available amongst these families is of the undecameric rings of the Rad52 N-terminal domain (residues 1-212)35,48,49. These data revealed an inner groove and an outer groove on the ring surface, where ssDNA was bound. The two grooves were suggested to represent two different modes of DNA-binding and reaction intermediates. The deep inner groove, predicted to be the site of annealing, is lined with positively charged arginine and lysine residues that interact with the phosphate backbone of ssDNA, positioning the nucleotides outwards for complementary base pairing.

The architecture of annealase proteins can be crudely separated into two functionally distinct domains. The N-terminal domain (NTD) mediates oligomerization and contains the catalytic functionality. Residues 1–177 of Redβ (Redβ177) are sufficient for binding to ssDNA50 and catalyzing the annealing of complementary strands in vitro29,51. Although there are some suggestions that the C-terminal domain (CTD) may modulate the strength of DNA interactions, it does not appear to be required52. The much smaller CTDs are implicated in mediating protein-protein interactions. Residues 178–261 of Redβ interact with λExo29 and potentially other host proteins such as SSB53. The NTD and CTD of annealases are flexible relative to one another, which presents major challenges for structure determination. The current available crystal structures of both ICP843 and Rad5235,48,49 are C-terminally truncated. However, while these crystal structures provide tantalizing insights into the function of annealase proteins within that family, they are of limited use for broader extrapolation. The amino acid sequence similarity between annealases is very low, often less than 15–20%46, making sequence alignments less reliable and impractical for the purpose of generating homology models for annealases in different families.

An X-ray crystal structure of the C-terminal domain of Redβ (residues 183–261) in complex with the λExo toroid, showed a compact 3-helix bundle15. However, there is no representative structure of the NTD, the main domain that carries out the primary functions of DNA binding, annealing and homo-oligomerization required for SSA. As Redβ is undoubtedly the archetype annealase for its family, the absence of structural information has, until now, been a major barrier in our understanding of not only bacteriophage annealases, but also how SSA functions.

Here we present a structure of the helical annealing intermediate of Redβ177, in complex with complementary oligonucleotides, determined by cryo-electron microscopy to 3.3 Å. Our structure allows a detailed look at both the nucleoprotein interactions and the interprotein contacts dictating helical superstructure formation. This structure demonstrates the mechanism of action for Redβ177, while also having the potential to be used as a template for homology models for other members of this protein superfamily. Combined with CTD crystal structure and the structure prediction we made using AlphaFold v2.054, we finally present a complete picture of the Redβ structure more than half a century after its discovery.

Results

We present the structure of the DNA-binding and oligomerization domain of Redβ annealase from phage λ. This structure, resolved to 3.3 Å, was obtained by cryo-EM via helical reconstruction, using a C-terminally truncated Redβ177 bound to two 27mer complementary oligonucleotides.

Reconstructed helical cryo-EM density map

The Redβ177 was expressed in E. coli and purified by immobilized metal ion affinity chromatography followed by gel filtration chromatography (Supplementary Fig. 1). The purified protein was incubated with oligonucleotides before being plunge frozen for cryo-EM. We collected 4,701 micrographs, from which 259,649 helical particles were identified using template-based helical filament tracing and subsequent particle curation in cryoSPARC55. The micrographs contained helices in a variety of conformations, but most were uniform, compact “telephone cord-like” solenoids, though some sections were less condensed (Fig. 1a). The 2D class averages generated during the reconstruction (Fig. 1b and Supplementary Fig. 2a), together with their Fourier transformation (Fig. 1c), clearly show the features of the helical structures formed by Redβ177. Two subpopulations of particles, representing different helical symmetries, were identified in 2D classification—a 1-start and a 2-start helical assembly, consisting of 26,525 and 77,799 particles, respectively (Supplementary Fig. 1). Previous work has established that Redβ forms left-handed helices27, therefore left-handed helical symmetry was imposed during further processing.

Fig. 1: Cryo-EM structure of Redβ177 1-start helical assembly, resolved to 3.3 Å.
figure 1

a A representative electron micrograph displaying Redβ177 helices. Among the 4701 electron micrographs collected, all images identified as containing our protein sample displayed similar helical filaments. b A representative 2D class average of the helical filaments. c Fourier transform of a 2D class average. d Cryo-EM map of the 1-start helical assembly resolved to 3.3 Å. e Local resolution of the structure visible from both a side and a top view. f Plot showing the Fourier shell correlation (FSC) versus spatial frequency, for both the masked (red) and unmasked (blue) maps. The resolution of the structure was assessed using the point where the FSC curve crosses a correlation value of 0.143.

Helical reconstruction and subsequent symmetry determination of the 1-start helical population identified a twist of −12.947 degrees and a rise of 2.078 Å. Following refinement of the local motion and CTF (contrast transfer function), symmetrized, non-uniform helical refinement of the 1-start particles resulted in a map with an estimated final resolution of 3.3 Å (Fig. 1d–f), according to the gold-standard FSC = 0.143 (GSFSC) criterion56. However, the local resolution of the map, as determined by cryoSPARC, ranges from 2.8 Å to 4.8 Å (Fig. 1e). The overall image processing workflow can be seen in Supplementary Fig. 3 and structure statistics can be found in Supplementary Table 1. A single turn of the 1-start helix contains 27.8 asymmetric units, each corresponding to a Redβ177 monomer (Fig. 2a). The density of double-stranded DNA, which represents an average of possible DNA sequences bound to a monomer of Redβ177, can be seen wrapping around the circumference of the protein helix (Figs. 1d, e and 2a). The presence of double-stranded DNA in the map indicates that we have captured an annealing intermediate of the SSA homologous recombination reaction. The attempt to reconstruct the structure of the 2-start helical complex did not yield a reliable map, likely due to the presence of irresolvable continuous variation in the helical symmetry parameters. Therefore, we are presenting only the 2D class averages that indicate a helical assembly composed of two apparently identical Redβ177 filaments (Supplementary Fig. 2b).

Fig. 2: Redβ177 and DNA structure.
figure 2

a The ribbon structure of the Redβ177 helix showing the interaction between protein and DNA (black and white) in a planar conformation. b The cryo-EM map of a single Redβ177 monomer (gray) bound to DNA, with the ribbon structure within it (purple). Close-up views of this monomer c demonstrates the fit of amino acids and nucleotides within the density map.

Atomic model confirms Redβ177 is structurally similar to human Rad52

A total of 161 residues of Redβ177 were built de novo into the density (Fig. 2b). This encompassed residues 3–169, except for a flexible loop region (residues 133–138) which could not be accurately modeled (see below). The density for residues 1–2 and 170–177 was of insufficient quality to allow modeling. However, the quality of the map allowed for the accurate positioning of the side chains of majority of the remaining residues and of the oligonucleotides (Fig. 2c). The coordinates for the atomic model have been deposited into the PDB (7UJL) and the map into EMDB (EMD-26566). In general, the Redβ N-terminal domain consists of a mixed αβ fold consisting of five α-helices, a 310 helix and five β-strands, which form two separate β-sheet elements (Fig. 3a). The structure contains a recognizable βββα core motif (β3–5 and α5), which is a small OB-fold, present in DNA-binding proteins57. This motif is also present in the crystal structure of the N-terminal domain of Rad52 (Fig. 3b), where it contributes to the formation of the deep groove of the inner DNA-binding pocket49. However, in Redβ177 the β-strands are much shorter, 5–6 amino acids in length, relative to those of the Rad52 βββα motif (10-14 amino acids in length). As a result, the outward-facing DNA-binding groove on the surface of Redβ177 is much shallower and more exposed than that of Rad52 (Fig. 3a, b). A structural similarity search with Redβ177, using the DALI server58, identified Rad52 as the most similar structural fold. This is consistent with prior work placing Redβ and many other bacterial annealases in a superfamily that is proposed to share a common fold46.

Fig. 3: Conserved annealase structural motifs.
figure 3

Cartoon models comparing the secondary structures of Redβ177 (a) and Rad52 (5XRZ) (b). Color denotes conserved secondary structures between the proteins, while white represents regions that are not conserved between the structures and DNA is shown in purple. c The ribbon structure of the Redβ177 monomer, with colors indicating conserved motifs 1−6 that were identified by the MSA (Supplementary Fig. 4). Beige indicates no conservation. d Amino acid sequence conservation of Redβ177 compared to other annealases. The extent of sequence conservation (%) is denoted by thickness and color.

Multiple amino acid sequence alignment identifies conserved motifs

Using the Redβ protein sequence (UniProtKB: P03698) as a reference, we constructed a multiple sequence alignment (MSA) based on the top 1000 hits from a BLAST search of the UniProt Reference Cluster (UniRef) 90% database. Following refinement, our MSA consists of 425 reference cluster sequences and is 411 residues in length (Supplementary Data 1). From this alignment, we identified eight conserved motifs based on >50% consensus across the MSA (Supplementary Data 1). Six of these motifs were able to be mapped to the structure of Redβ177 (Fig. 3c and Supplementary Fig. 4), the other two motifs are located in the C-terminal domain. The amino acid sequence conservation was also mapped onto the structure of Redβ177 (Fig. 3d). This showed that the most highly conserved residues, which are in motifs 2, 3, and 6, form the DNA-binding pocket. Two residues in particular, F66 and W143, are almost universally conserved across the sequences in the database.

Oligomerization

The model of Redβ177 was reconstructed as a compact, solenoid, helical filament with dsDNA wound around the outside. This superstructure is stabilized through substantial electrostatic contacts between adjacent monomers in the complex (Fig. 4a, b, Supplementary Fig. 5). Much of the large interaction surface area is comprised of patches of positive and negative charge, which are mirrored on the adjacent monomer face (Fig. 4b). In particular, a cluster of strong electrostatic interactions formed by E125, E121, K148ʹ, and R149ʹ; R161 and D80ʹ; D68, K61ʹ and K36ʹ, may help to stabilize the area around the DNA-binding pocket (Fig. 4b). Some of these residues such as K36, K61, and R161 are also involved in DNA-binding. Therefore, oligomerization of adjacent units may also be indirectly facilitated by the binding of the DNA substrate, as described below.

Fig. 4: Electrostatic surface representation of Redβ177.
figure 4

The surface is colored by charge with positive (blue), negative (red) and electroneutral (white). a Electrostatic representation of three monomers Redβ177 bound to dsDNA (pale green). b The electrostatic interactions between two adjacent monomers within the helix. The areas where the monomers interact are outlined in orange (c, d). Interactions between the top (c) and bottom (d) surfaces of the Redβ177 monomers within the helix. The orange ovals highlight the interacting residues.

The interacting bottom-to-top monomer surfaces contribute relatively little to the stabilization of the helix (Fig. 4c, d). There are only two points of contact, E102 and N101; and T107ʹ and E14ʹ, between the top and bottom surfaces, respectively. Therefore, the side-by-side contacts are the important determinants of helical characteristics such as the pitch and rise. Likewise, the abundance of electrostatic interactions between adjacent monomers drives elongation of the helical fragment.

DNA-binding properties of Redβ177 and the mechanism for the catalysis of DNA annealing

In order to capture an annealing intermediate structure, which would shed light on the mechanism of SSA, our sample was prepared for cryo-EM by sequentially incubating Redβ177 with complementary ssDNA oligonucleotides. This approach has been previously used for Redβ in in vitro experiments to form annealing intermediates27,28,30 and it resulted in clear density for a double-stranded DNA annealing intermediate in our structure. The outward-facing DNA-binding groove accommodates both strands of the annealed dsDNA product, with an observed stoichiometry of four base pairs per Redβ177 monomer (Fig. 5a). However, adjacent monomers bind the template strand in an overlapping manner, such that a single monomer makes contact with nucleotides involved in forming 6 base pairs. The DNA-binding groove is lined with positively charged residues (Fig. 4b), which accommodate the negatively charged phosphate backbone of the DNA, through a combination of electrostatic interactions and hydrogen bonds (Fig. 5b).

Fig. 5: DNA binding to Redβ177.
figure 5

dsDNA binds to Redβ177 with a site size of 4, as numbered in a, at the position of the phosphate of each nucleotide. b Hydrogen bonding between Redβ177, the template (top panel) and incoming (bottom panel) DNA strands. Close-up views of the H-bonding network stabilizing the kink in the phosphate backbone (i) and the π-stacking interaction between F66 and a nucleotide (ii) are shown. Ribbon diagrams of DNA (green) binding to three Redβ177 monomers (purple) clearly show the kink in the phosphate backbone of the template strand (c) as well as the finger loop (d), highlighting the residues which could not be modeled due to lack of density. A prime (ʹ) symbol denotes numbering in an adjacent molecule.

A total of nine hydrogen bonds between the phosphate backbone and the side chains of R128, W143, H153, K154 and R161 secure the inner, template ssDNA strand in the DNA-binding groove of Redβ177 (Fig. 5b, top panel). These bonds result in the nucleotides adopting a planar orientation and facilitate solvent exposure to provide a template for base pairing to incoming homologous ssDNA. In contrast, the incoming DNA strand is only stabilized by two hydrogen bonds to the phosphate backbone through K69 (Fig. 5b bottom panel). In addition, the incoming strand is stabilized in a planar orientation through a π-stacking interaction between the phenyl ring of F66 and the purine or pyrimidine ring of the position 3ʹ nucleotide and two hydrogen bonds between R128 and the adjacent nucleotide at position 2ʹ (Fig. 5b, panel ii). These interactions may tether and orient the incoming ssDNA to the annealing complex, whilst allowing it to slide against the template strand until regions of base pairing stabilize the duplex. The annealed duplex DNA is also stabilized in an extended, planar conformation in the DNA-binding groove (Fig. 5), which is atypical of B-form helical dsDNA conformations often found in solution59. It is likely that upon annealing, the planar conformation is maintained through an induced distortion in the phosphate backbone of the inner (template) DNA strand, which is stabilized by a network of seven hydrogen bonds with R128, W143 and K154 (Fig. 5b, panel i). This kink (Fig. 5c) is necessitated due to the shorter helical radius, and thus shorter path of the inner DNA strand, relative to the outer strand.

Residues 133–138 could not be modeled in our structure. This region forms part of a larger, positively charged, loop (residues 126–142) that forms the roof of the DNA-binding groove (Fig. 5d). It is likely that residues 133–138 that form the tip of the loop are highly flexible, the motion of which (i.e., conformational variability) precluded the formation of structured density for this region during reconstruction. To see how this loop may be moving and functioning, we ran molecular dynamics (MD) simulations using a trimeric assembly of apo protein (Redβ177 only), Redβ177 bound either to template ssDNA or to both DNA strands. Molecular dynamics simulations were run both with weak harmonic restraints applied on the two outside monomer and DNA backbone atoms (Fig. 6) and also without any restraints, as a control (Supplementary Fig. 6). Weak harmonic restraints help to maintain the oligomeric Redβ structure (modeled by a trimeric structure) and the intermediate DNA structure. Each system was simulated for 1 µs during which the RMSD of the protein backbone positions and RMSF of the Cα atom of the central trimeric subunit were calculated (Supplementary Fig. 6d–g).

Fig. 6: Molecular dynamics simulations of Redβ177.
figure 6

The sampled ensemble of apo (A), ssDNA-bound (B) and dsDNA-bound (C) states used in the molecular dynamics simulations. Trimeric systems with weak harmonic positional restraints on the backbone atoms of two outside protein chains and DNA were simulated. One hundred snapshots with a time interval of 10 ns from 1 μs simulations are shown. The structures were fitted to the backbone atoms for the monomer in the middle of the trimeric system. Blue, silver, and red colors correspond to early, mid and late time intervals, respectively.

As we hypothesized, the MD results clearly demonstrate the flexibility of this loop in all simulations. However, the greatest range of movement, as indicated by the highest average RMSD, was seen in the absence of DNA (Fig. 6A and Supplementary Fig. 6a, d–g). The conformations of the loop varied from completely open, which would allow the diffusion of the incoming DNA strand into the binding site, to completely closed, which would keep the DNA in there. The range of motion decreases considerably in the presence of ssDNA and further still, in the presence of nascent dsDNA (Fig. 6B, C and Supplementary Fig. 6b–g). The decrease in motion is likely due to steric restrictions between the protein and the DNA. However, the flexible loop-tip contains two arginine residues, R134 and R137 (Fig. 5d), which may interact with the phosphate backbone of the incoming DNA strand to facilitate base pairing and thereafter helping to clamp the nascent dsDNA in place.

Discussion

Despite its discovery in 196660, the structure of Redβ annealase has remained unknown. It has proven to be extremely difficult to crystalize Redβ, despite trying a variety of truncations and added substrates. The crystals that could be grown did not diffract well enough to yield a structure (personal communication with Dr. Xinhua Ji at NCI/NIH, MD, USA and Dr. Charles Bell at OSU, OH, USA). Redβ177 used in this study was first described by the Bell group in 200652. Our initial cryo-EM trials with apo protein yielded solenoid helices that were relaxed, and therefore, heterogeneous. Upon the addition of DNA and the optimization of buffer conditions, we have obtained compact solenoid helices of DNA-bound Redβ177 that are regular enough for helical image processing. The structure we have obtained unravels the molecular mechanisms of many observations and results published during the half a century since Redβ was discovered.

The first structural information of an annealase came from human Rad5248. Redβ and Rad52 share only ~17% amino acid sequence identity and previous bioinformatic studies placed the two proteins into superfamilies of their own7,47. However, similarity is evident at the level of the 3D structures (Fig. 3a, b) and the DALI server identified Rad52 as the most similar structure to Redβ177. Structures of Rad52 bound to ssDNA have been reported in two different modes, described as ssDNA bound to either the inner or outer DNA-binding groove49. Comparing our structure of DNA-bound Redβ177 to the Rad52 structures, we see that the placement of the template ssDNA in Redβ177 is similar to the ssDNA found at the inner binding site of Rad52 (5XRZ). Both proteins have DNA-binding sites made up of the β-β-α-β-β-β-α secondary structure motif that is predicted to be conserved among the annealases46. However, a major difference between the two proteins is the location of the flexible loop that is proposed to hold the DNA in place. In Redβ, the flexible loop forms the ‘top’ of the DNA-binding site, whilst in Rad52 a functionally similar loop is found at the ‘bottom’ of the DNA-binding site, located between the first two β-strands in the motif (Fig. 3a, b). The disparity between sequence and structure similarity between Redβ and Rad52 could be explained by two scenarios: (1) Both proteins are homologous and share a common ancestor but have diverged significantly over time. However, as structure is more conserved than sequence61, the proteins remained structurally similar despite divergent sequence evolution. (2) The proteins are not homologs, but their structure is biologically favorable for the role of an annealase. In this scenario, the similarities in protein structure would be the result of convergent evolution towards an enzymatic optimum.

DdrB from Deinococcus radiodurans, which was originally identified as belonging to the Rad52 superfamily of annealases7, has also been captured as a DNA annealing intermediate structure. Two pentameric DdrB rings assemble to form a DNA-binding channel bringing complementary ssDNA strands together facing each other in the middle of the double-ring complex62. Based on the DdrB structure, when we saw the conversion of the Redβ177 solenoid structure from relaxed to tight following the addition of complementary ssDNA strands, we predicted that ssDNAs would be bound in between the ‘rings’ of the helix (i.e., lock-washers forming each turn of the helical filament), with the DNA bases facing each other. We had attributed compaction of the solenoid to annealing between ssDNA strands, bringing the lock-washer rings together. Therefore, the location of the DNA in our reconstructed map, (wrapped around the outside of the helical Redβ177 filament) was an unexpected observation. A previous study proposed that ssDNA bound around the outside of Redβ rings but that annealed dsDNA was located on the inside of the helical filament27. We show that the predicted location of the ssDNA was correct, and that this DNA-binding groove is also the location of the annealed dsDNA. What is similar between the DdrB and Redβ is the conformation of the DNA in the annealing intermediate: the nascent dsDNA is found in a planar conformation, unlike the various helical conformations dsDNA forms in solution59. This also explains why Redβ does not bind to pre-annealed dsDNA26, as the double-helical structure known as the B-form DNA would not fit into the DNA-binding site of Redβ.

The homo-oligomerization of Redβ has been reported by several groups using various approaches and techniques, including native mass spectrometry30, electron microscopy27,50 and atomic force microscopy28,29. These results are briefly summarized above and in more detail in our recent review8. Our reconstruction shows how Redβ177 oligomerizes with substantial contacts between adjacent subunits in the helix, but there are limited top-bottom contacts between corresponding monomers in the next turn of the helix (Fig. 4). This may explain the heterogeneity in the compactness of the solenoid-like helices of Redβ177 observed in the micrographs. Our structure also shows how DNA-binding reinforces interactions between adjacent monomers, favoring oligomerization (Fig. 5b). There are 4 bases per monomer in the DNA-binding groove, but each monomer makes contacts with the phosphates from 6 nucleotides of the template strand (Fig. 5a, b). This may explain the discrepancy in the literature regarding the site size of Redβ, which has been reported as 4–5 nts51,63.

Our structure has elucidated an extensive hydrogen bonding network involved in DNA binding, validating predictions made in previous studies. In particular, highly conserved R161 forms a hydrogen bond with position 4 of the template strand (Fig. 5b), supporting previous observations that mutation of this residue is correlated with a 1000-fold reduction in annealing efficiency in vivo46. Lysines at positions 36, 61, 111, 132, 148, 154, and 172 were reported to be critical for DNA binding in vitro and recombination in vivo. K61 was specifically reported to be critical for DNA annealing but not for initial ssDNA binding, and a role in binding to the incoming strand of DNA was suggested64. Our structure shows that lysines 36, 61, 132, and 154 are located close to the DNA-binding site, but 111 is not (K172 is unresolved in our structure). In addition, K69, which was not indicated to be involved in DNA-binding previously, is also shown to interact with the DNA (Fig. 5b). Interestingly, K61 is closer to the template DNA than to the incoming DNA strand, while lysines 36, 69, 132 are closer to the incoming strand. Although K148 is within close proximity to the DNA-binding site, its side chain points away from the DNA. Moreover, other positively charged residues H146, H153, R128 and R149 are also near the DNA, with R128 located within hydrogen bonding distance of both DNA strands, while W143 can make a hydrogen bond with the template strand.

Redβ catalyzed annealing of homologous ssDNA was reported to be an apparent first-order reaction (the half-time of the reaction was independent of the DNA concentration) and a synaptic mechanism was suggested in which the DNA molecules are first brought together irrespective of homology, which is then followed by homology searching and base pairing63. The annealing intermediate structure of Redβ177 that we determined reveals the molecular mechanism for a first-order annealing reaction. Our sample was generated by the sequential addition of the two complementary 27 base pair oligonucleotides to Redβ177 and the structure presented suggests that the binding of the template strand to Redβ is required as the initial step. After this, we hypothesized that the following events may be taking place: the incoming complementary ssDNA can bind transiently while base pairing (and therefore, homology) between the two DNA strands is being evaluated, possibly held in place by the flexible loop (composed of residues 130–139 most of which are unresolved in our structure) to which we refer to as the ‘finger loop’. Once the finger loop is reopened, the incoming ssDNA would stay in place if a strong enough base pairing is achieved; otherwise, it would diffuse out or slide to allow a new base pair match to be tested. This opening-closing and 2D and/or 3D diffusion of the incoming ssDNA would be iterated until homologous pairing is established between the two ssDNA strands bound to Redβ.

To test our hypothesis for the action of the finger loop, we ran molecular dynamics simulations. Our results showed that the mechanism we proposed is supported by the finger loop moving between open and closed conformations. The motion of the finger loop is likely a dynamic process that can be described as tapping behind the incoming ssDNA, keeping it in place to allow more time for the homology search to take place, but loose enough to allow sliding (2D diffusion) and thereby the detection of homology in situations where the nucleotide sequence may be shifted. The simulations also showed that Redβ is most flexible when it is not bound to any DNA and then becomes more rigid when bound to the template ssDNA. Redβ is then further stabilized upon forming an annealing complex with two DNA strands. This is in agreement with the reports of Redβ forming stable DNA annealing complexes26.

Redβ177 contains a 310 helix between the finger loop and α5; 310 helices are often located between an alpha helix and a flexible region, where the weaker i + 3 H-bonding network is proposed to allow transient unraveling65. Along with α5 this 310 helix forms part of motif 6 (Fig. 3c and Supplementary Fig. 4) which is highly conserved across annealases (Supplementary Data 1). In addition, W143 of the 310 helix is almost universally conserved across all species. Our structure identified that W143 forms part of the H-bonding network that stabilizes the induced kink in the phosphate backbone of the template DNA strand. Therefore, the position of W143 in this 310 helix is likely to be of functional importance. The MD simulations indicated that the 310 helix was unstable, oscillating through distorted conformations in concert with the movement of the finger loop. Therefore, we propose a mechanism where transient relaxation of the 310 helix as the finger loop opens allows H-bonding of W143 to the phosphate backbone of the template strand. Then, as the loop closes and the 310 helix reforms, the phosphate backbone may be pulled into the distorted conformation seen in our structure. The distortion in the phosphate backbone allows the template DNA strand to remain planar. This facilitates base pairing between the nucleotides of the template and incoming strands, both of which are held in a planar orientation through F66 (π-stacking) and R128 (H-bond), enhancing the formation of a stable annealing intermediate.

Finally, we have created a molecular scene by piecing four structures together — 4WUZ (λExo with DNA66), 6M9K (λExo with the C-terminal domain of Redβ53), our Redβ177 model, and the AlphaFold prediction (RedβAF54) of full-length Redβ (Supplementary Fig. 7). This scene (Fig. 7) shows dsDNA bound to λExo, from which the digested nascent ssDNA would be extruded from the center of the λExo toroid towards the Redβ molecules bound to the ‘back’ side of λExo (only one Redβ is shown, but there would be 3, one for each λExo monomer). The RedβAF prediction shows helix α7 placed near the DNA-binding pocket of Redβ. Therefore, the superposition of the Redβ177 and RedβAF places α7 in a position that is partially clashing with the bound DNA. If AlphaFold is correctly predicting the position of this helix in the apo state, our structure of Redβ shows that helix α7 must be displaced via a conformational change to allow DNA binding. This may function as an auto-inhibition mechanism, to ensure that Redβ would not bind to ssDNAs generated in the cell during other processes, such as DNA replication, and that it is only able to bind to nascent ssDNA generated via digestion of dsDNA by λExo. The nascent ssDNA would be effectively “loaded” onto Redβ by this coupling mechanism, coordinating the two reactions catalyzed by the individual components of this two-component recombinase: DNA digestion by λExo and DNA binding and annealing by Redβ. Our molecular scene provides a glimpse of what this coupling may look like, furthering our understanding of the molecular mechanisms of SSA.

Fig. 7: Superimposed molecular structures of Redβ, λExo and DNA.
figure 7

This molecular scene was created by piecing four structures together. These include the structure of λExo bound to DNA66 (4WUZ, three λExo monomers represented as surfaces in different shades of blue and DNA in green); the structure of λExo bound to the C-terminal domain of Redβ53 (6M9K, C-terminal domain of Redβ shown in light purple and λExo superimposing with the one in 4WUZ is hidden); the Redβ177 model (DNA in black and protein in dark purple); and the AlphaFold prediction for the full-length Redβ54 (the α7 helix shown in pink and the N- and C-terminal domains superimposing with the corresponding domains in our Redβ177 structure and 6M9K, respectively, are hidden).

There are many questions that remain about the molecular mechanisms of SSA: how can a trimeric λExo (3 subunits) and a dodecameric Redβ (12 subunits) form a complex with 1:1 monomer:monomer stoichiometry? Or, how does Redβ form rings and helices on ssDNA in concert with dsDNA digestion by λExo? Moreover, the structure we determined poses questions, such as what is the function and significance of the finger loop in Redβ and Rad52 and how is the nascent dsDNA released from the stable Redβ helices? As efforts to answer those questions are ongoing, our structure of Redβ177 provides insights into the mechanism of how this protein functions.

Methods

Cloning of Redβ177

A pET-28a(+) plasmid containing the wild-type bet gene was kindly provided by the laboratory of Dr. Donald Court (National Cancer Institute, NIH, Frederick, MD, USA). Redβ177 was assembled into a pET-24a(+) plasmid for expression with a C-terminal hexa-histidine (His6) tag. The DNA sequence corresponding to Redβ177 was PCR amplified using primers 5ʹ-GTTTAACTTTAAGAAGGAGATATACCATGAGTACTGCACTCGCAAC-3ʹ and 5ʹ-GATCTCAGTGGTGGTGGTGGTGGTGAGAAGAGCGCTCGGCTTCATCCTTG-3ʹ, for the simultaneous addition of pET-24a(+) overlap sequences. The pET-24a(+) plasmid was PCR amplified using primers 5ʹ-GGTATATCTCCTTCTTAAAGTTAAACAAAATTATTTCTAG-3ʹ and 5ʹ-TCTTCTCACCACCACCACCACCACTGAGATC-3ʹ. Following PCR, both DNA fragments were purified by agarose gel electrophoresis and subsequent extraction. The recombinant plasmid was generated by a NEBuilder® HiFi DNA Assembly (New England Biolabs) reaction, under standard conditions and using a 2:1 molar ratio of insert:plasmid DNA. The resulting construct, pET-24a-bet177-His6, places the expression of Redβ177-His6 (Redβ177) under the control of the E. coli lac expression system. The correct sequence of the plasmid was confirmed by Sanger sequencing (Garvan Molecular Genetics, Garvan Institute of Medical Research, Sydney, Australia). E. coli strain BL21(DE3) was transformed by the recombinant plasmid and used for protein expression.

Expression and purification of Redβ177 protein

E. coli BL21(DE3) cells were cultured in LB media supplemented with 50 µg/ml kanamycin, at 37 °C and with shaking at 200 rpm. When the culture reached an OD600 nm of 0.6, protein expression was induced by the addition of 1 mM isopropyl-β-D-1-thiogalactopyranoside (IPTG, Astral Scientific). At 3 h post induction, the cells were harvested by centrifugation at 6100 × g for 7 min at 6 °C. The cell pellet was resuspended in 50 ml of Lysis buffer (50 mM Tris pH 7.5, 300 mM NaCl, 10% v/v glycerol, 10% w/v sucrose) and cells were lysed in an ice bath by sonic pulses produced by an Ultrasonics digital sonifier (Branson). The lysate was clarified by centrifugation at 38,000 × g for 30 min at 4 °C. Following centrifugation, the supernatant was passed through a 0.45 µm syringe-driven filter (Sarstedt).

At a flow rate of 1 ml/min, controlled by an ÄKTA Pure FPLC equipped with a sample pump (Cytiva), the filtrate was pumped through a 1 ml HisTrap HP column (Cytiva), pre-equilibrated with 50 mM Tris pH 7.5, 300 mM NaCl, 10% v/v glycerol and 10 mM imidazole. The flow-through from the column, containing unbound proteins, was collected and retained. The column was washed at 1 ml/min with 10 column volumes of wash buffer (50 mM Tris pH 7.5, 300 mM NaCl, 10% v/v glycerol and 20 mM imidazole), before bound proteins were eluted with a 0–100% gradient of elution buffer (50 mM Tris pH 7.5, 300 mM NaCl, 10% v/v glycerol and 500 mM imidazole), over 30 column volumes at a flow rate of 1 ml/min. The column was then washed with 10 column volumes of elution buffer, before being re-equilibrated.

The flow-through from the first separation was pumped back onto the column for further extraction of Redβ177. Elution fractions from both chromatography separations were analyzed by SDS-PAGE. Fractions that were highly enriched for Redβ177 were pooled and concentrated in a 10 kDa MWCO centrifugal concentrator (Merck Millipore) to a total volume of <2 ml. The concentrated protein was centrifuged at 21,000 × g, for 30 min at 4 °C, following which there was no visible pellet. The supernatant was injected onto a HiLoad 16/600 Superdex pg gel filtration column (Cytiva), equilibrated with 50 mM Tris pH 7.5, 50 mM NaCl, 10% v/v Glycerol. Redβ177 eluted as a single, symmetric peak at 1 ml/min flow. The central peak fractions were pooled and estimated to be >95% pure by SDS-PAGE. Protein concentration was assessed by measuring absorbance at 280 nm using a NanoDrop 2000c spectrophotometer (ThermoFisher Scientific) and the extinction coefficient of ε = 34,950 L mol−1 cm−1. Protein was stored stable for up to 3 weeks at 4 °C, or flash frozen in liquid nitrogen and then stored at −80 °C.

Helical complex assembly and preparation of cryo-EM grids

To form the helical assemblies, purified Redβ177 was incubated with complementary 27mer oligonucleotides. Redβ177 was incubated with Oligo 1 (5ʹ-TGCAGCAGCTTTACCATCTGCCGCTGG-3ʹ) in binding buffer (20 mM KH2PO4 pH 6.0, 5 mM MgCl2) for 30 min at room temperature. Oligo 2 (5ʹ-CCAGCGGCAGATGGTAAAGCTGCTGCA-3ʹ) was then added, and the incubation was continued for a further 30 minutes. A molar ratio of 7.7:1:1 of Redβ177:Oligo 1:Oligo 2 was used. To remove any remaining traces of glycerol from the storage buffer, the complex was dialyzed against the binding buffer overnight at 4 °C, using a 10 kDa MWCO Slide-A-Lyzer MINI Dialysis cassette (ThermoFisher Scientific). A 2.5 µl aliquot of the dialyzed complex was deposited onto gold UltrAuFoil R1.2/1.3 Au 300 mesh cryo-electron microscopy grids (Quantifoil) and plunge frozen into liquid-nitrogen cooled liquid ethane using a Mark IV Vitrobot (ThermoFisher Scientific).

Cryo-electron microscopy and helical reconstruction of Redβ177-6xHis helical assemblies

The data were collected using a Titan Krios electron microscope (ThermoFisher Scientific) which operated at 300 kV. The Titan Krios was equipped with a Gatan K2 Summit detector and a Gatan BioQuantum LS 967 energy filter, operated in unfiltered mode. Initial microscope alignment was performed using TEM User Interface (ThermoFisher Scientific) and DigitalMicrograph (Gatan). Data were collected in electron counting mode, using a pixel size of 0.84 Å/pixel as well as a calibrated sample-to-pixel magnification of ×59524.

Movies were collected as a series of 50 frames with a total accumulated dose of 50 e2. Data were collected using automated data collection in EPU, with a defocus range of −0.6 to −2.2 μm. All processing was performed in cryoSPARC55 unless otherwise noted. Movies were gain corrected and aligned using patch-based motion correction and patch-based CTF correction was performed. Helical filament tracing and extraction was performed with a box size of 576 pixels and a particle separation distance of 55 pixels. Extracted particles were subjected to 2D classification and subsequent 2D class averages were used as templates for further rounds of template-based helical filament tracing. This process was repeated twice until picking had reached sufficient coverage of the helical filaments. Particles were then re-extracted with a box size of 448 pixels and rescaled to a box size of 224 pixels to assist with the speed of reconstruction. Extracted particles were subjected to several rounds of 2D classification to identify suitable subsets for reconstruction. Fast Fourier transforms of 2D classes were generated in Fiji67.

Suitable classes identified via 2D classification were initially subjected to 3D helical reconstruction with no helical symmetry applied. Once helical reconstruction had converged, particles were re-extracted to a box size of 448 pixels with no rescaling. Helical reconstruction was then repeated, and the helical symmetry parameters were estimated using real-space helical symmetry estimation. Helical reconstruction was then performed with helical symmetry and subsequent non-uniform refinement. CTF refinement and local motion correction were then performed until convergence. The final resolution of the reconstructed density was estimated using the gold-standard Fourier shell correlation (GSFSC) criteria.

Structure building and refinement

De novo atomic modeling in COOT68 was used to generate the initial structure for the Redβ177 monomer and the bound nucleic acid. The model generated in COOT was then refined using ISOLDE69. As the density corresponding to the bases of the bound oligonucleotides within the asymmetric unit represents a helically symmetrized average density of the base pairs within the oligonucleotide substrate sequence, insufficient detail was available to assign a specific base identity to each of the positions in the asymmetric unit or to reliably model the associated hydrogen bonding between strands. To model the nucleic acids in the asymmetric unit, an arbitrary representative sequence from the substrate oligonucleotides was selected and standard Watson-Crick base pairing applied. The atypical confirmation of the template and non-template strands was modeled with the assistance of adaptive distance restraints in ISOLDE. Model statistics were assessed via MolProbity70 as implemented in the Phenix software suite71.

Bioinformatics

In order to thoroughly search the available sequence data for homologous sequences, a full-length Redβ amino acid sequence (UniProtKB: P03698) was BLAST searched (E-value cut-off = 1, substitution matrix = BLOSUM-64) against the UniProt Reference Cluster (UniRef) 90% database72. From this search, the top 1000 hits were extracted, and amino acid sequences aligned with the command line version of MAFFT (Multiple Alignment using Fast Fourier Transform) v.7.48073 using the L-INSI algorithm. All MAFFT parameters were left at default, except for the ‘maxiterate’ parameter that was set to 1000.

The resultant multiple sequence alignment (MSA) was then visualized within Jalview v.2.11.1.474 and trimmed to the region of Redβ containing the residues 19–205. Redundant sequences and sequences less than 100 residues were then removed. Outlier sequences were also identified using the principal component and neighbour-joining tree analysis tools within Jalview (both using the default settings) and removed. Following refinement, the MSA was then again aligned with MAFFT using the above parameters.

Conserved motifs were then identified based on 50% consensus across the MSA as calculated in Geneious Prime v.2021.1.1 (available from: https://www.geneious.com/). Conserved motifs were then manually refined by identifying, and subsequently annotating, positions at which an amino acid property, rather than an individual residue, was conserved across the MSA.

DALI protein structure comparison server58 was used with the default parameters for identifying the proteins that are structurally similar to Redβ177.

Molecular dynamics simulations and structure prediction

Molecular dynamics simulations were carried out using NAMD (Nanoscale Molecular Dynamics, version 3 alpha)75. In total, three systems have been simulated each for 1 μs, including Redβ alone (apo), Redβ bound to ssDNA (ss), and Redβ bound to the dsDNA annealing intermediate (ds). For each system, two different setups have been simulated: one set of the simulations were free simulations, while a positional harmonic restraint on the backbone atoms of DNA and two outside proteins was applied for the other set of simulations. The latter was used to probe the effects of using  a trimeric structure to study the oligomeric assembly. Initial preparation of systems was done with CHARMM-GUI76,77. The unresolved loops (residue 131–136) were built with Modeller (version 10.1)78. The AMBER protein parm14SB force field79 and nucleic acid BSC1 force field79 were applied for the protein and DNA respectively. The TIP3P model was used for water80.

Molecular dynamics simulations were performed after solvating the system in an ~100 × 100 × 100 Å cubic box that extended at least 10 Å from the solute surface. Na+ and Cl counter ions were added to neutralize the system and achieve a salt concentration of 0.15 M. pKa, calculations were performed using PROPKA to assign protonation states of ionizable residues. Simulations were performed using periodic boundary conditions (PBC) at constant temperature (303.15 K) with the Langevin algorithm (a damping coefficient of 1/ps)81 and at a pressure of 1.0 bar using the Nose-Hoover Langevin Piston method82. The time step was set to 2.0 fs with all covalent bonds involving hydrogens kept rigid with the RATTLE algorithm83. Short-range electrostatics were calculated together with long-range electrostatics particle mesh Ewald (PME)84 with a cut-off of 9.0 Å and a PME grid size of 1.0 Å. For all systems, energy minimization (10,000 steps) and 125 ps equilibration were performed first with positional restraints placed on all the protein and DNA heavy atoms (with a force constant of 1.0 kcal/mol/Å2 on the backbone atoms and 0.5 kcal/mol/Å2 on the side chain atoms). This was followed by 1 μs production runs. In the simulation set with a positional harmonic restraint, this restraint was maintained on the two outside monomers and DNA backbones. Snapshots were saved every 100 ps. VMD (Visual Molecular Dynamics)85, LOOS (Lightweight Object-Orientated Structure Analysis)86 and MDAnalysis87 were used to analyze the trajectories.

Structural prediction was performed with the ColabFold88 implementation of AlphaFold2. The source code was retrieved from the GitHub repository (https://github.com/YoshitakaMo/localcolabfold). For each prediction, five models were generated and optimized with the AMBER force field.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.