Structural basis of the complete poxvirus transcription initiation process

Poxviruses express their genes in the cytoplasm of infected cells using a virus-encoded multi-subunit polymerase (vRNAP) and unique transcription factors. We present cryo-EM structures that uncover the complete transcription initiation phase of the poxvirus vaccinia. In the pre-initiation complex, the heterodimeric early transcription factor VETFs/l adopts an arc-like shape spanning the polymerase cleft and anchoring upstream and downstream promoter elements. VETFI emerges as a TBP-like protein that inserts asymmetrically into the DNA major groove, triggers DNA melting, ensures promoter recognition and enforces transcription directionality. The helicase VETFs fosters promoter melting and the phospho-peptide domain (PPD) of vRNAP subunit Rpo30 enables transcription initiation. An unprecedented upstream promoter scrunching mechanism assisted by the helicase NPH-I probably fosters promoter escape and transition into elongation. Our structures shed light on unique mechanisms of poxviral gene expression and aid the understanding of thus far unexplained universal principles in transcription. A series of cryo-EM structures examining transcription initiation by vaccinia poxvirus RNA polymerase reveal how viral transcription factors identify and melt a promoter and how a polymerase-associated helicase mediates promoter escape.

triphosphates (NTPs) upon incubation of complete vRNAP (Fig. 1b). A large-scale reconstitution of DNA-bound vRNAP complexes was separated by gradient centrifugation (Extended Data Fig. 1a), transferred onto holey carbon grids, and three cryo-EM datasets were collected (Extended Data Fig. 1b,c). After extensive three-dimensional (3D) classification, several distinctive particle classes could be separated that represented vRNAP complexes in transcription stages ranging from pre-initiation to co-transcriptional capping.
Cryo-electron microscopy structure of the PIC. Biochemical studies had previously shown that vRNAP, VETF and Rap94 are required for early transcription initiation 26,32,33 . We identified one particle class in our reconstitution that contained these factors along with the DNA scaffold, and thus represented the bona fide PIC 32,33 . Its single-particle reconstruction displayed an overall resolution of 3.0 Å and diffuse density for DNA and VETF. Signal subtraction and focused refinement resolved the VETF-DNA subcomplex (Extended Data Fig. 1d-i and Table 1). The density was docked with the core vRNAP model and the VETFl and VETFs chains and parts of Rap94 were traced de novo, allowing complete modeling of the PIC (Fig. 1c,d). Within the PIC, the promoter is positioned at the distal edge of the polymerase cleft. The upstream DNA contacts the protrusion domain of the polymerase subunit Rpo132, directly adjacent to the C-terminal domain (CTD) of Rap94. The downstream promoter region interacts with the vRNAP core through positions on the clamp head (Extended Data Fig. 2a). The melted promoter region is predominantly disordered but could be visualized with mild Gaussian filtering (Fig. 1e). It localizes centrally above the opening of the cleft forming a second contact zone with the clamp head (Extended Data Fig. 2a). Both DNA strands appear only minimally separated within the bubble region. The latter joins the adjacent double-helical upstream and downstream The core polymerase is depicted in gray as a solvent-accessible surface. usDNA, upstream DNA; dsDNA, downstream DNA. e, Transparent isosurface of the DNA cryo-eM density, filtered by Gaussian blur with 1.5σ standard deviation, and the DNA model, shown in cartoon style. Approximated helix axes of the different duplex DNA sections are indicated, and the translation of the helix axes of the two duplex DNA regions adjacent to the IMR is denoted. This view is rotated by 20° relative to d. f, Detailed view of VeTF and promoter DNA. Two views of VeTF with the bound promoter within the PIC are displayed. For easier visualization, the core polymerase is hidden. g, Complete vRNAP residual density (eMD 4868, gray transparent isosurface) docked with the VeTFl structure and shown along with the complete vRNAP model (PDB 6RFL) in cartoon representation (color code as in Figs. 1-3 and ref. 21 for the complete vRNAP-specific factors). The core polymerase is depicted in gray as a solvent-accessible surface. The predominantly disordered interface of VeTFl and the tRNA aminoacyl stem is marked with an orange dotted line.
sections in a 100° angle accompanied by a 25-Å translational shift of the helix axes (Fig. 1e). We thus conclude that the DNA is in the initially melted state. Of note, neither the B-cyclin, nor the B-homology region of the early transcription factor Rap94 establish direct DNA contacts in the PIC (Fig. 1d and Extended Data Fig. 2a). However, on the opposite side of the core vRNAP, VETFs and VETFl contact the DNA in the distal upstream and downstream promoter regions, respectively ( Fig. 1f and Extended Data Fig. 2b,c). Therefore, and due to the absence of contacts in the initially melted region (IMR), the VETF heterodimer appears to be anchored like a bridge on the upstream and downstream region of the promoter ( Fig. 1d and Extended Data Fig. 2b).
Unique mode of DNA-binding by the VETF heterodimer. The structure of VETF allowed us to decipher the mechanisms of vRNAP recruitment to the early promoter. VETFl folds into five distinct domains, termed NTD (N-terminal domain), TBPLD, CRBD, domain 4 and CTD (C-terminal domain) (Fig. 1c,f). Despite the absence of a priori detectable sequence homology, the second domain displays a bilobal TBP fold, and hence is a TBP-like domain (TBPLD). It is located centrally above the polymerase cleft and, unlike TBP in other structures of PICs, contacts the promoter in a sequence-independent manner. Sequence-specific DNA binding in the vaccinia PIC is instead facilitated by the neighboring domain, which recognizes the CR (Fig. 1a,d,f). Based on its fold and binding mode, this module constitutes a novel type of double-stranded DNA binding domain, hence termed the critical region binding domain (CRBD). Although holding only a limited content of secondary structure elements, it gains structural rigidity through three disulfide bridges that position a 3 10 -helix ideally for its insertion into the major groove of the DNA (Fig. 2a). The side chain-to-base contacts of this helix are the major sites for sequence-specific readout of the promoter sequence (Fig. 2b,c). Only weak bending of the DNA helix axis is introduced in this region (Fig. 2a).
The joint structural context of TBPLD and CRBD in VETFl establishes specific contacts to the upstream promoter (Extended Data Fig. 2d). On the core vRNAP, this part of the promoter is anchored via the interaction of domain 2 of Rap94 with the NTD of VETFl (Fig. 1d). All other domains of VETFl (NTD, domain 4 and CTD) contribute to the structural backbone of VETF. Domain 4 and the CTD of VETFl make up the interface to VETFs (Fig. 1f).
The downstream promoter interacts almost exclusively with VETFs (Figs. 1d and 2d and Extended Data Fig. 2b). Only one additional pointed contact to the core vRNAP is established by the clamp head close to the TSS (Extended Data Fig. 2a). We observe a striking similarity of the first two domains of VETFs with the canonical helicase fold of chromatin remodeling SNF2-type ATPases 22,34 , of which INO80 35 is the closest homolog. With the latter, VETFs shares, along with the vRNAP-associated transcription factor NPH-I, an extended brace helix that stably bridges the N-and C-lobe of the helicase. The intense DNA interaction of the VETFs' helicase module is accompanied by a strong bend of the helix (Extended Data Fig. 2e). At the point of inflection, Phe271 intercalates via the minor groove, effectively disturbing the planar base-stacking over the range of roughly three base pairs on either side of the insertion site (Fig. 2d). Although melting of the two DNA strands is not observed at this position, this mechanism bears some similarity to the 'scalpel' method of strand-separating helicases 36 (see Extended Data Fig. 2f-h for a comparison to the Pol II system).
Promoter positioning and enforcement of directionality. We next asked how the DNA contacts established by the CRBD of VETFl control the initiation process. The 3 10 -helix of CRBD inserts into the major groove, making it the reader head of VETF (hence termed the CRBD reader, Fig. 2a,b)  Only bases for the non-template strand are labeled, and the template strand is sequence complementary. Contact between Tyr367 and thymidine bases at positions −18 and −17 are displayed as a transparent van der Waals (vdW) surface. The protein-DNA hydrogen-bond network is depicted as dotted yellow lines. c, Schematic of the sequence-specific interactions of the CRBD reader. The CR consensus sequence is depicted according to Yang and others 37 . The vdW interactions are indicated in gray and hydrogen bonds in yellow. d, Details of VeTFs binding to the upstream promoter. The intercalating 'wedge' residue Phe271 is shown in stick representation.
sequence of 15 A nucleotides, interrupted by a TG dinucleotide 31,37 (Figs. 1a and 2c). Arg370 and Gln375 engage in base-specific hydrogen bonding that involves the bases of the TG motif on the non-template strand and the complementary AC dinucleotide on the opposing template strand (Fig. 2b,c and Supplementary Video 1). By this means, VETFl anchors the promoter in a defined position relative to the polymerase cleft. The CR displays a high propensity for A nucleotides downstream of the TG motif (Figs. 1a and 2c). Consistent with this, the C5 methyl groups of the corresponding complementary T nucleotides at positions −18 and −17 of the template strand interact cooperatively with the reader head by stacking with Tyr367. Inverse promoter binding would imply an unfavorable contact of Tyr367 with adenine bases (Fig. 2b,c) and thus a single promoter direction is coerced. By this means, the CRBD-DNA interaction ensures the (1) identification of the CR, (2) alignment of the CR relative to the polymerase cleft and (3) enforcement of transcription directionality. The CRBD is thus is the main regulator of the transcription initiation process.

Asymmetric DNA binding by the TBP-like domain of VETFl.
Our structure identified VETFl as a TBP-like protein (TBPLP) whose TBPLD is engaged in an intricate contact network comprising the neighboring domains of VETFl, VETFs and Rap94 (Fig. 3a). Members of the TBPLD family had previously been identified solely by means of sequence homology. However, VETFl stands apart from previously known TBPLPs because of its extremely divergent sequence, which, until now, has prevented its classification as such. Nevertheless, the structural conservation of the TBPLD is comparably high, resulting in a Z-score of 4.2 determined by PDBeFold 38 when matching it to PDB entry 1TBP. To compare their structures and binding modes, we aligned the TBPLD, the upstream DNA module of VETFl (Fig. 3b), with the yeast TBP (yTBP), TATA-box crystal structure (Fig. 3c). The TBPLD of VETFl features the characteristic saddle structure that was previously described for TBP [39][40][41][42] . However, the symmetry that is evolutionary conserved in TBP 43,44 appears broken. As a consequence, and unlike TBP, which contacts the TATA-box symmetrically, VETFl binds the promoter asymmetrically and sequence-independently solely through its C-terminal TBP lobe. Most strikingly, the TBPLD inserts into the DNA major groove, contrary to the canonical binding mode of TBP, which is based on minor groove insertion. In accordance with this observation, the two strictly conserved pairs of DNA-intercalating phenylalanine residues on each lobe of TBP are absent in the TBPLD [39][40][41][42] . Still, the TBPLD induces a pronounced DNA bend via intercalation of aliphatic, rather than aromatic, side chains ( Fig. 3b). In agreement with the fundamentally different binding mode of the TBPLD, a consensus TATA-box is absent from vaccinia early promoters 31 .

Rearrangement of the complete vRNAP into the PIC.
Complete vRNAP is the predominant vRNAP complex found in infected cells and is necessary and sufficient to execute viral early transcription. Accordingly, we have previously speculated that this unit becomes incorporated into virions as a pre-assembled unit to promote the restart of transcription in the next infection cycle 21 . To investigate the transformation of complete vRNAP into the PIC, we compared both structures and their cryo-EM reconstructions. The VETF heterodimer is already present in the complete vRNAP; however, defined density could only be observed for the CRBD of VETFl, while the remaining parts were mobile. Under the assumption that the adjacent TBPLD is flexibly joined to the CRBD, we were able to dock the diffuse residual density in the vRNAP reconstruction with the VETFl coordinates extracted from the PIC model, resulting in reasonable overlap. In the resulting structure ( Fig. 1g) VETFl displays a flexible interface to tRNA Gln . A comparison with the PIC structure reveals major reconfigurations, including the release of all associated factors from complete vRNAP except for the VETF heterodimer and Rap94 (Supplementary Video 1). This underlines the importance of complete vRNAP as a pre-formed early transcription unit and the high plasticity of vaccinia transcriptional complexes (see also Supplementary Video 1 for a summary of core aspects of the PIC). Structure of the late PIC. The structural transition described above explains how complete vRNAP becomes recruited to the viral early promoter to form the PIC. We next solved the structure of vRNAP particle classes that represent bona fide transcription stages following the pre-initiation phase. Based on biochemical evidence, such particles are predicted to be devoid of VETF but contain Rap94. Particles of class 1, subclass 2 (Extended Data Fig. 3a), which yielded a reconstruction at a resolution of 3.0 Å (Extended Data Fig. 1b-d and Table 1) fulfilled this criterion. The density could be docked with the complete vRNAP model 21 . Disordered density corresponding to DNA is visible upstream next to the Rap94 CTD and within the downstream DNA channel. These sites roughly coincide with the DNA anchor points on the core vRNAP observed in the PIC (compare Fig. 1d). However, no density for the DNA transcription bubble or nascent RNA was detected in the active cleft (Fig. 4a). Instead, we found well-defined density for the highly phosphorylated stretch within the C terminus of Rpo30 (termed the phospho-peptide domain, PPD; Fig. 4b). It is in a similar conformation as in the complete vRNAP 21 and follows the path of the template-and non-template strand in the elongation complex (EC). This allows its pairing with the B-reader of Rap94 (Fig. 4a,b) and enables single-strand capture at later stages (see 'PPD assisted single-strand capture and formation of the initially transcribing complex' below). We therefore conclude that this particle represents a late state of the PIC (lPIC) in which VETF has been expelled, the melted promoter has been handed over to the core vRNAP, but transcription has not yet been initiated.
PPD assisted single-strand capture and formation of the initially transcribing complex. Next, we investigated the structural basis of lPIC conversion into an initially transcribing complex (ITC). Three vRNAP particle classes yielded reconstructions that were identified as different conformations of the ITC based on their composition and promotor positioning (lTC1-3, Extended Data Fig. 3a-d and Table 1). The exact location of the polymerase on the promoter could be determined, because its downstream blunt end was readily visible in the density (Extended Data Fig. 4a). In contrast to the lPIC, we observed ordered density for DNA in the downstream DNA channel and for a DNA/RNA hybrid above the active site (Fig. 5a). The PPD of Rpo30, which occupied the position of the DNA/RNA hybrid in the lPIC, has been displaced by the template strand. Consequently, the B-homology region has become mobile and is not visible in the density (Fig. 5b). No density for upstream DNA was identified. The three ITC complexes superimposed well but differed in the positioning of the DNA within the downstream DNA channel (Fig. 5a) and the state of the clamp (Fig. 7b). For ITC3, downstream DNA density was located in a shallower position and was comparably less ordered. In the ITC1 particle, the clamp is in a closed conformation with the DNA bound firmly and deep in the downstream DNA channel. ITC2 and ITC3 display an open clamp conformation and the downstream DNA appears mobile and bound in a shallower position. No substantial differences between the three ITC complexes were discernible with regard to the DNA/ RNA hybrid region. Thus, the three ITC structures inform on the conformational flexibility of the ITC and, in concert with the lPIC structure, on the template-strand capture mechanism.
ATP-dependent upstream promoter scrunching. During 3D classification, one particular class stood out because it comprised particles considerably larger than the ITC (Extended Data Fig. 5a). After a further round of focused classification of these particles on the observed extra density, followed by multibody refinement, a reconstruction was obtained that allowed the construction of a complete model (Fig. 6a, Extended Data Fig. 5b-d and Table 1). This complex was classified as a late ITC (lITC), based on the positions of the blunt ends of the upstream and downstream promoter-DNA segments that are visible in the density (Extended Data Fig. 4b) as well as on the presence of a RNA/DNA hybrid. Except for Rap94, the core vRNAP was in a conformation similar to that observed in the ITC complexes. The path of the downstream DNA fitted best that observed in the ITC3 particle, indicating loose binding. The downstream blunt end of the DNA scaffold had advanced roughly five base pairs in the downstream direction compared to the ITC (Extended Data Fig. 4a). Massive extra density above the cleft was unambiguously attributed to upstream DNA-bound NPH-I, and the NTD Rap94 and B-cyclin domain of Rap94 (Fig. 6a). Strikingly, the Rap94 B-homology region, the NTD and adjacent linkers appeared entirely reconfigured in comparison to other vRNAP complexes (Extended Data Fig. 4d,e) and the whole path of the Rap94 chain was visible (Fig. 6b). We also note that the path of the upstream DNA in the lITC is fundamentally different from that observed in the vaccinia PIC and in the ITC of Pol II 45 . The blunt ends of the DNA promoter scaffold are visible in the EM density of the lITC (Extended Data Fig. 4b), thus allowing us to determine the position of (and the size of) vRNAP relative to the transcription bubble (Extended Data Fig. 4c). Strikingly, the upstream end of the scaffold can only be accommodated within the lITC under the assumption of massive promoter scrunching. This includes 13 base pairs upstream of the artificial non-complementary region of the promoter scaffold that have been additionally melted when compared to the ITC (Extended Data Fig. 4c). It is likely that this condition enables promoter escape and hence contributes to the transition of the initiation phase into productive elongation (Supplementary Video 2).

Discussion
In this study, we describe six vRNAP structures that represent snapshots of the poxviral transcription initiation phase. When viewed together, a comprehensive mechanistic picture of the early events during vaccinia transcription emerges.
The structure of the vaccinia PIC in the initially melted state provides insight into poxvirus early promoter identification and binding. The arc-shaped VETF heterodimer spans the polymerase cleft and upstream and downstream promoter elements and thus allows precise insertion of the TBPLD at the site of initial melting. The upstream contact to the promoter is established by VETFl, and its CRBD is the decisive element for its sequence-specific recognition. The CRBD recognizes the critical region of the promoter through a thus far unknown DNA-binding domain, which is stabilized by three disulfide bridges. This is in agreement with two early studies observing DNAse protection of the −15 to −30 promoter region 46 as well as crosslinking of VETFl to the upstream and of VETFs to the downstream promoter region 32  may be introduced by vaccinia-encoded enzymes 47 as potential host factors for this task are confined to the endoplasmic reticulum. The TBPLD cooperates with the CRBD in upstream promoter binding and introduces a sharp DNA bend, which probably generates the nucleation site for promoter melting. Strikingly, the TBPLD of VETFl displays an asymmetric DNA binding mode. This sharply contrasts with the canonical, symmetric DNA binding mode observed in all TBP-DNA complexes solved so far, including PIC complexes of the nuclear polymerases. Our findings could help the understanding of the dual nature of TBP 48 , which, in its canonical binding mode, recognizes the TATA-box. Evidently, TBP is capable of an alternate, sequence-independent mode of action when directing the transcription machinery to TATA-less promoters 49,50 . Because vaccinia early promoters do not contain a TATA-box, an attractive explanation for the deviant binding mode of the TBPLD is that its orientation in the vaccinia PIC mirrors the alternate function of TBP at TATA-less promoters. Asymmetric DNA binding by TBP has been proposed in the context of Pol I 51 and may also occur in other TBPLDs 43,52,53 . To the best of our knowledge, no other structures of TBP-like domains exist in the database, and our structure might therefore contribute to the general understanding of this domain family. In this regard, it is interesting to note that a recent cryo-EM study reports a binding mode of TBP on a TATA-less promoter similar to that on TATA-containing promoters 54 . The behavior of the VETFl TBPLD might therefore be fundamentally different from that of genuine TBP. Although the order of events cannot definitely be determined given the current state of knowledge, we propose the following mechanism for vaccinia early promoter melting based on our data and prior findings for the Pol II system 55 (Fig. 3a). (1) The CRBD of VETFl binds the promoter at the CR, thereby enforcing directionality (Fig. 2a,b). (2) VETFs pulls the DNA in an ATP-dependent reaction towards the vRNAP clamp and lobe, analogous to the XPB helicase in the Pol II system 56 (compare Extended Data Fig. 2g to Extended Data Fig. 2h). (3) The clamp closes tightly around the DNA (Fig. 7b), thereby shaping its path 57 . (4) The promoter DNA becomes underwound and bent by 80° towards the C-lobe of VETFs, exposing bases for an interaction with the latter (Fig. 2d). (5) The tip of the C-terminal lobe of the VETFl TBPLD intercalates upstream of the IMR, inducing a second sharp bend in the promoter (Fig. 3b). (6) This bend triggers the initial melting event at the TSS, and the IMR absorbs the negative twist of the adjacent DNA segments.
A structure-based comparison of vaccinia and eukaryotic transcription systems reveals common principles but also obvious differences in the bound transcription factors. Similar positioning of the promoter relative to the core polymerase is observed in all PICs (likewise, the positions of the B-homology region of Rap94 in the vaccinia PIC and the corresponding domain of TFIIB in the Pol II PIC overlap 45,58 ; compare Extended Data Fig. 2g to Extended Data Fig. 2h). However, whereas TFIIB directly contacts the promoter, the B-homology region in Rap94 does not bind DNA (Fig. 1d). Furthermore, some features in the distal section of the DNA path appear to be conserved. A common principle might be the binding of a helicase transcription factor to the downstream promoter. It appears plausible that the helicase domains of VETFs (Extended Data Figs. 2e and 6) and of the TFIIH subunit XPB (SSl2 in yeast, Extended Data Figs. 2f and 6) are functional counterparts 59 .
In contrast to a recent study describing a PIC intermediate of Pol II immediately prior to the initially melted state 55 , we do not observe underwinding of the DNA duplex in the vaccinia PIC. A possible explanation for this is that the IMR has absorbed a previous negative twist during the melting process. At the promoter upstream side, we noticed a topological relationship of the VETFl-promoter complex and the positioning of the Rap94 CTD with the TBP/TFIIF module on the DNA in the Pol II PIC (Extended Data Fig. 2h). This notion is corroborated by the fact that, despite their fundamentally different binding modes, both TBP and the VETFl TBPLD induce a strong bend of the DNA. Thus, the architecture of the vaccinia PIC differs fundamentally from its nuclear counterparts. Although the catalytic cores of all multi-subunit polymerases are largely homologous, only basic architectural features are conserved with respect to the positioning of the early transcription factor. The arrangement of the VETFl TBPLD is so far unprecedented and unexpected. Our studies further reveal that VETF and Rap94 perform functionalities of TBP and TFIIB. The three conformationally different ITC structures mirror the flexibility of the transcription machinery in the initially transcribing phase and may coincide with non-processive RNA synthesis and TSS search, as observed in the Pol II system 60 .
During transition to the lITC, a dramatic reorganization of the transcription machinery takes place. This includes the recruitment of NPH-I, a re-routing of the upstream DNA path and widening of the transcription bubble, extending from promoter position +12 to −22 (Extended Data Fig. 4c). To accommodate the re-routed upstream DNA, the cyclin domain of Rap94 undergoes a conformational change (Extended Data Fig. 4d) accompanied by other changes in the Rap94/core enzyme interaction (Extended Data Fig. 4e). The only plausible explanation for the widening of the transcription bubble is that the NPH-I helicase motor has actively melted and scrunched upstream DNA duplex into the core vRNAP 61,62 . By this means, NPH-I probably assists promoter escape by adding the free energy of ATP hydrolysis to the generation of an energy-rich transcription intermediate 11 . Although, for Pol II, only downstream promoter scrunching has so far been observed 5,6,11 , vRNAP employs a novel mechanism in which downstream and upstream promoter scrunching are combined. Strikingly, a mechanism in which the helicase transcription factor TFIIH injects free energy from ATP hydrolysis into the ITC during TSS scanning has been postulated for Pol II 10 . In addition to its function as helicase motor, NPH-I plays an obvious role for the statics of the transcription bubble. Both the 80° bend of the DNA (Extended Data Fig.  2e) and the insertion of the 'wedge' residue Phe273 (Fig. 2d) stabilize the upstream fork point of the transcription bubble in the lITC. Processive vRNAP elongation complexes can be assembled in the absence of Rap94 in vitro 20 . In vivo, such complexes are found associated with the latter 28,63 . Thus, Rap94 may ensure the efficient recruitment of NPH-I to ECs stalled at pause sites to enable readthrough 62 , and the resulting vRNAP complex might be structurally similar to the lITC (Fig. 6a).
After assignment of our structures to the transcription timeline, we propose a comprehensive model of initial transcription ( Fig. 7 and Supplementary Video 2). First, complete vRNAP reconfigures to the PIC (step 1). In the PIC, vRNAP-bound VETF has selected, aligned, positioned and melted the promoter DNA, and the clamp is in a tight conformation (Fig. 7b). Upon handover of the melted promoter to the core polymerase, VETF leaves the complex, giving rise to the lPIC (step 2). Here, the promoter is supported upstream by the CTD of Rap94 and is anchored in the downstream DNA channel. The single-stranded DNA region is dynamic in this phase and therefore not visible (Fig. 4a). Through the interaction with the PPD of Rpo30, the B-homology domain of Rap94 is kept in an initiation-ready conformation. Template-strand capture proceeds with the displacement of the PPD, which might be driven by the pronounced electronegative charge of DNA interacting with the positively charged active site region of vRNAP. After single-strand capture (step 3), the B-reader scans the template strand for the TSS in a manner analogous to that which has been observed for Pol II 4 . Once the TSS is located, the B-homology domain becomes mobile and RNA synthesis commences (step 4). This phase is highly dynamic, as documented by three ITC structures deviating in the state of the clamp (Fig. 7b) and the positioning of the downstream DNA in the downstream DNA channel (Fig. 5a). The vRNAP promoter escape is accompanied by recruitment of NPH-I, a large-scale remodeling of Rap94, and major changes to the path of the upstream DNA (step 5). In the lITC complex (Fig. 6a), NPH-I acts as a strand-separating helicase, widens the transcription bubble, defines its upstream fork point, and shapes the path of the single-stranded template and non-template DNA (Fig. 6a,c). Transition to a processive EC (step 6) triggers contraction of the transcription bubble, mobilization of the upstream DNA duplex and loss of NPH-I. Alternately, abortive initiation might lead to re-initiation via re-recruitment of the Rpo30 PPD (step 6b). We note that all vRNAP complexes of the transcription initiation phase contain the core polymerase in a virtually constant conformation. Still, each transition of the transcription complexes is accompanied by changes of the clamp position (Fig. 7b).
Our study provides detailed mechanistic insights into the initial phase of poxvirus transcription. Some features observed in the presented structure are poxvirus-specific, such as the unique promoter recognition by the CRBD of VETFl. Others, such as the hitherto unknown behavior of a TBP-like protein, the observation of the initial melting event and the discovery of an ATP-dependent scrunching mechanism might be of relevance for the general understanding of multi-subunit RNAPs.

online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41594-021-00655-w.

Reconstitution of promoter-bound vRNAP complexes.
A synthetic double-stranded DNA oligonucleotide scaffold mimicking the vaccinia virus early promoter region was generated by annealing of two partially complementary DNA oligonucleotides (Fig. 1a). Annealing was performed in buffer containing 100 mM NaCl, 20 mM HEPES pH 7.5 and 3 mM MgCl 2 by heating to 95 °C for 5 min followed by slow cooling to room temperature. The resulting double-stranded DNA oligonucleotide was precipitated with isopropanol and the pellet was Cryo-electron microscopy and model building of the PIC. Following sucrose gradient purification, the indicated fractions (Extended Data Fig. 1) were pooled, diluted 1:50 with a buffer containing 10 mM Tris-HCl pH 7.5, 100 mM NaCl, 5 mM MgCl 2 and 1 mM DTT, and centrifuged in a Vivaspin concentrator to remove the sucrose. R1.2/1.3 holey carbon grids (Quantifoil) were glow-discharged for 90 s (Plasma Cleaner model PDC-002; Harrick Plasma) at medium power, and 3.5 μl of C2 sample was applied inside a Vitrobot Mark IV instrument (FEI) at 4 °C and 100% relative humidity. Grids were blotted for 3 s with blot force 5 and plunged into liquid ethane. Cryo-EM datasets comprising 10,816 (dataset 1), 9,878 (dataset 2) and 3,640 (dataset 3) micrographs, respectively, were collected from three different grids with a Thermo Fisher Titan Krios G3 set-up equipped with a Falcon III camera (Thermo Fisher). Data were acquired with EPU (Thermo Fisher) at 300 keV and a nominal magnification of ×75,000 (calibrated pixel size of 1.0635 Å) in video mode with 47 fractions per video and counting of the electron signal. The total exposure was 77.5 e − /Å 2 for 75 s, with two exposures per hole.
Dose-weighted, motion-corrected sums of the micrograph videos were calculated with Motioncor2 65 . The contrast-transfer function (CTF) of each micrograph was fitted with RELION 3.1 66 using the built-in CTFFIND algorithm. An initial set of 25,000 particles was picked with the Gaussian picker and subjected to three rounds of 2D classification in RELION 66 to clean up the dataset. Eight class averages were selected as templates for subsequent automated particle picking within RELION and a total of 300,000 particles were picked using the RELION autopicker. After a second round of 2D classification, 3D classification was performed using the vRNAP core structure as template. Particles belonging to the PIC were selected and 2D classes for autopicking were calculated. The resulting three particle stacks, one for each dataset, were cleaned up individually by four rounds of 2D classification each, and contained 1,064,795 (dataset 1), 1,205,746 (dataset 2) and 323,776 (dataset 3) good particles, respectively. Each particle stack was then subjected to 3D classification, and particles that fell in the defined PIC class were selected. The PIC particle stacks of the three datasets were then united into a single stack, and CTF refinement, followed by a consensus 3D refinement, was performed. This united particle stack was then subjected to a focused 3D classification with a mask that selected for VETF and DNA. Two of the resulting three classes yielded high-resolution reconstructions of VETF and DNA in minimally divergent conformations (Extended Data Fig. 2c). The particles from the two good classes were then forwarded to a multibody (MB) refinement in RELION, either pooled or separately. MB refinement was performed with two bodies, representing either VETF or DNA and core vRNAP. We noticed that minor variations of the mask pairs resulted in improvement of particular regions of the reconstruction. We therefore repeated the MB refinement with 11 more mask pairs. The resulting 12 map pairs were then combined with Phenix.combine_ focused_maps to create a single, optimal map and the procedure was repeated for several selected subsets. The combined map of all 12 map pairs was compared to the different combined maps based on subsets. The combined map based on all 12 map pairs showed comparably better richness of detail and connectivity for VETF and was therefore used for automated model refinement. To build the PIC model, the vRNAP core excluding the Rpo30 PPD was extracted from the complete vRNAP structure (PDB 6RFL) and docked into the cryo-EM density map. Within the residual density, the path of the DNA was identified and manually docked with section-wise stretches of ideal B-DNA. VETF was then traced de novo in Coot 0.9 67 . To this end, the SNF2 helicase core of VETFs was located and built, followed by well-defined regions of VETFl. The resulting partial model was initially refined with Phenix.real_space_refine and forwarded to Phenix.combine_focused_maps to create a stitched map, and the VETF model was completed manually. The full polypeptide chains of both VETFs and VETFl were continuously modeled. Finally, residual density was identified as the relocated Rap94 NTD, and the DNA sequence was assigned. The resulting model was manually optimized with the real-space refinement routine of Coot 0.9 and subjected again to refinement with Phenix.real_ space_refine 68 , including ADP refinement steps. During refinement, secondary structure and Ramachandran restraints were imposed. After four further cycles of manual inspection and automated refinement, the refinement converged, and a model with excellent stereochemistry and good correlation with the cryo-EM map was obtained (Table 1).
Three-dimensional reconstruction and model building of lPIC and ITC complexes. The lPIC particle stack obtained as described above was subjected to two rounds of focused 3D classification with three classes in each of the two rounds. The classification was focused with a mask on the cleft, active site and downstream DNA channel as well as the region of the Rap94 cyclin domain. From the resulting set of nine class averages (Extended Data Fig. 3a), four reasonable reconstructions were obtained after a final round of 3D refinement and post-processing, and the associated complexes were identified as the lPIC and ITC1-3 (Extended Data Fig. 3a). The resolution was determined by Fourier-shell correlation (FSC) to 3.0 Å for the lPIC and 2.9 Å, 3.2 Å and 3.0 Å for ITC1, ITC2 and ITC3, respectively (Extended Data Fig. 3b-d). To build the lPIC model, the vRNAP core including the Rpo30 PPD was extracted from the complete vRNAP structure (PDB 6RFL) and docked into the cryo-EM density. The positioning of the Rap94 cyclin domain and the adjacent linker regions was adjusted manually with Coot 67 , and the model was refined with Phenix.real_space_refine 68 including an ADP refinement step. During refinement, secondary structure and Ramachandran restraints were imposed. After two further cycles of manual inspection and automated refinement, the refinement converged and a model with excellent stereochemistry and good correlation with the cryo-EM map was obtained (Table 1).
Three-dimensional reconstruction and model building of the lITC. The lITC particle stack, obtained as described above, was subjected to a round of focused 3D classification with a mask on the NPH-I and upstream DNA region. From the three resulting classes, a single one displayed good occupancy and resolution for NPH-I. Particles belonging to this class were subjected to two-body MB refinement in RELION using a mask for NPH-I and upstream DNA and a mask for the core vRNAP. The postprocessed reconstructions for both bodies were then combined with Phenix.combine_focused_maps. To build the lITC model, the ITC1 structure was docked into the density. Within the residual density, a characteristic SNF2 helicase fold was recognized that was docked with either VETFs or NPH-I from the complete vRNAP structure (PDB 6RFL). NPH-I unequivocally fitted the density, while VETFs did not. Further residual density could then be identified as the relocated Rap94 B-cyclin domain, the relocated Rap94 NTD and the NPH-I CTD. After manual adjustments with Coot, including rebuilding of remodeled Rap94 linker regions, the model was refined with Phenix.real_space_refine including an ADP refinement step. During refinement, secondary structure and Ramachandran restraints were imposed. After two further cycles of manual inspection and automated refinement, the refinement converged and a model with excellent stereochemistry and good correlation with the cryo-EM map was obtained.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this Article.