Meas. Sci. Technol. 18 (2007) 2446-2455

# Synchronization methods for the PAC RPC trigger system in the CMS experiment\*

Karol Bunkowski<sup>1</sup>, Krzysztof T Pozniak<sup>2</sup>, Michal Bluj<sup>3</sup>, Krzysztof Doroba<sup>1</sup>, Matti Iskanius<sup>4</sup>, Artur Kalinowski<sup>1</sup>, Krzysztof Kierzkowski<sup>1</sup>, Marcin Konecki<sup>1</sup>, Jan Krolikowski<sup>1</sup>, Ignacy Kudla<sup>1</sup>, Flavio Loddo<sup>5</sup>, Wojciech Oklinski<sup>1</sup>, Antonio Ranieri<sup>5</sup>, Giuseppe de Robertis<sup>5</sup>, Tuure Tuuva<sup>4</sup>, Grzegorz Wrochna<sup>3</sup> and Wojciech Zabolotny<sup>2</sup>

<sup>1</sup> Institute of Experimental Physics, Warsaw University, Hoża 69, 00-681, Warsaw, Poland
 <sup>2</sup> Institute of Electronic Systems, Warsaw University of Tech., Nowowiejska 15/19, Warsaw, Poland

<sup>3</sup> Soltan Institute for Nuclear Studies, ul. Hoża 69, 00-681 Warsaw, Poland

<sup>4</sup> University of Technology Lappeenranta, PO Box 20, Skinnarilankatu 34,

FIN-53851 Lappeenranta, Finland

<sup>5</sup> Istituto Nazionale di Fisica Nucleare, Via Orabona 4, I-70126 Bari, Italy

Received 25 November 2006, in final form 28 April 2007 Published 6 July 2007 Online at stacks.iop.org/MST/18/2446

## Abstract

The PAC (pattern comparator) is a dedicated muon trigger for the CMS (Compact Muon Solenoid) experiment at the LHC (Large Hadron Collider). The PAC trigger processes signals provided by RPC (resistive plate chambers), a part of the CMS muon system. The goal of the PAC RPC trigger is to identify muons, measure their transverse momenta and select the best muon candidates for each proton bunch collision occurring every 25 ns. To perform this task it is necessary to deliver the information concerning each bunch crossing from many RPC chambers to the trigger logic at the same moment. Since the CMS detector is large (the muon hits are spread over 40 ns), and the data are transmitted through thousands of channels, special techniques are needed to assure proper synchronization and synchronous transmission are presented. The methods were tested during the MTCC (magnet test and cosmic challenge). The performance of the synchronization methods is illustrated by the results of the tests.

**Keywords:** HEP experiment, distributed multichannel system, diagnostics, monitoring, synchronization, parametrized hardware algorithm, FPGA, LHC

# 1. Introduction

### 1.1. CMS detector

CMS is one of the four large detectors at the LHC. It is a classical multipurpose detector with a solenoidal magnet

\* Supported by Polish Ministry of Scientific and Information Technology under grant 115/E-343/SPB/CERN/P-03/DZ 444/2002-2004. and several layers of subdetectors surrounding the interaction point. The CMS detector measures muons, electrons, photons, jets and missing transverse energy produced in the LHC proton–proton interactions. The collisions of proton bunches occur in the middle of the detector (interaction point) with a frequency of 40 MHz. Such a high collision frequency is required because the probability of interesting processes (such as production of Higgs bosons or supersymmetric particles) is



**Figure 1.** Layout of the CMS detector and PAC RPC trigger system. The CMS detector and its front-end electronics are placed in an underground cavern. The control, trigger and data acquisition electronics is placed in the counting room behind a large concrete wall shielding the ionization radiation. The connection between the cavern and the counting room is realized with fast optical links.

small. However, it is not possible to record all events on mass storage (the stream of data produced by the detector is about 1 MB per each bunch crossing). Therefore, the interesting events have to be selected in real time from events containing only standard interactions. This selection is performed by the trigger system [1]. In CMS it is divided into two levels. The level 1 (L1) trigger is based on dedicated hardware, while the high-level trigger (HLT) is implemented on a computer farm.

### 1.2. Level 1 trigger system

The L1 trigger consists of several subsystems, based on various subdetectors looking for different signatures of interesting events. Each subsystem looks for specific objects (high energy particles) and estimates their parameters, and then sends information about them to the global trigger. The global trigger combines the information from all subsystems and makes the decision whether the event should be rejected or sent for further processing in the high-level trigger. While the trigger decision is being performed, the full event data are waiting in the pipeline memories. The trigger decision is sent to those memories as a 1 bit signal (L1 accept, L1A).

The electronics of the L1 trigger works synchronously with the bunch collision, i.e. the electronics is driven by the 40 MHz clock delivered by the accelerator control system. The 25 ns period between the bunch crossings (BX) is commonly used as the time unit.

The basic requirement for the trigger system is that it has to process every event, no dead time is allowed [1]. Moreover, the total time for processing each event is limited to 3.2  $\mu$ s, i.e. the length of the pipeline memories. To achieve that, pipeline processing is used: the algorithms are divided into steps performed in 25 ns, the result of each step is sent to the input of the next step of the algorithm. In this approach the synchronization of the data stream flowing through the system is the crucial requirement: at every stage of the data processing all data concerning the given event must be delivered to the processing devices at the same moment.

It is obvious that the L1 accept signal should trigger readout of the data originating from the BX, for which the L1A was generated. To which BX the L1A refers to it is determined only by the signal timing (no other data, like BX identifier, are transmitted with the signal). Therefore, the total latency of the trigger system (the time between bunch crossing and moment of its delivery to the data buffers) should be precisely set and always exactly the same.

### 1.3. TTC system

To assure synchronous operation of the whole CMS electronics system the timing, trigger and control (TTC) system [2] was developed. TTC is a set of optical and electronic components used for the distribution of clock and synchronous commands. The 40 MHz clock and BC0 signal (bunch crossing zerosignal related to the first bunch of a LHC beam cycle, issued every 3564 BXs) are delivered from the beam control system to the global trigger. The L1A, clock, BC0 and other control signals are coded and transmitted through a passive optical network to the TTC receivers (TTCrx) [4] located on each board. Ticks of the LHC clock (i.e., BXs) are counted by bunch counters, reset by the BC0 signal. The bunch counter number (BCN) provided by those counters is used in the synchronization procedures. Bunch counters are implemented on each FPGA device involved in the processing of RPC data. The BC0 signal is delayed at every stage of the PAC RPC trigger system in such a way that the time differences of its reception are compensated. In this case the same value of the BCN refers to the same time at every board of the same stage of the RPC trigger system.

#### 1.4. RPC trigger system

The PAC RPC trigger is an independent component of the L1 muon trigger system, which searches for high transverse momentum muons—one of the most important signatures of interesting events. A schematic view of the PAC RPC muon trigger is shown in figure 1.

Resistive plate chambers are fast detectors, optimized for the detection of muons. In the central region of the CMS ('barrel') the chambers form six cylindrical layers, while in each of the forward regions of the detector ('endcaps') the chambers form four disc layers. In total there are 696 chambers. The barrel chambers contain two or three rows of readout strips, perpendicular to the beam, in each row there



**Figure 2.** The structure of the RPC trigger system. The number of components, the total number of bits transferred at one BX by all transmission channels and the transmission frequency at a selected level are given.

are up to 96 strips. The endcap chambers have three rows of 32 strips each, the strips have radial layout. The straight distance from the interaction point to the chambers varies from 4.2 m (i.e., 14 ns for muons flying with the speed of light) for the inner barrel chambers, to 12.5 m (42 ns) for the outer endcap chambers.

Analogue strip signals are discriminated and formed into 100 ns binary pulses at the 7200 front end boards (FEBs) placed on the chambers (figure 2). Each signal corresponds to one RPC strip, its high level denotes the particle's hit in a given strip. The rising edge of the signals brings the information about the time of the hit. The signals are sent from FEBs in the LVDS standard through copper cables to link boards (LB) [5]. The LBs (1232 boards) are located around the detector (in the CMS cavern). The LB electronics synchronizes the signals with the clock provided by the TTC and compresses the data (zero suppression). The data from two slave LBs are transmitted to the master LB. The master LB multiplexes the data from slave LBs and from itself and converts them into optical signal (1.6 GHz) [6, 7] transmitted through a fibre to the trigger boards (TB) [8] located in the counting room. In total there are 444 fibres, and they vary in length from 40 to 80 m. Since the data from every optical link have to be delivered to two or four TBs, the links are split by splitter boards (SpB, 60 pieces).

Each of 84 trigger boards receives signals from up to 18 links. On the TB the data are distributed through the OPTO receiver FPGAs (six chips on each TB) to the PAC FPGAs (three or four chips on each TB, placed on the mezzanines; each PAC receives data from all links). The PACs execute the trigger algorithm based on the so-called pattern comparator (PAC) strategy described in [9, 10]: the chamber hits are compared with predefined patterns of muon tracks obtained from simulations, the coincidence of hits in the same BX in at least three layers of chambers is required. In total in the whole system about 110 thousands of RPC strip signals are compared with more than 2 millions of patterns in every BX.

The muon candidates found by PACs are transmitted to the GBSORT chip. Since the PAC algorithm is performed for segments of the detector that overlap, the same muon can be found by several segments. Therefore, in the GBSORT, the muon candidates from neighbouring segments are suppressed ('ghost-buster' algorithm [12]). Then remaining candidates are sorted according to their quality [11].

Since the number of data that have to be transmitted on the TB is large (432 bits per BX from six OPTOs to every PAC, then 432 bits from four PACs to the GBSORT), to reduce the number of paths on the board the data are transmitted with fast LVDS lines with a frequency of 320 MHz (i.e., one line transmits 8 bits during one BX).

The muon candidates returned by the GBSORT are further processed on the next levels of the ghost busters and sorter tree: on the custom backplane of the trigger crate (TC GBSORT) containing the TBs, and then on the half sorter boards (HSB) and final sorter board (FSB) located in the sorter crate (SC). From FSB, up to eight highest momentum muon candidates are sent to the global muon trigger (GMT) for every BX. The GMT combines these muon candidates with candidates found by two other muon subsystems of the CMS detector: the drift tube (DT) trigger (covering barrel region only) and cathode strip chamber (CSC) trigger (covering endcap regions). The resulting candidates are sent to the global trigger.

The RPC trigger system uses FPGA devices extensively. All system functionalities (except for front-end chips) are coded in VHDL and implemented in FPGA devices. Almost 10 000 Altera [13] and Xilinx [14] devices are used in the system.

### 1.5. Synchronization in the RPC trigger system

At the LHC the crossings of beam bunches occur every 25 ns. The muons which originate from the decays of the products of proton interactions are created in well-defined time (the time of bunches overlap is about 1 ns [3]). Then the products of the interactions fly through the detectors, generating signals that further propagate through the electronics. The differences in muon time of flight to the chambers and in the time of signal propagation in the cables and electronics are bigger than the period between the bunch crossings. Therefore, the chamber hits of muons originating from the same bunch crossing received by the link boards are not simultaneous, but are spread in time.

On the other hand, the RPC trigger system operates in synchronous pipeline mode: the data continuously flow through the system synchronous with the 40 MHz clock (or its multiplication), and are processed by the muon identification algorithm. The data from many sources (detector units) are concentrated on every processing device. Therefore, it must be assured that all data delivered at a given clock period to the input of each PAC or ghost buster belong to the same BX. To achieve that first of all the asynchronous chamber hits (FEB signals) must be time quantized, i.e. formed into 25 ns pulses synchronous with the clock. Additionally the muon hits must be assigned to the proper BX, so that the differences of muon time of flight and the length of signal cables for different chambers are compensated. This task is performed in the synchronization unit of the link boards. Next, when the data are transmitted from the LBs through the optical links and splitters to the trigger boards, on the TBs they must be realigned again, to compensate the differences of the latency of each link. In the ghost busters tree the time alignment of the muon candidates should be kept.

In the case of the PAC RPC trigger system the issue of the data stream synchronization can be split into two parts, discussed in the following sections:

- the digitization and time quantization of the RPC signals in the synchronization unit of the link boards,
- the synchronization of the data transmission, both optical and electrical, from the link boards to the PAC logic, and then through the sorter tree to the global muon trigger boards.

### 2. Synchronization of the RPC signals

In the case of the RPC PAC trigger, to detect a muon and assign it to the proper bunch crossing, it is sufficient to measure the arrival time of RPC signals with a resolution of 25 ns, i.e. one LHC clock period (internal time resolution of RPC chambers is  $\sim 2$  ns). This measurement is performed by a synchronization unit (SU) of the link board's FPGA [5]. In the SU the time window is created by two clocks ('window open' and 'window closed'). These clocks, provided by a TTCrx chip [4] mounted on each LB, can be independently deskewed (delayed) in steps of 104 ps, thus the position and width of the synchronization window can be precisely adjusted. The RPC signal is accepted if its rising edge is inside the synchronization window. Then a signal synchronous with the main LHC clock (also provided by the TTCrx) is produced. In this way the signal is assigned to the given clock period (BX). The SU of a LB allows us to synchronize simultaneously 96 signals. Since only two deskewed clocks are available on one LB, the width and position of the synchronization windows are the same for all channels.

The minimum size of the synchronization window for a given LB is determined by the total spread of the RPC muon hits timing on the input of the Link Board SynCoder FPGA. Analyses show that this spread does not exceed 25 ns, which is the fundamental requirement for successful synchronization. The width of the synchronization window should be smaller than 25 ns whenever possible [1, 16], to reduce the rate of noise and uncorrelated background.

### 2.1. Alignment of the BC0 signal between the link boards

The TTC fibres for the link boxes vary in length (from 40 to 80 m), hence the phase of the TTC clock varies between the link boards. Therefore the first step of the LBs synchronization is to define 'the same bunch crossing' on all boards. The BC0 signal and BCN is used to mark the bunch crossings, thus on each LB the BC0 has to be properly delayed to compensate the TTC fibre differences, so that the spread of BC0 signals between the LBs is less than 25 ns.

The BC0 should be delayed with respect to the LBs for which the time of TTC signal propagation is the biggest, i.e. for which the TTC fibre is the longest (figure 3). Let us call such a LB reference LB<sub>*R*</sub>. The delay of BC0, which should be applied on the LB<sub>*i*</sub>, is given by the formula (in BX units):

$$d_i^{\text{BC0}} = \inf(\Delta t_i^{\text{TTC}} / 25 \,\text{ns}) + (1^*) + c_i^{\text{win}}, \tag{1}$$

where int is the integer part of a number;  $\Delta t_i^{\text{TTC}} = t_R^{\text{TTC}} - t_i^{\text{TTC}}$  is the difference in TTC signal propagation time between the reference LB<sub>R</sub> and LB<sub>i</sub> and

$$(1^*) = 1$$
 if  $\Delta \varphi_i^{\text{TTC}} > 0$ , otherwise 0.

The BC0 signal from the TTCrx is synchronous with the Clock40Des1 [4] (this clock is used as 'window closed' in the SU). When the BC0 is resynchronized with the main TTCrx Clock40 (the Link Board FPGA is driven by this clock) the BX to which the signal is assigned depends on the position of Clock40Des1 with respect to Clock40. To correct this effect the parameter  $c_i^{\text{win}}$  is introduced in formula (1) defined in the following way:

$$c_i^{\text{win}} = \begin{cases} 0 \text{ BX} & \text{if winClose} > 17 \text{ ns} & \text{or} & \text{ClkInv} = \text{true} \\ 1 \text{ BX} & \text{in the opposite case.} \end{cases}$$

ClkInv is the parameter setting of the inversion of the clock used to latch the signal; it should be set to *true* if the value of 'deskew 1' is between 15 and 18 ns in order to avoid nondefined signal states.

The  $d^{BC0}$  calculated from formula (1) assures that on every LB that has a TTC fibre of length different from the LB<sub>R</sub> the BC0 is up to 25 ns slower than the BC0 on the LB<sub>R</sub>.

### 2.2. Position of the synchronization window and data delay

The optimal position of the beginning of the synchronization window winOpen ('window open') for a given LB is determined by the minimal total time of muon flight from the interaction point to the chamber and RPC signal propagation to the input of a LB<sub>i</sub> ( $t_i^{min}$ ):

winOpen<sub>i</sub> = 
$$(t_i^{\min} + \Delta \varphi_i^{\text{TTC}} + \text{offset})\% 25 \text{ ns},$$
 (2)

where % denotes the modulo operation,  $\Delta \varphi_i^{\text{TTC}} = (t_R^{\text{TTC}} - t_i^{\text{TTC}})\%25$  ns is the difference of the TTC clock phases between the reference LB<sub>R</sub> and LB<sub>i</sub>.

The  $t^{\min}$  parameter has the following components:

$$t_{\text{flight}} = t_{\text{flight}} + t_{\text{RPC}} + t_{\text{propag}} + t_{\text{FEB}} + t_{\text{cable}},$$

where  $t_{\text{flight}}$  is the minimal time of muon flight to the RPC chamber. It varies from 14 ns, for the inner barrel chambers, to 42 ns for the outer endcap chambers.  $t_{\text{RPC}}$  is the time of RPC signal formation after muon crossing: ionization, avalanche formation and drift to electrodes. This time is of order of 1 ns.



**Figure 3.** Alignment of the BC0 signal between link boards and synchronization of the chamber signals. The LB1 has the longest time of TTC signal propagation, thus it is a reference for the BC0 synchronization. In other LBs the BC0 is delayed such that it is up to 25 ns slower than that in the LB1. The chamber signals (to simplify the picture having the same timing) originating from the same event are assigned the BX number 3 on each LB.

 $t_{\rm propag}$  is the time of signal propagation along the RPC strip to obtain the minimal time, the muons crossing the strip end connected to FEBs should be considered. Then the minimal  $t_{\rm propag}$  is 0.  $t_{\rm FEB}$  is the time of processing the strip signal by the front end chip (amplification, discrimination and LVDS pulse formation); it is about 10 ns.  $t_{\rm cables}$  is the time of signal propagation through cables connecting FEBs to LB. The length of these cables differs between chambers, thus the propagation time of signals varies from 65 to 107 ns.

To align in time the data between the LBs, the pipeline delays are implemented on the input of the multiplexer of the master LB. The value of the delay that should be applied for a given LB is (in BX units)

$$d_i^{\text{data}} = a - \text{int}\left[\left(t_i^{\min} + \Delta\varphi_i^{\text{TTC}} + \text{offset}\right)/25\,\text{ns}\right] - (1^*) + c_i^{\min}.$$
(3)

The value of the constant *a*, the same for all LBs, should be chosen so that the resulting delays are always positive, but *a* is minimal. The delay calculated in this way assures that hits of muons originating from the same event are assigned to the BX marked with the same BCN on every LB.

The *offset* in the formulae is the result of the difference in phase between the moments of bunch crossings and the phase of the LHC clock in the synchronization unit. This offset cannot be measured in practice, so in the first run of data taking the synchronization windows and delays should be set according to the above formulae with an arbitrarily chosen offset. Then, the offset correction can be found from the analysis of collected data. During first runs of the LHC the bunch collisions will not occur for every BX, but the bunch spacing will be 75 ns (during first commissioning runs the spacing will be even bigger). The 75 ns bunch spacing is very convenient for synchronization purposes, since then the chamber hits originating from different events are not mixed in the same BX. Moreover, the BX of muon is known *a priori* from the known structure of the beam and does not have to be determined from the chamber hits. This BX of the event is used as the reference in the analysis of the timing of the chamber hits. The offset correction can be calculated from the ratio of hit counts in the proper BX to the number of hits in the previous and next BX.

To perform such an analysis the muon hits must be distinguished from chamber noise and uncorrelated background hits, whose rates are one or two orders of magnitude higher. The only way to do this is to look for a coincidence of hits in different chambers, i.e. to produce the PAC trigger. But to generate the coincidence, the hits must be delivered to the PAC inputs in the same clock period, i.e. be properly synchronized. It is obvious that with arbitrarily chosen offsets some muons will not be triggered at all, as the hits can be spread between two (or even three) BXs. The solution that cuts this loop is to extend the length of chamber hit signals on the PAC inputs by 1 or 2 BXs. In this way all muons will be triggered, even though the hits from different chambers are not in the same BX. Then, by analysing the collected data, the correction of offset can be determined, as described above, so the synchronization procedure can be performed fast (at most a few iterations of this procedure will be needed), without scanning the synchronization windows positions separately for each LB.

### 3. Synchronization of data transmission

In the RPC trigger system the measurement data are transmitted between the processing devices by using several types of transmission channels (figure 2): optical links,



Figure 4. The generic transmission channel.

electrical transmission on boards or with cables between boards. The size of the data word is different on different levels of the system. In most of the cases a multiplexed transmission with high frequencies (e.g., 1600 MHz in the case of optical transmission LB–TB, 320 MHz in the case of electric transmission between the trigger board devices) is used to reduce the number of optical fibres, cables or paths on boards.

The requirements for the trigger system presented in section 1 define the detailed requirements for the transmission of the data in the RPC trigger system. The latency of the transmission should be constant. Moreover, the latency should be minimized, as the total latency of the system is limited. There should be a possibility to align in time the data between the channels, i.e. to delay suitably the data in selected transmission channels. The tools that help to find the values of those delays should also be provided.

To meet these requirements, all types of transmission channels used in the system have to be equipped in similar functionalities. Therefore, to assure flexible and easy implementation of different types of transmission channels in FPGA devices, and to standardize their usage, universal parametrized transmitter and receiver modules were developed with the following functionalities:

- data multiplexing and demultiplexing, including a mechanism for the alignment of many lines of a transmission bus and synchronization of the data with the local clock of the receiver (thus, the local clock of the transmitter does not have to be sent together with data).
- static and pseudorandom test data generators and analysers, which facilitate finding the proper values of parameters configuring the transmission channel, and which enable testing the transmission quality,
- the buffers on the receiver that enables us to delay the data stream with steps of 1 BX,
- the data are marked with a time signature that helps to find the proper values of delays aligning the data stream between the channels and provides the possibility of monitoring the changes of latency,
- mechanism of automatic monitoring of the transmission quality and masking corrupted data frames.

# 3.1. Functional parametrization of a synchronous transmission channel

The scheme of a generic transmission channel used in the RPC trigger system is presented in figure 4. The compatibility of the transmitter and receiver modules is assured by a set of parameters defining a particular transmission channel. The width of the data stream *D* and the number of bits for time signature *T* define the total number of transmitted signals N = D + T. The ratio of the frequency used in the transmission channel to the 40 MHz clock frequency determines the multiplexing ratio  $M = f_M/f_T$ . The number of physical transmission lines (paths on a board, cable channels) is [L] = N/M. Both modules are independently synthesized in firmware for a particular FPGA chip.

The time signature transmitted with the data consists of the BC0 signal and a few (usually 3 or 4) least significant bits of a local bunch counter number (BCN). On the receiver side the data should be delayed by the pipeline (series of flip flops) so that the transmitted signature equals the local signature of the receiver. The total latency of the transmission channel  $T_L$ depends on delays introduced by electronic circuits and cables (fibres), and the delay added to maintain the synchronization between the channels.

The BC0 signal is delayed by the dedicated DELAY elements. These delays should be such that the differences of the BCN value between the receiver and transmitter at a given moment are equal to the transmission latency  $T_L$ .

The sub-modules MUXER and DEMUXER enable transmission with the multiplied frequency reducing the number of physical connections. The multiplication factor M corresponds to the number of data bits transmitted by a single signal line.

The generic transmission channel is used for data sending on various transmission media (optic, LVDS traces on PCB or copper cables).

A mechanism of coding of the time signature with the data bits was introduced for the on-line monitoring of transmission and synchronization. Each bit of the time signature is replaced with a result of XOR of this bit and selected bits of the data word. The receiver uses the same algorithm to restore the transmitted time signature and check its matching with the local time signature.

To test the transmission quality the possibility of sending static or pseudorandom data was implemented. These data are automatically analysed in the receiver and the information



Figure 6. The receiver module.

about the detected errors is issued separately for each line. The number of errors is counted. The number of generators is determined by the *S* parameter. The width of generator data is P = N/S. The mechanism to determine the successive pseudorandom value from the current value is used to control the pseudorandom data transmission without the necessity of common initialization and synchronization during the test procedure.

### 3.2. The parametrized transmitter module

The transmitter module is presented in figure 5. The time signature can be switched on or off by the multiplexer (1). The data stream together with the time signature is sent to the coding module (5) via a multiplexer (4), which switches between the real and test data stream. Module (6) serializes the data (in the case of optical transmission this module is replaced with a hardware circuit GOL [17]). Multiplexer (2) provides choice between the static test data and pseudorandom data. The test data word has a size of *N* bits., i.e. full size of the transmission channel word. To provide the possibility of using the time signature in the test mode, module (3) performs XOR operation on the *T* least significant bits of the test data and time signature bits.

## 3.3. The parametrized receiver module

The receiver module is presented in figure 6. The input signals of each line are registered by a chosen clock edge (module 1—'clkInv' parameter) in order to avoid non-defined signal states. To compensate for the timing differences between the transmission lines each line can be individually delayed

by the register (module 2—'regAdd' parameter). Before deserialization the data are aligned with the phase of the receiver 40 MHz clock by the delay (3) ('muxDelay', the value of this delay is the same for all lines, since the timing of the lines was earlier aligned by registers (2)). After deserialization in the module (4) the data are delayed in module (6) ('dataDelay') to assure that the transmitted time signature is equal to the local signature of the receiver. Module (7) recovers the original time signature by performing a XOR coding operation as in the transmitter.

The transmitted signature is automatically compared with the local one by module (8). If a discrepancy is detected, the corrupted data are blocked (zero value is sent to the output). This operation of data validation is switched on by the multiplexer (11). A test mode (switched by element (13)) compares the received data word with the value calculated from the preceding word. The multiplexer (12) enables the reading of received static data or the result of the pseudorandom data analysis.

### 3.4. Example of the transmission channel implementation

As an example let us present the implementation of the transmission channel between the OPTO and PAC chips on the trigger board. The data word here is 19 bits long, the time signature is 5 bits long. As the transmission frequency is 320 MHz, the multiplexing ratio is 8, therefore to transmit the 24 bits of one link (channel) 3 LVDS lines are needed. To synchronize the channel the proper values of the 'clkInv', 'regAdd' and 'muxDelay' are found by using the static test data. Usually there is more than one set of these parameters, for which the static data are correctly transmitted. Then,

the transmission is tested by using random test data. If the errors are detected for any line, the other set of previously found parameters is applied for that line and the random test is repeated. In this procedure the time signature is not used. The value of the 'dataDelay' can be calculated based on the applied value of the 'muxDelay' (as the data stream is already aligned on the OPTO chips, here only the differences in the latencies of the transmission should be corrected, these differences are not bigger than 1 BX). The synchronization procedure is performed by the dedicated Java application.

### 3.5. Usage of diagnostic readout in the synchronization

In each device of the RPC trigger system the diagnostic readout module [5] is implemented. This module, besides many other applications, is very useful for tasks related to synchronization of transmission and data stream alignment. The module enables us to observe simultaneously the input and output data stream of a device, including the received time signature of each input transmission channel, and the local time signature of a device. In this way the proper value of the delays can be easily found (especially the BC0 delay). The module enables also us to validate the correctness of the transmission and to discover the causes of observed errors; therefore, it is very useful for hardware and firmware debugging. Moreover, the latency of the transmission and the processing algorithm can be easily measured simply by looking at the data taken with the diagnostic readout.

### 4. Test results

During the summer and autumn of 2006 the magnet test and cosmic challenge (MTCC) [15] of the CMS detector was performed. In this test the whole CMS detector was closed on the surface for the first time and its magnet operated at a 3.8 T magnetic field. All of the installed subdetectors and functional systems participated in the MTCC and were tested with muons from cosmic rays. In the case of the RPC system only 18 RPC chambers were used in the barrel and six in the endcap. The link system for these chambers contained 55 link boards installed on balconies near the chambers. LBs were connected with 21 optical links through four splitter boards to two trigger boards, operating in the counting house ('Green Barrack') together with readout and control components. The RPC system has been connected to CMS data acquisition, which has collected a few millions of events with cosmic muons for analysis. The collected data were used to evaluate the performance of the RPC chambers as well as to crosscheck with other muon subdetectors. The RPC trigger was also used during the MTCC to identify cosmic muons. The hardware synchronization tools and software procedures of the RPC trigger were successfully tested.

### 4.1. Synchronization for cosmic muons

The approach presented in section 2 worked very well on the MTCC to synchronize the RPC detector for cosmic muons. In the MTCC the length of the TTC fibres was the same for every LB, which simplified the BC0 and data synchronization. The position of synchronization windows and delays was

calculated according to formulae (2) and (3) for straight, vertical muons. The cosmic muons are not time structured, therefore the offset is arbitrary. The results, i.e. the timing histograms of muon hits for selected LBs, are presented in figure 7. The fact that the distributions of the BX of chamber hits with respect to the BX of the trigger are concentrated in one BX proves that the method of calculation of the synchronization parameters gives good results. Then, by changing the value of the offset, the RPC trigger was aligned with the trigger provided by the drift tube trigger system. The difference of arrival times of the DT versus the RPC trigger  $t_{DT}-t_{RPC}$  was distributed in two consecutive BXs. The value of the required offset correction is

$$\Delta \text{ offset} = n_{\text{bx}} + 1/(n_{\text{bx}} + n_{\text{bx+1}}) \times 25 \text{ ns}$$

where  $n_{bx}$  is the number of RPC triggers in the proper BX, and  $n_{bx+1}$  is number of RPC triggers in the next BX. After applying this correction both triggers were well aligned (figure 8).

### 4.2. Performance of data transmission

During the MTCC the transmission check was enabled for all channels. During many days of data taking, including a few periods of continuous running (without the FPGA reloading and hardware reinitialization) lasting for 2 days, no transmission errors were observed. Also in the collected data no corrupted records were discovered.

# 4.3. Expected stability of synchronization during CMS operation

Found positions of the synchronization windows and BC0 delays should be valid forever (assuming that the phase of the TTC clock with respect to the bunch crossing does not change).

In the case of the optical transmission, due to properties of the GOL and TLK devices used in that transmission channel, the synchronization parameters can change after each resynchronization of the transmission. Moreover, the latency of the transmission channel can also change; therefore this change must be included in the data stream alignment (i.e., the data delay should be corrected after each resynchronization). The link boards will be reloaded every 10 min to avoid firmware corruption caused by ionizing radiation, and the optical transmission should be resynchronized after the reloading. Since the resynchronization performed by software would take too much time, the automatic synchronization procedure should be implemented in the OPTO devices.

The synchronization parameters of other transmission channels do not change unless the FPGA firmware is recompiled. Therefore if the hardware works properly the transmission should be errorless (except for a small rate of single, accidental errors). A large number of transmission errors can indicate the malfunction of the involved devices (problems with clock, corruption of firmware, hardware overheating or damage).

# 5. Summary

The RPC trigger system identifies and measures the momenta of muons produced in proton-proton collisions in the CMS



**Figure 7.** The timing histograms for the link boards of one sector (the histogram titles correspond to the names of chambers to which the particular LBs are connected). The distribution of the chamber hit BX with respect to the time of triggers is presented. BX number equal to 1 corresponds to the presence of the trigger signal. As the cosmic muons are not time structured, for some muons the hits in one or two chambers can be produced in different BXs than the hits in the rest of the chambers determining the BX of the trigger (the coincidence of at least four chambers was required, while six layers of chambers were used). Therefore hits outside the BX number 1 are observed. The other reason for those hits (especially the reason for the asymmetry in the number of hits in the BX numbers 0 and 2) can be individual differences in the timing properties of the TTCrx chips of particular LBs providing the clock.



**Figure 8.** The difference in the reception time of the DT trigger and the RPC trigger before (*a*) and after (*b*) correction of the offset of the position of the link boards synchronization window. Only the events where both systems produced the trigger are included. The muons on the edge between two BXs can produce a trigger in different BXs in both systems; therefore some triggers are observed outside the bin number 0.

detector by comparing the signal measured by the RPC chambers to a set of predefined patterns (PAC algorithm). To find the best (i.e., most energetic) muon candidates the system performs in the order of  $10^6$  comparisons in parallel. Correct muon identification and momentum measurement require absolute synchronization of each of about 110 thousand signals from the RPC chambers, as well as exact synchronous distribution of the signals on each level of the PAC RPC trigger system. The system sends information about energetic muons found in an event with constant delay with respect to the nominal bunch crossing time, as is required by the global trigger of CMS.

This paper describes the method used to synchronize the signals received from the RPC chambers, as well as

a complex solution for synchronous signal transmission in the entire RPC system based on parametrized transmitter and receiver modules. The methodologies described here have been successfully used during tests of the system with cosmic muons (MTCC). The implemented hardware solutions significantly accelerate the process of synchronization, and are sufficient to be used for an automated procedure expected for the final system.

## References

 Negra M D *et al* 2000 CMS TriDAS project: technical design report: 1. The trigger systems *Technical Design Report* CMS, CERN/LHCC 2000-038 (Geneva, Switzerland: CERN) (http://cmsdoc.cern.ch/cms/TDR/TRIGGERpublic/CMSTrigTDR.pdf)

- [2] Taylor B G 1998 TTC distribution for LHC detectors Nucl. Instrum. Methods A 45 821-8
- [3] 'LHC Bunch Collision', http://ttc.web.cern.ch/TTC/TTCmain. html#Collisions
- [4] Christiansen J et al 2003 TTCrx Reference Manual Version 3.8 (Geneva, Switcherland: CERN), http://ttc.web.cern.ch/ TTC
- [5] Buńkowski K et al 2005 Diagnostic tools for the RPC muon trigger of the CMS detector design and test beam results IEEE Trans. Nucl. Sci. 52 3216-22
- [6] Górski M, Kudła M I and Poźniak K T 1998 Resistive plate chamber (RPC) based muon trigger system for the CMS experiment data compression/decompression system Nucl. Instrum. Methods A 419 701-6
- [7] Górski M, Kalinowski A, Królikowski J, Kudła I M, Poźniak K T, Wrochna G and Zalewski P 2004 Data transfer simulation for the RPC muon trigger of the CMS experiment Proc. SPIE 5484 247-56
- [8] Filipek T A, Poźniak K T, Kudła I M, Kierzkowski K, Okliński W and Romaniuk R S 2005 Fast synchronous distribution network of data streams for RPC muon trigger in CMS experiment Proc. SPIE 5775 139-49

- [9] Kalinowski A, Królikowski J and Zych P 2001 Muon trigger algorithms based on 6 RPC planes CMS NOTE-2001/045 (Geneva, Switzerland: CERN) (http://cmsdoc.cern.ch/ documents/01/note01\_045.pdf)
- [10] Buńkowski K, Kalinowski A, Kudła I M, Poźniak K T and Wrochna G 2003 Pattern comparator trigger algorithm-implementation in FPGA Proc. SPIE 5125 165 - 74
- [11] Poźniak K T 2005 Parameterized, hierarchical sorter for RPC muon trigger Proc. SPIE 5775 111-20
- [12] Fengler A, Kudła I M and Zalewski P 1998 Ghosts buster for the RPC based muon trigger CMS NOTE-1998 /012 (http://cmsdoc.cern.ch/documents/98/note98\_012.pdf)
- [13] http://www.altera.com/ [Altera Homepage][14] http://www.xilinx.com/ [Xilinx Homepage]
- [15] http://cms.cern.ch/iCMS/jsp/page.jsp?mode = cms&link = /MTCC.html&name = MTCC [MTCC]
- [16] Wrochna G 1998 Synchronization of the CMS muon detector CMS IN 1998 /017 (http://cmsdoc.cern.ch/documents/ 98/cr98\_017.pdf)
- [17] Moreira P, Toifl T, Kluge A, Cervelli G, Marchioro A and Christiansen J Gigabit gigabit optical link transmitter manual CERN - EP / MIC (Geneva, Switzerland: CERN) (http://proj-gol.web.cern.ch/proj-gol/manuals/ gol\_manual.pdf)