A 48 ps Resolution, Timing Chip Constructed from Two Expandable 32-Channel Register-only TDCs | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A 48 ps Resolution, Timing Chip Constructed from Two Expandable 32-Channel Register-only TDCs Hang Yu, Jinghe Yang, Yan Li, Binjie Ge, Jinpeng Shen, Liming Si, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8697737/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 8 You are reading this latest preprint version Abstract Raman spectroscopy is a nondestructive, label-free optical analysis technique, and has seen various applications. Gated-Raman spectroscopy provides an effective way to suppress the influence of fluorescence photons. In this work, a novel 48 ps resolution timing chip constructed from two expandable 32-channel Time-to-digital convertors (TDC) is presented for gated-Raman spectroscopy. Each proposed TDC core consists of 32 independent channels, with a unified height of 30 µm for each channel. The design can be conveniently integrated with concurrent single photon avalanche diode (SPAD) technology. The TDC channels are constructed entirely from standard digital gates, primarily D-type registers and therefore uniformity among TDC channels is naturally achieved. Thirteen critical rising edges for time measurement are generated from delay interpolation, which is entirely separated from the TDC channels. The critical edges are distributed to all TDC channels through isolated shielded paths to minimize noise coupling. The separation among these critical rising edges, and therefore the measuring resolution, is fixed as 48 ps by two feedback locking loops referenced to a single 100 MHz clock. The register-only TDC makes the proposed design flexible for expansion, and in the presented prototype timing chip, two identical core blocks are simply placed side-by-side to form a timing chip with 64 TDC channels. The prototype was fabricated using 180 nm CMOS technology, with its functionality fully validated. When all included TDC channels are excited with 1 MHz periodic inputs, the power consumption is only 0.48mW/channel with a 1.8 V power supply. Gated Raman spectroscopy SPAD Time-to-digital converter (TDC) delay interpolation Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 1. INTRODUCTION Raman spectroscopy is a nondestructive, label-free optical analysis technique, and has seen various applications in agricultural, food, oil and pharmaceutical industries, either to measure chemical environment or to analyze molecular compositions [ 1 – 3 ]. A typical Raman measurement involves using a laser beam to excite the sample-under-test and the resulting inelastic scattering of light (with signature of molecule-level vibration) emitted from the sample are captured by spectrometer [ 4 – 5 ]. However, low signal-to-noise ratio is usually seen by Raman spectroscopy, as both the scattered laser energy and fluorescence emission from substance show much higher amplitude, which should be discarded during the measurement. Recently, gated Raman spectroscopy has gained popularity due to the increasing recognition of the timing response difference between Raman scattering and fluorescence emission [ 6 – 7 ]. This technique utilizes a highly focused laser beam to excite the sample and detects only those Raman-scattered photons that occur within a brief time window following the laser excitation. As a result, the interference from fluorescence photons, which typically emit over a longer timescale measured in nanoseconds, can be effectively minimized [ 8 ]. Incorporating very-sensitive sensors such as single-photon avalanche diode (SPAD) makes Raman spectroscopy even more attractive, and the Raman spectrum can be simply re-built by quantitatively analyzing the detected photons at each spectral point [ 9 ]. In such a setup, accurate extraction of the Raman spectrum depends on the performance of the time-to-digital converting (TDC) array, preferably a single TDC for a column of SPAD line sensors [ 10 – 15 ]. The ideal TDCs of such configuration must figures not only fine resolution in the range of tens of picoseconds, it also must occupy small silicon area, preferably compatible with their corresponding line sensor to ensure a compact systematic integration. Besides, consistency across TDC channels that can be deteriorated by PVT variations must be guaranteed. Several different types of TDC designs have already been integrated with SPAD line sensors. In [ 10 ], cyclic converter architecture with delay line-based time interval amplifier is used to implement the required TDC for each line sensor, achieving sub-100 ps measuring resolution. Alternatively, Erdogan et al. demonstrated a TDC design based on a gated ring oscillator (GRO), and the measuring resolution is pushed forward to 51.20 ps [ 11 ]. However, without referencing to any exact frequencies, these designs are susceptible to process voltage and temperature (PVT) variations, and stable TDC performance across different channels can only be guaranteed by either characterizing transistor electronic properties through extensive simulations or by post-fabrication tweaking in the digital domain. Feedback mechanism including delay locked loops can be included to circumvent the influence from PVT and subsequently to stabilize the TDC measuring resolutions. In [ 12 ] and [ 13 ], Ilkka Nissinen et al. discussed two different designs, both utilizing two nested DLLs to defining the temporal resolution of the delay elements composing the TDCs. The generated voltages from the nested DLLs are then distributed to all TDC channels, consisting the exact replica delay elements in cascaded their form. D-type registers are then used to record the measuring results in the digital domain. With this configuration, the discussed TDCs, designed in CMOS 350 nm and 110 nm technology respectively, demonstrated no larger than 50 ps and 25.6 ps temporal resolutions. Other TDC array designs such as in [ 16 ] and [ 17 ] are dedicated for simultaneously accurate timing measurement and extended dynamic range. They are also based on similarly two-level feedback structure, they both achieve fine measuring resolution in the sub-20ps, fast measurement speed and large then millisecond dynamic range. However, the relatively large silicon area required to improve matching of basic delay elements and uniformity across TDC channels makes it unsuitable to be integrated with SPAD line sensors. Other published designs such as in [ 18 – 20 ] adopts the successive approximation technique to reach high TDC measuring resolution, but the relatively low converting speed limits their applications in related fields. In the above published designs, all TDC associated with each line sensor incorporates an intrinsic delay line consisted by delay elements in cascaded or parallel configuration, and their matching properties directly affect the consistency among various TDC channels, due to the well-understood trade-off between silicon size and their matching performance. As sensor dimensions continuously shrink in advanced semiconductor process, it requires new design approaches to simultaneously ensure minimized TDC size and improved measuring consistency. In this paper, we propose a new design scheme for arrayed TDCs that complies with those rigorous criteria. Without including any delay element that needs stringent performance matching, each TDC consists solely of D-type registers, and therefore it shows great scalability following the advance of semiconductor process. A group of virtual clocks with evenly-separated rising edges, which are generated from a separate block formed by parallel delay elements, are shared and delivered to all TDC channels via strictly shielded paths for timing measurement. The timing separation between neighboring rising edges of the virtual clocks is also dynamically set by two nested DLLs. In such an arrangement, the consistency among TDC channels is almost inherently guaranteed. Additionally, as the necessary delay elements are independent of TDC channels, optimized silicon area can be allocated for better matching purpose. Other advantage includes simplified calibration process when improved linearity is required by the TDC channels, as there is only one separate-placed group of delay elements that needs correction. This scheme also exhibits excellent extensibility. Although the prototype core only contains 32 TDC channels, each with 30 µm of width, extending to more channels can be achieved by simply aligning several core blocks side-by-side, without additional effort for signal routing. This paper is organized as follows. In section II, the systematic architecture of the TDC core consisting the 32-channel is demonstrated, with detailed discussion of its design methodology. A prototype design including 64 TDC channels is then constructed by combining two 32-channel cores, to showcase the design extensibility. The 64-channel prototype design is fabricated using CMOS 180nm technology, and its functionality is fully characterized. The characterization results of the 64-channel prototype are presented and discussed in section III, followed by concluding remarks. 2. SYSTEMATIC ARCHITECTURE Figure 1 illustrates the systematic architecture of the proposed core design feathering 32 independent TDC channels, and how it can be interfaced with SPAD line sensors. Without any intrinsic delay-generating elements, the TDC in each measuring channel consists of D-type registers only. The timing scale necessary for measurements is provided from a group of equally-spaced rising edges PCLK . PCLK is generated from an array consisting of thirteen carefully matched delay elements (DE) with capacitance interpolation, and they are subsequently transmitted to and shared by all 32 TDC channels. This configuration not only reduces the design complexity of the TDC channels, making them more easily sized to match the line sensors, it also provides better uniformity among different channels as identical timing scales are used for measurement. Additionally, placing the delay line in close proximity but within the TDC channels relaxes the layout constraints of DEs that are usually in every TDC channel, thereby enabling better matching performance. The measuring timing scale provided from PCLK , or the space between neighboring rising edges is maintained as 48 ps across various PVT conditions by two nested Delay Locked Loops (DLLs). Referencing a 100 MHz clock, the first DLL generates sixteen virtual clock phases every 10 ns, each separated by 625 ps. Two of these locked phases, P and P , are then used as timing references by the second DLL. A single control voltage Vc is therefore established through this two-step process, which is then directly copied to a dedicated DE array independent from the TDC channels, to set the 48ps measuring resolution. The delay elements in the dedicated DE array are placed in very close proximity, and therefore better matching can be achieved without being constrained by limited silicon area. Following a similar process as in most related applications, the proposed core functions are as follows. The signal to excite the narrow laser beam, Start, initiates the timing measurement. A delayed version of Start, denoted as ST, triggers the DE array to produce the group of 48 ps-separated rising edges PCLK , which are distributed through the SH_PATH. SH_PATH is a dedicated structure for transmitting PCLK to the TDC channels, where the length of each signal line is carefully aligned to maintain the relative phase relation of the transmitted signals. PCLK are then sampled by the digitized input signal, TDC_IN (i = 0 to 31) from the line sensors, respectively, resulting thirteen thermometer-coded bits as the measurement results in each register-only TDC channel, which are then routed in parallel to the Data-Interface circuit for further processing. The inherently-generated thermometer-coded results from each TDC channel are independently converted into 4-bit binary form in the Data Interface, and are stored in register blocks. A 128-bit data frame containing measuring results from all 32 channels is then formed at the next clock period. An interrupt signal, INT, is generated upon detecting any non-zero result among the TDC channels, and transmission of the formed data frame to the outside receiving device is initiated. This process contains a total of 32 clock cycles, sending the data frame in 4-bit parallel form, and the synchronized clock, through dedicated IOs (DO and CLK_O). 2.1. Interpolated Delay Elements (DEs) The equally-spaced rising edges for timing measurement are generated from a single group of interpolated DEs placed in an independent DE array separated from TDC channels. In such configuration, the interpolated DEs can occupy a larger silicon area for optimized matching. Figure 2 (a) demonstrates the basic circuit topology of the interpolated DEs, which is primarily composed of a current-limit inverter and a shunt capacitor Cs at its output. The delay of DE is the time required to discharge the shunt capacitor Cs (including parasitic capacitance Cp), by a carefully controlled current determined by Vc. To the first order, the delay of each interpolated DE can be defined as the product of the voltage-controlled discharging current and their corresponding capacitor Cs. In this design, the same current-limit inverter, and therefore the same discharging current, is adopted for all interpolated DEs for matching propose, and each DE incorporates different output capacitance to generate slightly varied delays. As shown in Fig. 2 (a), Cs with 13 distinct capacitance values (integer multiples of the unit capacitance Cu, ranging from M = 1 to 13) are employed to construct the interpolated DEs, thus to generate PCLK0-PCLK12, respectively. As a result, if the maximum relative timing distance among the generated rising edges, which appears between PCLK0 and PCLK12, is fixed, linear interpolation can be applied, and the delay of intermediate rising edges PCLK can be calculated corresponding to the integer multiples of Cu shunted in each interpolated DE. To ensure high-quality matching, the same capacitance array comprising 17 unit-sized capacitors is utilized for all interpolated DEs. Only the necessary number of unit capacitors is routed to the inverter output in each DE, while the remaining capacitors are grounded as dummies. Additionally, four extra unit-sized dummy capacitors are placed at the leftmost and rightmost corners to further improve matching. This arrangement can also minimize the impact of randomized Cp variation in implemented elements. As the relative timing distance between PCLK and PCLK is locked as 625 ps by the two nested DLLs, which will be discussed in the next section, the separation between two neighboring rising edges of PCLK is expected to be one-sixteenth of 625 ps. Figure 2 (b) presents the simulated delay for all thirteen interpolated DEs. The minimum delay, corresponding to M = 1 and referred to as the intrinsic delay, is approximately 418 ps. While this intrinsic delay is susceptible to variations in PVT conditions, it should not be used for TDC measurements. However, the relative delays among interpolated DEs are fixed, which can serve as gauges for time-to-digital conversion. In Fig. 2 (b), the simulated linearity of the delay interpolation is also demonstrated. It shows that randomized variation of parasitic capacitance Cp at the current-limited inverter output has minimal impact and can be effectively ignored. The worst-case delay variations of all interpolated DEs are estimated through Monte Carlo simulations. Using an extracted view that includes parasitic components, the interpolated delays for M = 1 to 13 are each simulated 500 times. The jitter of the rising edges follows a normal distribution, and the standard deviation for each simulated case is plotted in Fig. 2 (c). The maximum variance observed is only approximately 9.5 ps with M = 13, which is well below the 48 ps measuring resolution. 2.2. Nested DLLs to control the measuring resolution The measuring resolution is dynamically locked using two nested DLLs as presented in Fig. 3 and Fig. 4 , respectively. Referencing to a 100 MHz input clock, the first DLL produces 16 virtual clock phases separated by 625 ps, which are denoted as P . The second DLL then takes two locked phases from the first DLL as references, and generates a settled voltage Vc, representing the 48 ps measuring resolution, through the locking process. Vc is replicated to the separated DE array, and PCLK with proper timing separation are readily available upon the triggering from Start. The proposed scheme allows close-proximity placement of the interpolated DEs in both the separated DE array and the second DLL, and proper performance matching can therefore be guaranteed. As shown in the detailed systematic diagram as in Fig. 3(a), the first DLL incorporates a phase-and-frequency detector (PFD), a charge pump, a voltage-to-current convertor (V-I), and 16 carefully matched delay cells (DCs). The first DLL dynamically locks the rising edges of the first and the last DCs, and therefore the timing delay of the DC is set as 625 ps if all of them are perfectly matched. Figure 3(b) shows the simulated delay of all 16 DCs across various process corners and temperature conditions after locking is achieved. Notably due to the varied capacitance loading at the output, the last DC produces slightly more delay (~ 30 ps) compared with other DCs. For this reason, P should be avoided to be used as reference for the second DLL to generate the target measuring resolution. The second DLL includes two groups of interpolated DEs with different loading capacitance (M = 1 and 13), with each group consisting of three cascaded interpolated DEs to average out any unexpected phase errors. Taking P and P as the corresponding inputs, the output rising edges from the two cascaded groups are aligned through the negative feedback loop. Since P is delayed from P by 3×625 ps, the delay difference between interpolated DEs with M = 1 and M = 13 is set to 625 ps assuming perfect match. The control voltage Vc is gradually established on the filtering capacitance C fil , which is then buffered and sent to the separated DE array for resolution control. This nested topology is very similar as reported in [ 12 ]. However, the presented design includes 2ⅹ numbers of interpolated DEs for improved measuring resolution and extended dynamic range. Most importantly, Vc generated from this two-step locking process only functions upon the thirteen interpolated DEs in the separated DE array, and it is not copied to any TDC channels through long metal lines susceptible to noise coupling. As a result, matched performance is expected from all included TDC channels. 2.3. Independent DE array and register-only TDC channels As a functional block completely independent from the TDC channels, the separated DE array is comprised only by the thirteen interpolated DEs, all biased by the same control voltage Vc. When Start initiates the excitation of the laser beam, its delayed version ST simultaneously triggers all the interpolated DEs consisting the separated DE array. As a result, the 13 critical rising edges, PCLK0-PCLK12, are produced upon triggering. The latency between START and the actual excitation of laser beams can be mitigated by adjusting the occurrence of ST, which can be tuned with a resolution of approximately 80 ps, with an overall adjustment range of around 80 ns. This tuning range is sufficient in most applicable scenarios. Generating the critical edges entirely from the separated array greatly simplifies the TDC channel. As also illustrated in Fig. 5 , the TDC channel in this design can be consisted solely by 13 D-type registers. With all of their clock terminals connected as the channel input TDC_IN[i], D terminals of the thirteen registers within the same TDC channel are routed respectively to PCLKO_0-PCLKO_12, replica of PCLK_0-PCLK_12 after being distributed through the shielded path (SH_PATH). As a result, the rising edges arriving before TDC_IN[i] result in binary '1's at the register outputs, whereas those arriving after result in '0's. Consequently, 13 bits of thermometer-coded data, denoted as Di , are generated as the result from a single timing measurement. The register-only TDC naturally achieves both high degree of integration and consistency among channels. However to make such configuration successful, PCLK0-PCLK12 must be sent with high quality to all 32 TDC channels. In the presented design, the signal paths for those critical rising edges from the separated DE array to each TDC channel are carefully aligned in a shielded path (SH-PATH). Their lengths are carefully arranged as equal, and they are surrounded by grounded conductive layers including metals and heavily-implanted polysilicon for shielding purpose. This shielding structure provides an identical environment for the propagating signals, thus reduces un-wanted couplings and signal interference. In addition, multiple buffers are inserted along the paths to maintain fast transition on the rising edge when the distributed signals are output to the TDC channels. The proposed configuration is demonstrated Fig. 6 , illustrating a section of the SH-PATH and eight TDC channels. 3. EXPANDABLE LAYOUT AND MEASUREMENT RESULTS The proposed prototype core with 32 register-only TDC channels is designed in CMOS 180 nm technology. The height of each TDC channel is strictly set as 30 µm to match the targeted SPAD line sensors, and therefore the height of all 32 TDC channels is limited as 960 µm. Adding up the nested DLLs, DE array, the SH-PATH, and the power management unit (PMU), the total silicon area of core design including 32 TDC channels is 960 µm × 1152 µm. The proposed core can be easily expanded to include more TDC channels if needed. Figure 7 demonstrates this expandability as a 64-channel design is formed by placing two proposed cores side-by-side, with each separately consisting of 32 TDC channels, the separated DE array with 13 interpolated DEs, the nested DLLs for resolution control and the data interface. This combination leaves no physical gap in between, and only a few IOs required by the power supplies and 100 MHz reference clock are shared. To validate the design methodology, this proof-of-concept have the inputs to three neighboring channels (TDC_IN , with k as integers) share a single IO for convenient packaging, and testing signal from the IO is distributed to which TDC channel is controlled by an additional 2-bit digital word. As a result, complete validation of all 64 TDC channels needs three separate testing phases with each includes one third of the total number of channels. For each testing phase, a high-precision arbitrary waveform generator with two correlated outputs and 350 MHz physical bandwidth is used to provide the required excitation pair, Start and TDC_IN . The phase difference between rising edges of the two exciting signals can be adjusted with a step of 0.01-degree, equivalent to 30 ps in time domain (ΔT). Figure 8 presents the typical statistical distribution of 11 TDC channels from the characteristic results where Start and TDC_IN are separated by 400 ps with the 2-bit control word set as 01. Each presented histogram in the figure to the comprises a total of 4000 samples. The measuring accuracy of the presented TDC channels, which can be demonstrated by their corresponding statistical variance σ, are also listed in the histograms. By using a series of Start and TDC_IN pairs with progressively increased time intervals, the dynamic range of all TDC channels can be evaluated. Figure 9 illustrates the measured dynamic ranges for all 64 TDC channels, and each presented curve represents the performance of a single channel comprising 25 incremental data points, covering 750 ps in total. Consistency across TDC channels is clearly demonstrated as all groups of critical rising edges involved in the measurement are generated from the same separated DE array. Timing discrepancy, or skew among TDC channels in Fig. 9 is primarily introduced by the unequal-paths of the excitation input routed to each TDC channel on the testing platform. The maximum discrepancy among all 64 measured channels is less than 3 × ΔT, according to the presented results in Fig. 9 (a). Further improvement of the TDC linearity can be achieved by an additional one-time, simplified calibration process. As the non-linearity of all TDC channels originates from the same shared PCLK , calibrating the thirteen interpolated DEs alone is sufficient. The one-time calibration process includes using the statistical results presented as in Fig. 9 (a) as prior knowledge to calculate the difference of measured means, as whole bits, against their ideal linear counterpart. By subtracting the calculated difference, recorded result from each single measurement can be mitigated, thereby significantly enhancing the linearity performance. Figure 9 (b) and (c) present the mitigated results of all 64 TDC channels and their corresponding residual errors, after the one-time calibration process. The timing skew appeared on various TDC channels is also removed by post processing. Comparing with the ideal linear cases, all TDC channels achieve better-than-0.5 LSB linearity across most covered range, except slightly increased residual errors at the extremities of the dynamic range. With 1.8V supply voltage, the presented timing chip consumes only 17 mA current, when all TDC channels are effectively active with 1 MHz periodic input pairs. Normalized to each TDC channel, the power consumption is only 0.48 mW/channel. Key performance metrics of the proposed IC are compared with some state-of-the-art designs for similar applications, as in Table I. The TDC array presented in this work combines ps-level resolution and much reduced power consumption normalized to a single TDC channel. Additionally, the TDCs presented in this work consist only of D-type registers, making their performance less susceptible to non-idealities such as PVT or device mismatches. This design scheme is also beneficial to match the width of each TDC channel to its corresponding sensor line, especially for advanced technologies with much diminished sensor sizes. As independent data interface and IOs are dedicated to each core block, the maximum measurement speed is not affected by the adding more core blocks and TDC channels. With a 100 MHz reference clock, the data output rate of each core design block is 400 Mbps, and therefore the maximum measurement speed maintains as 3.125 Msps. 4. CONCLUSION A novel 48 ps resolution timing chip constructed from two expandable core blocks with 32 TDC channels each is presented in this paper. In each core block, the thirteen rising edges that are critical for timing measurement are produced by interpolated DEs, placed separately from all TDC channels. As a result, the TDC channels can be purely constructed by D-type registers, thereby high degree of integration and performance consistency among channels is naturally achieved. Referenced to a 100 MHz clock, the 48 ps measuring resolution is stabilized across various PVT conditions by two nested DLLs. The total area of the core circuit including 32 TDC channels, two DLLs and PMU is 960 µm ×1152 µm. A prototype timing chip design with 64 TDC channels is constructed and fabricated using CMOS 180 nm technology. The prototype design is formed by simply placing two identical core blocks side-by-side, with shared power supplies and reference clock. The performance uniformity of all TDC channels consisting the presented prototype is validated through experiments. Further improvement of linearity to better than 0.5-LSB can be achieved by a simplified one-time calibration process. Operating under 1.8 V supply voltage, the power consumption is only 0.48 mW/channel. Table I Performance comparison with state-of-art designs [ 10 ] [ 11 ] [ 12 ] [ 13 ] This work Technology 350 nm 130 nm 350 nm 110 nm 180 nm Chanel width 41.6 µm 23.78µm 35 µm 32.9 µm 30 µm Resolution (ps) 19.5ps 51.2 ps 50–100 ps 25.6–65 ps 48 ps Dynamic range 640 ns 1.64 ns -209.6 ns 350 − 700 ps 3.2–8.2 ns 650ps Power consumption / Channel @excitation freq. 1.17 mW @ 0.1 MHz 3.83 mW @ 20 MHz (TCSPC mode) N/A 0.109 mW @ 280 kHz 0.48 mW @ 1 MHz Max. Measurement freq. * 0.17 Msps 0.23 Msps 0.4 Msps 2.34 Msps 3.125 Msps Resolution locking mechanism Nutt interpolation with time doubler GRO with DAC-tuned supply Nested DLLs Nested DLLs Nested DLLs TDC Channel Cyclic convertor with time doublers Flash, Matched delay line Flash, Matched delay line Flash, Matched delay line Flash, Registers-only *Calculated as readout frames per second. Declarations Author Contribution H. Yu, Y. Li, and J. Yang performed the main tasks of the research, prepared figures, analyzed data, and co-wrote the manuscript. B. Ge, J. Shen and L. Shi provided help with data analyzing and the manuscript writing. S. Ma guided the research. Acknowledgments This work was partially supported by the National Natural Science Foundation of China under Grant No.s 62250002, 62201194, 62571049, 61471245, 62571049and U120125. This work is also supported by Grant X210251TH210 from Ji Hua Laboratory. The authors declare no competing interests. References Saar, B. G., Freudiger, C. W., Reichman, J., et al. (2010). Video-rate molecular imaging in vivo with stimulated Raman scattering[J]. science , 330 (6009), 1368–1370. Bumbrah, G. S., & Sharma, R. M. (2016). Raman spectroscopy – Basic principle, instrumentation and selected applications for the characterization of drugs of abuse, Egyptian Journal of Forensic Sciences , vol. 6, pp.209–215, June. Carter, J. C., Brewer, W. E., & Angel, S. M. (Dec., 2000). Raman spectroscopy for the in-situ identification of cocaine and selected adulterants. Applied Spectroscopy , 54 , 1876–1881. RAMAN, C. (1928). New Type of Secondary Radiation. Nature , 121 , 501–502. Blacksberg, J., Alerstam, E., Cochrane, C. J., Maruyama, Y., & Farmer, J. D. (Jan. 2020). Miniature high-speed, low-pulse-energy picosecond Raman spectrometer for identification of minerals and organics in planetary science. Applied Optics , 59 (2), 433. Kögler, M., & Heilala, B. (2021). Time-gated Raman spectroscopy – a review. Measurement Science & Technology , 32 , 1–17. Chiuri, A., & Angelini, F. (2021). Fast Gating for Raman Spectroscopy, Sensors, vol.21, pp. 1–41, Apr. Matousek, P., Towrie, M., Ma, C., et al. (2001). Fluorescence suppression in resonance Raman spectroscopy using a high-performance picosecond Kerr gate[J]. Journal of Raman Spectroscopy , 32 (12), 983–988. Maruyama, Y., Blacksberg, J., & Charbon, E. (2013). A 1024× 8 700ps time-gated SPAD line sensor for laser Raman spectroscopy and LIBS in space and rover-based planetary exploration[C]//2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers. IEEE, : 110–111. Tuomo Talala, E., Parkkinen, & Nissinen, I. (2023). CMOS SPAD Line Sensor with Fine-Tunable Parallel Connected Time-to-Digital Converters for Raman Spectroscopy, IEEE J. Solid-State Circuits, vol.58, pp.1350–1361, May. Ahmet, T., Erdogan, R., & Walker (2019). A CMOS SPAD Line Sensor with Per-Pixel Histogramming TDC for Time-Resolved Multispectral Imaging, IEEE J. Solid-State Circuits, vol.54, pp.1705–1719, June. Nissinen, I., Nissinen, J., Keränen, P., Stoppa, D., & Kostamovaara, J. (2018). A 16×256 SPAD Line Detector With a 50-ps, 3-bit, 256-Channel Time-to-Digital Converter for Raman Spectroscopy. in IEEE Sensors Journal , 18 (9), 3789–3798. Talala, T., Parkkinen, E., & Nissinen, I. (May 2023). Line Sensor With Fine-Tunable Parallel Connected Time-to-Digital Converters for Raman Spectroscopy. IEEE Journal of Solid-State Circuits , 58 (5), 1350–1361. Pancheri, L., & Stoppa, D. (2009). A SPAD-based pixel linear array for high-speed time-gated fluorescence lifetime imaging, 2009 Proceedings of ESSCIRC, Athens, Greece, pp. 428–431. Maruyama, Y., Blacksberg, J., & Charbon, E. (Jan. 2014). A 1024 × 8, 700-ps Time-Gated SPAD Line Sensor for Planetary Surface Exploration with Laser Raman Spectroscopy and LIBS. IEEE Journal of Solid-State Circuits , 49 (1), 179–189. Li, Y., Yu, H., et al. (Aug., 2018). A CMOS Time-to-Digital Converter for real-time optical time-of-flight sensing system. IEEE Communications Magazine , 56 , 113–119. Ke, Q., Yu, H. (2022). Sept., A 32-Channel Time-to-Digital Converter with 20-ps Resolution for ToF Applications, IEEE International Conference on Circuits and Systems, Chengdu. Mantyniemi, A., Rahkonen, T., & Kostamovaara, J. (2009). A CMOS time-to-digital converter (TDC) based on a cyclic time domain successive approximation interpolation method, IEEE J. Solid-State Circuits , vol. 44, no. 11, pp. 3067–3078, Nov. Chen, C. C., Chen, P., Hwang, C. S., & Chang, W. (2005). A precise cyclic CMOS time-to-digital converter with low thermal sensitivity, IEEE Trans. Nucl. Sci., vol. 52, no. 4, pp. 834–838, Aug. Tisa, S., Lotito, A., Giudice, A., & Zappa, F. (2003). Monolithic time-to-digital converter with 20 ps resolution, in Proc. European Solid-State Circuits Conf., ESSCIRC, Sep. pp. 465–468. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers agreed at journal 12 Apr, 2026 Reviews received at journal 22 Mar, 2026 Reviewers agreed at journal 18 Feb, 2026 Reviewers agreed at journal 04 Feb, 2026 Reviewers invited by journal 02 Feb, 2026 Editor assigned by journal 29 Jan, 2026 Submission checks completed at journal 29 Jan, 2026 First submitted to journal 26 Jan, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8697737","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":584472709,"identity":"a9489a56-af48-4f9a-87f7-d682a491810b","order_by":0,"name":"Hang Yu","email":"","orcid":"","institution":"Shenzhen MSU_BIT University","correspondingAuthor":false,"prefix":"","firstName":"Hang","middleName":"","lastName":"Yu","suffix":""},{"id":584472710,"identity":"b8dce5cd-a02a-4a15-ba1e-b680011f965a","order_by":1,"name":"Jinghe Yang","email":"","orcid":"","institution":"Shenzhen MSU_BIT University","correspondingAuthor":false,"prefix":"","firstName":"Jinghe","middleName":"","lastName":"Yang","suffix":""},{"id":584472711,"identity":"020032fb-ea3c-4125-9d5b-ae9a0fce4385","order_by":2,"name":"Yan Li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAwElEQVRIiWNgGAWjYBACPmYQWcHAYACieYjRwgbWcoYkLSCCsY0kLew8hp8L59VFm0skMD5428Ygb07YYTzG0jO3Hc7dOSOB2XBuG4PhzgbCWgykebcdyN1wI4FNmreNIcHgABG2/OadUwfSwv6bWC1m0rwNzGBbmInUwlZmzXMM6Jeeh82Sc85JGG4gpIWf//Dm2zw1dbnb2ZMPfnhTZiNP0BYGBg4DKIOxAUhIEFQPBOwPiFE1CkbBKBgFIxkAAOPHN0kcddw7AAAAAElFTkSuQmCC","orcid":"","institution":"Shenzhen MSU_BIT University","correspondingAuthor":true,"prefix":"","firstName":"Yan","middleName":"","lastName":"Li","suffix":""},{"id":584472712,"identity":"80623546-42c2-48ac-a340-78c2075524e7","order_by":3,"name":"Binjie Ge","email":"","orcid":"","institution":"Shenzhen MSU_BIT University","correspondingAuthor":false,"prefix":"","firstName":"Binjie","middleName":"","lastName":"Ge","suffix":""},{"id":584472713,"identity":"2cc17fe4-c602-4a01-baca-658f791407d2","order_by":4,"name":"Jinpeng Shen","email":"","orcid":"","institution":"Shenzhen MSU_BIT University","correspondingAuthor":false,"prefix":"","firstName":"Jinpeng","middleName":"","lastName":"Shen","suffix":""},{"id":584472714,"identity":"a63f0944-d4c2-4aa6-a6d8-dc5581163039","order_by":5,"name":"Liming Si","email":"","orcid":"","institution":"Beijing Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Liming","middleName":"","lastName":"Si","suffix":""},{"id":584472715,"identity":"8eb31d83-81fc-4a27-8496-05b683f7d7be","order_by":6,"name":"Siguang Ma","email":"","orcid":"","institution":"Ji Hua Laboratory","correspondingAuthor":false,"prefix":"","firstName":"Siguang","middleName":"","lastName":"Ma","suffix":""}],"badges":[],"createdAt":"2026-01-26 07:53:31","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8697737/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8697737/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":101843826,"identity":"d1217ddf-2436-4206-88bf-14493a15866f","added_by":"auto","created_at":"2026-02-04 08:58:31","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":474691,"visible":true,"origin":"","legend":"\u003cp\u003eSystematic architecture of the proposed design: (a) The systematic architecture of the proposed design, featuring 32 TDC channels with each connected to a sensor line (b) The timing diagram illustrates the key signal sequences involved in the timing measurement process of the proposed design.\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/dcd328a1b11308178ba2165f.jpeg"},{"id":101843818,"identity":"131f9e6c-3541-4f45-9fcc-2929fda6c8b6","added_by":"auto","created_at":"2026-02-04 08:58:28","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":570956,"visible":true,"origin":"","legend":"\u003cp\u003eDesign and performance of the interpolated DEs. (a) Topology of the interpolated DE. (b) the simulated delay results of the 13 interpolated DEs demonstrate the linearity of delay interpolation. (c) simulated delay variations of all interpolated DEs when their delays are set as shown in (b).\u003c/p\u003e","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/5f6eab2504632b5c2dd0be67.jpeg"},{"id":101843833,"identity":"39a6d196-70ff-48fd-92aa-bba4e8891bb0","added_by":"auto","created_at":"2026-02-04 08:58:32","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":114505,"visible":true,"origin":"","legend":"\u003cp\u003eDesign of the first DLL with 16 phases (a) systematic diagram of the first DLL, (b) simulated delays of all DCs across various process corners.\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/fdbd9db927a35c04bcafe238.jpeg"},{"id":101843831,"identity":"35abf8bb-e618-43f2-9426-8c875df730b5","added_by":"auto","created_at":"2026-02-04 08:58:32","extension":"jpeg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":98624,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic of the second DLL, illustrating the structure with two signal paths each containing three cascaded interpolated DEs for setting the delay difference between specific clock phases and generating the control voltage Vc.\u003c/p\u003e","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/517ee128c30db22de888a5ad.jpeg"},{"id":101843810,"identity":"1a7c0c45-de14-4812-90fb-cc437d72859e","added_by":"auto","created_at":"2026-02-04 08:58:23","extension":"jpeg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":120165,"visible":true,"origin":"","legend":"\u003cp\u003eDetailed schematic of separated DE array and the TDC channels comprised only by D-type registers.\u003c/p\u003e","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/eb88cb6aade49efdf980dbd5.jpeg"},{"id":101843814,"identity":"6543da14-2ec4-44d0-a8e4-8111e98a4a5b","added_by":"auto","created_at":"2026-02-04 08:58:25","extension":"jpeg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":257750,"visible":true,"origin":"","legend":"\u003cp\u003ePartial layout to demonstrate the shielded path (\u003cem\u003eSH_PATH\u003c/em\u003e) for \u003cem\u003ePCLK0-12\u003c/em\u003e: it must not only provide equal path length for the thirteen rising edges, but at the same time it should help keep fast transitions on the rising edges at the channel outputs.\u003c/p\u003e","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/774b097af2458b5a7a1e4121.jpeg"},{"id":101843802,"identity":"b7b3f96e-c22a-4fa1-8776-9380591cb185","added_by":"auto","created_at":"2026-02-04 08:58:18","extension":"jpeg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":326686,"visible":true,"origin":"","legend":"\u003cp\u003eMicrograph of the proposed expandable design as two core blocks with each consisting of 32 TDC channels.\u003c/p\u003e","description":"","filename":"floatimage7.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/2287e883d598617720a153bd.jpeg"},{"id":101843837,"identity":"893af06f-820e-4741-8c9f-1b62722eacb6","added_by":"auto","created_at":"2026-02-04 08:58:33","extension":"jpeg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":200675,"visible":true,"origin":"","legend":"\u003cp\u003eTypical statistical histograms of the eleven measured TDC channels during one measurement (σ in unit of bins)\u003c/p\u003e","description":"","filename":"floatimage8.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/5e46b6f9ea5cd8bcafe590af.jpeg"},{"id":101843872,"identity":"cf0c67df-04ec-4f40-9f0f-0b88ff674a3c","added_by":"auto","created_at":"2026-02-04 08:58:42","extension":"jpeg","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":289532,"visible":true,"origin":"","legend":"\u003cp\u003eMeasured performance of all 64 channels (core block 1 and core block 2) of the presented chip with incremental phase step of 30 ps (ΔT):(a) uncalibrated; (b) after one-time calibration; (c) the residual errors for each channel after the one-time calibration.\u003c/p\u003e","description":"","filename":"floatimage9.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/d249ac0fab200eac79ec6ffe.jpeg"},{"id":101843976,"identity":"50d79d2d-bf1e-451e-b728-d72de204100f","added_by":"auto","created_at":"2026-02-04 08:58:56","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2930676,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8697737/v1/644c4ddf-f8d6-4615-baa9-2a828008306f.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"A 48 ps Resolution, Timing Chip Constructed from Two Expandable 32-Channel Register-only TDCs","fulltext":[{"header":"1. INTRODUCTION","content":"\u003cp\u003eRaman spectroscopy is a nondestructive, label-free optical analysis technique, and has seen various applications in agricultural, food, oil and pharmaceutical industries, either to measure chemical environment or to analyze molecular compositions [\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. A typical Raman measurement involves using a laser beam to excite the sample-under-test and the resulting inelastic scattering of light (with signature of molecule-level vibration) emitted from the sample are captured by spectrometer [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. However, low signal-to-noise ratio is usually seen by Raman spectroscopy, as both the scattered laser energy and fluorescence emission from substance show much higher amplitude, which should be discarded during the measurement. Recently, gated Raman spectroscopy has gained popularity due to the increasing recognition of the timing response difference between Raman scattering and fluorescence emission [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. This technique utilizes a highly focused laser beam to excite the sample and detects only those Raman-scattered photons that occur within a brief time window following the laser excitation. As a result, the interference from fluorescence photons, which typically emit over a longer timescale measured in nanoseconds, can be effectively minimized [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIncorporating very-sensitive sensors such as single-photon avalanche diode (SPAD) makes Raman spectroscopy even more attractive, and the Raman spectrum can be simply re-built by quantitatively analyzing the detected photons at each spectral point [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. In such a setup, accurate extraction of the Raman spectrum depends on the performance of the time-to-digital converting (TDC) array, preferably a single TDC for a column of SPAD line sensors [\u003cspan additionalcitationids=\"CR11 CR12 CR13 CR14\" citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. The ideal TDCs of such configuration must figures not only fine resolution in the range of tens of picoseconds, it also must occupy small silicon area, preferably compatible with their corresponding line sensor to ensure a compact systematic integration. Besides, consistency across TDC channels that can be deteriorated by PVT variations must be guaranteed. Several different types of TDC designs have already been integrated with SPAD line sensors. In [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e], cyclic converter architecture with delay line-based time interval amplifier is used to implement the required TDC for each line sensor, achieving sub-100 ps measuring resolution. Alternatively, Erdogan et al. demonstrated a TDC design based on a gated ring oscillator (GRO), and the measuring resolution is pushed forward to 51.20 ps [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. However, without referencing to any exact frequencies, these designs are susceptible to process voltage and temperature (PVT) variations, and stable TDC performance across different channels can only be guaranteed by either characterizing transistor electronic properties through extensive simulations or by post-fabrication tweaking in the digital domain. Feedback mechanism including delay locked loops can be included to circumvent the influence from PVT and subsequently to stabilize the TDC measuring resolutions. In [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] and [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], Ilkka Nissinen et al. discussed two different designs, both utilizing two nested DLLs to defining the temporal resolution of the delay elements composing the TDCs. The generated voltages from the nested DLLs are then distributed to all TDC channels, consisting the exact replica delay elements in cascaded their form. D-type registers are then used to record the measuring results in the digital domain. With this configuration, the discussed TDCs, designed in CMOS 350 nm and 110 nm technology respectively, demonstrated no larger than 50 ps and 25.6 ps temporal resolutions. Other TDC array designs such as in [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] and [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e] are dedicated for simultaneously accurate timing measurement and extended dynamic range. They are also based on similarly two-level feedback structure, they both achieve fine measuring resolution in the sub-20ps, fast measurement speed and large then millisecond dynamic range. However, the relatively large silicon area required to improve matching of basic delay elements and uniformity across TDC channels makes it unsuitable to be integrated with SPAD line sensors. Other published designs such as in [\u003cspan additionalcitationids=\"CR19\" citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e] adopts the successive approximation technique to reach high TDC measuring resolution, but the relatively low converting speed limits their applications in related fields.\u003c/p\u003e \u003cp\u003eIn the above published designs, all TDC associated with each line sensor incorporates an intrinsic delay line consisted by delay elements in cascaded or parallel configuration, and their matching properties directly affect the consistency among various TDC channels, due to the well-understood trade-off between silicon size and their matching performance. As sensor dimensions continuously shrink in advanced semiconductor process, it requires new design approaches to simultaneously ensure minimized TDC size and improved measuring consistency.\u003c/p\u003e \u003cp\u003eIn this paper, we propose a new design scheme for arrayed TDCs that complies with those rigorous criteria. Without including any delay element that needs stringent performance matching, each TDC consists solely of D-type registers, and therefore it shows great scalability following the advance of semiconductor process. A group of virtual clocks with evenly-separated rising edges, which are generated from a separate block formed by parallel delay elements, are shared and delivered to all TDC channels via strictly shielded paths for timing measurement. The timing separation between neighboring rising edges of the virtual clocks is also dynamically set by two nested DLLs. In such an arrangement, the consistency among TDC channels is almost inherently guaranteed. Additionally, as the necessary delay elements are independent of TDC channels, optimized silicon area can be allocated for better matching purpose. Other advantage includes simplified calibration process when improved linearity is required by the TDC channels, as there is only one separate-placed group of delay elements that needs correction. This scheme also exhibits excellent extensibility. Although the prototype core only contains 32 TDC channels, each with 30 \u0026micro;m of width, extending to more channels can be achieved by simply aligning several core blocks side-by-side, without additional effort for signal routing.\u003c/p\u003e \u003cp\u003eThis paper is organized as follows. In section II, the systematic architecture of the TDC core consisting the 32-channel is demonstrated, with detailed discussion of its design methodology. A prototype design including 64 TDC channels is then constructed by combining two 32-channel cores, to showcase the design extensibility. The 64-channel prototype design is fabricated using CMOS 180nm technology, and its functionality is fully characterized. The characterization results of the 64-channel prototype are presented and discussed in section III, followed by concluding remarks.\u003c/p\u003e"},{"header":"2. SYSTEMATIC ARCHITECTURE","content":"\u003cp\u003eFigure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e illustrates the systematic architecture of the proposed core design feathering 32 independent TDC channels, and how it can be interfaced with SPAD line sensors. Without any intrinsic delay-generating elements, the TDC in each measuring channel consists of D-type registers only. The timing scale necessary for measurements is provided from a group of equally-spaced rising edges PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026gt;. PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026thinsp;\u0026gt;\u0026thinsp;is generated from an array consisting of thirteen carefully matched delay elements (DE) with capacitance interpolation, and they are subsequently transmitted to and shared by all 32 TDC channels. This configuration not only reduces the design complexity of the TDC channels, making them more easily sized to match the line sensors, it also provides better uniformity among different channels as identical timing scales are used for measurement. Additionally, placing the delay line in close proximity but within the TDC channels relaxes the layout constraints of DEs that are usually in every TDC channel, thereby enabling better matching performance. The measuring timing scale provided from PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026gt;, or the space between neighboring rising edges is maintained as 48 ps across various PVT conditions by two nested Delay Locked Loops (DLLs). Referencing a 100 MHz clock, the first DLL generates sixteen virtual clock phases every 10 ns, each separated by 625 ps. Two of these locked phases, P\u0026thinsp;\u0026lt;\u0026thinsp;3\u0026thinsp;\u0026gt;\u0026thinsp;and P\u0026thinsp;\u0026lt;\u0026thinsp;6\u0026gt;, are then used as timing references by the second DLL. A single control voltage Vc is therefore established through this two-step process, which is then directly copied to a dedicated DE array independent from the TDC channels, to set the 48ps measuring resolution. The delay elements in the dedicated DE array are placed in very close proximity, and therefore better matching can be achieved without being constrained by limited silicon area.\u003c/p\u003e \u003cp\u003eFollowing a similar process as in most related applications, the proposed core functions are as follows. The signal to excite the narrow laser beam, Start, initiates the timing measurement. A delayed version of Start, denoted as ST, triggers the DE array to produce the group of 48 ps-separated rising edges PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026gt;, which are distributed through the SH_PATH. SH_PATH is a dedicated structure for transmitting PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026thinsp;\u0026gt;\u0026thinsp;to the TDC channels, where the length of each signal line is carefully aligned to maintain the relative phase relation of the transmitted signals. PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026thinsp;\u0026gt;\u0026thinsp;are then sampled by the digitized input signal, TDC_IN\u0026thinsp;\u0026lt;\u0026thinsp;i\u0026gt; (i\u0026thinsp;=\u0026thinsp;0 to 31) from the line sensors, respectively, resulting thirteen thermometer-coded bits as the measurement results in each register-only TDC channel, which are then routed in parallel to the Data-Interface circuit for further processing.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe inherently-generated thermometer-coded results from each TDC channel are independently converted into 4-bit binary form in the Data Interface, and are stored in register blocks. A 128-bit data frame containing measuring results from all 32 channels is then formed at the next clock period. An interrupt signal, INT, is generated upon detecting any non-zero result among the TDC channels, and transmission of the formed data frame to the outside receiving device is initiated. This process contains a total of 32 clock cycles, sending the data frame in 4-bit parallel form, and the synchronized clock, through dedicated IOs (DO\u0026thinsp;\u0026lt;\u0026thinsp;0:3\u0026thinsp;\u0026gt;\u0026thinsp;and CLK_O).\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1. Interpolated Delay Elements (DEs)\u003c/h2\u003e \u003cp\u003eThe equally-spaced rising edges for timing measurement are generated from a single group of interpolated DEs placed in an independent DE array separated from TDC channels. In such configuration, the interpolated DEs can occupy a larger silicon area for optimized matching. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e(a) demonstrates the basic circuit topology of the interpolated DEs, which is primarily composed of a current-limit inverter and a shunt capacitor Cs at its output. The delay of DE is the time required to discharge the shunt capacitor Cs (including parasitic capacitance Cp), by a carefully controlled current determined by Vc. To the first order, the delay of each interpolated DE can be defined as the product of the voltage-controlled discharging current and their corresponding capacitor Cs. In this design, the same current-limit inverter, and therefore the same discharging current, is adopted for all interpolated DEs for matching propose, and each DE incorporates different output capacitance to generate slightly varied delays.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAs shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e(a), Cs with 13 distinct capacitance values (integer multiples of the unit capacitance Cu, ranging from M\u0026thinsp;=\u0026thinsp;1 to 13) are employed to construct the interpolated DEs, thus to generate PCLK0-PCLK12, respectively. As a result, if the maximum relative timing distance among the generated rising edges, which appears between PCLK0 and PCLK12, is fixed, linear interpolation can be applied, and the delay of intermediate rising edges PCLK\u0026thinsp;\u0026lt;\u0026thinsp;1\u0026ndash;11\u0026thinsp;\u0026gt;\u0026thinsp;can be calculated corresponding to the integer multiples of Cu shunted in each interpolated DE.\u003c/p\u003e \u003cp\u003eTo ensure high-quality matching, the same capacitance array comprising 17 unit-sized capacitors is utilized for all interpolated DEs. Only the necessary number of unit capacitors is routed to the inverter output in each DE, while the remaining capacitors are grounded as dummies. Additionally, four extra unit-sized dummy capacitors are placed at the leftmost and rightmost corners to further improve matching. This arrangement can also minimize the impact of randomized Cp variation in implemented elements.\u003c/p\u003e \u003cp\u003eAs the relative timing distance between PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026thinsp;\u0026gt;\u0026thinsp;and PCLK\u0026thinsp;\u0026lt;\u0026thinsp;12\u0026thinsp;\u0026gt;\u0026thinsp;is locked as 625 ps by the two nested DLLs, which will be discussed in the next section, the separation between two neighboring rising edges of PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026thinsp;\u0026gt;\u0026thinsp;is expected to be one-sixteenth of 625 ps. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e(b) presents the simulated delay for all thirteen interpolated DEs. The minimum delay, corresponding to M\u0026thinsp;=\u0026thinsp;1 and referred to as the intrinsic delay, is approximately 418 ps. While this intrinsic delay is susceptible to variations in PVT conditions, it should not be used for TDC measurements. However, the relative delays among interpolated DEs are fixed, which can serve as gauges for time-to-digital conversion. In Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e(b), the simulated linearity of the delay interpolation is also demonstrated. It shows that randomized variation of parasitic capacitance Cp at the current-limited inverter output has minimal impact and can be effectively ignored.\u003c/p\u003e \u003cp\u003eThe worst-case delay variations of all interpolated DEs are estimated through Monte Carlo simulations. Using an extracted view that includes parasitic components, the interpolated delays for M\u0026thinsp;=\u0026thinsp;1 to 13 are each simulated 500 times. The jitter of the rising edges follows a normal distribution, and the standard deviation for each simulated case is plotted in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e(c). The maximum variance observed is only approximately 9.5 ps with M\u0026thinsp;=\u0026thinsp;13, which is well below the 48 ps measuring resolution.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2. Nested DLLs to control the measuring resolution\u003c/h2\u003e \u003cp\u003eThe measuring resolution is dynamically locked using two nested DLLs as presented in Fig.\u0026nbsp;3 and Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003e, respectively. Referencing to a 100 MHz input clock, the first DLL produces 16 virtual clock phases separated by 625 ps, which are denoted as P\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;15\u0026gt;. The second DLL then takes two locked phases from the first DLL as references, and generates a settled voltage Vc, representing the 48 ps measuring resolution, through the locking process. Vc is replicated to the separated DE array, and PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;13\u0026thinsp;\u0026gt;\u0026thinsp;with proper timing separation are readily available upon the triggering from Start. The proposed scheme allows close-proximity placement of the interpolated DEs in both the separated DE array and the second DLL, and proper performance matching can therefore be guaranteed.\u003c/p\u003e \u003cp\u003eAs shown in the detailed systematic diagram as in Fig.\u0026nbsp;3(a), the first DLL incorporates a phase-and-frequency detector (PFD), a charge pump, a voltage-to-current convertor (V-I), and 16 carefully matched delay cells (DCs). The first DLL dynamically locks the rising edges of the first and the last DCs, and therefore the timing delay of the DC is set as 625 ps if all of them are perfectly matched. Figure\u0026nbsp;3(b) shows the simulated delay of all 16 DCs across various process corners and temperature conditions after locking is achieved. Notably due to the varied capacitance loading at the output, the last DC produces slightly more delay (~\u0026thinsp;30 ps) compared with other DCs. For this reason, P\u0026thinsp;\u0026lt;\u0026thinsp;15\u0026thinsp;\u0026gt;\u0026thinsp;should be avoided to be used as reference for the second DLL to generate the target measuring resolution.\u003c/p\u003e \u003cp\u003eThe second DLL includes two groups of interpolated DEs with different loading capacitance (M\u0026thinsp;=\u0026thinsp;1 and 13), with each group consisting of three cascaded interpolated DEs to average out any unexpected phase errors. Taking P\u0026thinsp;\u0026lt;\u0026thinsp;6\u0026thinsp;\u0026gt;\u0026thinsp;and P\u0026thinsp;\u0026lt;\u0026thinsp;3\u0026thinsp;\u0026gt;\u0026thinsp;as the corresponding inputs, the output rising edges from the two cascaded groups are aligned through the negative feedback loop. Since P\u0026thinsp;\u0026lt;\u0026thinsp;6\u0026thinsp;\u0026gt;\u0026thinsp;is delayed from P\u0026thinsp;\u0026lt;\u0026thinsp;3\u0026thinsp;\u0026gt;\u0026thinsp;by 3\u0026times;625 ps, the delay difference between interpolated DEs with M\u0026thinsp;=\u0026thinsp;1 and M\u0026thinsp;=\u0026thinsp;13 is set to 625 ps assuming perfect match. The control voltage Vc is gradually established on the filtering capacitance C\u003csub\u003efil\u003c/sub\u003e, which is then buffered and sent to the separated DE array for resolution control.\u003c/p\u003e \u003cp\u003eThis nested topology is very similar as reported in [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. However, the presented design includes 2ⅹ numbers of interpolated DEs for improved measuring resolution and extended dynamic range. Most importantly, Vc generated from this two-step locking process only functions upon the thirteen interpolated DEs in the separated DE array, and it is not copied to any TDC channels through long metal lines susceptible to noise coupling. As a result, matched performance is expected from all included TDC channels.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3. Independent DE array and register-only TDC channels\u003c/h2\u003e \u003cp\u003eAs a functional block completely independent from the TDC channels, the separated DE array is comprised only by the thirteen interpolated DEs, all biased by the same control voltage Vc. When Start initiates the excitation of the laser beam, its delayed version ST simultaneously triggers all the interpolated DEs consisting the separated DE array. As a result, the 13 critical rising edges, PCLK0-PCLK12, are produced upon triggering. The latency between START and the actual excitation of laser beams can be mitigated by adjusting the occurrence of ST, which can be tuned with a resolution of approximately 80 ps, with an overall adjustment range of around 80 ns. This tuning range is sufficient in most applicable scenarios.\u003c/p\u003e \u003cp\u003eGenerating the critical edges entirely from the separated array greatly simplifies the TDC channel. As also illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003e, the TDC channel in this design can be consisted solely by 13 D-type registers. With all of their clock terminals connected as the channel input TDC_IN[i], D terminals of the thirteen registers within the same TDC channel are routed respectively to PCLKO_0-PCLKO_12, replica of PCLK_0-PCLK_12 after being distributed through the shielded path (SH_PATH). As a result, the rising edges arriving before TDC_IN[i] result in binary '1's at the register outputs, whereas those arriving after result in '0's. Consequently, 13 bits of thermometer-coded data, denoted as Di\u0026thinsp;\u0026lt;\u0026thinsp;0:12\u0026gt;, are generated as the result from a single timing measurement.\u003c/p\u003e \u003cp\u003eThe register-only TDC naturally achieves both high degree of integration and consistency among channels. However to make such configuration successful, PCLK0-PCLK12 must be sent with high quality to all 32 TDC channels. In the presented design, the signal paths for those critical rising edges from the separated DE array to each TDC channel are carefully aligned in a shielded path (SH-PATH). Their lengths are carefully arranged as equal, and they are surrounded by grounded conductive layers including metals and heavily-implanted polysilicon for shielding purpose. This shielding structure provides an identical environment for the propagating signals, thus reduces un-wanted couplings and signal interference. In addition, multiple buffers are inserted along the paths to maintain fast transition on the rising edge when the distributed signals are output to the TDC channels. The proposed configuration is demonstrated Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e6\u003c/span\u003e, illustrating a section of the SH-PATH and eight TDC channels.\u003c/p\u003e \u003c/div\u003e"},{"header":"3. EXPANDABLE LAYOUT AND MEASUREMENT RESULTS","content":"\u003cp\u003eThe proposed prototype core with 32 register-only TDC channels is designed in CMOS 180 nm technology. The height of each TDC channel is strictly set as 30 \u0026micro;m to match the targeted SPAD line sensors, and therefore the height of all 32 TDC channels is limited as 960 \u0026micro;m. Adding up the nested DLLs, DE array, the SH-PATH, and the power management unit (PMU), the total silicon area of core design including 32 TDC channels is 960 \u0026micro;m \u0026times; 1152 \u0026micro;m.\u003c/p\u003e \u003cp\u003eThe proposed core can be easily expanded to include more TDC channels if needed. Figure\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003e demonstrates this expandability as a 64-channel design is formed by placing two proposed cores side-by-side, with each separately consisting of 32 TDC channels, the separated DE array with 13 interpolated DEs, the nested DLLs for resolution control and the data interface. This combination leaves no physical gap in between, and only a few IOs required by the power supplies and 100 MHz reference clock are shared.\u003c/p\u003e \u003cp\u003eTo validate the design methodology, this proof-of-concept have the inputs to three neighboring channels (TDC_IN\u0026thinsp;\u0026lt;\u0026thinsp;3k-3,3k-2, 3k-1\u0026gt;, with k as integers) share a single IO for convenient packaging, and testing signal from the IO is distributed to which TDC channel is controlled by an additional 2-bit digital word. As a result, complete validation of all 64 TDC channels needs three separate testing phases with each includes one third of the total number of channels. For each testing phase, a high-precision arbitrary waveform generator with two correlated outputs and 350 MHz physical bandwidth is used to provide the required excitation pair, Start and TDC_IN\u0026thinsp;\u0026lt;\u0026thinsp;i\u0026gt;. The phase difference between rising edges of the two exciting signals can be adjusted with a step of 0.01-degree, equivalent to 30 ps in time domain (ΔT).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003e presents the typical statistical distribution of 11 TDC channels from the characteristic results where Start and TDC_IN\u0026thinsp;\u0026lt;\u0026thinsp;i\u0026gt; are separated by 400 ps with the 2-bit control word set as 01. Each presented histogram in the figure to the comprises a total of 4000 samples. The measuring accuracy of the presented TDC channels, which can be demonstrated by their corresponding statistical variance σ, are also listed in the histograms.\u003c/p\u003e \u003cp\u003eBy using a series of Start and TDC_IN\u0026thinsp;\u0026lt;\u0026thinsp;i\u0026gt; pairs with progressively increased time intervals, the dynamic range of all TDC channels can be evaluated. Figure\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003e illustrates the measured dynamic ranges for all 64 TDC channels, and each presented curve represents the performance of a single channel comprising 25 incremental data points, covering 750 ps in total. Consistency across TDC channels is clearly demonstrated as all groups of critical rising edges involved in the measurement are generated from the same separated DE array. Timing discrepancy, or skew among TDC channels in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003e is primarily introduced by the unequal-paths of the excitation input routed to each TDC channel on the testing platform. The maximum discrepancy among all 64 measured channels is less than 3\u0026thinsp;\u0026times;\u0026thinsp;ΔT, according to the presented results in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003e(a).\u003c/p\u003e \u003cp\u003eFurther improvement of the TDC linearity can be achieved by an additional one-time, simplified calibration process. As the non-linearity of all TDC channels originates from the same shared PCLK\u0026thinsp;\u0026lt;\u0026thinsp;0\u0026ndash;12\u0026gt;, calibrating the thirteen interpolated DEs alone is sufficient. The one-time calibration process includes using the statistical results presented as in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003e(a) as prior knowledge to calculate the difference of measured means, as whole bits, against their ideal linear counterpart.\u003c/p\u003e \u003cp\u003eBy subtracting the calculated difference, recorded result from each single measurement can be mitigated, thereby significantly enhancing the linearity performance. Figure\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003e(b) and (c) present the mitigated results of all 64 TDC channels and their corresponding residual errors, after the one-time calibration process. The timing skew appeared on various TDC channels is also removed by post processing. Comparing with the ideal linear cases, all TDC channels achieve better-than-0.5 LSB linearity across most covered range, except slightly increased residual errors at the extremities of the dynamic range.\u003c/p\u003e \u003cp\u003eWith 1.8V supply voltage, the presented timing chip consumes only 17 mA current, when all TDC channels are effectively active with 1 MHz periodic input pairs. Normalized to each TDC channel, the power consumption is only 0.48 mW/channel.\u003c/p\u003e \u003cp\u003eKey performance metrics of the proposed IC are compared with some state-of-the-art designs for similar applications, as in Table I. The TDC array presented in this work combines ps-level resolution and much reduced power consumption normalized to a single TDC channel. Additionally, the TDCs presented in this work consist only of D-type registers, making their performance less susceptible to non-idealities such as PVT or device mismatches. This design scheme is also beneficial to match the width of each TDC channel to its corresponding sensor line, especially for advanced technologies with much diminished sensor sizes.\u003c/p\u003e \u003cp\u003eAs independent data interface and IOs are dedicated to each core block, the maximum measurement speed is not affected by the adding more core blocks and TDC channels. With a 100 MHz reference clock, the data output rate of each core design block is 400 Mbps, and therefore the maximum measurement speed maintains as 3.125 Msps.\u003c/p\u003e"},{"header":"4. CONCLUSION","content":"\u003cp\u003eA novel 48 ps resolution timing chip constructed from two expandable core blocks with 32 TDC channels each is presented in this paper. In each core block, the thirteen rising edges that are critical for timing measurement are produced by interpolated DEs, placed separately from all TDC channels. As a result, the TDC channels can be purely constructed by D-type registers, thereby high degree of integration and performance consistency among channels is naturally achieved. Referenced to a 100 MHz clock, the 48 ps measuring resolution is stabilized across various PVT conditions by two nested DLLs. The total area of the core circuit including 32 TDC channels, two DLLs and PMU is 960 \u0026micro;m \u0026times;1152 \u0026micro;m. A prototype timing chip design with 64 TDC channels is constructed and fabricated using CMOS 180 nm technology. The prototype design is formed by simply placing two identical core blocks side-by-side, with shared power supplies and reference clock. The performance uniformity of all TDC channels consisting the presented prototype is validated through experiments. Further improvement of linearity to better than 0.5-LSB can be achieved by a simplified one-time calibration process. Operating under 1.8 V supply voltage, the power consumption is only 0.48 mW/channel.\u003c/p\u003e \u003cp\u003eTable I Performance comparison with state-of-art designs\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eThis work\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTechnology\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e350 nm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e130 nm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e350 nm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e110 nm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e180 nm\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eChanel width\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e41.6 \u0026micro;m\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e23.78\u0026micro;m\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e35 \u0026micro;m\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e32.9 \u0026micro;m\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e30 \u0026micro;m\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eResolution (ps)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e19.5ps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e51.2 ps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e50\u0026ndash;100 ps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e25.6\u0026ndash;65 ps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e48 ps\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDynamic range\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e640 ns\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e1.64 ns\u003c/p\u003e \u003cp\u003e -209.6 ns\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e350 \u0026minus;\u0026thinsp;700 ps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e3.2\u0026ndash;8.2 ns\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e650ps\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePower consumption\u003c/p\u003e \u003cp\u003e/ Channel\u003c/p\u003e \u003cp\u003e@excitation freq.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1.17 mW\u003c/p\u003e \u003cp\u003e@ 0.1 MHz\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e3.83 mW\u003c/p\u003e \u003cp\u003e@ 20 MHz\u003c/p\u003e \u003cp\u003e(TCSPC mode)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.109 mW\u003c/p\u003e \u003cp\u003e@ 280 kHz\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e0.48 mW\u003c/p\u003e \u003cp\u003e@ 1 MHz\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMax. Measurement freq.\u0026nbsp;*\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.17 Msps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.23 Msps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.4 Msps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e2.34 Msps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e3.125 Msps\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eResolution locking mechanism\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNutt interpolation\u003c/p\u003e \u003cp\u003ewith time doubler\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eGRO with DAC-tuned supply\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eNested DLLs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eNested DLLs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eNested DLLs\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTDC Channel\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCyclic convertor with time doublers\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFlash,\u003c/p\u003e \u003cp\u003eMatched\u003c/p\u003e \u003cp\u003edelay line\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFlash,\u003c/p\u003e \u003cp\u003eMatched\u003c/p\u003e \u003cp\u003edelay line\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eFlash,\u003c/p\u003e \u003cp\u003eMatched\u003c/p\u003e \u003cp\u003edelay line\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eFlash,\u003c/p\u003e \u003cp\u003eRegisters-only\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e*Calculated as readout frames per second.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eH. Yu, Y. Li, and J. Yang performed the main tasks of the research, prepared figures, analyzed data, and co-wrote the manuscript. B. Ge, J. Shen and L. Shi provided help with data analyzing and the manuscript writing. S. Ma guided the research.\u003c/p\u003e\u003ch2\u003eAcknowledgments\u003c/h2\u003e \u003cp\u003eThis work was partially supported by the National Natural Science Foundation of China under Grant No.s 62250002, 62201194, 62571049, 61471245, 62571049and U120125. This work is also supported by Grant X210251TH210 from Ji Hua Laboratory.\u003c/p\u003e \u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eSaar, B. G., Freudiger, C. W., Reichman, J., et al. (2010). Video-rate molecular imaging in vivo with stimulated Raman scattering[J]. \u003cem\u003escience\u003c/em\u003e, \u003cem\u003e330\u003c/em\u003e(6009), 1368\u0026ndash;1370.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBumbrah, G. S., \u0026amp; Sharma, R. M. (2016). Raman spectroscopy \u0026ndash; Basic principle, instrumentation and selected applications for the characterization of drugs of abuse, \u003cem\u003eEgyptian Journal of Forensic Sciences\u003c/em\u003e, vol. 6, pp.209\u0026ndash;215, June.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarter, J. C., Brewer, W. E., \u0026amp; Angel, S. M. (Dec., 2000). Raman spectroscopy for the in-situ identification of cocaine and selected adulterants. \u003cem\u003eApplied Spectroscopy\u003c/em\u003e, \u003cem\u003e54\u003c/em\u003e, 1876\u0026ndash;1881.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRAMAN, C. (1928). New Type of Secondary Radiation. \u003cem\u003eNature\u003c/em\u003e, \u003cem\u003e121\u003c/em\u003e, 501\u0026ndash;502.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBlacksberg, J., Alerstam, E., Cochrane, C. J., Maruyama, Y., \u0026amp; Farmer, J. D. (Jan. 2020). Miniature high-speed, low-pulse-energy picosecond Raman spectrometer for identification of minerals and organics in planetary science. \u003cem\u003eApplied Optics\u003c/em\u003e, \u003cem\u003e59\u003c/em\u003e(2), 433.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eK\u0026ouml;gler, M., \u0026amp; Heilala, B. (2021). Time-gated Raman spectroscopy \u0026ndash; a review. \u003cem\u003eMeasurement Science \u0026amp; Technology\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e, 1\u0026ndash;17.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChiuri, A., \u0026amp; Angelini, F. (2021). Fast Gating for Raman Spectroscopy, Sensors, vol.21, pp. 1\u0026ndash;41, Apr.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMatousek, P., Towrie, M., Ma, C., et al. (2001). Fluorescence suppression in resonance Raman spectroscopy using a high-performance picosecond Kerr gate[J]. \u003cem\u003eJournal of Raman Spectroscopy\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(12), 983\u0026ndash;988.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaruyama, Y., Blacksberg, J., \u0026amp; Charbon, E. (2013). A 1024\u0026times; 8 700ps time-gated SPAD line sensor for laser Raman spectroscopy and LIBS in space and rover-based planetary exploration[C]//2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers. IEEE, : 110\u0026ndash;111.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTuomo Talala, E., Parkkinen, \u0026amp; Nissinen, I. (2023). CMOS SPAD Line Sensor with Fine-Tunable Parallel Connected Time-to-Digital Converters for Raman Spectroscopy, IEEE J. Solid-State Circuits, vol.58, pp.1350\u0026ndash;1361, May.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAhmet, T., Erdogan, R., \u0026amp; Walker (2019). A CMOS SPAD Line Sensor with Per-Pixel Histogramming TDC for Time-Resolved Multispectral Imaging, IEEE J. Solid-State Circuits, vol.54, pp.1705\u0026ndash;1719, June.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNissinen, I., Nissinen, J., Ker\u0026auml;nen, P., Stoppa, D., \u0026amp; Kostamovaara, J. (2018). A 16\u0026times;256 SPAD Line Detector With a 50-ps, 3-bit, 256-Channel Time-to-Digital Converter for Raman Spectroscopy. \u003cem\u003ein IEEE Sensors Journal\u003c/em\u003e, \u003cem\u003e18\u003c/em\u003e(9), 3789\u0026ndash;3798.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTalala, T., Parkkinen, E., \u0026amp; Nissinen, I. (May 2023). Line Sensor With Fine-Tunable Parallel Connected Time-to-Digital Converters for Raman Spectroscopy. \u003cem\u003eIEEE Journal of Solid-State Circuits\u003c/em\u003e, \u003cem\u003e58\u003c/em\u003e(5), 1350\u0026ndash;1361.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePancheri, L., \u0026amp; Stoppa, D. (2009). A SPAD-based pixel linear array for high-speed time-gated fluorescence lifetime imaging, 2009 Proceedings of ESSCIRC, Athens, Greece, pp. 428\u0026ndash;431.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaruyama, Y., Blacksberg, J., \u0026amp; Charbon, E. (Jan. 2014). A 1024 \u0026times; 8, 700-ps Time-Gated SPAD Line Sensor for Planetary Surface Exploration with Laser Raman Spectroscopy and LIBS. \u003cem\u003eIEEE Journal of Solid-State Circuits\u003c/em\u003e, \u003cem\u003e49\u003c/em\u003e(1), 179\u0026ndash;189.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi, Y., Yu, H., et al. (Aug., 2018). A CMOS Time-to-Digital Converter for real-time optical time-of-flight sensing system. \u003cem\u003eIEEE Communications Magazine\u003c/em\u003e, \u003cem\u003e56\u003c/em\u003e, 113\u0026ndash;119.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKe, Q., Yu, H. (2022). Sept., A 32-Channel Time-to-Digital Converter with 20-ps Resolution for ToF Applications, IEEE International Conference on Circuits and Systems, Chengdu.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMantyniemi, A., Rahkonen, T., \u0026amp; Kostamovaara, J. (2009). A CMOS time-to-digital converter (TDC) based on a cyclic time domain successive approximation interpolation method, \u003cem\u003eIEEE J. Solid-State Circuits\u003c/em\u003e, vol. 44, no. 11, pp. 3067\u0026ndash;3078, Nov.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen, C. C., Chen, P., Hwang, C. S., \u0026amp; Chang, W. (2005). A precise cyclic CMOS time-to-digital converter with low thermal sensitivity, IEEE Trans. Nucl. Sci., vol. 52, no. 4, pp. 834\u0026ndash;838, Aug.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTisa, S., Lotito, A., Giudice, A., \u0026amp; Zappa, F. (2003). Monolithic time-to-digital converter with 20 ps resolution, in Proc. European Solid-State Circuits Conf., ESSCIRC, Sep. pp. 465\u0026ndash;468.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"analog-integrated-circuits-and-signal-processing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"alog","sideBox":"Learn more about [Analog Integrated Circuits and Signal Processing](http://link.springer.com/journal/10470)","snPcode":"10470","submissionUrl":"https://submission.nature.com/new-submission/10470/3","title":"Analog Integrated Circuits and Signal Processing","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Gated Raman spectroscopy, SPAD, Time-to-digital converter (TDC), delay interpolation","lastPublishedDoi":"10.21203/rs.3.rs-8697737/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8697737/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eRaman spectroscopy is a nondestructive, label-free optical analysis technique, and has seen various applications. Gated-Raman spectroscopy provides an effective way to suppress the influence of fluorescence photons. In this work, a novel 48 ps resolution timing chip constructed from two expandable 32-channel Time-to-digital convertors (TDC) is presented for gated-Raman spectroscopy. Each proposed TDC core consists of 32 independent channels, with a unified height of 30 \u0026micro;m for each channel. The design can be conveniently integrated with concurrent single photon avalanche diode (SPAD) technology. The TDC channels are constructed entirely from standard digital gates, primarily D-type registers and therefore uniformity among TDC channels is naturally achieved. Thirteen critical rising edges for time measurement are generated from delay interpolation, which is entirely separated from the TDC channels. The critical edges are distributed to all TDC channels through isolated shielded paths to minimize noise coupling. The separation among these critical rising edges, and therefore the measuring resolution, is fixed as 48 ps by two feedback locking loops referenced to a single 100 MHz clock. The register-only TDC makes the proposed design flexible for expansion, and in the presented prototype timing chip, two identical core blocks are simply placed side-by-side to form a timing chip with 64 TDC channels. The prototype was fabricated using 180 nm CMOS technology, with its functionality fully validated. When all included TDC channels are excited with 1 MHz periodic inputs, the power consumption is only 0.48mW/channel with a 1.8 V power supply.\u003c/p\u003e","manuscriptTitle":"A 48 ps Resolution, Timing Chip Constructed from Two Expandable 32-Channel Register-only TDCs","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-04 08:57:15","doi":"10.21203/rs.3.rs-8697737/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"86342091460837591733621430760554069049","date":"2026-04-12T05:41:35+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-03-22T07:42:10+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"172819868445389129908256585630552183505","date":"2026-02-18T06:56:50+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"198321132178446100813494061137125494052","date":"2026-02-04T16:18:03+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-02-02T15:43:31+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-01-29T07:48:35+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-01-29T07:47:38+00:00","index":"","fulltext":""},{"type":"submitted","content":"Analog Integrated Circuits and Signal Processing","date":"2026-01-26T07:38:16+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"analog-integrated-circuits-and-signal-processing","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"alog","sideBox":"Learn more about [Analog Integrated Circuits and Signal Processing](http://link.springer.com/journal/10470)","snPcode":"10470","submissionUrl":"https://submission.nature.com/new-submission/10470/3","title":"Analog Integrated Circuits and Signal Processing","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"35b0d0fb-3d17-4f51-b167-314a7ee37ada","owner":[],"postedDate":"February 4th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-02-04T08:57:16+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-04 08:57:15","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8697737","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8697737","identity":"rs-8697737","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.