SUANPAN: Scalable Photonic Linear Vector Machine | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Physical Sciences - Article SUANPAN: Scalable Photonic Linear Vector Machine Xue Feng, Ziyue Yang, Chen Li, Yuqia Ran, Yongzhuo Li, Kaiyu Cui, and 9 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5401152/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Photonic linear operation is a promising approach to handle the extensive vector multiplications in artificial intelligence (AI) techniques due to the natural bosonic parallelism and high-speed information transmission of photonics. However, there is still no universal scalable photonic computing architecture that can be readily merged with existing electronic digital computing system. Even though it is believed that maximizing the interaction of the light beams is necessary to fully utilize the parallelism and tremendous efforts have been made in past decades, the achieved dimensionality of vector-matrix multiplication is very limited due to the difficulty of scaling up a tightly interconnected or highly coupled optical system. Here, we propose a programmable and reconfigurable photonic linear vector machine to perform only the inner product of two vectors, formed by a series of independent basic computing units, while each unit contains only one emitter-detector pair. The elemental values of the processed vectors are prepared by the time-space domain encoding. Specifically, one vector is encoded by the output duration of continuous light-emitter while the other is encoded as the position of the emitter-detector pair. The result of the inner product is obtained by the sum of photocurrents of all photodetectors. Since there is no interaction among light beams inside, extreme scalability could be achieved by simply multiplicating the independent basic computing unit without requiring large-scale analog-to-digital converter or digital-to-analog converter arrays. Our time-space domain encoding architecture is inspired by the traditional Chinese Suanpan or abacus, and thus is denoted as photonic SUANPAN. As a proof of principle, SUANPAN architecture is implemented with an 8×8 vertical cavity surface emission laser (VCSEL) array and an 8×8 MoTe 2 two-dimensional material photodetector array. The experimental computing fidelities for randomly generated vector inner products are all over 98% for 1-bit, 2-bit, 4-bit and 8-bit quantization and over 95% for 8 80 vector dimensionalities with 4-bit quantization. Two typical AI tasks of the Ising machine for non-deterministic polynomial-time (NP)-hard optimization problem and artificial neural network for visual perception are performed to demonstrate the ability of SUANPAN architecture. For the Ising problem, 1024-dimensional problems are successfully solved, which is the highest dimensional optical Ising machine with heuristic algorithm. For artificial neural network, a competitive classification accuracy of 84 88% is achieved for MNIST (Modified National Institute of Standards and Technology) handwritten digit dataset. We believe that our proposed photonic SUANPAN is capable of serving as a fundamental linear vector machine that can be readily merged with existing electronic digital computing system and is potential to enhance the computing power for future various AI applications. Physical sciences/Optics and photonics/Applied optics/Optoelectronic devices and components Physical sciences/Optics and photonics/Other photonics/Micro-optics Physical sciences/Nanoscience and technology/Nanoscale materials/Two-dimensional materials Physical sciences/Physics/Electronics, photonics and device physics/Photonic devices Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction Artificial intelligence (AI) is currently an active topic in both scientific research and commercial application as well as daily life 1,2 . For AI techniques, the linear operations of high-dimensional vectors are fundamental and dominant in both the artificial neural networks 3-5 (ANN) and optimization problem solvers, such as the Ising machine 6-8 . As the complexity of problems increases, the dimensionality of the processed vector grows rapidly, resulting in a huge computational burden. It is known that vector operations can be readily accelerated by photons due to the natural parallelism of bosons 9 . In the past decades, various photonic computing architectures have been demonstrated to perform vector-matrix operation in optical domain, i.e. Stanford structure 10-12 , Reck scheme 13-16 , deep diffraction architecture 17-20 , micro-ring resonator (MRR) array 21-26 , etc. All these approaches can only perform limited scalability of the linear vector operation due to two fundamental issues. As the optical matrix is adopted to perform linear transformation of the input vector encoded on light beam, the basic units in the computing architecture, i.e. liquid crystal cells, beam splitters, meta-atoms, etc. , would be tightly interconnected or highly coupled with each other due to the unavoidable light beam interaction. Thus, high-dimensional optical vector-matrix operations cannot be achieved by simply duplicating these basic units. Another issue is how to fuse the linear optics with the existing electronic digital computing systems. Due to significant obstacles in implementing all-optical computers 27 , optoelectronic hybrid computing architecture may be the most promising solution 28 by performing the linear vector operations in the optical domain and nonlinear operations in the electronic domain. This requires a large number of analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) for exchanging analog data in optical domain and digital data in electronic domain. In fact, such conversion between digital and analog data is the main limiting factor to significantly scale up the computing dimensionality, since ADC/DAC causes large power and area overhead 29 . In addition, the slow speed of ADC and DAC remains the main obstacle to high-speed operation with linear optical computing. To address this issue, an all-analog chip combining electronic and light computing (ACCEL) 30 was presented, but it could only serve as a specific processor for the vision task. Therefore, a universal optoelectronic hybrid computing architecture, which does not rely on light beam interaction and can allow high-speed conversions between analog and digital data is urgently needed to address the two fundamental issues so that flexible scalability can be achieved to alleviate the heavy computing demands of future AI technology. Inspired by the traditional Chinese Suanpan or abacus, in which the mathematical operations are carried out by moving these beads, we propose and demonstrate a photonic SUANPAN architecture to pursue a brand-new analog-digital hybrid computing paradigm. The basic computing unit in SUANPAN contains one emitter-detector pair, and can achieve Bit Encoding and Analog Detecting, denoted as BEAD. Our proposed SUANPAN is formed by BEAD array, and thus can easily scale up by duplicating the independent BEAD. Instead of optical matrix operations in the current popular optical computing architectures, SUANPAN is designed only for the optical inner product of two vectors. The elemental values of one vector are encoded on the output intensity of light-emitter by controlling the duty ratio of driving current. While the elemental values of the other vector are encoded by the position of M BEADs to achieve M -bit quantization. The photocurrent of the photodetector (PD) would be proportional to the multiplication of the light intensity and photoresponsivity, and the final result of the inner product can be obtained by the summation of all the photocurrents. There is no DAC and only one ADC required in SUANPAN architecture, so it can be readily merged with existing electronic digital computing system. As a proof of principle, the SUANPAN architecture is implemented using an 8×8 vertical cavity surface emission laser (VCSEL) array and an 8×8 MoTe 2 two-dimensional (2D) material PD array. The experimental results of vector inner product operations show that the calculation fidelity can be as high as >98% for various bit precisions (1-bit, 2-bit, 4-bit and 8-bit), and >95% for various vector dimensionalities (@4-bit precision). Furthermore, such SUANPAN with 64 BEADs has been successfully reconfigured to perform two typical AI tasks, Ising machine and ANN. 1024-dimensional randomly generated Ising problem is successfully solved, which is the highest dimensional optical Ising machine with heuristic algorithm. Meanwhile, a competitive classification accuracy of 84~88% is achieved for MNIST handwritten digit dataset by properly setting SUANPAN for an ANN. Since there is no interaction among the propagating light beams of all BEADs and only the output currents of all BEADs are connected, SUANPAN architecture could be extremely scalable by increasing the number of BEADs with no additional loss or error as well as flexibly reconfigurable and programmable by properly arranging the BEADs for different computational tasks. We believe that our proposed photonic SUANPAN is capable to serve as a fundamental linear vector machine and is potential to enhance the computing power for future various AI applications. SUANPAN architecture The proposed SUANPAN architecture consists of a light-emitter array and a PD array as well as some necessary electronic hardware as schematically shown in Fig.1a. Here, a VCSEL array and a MoTe 2 2D material PD array are fabricated to demonstrate the SUANPAN architecture as shown in Fig.1b and Fig.1c, respectively. The schematic diagram and microscope photographs of a single VCSEL and MoTe 2 PD are also shown in the insets of Fig.1a. SUANPAN architecture is designed for vector inner product, also known as multiply-accumulate (MAC) operation. Firstly, for multiply operation, each PD is well aligned with a corresponding light-emitter to form an emitter-detector pair, therefore the photocurrent of the PD would be proportional to the multiplication of the light intensity and photoresponsivity due to the linear optical response 31 . Then, for add operation, the photocurrent of each PD is connected together, therefore the output current would be the sum of all PDs due to the Kirchhoff’s law. In this way, the multiply-accumulate operation is naturally performed through the emission and detection in SUANPAN architecture. In addition, one more important issue is how to encode the vectors on to the light-emitter array and PD array. In the traditional Chinese Suanpan or abacus, numbers are represented by the different positions of beads, and the mathematic operations are carried out by moving these beads. Inspired by that, the basic computing unit in SUANPAN contains one emitter-detector pair, and can achieve Bit Encoding and Analog Detecting, named as BEAD. The vector encoding and operating would rely on controlling the on-off state of each BEAD. For deep insight, a simple example of one BEAD is considered. As shown in Fig.2a, the multiplier a is encoded in the intensity of light-emitter by controlling the duty ratio of driving current, which is done by a digital counter according to the clock cycles without DAC. The bit precision depends on the time-slot numbers of splitting the period. With the constant period, the more time-slots there are, the more quantization bits for a can be achieved. In this work, the period of light emission is divided into 100 slots, so that a can be taken as 0, 1, …, 100. The multiplier b is encoded on the on-off state of the BEAD, by turning on (green arrow) or off (gray arrow) the light-emitter, respectively. Hence, there are two states to encode b , b =0 or b =1, known as 1-bit quantization. The photocurrent would be proportional to a ×0 or a ×1, which is the result of a × b , b =0,1. For more bit quantization of b , more BEADs are employed and form a set of them. Considering 2-bit quantization of b , there would be two BEADs in one set as shown in the white box of Fig.2b. The method of encoding a is also as aforementioned, while encoding b should employ two BEADs simultaneously and thus form four combinations of on-off states, which is corresponding to the four values of 2-bit quantization. From the perspective of binary encoding, such two BEADs are equivalent to the two bits in the binary representation of b , while the position of PD is considered as the bit-position. For example, if b =2, then the binary representation would be b = , which means the first BEAD is at off-state and the second one is at on-state. As mentioned above, this is quite similar to the Chinese traditional Suanpan or abacus, which represents numbers according to the position of beads and carries out mathematical operations by moving these beads up and down. Different bits represent different weights in binary representation, and this can be achieved by setting the photoresponsivities of these two PDs as 2 0 to 2 1 . To properly manipulate the photoresponsivity, we design and fabricate 2D material photoconductive detectors (more details will be discussed in the Results section). By combining the photocurrents of these two PDs together, the output result is a × b as shown in Fig.2b. Further, in this way, M -bit quantization of b can be achieved with a set of M BEADs, so that SUANPAN can encode the range of b from 0 to 2 M -1 and achieve the multiplication of a and b , as shown in Fig.2c. As mentioned above, a is digitally encoded to the duty ratio of light emission and b is digitally encoded to the on-off states of the BEADs in one set, while the output result is the analog photocurrent from the PDs. Thus, the basic unit, consisting of an emitter-detector pair, is actually operated as Bit Encoding and Analog Detecting, which is thus abbreviated as BEAD. Since the number of a and b are encoded in time and space domain respectively, SUANPAN architecture can perform the multiplication of any desired bit precision theoretically. Further, The vector inner product can be achieved by employing N sets of BEADs and connecting their output photocurrents together (Fig.2d). It should be mentioned that SUANPAN architecture can also handle negative numbers by applying reversed bias voltage of the corresponding PD. Considering both positive/negative numbers, 2× M BEADs should be utilized for each set and the details are provided in Extended Data Fig.1. Through this encoding method, M -bit quantization and N -dimensional signed vector inner product can be performed by SUANPAN with N ×2× M BEADs at a time while each element of the input vectors can be flexibly programmed. Furthermore, the number of utilized sets ( N ) as well as the number of BEADs in each set ( M ) can be reconfigurable according to specific calculation requirements in terms of the dimensionality and bit precision. At the output, the photocurrents of all PDs are connected together, so that only one ADC is required to transform the total analog current into digital signals. Also, since only 1-bit information would be encoded on a single BEAD, the properly settled but fixed bias voltage would be applied on each BEAD. Thus, there are no requirement for ADC array or DAC array in SUANPAN, which is actually the main limiting factor for the hybrid computing with both digital and analog operations. Last but not least, since there is no interaction among the propagating light beams of all BEADs, SUANPAN architecture can be extremely scalable by increasing the number of independent BEADs with no additional loss or error. On one hand, the number of utilized BEADs can be easily increased by integrating more components on one pair of light-emitter array and PD array chips. On the other hand, distributed computing can also be achieved by simply connecting multiple pairs of chips together to scale up the computing power more. Therefore, SUANPAN is a programmable, reconfigurable and scalable computing architecture, which is capable to serve as a general vector inner product accelerator for the existing computing system. Results To demonstrate the SUANPAN architecture, a pair of VCSEL and MoTe 2 PD are employed to form the BEAD. As a light-emitter, VCSEL can readily achieve high-speed modulation as well as large-scale array. For PD, 2D material of MoTe 2 is utilized for three reasons: (1) The photoresponsivity of 2D material PD can be flexibly controlled by the bias voltage. (2) The high carrier mobility in 2D material 32 can support high-speed detection, which is an important issue for high-speed computing. (3) 2D material can be heterogeneously integrated with other material platform 33 . Therefore, in SUANPAN architecture 2D material PD is potentially integrated with light-emitter array in the future, which will be explained in detail in the Discussion section. Thus, we have fabricated both the VCSEL and MoTe 2 PD array chip with 8×8 components, respectively. The fabrication process can be found in Methods and Extended Data Fig.2, and the characteristics of each VCSEL and PD can be found in Extended Data Fig.3-5. Our fabricated VCSELs and PDs show good uniformity and stability, which are very significant in the optical computing architecture. To align each pair of VCSEL and PD, an optical imaging system is built up (refer to Extended Data Fig.7). Such an imaging system would be not necessary by integrating the 2D material PD array with light-emitter array on a single chip in the future. To verify the functionality of SUANPAN architecture, random vector inner products are performed with various bit precisions and dimensionalities. Finally, two typical AI tasks of the Ising machine and ANN, are performed to present the ability of SUANPAN architecture. The inner product of random vectors is the foundation for different AI tasks and the signed vector inner product with precision of 1-bit, 2-bit, 4-bit and 8-bit are demonstrated. For 1-bit precision signed vector, 2 BEADs are required in each set, and there are 32 sets with 64 BEADs. Therefore, 32-dimensional vector inner product can be done at one time. Similarly, for 2-bit, 4-bit and 8-bit quantization, the corresponding dimensionality would be 16, 8 and 4 respectively. To achieve higher dimensional vector inner product, time-division multiplexing can also be employed, which is according to accumulating multiple times of calculations. For each bit precision, the configuration of SUANPAN would be properly settled and the corresponding bias voltage of each PD is shown in Fig.2e-h. Also, 1000 pairs of signed vectors are randomly generated and performed by SUANPAN. The calculation accuracy is evaluated by the fidelity expressed as: Here, 1000 true values are calculated in a computer and denoted as a vector x , and the calculated results in SUANPAN are recorded as a vector y . As shown in equation (1), the fidelity describes the parallel degree between x and y , while the scaling error can be excluded. The statistical results are shown in Fig.2e-h and all of those fidelities are higher than 98%. These results indicate that SUANPAN can perform accurate calculation of signed vector inner products with different bit precisions. Specifically, the fidelity of 4-bit precision signed vector inner product with different dimensionalities are shown in Fig.2i. As the dimensionality increases, the computational fidelity remains above 95%, which also demonstrates the scalability of SUANPAN architecture. Due to the high fidelity in executing random signed vector inner products with different bit precisions and dimensionalities, SUANPAN architecture can be flexibly utilized to further demonstrate more specific computing tasks. In next sections, both the Ising machine and ANN would be considered. For the decision-making task in AI applications, the combinatorial optimization problems are ubiquitous and usually non-deterministic polynomial. One approach to process such NP-hard problems is mapping them to Ising problem 34 , which is a typical combinatorial optimization problem and also known as quadratic unconstrained binary optimization (QUBO) problem. An N -dimensional Ising problem can be defined by an interaction matrix J , which is a symmetric matrix of N × N dimensionality. For a given interaction matrix J , the Hamiltonian of Ising problem is defined as follows: Solving Ising problem is to find the specific vector S that minimizes the Hamiltonian, which is denoted as the ground state. Since the elements in S can only take 0 or 1, the dimensionality of the solution space is 2 N for an N -dimensional Ising problem. With the development of AI, various heuristic algorithms have been developed to solve the Ising problem and the so-called Ising machines have been demonstrated on various computing platforms. Among them, the simulated annealing (SA) algorithm 35 is combined with optical computing platforms to form a photonic Ising machine due to the parallelism of light beam. Here, SUANPAN is set to serve as a hardware solver for Ising problem. The detailed solution process of SA algorithm 35 consists of initialization, n iterations and output. In each iteration, one random element of S is flipped and then the variation of Hamiltonian H is calculated. After that, the vector of S would be accepted or not according to H . Obviously, the Hamiltonian should be calculated in each iteration with O( N 2 ) complexity, which is actually the main computational burden in SA algorithm. Through some identity transformations, H can be transformed into an N -dimensional vector inner product, which can be readily performed by SUANPAN as shown in Fig.3a. Therefore, there are mainly two functional units to build up such an Ising machine here. The first one is an electronic processor (gray box in Fig.3a), to execute the updating and flipping operations. The other is SUANPAN (blue box in Fig.3a), where vector inner product would be performed to calculate H . Here, as shown in Fig.3a, the first vector is the i th row of matrix J with some proper modifications. Since J is a continuous variable matrix, it is encoded on light-emitter array to achieve a high bit precision. The second vector is according to vector S , in which the element only takes 0/1. Thus, it is encoded on PD array with 1-bit precision. It should be mentioned that if S i is flipped from 0 to 1, then the result of the vector inner product would be H . Otherwise, if S i is flipped from 1 to 0, then negative sign should be taken. We first solve a 30-dimensional randomly generated Ising problem by SUANPAN, which is the highest dimensionality of arbitrarily connected Ising problem reported previously in a programmable photonic Ising machine 36 . The current SUANPAN with 64 BEADs can calculate 32-dimensional vector inner product at one time with 1-bit precision, and thus it can be employed to solve 30-dimensional Ising problems without time-division multiplexing. The iteration number is set as 500 and the solving process is repeated with 100 rounds, while the full parameters of the SA algorithm are provided in Extended Data Table1. Each curve in Fig.3b represents the Hamiltonian evolution during each solving round, and 99 curves eventually converged to the lowest ground state (dashed line shown in Fig.3b). The Hamiltonian decreases very rapidly in the initial 100 iterations. After about 230 iterations, most of the curves have already been very close to the ground state. Finally, an accuracy of 99% is achieved by SUANPAN, which is much higher than the existing optical programmable Ising machine 36 . To further validate the scalability of SUANPAN architecture, a randomly generated 1024-dimensional Ising problem is considered, in which dimensionality is more than an order of magnitude comparing with the previously reported programmable photonic Ising machine based on heuristic algorithm 36 . Here, the required 1024-dimensional vector inner product is decomposed into 32-dimensional one with time-division multiplexing for 32 times. The iteration number is set as 5000 and full parameters of the SA algorithm are provided in Extended Data Table1. For high-dimensional Ising problem, it is hard to obtain the true Hamiltonian value of the ground state, and thus an approximate solution with 87.8% of the ground state is usually set as a criteria for successful solution 37-39 , as dashed line shown in Fig.3c. Such 1024-dimensional Ising problem is successfully solved by SUANPAN as the annealing curve fall below the criteria after ~4000 iterations. The high convergence rate and high dimensionality in solving Ising problem can fully validate the programmability, reconfigurability and computational stability of SUANPAN architecture, which is capable to serve as a programmable Ising machine. ANN is another typical task in modern AI and various physical neural networks (PNNs) have been applied for the tasks of visual perception, speech recognition, subject classification, etc . Among them, optical neural networks are quite promising also due to the parallelism of light beam 4,5 . In PNNs, silico training is usually required to avoid errors caused by differences between simulation and practical devices. Unlike that, SUANPAN architecture can directly map pre-trained ANNs, in which the input data needs to go through a linear part and a nonlinear part during each layer. In the linear part of ANN, the vector matrix multiplication is calculated between input data and weight matrix, while in the nonlinear part, a nonlinear activation function is performed. Correspondingly, the linear matrix multiplication, which can be considered as a set of vector inner product, would be performed by SUANPAN, and the nonlinear function would be calculated by electronic processor. Therefore, through time-division multiplexing, SUANPAN architecture can execute ANNs of arbitrary depth and arbitrary number of nodes in theory. As proof of principle, both single and double layer ANNs are performed with SUANPAN architecture as shown in Fig.4a and 4d, respectively. Here, MNIST handwritten digit is utilized as dataset, and stochastic gradient descent 40 (SGD) is utilized as training method. The nonlinear function of the output layer is Softmax function, and the nonlinear function of the hidden layer is Relu function. The weights of the single-layer ANN and double-layer ANN are 4-bit precision and 6-bit precision according to simulations, respectively (the details shown in Extended Data Fig.8). For the inference result of single-layer ANN, the confusion matrix of 10000 pictures in test dataset calculated by computer and SUANPAN are shown in Fig.4b and 4c, respectively. The approaching classification accuracies are 88.08% and 90.12% for SUANPAN and computer, respectively. It can be seen that the results processed by SUANPAN are quite consistent with those processed by a computer, and only a little deterioration is introduced. These results indicate a quite high computing accuracy in SUANPAN architecture. It should be mentioned that the classification accuracy is not limited by the SUANPAN, but by the pre-trained model itself, which could be further improved through optimizing the network model and the training method. While for double-layer, only the first 100 pictures in MNIST test dataset are performed as a proof of principle verification. The confusion matrix and accuracy calculated by computer and SUANPAN are shown in Fig.4e and 4f, respectively. It can be noticed that the classification accuracy of double-layer ANN calculated by computer is much higher than one-layer, while that calculated by SUANPAN is lower than one-layer. The reason might be the device performance of MoTe 2 PD array has deteriorated after two months of testing. Anyway, we believe that the above results of ANN can still validate the feasibility of SUANPAN architecture. Discussion In this work, we have proposed and demonstrated the photonic SUANPAN architecture to perform the vector inner product operations. Instead of the reported optical computing architecture maximizing the interaction of light beams, we utilize independent emitter-detector pair to form a programmable, reconfigurable and scalable computing architecture that can be compatible with the existing computing system. Our fabricated SUANPAN with 64 pairs of VCSEL and MoTe 2 PD shows high computing fidelities for randomly generated vector inner products, and demonstrates two typical AI tasks of the Ising machine and ANN. For the Ising problem, 1024-dimensional problem is successfully solved, which is the highest dimensional optical Ising machine with heuristic algorithm. For ANN, a competitive classification accuracy of 84~88% is achieved for MNIST handwritten digit dataset. There are two main contributions in this work, the SUANPAN computing architecture and the Bit Encoding and Analog Detecting computing paradigm, which actually hold together. Firstly, for the SUANPAN architecture, it breaks through the traditional mindset of obtaining optical matrix transformations through interaction of light beams. Instead, there is no interaction among those propagating light beams of all BEADs. Therefore, the SUANPAN architecture can be decomposed into BEADs as independent computing units. The scalability, reconfigurability and programmability of the SUANPAN architecture are only based on the duplication, recombination and modulation of BEAD without any additional cost. Compared with other optical matrix transformations through interaction between light beams, SUANPAN architecture possesses these following advantages: (1) With massive and industrial replication of BEADs, SUANPAN architecture can theoretically be infinitely scalable. (2) SUANPAN architecture can be flexibly reconfigured and programmed to perform various specific computing tasks. (3) The correction of SUANPAN architecture only considers the intensity of light beam (details are provided in Methods section), and there is no requirement to correct the phase of the light beams for interaction. (4) Even if one BEAD is damaged during fabrication or operation, it does not affect other BEADs, and the only result is decreasing the operating dimensionality. In short, the core idea of SUANPAN architecture is to decompose the whole computing architecture into simple and independent units, since no interaction among each unit would bring more scalability and programmability to the whole optical computing architecture. Secondly, for the Bit Encoding and Analog Detecting computing paradigm, it provides a promising solution for optoelectronic analog-digital hybrid computing. For traditional DA conversion in optoelectronic computing, the digital electronic signal is converted to analog optical signal in each single device. Therefore, each single device requires a DAC. However, in Bit Encoding and Analog Detecting paradigm, the M -bit digital electronic signal is converted to analog with in a set of M BEADs, where each BEAD only represents 1-bit information. Therefore, no DAC is required in such computing paradigm. At the same time, only one ADC is required to convert the sum photocurrent into electronic digital signal as the result. Therefore, the Bit Encoding and Analog Detecting computing paradigm greatly reduces the heavy burden introduced by ADC and DAC. In fact, it is also an important issue for the arbitrary scalability of SUANPAN architecture. Additionally, this computing paradigm can be extended beyond SUANPAN architecture and introduced into other computing architectures to reduce the obstructive effects of ADC and DAC. Finally, our work is a preliminary verification of the feasibility of SUANPAN architecture, and there is still a lot of room for improvement within the specific implementation. It is obvious that the light-emitter array and the 2D material PD array can be integrated into a single chip due to the heterogeneous integration of 2D materials. In this way, the imaging system is not required to align the light-emitter array and PD array, and further 3D stacking integration can be achieved to further expand the scalability. In addition, a detailed analysis is provided here for the future development of the computility. In order to evaluate the computility of SUANPAN architecture, a figure of metric is defined as BIPPS here, which is bits of inner product per second. Compared to floating point operations per second (FLOPS), if the computational task is M -bit quantization, then M BIPPS is equivalent to 1 FLOPS. For SUANPAN architecture, computility is determined by two factors, the BEAD size ( L B ) and bandwidth ( f B ). Since BEAD consists of an emitter-detector pair, the BEAD size is defined as the larger one between light-emitter and PD, while the BEAD bandwidth is defined as the smaller one between the modulating bandwidth of light-emitter and the response bandwidth of PD. Considering within the utilized chip area of 1 cm 2 , the computility is obtained as follow: where the first term equals the number of BEADs on a chip area of 1 cm 2 . Therefore, the computility can be increased by reducing the BEAD size and increasing the BEAD bandwidth as shown in Fig.5. Due to the research on various high-speed nano-lasers 41-43 and nano-detectors 44,45 , the computility could achieve over 1P BIPPS/cm 2 for BEAD size L B 1 GHz. Therefore, SUANPAN architecture is capable as an attractive and practicable photonic linear vector machine in the foreseeable future Methods Device fabrication of VCSEL The 850 nm VCSEL EPI structure consists of around 35 pairs of AlGaAs bottom DBR and 25 pairs of AlGaAs top DBR. AlGaAs/InGaAs quantum well is used as active region. 98% AlGaAs layer is used to form oxide aperture. After epitaxy, the wafer goes through P-metal deposition, ICP trench and wet oxidation. The N-metal is deposited on the backside of the N-type substrate to form common cathode. While the individual emitter in the array is connected to separated anode pads on the edge of the VCSEL array chip by electro-plated traces (Extended Data Fig.6a). Device fabrication of MoTe 2 PD The PDs are fabricated on a SiO 2 /Si substrate directly grown with a 10 nm 2H MoTe 2 layer (detailed fabrication process in Ref. 46,47 ). First, the patterns are defined by ultraviolet lithography and transferred to the MoTe 2 /SiO 2 layer by reactive ion etching (SF 6 acts as the etching gas). Then, the Cr/Au electrodes (10 nm/50 nm) are fabricated using ultraviolet lithography, deposition and lift-off. The schematic diagram of the preparation process is shown in Extended Data Fig.2. To prevent degradation, the PDs are packaged with a 10 nm Al 2 O 3 layer grown by atomic layer deposition. For subsequent testing, the MoTe 2 PDs are connected to a self-designed printed circuit board (PCB) using wire-bonding technology (Extended Data Fig.6b). Experimental setup Schematics of the experimental setup are illustrated in Fig.1a and Extended Data Fig.7. Light from the VCSEL array with a wavelength of 850 nm is focused by a zoom lens onto the MoTe 2 PD array. The 8×8 VCSEL array and the PD array are aligned by the illumination optical path. Electrical and optoelectronic measurements of the fabricated MoTe 2 PDs are carried out with a semiconductor parameter analyzer (PDA FS380) at room temperature in ambient conditions. Rectification of SUANPAN Due to the fabrication error, the output intensity of each VCSEL and the output current of each PD may not be completely consistent under the same conditions. Therefore, it is necessary to rectify the entire architecture before performing calculation tasks. Firstly, the output dark current of each detector is adjusted to be consistent by changing the bias voltage on each detector. Secondly, under such bias voltage, the output photocurrent of each detector is adjusted to be consistent by changing the output intensity of each VCSEL. Then, it can be considered that all 64 BEADs are consistent. 46 Xu, X. et al. Millimeter-Scale Single-Crystalline Semiconducting MoTe 2 via Solid-to-Solid Phase Transformation. J. Am. Chem. Soc. 141 , 2128-2134 (2019). 47 Pan, Y. et al. Heteroepitaxy of semiconducting 2H-MoTe 2 thin films on arbitrary surfaces for large-scale heterogeneous integration. Nat. Synth. 1 , 701-708 (2022). Declarations Data availability The data that support the findings of this study are available within the paper and the Extended Data. Other relevant data are available from the corresponding author on reasonable request. Acknowledgements Funding from the National Key Research and Development Program of China (2023YFB2806703), the National Natural Science Foundation of China (Grant No. U22A6004, 92365210, 62175124) is greatly acknowledged. This work was also supported by Beijing National Research Center for Information Science and Technology (BNRist), Frontier Science Center for Quantum Information, Beijing academy of quantum information science, and Tsinghua University Initiative Scientific Research Program. Author Contributions Z.Y., Y.L. and X.F. conceived the idea. Z.Y. and C.L. designed and performed the simulations, experiments and data analysis. Y.R. and Y.Y. contributed to the growth of MoTe 2 layer on the SiO 2 /Si substrate. J.W. and C.J.C.-H. contributed to the fabrication of the VCSEL array. F.Q. assisted in building the electronic control system of SUANPAN architecture. H.S., K.C., F.L., W.Z. and C.N. provided useful discussions and comments. Z.Y., C.L., Y.L., X.F. and Y.H. wrote the paper. All authors revised and approved the manuscript. Competing interests The authors declare no competing interests. References Cetinic, E. & She, J. Understanding and Creating Art with AI: Review and Outlook. ACM Trans. Multimed. Comput. Commun. Appl. 18 , 1-22 (2022). Rajpurkar, P. et al. AI in health and medicine. Nat. Med. 28 , 31-38 (2022). LeCun, Y. et al. Deep learning. Nature 521 , 436-444 (2015). Wetzstein, G. et al. Inference in artificial intelligence with deep optics and photonics. Nature 588 , 39-47 (2020). Fu, T. et al. Optical neural networks: progress and challenges. Light-Sci. Appl. 13 , 263 (2024). Mohseni, N. et al. Ising machines as hardware solvers of combinatorial optimization problems. Nat. Rev. Phys. 4 , 363-379 (2022). Laydevant, J. et al. Training an Ising machine with equilibrium propagation. Nat. Commun. 15 , 3671 (2024). Nikhar, S. et al. All-to-all reconfigurability with sparse and higher-order Ising machines. Nat. Commun. 15 , 8977 (2024). Zhou, H. et al. Photonic matrix multiplication lights up photonic accelerator and beyond. Light-Sci. Appl. 11 , 30 (2022). Goodman, J. W. et al. Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms. Opt. Lett. 2 , 1-3 (1978). Spall, J. et al. Fully reconfigurable coherent optical vector-matrix multiplication. Opt. Lett. 45 , 5752-5755 (2020). Wang, T. et al. An optical neural network using less than 1 photon per multiplication. Nat. Commun. 13 , 123 (2022). Reck, M. et al. Experimental realization of any discrete unitary operator. Phys. Rev. Lett. 73 , 58-61 (1994). Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11 , 441-446 (2017). Roques-Carmes, C. et al. Heuristic recurrent algorithms for photonic Ising machines. Nat. Commun. 11 , 249 (2020). Pai, S. et al. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science 380 , 398-404 (2023). Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361 , 1004-1008 (2018). Yan, T. et al. Fourier-space Diffractive Deep Neural Network. Phys. Rev. Lett. 123 , 023901 (2019). Zhou, T. et al. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nat. Photonics 15 , 367-373 (2021). Fu, T. et al. Photonic machine learning with on-chip diffractive optics. Nat. Commun. 14 , 70 (2023). Tait, A. N. et al. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing. J. Lightwave Technol. 32 , 4029-4041 (2014). Deng, Y. & Chu, D. Coherence properties of different light sources and their effect on the image sharpness and speckle of holographic displays. Sci Rep 7 , 5893 (2017). Feldmann, J. et al. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature 569 , 208-214 (2019). Feldmann, J. et al. Parallel convolutional processing using an integrated photonic tensor core. Nature 589 , 52-58 (2021). Bai, B. et al. Microcomb-based integrated photonic processing unit. Nat. Commun. 14 , 66 (2023). Dong, B. et al. Partial coherence enhances parallelized photonic computing. Nature 632 , 55-62 (2024). Kazanskiy, N. L. et al. Optical Computing: Status and Perspectives. Nanomaterials 12 (2022). Dan, Y. et al. Optoelectronic integrated circuits for analog optical computing: Development and challenge. Front. Physics 10 , 1064693 (2022). Kim, S. et al. Neuro-CIM: ADC-Less Neuromorphic Computing-in-Memory Processor With Operation Gating/Stopping and Digital–Analog Networks. IEEE J. Solid-State Circuit 58 , 2931-2945 (2023). Chen, Y. et al. All-analog photoelectronic chip for high-speed vision tasks. Nature 623 , 48-57 (2023). Huo, N. & Konstantatos, G. Recent Progress and Future Prospects of 2D‐Based Photodetectors. Adv. Mater. 30 , 1801164 (2018). An, J. et al. Perspectives of 2D Materials for Optoelectronic Integration. Adv. Funct. Mater. 32 , 2110119 (2021). You, J. et al. Hybrid/Integrated Silicon Photonics Based on 2D Materials in Optical Communication Nanosystems. Laser Photon. Rev. 14 , 2000239 (2020). Lucas, A. Ising formulations of many NP problems. Front. Physics 2 , 5 (2014). Van Laarhoven, P. J. & Aarts, E. H. Simulated annealing: theory and application . (Springer, 1987). Ouyang, J. et al. On-demand photonic Ising machine with simplified Hamiltonian calculation by phase encoding and intensity detection. Commun. Phys. 7 , 168 (2024). Yamamoto, Y. et al. Coherent Ising machines—optical neural networks operating at the quantum limit. npj Quantum Inform. 3 , 49 (2017). Haribara, Y. et al. Computational Principle and Performance Evaluation of Coherent Ising Machine Based on Degenerate Optical Parametric Oscillator Network. Entropy 18 , 151 (2016). Goemans, M. X. & Williamson, D. P. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42 , 1115-1145 (1995). Ketkar, N. Deep learning with Python . (Springer, 2017). Jeong, K. Y. et al. Recent Progress in Nanolaser Technology. Adv. Mater. 32 , 2001996 (2020). Du, W. et al. Nanolasers Based on 2D Materials. Laser Photon. Rev. 14 , 2000271 (2020). Zhang, Q. et al. Halide Perovskite Semiconductor Lasers: Materials, Cavity Design, and Low Threshold. Nano Lett. 21 , 1903-1914 (2021). Long, M. et al. Progress, Challenges, and Opportunities for 2D Material Based Photodetectors. Adv. Funct. Mater. 29 , 1803807 (2018). Liu, C. et al. Silicon/2D-material photodetectors: from near-infrared to mid-infrared. Light-Sci. Appl. 10 , 123 (2021). Additional Declarations There is NO Competing Interest. Supplementary Files ExtendedDataFig.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5401152","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Physical Sciences - Article","associatedPublications":[],"authors":[{"id":377081560,"identity":"ea014e6d-f08b-4bad-9557-d86800a8ed58","order_by":0,"name":"Xue Feng","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+UlEQVRIiWNgGAWjYFACxgdAwoaBgRmMQCCBkBZmAyCRJkGylsMSYCZRWgwOMDM+Lvh1vk6+nffw68I2OwZ+9hwDhp87cGuRbGBmNp7Zd1vC4DBfmvXMtmQGyZ43Boy9Z3Br4Zd/f0yatweohZnHzJi3jZnB4EaOATNjG24tbAzM7L95e85JyDeDtdQz2BPSws/AzMbM8+OABMNhHuPHvG2HGQwkCGgB+UWatyFZcsNhHjNmnnPHeSTOPCs42ItHCyjEPvP8seOX7z9j/JmnrFqOvz1544OfeLSAAdQZbKDI4QGxDhDQAAR/wCTzB8IqR8EoGAWjYCQCAAMwQ9pPMoQUAAAAAElFTkSuQmCC","orcid":"https://orcid.org/0000-0002-9057-1549","institution":"Tsinghua University","correspondingAuthor":true,"prefix":"","firstName":"Xue","middleName":"","lastName":"Feng","suffix":""},{"id":377081561,"identity":"f1547ed0-f9dd-4f4b-aa03-c8dbb5da51d9","order_by":1,"name":"Ziyue Yang","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Ziyue","middleName":"","lastName":"Yang","suffix":""},{"id":377081562,"identity":"b4764f80-58c4-4748-8d04-f5763122455b","order_by":2,"name":"Chen Li","email":"","orcid":"https://orcid.org/0000-0001-6237-361X","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Chen","middleName":"","lastName":"Li","suffix":""},{"id":377081563,"identity":"2163ff52-ee05-45f0-958f-fe0e6db24a90","order_by":3,"name":"Yuqia Ran","email":"","orcid":"","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"Yuqia","middleName":"","lastName":"Ran","suffix":""},{"id":377081564,"identity":"04ca588c-5777-4de8-b629-c46dafbb9772","order_by":4,"name":"Yongzhuo Li","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Yongzhuo","middleName":"","lastName":"Li","suffix":""},{"id":377081565,"identity":"9a0f94d1-42a9-4f70-b8b2-6ce620dcda14","order_by":5,"name":"Kaiyu Cui","email":"","orcid":"https://orcid.org/0000-0002-3119-6072","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Kaiyu","middleName":"","lastName":"Cui","suffix":""},{"id":377081566,"identity":"775e3bca-a634-4ca6-8047-cfeee85aadbb","order_by":6,"name":"Fang Liu","email":"","orcid":"https://orcid.org/0000-0002-6781-8371","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Fang","middleName":"","lastName":"Liu","suffix":""},{"id":377081567,"identity":"9bdb909b-2b7f-44ff-b0e7-d0ccf22f400d","order_by":7,"name":"Hao Sun","email":"","orcid":"https://orcid.org/0000-0003-0951-6428","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Hao","middleName":"","lastName":"Sun","suffix":""},{"id":377081568,"identity":"239db110-008e-42f6-a498-7881e2bee2a1","order_by":8,"name":"Wei Zhang","email":"","orcid":"https://orcid.org/0000-0002-6848-6807","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Wei","middleName":"","lastName":"Zhang","suffix":""},{"id":377081569,"identity":"1f72ac2f-f685-4dc4-ae39-11b59f808706","order_by":9,"name":"Yu Ye","email":"","orcid":"https://orcid.org/0000-0001-6046-063X","institution":"Peking University","correspondingAuthor":false,"prefix":"","firstName":"Yu","middleName":"","lastName":"Ye","suffix":""},{"id":377081570,"identity":"75266c09-f026-49e0-85e4-d70d6efd2f90","order_by":10,"name":"Fei Qiao","email":"","orcid":"https://orcid.org/0000-0002-5054-9590","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Fei","middleName":"","lastName":"Qiao","suffix":""},{"id":377081571,"identity":"47728e66-c150-4d3b-82d5-66bddbe05c3c","order_by":11,"name":"Jiaxing Wang","email":"","orcid":"","institution":"Berxel Photonics Company Ltd.","correspondingAuthor":false,"prefix":"","firstName":"Jiaxing","middleName":"","lastName":"Wang","suffix":""},{"id":377081572,"identity":"2cf302b0-47a2-4ffe-9861-a465a26a9cda","order_by":12,"name":"Cun-Zheng Ning","email":"","orcid":"https://orcid.org/0000-0003-4583-8889","institution":"Shenzhen Technology University","correspondingAuthor":false,"prefix":"","firstName":"Cun-Zheng","middleName":"","lastName":"Ning","suffix":""},{"id":377081573,"identity":"c5d04235-2b9d-4fde-9684-33d0765b0b37","order_by":13,"name":"Connie J.Chang-Hasnain","email":"","orcid":"","institution":"Berxel Photonics Company Ltd.","correspondingAuthor":false,"prefix":"","firstName":"Connie","middleName":"","lastName":"J.Chang-Hasnain","suffix":""},{"id":377081574,"identity":"42b3e894-538c-4a23-b4f6-9899fc652086","order_by":14,"name":"Yidong Huang","email":"","orcid":"https://orcid.org/0000-0002-4570-9449","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Yidong","middleName":"","lastName":"Huang","suffix":""}],"badges":[],"createdAt":"2024-11-06 09:00:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5401152/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5401152/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":69780092,"identity":"24d578a8-29b1-4d7d-9f4f-459c903dc807","added_by":"auto","created_at":"2024-11-25 07:55:22","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1617171,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eArchitecture of SUANPAN. a,\u003c/strong\u003e The schematic diagram of SUANPAN architecture, consisting of a light-emitter array, a PD array and some necessary electronic hardware. Left inset shows the schematic and microscope photograph of a single VCSEL. Scale bar is 20 μm. Right inset shows the schematic and microscope photograph of a single MoTe\u003csub\u003e2\u003c/sub\u003e PD. Scale bar is 100 μm. \u003cstrong\u003eb, \u003c/strong\u003eThe optical image of the VCSEL array. \u003cstrong\u003ec,\u003c/strong\u003e The optical image of the MoTe\u003csub\u003e2\u003c/sub\u003e PD array.\u003c/p\u003e","description":"","filename":"fig1.png","url":"https://assets-eu.researchsquare.com/files/rs-5401152/v1/95bd813dad5f683b8808eded.png"},{"id":69780090,"identity":"03262f5e-9272-4348-b29a-3ef1aa098776","added_by":"auto","created_at":"2024-11-25 07:55:22","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":758150,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eOperation principle of SUANPAN. a, \u003c/strong\u003eThe mechanism of performing \u003cem\u003ea\u003c/em\u003e×\u003cem\u003eb\u003c/em\u003e, where \u003cem\u003eb\u003c/em\u003e is 1-bit quantization with 1 BEAD. Inset: the on-off state of 1 BEAD in SUANPAN comparing with 1 bead in Chinese Suanpan.\u003cstrong\u003e b, \u003c/strong\u003eThe mechanism of performing \u003cem\u003ea\u003c/em\u003e×\u003cem\u003eb\u003c/em\u003e, where \u003cem\u003eb\u003c/em\u003eis 2-bit quantization with 2 BEADs. Inset: the on-off states of 2 BEADs in SUANPAN comparing with 2 beads in Chinese Suanpan. \u003cstrong\u003ec,\u003c/strong\u003e The mechanism of performing \u003cem\u003ea\u003c/em\u003e×\u003cem\u003eb\u003c/em\u003e, where \u003cem\u003eb \u003c/em\u003eis \u003cem\u003eM\u003c/em\u003e-bit quantization with \u003cem\u003eM\u003c/em\u003eBEADs. The white box represents a set of \u003cem\u003eM\u003c/em\u003eBEADs. Inset: \u003cem\u003eM\u003c/em\u003e beads in Chinese Suanpan as a comparison. \u003cstrong\u003ed,\u003c/strong\u003e The mechanism of performing vector inner product \u003cem\u003eA\u003c/em\u003e \u003cem\u003eB \u003c/em\u003ewith \u003cem\u003eN\u003c/em\u003e sets of \u003cem\u003eM\u003c/em\u003e BEADs. Inset: \u003cem\u003eN\u003c/em\u003e sets of Chinese Suanpan as a comparison. \u003cstrong\u003ee-h, \u003c/strong\u003eThe configuration of SUANPAN and experimental fidelity for 1-bit, 2-bit, 4bit and 8-bit quantization, respectively. \u003cstrong\u003ei,\u003c/strong\u003e The experimental fidelity for 4-bit quantization versus the dimensionality scales.\u003c/p\u003e","description":"","filename":"fig2.png","url":"https://assets-eu.researchsquare.com/files/rs-5401152/v1/0726df3f4c8c0b76e946040f.png"},{"id":69780094,"identity":"1a20e4c2-2b26-46e3-8934-3cbfdb76ad27","added_by":"auto","created_at":"2024-11-25 07:55:22","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":458779,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePhotonic Ising machine with SUANPAN architecture. a, \u003c/strong\u003eThe system architecture of Ising machine utilizing SUANPAN and a digital computer as optoelectronic hybrid computing. Blue box: the vector inner product performed by SUANPAN, which is equivalent to calculate. Gray box: nonlinear operations performed by the digital computer. \u003cstrong\u003eb,\u003c/strong\u003e 100 experimental annealing curves of a random 30-dimensional Ising problem. Dashed line: ground state. Inset: the random 30-dimensional Ising model. \u003cstrong\u003ec, \u003c/strong\u003eExperimental annealing curve of a random 1024-dimensional Ising problem. Dashed line: 87.8% approximate solution.\u003c/p\u003e","description":"","filename":"fig3.png","url":"https://assets-eu.researchsquare.com/files/rs-5401152/v1/543d4bb5ae4719f0b86a033f.png"},{"id":69780091,"identity":"4cd95dbc-7992-4945-ab0d-af55e689750e","added_by":"auto","created_at":"2024-11-25 07:55:22","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":688916,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eArtificial neural network with SUANPAN architecture. a, \u003c/strong\u003eThe pre-trained one-layer ANN for MNIST dataset. \u003cstrong\u003eb-c,\u003c/strong\u003eThe accuracy and confusion matrix of the one-layer ANN performed by computer and SUANPAN, respectively. \u003cstrong\u003ed,\u003c/strong\u003e The pre-trained double-layer ANN for MNIST dataset. \u003cstrong\u003ee-f,\u003c/strong\u003e The accuracy and confusion matrix of the double-layer ANN performed by computer and SUANPAN, respectively.\u003c/p\u003e","description":"","filename":"fig4.png","url":"https://assets-eu.researchsquare.com/files/rs-5401152/v1/3a818cf0830754ea750754db.png"},{"id":69780095,"identity":"e99f0798-f0ca-4596-a3e4-94385d9e826e","added_by":"auto","created_at":"2024-11-25 07:55:22","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":712257,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eThe estimated computility evolution with decreasing BEAD size and increasing BEAD bandwidth.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"fig5.png","url":"https://assets-eu.researchsquare.com/files/rs-5401152/v1/2dbe0fd661ed591b7e8cd643.png"},{"id":69780093,"identity":"2f908483-0379-4e43-904f-0f6145743629","added_by":"auto","created_at":"2024-11-25 07:55:22","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":5431200,"visible":true,"origin":"","legend":"","description":"","filename":"ExtendedDataFig.docx","url":"https://assets-eu.researchsquare.com/files/rs-5401152/v1/940d5978632bf30e6157b488.docx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"SUANPAN: Scalable Photonic Linear Vector Machine","fulltext":[{"header":"Introduction","content":"\u003cp\u003eArtificial intelligence (AI) is currently an active topic in both scientific research and commercial application as well as daily life\u003csup\u003e1,2\u003c/sup\u003e. For AI techniques, the linear operations of high-dimensional vectors are fundamental and dominant in both the artificial neural networks\u003csup\u003e3-5\u003c/sup\u003e (ANN) and optimization problem solvers,\u003cem\u003e\u0026nbsp;such as\u0026nbsp;\u003c/em\u003ethe Ising machine\u003csup\u003e6-8\u003c/sup\u003e. As the complexity of problems increases, the dimensionality of the processed vector grows rapidly, resulting in a huge computational burden. It is known that vector operations can be readily accelerated by photons due to the natural parallelism of bosons\u003csup\u003e9\u003c/sup\u003e. In the past decades, various photonic computing architectures have been demonstrated to perform vector-matrix operation in optical domain, \u003cem\u003ei.e.\u003c/em\u003e Stanford structure\u003csup\u003e10-12\u003c/sup\u003e, Reck scheme\u003csup\u003e13-16\u003c/sup\u003e, deep diffraction architecture\u003csup\u003e17-20\u003c/sup\u003e, micro-ring resonator (MRR) array\u003csup\u003e21-26\u003c/sup\u003e, \u003cem\u003eetc.\u003c/em\u003e All these approaches can only perform limited scalability of the linear vector operation due to two fundamental issues. As the optical matrix is adopted to perform linear transformation of the input vector encoded on light beam, the basic units in the computing architecture, \u003cem\u003ei.e.\u003c/em\u003e liquid crystal cells, beam splitters, meta-atoms, \u003cem\u003eetc.\u003c/em\u003e, would be tightly interconnected or highly coupled with each other due to the unavoidable light beam interaction. Thus, high-dimensional optical vector-matrix operations cannot be achieved by simply duplicating these basic units. Another issue is how to fuse the linear optics with the existing electronic digital computing systems. Due to significant obstacles in implementing all-optical computers\u003csup\u003e27\u003c/sup\u003e, optoelectronic hybrid computing architecture may be the most promising solution\u003csup\u003e28\u003c/sup\u003e by performing the linear vector operations in the optical domain and nonlinear operations in the electronic domain. This requires a large number of analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) for exchanging analog data in optical domain and digital data in electronic domain. In fact, such conversion between digital and analog data is the main limiting factor to significantly scale up the computing dimensionality, since ADC/DAC causes large power and area overhead\u003csup\u003e29\u003c/sup\u003e. In addition, the slow speed of ADC and DAC remains the main obstacle to high-speed operation with linear optical computing. To address this issue, an all-analog chip combining electronic and light computing (ACCEL)\u003csup\u003e30\u003c/sup\u003e was presented, but it could only serve as a specific processor for the vision task. Therefore, a universal optoelectronic hybrid computing architecture, which does not rely on light beam interaction and can allow high-speed conversions between analog and digital data is urgently needed to address the two fundamental issues so that flexible scalability can be achieved to alleviate the heavy computing demands of future AI technology.\u003c/p\u003e\n\u003cp\u003eInspired by the traditional Chinese Suanpan or abacus, in which the mathematical operations are carried out by moving these beads, we propose and demonstrate a photonic SUANPAN architecture to pursue a brand-new analog-digital hybrid computing paradigm. The basic computing unit in SUANPAN contains one emitter-detector pair, and can achieve Bit Encoding and Analog Detecting, denoted as BEAD. Our proposed SUANPAN is formed by BEAD array, and thus can easily scale up by duplicating the independent BEAD. Instead of optical matrix operations in the current popular optical computing architectures, SUANPAN is designed only for the optical inner product of two vectors. The elemental values of one vector are encoded on the output intensity of light-emitter by controlling the duty ratio of driving current. While the elemental values of the other vector are encoded by the position of \u003cem\u003eM\u003c/em\u003e BEADs to achieve \u003cem\u003eM\u003c/em\u003e-bit quantization. The photocurrent of the photodetector (PD) would be proportional to the multiplication of the light intensity and photoresponsivity, and the final result of the inner product can be obtained by the summation of all the photocurrents. There is no DAC and only one ADC required in SUANPAN architecture, so it can be readily merged with existing electronic digital computing system. As a proof of principle, the SUANPAN architecture is implemented using an 8\u0026times;8 vertical cavity surface emission laser (VCSEL) array and an 8\u0026times;8 MoTe\u003csub\u003e2\u003c/sub\u003e two-dimensional (2D) material PD array. The experimental results of vector inner product operations show that the calculation fidelity can be as high as \u0026gt;98% for various bit precisions (1-bit, 2-bit, 4-bit and 8-bit), and \u0026gt;95% for various vector dimensionalities (@4-bit precision). Furthermore, such SUANPAN with 64 BEADs has been successfully reconfigured to perform two typical AI tasks, Ising machine and ANN. 1024-dimensional randomly generated Ising problem is successfully solved, which is the highest dimensional optical Ising machine with heuristic algorithm. Meanwhile, a competitive classification accuracy of 84~88% is achieved for MNIST handwritten digit dataset by properly setting SUANPAN for an ANN. Since there is no interaction among the propagating light beams of all BEADs and only the output currents of all BEADs are connected, SUANPAN architecture could be extremely scalable by increasing the number of BEADs with no additional loss or error as well as flexibly reconfigurable and programmable by properly arranging the BEADs for different computational tasks. We believe that our proposed photonic SUANPAN is capable to serve as a fundamental linear vector machine and is potential to enhance the computing power for future various AI applications.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eSUANPAN architecture\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe proposed SUANPAN architecture consists of a light-emitter array and a PD array as well as some necessary electronic hardware as schematically shown in Fig.1a. Here, a VCSEL array and a MoTe\u003csub\u003e2\u003c/sub\u003e 2D material PD array are fabricated to demonstrate the SUANPAN architecture as shown in Fig.1b and Fig.1c, respectively. The schematic diagram and microscope photographs of a single VCSEL and MoTe\u003csub\u003e2\u003c/sub\u003e PD are also shown in the insets of Fig.1a. SUANPAN architecture is designed for vector inner product, also known as multiply-accumulate (MAC) operation. Firstly, for multiply operation, each PD is well aligned with a corresponding light-emitter to form an emitter-detector pair, therefore the photocurrent of the PD would be proportional to the multiplication of the light intensity and photoresponsivity due to the linear optical response\u003csup\u003e31\u003c/sup\u003e. Then, for add operation, the photocurrent of each PD is connected together, therefore the output current would be the sum of all PDs due to the Kirchhoff\u0026rsquo;s law. In this way, the multiply-accumulate operation is naturally performed through the emission and detection in SUANPAN architecture. In addition, one more important issue is how to encode the vectors on to the light-emitter array and PD array. In the traditional Chinese Suanpan or abacus, numbers are represented by the different positions of beads, and the mathematic operations are carried out by moving these beads. Inspired by that, the basic computing unit in SUANPAN contains one emitter-detector pair, and can achieve Bit Encoding and Analog Detecting, named as BEAD. The vector encoding and operating would rely on controlling the on-off state of each BEAD.\u003c/p\u003e\n\u003cp\u003eFor deep insight, a simple example of one BEAD is considered. As shown in Fig.2a, the multiplier \u003cem\u003ea\u003c/em\u003e is encoded in the intensity of light-emitter by controlling the duty ratio of driving current, which is done by a digital counter according to the clock cycles without DAC. The bit precision depends on the time-slot numbers of splitting the period. With the constant period, the more time-slots there are, the more quantization bits for \u003cem\u003ea\u003c/em\u003e can be achieved. In this work, the period of light emission is divided into 100 slots, so that \u003cem\u003ea\u003c/em\u003e can be taken as 0, 1, \u0026hellip;, 100. The multiplier \u003cem\u003eb\u003c/em\u003e is encoded on the on-off state of the BEAD, by turning on (green arrow) or off (gray arrow) the light-emitter, respectively. Hence, there are two states to encode \u003cem\u003eb\u003c/em\u003e, \u003cem\u003eb\u003c/em\u003e=0 or \u003cem\u003eb\u003c/em\u003e=1, known as 1-bit quantization. The photocurrent would be proportional to \u003cem\u003ea\u003c/em\u003e\u0026times;0 or \u003cem\u003ea\u003c/em\u003e\u0026times;1, which is the result of \u003cem\u003ea\u003c/em\u003e\u0026times;\u003cem\u003eb\u003c/em\u003e, \u003cem\u003eb\u003c/em\u003e=0,1.\u003c/p\u003e\n\u003cp\u003eFor more bit quantization of \u003cem\u003eb\u003c/em\u003e, more BEADs are employed and form a set of them. Considering 2-bit quantization of \u003cem\u003eb\u003c/em\u003e, there would be two BEADs in one set as shown in the white box of Fig.2b. The method of encoding \u003cem\u003ea\u003c/em\u003e is also as aforementioned, while encoding \u003cem\u003eb\u003c/em\u003e should employ two BEADs simultaneously and thus form four combinations of on-off states, which is corresponding to the four values of 2-bit quantization. From the perspective of binary encoding, such two BEADs are equivalent to the two bits in the binary representation of \u003cem\u003eb\u003c/em\u003e, while the position of PD is considered as the bit-position. For example, if \u003cem\u003eb\u003c/em\u003e=2, then the binary representation would be \u003cem\u003eb\u003c/em\u003e=\u0026nbsp;, which means the first BEAD is at off-state and the second one is at on-state. As mentioned above, this is quite similar to the Chinese traditional Suanpan or abacus, which represents numbers according to the position of beads and carries out mathematical operations by moving these beads up and down. Different bits represent different weights in binary representation, and this can be achieved by setting the photoresponsivities of these two PDs as 2\u003csup\u003e0\u003c/sup\u003e to 2\u003csup\u003e1\u003c/sup\u003e. To properly manipulate the photoresponsivity, we design and fabricate 2D material photoconductive detectors (more details will be discussed in the Results section). By combining the photocurrents of these two PDs together, the output result is \u003cem\u003ea\u003c/em\u003e\u0026times;\u003cem\u003eb\u003c/em\u003e as shown in Fig.2b. Further, in this way, \u003cem\u003eM\u003c/em\u003e-bit quantization of \u003cem\u003eb\u003c/em\u003e can be achieved with a set of \u003cem\u003eM\u003c/em\u003e BEADs, so that SUANPAN can encode the range of \u003cem\u003eb\u003c/em\u003e from 0 to 2\u003cem\u003e\u003csup\u003eM\u003c/sup\u003e\u003c/em\u003e-1 and achieve the multiplication of \u003cem\u003ea\u003c/em\u003e and \u003cem\u003eb\u003c/em\u003e, as shown in Fig.2c.\u003c/p\u003e\n\u003cp\u003eAs mentioned above, \u003cem\u003ea\u003c/em\u003e is digitally encoded to the duty ratio of light emission and \u003cem\u003eb\u003c/em\u003e is digitally encoded to the on-off states of the BEADs in one set, while the output result is the analog photocurrent from the PDs. Thus, the basic unit, consisting of an emitter-detector pair, is actually operated as Bit Encoding and Analog Detecting, which is thus abbreviated as BEAD. Since the number of \u003cem\u003ea\u003c/em\u003e and \u003cem\u003eb\u003c/em\u003e are encoded in time and space domain respectively, SUANPAN architecture can perform the multiplication of any desired bit precision theoretically. Further, The vector inner product can be achieved by employing \u003cem\u003eN\u003c/em\u003e sets of BEADs and connecting their output photocurrents together (Fig.2d). It should be mentioned that SUANPAN architecture can also handle negative numbers by applying reversed bias voltage of the corresponding PD. Considering both positive/negative numbers, 2\u0026times;\u003cem\u003eM\u003c/em\u003e BEADs should be utilized for each set and the details are provided in Extended Data Fig.1.\u003c/p\u003e\n\u003cp\u003eThrough this encoding method, \u003cem\u003eM\u003c/em\u003e-bit quantization and \u003cem\u003eN\u003c/em\u003e-dimensional signed vector inner product can be performed by SUANPAN with \u003cem\u003eN\u003c/em\u003e\u0026times;2\u0026times;\u003cem\u003eM\u003c/em\u003e BEADs at a time while each element of the input vectors can be flexibly programmed. Furthermore, the number of utilized sets (\u003cem\u003eN\u003c/em\u003e) as well as the number of BEADs in each set (\u003cem\u003eM\u003c/em\u003e) can be reconfigurable according to specific calculation requirements in terms of the dimensionality and bit precision. At the output, the photocurrents of all PDs are connected together, so that only one ADC is required to transform the total analog current into digital signals. Also, since only 1-bit information would be encoded on a single BEAD, the properly settled but fixed bias voltage would be applied on each BEAD. Thus, there are no requirement for ADC array or DAC array in SUANPAN, which is actually the main limiting factor for the hybrid computing with both digital and analog operations. Last but not least, since there is no interaction among the propagating light beams of all BEADs, SUANPAN architecture can be extremely scalable by increasing the number of independent BEADs with no additional loss or error. On one hand, the number of utilized BEADs can be easily increased by integrating more components on one pair of light-emitter array and PD array chips. On the other hand, distributed computing can also be achieved by simply connecting multiple pairs of chips together to scale up the computing power more. Therefore, SUANPAN is a programmable, reconfigurable and scalable computing architecture, which is capable to serve as a general vector inner product accelerator for the existing computing system.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eTo demonstrate the SUANPAN architecture, a pair of VCSEL and MoTe\u003csub\u003e2\u003c/sub\u003e PD are employed to form the BEAD. As a light-emitter, VCSEL can readily achieve high-speed modulation as well as large-scale array. For PD, 2D material of MoTe\u003csub\u003e2\u003c/sub\u003e is utilized for three reasons: (1) The photoresponsivity of 2D material PD can be flexibly controlled by the bias voltage. (2) The high carrier mobility in 2D material\u003csup\u003e32\u003c/sup\u003e can support high-speed detection, which is an important issue for high-speed computing. (3) 2D material can be heterogeneously integrated with other material platform\u003csup\u003e33\u003c/sup\u003e. Therefore, in SUANPAN architecture 2D material PD is potentially integrated with light-emitter array in the future, which will be explained in detail in the Discussion section. Thus, we have fabricated both the VCSEL and MoTe\u003csub\u003e2\u003c/sub\u003e PD array chip with 8\u0026times;8 components, respectively. The fabrication process can be found in Methods and Extended Data Fig.2, and the characteristics of each VCSEL and PD can be found in Extended Data Fig.3-5. Our fabricated VCSELs and PDs show good uniformity and stability, which are very significant in the optical computing architecture. To align each pair of VCSEL and PD, an optical imaging system is built up (refer to Extended Data Fig.7). Such an imaging system would be not necessary by integrating the 2D material PD array with light-emitter array on a single chip in the future. To verify the functionality of SUANPAN architecture, random vector inner products are performed with various bit precisions and dimensionalities. Finally, two typical AI tasks of the Ising machine and ANN, are performed to present the ability of SUANPAN architecture.\u003c/p\u003e\n\u003cp\u003eThe inner product of random vectors is the foundation for different AI tasks and the signed vector inner product with precision of 1-bit, 2-bit, 4-bit and 8-bit are demonstrated. For 1-bit precision signed vector, 2 BEADs are required in each set, and there are 32 sets with 64 BEADs. Therefore, 32-dimensional vector inner product can be done at one time. Similarly, for 2-bit, 4-bit and 8-bit quantization, the corresponding dimensionality would be 16, 8 and 4 respectively. To achieve higher dimensional vector inner product, time-division multiplexing can also be employed, which is according to accumulating multiple times of calculations. For each bit precision, the configuration of SUANPAN would be properly settled and the corresponding bias voltage of each PD is shown in Fig.2e-h. Also, 1000 pairs of signed vectors are randomly generated and performed by SUANPAN. The calculation accuracy is evaluated by the fidelity expressed as:\u003c/p\u003e\n\u003cp\u003e\u003cimg src=\"https://myfiles.space/user_files/58894_9946feeafa4c1df7/58894_custom_files/img1732519859.png\" width=\"369\" height=\"58\"\u003e\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eHere, 1000 true values are calculated in a computer and denoted as a vector \u003cem\u003ex\u003c/em\u003e, and the calculated results in SUANPAN are recorded as a vector \u003cem\u003ey\u003c/em\u003e. As shown in equation (1), the fidelity describes the parallel degree between \u003cem\u003ex\u003c/em\u003e and \u003cem\u003ey\u003c/em\u003e, while the scaling error can be excluded. The statistical results are shown in Fig.2e-h and all of those fidelities are higher than 98%. These results indicate that SUANPAN can perform accurate calculation of signed vector inner products with different bit precisions. Specifically, the fidelity of 4-bit precision signed vector inner product with different dimensionalities are shown in Fig.2i. As the dimensionality increases, the computational fidelity remains above 95%, which also demonstrates the scalability of SUANPAN architecture. Due to the high fidelity in executing random signed vector inner products with different bit precisions and dimensionalities, SUANPAN architecture can be flexibly utilized to further demonstrate more specific computing tasks. In next sections, both the Ising machine and ANN would be considered.\u003c/p\u003e\n\u003cp\u003eFor the decision-making task in AI applications, the combinatorial optimization problems are ubiquitous and usually non-deterministic polynomial. One approach to process such NP-hard problems is mapping them to Ising problem\u003csup\u003e34\u003c/sup\u003e, which is a typical combinatorial optimization problem and also known as quadratic unconstrained binary optimization (QUBO) problem. An \u003cem\u003eN\u003c/em\u003e-dimensional Ising problem can be defined by an interaction matrix \u003cem\u003eJ\u003c/em\u003e, which is a symmetric matrix of \u003cem\u003eN\u003c/em\u003e\u0026times;\u003cem\u003eN\u003c/em\u003e dimensionality. For a given interaction matrix \u003cem\u003eJ\u003c/em\u003e, the Hamiltonian of Ising problem is defined as follows:\u003c/p\u003e\n\u003cp\u003e\u003cimg src=\"https://myfiles.space/user_files/58894_9946feeafa4c1df7/58894_custom_files/img1732519858.png\" width=\"322\" height=\"22\"\u003e\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eSolving Ising problem is to find the specific vector \u003cem\u003eS\u003c/em\u003e that minimizes the Hamiltonian, which is denoted as the ground state. Since the elements in \u003cem\u003eS\u003c/em\u003e can only take 0 or 1, the dimensionality of the solution space is 2\u003cem\u003e\u003csup\u003eN\u003c/sup\u003e\u003c/em\u003e for an \u003cem\u003eN\u003c/em\u003e-dimensional Ising problem. With the development of AI, various heuristic algorithms have been developed to solve the Ising problem and the so-called Ising machines have been demonstrated on various computing platforms. Among them, the simulated annealing (SA) algorithm\u003csup\u003e35\u003c/sup\u003e is combined with optical computing platforms to form a photonic Ising machine due to the parallelism of light beam. Here, SUANPAN is set to serve as a hardware solver for Ising problem. The detailed solution process of SA algorithm\u003csup\u003e35\u003c/sup\u003e consists of initialization, \u003cem\u003en\u003c/em\u003e iterations and output. In each iteration, one random element of \u003cem\u003eS\u003c/em\u003e is flipped and then the variation of Hamiltonian \u0026nbsp;\u003cem\u003eH\u003c/em\u003e is calculated. After that, the vector of \u003cem\u003eS\u003c/em\u003e would be accepted or not according to \u0026nbsp;\u003cem\u003eH\u003c/em\u003e. Obviously, the Hamiltonian should be calculated in each iteration with O(\u003cem\u003eN\u003c/em\u003e\u003csup\u003e2\u003c/sup\u003e) complexity, which is actually the main computational burden in SA algorithm. Through some identity transformations, \u0026nbsp;\u003cem\u003eH\u003c/em\u003e can be transformed into an \u003cem\u003eN\u003c/em\u003e-dimensional vector inner product, which can be readily performed by SUANPAN as shown in Fig.3a. Therefore, there are mainly two functional units to build up such an Ising machine here. The first one is an electronic processor (gray box in Fig.3a), to execute the updating and flipping operations. The other is SUANPAN (blue box in Fig.3a), where vector inner product would be performed to calculate \u0026nbsp;\u003cem\u003eH\u003c/em\u003e. Here, as shown in Fig.3a, the first vector is the \u003cem\u003ei\u003c/em\u003eth row of matrix \u003cem\u003eJ\u003c/em\u003e with some proper modifications. Since \u003cem\u003eJ\u003c/em\u003e is a continuous variable matrix, it is encoded on light-emitter array to achieve a high bit precision. The second vector is according to vector \u003cem\u003eS\u003c/em\u003e, in which the element only takes 0/1. Thus, it is encoded on PD array with 1-bit precision. It should be mentioned that if \u003cem\u003eS\u003csub\u003ei\u003c/sub\u003e\u003c/em\u003e is flipped from 0 to 1, then the result of the vector inner product would be \u0026nbsp;\u003cem\u003eH\u003c/em\u003e. Otherwise, if \u003cem\u003eS\u003csub\u003ei\u003c/sub\u003e\u003c/em\u003e is flipped from 1 to 0, then negative sign should be taken.\u003c/p\u003e\n\u003cp\u003eWe first solve a 30-dimensional randomly generated Ising problem by SUANPAN, which is the highest dimensionality of arbitrarily connected Ising problem reported previously in a programmable photonic Ising machine\u003csup\u003e36\u003c/sup\u003e. The current SUANPAN with 64 BEADs can calculate 32-dimensional vector inner product at one time with 1-bit precision, and thus it can be employed to solve 30-dimensional Ising problems without time-division multiplexing. The iteration number is set as 500 and the solving process is repeated with 100 rounds, while the full parameters of the SA algorithm are provided in Extended Data Table1. Each curve in Fig.3b represents the Hamiltonian evolution during each solving round, and 99 curves eventually converged to the lowest ground state (dashed line shown in Fig.3b). The Hamiltonian decreases very rapidly in the initial 100 iterations. After about 230 iterations, most of the curves have already been very close to the ground state. Finally, an accuracy of 99% is achieved by SUANPAN, which is much higher than the existing optical programmable Ising machine\u003csup\u003e36\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eTo further validate the scalability of SUANPAN architecture, a randomly generated 1024-dimensional Ising problem is considered, in which dimensionality is more than an order of magnitude comparing with the previously reported programmable photonic Ising machine based on heuristic algorithm\u003csup\u003e36\u003c/sup\u003e. Here, the required 1024-dimensional vector inner product is decomposed into 32-dimensional one with time-division multiplexing for 32 times. The iteration number is set as 5000 and full parameters of the SA algorithm are provided in Extended Data Table1. For high-dimensional Ising problem, it is hard to obtain the true Hamiltonian value of the ground state, and thus an approximate solution with 87.8% of the ground state is usually set as a criteria for successful solution\u003csup\u003e37-39\u003c/sup\u003e, as dashed line shown in Fig.3c. Such 1024-dimensional Ising problem is successfully solved by SUANPAN as the annealing curve fall below the criteria after ~4000 iterations. The high convergence rate and high dimensionality in solving Ising problem can fully validate the programmability, reconfigurability and computational stability of SUANPAN architecture, which is capable to serve as a programmable Ising machine.\u003c/p\u003e\n\u003cp\u003eANN is another typical task in modern AI and various physical neural networks (PNNs) have been applied for the tasks of visual perception, speech recognition, subject classification, \u003cem\u003eetc\u003c/em\u003e. Among them, optical neural networks are quite promising also due to the parallelism of light beam\u003csup\u003e4,5\u003c/sup\u003e. In PNNs, silico training is usually required to avoid errors caused by differences between simulation and practical devices. Unlike that, SUANPAN architecture can directly map pre-trained ANNs, in which the input data needs to go through a linear part and a nonlinear part during each layer. In the linear part of ANN, the vector matrix multiplication is calculated between input data and weight matrix, while in the nonlinear part, a nonlinear activation function is performed. Correspondingly, the linear matrix multiplication, which can be considered as a set of vector inner product, would be performed by SUANPAN, and the nonlinear function would be calculated by electronic processor. Therefore, through time-division multiplexing, SUANPAN architecture can execute ANNs of arbitrary depth and arbitrary number of nodes in theory. As proof of principle, both single and double layer ANNs are performed with SUANPAN architecture as shown in Fig.4a and 4d, respectively. Here, MNIST handwritten digit is utilized as dataset, and stochastic gradient descent\u003csup\u003e40\u003c/sup\u003e (SGD) is utilized as training method. The nonlinear function of the output layer is Softmax function, and the nonlinear function of the hidden layer is Relu function. The weights of the single-layer ANN and double-layer ANN are 4-bit precision and 6-bit precision according to simulations, respectively (the details shown in Extended Data Fig.8). For the inference result of single-layer ANN, the confusion matrix of 10000 pictures in test dataset calculated by computer and SUANPAN are shown in Fig.4b and 4c, respectively. The approaching classification accuracies are 88.08% and 90.12% for SUANPAN and computer, respectively. It can be seen that the results processed by SUANPAN are quite consistent with those processed by a computer, and only a little deterioration is introduced. These results indicate a quite high computing accuracy in SUANPAN architecture. It should be mentioned that the classification accuracy is not limited by the SUANPAN, but by the pre-trained model itself, which could be further improved through optimizing the network model and the training method. While for double-layer, only the first 100 pictures in MNIST test dataset are performed as a proof of principle verification. The confusion matrix and accuracy calculated by computer and SUANPAN are shown in Fig.4e and 4f, respectively. It can be noticed that the classification accuracy of double-layer ANN calculated by computer is much higher than one-layer, while that calculated by SUANPAN is lower than one-layer. The reason might be the device performance of MoTe\u003csub\u003e2\u003c/sub\u003e PD array has deteriorated after two months of testing. Anyway, we believe that the above results of ANN can still validate the feasibility of SUANPAN architecture.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this work, we have proposed and demonstrated the photonic SUANPAN architecture to perform the vector inner product operations. Instead of the reported optical computing architecture maximizing the interaction of light beams, we utilize independent emitter-detector pair to form a programmable, reconfigurable and scalable computing architecture that can be compatible with the existing computing system. Our fabricated SUANPAN with 64 pairs of VCSEL and MoTe\u003csub\u003e2\u003c/sub\u003e PD shows high computing fidelities for randomly generated vector inner products, and demonstrates two typical AI tasks of the Ising machine and ANN. For the Ising problem, 1024-dimensional problem is successfully solved, which is the highest dimensional optical Ising machine with heuristic algorithm. For ANN, a competitive classification accuracy of 84~88% is achieved for MNIST handwritten digit dataset. There are two main contributions in this work, the SUANPAN computing architecture and the Bit Encoding and Analog Detecting computing paradigm, which actually hold together.\u003c/p\u003e\n\u003cp\u003eFirstly, for the SUANPAN architecture, it breaks through the traditional mindset of obtaining optical matrix transformations through interaction of light beams. Instead, there is no interaction among those propagating light beams of all BEADs. Therefore, the SUANPAN architecture can be decomposed into BEADs as independent computing units. The scalability, reconfigurability and programmability of the SUANPAN architecture are only based on the duplication, recombination and modulation of BEAD without any additional cost. Compared with other optical matrix transformations through interaction between light beams, SUANPAN architecture possesses these following advantages: (1) With massive and industrial replication of BEADs, SUANPAN architecture can theoretically be infinitely scalable. (2) SUANPAN architecture can be flexibly reconfigured and programmed to perform various specific computing tasks. (3) The correction of SUANPAN architecture only considers the intensity of light beam (details are provided in Methods section), and there is no requirement to correct the phase of the light beams for interaction. (4) Even if one BEAD is damaged during fabrication or operation, it does not affect other BEADs, and the only result is decreasing the operating dimensionality. In short, the core idea of SUANPAN architecture is to decompose the whole computing architecture into simple and independent units, since no interaction among each unit would bring more scalability and programmability to the whole optical computing architecture.\u003c/p\u003e\n\u003cp\u003eSecondly, for the Bit Encoding and Analog Detecting computing paradigm, it provides a promising solution for optoelectronic analog-digital hybrid computing. For traditional DA conversion in optoelectronic computing, the digital electronic signal is converted to analog optical signal in each single device. Therefore, each single device requires a DAC. However, in Bit Encoding and Analog Detecting paradigm, the \u003cem\u003eM\u003c/em\u003e-bit digital electronic signal is converted to analog with in a set of \u003cem\u003eM\u003c/em\u003e BEADs, where each BEAD only represents 1-bit information. Therefore, no DAC is required in such computing paradigm. At the same time, only one ADC is required to convert the sum photocurrent into electronic digital signal as the result. Therefore, the Bit Encoding and Analog Detecting computing paradigm greatly reduces the heavy burden introduced by ADC and DAC. In fact, it is also an important issue for the arbitrary scalability of SUANPAN architecture. Additionally, this computing paradigm can be extended beyond SUANPAN architecture and introduced into other computing architectures to reduce the obstructive effects of ADC and DAC.\u003c/p\u003e\n\u003cp\u003eFinally, our work is a preliminary verification of the feasibility of SUANPAN architecture, and there is still a lot of room for improvement within the specific implementation. It is obvious that the light-emitter array and the 2D material PD array can be integrated into a single chip due to the heterogeneous integration of 2D materials. In this way, the imaging system is not required to align the light-emitter array and PD array, and further 3D stacking integration can be achieved to further expand the scalability. In addition, a detailed analysis is provided here for the future development of the computility. In order to evaluate the computility of SUANPAN architecture, a figure of metric is defined as BIPPS here, which is bits of inner product per second. Compared to floating point operations per second (FLOPS), if the computational task is \u003cem\u003eM\u003c/em\u003e-bit quantization, then \u003cem\u003eM\u003c/em\u003e BIPPS is equivalent to 1 FLOPS. For SUANPAN architecture, computility is determined by two factors, the BEAD size (\u003cem\u003eL\u003c/em\u003e\u003csub\u003eB\u003c/sub\u003e) and bandwidth (\u003cem\u003ef\u003c/em\u003e\u003csub\u003eB\u003c/sub\u003e). Since BEAD consists of an emitter-detector pair, the BEAD size is defined as the larger one between light-emitter and PD, while the BEAD bandwidth is defined as the smaller one between the modulating bandwidth of light-emitter and the response bandwidth of PD. Considering within the utilized chip area of 1 cm\u003csup\u003e2\u003c/sup\u003e, the computility is obtained as follow:\u003c/p\u003e\n\u003cp\u003e\u003cimg src=\"https://myfiles.space/user_files/58894_9946feeafa4c1df7/58894_custom_files/img1732519938.png\" width=\"376\" height=\"54\"\u003e\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003ewhere the first term equals the number of BEADs on a chip area of 1\u0026nbsp;cm\u003csup\u003e2\u003c/sup\u003e. Therefore, the computility can be increased by reducing the BEAD size and increasing the BEAD bandwidth as shown in Fig.5. Due to the research on various high-speed nano-lasers\u003csup\u003e41-43\u003c/sup\u003e and nano-detectors\u003csup\u003e44,45\u003c/sup\u003e, the computility could achieve over 1P\u0026nbsp;BIPPS/cm\u003csup\u003e2\u003c/sup\u003e for BEAD size \u003cem\u003eL\u003c/em\u003e\u003csub\u003eB\u003c/sub\u003e\u0026lt;10 \u0026mu;m and BEAD bandwidth \u003cem\u003ef\u003c/em\u003e\u003csub\u003eB\u003c/sub\u003e\u0026gt;1 GHz. Therefore, SUANPAN architecture is capable as an attractive and practicable photonic linear vector machine in the foreseeable future\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cstrong\u003eDevice fabrication of VCSEL\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe 850 nm VCSEL EPI structure consists of around 35 pairs of AlGaAs bottom DBR and 25 pairs of AlGaAs top DBR. AlGaAs/InGaAs quantum well is used as active region. 98% AlGaAs layer is used to form oxide aperture. After epitaxy, the wafer goes through P-metal deposition, ICP trench and wet oxidation. The N-metal is deposited on the backside of the N-type substrate to form common cathode. While the individual emitter in the array is connected to separated anode pads on the edge of the VCSEL array chip by electro-plated traces (Extended Data Fig.6a).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDevice fabrication of MoTe\u003csub\u003e2\u003c/sub\u003e PD\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe PDs are fabricated on a SiO\u003csub\u003e2\u003c/sub\u003e/Si substrate directly grown with a 10 nm 2H MoTe\u003csub\u003e2\u003c/sub\u003e layer (detailed fabrication process in Ref.\u003csup\u003e46,47\u003c/sup\u003e). First, the patterns are defined by ultraviolet lithography and transferred to the MoTe\u003csub\u003e2\u003c/sub\u003e/SiO\u003csub\u003e2\u003c/sub\u003e layer by reactive ion etching (SF\u003csub\u003e6\u003c/sub\u003e acts as the etching gas). Then, the Cr/Au electrodes (10 nm/50 nm) are fabricated using ultraviolet lithography, deposition and lift-off. The schematic diagram of the preparation process is shown in Extended Data Fig.2. To prevent degradation, the PDs are packaged with a 10 nm Al\u003csub\u003e2\u003c/sub\u003eO\u003csub\u003e3\u003c/sub\u003e layer grown by atomic layer deposition. For subsequent testing, the MoTe\u003csub\u003e2\u003c/sub\u003e PDs are connected to a self-designed printed circuit board (PCB) using wire-bonding technology (Extended Data Fig.6b).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eExperimental setup\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSchematics of the experimental setup are illustrated in Fig.1a and Extended Data Fig.7. Light from the VCSEL array with a wavelength of 850 nm is focused by a zoom lens onto the MoTe\u003csub\u003e2\u003c/sub\u003e PD array. The 8×8 VCSEL array and the PD array are aligned by the illumination optical path. Electrical and optoelectronic measurements of the fabricated MoTe\u003csub\u003e2\u0026nbsp;\u003c/sub\u003ePDs are carried out with a semiconductor parameter analyzer (PDA FS380) at room temperature in ambient conditions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRectification of SUANPAN\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDue to the fabrication error, the output intensity of each VCSEL and the output current of each PD may not be completely consistent under the same conditions. Therefore, it is necessary to rectify the entire architecture before performing calculation tasks. Firstly, the output dark current of each detector is adjusted to be consistent by changing the bias voltage on each detector. Secondly, under such bias voltage, the output photocurrent of each detector is adjusted to be consistent by changing the output intensity of each VCSEL. Then, it can be considered that all 64 BEADs are consistent.\u003c/p\u003e\n\u003cp\u003e46 \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Xu, X. et al. Millimeter-Scale Single-Crystalline Semiconducting MoTe\u003csub\u003e2\u003c/sub\u003e via Solid-to-Solid Phase Transformation. \u003cem\u003eJ. Am. Chem. Soc.\u003c/em\u003e \u003cstrong\u003e141\u003c/strong\u003e, 2128-2134 (2019).\u003c/p\u003e\n\u003cp\u003e47 \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Pan, Y. et al. Heteroepitaxy of semiconducting 2H-MoTe\u003csub\u003e2\u003c/sub\u003e thin films on arbitrary surfaces for large-scale heterogeneous integration. \u003cem\u003eNat. Synth.\u003c/em\u003e \u003cstrong\u003e1\u003c/strong\u003e, 701-708 (2022).\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data that support the findings of this study are available within the paper and the Extended Data. Other relevant data are available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFunding from the National Key Research and Development Program of China (2023YFB2806703), the National Natural Science Foundation of China (Grant No. U22A6004, 92365210, 62175124) is greatly acknowledged. This work was also supported by Beijing National Research Center for Information Science and Technology (BNRist), Frontier Science Center for Quantum Information, Beijing academy of quantum information science, and Tsinghua University Initiative Scientific Research Program.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eZ.Y., Y.L. and X.F. conceived the idea. Z.Y. and C.L. designed and performed the simulations, experiments and data analysis. Y.R. and Y.Y. contributed to the growth of MoTe\u003csub\u003e2\u003c/sub\u003e layer on the SiO\u003csub\u003e2\u003c/sub\u003e/Si substrate. J.W. and C.J.C.-H. contributed to the fabrication of the VCSEL array. F.Q. assisted in building the electronic control system of SUANPAN architecture. H.S., K.C., F.L., W.Z. and C.N. provided useful discussions and comments. Z.Y., C.L., Y.L., X.F. and Y.H. wrote the paper. All authors revised and approved the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eCetinic, E. \u0026amp; She, J. Understanding and Creating Art with AI: Review and Outlook. \u003cem\u003eACM Trans. Multimed. Comput. Commun. Appl.\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, 1-22 (2022).\u003c/li\u003e\n\u003cli\u003eRajpurkar, P. et al. AI in health and medicine. \u003cem\u003eNat. Med.\u003c/em\u003e \u003cstrong\u003e28\u003c/strong\u003e, 31-38 (2022).\u003c/li\u003e\n\u003cli\u003eLeCun, Y. et al. Deep learning. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e521\u003c/strong\u003e, 436-444 (2015).\u003c/li\u003e\n\u003cli\u003eWetzstein, G. et al. Inference in artificial intelligence with deep optics and photonics. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e588\u003c/strong\u003e, 39-47 (2020).\u003c/li\u003e\n\u003cli\u003eFu, T. et al. Optical neural networks: progress and challenges. \u003cem\u003eLight-Sci. Appl.\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 263 (2024).\u003c/li\u003e\n\u003cli\u003eMohseni, N. et al. Ising machines as hardware solvers of combinatorial optimization problems. \u003cem\u003eNat. Rev. Phys.\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, 363-379 (2022).\u003c/li\u003e\n\u003cli\u003eLaydevant, J. et al. Training an Ising machine with equilibrium propagation. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, 3671 (2024).\u003c/li\u003e\n\u003cli\u003eNikhar, S. et al. All-to-all reconfigurability with sparse and higher-order Ising machines. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, 8977 (2024).\u003c/li\u003e\n\u003cli\u003eZhou, H. et al. Photonic matrix multiplication lights up photonic accelerator and beyond. \u003cem\u003eLight-Sci. Appl.\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 30 (2022).\u003c/li\u003e\n\u003cli\u003eGoodman, J. W. et al. Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms. \u003cem\u003eOpt. Lett.\u003c/em\u003e \u003cstrong\u003e2\u003c/strong\u003e, 1-3 (1978).\u003c/li\u003e\n\u003cli\u003eSpall, J. et al. Fully reconfigurable coherent optical vector-matrix multiplication. \u003cem\u003eOpt. Lett.\u003c/em\u003e \u003cstrong\u003e45\u003c/strong\u003e, 5752-5755 (2020).\u003c/li\u003e\n\u003cli\u003eWang, T. et al. An optical neural network using less than 1 photon per multiplication. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 123 (2022).\u003c/li\u003e\n\u003cli\u003eReck, M. et al. Experimental realization of any discrete unitary operator. \u003cem\u003ePhys. Rev. Lett.\u003c/em\u003e \u003cstrong\u003e73\u003c/strong\u003e, 58-61 (1994).\u003c/li\u003e\n\u003cli\u003eShen, Y. et al. Deep learning with coherent nanophotonic circuits. \u003cem\u003eNat. Photonics\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 441-446 (2017).\u003c/li\u003e\n\u003cli\u003eRoques-Carmes, C. et al. Heuristic recurrent algorithms for photonic Ising machines. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 249 (2020).\u003c/li\u003e\n\u003cli\u003ePai, S. et al. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. \u003cem\u003eScience\u003c/em\u003e \u003cstrong\u003e380\u003c/strong\u003e, 398-404 (2023).\u003c/li\u003e\n\u003cli\u003eLin, X. et al. All-optical machine learning using diffractive deep neural networks. \u003cem\u003eScience\u003c/em\u003e \u003cstrong\u003e361\u003c/strong\u003e, 1004-1008 (2018).\u003c/li\u003e\n\u003cli\u003eYan, T. et al. Fourier-space Diffractive Deep Neural Network. \u003cem\u003ePhys. Rev. Lett.\u003c/em\u003e \u003cstrong\u003e123\u003c/strong\u003e, 023901 (2019).\u003c/li\u003e\n\u003cli\u003eZhou, T. et al. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. \u003cem\u003eNat. Photonics\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, 367-373 (2021).\u003c/li\u003e\n\u003cli\u003eFu, T. et al. Photonic machine learning with on-chip diffractive optics. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 70 (2023).\u003c/li\u003e\n\u003cli\u003eTait, A. N. et al. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing. \u003cem\u003eJ. Lightwave Technol.\u003c/em\u003e \u003cstrong\u003e32\u003c/strong\u003e, 4029-4041 (2014).\u003c/li\u003e\n\u003cli\u003eDeng, Y. \u0026amp; Chu, D. Coherence properties of different light sources and their effect on the image sharpness and speckle of holographic displays. \u003cem\u003eSci Rep\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 5893 (2017).\u003c/li\u003e\n\u003cli\u003eFeldmann, J. et al. All-optical spiking neurosynaptic networks with self-learning capabilities. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e569\u003c/strong\u003e, 208-214 (2019).\u003c/li\u003e\n\u003cli\u003eFeldmann, J. et al. Parallel convolutional processing using an integrated photonic tensor core. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e589\u003c/strong\u003e, 52-58 (2021).\u003c/li\u003e\n\u003cli\u003eBai, B. et al. Microcomb-based integrated photonic processing unit. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 66 (2023).\u003c/li\u003e\n\u003cli\u003eDong, B. et al. Partial coherence enhances parallelized photonic computing. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e632\u003c/strong\u003e, 55-62 (2024).\u003c/li\u003e\n\u003cli\u003eKazanskiy, N. L. et al. Optical Computing: Status and Perspectives. \u003cem\u003eNanomaterials\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e (2022).\u003c/li\u003e\n\u003cli\u003eDan, Y. et al. Optoelectronic integrated circuits for analog optical computing: Development and challenge. \u003cem\u003eFront. Physics\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 1064693 (2022).\u003c/li\u003e\n\u003cli\u003eKim, S. et al. Neuro-CIM: ADC-Less Neuromorphic Computing-in-Memory Processor With Operation Gating/Stopping and Digital\u0026ndash;Analog Networks. \u003cem\u003eIEEE J. Solid-State Circuit\u003c/em\u003e \u003cstrong\u003e58\u003c/strong\u003e, 2931-2945 (2023).\u003c/li\u003e\n\u003cli\u003eChen, Y. et al. All-analog photoelectronic chip for high-speed vision tasks. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e623\u003c/strong\u003e, 48-57 (2023).\u003c/li\u003e\n\u003cli\u003eHuo, N. \u0026amp; Konstantatos, G. Recent Progress and Future Prospects of 2D‐Based Photodetectors. \u003cem\u003eAdv. Mater.\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 1801164 (2018).\u003c/li\u003e\n\u003cli\u003eAn, J. et al. Perspectives of 2D Materials for Optoelectronic Integration. \u003cem\u003eAdv. Funct. Mater.\u003c/em\u003e \u003cstrong\u003e32\u003c/strong\u003e, 2110119 (2021).\u003c/li\u003e\n\u003cli\u003eYou, J. et al. Hybrid/Integrated Silicon Photonics Based on 2D Materials in Optical Communication Nanosystems. \u003cem\u003eLaser Photon. Rev.\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 2000239 (2020).\u003c/li\u003e\n\u003cli\u003eLucas, A. Ising formulations of many NP problems. \u003cem\u003eFront. Physics\u003c/em\u003e \u003cstrong\u003e2\u003c/strong\u003e, 5 (2014).\u003c/li\u003e\n\u003cli\u003eVan Laarhoven, P. J. \u0026amp; Aarts, E. H. \u003cem\u003eSimulated annealing: theory and application\u003c/em\u003e. (Springer, 1987).\u003c/li\u003e\n\u003cli\u003eOuyang, J. et al. On-demand photonic Ising machine with simplified Hamiltonian calculation by phase encoding and intensity detection. \u003cem\u003eCommun. Phys.\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 168 (2024).\u003c/li\u003e\n\u003cli\u003eYamamoto, Y. et al. Coherent Ising machines\u0026mdash;optical neural networks operating at the quantum limit. \u003cem\u003enpj Quantum Inform.\u003c/em\u003e \u003cstrong\u003e3\u003c/strong\u003e, 49 (2017).\u003c/li\u003e\n\u003cli\u003eHaribara, Y. et al. Computational Principle and Performance Evaluation of Coherent Ising Machine Based on Degenerate Optical Parametric Oscillator Network. \u003cem\u003eEntropy\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, 151 (2016).\u003c/li\u003e\n\u003cli\u003eGoemans, M. X. \u0026amp; Williamson, D. P. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. \u003cem\u003eJ. ACM\u003c/em\u003e \u003cstrong\u003e42\u003c/strong\u003e, 1115-1145 (1995).\u003c/li\u003e\n\u003cli\u003eKetkar, N. \u003cem\u003eDeep learning with Python\u003c/em\u003e. (Springer, 2017).\u003c/li\u003e\n\u003cli\u003eJeong, K. Y. et al. Recent Progress in Nanolaser Technology. \u003cem\u003eAdv. Mater.\u003c/em\u003e \u003cstrong\u003e32\u003c/strong\u003e, 2001996 (2020).\u003c/li\u003e\n\u003cli\u003eDu, W. et al. Nanolasers Based on 2D Materials. \u003cem\u003eLaser Photon. Rev.\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 2000271 (2020).\u003c/li\u003e\n\u003cli\u003eZhang, Q. et al. Halide Perovskite Semiconductor Lasers: Materials, Cavity Design, and Low Threshold. \u003cem\u003eNano Lett.\u003c/em\u003e \u003cstrong\u003e21\u003c/strong\u003e, 1903-1914 (2021).\u003c/li\u003e\n\u003cli\u003eLong, M. et al. Progress, Challenges, and Opportunities for 2D Material Based Photodetectors. \u003cem\u003eAdv. Funct. Mater.\u003c/em\u003e \u003cstrong\u003e29\u003c/strong\u003e, 1803807 (2018).\u003c/li\u003e\n\u003cli\u003eLiu, C. et al. Silicon/2D-material photodetectors: from near-infrared to mid-infrared. \u003cem\u003eLight-Sci. Appl.\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 123 (2021).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-5401152/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5401152/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Photonic linear operation is a promising approach to handle the extensive vector multiplications in artificial intelligence (AI) techniques due to the natural bosonic parallelism and high-speed information transmission of photonics. However, there is still no universal scalable photonic computing architecture that can be readily merged with existing electronic digital computing system. Even though it is believed that maximizing the interaction of the light beams is necessary to fully utilize the parallelism and tremendous efforts have been made in past decades, the achieved dimensionality of vector-matrix multiplication is very limited due to the difficulty of scaling up a tightly interconnected or highly coupled optical system. Here, we propose a programmable and reconfigurable photonic linear vector machine to perform only the inner product of two vectors, formed by a series of independent basic computing units, while each unit contains only one emitter-detector pair. The elemental values of the processed vectors are prepared by the time-space domain encoding. Specifically, one vector is encoded by the output duration of continuous light-emitter while the other is encoded as the position of the emitter-detector pair. The result of the inner product is obtained by the sum of photocurrents of all photodetectors. Since there is no interaction among light beams inside, extreme scalability could be achieved by simply multiplicating the independent basic computing unit without requiring large-scale analog-to-digital converter or digital-to-analog converter arrays. Our time-space domain encoding architecture is inspired by the traditional Chinese Suanpan or abacus, and thus is denoted as photonic SUANPAN. As a proof of principle, SUANPAN architecture is implemented with an 8×8 vertical cavity surface emission laser (VCSEL) array and an 8×8 MoTe\u003csub\u003e2\u003c/sub\u003e two-dimensional material photodetector array. The experimental computing fidelities for randomly generated vector inner products are all over 98% for 1-bit, 2-bit, 4-bit and 8-bit quantization and over 95% for 8~80 vector dimensionalities with 4-bit quantization. Two typical AI tasks of the Ising machine for non-deterministic polynomial-time (NP)-hard optimization problem and artificial neural network for visual perception are performed to demonstrate the ability of SUANPAN architecture. For the Ising problem, 1024-dimensional problems are successfully solved, which is the highest dimensional optical Ising machine with heuristic algorithm. For artificial neural network, a competitive classification accuracy of 84~88% is achieved for MNIST (Modified National Institute of Standards and Technology) handwritten digit dataset. We believe that our proposed photonic SUANPAN is capable of serving as a fundamental linear vector machine that can be readily merged with existing electronic digital computing system and is potential to enhance the computing power for future various AI applications.","manuscriptTitle":"SUANPAN: Scalable Photonic Linear Vector Machine","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-11-25 07:55:18","doi":"10.21203/rs.3.rs-5401152/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"dcf9aa11-e63d-4d8b-b269-9a65f4f03584","owner":[],"postedDate":"November 25th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":40139689,"name":"Physical sciences/Optics and photonics/Applied optics/Optoelectronic devices and components"},{"id":40139690,"name":"Physical sciences/Optics and photonics/Other photonics/Micro-optics"},{"id":40139691,"name":"Physical sciences/Nanoscience and technology/Nanoscale materials/Two-dimensional materials"},{"id":40139692,"name":"Physical sciences/Physics/Electronics, photonics and device physics/Photonic devices"}],"tags":[],"updatedAt":"2025-05-25T04:15:11+00:00","versionOfRecord":[],"versionCreatedAt":"2024-11-25 07:55:18","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-5401152","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5401152","identity":"rs-5401152","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.