Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints

preprint OA: closed
Full text JSON View at publisher
Full text 132,683 characters · extracted from preprint-html · click to expand
Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints Weiqiang Ding, Yuhan Jiao, Yongyin Cao, Bojian Shi, Fangkui Sun, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7758851/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract On-chip diffractive optical neural networks offer advantages for optical information processing but face fundamental challenges when theoretical scalar diffraction models fail to accurately predict vector electromagnetic wave propagation in real devices. Existing solutions compromise either integration density or computational efficiency. Here we show a dual-optimization approach that combines Gaussian-smoothing diffractive neural networks with angle correction to bridge this modeling gap. Our method requires no extra training datasets and adds minimal computational overhead, with excellent generalizability. It reduces modeling errors, enhancing fidelity from 34.91% to 98.10% with mode purities reaching 93.39% and 90.37% in mode conversion tasks. Importantly, it maintains excellent performance even in ultra-compact architectures, achieving 97.77% fidelity at a layer spacing of only 20 µm, compared to approximately 300 µm required previously. This establishes a scalable framework for high-performance on-chip diffractive neural networks with complete physical interpretability for silicon photonics applications. Physical sciences/Optics and photonics/Applied optics/Integrated optics Physical sciences/Optics and photonics/Optical materials and structures/Silicon photonics Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Introduction Optical neural networks (ONNs) have emerged as a promising architecture for photonic computing, offering intrinsic advantages such as massive parallelism 1 – 3 , tunable parameters 4 – 5 , and direct trainability 6 – 8 . Unlike electronic neural networks (ENNs), ONNs process optical field information in real-time during propagation, thereby eliminating the processor-memory bottleneck that limits speed and energy efficiency in traditional computing 9 – 12 . Among ONNs, diffractive optical neural networks (DONNs) are notable for enabling all-optical computation through coherent light diffraction 13 – 14 . DONNs inherently exhibit high throughput and ultra-broad bandwidth, operating with virtually no energy consumptio 15 – 17 . These characteristics have made them attractive for applications in image classification, optical encryption, and real-time systems such as autonomous driving and augmented reality 13 – 19 . However, traditional free-space DONNs are bulky because they use discrete diffractive elements, making it hard to integrate them into compact systems. Also, the complex alignment between discrete parts may cause additional errors 20 – 21 . To address these limitations, researchers have developed on-chip diffractive optical neural networks (OC-DONNs) using standard silicon-on-insulator (SOI) platforms 22 – 25 . These metasurface-based systems enable precise wavefront shaping using compact subwavelength slot array 26 – 29 . On-chip diffractive optical components feature ultra-compact dimensions 29 , low scattering loss 30 , and broadband operation 31 – 32 . These advantages pave the way for silicon photonic computing chips operating at the speed of light. Such chips have promising applications in imaging, sensing, and quantum information processing 33 – 34 . Nevertheless, the transition from free-space to on-chip architectures introduces fundamental challenges. The smaller and more compact neuron dimensions in on-chip architectures render conventional scalar diffraction theory inaccurate, severely limiting the development potential of on-chip diffractive optical neural networks in terms of computational accuracy, integration density, and practical applications 26 – 28 , 35 – 38 . Existing solutions to these challenges suffer from fundamental limitations. Increasing metaline spacing reduces coupling effects but expands the chip footprint and introduces additional propagation losses 26 – 29 . Other solutions rely on slot groups to suppress coupling at the expense of design flexibility and integration density 26 – 32 . Most critically, fitting neural network approaches introduce computational overhead 31 , 34 , 39 , require extensive training datasets, exhibit poor generalization under fabrication variations, and operate as black boxes without providing physical insight into underlying electromagnetic phenomena. These fundamental limitations prevent current OC-DONNs from accurately realizing theoretically predicted functions while maintaining compact architectures. In this work, we propose a comprehensive dual-optimization strategy that physically addresses both near-field coupling and angular modeling challenges, significantly improving the fidelity between scalar diffraction theory and rigorous electromagnetic simulations. Our physics-based method employs Gaussian filtering to suppress near-field coupling between adjacent neurons while incorporating spatially-varying transmission coefficients for oblique incidence. This physics-based approach eliminates reliance on external training data while ensuring computational efficiency and full physical interpretability. In mode conversion tasks, the fidelity dramatically improves from 34.91% to 98.10%, maintaining 97.77% fidelity even at 20 µm layer spacing. A three-layer GS-DONN using single-channel mode encoding achieves 96.67% accuracy on the Iris dataset, and this approach eliminates the need for separate physical channels, addressing the footprint limitations of intensity-encoded systems. The framework provides 5–6 orders of magnitude computational advantage over neural network methods while eliminating retraining requirements through inherent generalization capabilities. The demonstrated combination of compactness, high fidelity, computational efficiency, and universal applicability establishes a robust and scalable foundation for high-performance OC-DONNs, offering new opportunities in integrated photonic computing. Results Gaussian smoothing optimization OC-DONNs employ significantly smaller neurons (~ 10⁻⁷ m) 13 compared to free-space implementations (~ 10⁻⁴ m) 32 . In compact on-chip architectures, the reduced inter-neuron spacing leads to strong near-field coupling, which cannot be accurately captured by scalar diffraction theory 22 . This discrepancy has become a key bottleneck limiting both the simulation accuracy and integration density of OC-DONNs. This limitation is clearly demonstrated in slot-array simulations. For an array of 201 silicon dioxide (SiO 2 ) filled slots arranged along the y-direction (Fig. 1 (a)), when L mid ≠ L side , the phase–slot length relationship obtained from the sweep model under perfect matched layer (PML) boundary conditions deviates markedly from the periodic parameter sweep results of a single slot, with agreement occurring only at L mid ≈ L side . The deviation occurs because modifying one slot not only changes the local field but also induces scattering and interference in adjacent regions, disrupting the periodic assumption. This explains why slot groups can partially alleviate the mismatch. This phenomenon reveals that in practical applications, one cannot consider the physical properties of individual slots in isolation, but must account for their interactions with surrounding slots. Traditional design methods often rely on training and optimization based on the independent characteristics of single slots, neglecting the coupling effects between neighboring slots and thus failing to accurately reflect the true physical situation. To overcome this limitation, the design process needs to simultaneously consider the comprehensive influence of both the target slot and its adjacent slots. Based on this understanding, we naturally turn to Gaussian convolution methods, which allow each slot's properties to be modulated by the influence of surrounding slots through spatial weighting. Based on this approach, we propose a Gaussian smoothing method that suppresses phase discontinuities between adjacent neurons to improve the fidelity between scalar diffraction theory and vector electromagnetic simulations. The method leverages the weighting property of Gaussian filters to smooth local fluctuations while preserving global trends. For large-scale implementation, the convolution is performed in the frequency domain using Fast Fourier transform (FFT) and its inverse (IFFT) 40 , with details provided in the Supporting Information Section S1. We quantify the consistency using a fidelity metric that treats both prediction \(\left| {{\psi _{\varvec{pre}}}} \right\rangle\) and simulation \(\left| {{\psi _{\varvec{sim}}}} \right\rangle\) as state vectors in Hilbert space: $$\eta ={\left| {\left\langle {{{\psi _{\varvec{pre}}}}} \mathrel{\left | {\vphantom {{{\psi _{\varvec{pre}}}} {{\psi _{\varvec{sim}}}}}} \right. \kern-0pt} {{{\psi _{\varvec{sim}}}}} \right\rangle } \right|^\varvec{2}}$$ 1 As validated in Fig. 2 , Gaussian smoothing improves fidelity from 34.91% to 98.10% ( σ = 2.5 µm), demonstrating that this method can effectively bridge the gap between scalar diffraction theory and vector electromagnetic simulations. It also reduces the variance in adjacent slot length differences from 0.3106 µm² to 0.0036 µm 2 (98.8% reduction). Furthermore, the overall slot length distribution range contracts from 1.81 µm to 0.65 µm, representing a 64% reduction. This significant range reduction not only decreases manufacturing process complexity but also improves device fabrication tolerance, reducing performance deviations caused by manufacturing errors. These quantitative results fully validate the effectiveness of the Gaussian convolution method in OC-DONN design, establishing a solid foundation for achieving high-fidelity, manufacturable large-scale OC-DONNs. Angle correction optimization While Gaussian smoothing effectively suppresses near-field coupling, the requirements for system miniaturization and cost control in on-chip integration necessitate that OC-DONNs adopt compact interlayer spacing designs. In such compact architectures, short interlayer spacing, such as 20 µm, inevitably leads to large diffraction angles, often exceeding 40°. Under these conditions, the paraxial approximation underlying the scalar diffraction model breaks down, leading to substantial prediction errors. Most existing studies employ the Rayleigh-Sommerfeld model assuming zero incident angle for analysis, but this simplification fails to accurately characterize the oblique incidence effects in actual optical fields. Therefore, this paper adopts the Kirchhoff diffraction formula, considering both incident and exit angles simultaneously, to better describe the optical behavior in multilayer diffractive structures. Details are provided in Supporting Information Section S5. As shown in Fig. 3 (a)-(b), while incident angles have relatively minor effects on phase but significantly impact transmittance, the actual transmittance drops sharply below 1% when angles exceed 60°. However, previous studies typically assumed a constant transmission coefficient independent of the incident angle, a simplification that becomes invalid at large angles and severely compromises modeling fidelity. To address this limitation, we introduce \(\alpha ({\theta ^{\varvec{in}}}) \propto \exp (\frac{{ - {{({\theta ^{\varvec{in}}})}^2}}}{{2{c^2}}})\) into the transmission function to correct the transmission coefficient under large angle. And c represents the standard deviation of the Gaussian function, \({\theta ^{\varvec{in}}}\) is dynamically estimated based on the local wavevector gradient using the finite difference method. (see Supporting Information Section S2 for details). This dynamic estimation enables accurate modeling of spatially-varying transmission coefficients, particularly important for compact systems where paraxial approximations fail. The method provides a rigorous physical foundation for OC-DONN optimization, especially in ultra-compact designs. As shown in Fig. 4 , we compared different optimization strategies. In ultra-compact structures with a 20 µm interlayer spacing, the results demonstrate a progressive improvement in fidelity η from 41.68% to 97.77% using dual optimization strategies. Each component of our approach targets a distinct physical limitation: Gaussian filtering reduces near-field coupling, while angle correction accurately captures the angular dependence of transmission coefficients. Compared to data-driven neural network corrections, this physics-based method requires no large training datasets, achieves better generalization, and enables immediate deployment across varied design scenarios. Mode conversion To validate our dual-optimization approach, we demonstrate its application in on-chip mode conversion. Unlike traditional mode converters designed for single-task transformations, our GS-DONN enables multi-task mode manipulation through a single trainable architecture. We demonstrate the dual-optimization strategy through a GS-DONN designed for on-chip mode conversion. The network architecture incorporates two input waveguides with 15 µm center-to-center spacing and the slot lengths are optimized using the combined Gaussian smoothing ( σ = 1.5 µm) and angle correction method. Figure 5 demonstrates the effectiveness of the GS-DONN approach for mode conversion tasks. When TE 00 mode inputs are launched from different ports, the network successfully converts them to target TE 01 and TE 02 modes with exceptional fidelity. The mode purity is calculated using the following formula: $$M=\frac{{{{\left| {\int {E{{_{y}^{{\varvec{out}}}}^ * }E_{y}^{{\varvec{tar}}}\varvec{d}y} } \right|}^2}}}{{\int {{{\left| {E_{y}^{{\varvec{out}}}} \right|}^2}\varvec{d}y} \int {{{\left| {E_{y}^{{\varvec{tar}}}} \right|}^2}\varvec{d}y} }}$$ 2 where \(E_{y}^{{\varvec{out}}}\) is the output field of the total, and \(E_{y}^{{\varvec{tar}}}\) is the output field of the target mode. The mode purity reaches 93.39% and 90.37% for TE 01 and TE 02 conversions, respectively. Critically, the complex field distributions predicted by GS-DONN simulations show excellent agreement with var-FDTD results, achieving fidelities of η = 98.17% and η = 98.19%. Intensity-encoded classification of the IRIS dataset Building upon the basic optical manipulation demonstrated in mode conversion, we next evaluate GS-DONN's computational capabilities through classification tasks. Unlike simple mode transformations, classification requires the network to learn complex decision boundaries from training data, demonstrating the potential for GS-DONN to perform sophisticated computational tasks beyond conventional optical operations. For comparative evaluation, we employ the Iris dataset containing 150 samples across three classes (Iris setosa, versicolor, and virginica) with four features. The dataset is partitioned into training and testing sets using a stratified 4:1 split, yielding 120 training samples and 30 testing samples to ensure balanced class representation. For intensity encoding, the four-dimensional feature vectors are mapped to the input optical intensities and using the dual-optimization approach with Gaussian kernel width σ = 1.5 µm. As illustrated in Fig. 6 , the optical fields are predominantly concentrated in their designated detector regions, confirming accurate classification by the GS-DONN. Moreover, the simulation output fields exhibit strong fidelity with the predicted patterns, with average fidelities of 90.85%, 93.27%, and 92.94%, respectively, validating the capability in enhancing fidelity between scalar diffraction theory and vectorial electromagnetic simulations. Mode-Encoded Classification of the IRIS Dataset While intensity encoding successfully demonstrated classification capabilities, it requires one optical channel per input feature, limiting practicality for high-dimensional datasets. We next explore mode encoding to overcome this channel requirement constraint. Compare with intensity encoding where each feature requires a separate input channel, mode encoding enables multiple features to be encoded within a single waveguide through modal superposition, directly addressing the scalability challenge in integrated photonic computing. The input optical field can be represented as a linear superposition of a set of orthogonal mode basis functions: $$E(r)=\sum\limits_{{n=1}}^{N} {{\alpha _n}{\psi _n}(r)}$$ 3 where \({\psi _n}(r)\) denotes the n -th order orthogonal mode basis function, and \({\alpha _n}\) is the corresponding complex amplitude coefficient. This approach significantly reduces input channel count, enabling the four-dimensional Iris dataset to be processed using just one waveguide. Figure 7 presents the var-FDTD simulation results for each Iris. Both architectures employ the dual-optimization framework with Gaussian smoothing ( σ = 1.5 µm) and angle correction. Additional training details are provided in the Supporting Information Section S3. For the single-channel classification task, the GS-DONN recognition accuracies of the two-layer and three-layer networks reached 91.33% and 96.7%, respectively. In the dual-channel case, the corresponding accuracies were 90% and 96%. These results confirm the effectiveness of mode-encoded GS-DONNs in compact and scalable optical classification tasks. Discussion Computational Efficiency Our dual-optimization approach achieves O ( G t ·S t ·N t · log N t ) complexity, representing orders of magnitude improvement over neural network methods. Neural network approaches exhibit O ( D nt ·G nt ·S nt · 4 N nt ²) + O ( G t ·S t ·N t ·G nt ·S nt · 4 N nt ²) complexity for each design scenario, where the quadratic dependence on N nt 2 creates severe computational bottlenecks. Here, G t , S t , and N t represent the number of training iterations, the number of layers, and the number of neurons per layer of the OC-DONNs training network, respectively. G nt , S nt , N nt , and D nt represent the number of training iterations, the number of layers, the number of neurons per layer, and the number of the training samples of the fitting neural network, respectively. Detailed computational complexity analysis is provided in Supporting Information Section S4. For typical parameters ( N t , N nt = 300, S t , S nt = 3, G t , G nt = 1000, D nt = 5000), Our method requires approximately ~ 10 6 operations, while neural network training typically requires ~ 10 12 -10 13 operation. Table 1 Comparison between the Proposed OC-DONN and Other Integrated Works Source Basic Units Interlayer Distance (µm) Goodness of fit with FDTD (R 2 ) Pre-training OC-DONN Training Unit structure size (µm 2 ) Integration in theory (neurons/mm 2 ) Shen 43 , Nat. Photonics MZI N/A N/A N/A N/A 55×220 < 10 Huang 36 Nat. electronics MRRs N/A N/A N/A N/A π×8 2 ~ 2500 Fu 32 , Nat. Commun. Sws groups 300 91.8% None None 1.5×2 ~ 2000 Zarei 26 , Sci.Rep. Sws groups 300 N/A None None 1.5×2.5 ~ 2700 Wang 28 , Nat. Commun. Sws groups 100 N/A None None 1×2.5 ~ 6000 Shao 31 , Nanophotonics Sws 60 N/A O ( D nt · G nt · S nt ·4 N nt 2 ) O ( G t · S t · N t · G nt · S nt ·4 N nt 2 ) 0.3× ~ 2.5 > 30,000 Liu 34 Optics Express Sws 15 ~ 90% O ( D nt · G nt · S nt ·4 N nt 2 ) O ( G t · S t · N t · G nt · S nt ·4 N nt 2 ) 0.5×3 > 60,000 Liu 39 , Light Sci. Appl Sws 15 ~ 90% O ( D nt · G nt · S nt ·4 N nt 2 ) O ( G t · S t · N t · G nt · S nt ·4 N nt 2 ) 0.5×3 > 60,000 Our work Sws 20 95.13% None O ( G t · S t · N t ·log N t ) 0.5×2.5 > 60,000 Table 1 . MZI Mach-Zehnder interferometer, MRR micro-ring resonator, SWs subwavelength slot, N/A indicates "Not Applicable". Goodness of fit R 2 = ESS / TSS, where ESS (Explained Sum of Squares) represents the variation explained by the regression model, and TSS (Total Sum of Squares) represents the total variation in the dependent variable. For consistency with reference literature that uses R 2 to evaluate scalar diffraction and electromagnetic simulation matching, our fidelity η = 97.77% values have been converted to R 2 . Generalization Capability Fitting neural network methods suffer from fundamental generalization limitations, requiring complete retraining when materials, geometric parameters, or environmental conditions deviate from training datasets 31 , 34 , 39 . Our physics-based approach inherently generalizes across varying conditions by embedding physical constraints into the model. For practical systems requiring adaptation to K different conditions, neural networks scale as K ·( O ( D nt · G nt · S nt ·4 N nt ²) + O ( G t · S t · N t · G nt · S nt ·4 N nt 2 )) due to complete retraining requirements, while our method maintains constant O ( G t · S t · N t ·log N t ) complexity regardless of scenario count. This fundamental difference makes our approach uniquely suitable for practical deployment where adaptation to varying conditions is essential. Mode Encoding Implementation The ability to process four-dimensional Iris data using a single waveguide highlights the input efficiency of mode encoding. By mapping multiple features onto orthogonal optical modes, this approach eliminates the need for separate physical channels, addressing the footprint limitations of intensity-encoded systems. It enables substantial chip area savings without compromising accuracy, achieving 96.67% classification performance in three-layer networks. The high mode purity (93.39% for TE 01 ) further supports its applicability in advanced photonic systems, including mode-division multiplexing and quantum information processing, where compactness and modal fidelity are critical. Limitations The dual optimization method imposes constraint conditions that require slots to simultaneously satisfy the dual requirements of phase modulation and Gaussian smoothing, which restricts design flexibility and may necessitate compensation through increased numbers of hidden layers and neurons. Through systematic analysis of key design parameters including Gaussian kernel width, network layer count, and interlayer spacing, we established a comprehensive theoretical framework for performance optimization. The analysis results demonstrate that optimal system performance is achieved when σ reaches 1.5 to 2.5 µm, at which point fidelity approaches saturation. Based on this finding, this work adopts σ = 1.5 µm as the optimization parameter, which not only maintains stable high fidelity but also achieves design objectives with the minimum number of network layers and the most compact interlayer spacing. It is noteworthy that this physics-based constraint optimization approach offers significant advantages over pure neural network methods: it effectively prevents generalization failures across different material systems and geometric configurations, and requires only a single system optimization to be applicable across various application scenarios, eliminating the need for repeated network retraining under different conditions. Method Model of OC-DONN The OC-DONNs designed herein adopts a layered architecture, comprising an input layer, multiple hidden layers (metaline), and an output layer. Each metaline consists of a one-dimensional array of subwavelength-scale slots filled with SiO 2 , where the slot length acts as the trainable parameter while the width and thickness remain fixed. We define this slot as a single neuron, and the interconnection between neurons is realized via free-space diffraction between adjacent metalines. Figure 8 (a) illustrates the structure of the OC-DONN. The thickness of the silicon (Si) substrate t 3 is 3 µm, the SiO 2 insulator layer t 2 is 2 µm, and the Si waveguide layer t 1 is 0.22 µm. The center-to-center spacing d is 0.5 µm, with each slot having a fixed width w is 0.14 µm. and thickness h is 0.22 µm. The architecture of an OC-DONN is similar with the conventional artificial neural network, comprising a forward propagation module, a backward propagation module, and a loss function for training (more details shown in Supporting Information Section S5). \(T_{p}^{m}({x_p},{y_p})\) represents the transmission function of the p -th neuron at position \(({x_p},{y_p})\) in the m -th layer of the network, which can be described as: $$T_{p}^{m}({x_p},{y_p})=\alpha _{p}^{m}({\theta ^{in}}) \cdot \exp \{ \varvec{j}[\varphi _{p}^{m}({x_p},{y_p})]\}$$ 4 \(\alpha _{p}^{m}({\theta ^{\varvec{in}}})\) denotes the amplitude transmission coefficient, \({\theta ^{\varvec{in}}}\) denotes the incident angle of the optical field on each neuron. \(\varphi _{p}^{m}({x_p},{y_p})\) is the phase factor of the corresponding neuron, by fixing the width of slots to 0.14 µm, continuous phase modulation from 0 to 2π is achieved by varying the slots length between 0.2 µm and 2.5 µm, reaching up to 96% transmission efficiency, as shown in Fig. 8 (b). Simulation Full-wave electromagnetic simulations were performed using the variational finite-difference time-domain method to validate the effectiveness of the dual optimization approach. Appropriate grid resolution was adopted to discretize the computational domain, accurately capturing subwavelength features while maintaining computational efficiency. PML absorbing boundary conditions were applied at the computational domain boundaries to effectively eliminate reflections and simulate infinite space conditions. Network Training For mode conversion task, we proposed a modified loss function in previous works 16 for complex-valued data: $$\left\{ \begin{gathered} {\mathcal{L}_1}=\frac{1}{{{N_s}}}\sum\limits_{\tau } {{{(Y_{{\varvec{tar}}}^{{(\tau )}} - {e^{\varvec{j}\varphi }}X_{{\varvec{out}}}^{{(\tau )}})}^{^{\dag }}}(Y_{{\varvec{tar}}}^{{(\tau )}} - {e^{\varvec{j}\varphi }}X_{{\varvec{out}}}^{{(\tau )}})} \hfill \\ {e^{\varvec{j}\varphi }}=\frac{{{{(X_{{\varvec{out}}}^{{(\tau )}})}^{^{\dag }}}Y_{{\varvec{tar}}}^{{(\tau )}}}}{{|{{(X_{{\varvec{out}}}^{{(\tau )}})}^{^{\dag }}}Y_{{\varvec{tar}}}^{{(\tau )}}|}} \hfill \\ \end{gathered} \right.$$ 5 here, N s denotes the total number of training samples, while \(Y_{{\varvec{tar}}}^{{(\tau )}}\) and \(X_{{\varvec{out}}}^{{(\tau )}}\) represent the target and output optical fields, both containing amplitude and phase information. By performing gradient descent using the above loss function, when the target field \(Y_{{\varvec{tar}}}^{{(\tau )}}\) and the output field \(X_{{\varvec{out}}}^{{(\tau )}}\) are in the same state, and the loss reaches its minimum. This indicates that the loss function satisfies the requirements of the mode conversion task. For classification tasks, the cross-entropy loss function is used 19 : $$\left\{ \begin{gathered} {F^{(\alpha ,\beta )}}={M^{{{(\beta )}^\varvec{T}}}}I_{{\varvec{out}}}^{{(\alpha )}} \hfill \\ {P^{(\alpha ,\beta )}}=\frac{{\exp ({F^{(\alpha ,\beta )}})}}{{\sum\nolimits_{\gamma } {\exp ({F^{(\gamma ,\beta )}})} }} \hfill \\ {\mathcal{L}_2}= - \frac{1}{{{N_s}}}\sum\limits_{{\alpha ,\beta }} {{Q^{(\alpha ,\beta )}}\log ({P^{(\alpha ,\beta )}})} \hfill \\ \end{gathered} \right.$$ 6 where, \({M^{(\beta )}}\) denotes the label corresponding to the β detector. The Dirac delta function \({Q^{(\alpha ,\beta )}}\) equals 1 if the label of the α sample belongs to class β , and 0 otherwise. The predicted probability \({P^{(\alpha ,\beta )}}\) indicates the likelihood that sample α is classified as class β , which is computed via the softmax function. Declarations Additional information We provide more details on the computational complexity of the on-chip optical processor with detailed calculation procedures, the network parameter analysis and performance optimization of the optical neural network architecture. Acknowledgments National Natural Science Foundation of China (Grant No. 12274105, 12574336, 12574414, 12504380, 12504438); Heilongjiang Provincial Natural Science Foundation of China (Grant No. LH2023A006); National Key Laboratory of Laser Spatial Information Foundation (LSI2025ZZKY04); the China Postdoctoral Science Foundation (2025M774294). Author Contributions W.D. and Q.J. conceived the idea and designed the experiments, Y.J. and C.Y. performed the theoretical simulations, Y.J. wrote the paper, and all authors contributed analysis tools. Data Availability The data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request. Code Availability The code underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request. Competing financial interests The authors declare no competing financial interests. References Zhou, T.; Chang, L.; Gu, Y.; Wang, L.; Zhang, S.; Li, H.; Luo, Y.; Xie, Z.; Huang, H.; Kong, W.; Weng, C.; Liu, H.; Zhao, J.; Wang, J.; Gu, T.; et al. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nat. Photonics. 15, 367 – 373(2021). Feldmann, J.; Youngblood, N.; Wright, C. D.; Bhaskaran, H.; Pernice, W. H. P. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature. 569, 208 – 214(2019). Shen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D. Deep learning with coherent nanophotonic circuits. Nat. Photonics. 11, 441–446(2017). Khoram, E.; Chen, A.; Liu, D.; Ying, L.; Wang, Q.; Yuan, M.; Yu, Z. Nanophotonic media for artificial neural inference. Photonics Res. 7, 823–827(2019). Xue, Z.; Zhou, T.; Xu, Z.; Ma, M.; Huang, J.; Shen, Y.; Englund, D.; Fan, S.; Solgaard, O.; Miller, D. A. B. Fully forward mode training for optical neural networks. Nature. 632, 280–286(2024). Williamson, I. A.; Hughes, T. W.; Minkov, M.; Bartlett, B.; Pai, S.; Fan, S. Reprogrammable electro-optic nonlinear activation functions for optical neural networks. IEEE J. Sel. Top. Quantum Electron. 26, 1 – 12(2020). Hughes, T. W.; Minkov, M.; Shi, Y.; Fan, S. Training of photonic neural networks through in situ backpropagation and gradient measurement. Optica. 5, 864 – 871(2018). Hu, J.; Mengu, D.; Tzarouchis, D. C.; Edwards, B.; Luo, Y.; Rivenson, Y.; Ozcan, A. Diffractive optical computing in free space. Nat. Commun. 15, 1525(2024). Fu, F.; Huo, D.; Zang, Z.; Lou, Y.; Wang, S.; Gu, Z.; Liu, D.-S.; Duan, X.; Wang, D.; Liu, X.; Qi, J.; Yu, S.; Du, Q.; Chen, G.; Lu, C.; Yu, Y.; Ren, X.; Yuan, X. Symbiotic evolution of photonics and artificial intelligence: a comprehensive review. Adv. Photon.7, 024001(2025). Bueno, J.; Maktoobi, S.; Froehly, L.; Fischer, I.; Jacquot, M.; Larger, L.; Brunner, D. Reinforcement learning in a large-scale photonic recurrent neural network. Optica. 5, 756 – 760(2018). Xu, Z.; Zhou, T.; Ma, M.; Shen, Y.; Huang, J.; Zhang, C.; Zou, X.; Liu, W.; Dai, Q.; Jia, B. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science. 384, 202–209(2024). Mengu, D.; Luo, Y.; Rivenson, Y.; Ozcan, A. Analysis of diffractive optical neural networks and their integration with electronic neural networks. IEEE J. Sel. Top. Quantum Electron. 26, 2921376(2020). Lin, X.; Rivenson, Y.; Yardimci, N. T.; Veli, M.; Luo, Y.; Jarrahi, M.; Ozcan, A. All-optical machine learning using diffractive deep neural networks. Science. 361, 1004 – 1008(2018). Yan, T.; Wu, J.; Zhou, T.; Xie, H.; Xu, F.; Fan, J.; Fang, L.; Lin, X.; Dai, Q. Fourier-space diffractive deep neural network. Phys. Rev. Lett. 123, 023901(2019). Yuan, X.; Wang, Y.; Xu, Z.; et al. Training large-scale optoelectronic neural networks with dual-neuron optical-artificial learning. Nat Commun. 14, 7110(2023). Zhou, T.; Fang, L.; Yan, T.; Wu, J.; Li, Y.; Fan, J.; Wu, H.; Lin, X.; Dai, Q. In situ optical backpropagation training of diffractive optical neural networks. Photonics Res. 8, 940(2020). Chen, H.; Feng, J.; Jiang, M.; Wang, Y.; Lin, J.; Tan, J.; Jin, P. Diffractive deep neural networks at visible wavelengths. Engineering. 7, 1483–1491(2021). Jia, Q.; Zhang, Y.; Shi, B.; Li, H.; Li, X.; Feng, R.; Sun, F.; Cao, Y.; Wang, J.; Qiu, C.-W.; Ding, W. Vector vortex beams sorting of 120 modes in visible spectrum. Nanophotonics. 12, 3955–3962(2021). Jia, Q.; Shi, B.; Zhang, Y.; Li, H.; Li, X.; Feng, R.; Sun, F.; Cao, Y.; Wang, J.; Qiu, C.-W.; Gu, M.; Ding, W. Partially coherent diffractive optical neural network. Optica. 11, 1742–1749(2024). Chen, Y.; Lin, Y.; Zhao, Y.; Wang, J.; Zhao, R.; Huang, Y. All-analog photoelectronic chip for high-speed vision tasks. Nature. 623, 48 – 57(2023). Fu, T.; Zhang, J.; Sun, R.; Zhou, H.; Luo, J.; Zhang, L.; Dai, D. Optical neural networks: progress and challenges. Light Sci. Appl. 13, 263(2024). Luo, X.; Hu, Y.; Ou, X.; Li, X.; Lai, J.; Liu, N.; Cheng, X.; Pan, A.; Deng, H. Metasurface-enabled on-chip multiplexed diffractive neural networks in the visible. Light Sci. Appl. 11, 158(2022). Feldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S.; et al. Parallel convolutional processing using an integrated photonic tensor core. Nature. 589, 52–58(2021). Wang, Z.; Li, T.; Soman, A.; Mao, D.; Kananen, T.; Gu, T. On-chip wavefront shaping with dielectric metasurface. Nat. Commun. 10, 3547(2019). Wang, Y.; Lin, W.; Duan, S.; Li, C.; Zhang, H.; Liu, B. On-chip reconfigurable diffractive optical neural network based on Sb2S3 Optics Express. 33, 1810–1826(2025). Zarei, S.; Khavasi, A. Realization of optical logic gates using on-chip diffractive optical neural networks. Sci. Rep. 12, 15747(2022). Fu, T.; Zang, Y.; Huang, H.; Du, Z.; Hu, C.; Chen, M.; Yang, S.; Chen, H. On-chip photonic diffractive optical neural network based on a spatial domain electromagnetic propagation model. Opt. Express. 29, 31924 – 31940(2021). Wang, Z.; Chang, L.; Wang, F.; Li, T.; Gu, T. Integrated photonic metasystem for image classifications at telecommunication wavelength. Nat. Commun. 13, 2131(2022). Sun, R.; Fu, T.; Huang, Y.; Liu, W.; Du, Z.; Chen, H. Multimode diffractive optical neural network. Adv. Photon. Nexus. 3, 026007(2024). Zhu, H.; Zou, J.; Zhang, H.; Shi, Y.; Luo, S.; Wang, N.; Cai, H.; Wan, L.; Wang, B.; Jiang, X. Space-efficient optical computing with an integrated chip diffractive neural network. Nat. Commun. 13, 1044(2022). Shao, G.; Zhou, T.; Yan, T.; Guo, Y.; Zhao, Y.; Huang, R.; Fang, L. Reliable, efficient, and scalable photonic inverse design empowered by physics-inspired deep learning. Nanophotonics. in press(2025). Fu, T.; Zang, Y.; Huang, Y.; Du, Z.; Hu, C.; Chen, M.; Yang, S.; Chen, H. Photonic machine learning with on-chip diffractive optics. Nat. Commun. 14, 70(2023),. Zarei, S.; Marzban, M. R.; Khavasi, A. Integrated photonic neural network based on silicon metalines. Opt. Express. 28, 36668 – 36684(2020). Liu, W.; Fu, T.; Huang, Y.; Sun, R.; Yang, S.; Chen, H. C-DONN: compact diffractive optical neural network with deep learning regression. Opt. Express. 31, 22127 – 22143(2023). Zhang, H.; Gu, M.; Jiang, X.; Thompson, J.; Cai, H.; Paesani, S.; Santagati, R.; Laing, A.; Zhang, Y.; Yung, M. An optical neural chip for implementing complex-valued neural network. Nat. Commun. 12, 457(2021). Huang, C.; Fujisawa, S.; de Lima, T. F.; Tait, A. N.; Blow, E. C.; Tian, Y.; Bilodeau, S.; Jha, A.; Yaman, F.; Peng, H.-T.; et al. A silicon photonic-electronic neural network for fibre nonlinearity compensation. Nat. Electron. 4, 837–844(2021). Huang, Y.; Fu, T.; Huang, H.; Yang, S.; Chen, H. Sophisticated deep learning with on-chip optical diffractive tensor processing. Photon. Res. 11, 1125–1138(2023). Zhang, Z.; Xiao, S.; Song, Q.; et al. Scalable on-chip diffractive speckle spectrometer with high spectral channel density. Light Sci. Appl. 14, 130(2025). Liu, W.; Huang, Y.; Sun, R.; Fu, T.; Yang, S.; Chen, H. Ultra-compact multi-task processor based on in-memory optical computing. Light Sci. Appl. 14, 134(2025). Gonzalez, R. C.; Woods, R. E. Digital Image Processing, 4th ed.; Pearson: Upper Saddle River, NJ(2018). Chen, H.; Feng, J.; Jiang, M.; Wang, Y.; Lin, J.; Tan, J.; Chen, P. Diffractive deep neural networks at visible wavelengths. Engineering 7, 1483 – 1491(2021). Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA(2016). Shen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; Soljačić, M. Deep learning with coherent nanophotonic circuits. Nat. Photonics,11, 441–446(2017). Additional Declarations There is NO Competing Interest. Supplementary Files UltracompactandefficientonchipdiffractionneuralnetworkbasedondualoptimizationofphysicalconstraintsSupportingInformation.docx Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7758851","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":526583532,"identity":"c5e3d89f-9475-420d-87a4-d889a596c0cb","order_by":0,"name":"Weiqiang Ding","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAw0lEQVRIiWNgGAWjYHACgwMMDDYJbCAmDwla0hLY2EjRAsSHExiI1mJwPHnj4YJf5/P45BsYH7xtY5A3J6jlzLOCwzP7bhcDHcZsOLeNwXBnAwEtZjdyDA7z9txObGNjYJPmbWNIAHmNGC3nQFrYfxOvhefHAbAtzERpsQf5hbchGaglsVlyzjkJww2EtEi2J2/+zPPHLnF+8+GDH96U2cgTtIWBIYGBgbENxGBsABISBNVDtDD8IUbhKBgFo2AUjFgAABKeQgd4o5v+AAAAAElFTkSuQmCC","orcid":"https://orcid.org/0000-0002-8225-2951","institution":"Harbin Institute of Technology","correspondingAuthor":true,"prefix":"","firstName":"Weiqiang","middleName":"","lastName":"Ding","suffix":""},{"id":526583533,"identity":"e5f97280-fcbf-48cf-87db-71b665cffbde","order_by":1,"name":"Yuhan Jiao","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Yuhan","middleName":"","lastName":"Jiao","suffix":""},{"id":526583534,"identity":"c6fc68e5-e3cd-4c7d-b323-b15ad90c5c9d","order_by":2,"name":"Yongyin Cao","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Yongyin","middleName":"","lastName":"Cao","suffix":""},{"id":526583535,"identity":"7a2713d8-3247-4a7e-a204-3bdc7b654da0","order_by":3,"name":"Bojian Shi","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Bojian","middleName":"","lastName":"Shi","suffix":""},{"id":526583536,"identity":"bd96f90d-b122-4038-a70c-96d32dce46e8","order_by":4,"name":"Fangkui Sun","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Fangkui","middleName":"","lastName":"Sun","suffix":""},{"id":526583537,"identity":"6c46533d-7665-4e3a-b1f0-e9e2394bb85d","order_by":5,"name":"Qi Jia","email":"","orcid":"https://orcid.org/0000-0001-8537-9328","institution":"Harbin Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Qi","middleName":"","lastName":"Jia","suffix":""},{"id":526583538,"identity":"7beaaa71-9ab2-4390-8902-3f2aeef4d427","order_by":6,"name":"Rui Feng","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Rui","middleName":"","lastName":"Feng","suffix":""}],"badges":[],"createdAt":"2025-10-01 11:06:48","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7758851/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7758851/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94010627,"identity":"f6f988c6-5a65-4c47-a775-2bdb52e5aacc","added_by":"auto","created_at":"2025-10-21 10:16:42","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":26922712,"visible":true,"origin":"","legend":"","description":"","filename":"Ultracompactandefficientonchipdiffractionneuralnetworkbasedondualoptimizationofphysicalconstraints.docx","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/f082858563d984aabaaf8b1a.docx"},{"id":94010862,"identity":"9eaaa5f1-1e13-49a1-92ff-f24af58344bf","added_by":"auto","created_at":"2025-10-21 10:16:59","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7968,"visible":true,"origin":"","legend":"","description":"","filename":"COMMSPHYS251644.json","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/f8f15d79b8ef7b02d45e66ab.json"},{"id":94010789,"identity":"986bad4f-e596-406c-8571-d8ba2a6024e4","added_by":"auto","created_at":"2025-10-21 10:16:52","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3644791,"visible":true,"origin":"","legend":"","description":"","filename":"UltracompactandefficientonchipdiffractionneuralnetworkbasedondualoptimizationofphysicalconstraintsSupportingInformation.docx","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/904fe8c9deec555c66852bdc.docx"},{"id":94010787,"identity":"ff7968c5-6a78-4826-a461-a0459b2fe0ca","added_by":"auto","created_at":"2025-10-21 10:16:51","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":127388,"visible":true,"origin":"","legend":"","description":"","filename":"COMMSPHYS2516440enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/ea47d37c0f813943a802890d.xml"},{"id":94010821,"identity":"9ac9d097-930f-4bcd-b4c7-37f0b1df40cf","added_by":"auto","created_at":"2025-10-21 10:16:55","extension":"png","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1961931,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/6cb6caf11452b428b28006d3.png"},{"id":94010741,"identity":"04990ab2-454d-4de0-8aff-a27e98390aef","added_by":"auto","created_at":"2025-10-21 10:16:49","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2087944,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/c57f36a215dafe8eaf02a216.png"},{"id":94010785,"identity":"4578fb4e-a29f-476d-8a3f-204ee8a1358e","added_by":"auto","created_at":"2025-10-21 10:16:51","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1734662,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/39d770df613c3a54734172ea.png"},{"id":94010861,"identity":"85f8be07-2c12-403e-811c-2c11480b0e4d","added_by":"auto","created_at":"2025-10-21 10:16:59","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2175701,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/2f68be019cb95de296778f1e.png"},{"id":94010679,"identity":"47330372-c1e2-4909-941b-e3e34dea378b","added_by":"auto","created_at":"2025-10-21 10:16:43","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5008088,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/e8813844a4347996a26da19f.png"},{"id":94010816,"identity":"32dddccf-9e68-4205-8637-37caa69957ec","added_by":"auto","created_at":"2025-10-21 10:16:54","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":7428636,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/20355226c6231b328b8d961e.png"},{"id":94010764,"identity":"4474ba10-ae9b-4b1d-8fd3-8297260f30e7","added_by":"auto","created_at":"2025-10-21 10:16:50","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5685060,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/d1a38bd209ae852d9307ca67.png"},{"id":94010714,"identity":"a5b264ac-e4b8-45ee-8129-68f1e3672769","added_by":"auto","created_at":"2025-10-21 10:16:44","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":719507,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/bc9e1d9c1737082d5ae475c2.png"},{"id":94010883,"identity":"5aadc3bc-95e0-4285-a5f6-417645054fb4","added_by":"auto","created_at":"2025-10-21 10:17:00","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":247698,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/16abc249970a5bf2c4915e2d.png"},{"id":94010732,"identity":"dc386c24-0f58-4302-981b-abd90b0a471e","added_by":"auto","created_at":"2025-10-21 10:16:47","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":401398,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/448ff944a80478afa307a111.png"},{"id":94010717,"identity":"a1f9b023-d123-440f-bf55-b5af17be719b","added_by":"auto","created_at":"2025-10-21 10:16:44","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":192455,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/396ce9b44d6edb5eb880c6bd.png"},{"id":94010718,"identity":"9da04bc9-2612-46df-b11c-a8dc5df83304","added_by":"auto","created_at":"2025-10-21 10:16:44","extension":"png","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":469533,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/8092e40fe6447d7a4d344979.png"},{"id":94010856,"identity":"17e83c2b-eb14-4c82-82b8-111ae2d0a159","added_by":"auto","created_at":"2025-10-21 10:16:56","extension":"png","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1104072,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/716da10c3706fdf9322ef4cc.png"},{"id":94010920,"identity":"ddd1ea60-83c8-4aaf-80ce-89461aeded80","added_by":"auto","created_at":"2025-10-21 10:17:02","extension":"png","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1104492,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/3dd688a27e50b3c0b4d9b31f.png"},{"id":94010819,"identity":"93bcd023-ad41-4383-94e7-4ed2efa10caf","added_by":"auto","created_at":"2025-10-21 10:16:55","extension":"png","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":851569,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/f01272eaddd7428fb352a190.png"},{"id":94010792,"identity":"aba3b637-6092-4b30-aa84-73bc670c7a95","added_by":"auto","created_at":"2025-10-21 10:16:53","extension":"png","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":172774,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/1b7f0cd7628d8e2569363c65.png"},{"id":94010767,"identity":"24f3685a-3ec2-4471-8d99-7a1fc4d83210","added_by":"auto","created_at":"2025-10-21 10:16:51","extension":"xml","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":124711,"visible":true,"origin":"","legend":"","description":"","filename":"COMMSPHYS2516440structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/13da47ce9885f5de6393914f.xml"},{"id":94010924,"identity":"513ee242-ceb8-488b-bb5e-abf89887810f","added_by":"auto","created_at":"2025-10-21 10:17:03","extension":"html","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":137843,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/6ceac7931eb5d56f418ad44d.html"},{"id":94010858,"identity":"86cd43e7-3576-402d-8491-fcdd7b00671f","added_by":"auto","created_at":"2025-10-21 10:16:57","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":918667,"visible":true,"origin":"","legend":"\u003cp\u003e(a) The sweep model sets the central slot length \u003cem\u003eL\u003c/em\u003e\u003csub\u003emid\u003c/sub\u003e fixed at 0.5, 1.0, 1.5, 2.0 μm, while the side slot lengths are simultaneously swept from 0.2 μm to 2.5 μm. (b) Comparison between the phase-slot length relationship obtained from the sweep model under PML boundary conditions and that of a single slot derived from periodic parameter sweeps.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/6dccd4100b457cb5fcc8efa9.png"},{"id":94010719,"identity":"e5460c57-3bac-4f42-996e-3a2b6be65f81","added_by":"auto","created_at":"2025-10-21 10:16:45","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1170917,"visible":true,"origin":"","legend":"\u003cp\u003eGaussian filtering effectiveness. (a) Without filtering: 34.91% fidelity. (b)\u003cstrong\u003e \u003c/strong\u003eWith filtering: 98.10% fidelity. (c) The box plots show the distribution of lengths differences between adjacent slots for the original data and five different Gaussian kernel widths \u003cem\u003eσ\u003c/em\u003e = 0.5, 1.0, 1.5, 2.0, 2.5 μm. The right inset displays the smoothed slots lengths distribution using a Gaussian kernel of \u003cem\u003eσ\u003c/em\u003e = 2.5 μm. The left inset shows the original slots lengths profile along the y-axis.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/5f2608a418d6ee7a00ed58ce.png"},{"id":94010738,"identity":"cdbaae83-0e61-4b8d-aff3-293ec17b5e1d","added_by":"auto","created_at":"2025-10-21 10:16:48","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":811540,"visible":true,"origin":"","legend":"\u003cp\u003eGaussian filtering effectiveness. (a) Transmission phase and (b) amplitude as functions of slot length (0.2–2.5 μm) and incident angle (0–60°) with a fixed slot width of 0.14 μm. (c) Phase profile (normalized to 2π) along the y-direction. The tangent angle at each point corresponds to the local incident angle \u003cem\u003eθ\u003c/em\u003e\u003csup\u003ein\u003c/sup\u003e, estimated from the phase gradient\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/f6343885af2a3a6e82881e30.png"},{"id":94012839,"identity":"120d6d43-3abc-42f9-bd4f-709feba19a07","added_by":"auto","created_at":"2025-10-21 10:24:55","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":1256675,"visible":true,"origin":"","legend":"\u003cp\u003eDual-optimization method performance comparison. Electric field intensity distributions comparing (a)-(d) unoptimized and (e)-(h) angle optimized cases under different Gaussian smoothing conditions. Conversion efficiency \u003cem\u003eη \u003c/em\u003eincreases progressively from 41.68% to 97.77% with combined optimization strategies.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/e5e5e2ade9027ea35e26f187.png"},{"id":94010739,"identity":"4b2501eb-45bb-4c68-945d-050c9f7e2d7d","added_by":"auto","created_at":"2025-10-21 10:16:48","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":4887175,"visible":true,"origin":"","legend":"\u003cp\u003eMode conversion enabled by GS-DONN\u003cstrong\u003e \u003c/strong\u003e(a) and (b) Two TE\u003csub\u003e00\u003c/sub\u003e mode beams are launched from different input ports and propagate through identical GS-DONN structures. At the output, the beams are respectively converted into (c)\u003cstrong\u003e \u003c/strong\u003eTE\u003csub\u003e01\u003c/sub\u003e and (d) TE\u003csub\u003e02\u003c/sub\u003e modes.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/161aa30d825f74c08c12a36e.png"},{"id":94010742,"identity":"768ebfda-7229-4624-a2f1-9f7801451626","added_by":"auto","created_at":"2025-10-21 10:16:49","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":5071672,"visible":true,"origin":"","legend":"\u003cp\u003eClassification performance of the Iris dataset using intensity encoding. (a), (d), and (g) show the simulation process simulated by a two-layer GS-DONN. D1, D2, and D3 represent the prediction of the different kinds of Iris flowers. (b), (e), and (h) present the corresponding output power distributions. (c), (f), and (i) compare the optical fields at the output monitored by var-FDTD simulations (red curve) with GS-DONN simulation (blue curve).\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/8d345f46f3f8088849f9313c.png"},{"id":94010737,"identity":"8660f4c0-3af2-4dfc-a66e-47d83de76ff0","added_by":"auto","created_at":"2025-10-21 10:16:48","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":3412429,"visible":true,"origin":"","legend":"\u003cp\u003eClassification performance on the Iris dataset using mode-encoded inputs.\u003cstrong\u003e \u003c/strong\u003e(a)\u003cstrong\u003e \u003c/strong\u003eConfusion matrices of a two-layer GS-DONN under single-channel and dual-channel mode encoding schemes based on GS-DONN simulation and var-FDTD simulation results, respectively. (b) The var-FDTD simulation results for the three types of Iris flowers under single-channel mode encoding. D1, D2, and D3 represent the prediction for each class.\u003cstrong\u003e \u003c/strong\u003e(c) Output waveforms of each Iris type, with maximum intensity observed at the designated monitor for each target class.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/42eccce2c3fc8739d61b5dd2.png"},{"id":94010745,"identity":"13804e93-6ac3-4b95-946b-5c08f444e4c9","added_by":"auto","created_at":"2025-10-21 10:16:49","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":579648,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic of the OC-DONN. (a) Three-layer architecture with input waveguides, SiO\u003csub\u003e2\u003c/sub\u003e slot arrays, and output detectors. (b) Phase modulation range 0-2π with \u0026gt;96% transmission efficiency.\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/1a152609ea1280e1dc26bc11.png"},{"id":94013828,"identity":"aef6d3ff-db65-4f37-8ed0-84ef2e309ca2","added_by":"auto","created_at":"2025-10-21 10:41:12","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":16832833,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/912325e9-f5b2-4f09-a12f-c09e762d130c.pdf"},{"id":94010766,"identity":"6438d986-8520-4b06-bec2-dbbe78340509","added_by":"auto","created_at":"2025-10-21 10:16:51","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":3644791,"visible":true,"origin":"","legend":"Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints","description":"","filename":"UltracompactandefficientonchipdiffractionneuralnetworkbasedondualoptimizationofphysicalconstraintsSupportingInformation.docx","url":"https://assets-eu.researchsquare.com/files/rs-7758851/v1/1bdb6b842ed75190291b883c.docx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints","fulltext":[{"header":"Introduction","content":"\u003cp\u003eOptical neural networks (ONNs) have emerged as a promising architecture for photonic computing, offering intrinsic advantages such as massive parallelism\u003csup\u003e\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e, tunable parameters\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e, and direct trainability\u003csup\u003e\u003cspan additionalcitationids=\"CR7\" citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. Unlike electronic neural networks (ENNs), ONNs process optical field information in real-time during propagation, thereby eliminating the processor-memory bottleneck that limits speed and energy efficiency in traditional computing\u003csup\u003e\u003cspan additionalcitationids=\"CR10 CR11\" citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e. Among ONNs, diffractive optical neural networks (DONNs) are notable for enabling all-optical computation through coherent light diffraction\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. DONNs inherently exhibit high throughput and ultra-broad bandwidth, operating with virtually no energy consumptio\u003csup\u003e\u003cspan additionalcitationids=\"CR16\" citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. These characteristics have made them attractive for applications in image classification, optical encryption, and real-time systems such as autonomous driving and augmented reality\u003csup\u003e\u003cspan additionalcitationids=\"CR14 CR15 CR16 CR17 CR18\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eHowever, traditional free-space DONNs are bulky because they use discrete diffractive elements, making it hard to integrate them into compact systems. Also, the complex alignment between discrete parts may cause additional errors\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e. To address these limitations, researchers have developed on-chip diffractive optical neural networks (OC-DONNs) using standard silicon-on-insulator (SOI) platforms\u003csup\u003e\u003cspan additionalcitationids=\"CR23 CR24\" citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. These metasurface-based systems enable precise wavefront shaping using compact subwavelength slot array\u003csup\u003e\u003cspan additionalcitationids=\"CR27 CR28\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. On-chip diffractive optical components feature ultra-compact dimensions\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e, low scattering loss\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e, and broadband operation\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e. These advantages pave the way for silicon photonic computing chips operating at the speed of light. Such chips have promising applications in imaging, sensing, and quantum information processing\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eNevertheless, the transition from free-space to on-chip architectures introduces fundamental challenges. The smaller and more compact neuron dimensions in on-chip architectures render conventional scalar diffraction theory inaccurate, severely limiting the development potential of on-chip diffractive optical neural networks in terms of computational accuracy, integration density, and practical applications\u003csup\u003e\u003cspan additionalcitationids=\"CR27\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e,\u003cspan additionalcitationids=\"CR36 CR37\" citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u003c/sup\u003e. Existing solutions to these challenges suffer from fundamental limitations. Increasing metaline spacing reduces coupling effects but expands the chip footprint and introduces additional propagation losses\u003csup\u003e\u003cspan additionalcitationids=\"CR27 CR28\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. Other solutions rely on slot groups to suppress coupling at the expense of design flexibility and integration density\u003csup\u003e\u003cspan additionalcitationids=\"CR27 CR28 CR29 CR30 CR31\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e. Most critically, fitting neural network approaches introduce computational overhead\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e,\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e,\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e, require extensive training datasets, exhibit poor generalization under fabrication variations, and operate as black boxes without providing physical insight into underlying electromagnetic phenomena. These fundamental limitations prevent current OC-DONNs from accurately realizing theoretically predicted functions while maintaining compact architectures.\u003c/p\u003e\u003cp\u003eIn this work, we propose a comprehensive dual-optimization strategy that physically addresses both near-field coupling and angular modeling challenges, significantly improving the fidelity between scalar diffraction theory and rigorous electromagnetic simulations. Our physics-based method employs Gaussian filtering to suppress near-field coupling between adjacent neurons while incorporating spatially-varying transmission coefficients for oblique incidence. This physics-based approach eliminates reliance on external training data while ensuring computational efficiency and full physical interpretability. In mode conversion tasks, the fidelity dramatically improves from 34.91% to 98.10%, maintaining 97.77% fidelity even at 20 \u0026micro;m layer spacing. A three-layer GS-DONN using single-channel mode encoding achieves 96.67% accuracy on the Iris dataset, and this approach eliminates the need for separate physical channels, addressing the footprint limitations of intensity-encoded systems. The framework provides 5\u0026ndash;6 orders of magnitude computational advantage over neural network methods while eliminating retraining requirements through inherent generalization capabilities. The demonstrated combination of compactness, high fidelity, computational efficiency, and universal applicability establishes a robust and scalable foundation for high-performance OC-DONNs, offering new opportunities in integrated photonic computing.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eGaussian smoothing optimization\u003c/h2\u003e\u003cp\u003eOC-DONNs employ significantly smaller neurons (~\u0026thinsp;10⁻⁷ m)\u003csup\u003e13\u003c/sup\u003e compared to free-space implementations (~\u0026thinsp;10⁻⁴ m) \u003csup\u003e32\u003c/sup\u003e. In compact on-chip architectures, the reduced inter-neuron spacing leads to strong near-field coupling, which cannot be accurately captured by scalar diffraction theory\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. This discrepancy has become a key bottleneck limiting both the simulation accuracy and integration density of OC-DONNs.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eThis limitation is clearly demonstrated in slot-array simulations. For an array of 201 silicon dioxide (SiO\u003csub\u003e2\u003c/sub\u003e) filled slots arranged along the y-direction (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e(a)), when L\u003csub\u003emid\u003c/sub\u003e \u0026ne; L\u003csub\u003eside\u003c/sub\u003e, the phase\u0026ndash;slot length relationship obtained from the sweep model under perfect matched layer (PML) boundary conditions deviates markedly from the periodic parameter sweep results of a single slot, with agreement occurring only at L\u003csub\u003emid\u003c/sub\u003e \u0026asymp; L\u003csub\u003eside\u003c/sub\u003e. The deviation occurs because modifying one slot not only changes the local field but also induces scattering and interference in adjacent regions, disrupting the periodic assumption. This explains why slot groups can partially alleviate the mismatch.\u003c/p\u003e\u003cp\u003eThis phenomenon reveals that in practical applications, one cannot consider the physical properties of individual slots in isolation, but must account for their interactions with surrounding slots. Traditional design methods often rely on training and optimization based on the independent characteristics of single slots, neglecting the coupling effects between neighboring slots and thus failing to accurately reflect the true physical situation. To overcome this limitation, the design process needs to simultaneously consider the comprehensive influence of both the target slot and its adjacent slots. Based on this understanding, we naturally turn to Gaussian convolution methods, which allow each slot's properties to be modulated by the influence of surrounding slots through spatial weighting.\u003c/p\u003e\u003cp\u003eBased on this approach, we propose a Gaussian smoothing method that suppresses phase discontinuities between adjacent neurons to improve the fidelity between scalar diffraction theory and vector electromagnetic simulations. The method leverages the weighting property of Gaussian filters to smooth local fluctuations while preserving global trends. For large-scale implementation, the convolution is performed in the frequency domain using Fast Fourier transform (FFT) and its inverse (IFFT)\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e, with details provided in the Supporting Information Section S1. We quantify the consistency using a fidelity metric that treats both prediction \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\left| {{\\psi _{\\varvec{pre}}}} \\right\\rangle\\)\u003c/span\u003e\u003c/span\u003e and simulation \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\left| {{\\psi _{\\varvec{sim}}}} \\right\\rangle\\)\u003c/span\u003e\u003c/span\u003e as state vectors in Hilbert space:\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$$\\eta ={\\left| {\\left\\langle {{{\\psi _{\\varvec{pre}}}}} \\mathrel{\\left | {\\vphantom {{{\\psi _{\\varvec{pre}}}} {{\\psi _{\\varvec{sim}}}}}} \\right. \\kern-0pt} {{{\\psi _{\\varvec{sim}}}}} \\right\\rangle } \\right|^\\varvec{2}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eAs validated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, Gaussian smoothing improves fidelity from 34.91% to 98.10% (\u003cem\u003eσ\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2.5 \u0026micro;m), demonstrating that this method can effectively bridge the gap between scalar diffraction theory and vector electromagnetic simulations. It also reduces the variance in adjacent slot length differences from 0.3106 \u0026micro;m\u0026sup2; to 0.0036 \u0026micro;m\u003csup\u003e2\u003c/sup\u003e (98.8% reduction). Furthermore, the overall slot length distribution range contracts from 1.81 \u0026micro;m to 0.65 \u0026micro;m, representing a 64% reduction. This significant range reduction not only decreases manufacturing process complexity but also improves device fabrication tolerance, reducing performance deviations caused by manufacturing errors. These quantitative results fully validate the effectiveness of the Gaussian convolution method in OC-DONN design, establishing a solid foundation for achieving high-fidelity, manufacturable large-scale OC-DONNs.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eAngle correction optimization\u003c/h3\u003e\n\u003cp\u003eWhile Gaussian smoothing effectively suppresses near-field coupling, the requirements for system miniaturization and cost control in on-chip integration necessitate that OC-DONNs adopt compact interlayer spacing designs. In such compact architectures, short interlayer spacing, such as 20 \u0026micro;m, inevitably leads to large diffraction angles, often exceeding 40\u0026deg;. Under these conditions, the paraxial approximation underlying the scalar diffraction model breaks down, leading to substantial prediction errors. Most existing studies employ the Rayleigh-Sommerfeld model assuming zero incident angle for analysis, but this simplification fails to accurately characterize the oblique incidence effects in actual optical fields. Therefore, this paper adopts the Kirchhoff diffraction formula, considering both incident and exit angles simultaneously, to better describe the optical behavior in multilayer diffractive structures. Details are provided in Supporting Information Section S5. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e(a)-(b), while incident angles have relatively minor effects on phase but significantly impact transmittance, the actual transmittance drops sharply below 1% when angles exceed 60\u0026deg;. However, previous studies typically assumed a constant transmission coefficient independent of the incident angle, a simplification that becomes invalid at large angles and severely compromises modeling fidelity.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTo address this limitation, we introduce \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\alpha ({\\theta ^{\\varvec{in}}}) \\propto \\exp (\\frac{{ - {{({\\theta ^{\\varvec{in}}})}^2}}}{{2{c^2}}})\\)\u003c/span\u003e\u003c/span\u003e into the transmission function to correct the transmission coefficient under large angle. And \u003cem\u003ec\u003c/em\u003e represents the standard deviation of the Gaussian function, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\theta ^{\\varvec{in}}}\\)\u003c/span\u003e\u003c/span\u003e is dynamically estimated based on the local wavevector gradient using the finite difference method. (see Supporting Information Section S2 for details). This dynamic estimation enables accurate modeling of spatially-varying transmission coefficients, particularly important for compact systems where paraxial approximations fail. The method provides a rigorous physical foundation for OC-DONN optimization, especially in ultra-compact designs.\u003c/p\u003e\u003cp\u003eAs shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, we compared different optimization strategies. In ultra-compact structures with a 20 \u0026micro;m interlayer spacing, the results demonstrate a progressive improvement in fidelity \u003cem\u003eη\u003c/em\u003e from 41.68% to 97.77% using dual optimization strategies. Each component of our approach targets a distinct physical limitation: Gaussian filtering reduces near-field coupling, while angle correction accurately captures the angular dependence of transmission coefficients. Compared to data-driven neural network corrections, this physics-based method requires no large training datasets, achieves better generalization, and enables immediate deployment across varied design scenarios.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eMode conversion\u003c/h3\u003e\n\u003cp\u003eTo validate our dual-optimization approach, we demonstrate its application in on-chip mode conversion. Unlike traditional mode converters designed for single-task transformations, our GS-DONN enables multi-task mode manipulation through a single trainable architecture.\u003c/p\u003e\u003cp\u003eWe demonstrate the dual-optimization strategy through a GS-DONN designed for on-chip mode conversion. The network architecture incorporates two input waveguides with 15 \u0026micro;m center-to-center spacing and the slot lengths are optimized using the combined Gaussian smoothing (\u003cem\u003eσ\u003c/em\u003e\u0026thinsp;=\u0026thinsp;1.5 \u0026micro;m) and angle correction method.\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e demonstrates the effectiveness of the GS-DONN approach for mode conversion tasks. When TE\u003csub\u003e00\u003c/sub\u003e mode inputs are launched from different ports, the network successfully converts them to target TE\u003csub\u003e01\u003c/sub\u003e and TE\u003csub\u003e02\u003c/sub\u003e modes with exceptional fidelity. The mode purity is calculated using the following formula:\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e\n$$M=\\frac{{{{\\left| {\\int {E{{_{y}^{{\\varvec{out}}}}^ * }E_{y}^{{\\varvec{tar}}}\\varvec{d}y} } \\right|}^2}}}{{\\int {{{\\left| {E_{y}^{{\\varvec{out}}}} \\right|}^2}\\varvec{d}y} \\int {{{\\left| {E_{y}^{{\\varvec{tar}}}} \\right|}^2}\\varvec{d}y} }}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003ewhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(E_{y}^{{\\varvec{out}}}\\)\u003c/span\u003e\u003c/span\u003e is the output field of the total, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(E_{y}^{{\\varvec{tar}}}\\)\u003c/span\u003e\u003c/span\u003e is the output field of the target mode. The mode purity reaches 93.39% and 90.37% for TE\u003csub\u003e01\u003c/sub\u003e and TE\u003csub\u003e02\u003c/sub\u003e conversions, respectively. Critically, the complex field distributions predicted by GS-DONN simulations show excellent agreement with var-FDTD results, achieving fidelities of \u003cem\u003eη\u003c/em\u003e\u0026thinsp;=\u0026thinsp;98.17% and \u003cem\u003eη\u003c/em\u003e\u0026thinsp;=\u0026thinsp;98.19%.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eIntensity-encoded classification of the IRIS dataset\u003c/h3\u003e\n\u003cp\u003eBuilding upon the basic optical manipulation demonstrated in mode conversion, we next evaluate GS-DONN's computational capabilities through classification tasks. Unlike simple mode transformations, classification requires the network to learn complex decision boundaries from training data, demonstrating the potential for GS-DONN to perform sophisticated computational tasks beyond conventional optical operations.\u003c/p\u003e\u003cp\u003eFor comparative evaluation, we employ the Iris dataset containing 150 samples across three classes (Iris setosa, versicolor, and virginica) with four features. The dataset is partitioned into training and testing sets using a stratified 4:1 split, yielding 120 training samples and 30 testing samples to ensure balanced class representation.\u003c/p\u003e\u003cp\u003eFor intensity encoding, the four-dimensional feature vectors are mapped to the input optical intensities and using the dual-optimization approach with Gaussian kernel width \u003cem\u003eσ\u003c/em\u003e\u0026thinsp;=\u0026thinsp;1.5 \u0026micro;m. As illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e, the optical fields are predominantly concentrated in their designated detector regions, confirming accurate classification by the GS-DONN. Moreover, the simulation output fields exhibit strong fidelity with the predicted patterns, with average fidelities of 90.85%, 93.27%, and 92.94%, respectively, validating the capability in enhancing fidelity between scalar diffraction theory and vectorial electromagnetic simulations.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eMode-Encoded Classification of the IRIS Dataset\u003c/h3\u003e\n\u003cp\u003eWhile intensity encoding successfully demonstrated classification capabilities, it requires one optical channel per input feature, limiting practicality for high-dimensional datasets. We next explore mode encoding to overcome this channel requirement constraint.\u003c/p\u003e\u003cp\u003eCompare with intensity encoding where each feature requires a separate input channel, mode encoding enables multiple features to be encoded within a single waveguide through modal superposition, directly addressing the scalability challenge in integrated photonic computing. The input optical field can be represented as a linear superposition of a set of orthogonal mode basis functions:\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e\n$$E(r)=\\sum\\limits_{{n=1}}^{N} {{\\alpha _n}{\\psi _n}(r)}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003ewhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\psi _n}(r)\\)\u003c/span\u003e\u003c/span\u003e denotes the \u003cem\u003en\u003c/em\u003e-th order orthogonal mode basis function, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\alpha _n}\\)\u003c/span\u003e\u003c/span\u003e is the corresponding complex amplitude coefficient. This approach significantly reduces input channel count, enabling the four-dimensional Iris dataset to be processed using just one waveguide.\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e presents the var-FDTD simulation results for each Iris. Both architectures employ the dual-optimization framework with Gaussian smoothing (\u003cem\u003eσ\u003c/em\u003e\u0026thinsp;=\u0026thinsp;1.5 \u0026micro;m) and angle correction. Additional training details are provided in the Supporting Information Section S3. For the single-channel classification task, the GS-DONN recognition accuracies of the two-layer and three-layer networks reached 91.33% and 96.7%, respectively. In the dual-channel case, the corresponding accuracies were 90% and 96%. These results confirm the effectiveness of mode-encoded GS-DONNs in compact and scalable optical classification tasks.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Discussion","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003eComputational Efficiency\u003c/h2\u003e\u003cp\u003eOur dual-optimization approach achieves \u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·S\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·N\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·\u003c/em\u003elog\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e) complexity, representing orders of magnitude improvement over neural network methods. Neural network approaches exhibit \u003cem\u003eO\u003c/em\u003e(\u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·G\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·S\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·\u003c/em\u003e4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e²) + \u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·S\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·N\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·G\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·S\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003cem\u003e·\u003c/em\u003e4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e²) complexity for each design scenario, where the quadratic dependence on \u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cem\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/em\u003e\u003c/sup\u003e creates severe computational bottlenecks. Here, \u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e, \u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e, and \u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e represent the number of training iterations, the number of layers, and the number of neurons per layer of the OC-DONNs training network, respectively. \u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e, \u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e, \u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e, and \u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e represent the number of training iterations, the number of layers, the number of neurons per layer, and the number of the training samples of the fitting neural network, respectively. Detailed computational complexity analysis is provided in Supporting Information Section S4. For typical parameters (\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e, \u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e = 300, \u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e, \u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e = 3, \u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e, \u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e = 1000, \u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e \u003cem\u003e=\u003c/em\u003e 5000), Our method requires approximately ~ 10\u003csup\u003e6\u003c/sup\u003e operations, while neural network training typically requires ~ 10\u003csup\u003e12\u003c/sup\u003e-10\u003csup\u003e13\u003c/sup\u003e operation.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\"×\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eComparison between the Proposed OC-DONN and Other Integrated Works\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"8\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSource\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eBasic Units\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eInterlayer Distance (µm)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eGoodness of fit with FDTD\u003c/p\u003e\u003cp\u003e(R\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003ePre-training\u003c/p\u003e\u003cp\u003eOC-DONN\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eTraining\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eUnit structure size (µm\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c8\"\u003e\u003cp\u003eIntegration in theory (neurons/mm\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eShen\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e, \u003cem\u003eNat. Photonics\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMZI\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e55×220\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u0026lt; 10\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHuang\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e \u003cem\u003eNat. electronics\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMRRs\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003eπ×8\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e~ 2500\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFu\u003csup\u003e32\u003c/sup\u003e, \u003cem\u003eNat. Commun.\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws groups\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e300\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e91.8%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e1.5×2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e~ 2000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eZarei\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e, \u003cem\u003eSci.Rep.\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws groups\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e300\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e1.5×2.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e~ 2700\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWang\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e, \u003cem\u003eNat. Commun.\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws groups\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e100\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e1×2.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e~ 6000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eShao\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e, \u003cem\u003eNanophotonics\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e60\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eN/A\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e0.3× ~ 2.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u0026gt; 30,000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLiu\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e \u003cem\u003eOptics Express\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e~ 90%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e0.5×3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u0026gt; 60,000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLiu\u003csup\u003e\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e, \u003cem\u003eLight Sci. Appl\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e~ 90%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e0.5×3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u0026gt; 60,000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOur work\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSws\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e20\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e95.13%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eNone\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·log\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"×\" colname=\"c7\"\u003e\u003cp\u003e0.5×2.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u0026gt; 60,000\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. MZI Mach-Zehnder interferometer, MRR micro-ring resonator, SWs subwavelength slot, N/A indicates \"Not Applicable\". Goodness of fit R\u003csup\u003e2\u003c/sup\u003e = ESS / TSS, where ESS (Explained Sum of Squares) represents the variation explained by the regression model, and TSS (Total Sum of Squares) represents the total variation in the dependent variable. For consistency with reference literature that uses R\u003csup\u003e2\u003c/sup\u003e to evaluate scalar diffraction and electromagnetic simulation matching, our fidelity \u003cem\u003eη =\u003c/em\u003e 97.77% values have been converted to R\u003csup\u003e2\u003c/sup\u003e.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eGeneralization Capability\u003c/h3\u003e\n\u003cp\u003eFitting neural network methods suffer from fundamental generalization limitations, requiring complete retraining when materials, geometric parameters, or environmental conditions deviate from training datasets\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e,\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e,\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e. Our physics-based approach inherently generalizes across varying conditions by embedding physical constraints into the model. For practical systems requiring adaptation to \u003cem\u003eK\u003c/em\u003e different conditions, neural networks scale as \u003cem\u003eK\u003c/em\u003e·(\u003cem\u003eO\u003c/em\u003e(\u003cem\u003eD\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e²) + \u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e·4\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003ent\u003c/em\u003e\u003c/sub\u003e\u003csup\u003e2\u003c/sup\u003e)) due to complete retraining requirements, while our method maintains constant \u003cem\u003eO\u003c/em\u003e(\u003cem\u003eG\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eS\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e·log\u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003et\u003c/em\u003e\u003c/sub\u003e) complexity regardless of scenario count. This fundamental difference makes our approach uniquely suitable for practical deployment where adaptation to varying conditions is essential.\u003c/p\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eMode Encoding Implementation\u003c/h2\u003e\u003cp\u003eThe ability to process four-dimensional Iris data using a single waveguide highlights the input efficiency of mode encoding. By mapping multiple features onto orthogonal optical modes, this approach eliminates the need for separate physical channels, addressing the footprint limitations of intensity-encoded systems. It enables substantial chip area savings without compromising accuracy, achieving 96.67% classification performance in three-layer networks. The high mode purity (93.39% for TE\u003csub\u003e01\u003c/sub\u003e) further supports its applicability in advanced photonic systems, including mode-division multiplexing and quantum information processing, where compactness and modal fidelity are critical.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003eLimitations\u003c/h2\u003e\u003cp\u003eThe dual optimization method imposes constraint conditions that require slots to simultaneously satisfy the dual requirements of phase modulation and Gaussian smoothing, which restricts design flexibility and may necessitate compensation through increased numbers of hidden layers and neurons.\u003c/p\u003e\u003cp\u003eThrough systematic analysis of key design parameters including Gaussian kernel width, network layer count, and interlayer spacing, we established a comprehensive theoretical framework for performance optimization. The analysis results demonstrate that optimal system performance is achieved when \u003cem\u003eσ\u003c/em\u003e reaches 1.5 to 2.5 µm, at which point fidelity approaches saturation. Based on this finding, this work adopts \u003cem\u003eσ\u003c/em\u003e = 1.5 µm as the optimization parameter, which not only maintains stable high fidelity but also achieves design objectives with the minimum number of network layers and the most compact interlayer spacing.\u003c/p\u003e\u003cp\u003eIt is noteworthy that this physics-based constraint optimization approach offers significant advantages over pure neural network methods: it effectively prevents generalization failures across different material systems and geometric configurations, and requires only a single system optimization to be applicable across various application scenarios, eliminating the need for repeated network retraining under different conditions.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003cdiv id=\"Sec14\" class=\"Section3\"\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"Method","content":"\u003ch2\u003eModel of OC-DONN\u003c/h2\u003e\u003cp\u003eThe OC-DONNs designed herein adopts a layered architecture, comprising an input layer, multiple hidden layers (metaline), and an output layer. Each metaline consists of a one-dimensional array of subwavelength-scale slots filled with SiO\u003csub\u003e2\u003c/sub\u003e, where the slot length acts as the trainable parameter while the width and thickness remain fixed. We define this slot as a single neuron, and the interconnection between neurons is realized via free-space diffraction between adjacent metalines.\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e(a) illustrates the structure of the OC-DONN. The thickness of the silicon (Si) substrate \u003cem\u003et\u003c/em\u003e\u003csub\u003e3\u003c/sub\u003e is 3 µm, the SiO\u003csub\u003e2\u003c/sub\u003e insulator layer \u003cem\u003et\u003c/em\u003e\u003csub\u003e2\u003c/sub\u003e is 2 µm, and the Si waveguide layer \u003cem\u003et\u003c/em\u003e\u003csub\u003e1\u003c/sub\u003e is 0.22 µm. The center-to-center spacing \u003cem\u003ed\u003c/em\u003e is 0.5 µm, with each slot having a fixed width \u003cem\u003ew\u003c/em\u003e is 0.14 µm. and thickness \u003cem\u003eh\u003c/em\u003e is 0.22 µm.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eThe architecture of an OC-DONN is similar with the conventional artificial neural network, comprising a forward propagation module, a backward propagation module, and a loss function for training (more details shown in Supporting Information Section S5). \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(T_{p}^{m}({x_p},{y_p})\\)\u003c/span\u003e\u003c/span\u003e represents the transmission function of the \u003cem\u003ep\u003c/em\u003e-th neuron at position \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(({x_p},{y_p})\\)\u003c/span\u003e\u003c/span\u003e in the \u003cem\u003em\u003c/em\u003e-th layer of the network, which can be described as:\u003c/p\u003e\u003cdiv id=\"Equ4\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ4\" name=\"EquationSource\"\u003e\n$$T_{p}^{m}({x_p},{y_p})=\\alpha _{p}^{m}({\\theta ^{in}}) \\cdot \\exp \\{ \\varvec{j}[\\varphi _{p}^{m}({x_p},{y_p})]\\}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e4\u003c/div\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\alpha _{p}^{m}({\\theta ^{\\varvec{in}}})\\)\u003c/span\u003e\u003c/span\u003e denotes the amplitude transmission coefficient, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({\\theta ^{\\varvec{in}}}\\)\u003c/span\u003e\u003c/span\u003e denotes the incident angle of the optical field on each neuron. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\varphi _{p}^{m}({x_p},{y_p})\\)\u003c/span\u003e\u003c/span\u003e is the phase factor of the corresponding neuron, by fixing the width of slots to 0.14 µm, continuous phase modulation from 0 to 2π is achieved by varying the slots length between 0.2 µm and 2.5 µm, reaching up to 96% transmission efficiency, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e(b).\u003c/p\u003e\u003ch2\u003eSimulation\u003c/h2\u003e\u003cp\u003eFull-wave electromagnetic simulations were performed using the variational finite-difference time-domain method to validate the effectiveness of the dual optimization approach. Appropriate grid resolution was adopted to discretize the computational domain, accurately capturing subwavelength features while maintaining computational efficiency. PML absorbing boundary conditions were applied at the computational domain boundaries to effectively eliminate reflections and simulate infinite space conditions.\u003c/p\u003e\u003ch2\u003eNetwork Training\u003c/h2\u003e\u003cp\u003eFor mode conversion task, we proposed a modified loss function in previous works\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e for complex-valued data:\u003c/p\u003e\u003cdiv id=\"Equ5\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ5\" name=\"EquationSource\"\u003e\n$$\\left\\{ \\begin{gathered} {\\mathcal{L}_1}=\\frac{1}{{{N_s}}}\\sum\\limits_{\\tau } {{{(Y_{{\\varvec{tar}}}^{{(\\tau )}} - {e^{\\varvec{j}\\varphi }}X_{{\\varvec{out}}}^{{(\\tau )}})}^{^{\\dag }}}(Y_{{\\varvec{tar}}}^{{(\\tau )}} - {e^{\\varvec{j}\\varphi }}X_{{\\varvec{out}}}^{{(\\tau )}})} \\hfill \\\\ {e^{\\varvec{j}\\varphi }}=\\frac{{{{(X_{{\\varvec{out}}}^{{(\\tau )}})}^{^{\\dag }}}Y_{{\\varvec{tar}}}^{{(\\tau )}}}}{{|{{(X_{{\\varvec{out}}}^{{(\\tau )}})}^{^{\\dag }}}Y_{{\\varvec{tar}}}^{{(\\tau )}}|}} \\hfill \\\\ \\end{gathered} \\right.$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e5\u003c/div\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003ehere, \u003cem\u003eN\u003c/em\u003e\u003csub\u003e\u003cem\u003es\u003c/em\u003e\u003c/sub\u003e denotes the total number of training samples, while \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(Y_{{\\varvec{tar}}}^{{(\\tau )}}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(X_{{\\varvec{out}}}^{{(\\tau )}}\\)\u003c/span\u003e\u003c/span\u003e represent the target and output optical fields, both containing amplitude and phase information. By performing gradient descent using the above loss function, when the target field \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(Y_{{\\varvec{tar}}}^{{(\\tau )}}\\)\u003c/span\u003e\u003c/span\u003e and the output field \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(X_{{\\varvec{out}}}^{{(\\tau )}}\\)\u003c/span\u003e\u003c/span\u003e are in the same state, and the loss reaches its minimum. This indicates that the loss function satisfies the requirements of the mode conversion task.\u003c/p\u003e\u003cp\u003eFor classification tasks, the cross-entropy loss function is used\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e:\u003c/p\u003e\u003cdiv id=\"Equ6\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ6\" name=\"EquationSource\"\u003e\n$$\\left\\{ \\begin{gathered} {F^{(\\alpha ,\\beta )}}={M^{{{(\\beta )}^\\varvec{T}}}}I_{{\\varvec{out}}}^{{(\\alpha )}} \\hfill \\\\ {P^{(\\alpha ,\\beta )}}=\\frac{{\\exp ({F^{(\\alpha ,\\beta )}})}}{{\\sum\\nolimits_{\\gamma } {\\exp ({F^{(\\gamma ,\\beta )}})} }} \\hfill \\\\ {\\mathcal{L}_2}= - \\frac{1}{{{N_s}}}\\sum\\limits_{{\\alpha ,\\beta }} {{Q^{(\\alpha ,\\beta )}}\\log ({P^{(\\alpha ,\\beta )}})} \\hfill \\\\ \\end{gathered} \\right.$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e6\u003c/div\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003ewhere, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({M^{(\\beta )}}\\)\u003c/span\u003e\u003c/span\u003edenotes the label corresponding to the \u003cem\u003eβ\u003c/em\u003e detector. The Dirac delta function \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({Q^{(\\alpha ,\\beta )}}\\)\u003c/span\u003e\u003c/span\u003e equals 1 if the label of the \u003cem\u003eα\u003c/em\u003e sample belongs to class \u003cem\u003eβ\u003c/em\u003e, and 0 otherwise. The predicted probability \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({P^{(\\alpha ,\\beta )}}\\)\u003c/span\u003e\u003c/span\u003e indicates the likelihood that sample \u003cem\u003eα\u003c/em\u003e is classified as class \u003cem\u003eβ\u003c/em\u003e, which is computed via the softmax function.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eAdditional information\u003c/p\u003e\n\u003cp\u003eWe provide more details on the computational complexity of the on-chip optical processor with detailed calculation procedures, the network parameter analysis and performance optimization of the optical neural network architecture.\u003c/p\u003e\n\u003cp\u003eAcknowledgments\u003c/p\u003e\n\u003cp\u003eNational Natural Science Foundation of China (Grant No. 12274105, 12574336, 12574414, 12504380, 12504438); Heilongjiang Provincial Natural Science Foundation of China (Grant No. LH2023A006); National Key Laboratory of Laser Spatial Information Foundation (LSI2025ZZKY04); the China Postdoctoral Science Foundation (2025M774294).\u003c/p\u003e\n\u003cp\u003eAuthor Contributions\u003c/p\u003e\n\u003cp\u003eW.D. and Q.J. conceived the idea and designed the experiments, Y.J. and C.Y. performed the theoretical simulations, Y.J. wrote the paper, and all authors contributed analysis tools.\u003c/p\u003e\n\u003cp\u003eData Availability\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.\u003c/p\u003e\n\u003cp\u003eCode Availability\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe code underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.\u003c/p\u003e\n\u003cp\u003eCompeting financial interests\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing financial interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eZhou, T.; Chang, L.; Gu, Y.; Wang, L.; Zhang, S.; Li, H.; Luo, Y.; Xie, Z.; Huang, H.; Kong, W.; Weng, C.; Liu, H.; Zhao, J.; Wang, J.; Gu, T.; et al. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nat. Photonics. 15, 367\u0026thinsp;\u0026ndash;\u0026thinsp;373(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFeldmann, J.; Youngblood, N.; Wright, C. D.; Bhaskaran, H.; Pernice, W. H. P. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature. 569, 208\u0026thinsp;\u0026ndash;\u0026thinsp;214(2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D. Deep learning with coherent nanophotonic circuits. Nat. Photonics. 11, 441\u0026ndash;446(2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKhoram, E.; Chen, A.; Liu, D.; Ying, L.; Wang, Q.; Yuan, M.; Yu, Z. Nanophotonic media for artificial neural inference. Photonics Res. 7, 823\u0026ndash;827(2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXue, Z.; Zhou, T.; Xu, Z.; Ma, M.; Huang, J.; Shen, Y.; Englund, D.; Fan, S.; Solgaard, O.; Miller, D. A. B. Fully forward mode training for optical neural networks. Nature. 632, 280\u0026ndash;286(2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWilliamson, I. A.; Hughes, T. W.; Minkov, M.; Bartlett, B.; Pai, S.; Fan, S. Reprogrammable electro-optic nonlinear activation functions for optical neural networks. IEEE J. Sel. Top. Quantum Electron. 26, 1\u0026thinsp;\u0026ndash;\u0026thinsp;12(2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHughes, T. W.; Minkov, M.; Shi, Y.; Fan, S. Training of photonic neural networks through in situ backpropagation and gradient measurement. Optica. 5, 864\u0026thinsp;\u0026ndash;\u0026thinsp;871(2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHu, J.; Mengu, D.; Tzarouchis, D. C.; Edwards, B.; Luo, Y.; Rivenson, Y.; Ozcan, A. Diffractive optical computing in free space. Nat. Commun. 15, 1525(2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFu, F.; Huo, D.; Zang, Z.; Lou, Y.; Wang, S.; Gu, Z.; Liu, D.-S.; Duan, X.; Wang, D.; Liu, X.; Qi, J.; Yu, S.; Du, Q.; Chen, G.; Lu, C.; Yu, Y.; Ren, X.; Yuan, X. Symbiotic evolution of photonics and artificial intelligence: a comprehensive review. Adv. Photon.7, 024001(2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBueno, J.; Maktoobi, S.; Froehly, L.; Fischer, I.; Jacquot, M.; Larger, L.; Brunner, D. Reinforcement learning in a large-scale photonic recurrent neural network. Optica. 5, 756\u0026thinsp;\u0026ndash;\u0026thinsp;760(2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXu, Z.; Zhou, T.; Ma, M.; Shen, Y.; Huang, J.; Zhang, C.; Zou, X.; Liu, W.; Dai, Q.; Jia, B. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science. 384, 202\u0026ndash;209(2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMengu, D.; Luo, Y.; Rivenson, Y.; Ozcan, A. Analysis of diffractive optical neural networks and their integration with electronic neural networks. IEEE J. Sel. Top. Quantum Electron. 26, 2921376(2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLin, X.; Rivenson, Y.; Yardimci, N. T.; Veli, M.; Luo, Y.; Jarrahi, M.; Ozcan, A. All-optical machine learning using diffractive deep neural networks. Science. 361, 1004\u0026thinsp;\u0026ndash;\u0026thinsp;1008(2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYan, T.; Wu, J.; Zhou, T.; Xie, H.; Xu, F.; Fan, J.; Fang, L.; Lin, X.; Dai, Q. Fourier-space diffractive deep neural network. Phys. Rev. Lett. 123, 023901(2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYuan, X.; Wang, Y.; Xu, Z.; et al. Training large-scale optoelectronic neural networks with dual-neuron optical-artificial learning. Nat Commun. 14, 7110(2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhou, T.; Fang, L.; Yan, T.; Wu, J.; Li, Y.; Fan, J.; Wu, H.; Lin, X.; Dai, Q. In situ optical backpropagation training of diffractive optical neural networks. Photonics Res. 8, 940(2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, H.; Feng, J.; Jiang, M.; Wang, Y.; Lin, J.; Tan, J.; Jin, P. Diffractive deep neural networks at visible wavelengths. Engineering. 7, 1483\u0026ndash;1491(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJia, Q.; Zhang, Y.; Shi, B.; Li, H.; Li, X.; Feng, R.; Sun, F.; Cao, Y.; Wang, J.; Qiu, C.-W.; Ding, W. Vector vortex beams sorting of 120 modes in visible spectrum. Nanophotonics. 12, 3955\u0026ndash;3962(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJia, Q.; Shi, B.; Zhang, Y.; Li, H.; Li, X.; Feng, R.; Sun, F.; Cao, Y.; Wang, J.; Qiu, C.-W.; Gu, M.; Ding, W. Partially coherent diffractive optical neural network. Optica. 11, 1742\u0026ndash;1749(2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, Y.; Lin, Y.; Zhao, Y.; Wang, J.; Zhao, R.; Huang, Y. All-analog photoelectronic chip for high-speed vision tasks. Nature. 623, 48\u0026thinsp;\u0026ndash;\u0026thinsp;57(2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFu, T.; Zhang, J.; Sun, R.; Zhou, H.; Luo, J.; Zhang, L.; Dai, D. Optical neural networks: progress and challenges. Light Sci. Appl. 13, 263(2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLuo, X.; Hu, Y.; Ou, X.; Li, X.; Lai, J.; Liu, N.; Cheng, X.; Pan, A.; Deng, H. Metasurface-enabled on-chip multiplexed diffractive neural networks in the visible. Light Sci. Appl. 11, 158(2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFeldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S.; et al. Parallel convolutional processing using an integrated photonic tensor core. Nature. 589, 52\u0026ndash;58(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang, Z.; Li, T.; Soman, A.; Mao, D.; Kananen, T.; Gu, T. On-chip wavefront shaping with dielectric metasurface. Nat. Commun. 10, 3547(2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang, Y.; Lin, W.; Duan, S.; Li, C.; Zhang, H.; Liu, B. On-chip reconfigurable diffractive optical neural network based on Sb2S3 Optics Express. 33, 1810\u0026ndash;1826(2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZarei, S.; Khavasi, A. Realization of optical logic gates using on-chip diffractive optical neural networks. Sci. Rep. 12, 15747(2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFu, T.; Zang, Y.; Huang, H.; Du, Z.; Hu, C.; Chen, M.; Yang, S.; Chen, H. On-chip photonic diffractive optical neural network based on a spatial domain electromagnetic propagation model. Opt. Express. 29, 31924\u0026thinsp;\u0026ndash;\u0026thinsp;31940(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang, Z.; Chang, L.; Wang, F.; Li, T.; Gu, T. Integrated photonic metasystem for image classifications at telecommunication wavelength. Nat. Commun. 13, 2131(2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSun, R.; Fu, T.; Huang, Y.; Liu, W.; Du, Z.; Chen, H. Multimode diffractive optical neural network. Adv. Photon. Nexus. 3, 026007(2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhu, H.; Zou, J.; Zhang, H.; Shi, Y.; Luo, S.; Wang, N.; Cai, H.; Wan, L.; Wang, B.; Jiang, X. Space-efficient optical computing with an integrated chip diffractive neural network. Nat. Commun. 13, 1044(2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShao, G.; Zhou, T.; Yan, T.; Guo, Y.; Zhao, Y.; Huang, R.; Fang, L. Reliable, efficient, and scalable photonic inverse design empowered by physics-inspired deep learning. Nanophotonics. in press(2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFu, T.; Zang, Y.; Huang, Y.; Du, Z.; Hu, C.; Chen, M.; Yang, S.; Chen, H. Photonic machine learning with on-chip diffractive optics. Nat. Commun. 14, 70(2023),.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZarei, S.; Marzban, M. R.; Khavasi, A. Integrated photonic neural network based on silicon metalines. Opt. Express. 28, 36668\u0026thinsp;\u0026ndash;\u0026thinsp;36684(2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu, W.; Fu, T.; Huang, Y.; Sun, R.; Yang, S.; Chen, H. C-DONN: compact diffractive optical neural network with deep learning regression. Opt. Express. 31, 22127\u0026thinsp;\u0026ndash;\u0026thinsp;22143(2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, H.; Gu, M.; Jiang, X.; Thompson, J.; Cai, H.; Paesani, S.; Santagati, R.; Laing, A.; Zhang, Y.; Yung, M. An optical neural chip for implementing complex-valued neural network. Nat. Commun. 12, 457(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHuang, C.; Fujisawa, S.; de Lima, T. F.; Tait, A. N.; Blow, E. C.; Tian, Y.; Bilodeau, S.; Jha, A.; Yaman, F.; Peng, H.-T.; et al. A silicon photonic-electronic neural network for fibre nonlinearity compensation. Nat. Electron. 4, 837\u0026ndash;844(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHuang, Y.; Fu, T.; Huang, H.; Yang, S.; Chen, H. Sophisticated deep learning with on-chip optical diffractive tensor processing. Photon. Res. 11, 1125\u0026ndash;1138(2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, Z.; Xiao, S.; Song, Q.; et al. Scalable on-chip diffractive speckle spectrometer with high spectral channel density. Light Sci. Appl. 14, 130(2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu, W.; Huang, Y.; Sun, R.; Fu, T.; Yang, S.; Chen, H. Ultra-compact multi-task processor based on in-memory optical computing. Light Sci. Appl. 14, 134(2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGonzalez, R. C.; Woods, R. E. Digital Image Processing, 4th ed.; Pearson: Upper Saddle River, NJ(2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, H.; Feng, J.; Jiang, M.; Wang, Y.; Lin, J.; Tan, J.; Chen, P. Diffractive deep neural networks at visible wavelengths. Engineering 7, 1483\u0026thinsp;\u0026ndash;\u0026thinsp;1491(2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGoodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA(2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; Soljačić, M. Deep learning with coherent nanophotonic circuits. Nat. Photonics,11, 441\u0026ndash;446(2017).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7758851/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7758851/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOn-chip diffractive optical neural networks offer advantages for optical information processing but face fundamental challenges when theoretical scalar diffraction models fail to accurately predict vector electromagnetic wave propagation in real devices. Existing solutions compromise either integration density or computational efficiency. Here we show a dual-optimization approach that combines Gaussian-smoothing diffractive neural networks with angle correction to bridge this modeling gap. Our method requires no extra training datasets and adds minimal computational overhead, with excellent generalizability. It reduces modeling errors, enhancing fidelity from 34.91% to 98.10% with mode purities reaching 93.39% and 90.37% in mode conversion tasks. Importantly, it maintains excellent performance even in ultra-compact architectures, achieving 97.77% fidelity at a layer spacing of only 20 \u0026micro;m, compared to approximately 300 \u0026micro;m required previously. This establishes a scalable framework for high-performance on-chip diffractive neural networks with complete physical interpretability for silicon photonics applications.\u003c/p\u003e","manuscriptTitle":"Ultra-compact and efficient on-chip diffraction neural network based on dual optimization of physical constraints","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-21 09:57:10","doi":"10.21203/rs.3.rs-7758851/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"communications-physics","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"commsphys","sideBox":"Learn more about [Communications Physics](http://www.nature.com/commsphys/)","snPcode":"","submissionUrl":"","title":"Communications Physics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Communications Series","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"50104442-419b-4a23-b513-57eff9102d98","owner":[],"postedDate":"October 21st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":55971042,"name":"Physical sciences/Optics and photonics/Applied optics/Integrated optics"},{"id":55971043,"name":"Physical sciences/Optics and photonics/Optical materials and structures/Silicon photonics"}],"tags":[],"updatedAt":"2026-03-14T08:55:19+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-21 09:57:10","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7758851","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7758851","identity":"rs-7758851","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00