Graphene/silicon heterojunction for reconfigurable phase-relevant activation function in coherent optical neural networks

Zhong, Chuyu; Liao, Kun; Dai, Tianxiang; Wei, Maoliang; Ma, Hui; Wu, Jianghong; Zhang, Zhibin; Ye, Yuting; Luo, Ye; Chen, Zequn; Jian, Jialing; Sun, Chunlei; Tang, Bo; Zhang, Peng; Liu, Ruonan; Li, Junying; Yang, Jianyi; Li, Lan; Liu, Kaihui; Hu, Xiaoyong; Lin, Hongtao

doi:10.1038/s41467-023-42116-6

Download PDF

Article
Open access
Published: 31 October 2023

Graphene/silicon heterojunction for reconfigurable phase-relevant activation function in coherent optical neural networks

Nature Communications volume 14, Article number: 6939 (2023) Cite this article

4349 Accesses
2 Altmetric
Metrics details

Subjects

Abstract

Optical neural networks (ONNs) herald a new era in information and communication technologies and have implemented various intelligent applications. In an ONN, the activation function (AF) is a crucial component determining the network performances and on-chip AF devices are still in development. Here, we first demonstrate on-chip reconfigurable AF devices with phase activation fulfilled by dual-functional graphene/silicon (Gra/Si) heterojunctions. With optical modulation and detection in one device, time delays are shorter, energy consumption is lower, reconfigurability is higher and the device footprint is smaller than other on-chip AF strategies. The experimental modulation voltage (power) of our Gra/Si heterojunction achieves as low as 1 V (0.5 mW), superior to many pure silicon counterparts. In the photodetection aspect, a high responsivity of over 200 mA/W is realized. Special nonlinear functions generated are fed into a complex-valued ONN to challenge handwritten letters and image recognition tasks, showing improved accuracy and potential of high-efficient, all-component-integration on-chip ONN. Our results offer new insights for on-chip ONN devices and pave the way to high-performance integrated optoelectronic computing circuits.

Realization of optical logic gates using on-chip diffractive optical neural networks

Article Open access 21 September 2022

High-performance silicon−graphene hybrid plasmonic waveguide photodetectors beyond 1.55 μm

Article Open access 28 February 2020

Waveguide-integrated twisted bilayer graphene photodetectors

Article Open access 01 May 2024

Introduction

Neuromorphic photonics has attracted extensive attention in recent decades¹. The light propagation in photonic networks^2,3 achieves the operation of matrix computation and has exhibited the promising potential to break the technical bottleneck of electrical networks, considering that optical devices use photons as information carriers and have the advantages of larger bandwidth, higher information capacity, and lower power consumption. With the prosperity of silicon photonics^4,5,6, integrated ONNs have achieved exciting accomplishments in artificial intelligent applications including symbol recognition^3,7, vowel analysis⁸, image classification⁹, etc.

In a neural network, the activation function (AF) introduces nonlinearity, enabling the network to perform complicated tasks, and has an important impact on training speed and computational accuracy^10,11. For on-chip ONNs without AF devices^12,13, the nonlinear operation is carried out by external modulators through computer control^8,9. This scheme benefits from the flexibility of digital AF selection, but several analog-to-digital conversion steps add latency to the network. A growing number of efforts have been made to develop on-chip AFs¹¹ in all-optical or electro-optic ways, as shown in Fig. 1a. In all-optical type AF devices, phase change materials (PCM)^3,14,15 or graphene^16,17 are adopted to modify the optical power directly by the optical signal itself through the refractive index or absorption modification. The absence of an electric circuit can help moderate the complexity of network design, but the optical power threshold is relatively large (MW/cm²)¹⁶. Recently, a non-intrusive germanium-silicon structure¹⁸ can achieve all-optical activation and power monitoring simultaneously, but the nonlinear response is unchangeable, lacking flexibility. Electro-optic type devices can produce reconfigurability. Indium tin oxide (ITO)^19,20 film devices were demonstrated with low power consumption, simple design but extra photodetectors were needed to monitor the signal intensity. Another strategy involves integrating a micro ring resonator (MRR) into Mach-Zehnder interferometer (MZI) circuits with phase shift electrodes⁷. An increasingly popular approach is called light-splitting-and-detection AF unit^21,22,23, which is adopted in recently reported ONN chips^24,25,26. In such AF unit, input optical power is monitored by a PD in an optical bypass, and the photocurrent is transferred to the modulation voltage of a modulator to form a feedback circuit, finally tuning the transmitted optical power. Such a strategy offers high reconfigurability but brings higher power consumption and time delay because of the opto-electric conversion. Nowadays, AF devices or units should seek to achieve smaller power thresholds, lower power consumption, shorter delay, smaller footprints, and higher flexibility. To offer new opportunities to optical AF device, two-dimensional material-assisted silicon photonics has exhibited intriguing potentials^27,28. Specifically, the synergistic combination of graphene with silicon-based photonic structures has proved its ability to deliver massively enhanced device performances, enriched functionalities and broadened operation waveband²⁹.

**Fig. 1: Significance, principle and design of our work.**

In this article, we point out that the phase shift of an AF device is usually neglected, omitting the fact that the ONN has a complex-valued nature, as illustrated in Fig. 1b. In addition, most classical AFs are not symmetrical over positive and negative values, which is incompatible with positive-only intensity values. Therefore, many classical AFs used in real-valued neural networks are no longer applicable to complex-valued ONNs (More discussed in Section IX in Supplementary Information). Current methods of solving this problem includes applying activation separately on real and imaginary values^30,31, applying activation based on intensity^{17,32,33,34,35} and applying activation based on phase^36,37. However, most of the methods often does not account for the crucial relationship between the amplitude and phase of the complex value, which can only be addressed by an activation function that operates on both³⁸. Here, we propose a phase-relevant AF device using graphene/silicon (Gra/Si) heterojunction integrated in MRR (Fig. 2a), which functions as modulator and photodetector in a single device. The optical modulation is achieved by plasma dispersion effect of the silicon waveguide³⁹ and doping of the graphene, which modulate both the resonance wavelength and coupling strength of the MRR. The extensively studied light detecting ability of graphene and graphene/silicon junction^40,41,42,43 has also been utilized. Experimentally, a modulation voltage (power) of 1 V (0.5 mW) was obtained in our Gra/Si device, lower than many pure silicon devices^44,45,46. In the photodetection aspect, the high responsivity of over 200 mA/W is realized at 1.5 V bias. The dual-functional property allows the device to achieve high reconfigurability. The modulator-detector-in-one feature guarantees shorter time delay, lower energy consumption, and higher integration density than other AF units. In the meanwhile, the MRR provides wavelength-sensitive phase tuning to the AF units. With the mentioned advantages, our devices can create activation functions with unique nonlinearity other than conventional ones²² with phase-tuning information included (see Table S3 in Section V in the Supplementary Information for quantitative comparison among AF devices). A complex-valued ONN considering phase activation is built in a computer and trained with the phase-activated AFs from our devices, as depicted in Fig. 1c. Image classification tasks using MNIST and CIFAR-10 datasets were challenged. Our AFs enable faster convergence speed and higher accuracy. The Gra/Si heterojunction in this work has proposed a positive perspective on future two-dimensional materials photonic networks.

**Fig. 2: Schematic illustration, properties and operation principle of the graphene/silicon heterojunction.**

Results

Device description and operation principles

The device’s structure is illustrated in Fig. 2a and details of the layered device are demonstrated in the inset. Our device was fabricated on a standard silicon photonics platform using a silicon-on-insulator (SOI) substrate by multi-project-wafer (MPW) involved processes (see “Methods”). The photonic structure consists of a ring resonator with a radius of 40 μm. The graphene was transferred monolithically onto the wafer by a standard wet-transfer process. Finally, it formed the graphene/silicon heterojunction with the lightly n-doped waveguide (Fig. 2b). The Raman spectrum in Fig. 2c indicates that the graphene is single-layered and measured current-voltage curve (Fig. 2d) coincides with the electric characteristic of a Schottky diode (more detailed Raman analysis please see Section I in Supplementary Information). In such a Schottky device, carrier engineering can be used to modify the Fermi level (absorption) of graphene⁴⁷ and the refractive index of silicon waveguides⁴⁶ (plasma dispersion effect), thereby modulating the optical signal. In the meantime, graphene also functions as a photo-detecting material⁴⁸. The operation principle is explained by the band structure of Gra/Si junction as depicted in Fig. 2e. Under forward bias, the positively charged p-doped graphene has a higher Fermi level and, consequently, is less absorbent. As for the silicon surface, the width of the space charge region is compressed, leading to a larger equivalent doping concentration (smaller refractive index³⁹) of the slightly n-doped silicon waveguide. In contrast, graphene is negatively charged under reverse bias, and exhibits increased optical absorption. The space charge region is wider regarding the silicon waveguide, bringing reduced doping concentration (larger refractive index). In the presence of a large forward bias (Fig. 2e.iv), the carrier concentration of the silicon is high, and the free carrier absorption dominates⁴⁹. Such functionalities were demonstrated in ring resonators. With the resonant effect, the modulation power is lower than that of non-resonant structures, and the photodetection is more sensitive due to the light trapping inside. In addition, during the tuning of resonance wavelength, the phase of the output light is also modulated and very sensitive to the position of resonance wavelength (see Section VII in Supplementary Information), exhibiting complex modulation of the optical field.

Device performances

The modulation performance of the fabricated devices with 50-μm-long graphene (device 1) was characterized, and the results are shown in Fig. 3. The transmission spectra under different voltages (Fig. 3a) indicate that both the refractive index and the absorption of the active area are tuned by electric driving as discussed. The carrier transfer process differs under different bias conditions; therefore, the effective refractive index(n_eff) and absorption of the active area result in contrasting spectra characterizations. The black dashed curve is the transmission under zero bias. At reverse bias, the resonance wavelength redshifts, and the full width at half maxima of the resonance peak becomes wider (smaller Q factor as shown in Fig. 3d). A larger refractive index of the silicon waveguide and larger absorption of graphene was calculated in Fig. 3c, coinciding with the results. Under forward bias, the resonance wavelength blueshifts until the bias voltage approaches 1 V, which also agrees with the band structure analysis. In response to increasing voltages over 1 V, the resonance redshifts and shifts faster (Fig. 3c), which could be a result of thermo-optic effects. Hence, our devices can work in carrier injection, carrier depletion, and thermos-optic regions. The modulation depth (extinction ratio) under different voltages below the thermos-optic region is depicted in the lower part of Fig. 2b. Modulation depth exceeding 12 dB can be achieved with a low modulation voltage (power) of about 1 V (0.5 mW), which is smaller than mid-infrared p-n or p-i-n silicon modulators ever reported^50,51,52,53. As for the other two shown modulation operations (−1 V to 1 V and 0 V to 2 V), the largest modulation power is about 2.7 mW, which is also a relatively small value (please see Table S2. in Section IV in Supplementary Information). Then, the detection characterization of our device was performed (Figs. 3e - 2i). As our device is a resonant structure, the photocurrent and responsivity of the resonance wavelength and non-resonance wavelength under a bias of 1.5 V are compared, as shown in Fig. 3e and Fig. 3f. And wavelength-resolved responsivity spectra were measured under different bias voltages (Fig. 3g). Input light in resonance wavelength can produce much larger photocurrent and responsivity. Therefore, our device works as a narrow-band detector. The photocurrent and responsivity at resonance wavelength under different bias voltages and input optical power are illustrated in Figs. 3h and 2i, respectively. Responsivity higher than 200 mA/W can be achieved for input optical power smaller than 100 μW, which exhibits the highest responsivity among the state-of-the-art 2-μm-band graphene-silicon photodetectors, according to the performance comparison in Table S2 in the Supplementary Information. The responsivity for the microwatt-level optical signal can exceed 1 A/W, because the trap states of the graphene-silicon interface prolonged the lifetime of the photo-induced carriers before recombination, leading to the gain which largely improved the responsivity. When optical power increases, the excited electrons contribute to fill the unoccupied states in the graphene to a certain level limited by the photon energy (wavelength). After that, extra incident power (a greater number of photons) will not be absorbed and consequently the photocurrent-power curve become flattened, together with a decreasing responsivity. At 3 V, both photocurrent and responsivity dropped due to a reduced Q factor and increased free carrier absorption (Fig. 3d).

Generation of activation functions and ONN training

According to the results in Fig. 3, both the output power and photocurrent can be tuned by applying different bias voltages and input optical power. Hence, utilizing the modulation-detection-in-one features of our devices, an on-chip photonic nonlinear activation function with phase tuning for an optical neural network with an ultralow optical power threshold is proposed and validated. The proposed integrated neural network chip system is demonstrated in Fig. 4a. The nonlinearity can be achieved by introducing a photocurrent measurement of the voltage feedback mechanism. An integrated circuit (IC) that can apply bias voltage V_in and measure photocurrent I_p can be designed and integrated with the photonic devices so that the bias voltage can be tuned based on the photocurrent variation. Consequently, a transfer function \({V^{\prime} }_{in}=H({I}_{p},{V}_{in})\) between bias voltage V_in and tuned voltage \({V^{\prime} }_{in}\) can be programmed into the IC. An easy-to-be-implemented \(H({I}_{p},{V}_{in})\) is a photocurrent stabilizing circuit. Activation functions were generated from two devices using current stabilizing\(H({I}_{p},{V}_{in})\). As depicted in Fig. 4b, photocurrent and transmission of device 1 at the wavelength of 2026.31 nm under different voltages and optical input power were obtained. Photocurrent contours of 1 μA and 2 μA are plotted within the filled contour and mapped to the transmission surface. As a result, the relation between the transmission and input power can be established, and two AFs were extracted and plotted in scattering points. The same operation was performed for device 2 (with a graphene length of 20 μm) at 2012.71 nm, and results are shown in Fig. 4c with three activation functions using photocurrent contours of 0.2 μA and 0.4 μA (more characterization results can be found in Section VI in Supplementary Information). All the AFs with phase shift are demonstrated in Fig. 4d. The phase shift was extracted from equations in Ref. ⁵⁴ and detailed phase shift deduction is demonstrated in Section VII in Supplementary Information. The configurability of our devices has been proved by the above results that a single device can generate several activation functions by applying different transfer functions related to different photocurrent constants. Even with the same \(H({I}_{p},{V}_{in})\), different activation functions can also be obtained by choosing different voltage zones. Last but not least, the activation threshold of input optical power as low as 10 μW was achieved, which is order(s) of magnitude lower than other reported results^16,23,55. Under the above approach, compared to other types of AF devices, our devices can generate complex activation functions with more reconfigurability, simpler operation, lower power consumption and optical threshold (see Table S3 in Section V in the Supplementary Information).

**Fig. 4: Generation mechanism and results of optical activation fucntions.**

The validity of our optical activation functions is investigated by two complex-valued neural networks in MNIST dataset and CIFAR-10 dataset, respectively. The network structures are illustrated in Fig. S13 in Supplementary Information. The two networks shown in Fig. S13 are based on LeNet⁵⁶ and ResNet-34⁵⁷, redesigned to adapt to complex-valued convolution and the size of the corresponding dataset. The max pooling layers and fully connected layers of the original network are replaced by a single global average pooling layer, as those layers are unsuitable for optical neural networks. The network’s performance is measured in terms of accuracy on the MNIST dataset and CIFAR-10 dataset. Both datasets consist of ten classes with 6000 images per class. The standard train/test split is class-balanced and contains 50,000 training images and 10,000 test images. To monitor the training process, the training set is further split into 40,000 true training images and 10,000 validation images. Images of the CIFAR-10 dataset are RGB-colored with a size of 32 × 32 pixels, and images of the MNIST dataset are grayscale with a size of 28 × 28 value. We duplicate the real values to both real and imaginary parts for input to the network.

Comparison between our generated optical activation functions against other commonly used activation functions is performed by constructing two complex neural networks for the MNIST dataset and CIFAR-10 dataset. This comparison involved three classical activation functions of real-valued neural networks: Tanh, Arctan and Softsign, our designed optical activation functions, with the identity function (no activation) as the baseline. A diagram of the transmission functions of various activation functions is shown in Section IX in Supplementary Information. We consider the phase shift relative to intensity for our activation functions, and assume it to be 0 for classical ones. We choose \(\sqrt{1-g(x)}\) as our used activation function that operates on the complex amplitude in order to avoid vanishing gradients by increasing the average transmission rate, where \(g(x)\) is the transmission rate. The square root corresponds to the relationship between complex amplitude and intensity. A spline interpolation is applied to the data points of our measurement to obtain an analytical piecewise function available for back propagation (see Section IX, X in Supplementary Information).

The training and validation results are depicted in Fig. 5. The training loss (defined as cross-entropy loss) and validation accuracy curves of the complex-valued optical neural networks with different activation functions are demonstrated in Fig. 5b, f. (Comparison between more activation functions can be seen in Section XI in Supplementary Information). The solid dot lines are the average results from 5 training sessions. Our optical activation function shows a much better loss in the MNIST dataset, indicating a faster convergence speed. The best optical activation function 3 shows a 7% accuracy advantage in both the validation set and test set over the ArcTan (which has the best performance of the classical function in our training) with a loss advantage of 1.5. Moreover, our best optical activation function shows a solid lead over classical functions in the CIFAR-10 dataset, with an 8% accuracy advantage in both the validation set and test set, and converges much faster over the ArcTan, which has the best performance compared with the classical function. It also demonstrates smaller loss values and faster loss reduction versus training rounds. These advantages are due to the transmission rate falling to zero for larger input values for classical functions. A near-zero transmission will result in zero gradient values, prohibiting updating network weights. Besides, it is also possible that the better training results originating from our functions are segmented (Section IX in Supplementary Information), which offers more flexible approximation abilities than smooth functions. The confusion matrices for 10,000 test data set images for different activations are presented in Fig. 5c, g, consistent with the training results. The phase information makes a vital difference in the networks’ performance (Section XII in Supplementary Information). Obviously, our functions can manipulate phase-based intensity, thus taking advantage of complex functions to produce better training results.

A closer analysis of the trained networks is demonstrated in Fig. 5d, h, which shows the visualized output of each block in the neural network, colored based on the intensity values. Our proposed optical activation function shows a much smoother activation map than classical activation functions (see more results in Section XIII in Supplementary Information), with more solid prediction values for the same input compared with the classical activation functions, which proves that the proposed activation function contributes towards stable training of the neural network. A more thorough comparison involving more classical and optical activation functions and the impact of the phase shift can be found in the supplementary materials.

In conclusion, we experimentally demonstrated graphene/silicon heterojunction as modulator and detector in one device, which could operate as a reconfigurable phase-activated optical activation function device to provide a more flexible solution for optical neural networks. The dual-functional devices can be programmed to produce a nonlinear optical response by detecting and modulating the optical signal simultaneously. The generated activation functions are more effective and efficient than classical activation functions within the same neural network. The Gra/Si heterojunction on MRR is highly designable and exhibit high reliability (Section VI in the Supplementary Information). Last but not least, as our device can tune the optical intensity, it can also be adopted in the weight matrix part of the optical neural network, which deserves further exploration. We believe this work is promising for future large-scale chip-level optical neural networks.

Methods

Device fabrication

The fabrication steps and flowchart are described in detail in Section II in the supplementary information, where structural details of our devices can also befound.

Device characterization

Please see Section III in the Supplementary Information.

Data availability

All the data supporting this study are available in the paper and Supplementary Information. Additional data related to this paper are available from the corresponding authors upon request.

References

Prucnal, P. R., Shastri, B. J., Teich, M. C., Prucnal, P. R. & Shastri, B. J. Neuromorphic Photonics (Taylor & Francis Group 2017).
Yan, T. et al. All-optical graph representation learning using integrated diffractive photonic computing units. Sci. Adv. 8, eabn7630 (2022).
Article CAS PubMed PubMed Central Google Scholar
Feldmann, J., Youngblood, N., Wright, C. D., Bhaskaran, H. & Pernice, W. H. P. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature 569, 208–214 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Xia, P. et al. High linearity silicon DC Kerr modulator enhanced by slow light for 112 Gbit/s PAM4 over 2 km single mode fiber transmission. Opt. Expr. 30, 16996–17007 (2022).
Article ADS CAS Google Scholar
Rahim, A. et al. Taking silicon photonics modulators to a higher performance level: State-of-the-art and a review of new technologies. Adv. Photonics 3, 024003 (2021).
Article ADS CAS Google Scholar
Atabaki, A. H. et al. Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip. Nature 556, 349–354 (2018).
Article ADS CAS PubMed Google Scholar
Jha, A., Huang, C. & Prucnal, P. R. Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics. Opt. Lett. 45, 4819–4822 (2020).
Article ADS PubMed Google Scholar
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).
Article ADS CAS Google Scholar
Feldmann, J. et al. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 52–58 (2021).
Article ADS CAS PubMed Google Scholar
Prajit, R., Barret, Z. & Quoc, V. L. Searching for activation functions. Preprint at https://arxiv.org/abs/1710.05941 (2017).
Chen, H. et al. Advances and challenges of optical neural networks. ChJL 47, 0500004 (2020).
Google Scholar
Tait, A. N. et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 7430 (2017).
Article ADS PubMed PubMed Central Google Scholar
Zhou, H. et al. Photonic matrix multiplication lights up photonic accelerator and beyond. Light Sci. Appl 11, 30 (2022).
Article PubMed PubMed Central Google Scholar
Yu. T. et al. Programmable chalcogenide-based all-optical deep neural networks. Nanophotonics 11, 4073–4088 (2022).
Feldmann, J. et al. Calculating with light using a chip-scale all-optical abacus. Nat. Commun. 8, 1256 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Tari, H., Bile, A., Moratti, F. & Fazio, E. Sigmoid type neuromorphic activation function based on saturable absorption behavior of graphene/PMMA composite for intensity modulation of surface plasmon polariton signals. Plasmonics 17, 1025–1032 (2022).
Article CAS Google Scholar
Liao, K. et al. Matrix eigenvalue solver based on reconfigurable photonic neural network. Nanophotonics 11, 4089–4099 (2022).
Article CAS Google Scholar
Shi, Y. et al. Nonlinear germanium-silicon photodiode for activation and monitoring in photonic neuromorphic networks. Nat. Commun. 13, 6048 (2022).
Amin, R. et al. ITO-based electro-absorption modulator for photonic neural activation function. APL Mater. 7, 081112 (2019).
Article ADS Google Scholar
Amin, R. et al. An ITO–graphene heterojunction integrated absorption modulator on Si-photonics for neuromorphic nonlinear activation. APL Photonics 6, 120801 (2021).
Article ADS CAS Google Scholar
Pour Fard, M. M. et al. Experimental realization of arbitrary activation functions for optical neural networks. Opt. Expr. 28, 12138–12148 (2020).
Article ADS Google Scholar
Williamson, I. A. D. et al. Reprogrammable electro-optic nonlinear activation functions for optical neural networks. IEEE J. Sel. Top. Quantum Electron 26, 1–12 (2020).
Article Google Scholar
Huang, Y., Wang, W., Qiao, L., Hu, X. & Chu, T. Programmable low-threshold optical nonlinear activation functions for photonic neural networks. Opt. Lett. 47, 1810–1813 (2022).
Article ADS CAS PubMed Google Scholar
Ashtiani, F., Geers, A. J. & Aflatouni, F. An on-chip photonic deep neural network for image classification. Nature 606, 501–506 (2022).
Article ADS CAS PubMed Google Scholar
Bandyopadhyay, S. et al. Single chip photonic deep neural network with accelerated training. Preprint at https://arxiv.org/abs/2208.01623 (2022).
Xu, Z. et al. Reconfigurable nonlinear photonic activation function for photonic neural network based on non-volatile opto-resistive RAM switch. Light Sci. Appl 11, 288 (2022).
Article CAS PubMed PubMed Central Google Scholar
Youngblood, N. & Li, M. Integration of 2D materials on a silicon photonics platform for optoelectronics applications. Nanophotonics 6, 1205–1218 (2017).
Article CAS Google Scholar
Wu, J. et al. Two‐dimensional materials for integrated photonics: recent advances and future challenges. Small Sci. 1, 2000053 (2021).
Article CAS Google Scholar
Akinwande, D. et al. Graphene and two-dimensional materials for silicon technology. Nature 573, 507–518 (2019).
Article ADS CAS PubMed Google Scholar
Huang, Y., Zhang, H. & Wang, Z. Multistability of complex-valued recurrent neural networks with real-imaginary-type activation functions. Appl Math. Comput 229, 187–200 (2014).
Article MathSciNet MATH Google Scholar
Benvenuto, N. & Piazza, F. On the complex backpropagation algorithm. ITSP 40, 967–969 (1992).
ADS Google Scholar
Katumba, A. et al. Neuromorphic computing based on silicon photonics and reservoir computing. IEEE J. Sel. Top. Quantum Electron 24, 1–10 (2018).
Article Google Scholar
Zhang, Z., Wang, H., Xu, F. & Jin, Y. Q. Complex-valued convolutional neural network and its application in polarimetric SAR image classification. ITGRS 55, 7177–7188 (2017).
ADS Google Scholar
Scardapane, S., Vaerenbergh, S. V., Hussain, A. & Uncini, A. Complex-valued neural networks with nonparametric activation functions. IEEE Trans. Emerg. Top. Comput Intell. 4, 140–150 (2020).
Article Google Scholar
Wu, R., Huang, H. & Huang, T. Learning of phase-amplitude-type complex-valued neural networks with application to signal coherence. In Neural Information Processing. 91-99. ((Springer International Publishing, 2017).
Sozos, K. et al. High-speed photonic neuromorphic computing using recurrent optical spectrum slicing neural networks. Commun. Eng. 1, 24 (2022).
Article Google Scholar
Virtue, P., Yu, S. X. & Lustig, M. Better than real: Complex-valued neural nets for MRI fingerprinting. In 2017 IEEE International Conference on Image Processing (ICIP). 3953-3957. 2017).
Wilmanski, M., Kreucher, C. & Hero, A. Complex input convolutional neural networks for wide angle SAR ATR. In 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP). 1037-1041. 2016).
Reed, G. T., Mashanovich, G., Gardes, F. Y. & Thomson, D. J. Silicon optical modulators. Nat. Photonics 4, 518–526 (2010).
Article ADS CAS Google Scholar
Wang, Y. et al. Ultrahigh-speed graphene-based optical coherent receiver. Nat. Commun. 12, 5076 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, X., Cheng, Z., Xu, K., Tsang, H. K. & Xu, J.-B. High-responsivity graphene/silicon-heterostructure waveguide photodetectors. Nat. Photonics 7, 888–891 (2013).
Article ADS CAS Google Scholar
Casalino, M. et al. Free-space schottky graphene/silicon photodetectors operating at 2 μm. ACS Photonics 5, 4577–4585 (2018).
Article CAS Google Scholar
Li, X. et al. High Detectivity Graphene-Silicon Heterojunction Photodetector. Small 12, 595–601 (2016).
Article CAS PubMed Google Scholar
Sobu, Y., Simoyama, T., Tanaka, S., Tanaka, Y. & Morito, K. 70 Gbaud Operation of All-Silicon Mach–Zehnder Modulator based on Forward-Biased PIN Diodes and Passive Equalizer. In 2019 24th OptoElectronics and Communications Conference (OECC) and 2019 International Conference on Photonics in Switching and Computing (PSC). 1-3. 2019).
Li, M., Wang, L., Li, X., Xiao, X. & Yu, S. Silicon intensity Mach–Zehnder modulator for single lane 100 Gb/s applications. Photonics Res. 6, 109–116 (2018).
Article CAS Google Scholar
Sinatkas, G., Christopoulos, T., Tsilipakos, O. & Kriezis, E. E. Electro-optic modulation in integrated photonics. J. Appl. Phys. 130, 010901 (2021).
Article ADS CAS Google Scholar
Tongay, S. et al. Rectification at graphene-semiconductor interfaces: zero-gap semiconductor-based diodes. Phys. Rev. X 2, 011002 (2012).
Google Scholar
Koppens, F. H. et al. Photodetectors based on graphene, other two-dimensional materials and hybrid systems. Nat. Nanotechnol. 9, 780–793 (2014).
Article ADS CAS PubMed Google Scholar
Schroder, D. K., Thomas, R. N. & Swartz, J. C. Free carrier absorption in silicon. IEEE J. Solid-State Circuits 13, 180–187 (1978).
Article ADS Google Scholar
Hagan, D. E., Ye, M., Wang, P., Cartledge, J. C. & Knights, A. P. High-speed performance of a TDFA-band micro-ring resonator modulator and detector. Opt. Express 28, 16845–16856 (2020).
Article ADS CAS PubMed Google Scholar
Wang, X. et al. High-speed silicon photonic Mach–Zehnder modulator at 2 μm. Photonics Res. 9, 535–540 (2021).
Article Google Scholar
Maoliang, Wei et al. TDFA-band silicon optical variable attenuator. Prog. Electromagnet. Res. 174, 33–42 (2022).
Article Google Scholar
Sun, C. et al. High-performance silicon PIN diode switches in the 2-microm wave band. Opt. Lett. 47, 2758–2761 (2022).
Article ADS CAS PubMed Google Scholar
Schwelb, O. Transmission, group delay, and dispersion in single-ring optical resonators and Add/Drop filters—A tutorial overview. J. Lightwave Technol. 22, 1380–1394 (2004).
Article ADS Google Scholar
Crnjanski, J., Krstic, M., Totovic, A., Pleros, N. & Gvozdic, D. Adaptive sigmoid-like and PReLU activation functions for all-optical perceptron. Opt. Lett. 46, 2003–2006 (2021).
Article ADS PubMed Google Scholar
Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Article Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778 (2016).

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant numbers 61975179 received by H.L., 91950204 received by X.H., 92150302 received by D.D., 12104375 received by L.L., 62105287, received by J.L.), the National Key Research and Development Program of China (Grant Number 2019YFB2203002 received by H.L.), the Fundamental Research Funds for the Central Universities (Grant Number 2021QNA5007 received by J.L.). The authors thank ZJU Micro-Nano Fabrication Center at Zhejiang University, Westlake Center for Micro/Nano Fabrication and Instrumentation, and Service Center for Physical Sciences at Westlake University for the facility support. The authors thank Dr. Min Tao and Prof. Fengli Gao from Jilin University for the feedback circuit diagram of the activation function. We thank Dr. Zhong Chen from Instrumentation and Service Center for Molecular Sciences at Westlake University for the assistance in Raman measurement. We also thank Dr. Chao Zhang from Instrumentation and Service Center for Physical Sciences for the assistance supporting in Hall effect measurement.

Author information

These authors contributed equally: Chuyu Zhong, Kun Liao.

Authors and Affiliations

State Key Laboratory of Modern Optical Instrumentation, College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, 310027, China
Chuyu Zhong, Maoliang Wei, Hui Ma, Junying Li, Jianyi Yang & Hongtao Lin
State Key Laboratory for Mesoscopic Physics, Frontiers Science Center for Nano-optoelectronics, School of Physics, Peking University, 100871, Beijing, China
Kun Liao, Tianxiang Dai, Zhibin Zhang, Kaihui Liu & Xiaoyong Hu
Key Laboratory of 3D Micro/Nano Fabrication and Characterization of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
Jianghong Wu, Yuting Ye, Ye Luo, Zequn Chen, Jialing Jian, Chunlei Sun & Lan Li
Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, 310024, China
Jianghong Wu, Yuting Ye, Ye Luo, Zequn Chen, Jialing Jian, Chunlei Sun & Lan Li
Institute of Microelectronics of the Chinese Academy of Sciences, 100029, Beijing, China
Bo Tang, Peng Zhang & Ruonan Liu
MOE Frontier Science Center for Brain Science & Brain-Machine Integration, Zhejiang University, Hangzhou, 310027, China
Hongtao Lin

Authors

Chuyu Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Kun Liao
View author publications
You can also search for this author in PubMed Google Scholar
Tianxiang Dai
View author publications
You can also search for this author in PubMed Google Scholar
Maoliang Wei
View author publications
You can also search for this author in PubMed Google Scholar
Hui Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jianghong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhibin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Ye
View author publications
You can also search for this author in PubMed Google Scholar
Ye Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zequn Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jialing Jian
View author publications
You can also search for this author in PubMed Google Scholar
Chunlei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Bo Tang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ruonan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Junying Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianyi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lan Li
View author publications
You can also search for this author in PubMed Google Scholar
Kaihui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hongtao Lin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, H.L. and C.Z.; methodology, C.Z and K.L.; software, C.Z. K.L. T.D. and B.T.; validation, C.Z. and K.L.; formal analysis, C.Z.; investigation, C.Z., K.L, T.D., M.W., H.M., Y.Y., J.W., Z.Z., Y.L., Z.C., J.J., C.S., B.T., P.Z., R.L., and J.L.; resources, B.T., P.Z., R.L., Z.Z., K.L. and X.H.; data curation, C.Z.; visualization, C.Z.; supervision, H.L., X.H., K.L., L.L. and J.Y.; All authors contributed to technical discussions and writing the paper.

Corresponding authors

Correspondence to Xiaoyong Hu or Hongtao Lin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Charis Mesaritakis, and the other, anonymous, reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhong, C., Liao, K., Dai, T. et al. Graphene/silicon heterojunction for reconfigurable phase-relevant activation function in coherent optical neural networks. Nat Commun 14, 6939 (2023). https://doi.org/10.1038/s41467-023-42116-6

Download citation

Received: 18 January 2023
Accepted: 29 September 2023
Published: 31 October 2023
DOI: https://doi.org/10.1038/s41467-023-42116-6

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.