Main

Realizations of quantum computing have built on rapid progress in controlling physical systems that can support quantum information; for example, superconducting circuits1,2, trapped ions3,4, neutral atoms5 and light6,7. These technological breakthroughs have brought four platforms to the regime of quantum computational advantage1,2,8,9,10 by solving specific sampling problems that would require unreasonable computing time even for the most powerful classical supercomputers. Two of these four are photonic, which highlights the position of light-based technology among the leading platforms. Quantum light as a quantum information carrier offers the advantage of low decoherence and comes with a many degrees of freedom with which to encode the information, while providing natural connectivity for distributed or blind quantum computing11.

Over the years, a variety of proposals that take the discrete-variable photonic approach to universal fault-tolerant computing have been put forward, in which quantum information is encoded with single photons12,13,14,15. With identified thresholds, these roadmaps motivate the development of quantum computing hardware based on single-photon sources, integrated photonic chips and single-photon detectors. Experimental progress of ever-increasing complexity has been achieved with integrated sources exploiting nonlinear effects, including with large-scale integrated chips16,17. However, the probabilistic nature of the single-photon generation process, the need for it to be heralded and the requirement for it to operate at low efficiency to limit multiphoton events are strong constraints on hardware architectures. This has resulted in a limited number of manipulated photons with typical rates in the megahertz range for four photons, and the demonstration of specific information processing tasks that rely on dedicated photonic chips17. Overcoming these limitations is foreseen through the use of massive multiplexing of hundreds of heralded sources18.

Another path to large-scale quantum computing with single photons has emerged, owing to deterministic single-photon source devices based on semiconductor quantum dots19,20,21. Such sources have demonstrated record single-photon generation efficiency that is 10–20 times higher than their nonlinear counterparts, allowing for a drastic reduction in resource requirements. Such efficiencies allowed a record manipulation of 14 single photons in a free-space boson sampling experiment22. Very recently, the same quantum dot sources have shown their ability to deterministically generate photonic cluster states at high rates23, even further reducing the foreseen overheads for large-scale measurement-based quantum computation24.

In this work we report a multipurpose cloud-accessible25 single-photon-based quantum computing machine, named Ascella, which is based on six photonic qubits generated by an on-demand quantum dot source. The quantum information is encoded in the path degree of freedom and arbitrarily manipulated in a 12-mode integrated universal interferometer. A machine-learned transpilation process corrects for the hardware manufacturing errors. Ascella operates the largest number of single photons on a chip so far with a six-photon sampling rate of 4 Hz, and shows operation stability over weeks. We benchmark its performances and demonstrate applications both in the gate-based and photonic computation frameworks. Each reported result represents either state-of-the-art performance or a novel experimental demonstration for which we provide the full code to reproduce through Quandela Cloud25. The numerous applications illustrate the general-purpose potential of the machine for noisy near-term quantum computing. We finally discuss the evolution of the reported platform towards scale-up, and demonstrate a critical step for future measurement-based quantum computation: heralded entanglement generation of three-photon Greenberger–Horne–Zeilinger (GHZ) states.

Results

Single-photon based computer

Ascella’s hardware (Fig. 1a) comprises an on-demand high-brightness single-photon source; a programmable optical demultiplexer, which allows up to six single photons to simultaneously interfere with a 12-mode reconfigurable universal interferometer; and a single-photon detection and post-processing unit.

Fig. 1: Architecture, performance and stability of Ascella.
figure 1

a, Sketch of the overall architecture of the six single-photon quantum computer. A quantum-dot single-photon source (SPS) device at 5 K is operated at a repetition rate of 80 MHz. An active demultiplexer followed by fibred delays converts the train of single photons into six photons arriving simultaneously at the universal 12-mode photonic chip. Photons are detected at the chip output by superconducting nanowire single-photon detectors (SNSPD) and detection times are processed by a correlator. A full software stack controls the unitary matrix U implemented on the chip through the voltages \({\overrightarrow{V}}\) applied on 126 thermal phase shifters, yielding phase shifts \({\overrightarrow{\phi }}\), and the photonic input state according to the job requested. It also recalibrates hourly and readjusts all hardware control knobs for optimal performance. The single photons are sent into a photonic chip featuring a universal interferometer scheme capable of implementing any 12 × 12 unitary matrix. b, Detected N-photon coincidence rates for N-photon inputs as a function of time, with the photonic circuit configured to implement the identity matrix. The rates are integrated for 50 s. The grey areas correspond to maintenance and upgrade periods. In the right figure, we also monitor the on-chip photon indistinguishability and single-photon purity, as quantified by the Hong–Ou–Mandel (HOM) visibility VHOM and 1 − g(2)(0), respectively, where g(2) is the normalized second-order correlation function. HOM kT is VHOM for delays k × ΔT between emitted photons where ΔT = 180 ns. Each data point corresponds to a correlation histogram integrated over 10 s. c, Job execution flowchart on Ascella. Perceval users may send jobs consisting in photonic circuits, or a gate-based circuit (GB) or a unitary matrix (U), along with the desired input state to the Quandela Cloud. The job is first processed by a CPU, which computes the necessary phase shifts \({\overrightarrow{\phi }}\) to apply, and subsequently the voltages \({\overrightarrow{V}}\) for the on-chip phase shifters from our compilation and transpilation process. Finally, the user receives the collected single-photon and coincidence counts after the computation on the quantum processing unit (QPU).

The on-demand single-photon source (see Supplementary Section 1), which is based on an InGaAs quantum dot in a microcavity19, is optically excited at an 80 MHz rate. It exploits a neutral dot and longitudinal-acoustic-phonon-assisted near-resonant excitation26 to emit linearly polarized single photons with 55% probability into the collection lens. To send six single photons to every even input mode of the chip, an active optical demultiplexer sequentially deflects the photon stream into six optical fibres of different lengths adjusted to synchronize the photons27. Using optical shutters, the demultiplexer can prepare any input state from \(\left\vert 000000000000\right\rangle\) to \(\left\vert 101010101010\right\rangle\) (photon positions can subsequently be swapped; see Supplementary Section 2). The 12-mode photonic integrated circuit (Si3N4 platform) is composed of 126 voltage-controlled thermo-optic phase shifters and 132 directional couplers28 that are laid out in a rectangular universal interferometer scheme (see Fig. 1a), allowing for the implementation of all 12 × 12 unitary matrices with an average fidelity of F = 99.7 ± 0.08% thanks to a custom compilation and transpilation process (see Methods). Finally, the 12 outputs of the circuit are connected to high-efficiency superconducting nanowire single-photon detectors, and N-photon detection events are registered using a time-to-digital converter.

The average total efficiency of the optical set-up is 8%, which includes the single-photon source device brightness, the transmission of all optical components, and the detection efficiencies (see Supplementary Section 2). This leads to record-breaking two- to four-photon on-chip coincidence rates (Fig. 1b), and on-chip processing of five and six photons, at rates of 50 Hz and 4 Hz, respectively. We measure high single-photon purity of >99% and high indistinguishability of 94%; these values are independent of the delays between photon emission (up to 1 μs), resulting in a measured on-chip two-photon interference visibility for all 15 pairs of 91−94% (see Supplementary Section 3). The genuine four- and six-photon indistinguishability—defined as the probability that the N photons are identical—establishes a new record value of 0.85 ± 0.02 for four photons29, and a value of 0.76 ± 0.02 for six photons. We ensure long-term stability and high-performance operation of Ascella by monitoring key metrics and by performing automated system optimization runs hourly. This guarantees a highly stable and long-term operation over several weeks (see Fig. 1b), and robustness against external temperature fluctuations and mechanical perturbations.

To operate the machine, tasks are sent remotely using the Python-based open-source framework Perceval30. The user can either specify (see Fig. 1c) a photonic circuit, a gate-based circuit, or a unitary transformation (U) to be applied to a specified input state containing one to six photons, as well as optional post-selection criteria. Output photon coincidences are then acquired up to the desired sample number, and data sample results are sent back to the user, either as a stream of events or as an aggregated state:count inventory.

Gate-based quantum computation

Following the Knill–Laflamme–Milburn scheme31, Ascella can perform probabilistic gate-based protocols. Within this quantum computation framework, we benchmark quantum logic gates on up to three qubits and implement a hybrid variational quantum eigensolver (VQE).

Benchmarking logic gates

Ideally, a gate U applied to an initial pure state \(\left\vert \psi \right\rangle\) will produce the pure state \(U\left\vert \psi \right\rangle\). In reality, errors, which are quantified by a noise channel Λ (ref. 32), corrupt the final state, which is then described by a density matrix \(\rho ={{\Lambda }}(U\left\vert \psi \right\rangle \,\left\langle \psi \right\vert {U}^{\;{\dagger} })\). A standard figure of merit to quantify the gate performance is the quantum state fidelity \({F}_{\psi }(U)=\left\langle \psi | {U}^{\;{\dagger} }\rho U| \psi \right\rangle\) of the final state ρ to the ideal state \(U\left\vert \psi \right\rangle\). To assess Ascella’s performance for a given gate, we evaluate the fidelity of the gate averaged over all possible input states \(\left\vert \psi \right\rangle\), that is, Favg(U) = ∫Fψ(U)dψ, where the integral is taken over the Haar measure.

A brute-force approach to estimating Favg(U) requires an impractically large number of measurements. A more efficient method, randomized benchmarking, has been proposed for matter qubits33, but applies long sequences of gates from specific sets of unitaries34. As photonic quantum processing converts any quantum circuit to a photonic circuit35, we use a new method to evaluate Favg (R.M. & S.C.W., manuscript in preparation). Our method exploits symmetries so that the contribution of most Fψs to Favg cancel out, allowing Favg to be expressed as a finite discrete sum \({F}_{{{{\rm{avg}}}}}=\mathop{\sum }\nolimits_{i = 1}^{K}{w}_{i}{m}_{i}\) of K expectation values mi with weight wi (see Supplementary Section 5). The wi and the state preparation and measurement configurations for each mi depend on the gate U and are pre-computed. Each configuration consists in preparing an unentangled initial state \(\left\vert \psi \right\rangle\), applying the gate and performing single-qubit Pauli measurements. For the gates benchmarked on Ascella (see Table 1), the K expectation values mi are obtained from M ≤ K measurement configurations, with K less than the 24n measurements required for full process tomography36 of an n-qubit gate.

Table 1 Average gate fidelities of one-, two- and three-qubit gates implemented by Ascella evaluated based on K expectation values obtained from M measurement configurations

The average gate fidelities measured for the T (defined as \(T:=\left\vert 0\right\rangle \left\langle 0\right\vert +{e}^{i\frac{\pi }{4}}\left\vert 1\right\rangle \left\langle 1\right\vert\)), CNOT and Toffoli gates are shown in Table 1. To the best of our knowledge, these are record fidelities for the two-photon CNOT37,38 and three-photon Toffoli39 gates. This group of measurements sets a benchmark for universal photonic quantum computing, and is on par with the benchmarked performance of open-access quantum computing platforms based on ions and superconducting qubits (see Supplementary Section 5). These values are a lower bound on the true average gate fidelities, as they also include errors related to state preparation and measurement roughly given by (1 − Favg(T-gate)2n/3), which is 0.3 ± 0.1%, 0.5 ± 0.1% and 0.8 ± 0.2% for the T, CNOT and Toffoli gates, respectively.

Variational quantum eigensolver

We illustrate gate-based computation possibilities by implementing a VQE algorithm to compute the ground-state energies of an H2 molecule; VQE exploits the variational principle stating that, given a Hamiltonian \(\hat{{{{\mathcal{H}}}}}\) and an ansatz wavefunction \(\left\vert \psi (\overrightarrow{\theta })\right\rangle\) parameterized by \(\overrightarrow{\theta }\), the ground-state energy associated with \(\hat{{{{\mathcal{H}}}}}\) satisfies \({E}_{0}\le \left\langle \psi (\overrightarrow{\theta })\right\vert \hat{{{{\mathcal{H}}}}}\left\vert \psi (\overrightarrow{\theta })\right\rangle\) (ref. 40). In this context, VQE explores the state space by minimizing the energy to find a chemically accurate approximation of E0. Reaching chemical accuracy (defined as obtaining a result within ±0.0016 Ha of the theoretical value) is critical for making realistic chemical predictions.

We build the fermionic Hamiltonian for H2 using the symmetry-conserving Bravyi–Kitaev transformation41, which is available via the OpenFermion42 Python package (see Methods). Symmetry allows reduction of the problem to the effective Hamiltonian \({\hat{{{{\mathcal{H}}}}}}_{{{{\rm{qubit}}}}}\), which acts on two qubits expressed in the standard Pauli basis (𝕀, X, Y and Z),

$${\hat{{{{\mathcal{H}}}}}}_{{{{\rm{qubit}}}}}(r)=\alpha {\mathbb{I}}{\mathbb{I}}+\beta Z{\mathbb{I}}+\gamma {\mathbb{I}}Z+\delta ZZ+\mu XX$$
(1)

with real parameters α, β, γ, δ and μ, which depend on the choice of bond length r. We create ansatz states \(\left\vert \psi (\overrightarrow{\theta })\right\rangle\) made of two path-encoded qubits using single-qubit operations \(R({\overrightarrow{\theta }}_{i})\) and an entangling post-selected controlled NOT (CNOT) gate (see Fig. 2a). To make a chemically accurate prediction of E0, we first find the optimal bond length (ropt), which is the one that corresponds to the lowest energy by varying r between 0.2 and 2.05 Å.

Fig. 2: Computations of Ascella.
figure 2

a, Gate-based computation. Hybrid VQE. On Ascella, the single-qubit gates \(R({\overrightarrow{\theta }}_{i})\), together with a CNOT gate, create an ansatz two-qubit state. We then measure in the Z-basis (using the identity gate 𝕀) or in the X-basis (using the Hadamard gate H). The output counts (grey arrows) are sent to a classical processor, which reconstructs the corresponding energy and implements a feedback loop to update the single-qubit gate angles \({\overrightarrow{\theta }}_{i}\) via a gradient-free optimizer, and find an ansatz closer to the ground state. Each iteration on the QPU takes about 22 s (including 14 s of QPU time and classical communication to the cloud). Chemical accuracy was reached for ropt = 0.75Å. Each data point corresponds to 10,000 post-processed two-photon coincidence samples, except for ropt = 0.75 Å, which corresponds to 400,000. The error bars correspond to ±1 s.d. of the photon counting statistics. b,c, Photon-native computation. b, Left: classification task using a quantum neural network and using the pseudo photon number resolving (PPNR) capabilities of the photonic circuit. Right: confusion matrices for the classification of the IRIS dataset on Ascella. The accuracy for the training and test sets is 0.92 and 0.95, respectively; 50,000 samples were used for each experimental iteration. c, Left: six single-photon boson sampling. Measured (top) and modelled (bottom) six-photon output distributions for the input state \(\left\vert 101010101010\right\rangle\). The 924 six-photon outcomes are canonically ordered from \(\left\vert 000000111111\right\rangle\) to \(\left\vert 111111000000\right\rangle\). Right: discrimination between boson sampling and uniform sampling hypothesis using the Aaronson and Arkhipov (A&A) counter, and between boson sampling and distinguishable sampling hypothesis using the likelihood ratio counter. The value of each discriminator is updated every 109 samples, which corresponds to 20 six-photon events. In both cases, a positive slope validates the test (see Methods).

The expectation value of \({\hat{{{{\mathcal{H}}}}}}_{{{{\rm{qubit}}}}}(r)\) on \(\left\vert \psi (\overrightarrow{\theta })\right\rangle\) is obtained from the weighted averages of 10,000 post-processed two-photon samples, giving an accuracy of ±0.01 Ha. The classical processor then evaluates a loss function by using a gradient-free optimizer based on expectation values obtained from Ascella, and corrected using an error-mitigation scheme inspired by ref. 43. Then \(\overrightarrow{\theta }\) is updated classically in a feedback loop between Ascella and a classical processor to reach lower and lower energies. We then make an additional experimental run at ropt with 400,000 post-processed two-photon samples to obtain an accuracy of ±0.00158 Ha on the ground-state energy associated with \({\hat{{{{\mathcal{H}}}}}}_{{{{\rm{qubit}}}}}({r}^{{{\;{\rm{opt}}}}})\). We compare this value with E0 to confirm that we have reached chemical accuracy. In the two steps highlighted above, we use error mitigation to compute the minimal energies of \({\hat{{{{\mathcal{H}}}}}}_{{{{\rm{qubit}}}}}(r)\) (see Supplementary Section 8). For any initial random parameters and bond lengths, the algorithm consistently converges to the theoretical eigenvalues in 50 to 100 iterations (see Fig. 2a). The entire experiment time per bond length is approximately four times faster than past photonic VQE experiments of a system with the same number of degrees of freedom40. At fixed initial conditions and ropt, chemical accuracy was achieved with a success probability of 93%, with greater accuracy than recent photonic VQE experiments43. These two improvements are due to higher-quality single-photon sources and chip control. Note that the accuracy is on par with VQE experiments on superconducting qubits44,45 and ions traps46,47, and is reached by using a photonic platform.

Photon-native quantum computation

We now demonstrate the operation of Ascella in its native photonic framework, where the information is directly processed through photonic quantum interferences in chosen unitary transformations and detection.

Photon-based quantum neural network

We train a quantum neural network48 on Ascella for a supervised learning classification task. We build a VQE algorithm where, taking inspiration from ref. 49, we use a native photonic ansatz. We perform multiclass classification on the well-known IRIS dataset50. To the best of our knowledge, this is the first experimental implementation of a variational quantum classifier with single photons; we refer to ref. 51 for a realization on a superconducting platform and to ref. 52 for a two-photon classifier based on kernel methods. Following our photon-native approach, we design the ansatz of the variational algorithm directly using the beamsplitters and phase shifters on five modes of Ascella, in which we input three photons. We also implement partial pseudo photon-number resolution by exploiting four extra modes of the chip.

We train the model using a see-saw optimization between the chip parameters and the output state parameters that define the measurement observable. Each iteration requires 112 experiments, one for each data point in the training set, and we gather 50,000 samples per run. A batch functionality in Perceval30 allows us to send all data points as one job to the server. Details on the ansatz and the training can be found in Methods and Supplementary Section 7. After about 15 iterations, we find an accuracy of 0.92 and 0.95 on the training and test sets, respectively. Figure 2b provides a summary of the model predictions versus actual labels as a confusion matrix.

Boson sampling with six single photons

Boson sampling is a sampling problem suited for demonstrating a quantum-over-classical advantage with optical quantum computing platforms53. The recent demonstrations of quantum advantage6,10 in the Gaussian boson sampling framework used squeezed light manipulated in free-space interferometers to limit optical losses. Genuine single-photon-based Boson sampling has progressed poorly on integrated chips due to the low efficiency of heralded sources54,55,56,57. Here we demonstrate on-chip boson sampling for a record number of six photons with a fully reconfigurable interferometer. A 12 × 12 Haar-random unitary matrix is randomly chosen using the dedicated tool in Perceval. We record the threshold statistics of all N-photon coincidences (N [[1; 6]]) and acquire in total 340.109 samples, with a six-photon coincidence rate reduced by the strong bunching of photons in this sampling task down to 0.16 Hz.

To validate our experimental results, we discriminate our collected boson sampling statistics from the uniform58 and distinguishable59 sampler hypotheses. We also reconstruct the six-photon output distribution from the sampled data and compare it with the ideal output distribution corresponding to the chosen unitary matrix. Both distributions are plotted in Fig. 2c, from which we deduce a fidelity \(F={\sum }_{i}\sqrt{{p}_{i}{q}_{i}}\) and a total variation distance (TVD) \(D=\frac{1}{2}{\sum }_{i}| {p}_{i}-{q}_{i}|\), where {pi} and {qi} are the ideal and experimental output probability distributions, respectively, with i {1, … , 924} labelling the no-collision output configuration of the boson-sampling device53. We measure state-of-the-art values F = (0.97 ± 0.03) and D = (0.16 ± 0.02)22,60. Details on the measurement simulation with Perceval as well as boson sampling with k photons lost (k [[1; 4]]) are given in Supplementary Section 6. Our experiment marks a demonstration of boson sampling with six single photons on an integrated photonic circuit (see Supplementary Table 4). In contrast with previous experimental demonstrations, our fully reconfigurable chip admits to sampling from any target unitary matrix, a critical feature for proving a quantum-over-classical advantage in boson sampling.

Discussion

Near-term improvements

The above results demonstrate the suitability of the architecture for near-term quantum computing tasks. In the short-term, the record 4 Hz rate for six photons demonstrated here can be further pushed to 12 photons by optimizing each hardware component (see Supplementary Table 1). These optimizations could be pushed even further in the mid-term; for instance, the current single-photon source efficiency of 55% at the first lens can be brought to 96% (ref. 61). The number of modes in the photonic chip can be increased while reducing photon transmission loss16,17,57. Finally, it is anticipated that heterogeneous integration of the different components (source, chip, detectors) will drastically reduce interconnection losses. For the high indistinguishability, our single-photon source technology has demonstrated ≥99.5% indistinguishability19, which would bring the two-qubit gate fidelities close to unity62. Altogether, these technological improvements will allow high-fidelity linear-optical computing protocols to be performed with dozens of photons.

Scaling

Beyond noisy intermediate-scale tasks, the current platform constitutes a step towards large-scale fault-tolerant quantum computing. By experimentally implementing quantum algorithms and protocols on such a platform, we have demonstrated key ingredients to scale up, specifically, high-fidelity multiphoton interference and entanglement generation on an integrated platform, which, moreover, is robust for continuous operation63. The next steps in reducing the overhead of probabilistic linear-optical protocols will involve shifting to a measurement-based paradigm that relies ultimately on the generation of large graph states12. Heralded three-photon GHZ states are a sufficient resource to build larger entangled states through type-II fusion13,15,64. This is the last demonstration we report on Ascella.

We use a scheme adapted from ref. 13,65 where three out of the six single photons are consumed to herald the generation of the three-photon state \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle =(\left\vert 000\right\rangle +\left\vert 111\right\rangle )/\sqrt{2}\).

Using the stabilizer operators of \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\), we experimentally measure a fidelity of \({F}_{{{{{\rm{GHZ}}}}}_{3}^{+}}=0.82\pm 0.04\) (see Fig. 3 and Methods) providing a reference value and benchmark of heralded GHZ state generation.

Fig. 3: Heralded generation of three-photon GHZ states.
figure 3

Measured expectation values of the stabilizing operators of the heralded three-photon GHZ state \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\) yielding a fidelity of \({F}_{{{{{\rm{GHZ}}}}}_{3}^{+}}=0.82\pm 0.04\). The expectations values are computed from 500 samples. Error bars correspond to ±1 s.d. of the photon counting statistics.

Finally, our recent demonstration of efficient generation of linear cluster states directly from the same quantum dot source technology23—combined with ingredients demonstrated by this platform—could lead to additional reductions in hardware resource overheads required for fault-tolerance66.

Methods

Architecture

Ascella is accessible remotely via a cloud service25. Tasks can be dispatched either to Ascella, to a perfect simulator, or to a noisy simulator through a generic scheduler handling multiple users' access limitations and task prioritization. Following a compilation and transpilation process, Ascella then sets the demultiplexer configuration and the photonic circuit phases to apply the required unitary matrix to the input state. For applications such as quantum machine learning, for which each training data sample corresponds to a task, users can prepare and send a batch of tasks that will execute sequentially on the QPU with fast incremental chip reconfiguration and without any communication overhead.

Single-photon source

The single-photon source is a single InAs quantum dot deterministically embedded in the centre of a micropillar19. The sample was fabricated from a wafer grown by molecular beam epitaxy and is composed of a λ-cavity and two distributed Bragg reflectors made of GaAs/Al0.95Ga0.05As λ/4 layers with 36 (18) pairs for the bottom (top); the top (bottom) distributed Bragg reflectors are gradually p(n)-doped and electrically contacted.

Photonic circuit

The photonic integrated circuit is a 12 mode universal Clements style interferometer, including 126 phase shifters and 1,323 dB directional couplers. The chip size is 20 × 20 mm2. The circuit is etched on stochiometric silicon nitride waveguides with the Triplex technology designed for 925 nm wavelength. Further details on the system and fabrication technology can be found in refs. 28,68.

The rectangular universal interferometer layout (see Fig. 1a) is ideally based on balanced directional couplers (that is 50% reflectivity). Experimentally, we observe reflectivities with average values of 56.7(6)% for our chip at the operation wavelength of 928 nm. The random error stems from the fabrication tolerance and the systematic error from the wavelength dependency. These errors reduce the range of implementable 12 × 12 unitary matrices69,70 and, if not compensated for, affect the fidelity of the implemented unitary matrix to the target unitary matrix. To address these limitations, we designed a custom compilation and transpilation process that converts with high-fidelity user-provided photonic circuits, unitary matrices or gate-based circuits into interferometer phase shift values (compilation) then into voltages to apply on the chip phase shifters (transpilation). The compilation process features a global optimization step executed on a CPU that simultaneously fine-tunes all the phase shifts to enhance matrix fidelity, thereby mitigating the impact of reflectivity errors. Subsequently, the transpilation process calculates the voltages to apply on the chip phase shifters while compensating for thermal cross-talk. To achieve this, it solves the phase-voltage relation \(\overrightarrow{\phi }=A{\overrightarrow{V}}{\,}^{\odot 2}+\overrightarrow{b}\) that models the behaviour of the thermo-optic phase shifters, where the vector \(\overrightarrow{\phi }\) contains all 126 physical phase shifts, \(\overrightarrow{V}\) the 126 applied voltages and 2 represents element-wise squaring. Off-diagonal elements of the 126 × 126 matrix A represent thermal cross-talk between phase shifters To find suitable values for A and \(\overrightarrow{b}\), we engineered a machine learning-based photonic chip characterization process that optimizes their values, constituting more than 16,000 free parameters to determine. The same process also estimates individual directional coupler reflectivities and relative output losses (see Supplementary Section 4 for values). This process offers a seven-fold improvement on the transpilation (\(\overrightarrow{\phi }\) to \(\overrightarrow{V}\) process) over more standard characterization techniques involving interference-fringe measurements.

We benchmark the transpilation process by applying 300 random phase configurations on the photonic chip and measuring the photon countrates at the 12 outputs. We compare them to a simulation of the chip which is taken into account the estimated directional coupler reflectivities and relative output losses (see Supplementary Section 4 for values). We quantify the difference between the measured and simulated values using the TVD. At 925 nm, with a standard characterization of the chip based on interference fringes measurements28, the TVD evaluated on the configurations is (21 ± 11)%, where the error bar is the standard deviation of the dataset. At the operating wavelength of our single-photon source (928 nm), with our machine learning process, we achieve a TVD of (3.0 ± 1.3)%, greatly improving our control over the chip. The relative variation on the obtained average TVD between successive benchmarkings is of the order of 3%, showing repeatability of the obtained value. The full compilation and transpilation processes achieve an average fidelity of F = 99.7 ± 0.08 following the fidelity evaluation procedure from ref. 28.

Variational quantum eigensolver

The ansatz for the VQE algorithm implements the gate-based circuit shown in Fig. 2a which consists of a generic two-qubit state generator. It comprises single-qubit rotations and a CNOT gate62. This is implemented on six modes (modes 1 to 6) comprising two path-encoded qubits and two extra modes for the postselected Ralph CNOT. Arbitrary rotations are implemented via tunable Mach–Zehnder interferometers with thermo-optic phase shifters. Extra phase shifters are used to mitigate systematic errors in the reflectivity of beamsplitters and to converge faster to the ground state energy. On average, an entire VQE experiment for H2 takes about 30 min per bond length with 10,000 post-processed two-photon samples (that is the time to construct Supplementary Fig. 15). This time comprises of the actual experimental time and the classical pre- and post-processing. For the final run at ropt to reach chemical accuracy, gathering 400,000 post-processed two-photon samples for each iteration takes 4 h 30 min.

Boson sampling

Two statistical tests are used to discriminate the experimental data against the uniform sampler and distinguishable particle hypotheses. The A&A counter and the likelihood ratio counter, respectively, are increased or decreased according to a likelihood ratio test. The A&A counter A is defined as57,58

$${A}_{k}:=\,\left\{\begin{array}{l}{A}_{k-1}+1\,{{{\rm{if}}}}\,{{{\mathcal{P}}}}\ge {\left(\frac{n}{m}\right)}^{2}\\ {A}_{k-1}-1\,{{{\rm{if}}}}\,{{{\mathcal{P}}}} < {\left(\frac{n}{m}\right)}^{2}\end{array}\right.$$

where n and m are, respectively, the number of photons and optical modes, and \({{{\mathcal{P}}}}:={\prod }_{i}{\sum }_{j}| {U}_{ij}{| }^{2}\), where i labels the modes in which photons are detected, j the input modes and U is the unitary sampling matrix.

The likelihood ratio counter C is defined as57,59

$${C}_{k}:=\,\left\{\begin{array}{l}{C}_{k-1}+1\,{{{\rm{if}}}}\,{{{\mathcal{L}}}}\ge {\left(\frac{n}{m}\right)}^{2}\\ {C}_{k-1}-1\,{{{\rm{if}}}}\,{{{\mathcal{L}}}} < {\left(\frac{n}{m}\right)}^{2}\end{array}\right.$$

where \({{{\mathcal{L}}}}:=\frac{q}{p}\) with q: = Perm(U(ij))2, p: = Perm(U(ij)2) and U(ij) denoting the sub-matrix restricted to the input labels i and output labels j.

Photon-based quantum neural network

We build the ansatz of our variational quantum classifier using modes 3 to 7 of Ascella. We input three photons into the chip, in modes 3, 5 and 7. We use 32 of the reconfigurable thermo-optic phase shifters as the variational parameters, and 4 phase shifters in the middle of the chip for the data encoding. We use extra modes for pseudo photon-number resolution: by setting four phase shifters to π/2 in the final layer of the chip, we redirect a portion of the photons from modes 3 and 7 into modes 1, 2 and 8, 9 respectively. For the classical optimization process, we use a see-saw approach based on Gaussian processes and Nelder–Mead optimizers. More details regarding the circuit ansatz, model definition, pseudo photon-number resolution and the optimization methods are in Supplementary Section 7.

Heralded three-photon GHZ on-chip generation

The generation of a path-encoded three-photon GHZ state is characterized with three reconfigurable integrated Mach–Zehnder interferometers (MZIi, i = 1, 2, 3). The layout of the optical circuit is provided in the Supplementary Section 9. Measuring six photons with threshold detectors, the output state of the circuit is given by65

$$\begin{array}{ll}\left\vert {{{\rm{Out}}}}\right\rangle &=\frac{1}{16}\left(-\left\vert {{{{\rm{GHZ}}}}}_{3}^{-}\right\rangle \left\vert {h}_{1}\right\rangle +\left\vert {{{{\rm{GHZ}}}}}_{3}^{-}\right\rangle \left[\left\vert {h}_{4}\right\rangle +\left\vert {h}_{6}\right\rangle +\left\vert {h}_{7}\right\rangle \right]\right.\\ &\left.-i\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle \left\vert {h}_{8}\right\rangle +i\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle \left[\left\vert {h}_{2}\right\rangle +\left\vert {h}_{3}\right\rangle +\left\vert {h}_{5}\right\rangle \right]\right).\end{array}$$

We target the state \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\), where \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{\pm }\right\rangle =(\left\vert 000\right\rangle \pm \left\vert 111\right\rangle )/\sqrt{2}\), which is obtained by conditioning the analysis of the state on the detection of one of the heralding states \(\left\vert {h}_{2}\right\rangle\), \(\left\vert {h}_{3}\right\rangle\), \(\left\vert {h}_{5}\right\rangle\) and \(\left\vert {h}_{8}\right\rangle\).

The heralding channels signal the generation of a specific GHZ state. The heralding conditions for the generation of \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\) are

$$\left\{\begin{array}{rcl}\left\vert {h}_{2}\right\rangle &=&\left\vert {0}_{2}{1}_{3}{0}_{4}{1}_{7}{1}_{8}{0}_{9}\right\rangle \\ \left\vert {h}_{3}\right\rangle &=&\left\vert {1}_{2}{0}_{3}{0}_{4}{1}_{7}{0}_{8}{1}_{9}\right\rangle \\ \left\vert {h}_{5}\right\rangle &=&\left\vert {1}_{2}{0}_{3}{1}_{4}{0}_{7}{1}_{8}{0}_{9}\right\rangle \\ \left\vert {h}_{8}\right\rangle &=&\left\vert {0}_{2}{1}_{3}{1}_{4}{0}_{7}{0}_{8}{1}_{9}\right\rangle \end{array}\right.$$

where \(\left\vert {0}_{i}\right\rangle \,(\left\vert {1}_{i}\right\rangle )\) correspond to detecting 0 (1) photons in mode i (modes are labelled from 1 to 12 from top to bottom).

The state \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\) is a stabilizer state, and therefore can uniquely be expressed in terms of its stabilizers71.

$$\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle \left\langle {{{{\rm{GHZ}}}}}_{3}^{+}\right\vert =\mathop{\sum}\limits_{{S}_{i}\in {{{\mathcal{S}}}}}\frac{1}{| {{{\mathcal{S}}}}| }{S}_{i},$$
(2)

where Si is a stabilizer of \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\), \({{{\mathcal{S}}}}\) is the stabilizer group of \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\) and \(| {{{\mathcal{S}}}}|\) is the number of elements of \({{{\mathcal{S}}}}\). The fidelity of some experimental implementation ρ of \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle\) is given by

$${F}_{{{{{\rm{GHZ}}}}}_{3}^{+}}={{{\rm{Tr}}}}\Big(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle \left\langle {{{{\rm{GHZ}}}}}_{3}^{+}\right\vert \rho\Big),$$

Plugging the expansion of \(\left\vert {{{{\rm{GHZ}}}}}_{3}^{+}\right\rangle \left\langle {{{{\rm{GHZ}}}}}_{3}^{+}\right\vert\) into \({F}_{{{{{\rm{GHZ}}}}}_{3}^{+}}\) and using linearity of the trace, one obtains

$${F}_{{{{{\rm{GHZ}}}}}_{3}^{+}}=\frac{1}{| {{{\mathcal{S}}}}| }\mathop{\sum}\limits_{{S}_{i}\in {{{\mathcal{S}}}}}{\mathsf{Trace}}({S}_{i}\rho )=\frac{1}{| {{{\mathcal{S}}}}| }\mathop{\sum}\limits_{{S}_{i}\in {{{\mathcal{S}}}}} < {S}_{i} > .$$

The fidelity of the heralded state to the target state is characterized on Ascella using this equation, where Si {III, XXX, ZZI, IZZ, ZIZ, − YYX, − XYY, − YXY}, the stabilizer operators of the target state, are experimentally accessed through the three-qubit operators XXX, ZZZ, YYX, XYY and YXY. Note that after the submission of this manuscript, two works reported on the heralded generation of three-photon GHZ states72,73.