Introduction

Microbial methane was an important source of greenhouse gases that kept the early Earth warm before the Great Oxidation Event (GOE, 2.4 Ga) under the faint Sun1. The generation of microbial methane, as a significant step of the global carbon cycle, is reflected in the stable isotopic compositions of organic matter. However, finding and interpreting records that span billions of years are challenging2,3. Over the past two decades, methane and other short-chain alkane gases discovered in ancient shields have shown broad variations in hydrogen isotopic compositions (δD), but relative uniformity in carbon isotopic compositions (δ13C)4,5. The first reported example is from the Kidd Creek mines in the Neoarchaean Abitibi Greenstone Belt of the Canadian Shield near Timmins, Ontario, Canada4. In this area, multiple cycles of sedimentation, volcanic activities, magma intrusions, and serpentinization of mafic rocks occurred ~2.7 Ga6. A Neoarchaean ecosystem has been indicated by biomarker evidence7. The Kidd Creek mines produce metal sulfides associated with underground felsic volcanic rocks8; alkane gases (from methane to pentanes) have been discovered at depths of 2000–3000 m in exploration boreholes4. Reported gas contents and isotopic compositions show, in addition to the above δD and δ13C patterns, an exponential distribution of alkane abundance and deuterium–deuterium clumped isotopic composition of methane (Δ12CH2D2) between −10 and 5‰ (for samples collected <400 days since the borehole was drilled)4,5,9,10,11.

Upon its discovery, the Kidd Creek gas was interpreted as abiotic through Fischer–Tropsch (FT) synthesis because the δD and δ13C patterns are different from those found in alkanes generated from the decomposition of organic matter4. However, these patterns have neither been found in hydrothermal gas, commonly believed as abiotic12, nor in artificial FT synthetic gas13. Therefore, the abiotic interpretation is empirical and lacks a chemical rationale, especially when the feasibility of this synthesis under geological conditions has been questioned in recent years14,15,16. Moreover, the abiotic interpretation overlooked the possibility of forming the unique isotopic distributions in a distinct oxygen-depleting Archaean ecosystem.

In this work, we present the evidence that the Kidd Creek gas is a microbial product in a Neoarchaean ecosystem. From the deuterium distribution in short alkanes, we discover that the hydrogen atoms in each short alkane molecule come from two sources: two capping H atoms are donated by serpentinization-derived H2, while the rest are inherited from long alkyl chains. The distributions of alkane abundances, carbon, hydrogen, and clumped isotopic compositions suggest that the gas was generated through hydrogen biodegradation of sediment organic matter. The Kidd Creek gas is a record of microbial greenhouse gas generation before the GOE.

Results and discussion

Identification of alkane precursors

Figure 1a shows the comparisons of δ13C and δD of alkanes between the Kidd Creek gas, typical thermogenic gases17,18, and gases from a hydrothermal field12. The unique contrastive alkane δ13C and δD variations in the Kidd Creek gas indicate that the hydrogen isotopic distribution is unlikely due to a pure kinetic isotope effect (KIE) during its formation. Instead, the wide δD variation implies that the hydrogen atoms likely have multiple sources with distinct δD values. The variation suggests a reaction involving the conversion of organic matter to short-chain alkanes in the presence of H2, a hydrogen degradation (or hydrogenolysis) reaction. The exponential distribution of alkane gas abundance with respect to carbon number (Fig. 1b), consistent with a random scission of long alkyl chains19, also supports the hydrogenolysis explanation. We found a linear correlation between the δD values and the reciprocals of the hydrogen numbers (Fig. 1c), which particularly supports the mechanism of short-chain alkane generation from hydrogenolysis of long alkyl chains, as explained in the following paragraphs.

Fig. 1: Distinct geochemical signatures of the Kidd Creek gas.
figure 1

a δ13C versus δD. b Distribution of alkane abundances (relative to CH4). c δD against the reciprocal of hydrogen number of alkanes. Blue lines—Kidd Creek gas (depth > 2500 m)4,5; violet—hydrothermal field gas data from the Lost City of Mid-Atlantic Ridge12, black, grey, and white triangles stand for methane, ethane, and propane, respectively; pink—thermogenic gas data from NW Australia Shelf17; red—thermogenic gas data from NW Sichuan Basin, China18; black dashed lines—linear trend lines of the Kidd Creek gas.

During hydrogenolysis, long alkyl chains (straight or branched) are consumed segment by segment consecutively. Each generated short-chain alkane molecule may contain 1, 2, or 3 capping hydrogen atoms from H2 depending on the positions of broken C–C bonds (Fig. 2a). Most alkane molecules produced from hydrogenolysis have two capping hydrogen atoms from H2 because secondary carbon atoms are significantly more abundant than primary carbon atoms in a long alkyl chain, especially when one or both ends of the alkyl chain are bonded to a less degradable carbon ring or kerogen. For the degradation of branched C–C chains (in isoprenoid structures), the amount of methane generated from branched primary carbons is equivalent to the amount generated from tertiary carbons; the average number of H atoms from H2 is still 2. As a result, the hydrogen isotopic composition of the generated alkane CnH2n+2 is expressed as

$$\delta {{{{{{\rm{D}}}}}}}_{{{{{{{\rm{C}}}}}}}_{{n}}{{{{{{\rm{H}}}}}}}_{2{n}+2}}=\frac{2{n}}{2{n}+2}\delta {{{{{{\rm{D}}}}}}}_{{{{{{\rm{A}}}}}}}+\frac{2}{2{n}+2}\delta {{{{{{\rm{D}}}}}}}_{{{{{{\rm{B}}}}}}}=\delta {{{{{{\rm{D}}}}}}}_{{{{{{\rm{A}}}}}}}+\frac{2(\delta {{{{{{\rm{D}}}}}}}_{{{{{{\rm{B}}}}}}}-\delta {{{{{{\rm{D}}}}}}}_{{{{{{\rm{A}}}}}}})}{2{n}+2}$$
(1)
Fig. 2: Numerical simulation on the distributions of abundance and δ13C of short alkanes from hydrogen biodegradation of long alkyl chains.
figure 2

a Reaction scheme showing that the number of carbon and hydrogen atoms contributed by the alkyl chain and H2 to the gaseous alkanes depends on C–C cleavage positions. Brown, blue, and red represent cases of 1, 2, and 3 capping hydrogen atoms donated by H2, respectively. b Numerical simulation on the abundances of gaseous alkanes relative to CH4. c Numerical simulation on δ13C of gaseous alkanes relative to δ13CnC4H10. Legend of b and c: black lines—Kidd Creek gas (samples with both δ13C and gas content data)4,5; blue, red, and gradient colours in between—modelled results (Eqs. 47) with conversion (fraction of CH4-contained 12C atoms in all 12C atoms) labelled in panel b. KIE parameters (identical for hydrogenolyses on kerogen side chains and on alkanes): primary 13C KIE of 13k/12k = 1.0015 (inverse KIE); secondary 13C KIE of 13k/12k = 0.9978 (normal KIE).

In Eq. (1), subscripts A and B refer to hydrogen atoms from the alkyl chain and hydrogen substrate, respectively. Equation (1) shows a linear relationship between \(\delta {{{{{{\rm{D}}}}}}}_{{{{{{{\rm{C}}}}}}}_{{n}}{{{{{{\rm{H}}}}}}}_{2{n}+2}}\) and the reciprocals of hydrogen numbers 1/(2n + 2). Similar analysis has successfully explained the linear variation between δ13C and 1/n in thermogenic alkanes20.

A linear covariation with a high coefficient of determination (R2 = 0.98) indeed exists in the Kidd Creek gas (Fig. 1c), yielding a trend line with a y-intercept of δDA = −126‰ and a slope of 2(δDA − δDB) = −1130‰ (hence δDB = −691‰). The δDA and δDB values are close to the δD values of marine organic matter (−80 to −170‰)21 and serpentinization-derived H2 in the Kidd Creek gas (~−730‰)22, respectively. Therefore, the reaction between the long alkyl chains in organic matter and serpentinization-derived H2 explains the δD isotopic pattern of the Neoarchaean alkane gas. The slight difference of δDB > δDH2 indicates a weak and inverse deuterium KIE (DKIE) during some reaction steps (an inverse KIE means that the substitution of a light isotope by a heavy isotope accelerates the chemical reaction rate). The linear trends with high coefficients of determination in the Kidd Creek gas (Fig. 1b, c) indicate that the interference from other reactions is minimal. These reactions, including the thermal cracking of kerogen, oxidation of methane, and hydrogenolysis of non-alkyl organic precursors (e.g. carbohydrates, which are abundant in living organic matter), yield neither an exponential distribution of short-chain alkane abundance nor the above hydrogen isotope distribution.

Microbial hydrogenolysis suggested by 13C and 12CH2D2

In contrast to the hydrogen atoms, which have two sources, all the carbon atoms in the Kidd Creek hydrocarbon gases originate from organic precursors. As a result, kinetic isotopic fractionation (instead of multiple sources) is the main reason for the δ13C variation between short-chain alkanes20. This variation is narrow in the Kidd Creek gas (Fig. 1a); therefore, the C–C cleavage has a weak 13C KIE, implying that the C–C cleavage is not the rate-determining step23 and is likely catalytic24. The coexistence of C2+ alkane gases (>10 vol%) with H2 also suggests a catalytic process. The reason is that a C–C bond (bond energy 346 kJ/mol) is more vulnerable to crack than an H–H bond (bond energy 432 kJ/mol). If there were no catalytic processes to lower the activation barrier of H–H splitting, C2+ gases would be more depleted. Because abiotic catalysis requires dispersed elemental transitional metal catalysts without thermal-sintering and sulfur-poisoning25, a demanding condition hardly satisfied in geological bodies, the catalytic process revealed here is most likely a microbial process.

The order of \({\delta }^{13}{{{{{{\rm{C}}}}}}}_{{{{{{{\rm{CH}}}}}}}_{4}} \, > \; {\delta }^{13}{{{{{{\rm{C}}}}}}}_{{n{{{{{\rm{C}}}}}}}_{4}{{{{{{\rm{H}}}}}}}_{10}}\, > \;{\delta }^{13}{{{{{{\rm{C}}}}}}}_{{{{{{{\rm{C}}}}}}}_{3}{{{{{{\rm{H}}}}}}}_{8}} \, > \; {\delta }^{13}{{{{{{\rm{C}}}}}}}_{{{{{{{\rm{C}}}}}}}_{2}{{{{{{\rm{H}}}}}}}_{6}}\) in most Kidd Creek gas samples (Fig. 1a) suggests a special combination of 13C KIEs. A CH4 molecule has only one carbon atom, which experienced a primary 13C KIE during C–C cleavage. Other alkane molecules have primary carbon atoms and carbon atoms adjacent to the primary ones. The primary carbon atoms experienced a primary 13C KIE during the cleavage; their adjacent ones experienced a secondary 13C KIE. The combination of a normal secondary 13C KIE and an inverse primary one may form the above δ13C order, as demonstrated by a numerical simulation on the consecutive random scission of long alkyl chains (Fig. 2; Model A in ‘Methods’ section). The simulation yields a perfect exponential distribution of short-chain alkane abundance with carbon number, in addition to the observed δ13C pattern (Fig. 2b, c). While an inverse primary KIE is most common for deuterium26, it is also possible for heavier elements such as 13C and 15N in biochemical reactions27,28.

Because δD variation in alkanes is mostly determined by the contributions of the two hydrogen sources with distinct δD values, the DKIE during this microbial reaction is hardly evaluable based on δD values. However, the DKIE can be evaluated from the deuterium–deuterium clumped isotopic composition Δ12CH2D2. Without a DKIE, a pure stochastic clumped isotopic distribution in the hydrogenolysis reaction would yield a drastically negative Δ12CH2D2 value (−76‰, for calculation see Model B in the ‘Methods’ section, after Eq. 14), in contrast to the reported values, which are not far away from zero10. One possible explanation for this difference is kinetic relaxation, where the hydrogen exchange between methane isotopologues can bring about an equilibrated Δ12CH2D2 value (close to 0). However, hydrogen atom exchange between methane molecules requires an elevated temperature and the existence of catalysts (free transitional metal in the lab or free radicals under geological conditions); under these conditions, ethane and propane would start to decompose (C2+ alkanes <5 vol%)29. The high fraction of C2+ alkanes (~15 vol%) in the Kidd Creek gas rules out a significant hydrogen exchange between methane isotopologues. A more concrete explanation for the above contrast is that clumped isotopic distribution during methane generation is normally kinetically governed, and an excessive KIE due to D–D clumping may form the observed Δ12CH2D2 value30. To demonstrate this explanation, we conducted kinetic numerical simulations on bulk and clumped isotopic fractionations of CH4 generation (Fig. 3; Model B in the ‘Methods’ section). Figure 3a shows the scheme of isotope distribution; Fig. 3b–e presents the simulation results with parameters listed in Table 1. The simulation yields an inverse primary DKIE during the generation of Kidd Creek methane (kH/kD = 0.91), which provides further support that the gas was formed through an enzyme-catalysed microbial process26.

Fig. 3: Numerical simulation on the isotopic compositions of CH4 from hydrogen biodegradation of long alkyl chains.
figure 3

a Reaction scheme showing isotope distribution during the hydrogenolysis of long C–C chains. be variations of δ13C, Δ13CH3D, δD, and Δ12CH2D2 with conversion (fraction of CH4-contained 12C atoms in all 12C atoms). Legend in be: dashed lines—kinetic isotope effects (KIEs) are absent and no enrichment/depletion of clumped isotopes in the organic precursor; solid lines—inverse deuterium KIE (DKIE) of the hydrogen donor (primary DKIE) present; circles—analytical solutions (Eqs. 12 and 13) at the start and end of conversion; green bars—range of reported data (conversion is set between 0.65 and 0.70 as constrained in Fig. 2). Parameters are listed in Table 1.

Table 1 Parameters of numerical simulations on the isotopic fractionation of CH4 during hydrogen biodegradation (Fig. 3).

Early life before GOE versus deep life after GOE

Hydrogenotrophy (microbial consumption of H2) using alkyl chains suggests an abnormally low oxygen fugacity in the ecosystem, where electron acceptors such as O2, Fe3+, SO42−, and even CO2, which are more efficient than alkanes to oxidise H2, are absent. This anoxic ecosystem could have been widely present either before the GOE or only present in the deep earth after the GOE when the Earth’s surface was too oxidative. By comparing hydrocarbon gases from various serpentinization sites (Fig. 4), we found the former condition more plausible:

  1. (1)

    All previously reported hydrocarbon gas samples with the above alkyl hydrogenolysis features (a widely varied δD accompanied by relatively uniform δ13C in alkanes, and a deuterium-depleted H2 source as indicated by the δD versus reciprocal hydrogen number correlation line) were obtained from the sites with both pre-GOE serpentinization and pre-GOE organic matter sedimentation (Fig. 4). Sedimentation of the Abitibi belt (2.71–2.73 Ga), where the Kidd Creek mines are located, was limited between several episodes of volcanic activities and serpentinization in the Neoarchaean. The last volcanic activity (~2.69 Ga6) brought about the graphitisation of organic precursors and the termination of biodegradation conditions31,32. Other sites include (1) the Kloof, Driefontein, and Mponeng mines in the Witwatersrand Basin of the Kaapvaal Shield (southern Africa), where serpentinization and organic matter sedimentation occurred 3.0–2.7 Ga33, and (2) the Copper Cliff mines in the Sudbury Basin of the Canadian Shield, with sedimentary organic matter in the Paleoproterozoic Huronian Supergroup deposited ~2.45 Ga34, just before the onset of the GOE35.

    On the contrary, hydrocarbon gas from post-GOE serpentinization sites does not have the above isotopic patterns. These sites range in age from the Proterozoic (the Juuka and Pori mines in the Fennoscandian Shield, with an age of 1.95 Ga36) to the present (the Lost City geothermal field, Fig. 1).

  2. (2)

    When these pre-GOE sites are exposed to modern microbial activity (due to mining), the isotope pattern of hydrogenolysis in the gas eventually becomes altered. The δD and Δ12CH2D2 values of CH4 in the Kidd Creek gas became more positive years after boreholes were drilled11. Microbes have been discovered in the sites of the Witwatersrand Basin37 and consume propane and n-butane, enriching deuterium in the residual of these hydrocarbons for some samples (Fig. 4c).

Fig. 4: Comparison of molecular and isotopic compositions between alkanes from pre-GOE and post-GOE sites.
figure 4

a δ13C versus δD of alkanes. b Exponential (Flory–Schulz) distribution of alkane abundance. c δD versus reciprocal of hydrogen number. d δ13C versus reciprocal of carbon number. Pre-GOE sites: 1—Kidd Creek mines (2.7 Ga); 2—Copper Cliff mines (2.45 Ga); 3—Kloof, Driefontein, and Mponeng mines (3.0–2.7 Ga). Post-GOE sites: 4—Juuka and Pori mines (1.95 Ga).

The above observations show that the occurrence and preservation of isotopic signatures from the hydrogenolysis reaction require a narrow chemical condition and specified microbial species, which are hardly satisfied after the GOE when strong oxidisers and diversified microbes coexist from shallow to deep earth. These signatures are vulnerable to post-GOE microbial reactions, such as methanogenesis from CO2 and H2 (making the δ13C more positive), methanotrophy, and the oxidation of higher alkanes. The oxygen-deficient Archaean ecosystem was crucial for the hydrogen biodegradation and the unusual isotopic signatures. Before the GOE, the atmospheric O2 partial pressure was <0.001% or even 0.00001% of the present atmospheric level (PAL); sulfate concentration in the ocean was 2.4% of the current level38. The oxygen-deficient atmosphere and ocean prevented both biological and chemical oxidation of organic matter in sediments39. Therefore, the Kidd Creek gas is a chemical fossil of the pre-GOE ecosystem preserved in the tectonically stable Canadian Shield.

The microbial hydrogenolysis process explains the 13C depletion in solid organic matter in some Archaean metasedimentary rocks. There are two distinct types of δ13C values in Archaean solid organic matter: one is within the normal range (−26 to −33‰), and the other is more depleted (−35 to −44‰, including the kerogen in the Abitibi belt of −43.8‰)40. The latter was attributed to methane cycling3, where methane was considered a precursor of the kerogen. Our discovery provides a more plausible explanation: kerogen (and its precursor) in some Archean sediments may have undergone microbial hydrogenolysis in the presence of H2. Through this alkyl removal process with a weak and possibly inverse 13C KIE, the residual kerogen (and the kerogen formed by the residual precursor) may have been depleted of 13C and side alkyl chains. Due to the lack of side alkyl chains, the residual kerogen would not have experienced 13C enrichment by thermal dealkylation; it would have kept the 13C-depleted signature.

The long preservation time of a pre-GOE gas accumulation (2.7 Gyr for the Kidd Creek gas) raises the question of whether the isotopic compositions of the residual gas were affected by diffusive fractionation. This impact is insignificant according to the two comparisons:

  1. (1)

    Comparison between the Kidd Creek rocks and typical Palaeozoic natural gas reservoirs on their diffusion time scales. The characteristic time scale (τ) for diffusion is inversely proportional to diffusivity41. Diffusive fractionation starts to have a significant impact on the isotopic compositions of residual gas when the preservation time is close to the characteristic time42. The sealing rocks of the Kidd Creek gas are crystalline rocks with gas diffusivity (<10−15 m2/s) three orders of magnitude lower than tight shale (10−12 m2/s)42,43,44, meaning that τ of the Kidd Creek gas-bearing rocks is 1000 times longer than that of the typical Palaeozoic gas reservoir in shale rocks, compensating their 10-fold difference in the preservation time. There is no significant impact from diffusive fractionation through shales on the isotopic compositions of the Palaeozoic natural gas accumulations44; therefore, the residual Kidd Creek gas unlikely experienced remarkable diffusive fractionation.

  2. (2)

    Comparison between H2 and alkanes on isotopic fractionation. H2 has a smaller molecular mass and is more sensitive to diffusive fractionation than alkanes. The δD values of H2 in the Kidd Creek gas are within the range of modern serpentinization-derived H222, indicating a minimal influence of diffusion on the isotopic compositions of the alkanes.

Hydrogenolysis of organic matter could act as a pathway for Archaean microbial life to obtain chemical energy stored in H2. Considering bond energies of 346, 411, and 432 kJ/mol for C–C, C–H, and H–H bonds, hydrogenolysis of long-chain alkanes is an exothermic reaction with a net heat release of 44 kJ/(mol H2). In contrast to the modern hydrogenotrophic methanogenesis of oxygen-bearing compounds45, microbial hydrogenolysis of long alkyl chains is a different geochemical pathway to convert H2 and acted as a significant source of greenhouse gas before the GOE, when the young Earth experienced widespread serpentinization of mafic volcanic rocks in the high heat-flow lithosphere. Conversion of organic matter and the serpentinization-derived H2, along with the hydrogen isotopic distribution in the ecosystem, are summarised in Fig. 5.

Fig. 5: Conceptual model.
figure 5

The model is showing hydrogen isotope fractionation during alkane generation from the biodegradation of organic matter with serpentinization-derived hydrogen in the Neoarchaean ecosystem.

Methods

Model A: abundances and δ13C of short alkanes

Considering the cleavage at position m (between the no. m and no. m + 1 carbon atoms) of an n-alkyl chain with n carbon atoms (1 ≤ m < n), the hydrogenolysis reaction equation is

$${{{{{\rm{R}}}}}}{\mbox{-}}{{{{{{\rm{C}}}}}}}_{n}{{{{{{\rm{H}}}}}}}_{2n+1}\mathop{\longrightarrow }\limits^{{{{{{\rm{2}}}}}}[{{{{{\rm{H}}}}}}]}{{{{{\rm{R}}}}}}{\mbox{-}}{{{{{{\rm{C}}}}}}}_{m}{{{{{{\rm{H}}}}}}}_{2m+1}+{{{{{\rm{H}}}}}}{\mbox{-}}{{{{{{\rm{C}}}}}}}_{n-m}{{{{{{\rm{H}}}}}}}_{2(n-m)+1}$$
(2)

Similarly, the reaction equation for cleavage on an n-alkane molecule is

$${{{{{\rm{H}}}}}}{\mbox{-}}{{{{{{\rm{C}}}}}}}_{n}{{{{{{\rm{H}}}}}}}_{2n+1}\mathop{\longrightarrow }\limits^{{{{{{\rm{2}}}}}}[{{{{{\rm{H}}}}}}]}{{{{{\rm{H}}}}}}{\mbox{-}}{{{{{{\rm{C}}}}}}}_{m}{{{{{{\rm{H}}}}}}}_{2m+1}+{{{{{\rm{H}}}}}}{\mbox{-}}{{{{{{\rm{C}}}}}}}_{n-m}{{{{{{\rm{H}}}}}}}_{2(n-m)+1}$$
(3)

Branched alkyl chains are omitted for simplicity. The cleavage reactions are consecutive; the products R-CmH2m+1, H-CmH2m+1, and H-Cn – mH2(n-m)+1 (written below as RCm, HCm, and HCnm) are still subject to hydrogenolysis until all C–C bonds break down with CH4 as the ultimate product. The cleavage rate (r) of a kerogen side chain (RCm) or an alkane molecule (HCm) is proportional to the C–C chain length and concentration c:

$${r}_{{{{{{{\rm{RC}}}}}}}_{m}}=-mk{c}_{{{{{{{\rm{RC}}}}}}}_{m}}$$
(4)
$${r}_{{{{{{{\rm{HC}}}}}}}_{m}}=-(m-1)k{c}_{{{{{{{\rm{HC}}}}}}}_{m}}$$
(5)

Here, k is the reaction constant to break any C–C bond. For simplicity, the difference in k between a middle and an end bond in the C–C chains is not considered. The net reaction rate of a kerogen side chain with a length of m carbon atoms is

$$\frac{{{{{{\rm{d}}}}}}{c}_{{{{{{{\rm{RC}}}}}}}_{m}}}{{{{{{\rm{d}}}}}}t}=-mk{c}_{{{{{{{\rm{RC}}}}}}}_{m}}+\mathop{\sum }\limits_{n=m+1}^{N}(k{c}_{{{{{{{\rm{RC}}}}}}}_{n}})$$
(6)

Here, t is time and N is the maximum chain length of the reaction system. The first term of the right-hand side accounts for the cleavage of the kerogen side chain, and the second term accounts for the generation of residual shorter side chains from the cleavage of the longer side chain. The net reaction rate of a normal alkane with m carbon atoms is

$$\frac{{{{{{\rm{d}}}}}}{c}_{{{{{{{\rm{HC}}}}}}}_{m}}}{{{{{{\rm{d}}}}}}t}=-(m-1)k{c}_{{{{{{{\rm{HC}}}}}}}_{m}}+\mathop{\sum }\limits_{n=m}^{N}(k{c}_{{{{{{{\rm{RC}}}}}}}_{n}})+2\mathop{\sum }\limits_{n=m+1}^{N}(k{c}_{{{{{{{\rm{HC}}}}}}}_{n}})$$
(7)

The first term of the right-hand side accounts for the cleavage of this alkane, the second term accounts for the generation of the alkane from the kerogen side chain, and the third term accounts for the generation from the cracking of alkanes longer than this alkane. The factor in the last term is necessary because HCm is generated from cuttings at the m and the nm positions of HCn.

Equations (6) and (7) show that shorter chains become more enriched as the cracking goes on. On the one hand, longer chains are more prone to cracking because they have more C–C bonds; on the other hand, shorter chains are the products of longer chains.

Both primary and secondary 13C KIEs on the rate constant k are considered. Suppose that there is a 13C substitution at position j (1 ≤  j ≤ n); if j = m − 1 or j = m, then there is a primary 13C KIE, or, if j = m − 2 or j = m + 1, there is a secondary 13C KIE on the cleavage at the m position.

A normal distribution is used for the molar distribution of the lengths of initial kerogen side chains, with a minimum chain length of nmin, a maximum chain length of nmax, a mean length of (nmin + nmax)/2, and a standard deviation of σ. When the mean length is large enough (>15), the isotopic compositions of gas products are insensitive to the initial kerogen side chain length distribution. For initial values, a δ13C value of −35‰ is applied. The initial chain length is in a normal distribution with a peak of C17 and a standard deviation of σ = 2 carbon atoms. The initial alkane concentrations are assumed to be 0.

For simplicity, we assume that since there is no isotopic fractionation within or between the alkyl chains at the beginning of hydrogenolysis, the probability of 13C substitution at any position of any side chain is identical and determined by the initial carbon isotopic composition δ13C. Multiple 13C substitutions on a C–C chain are omitted because consideration of multiple substitutions would drastically increase the modelling complexity. This approximation is valid when the C–C chain is not too long. For example, the ratio between the probabilities of double and single 13C substitution in a C20 chain is \({\left[{\left(\begin{array}{c}20\\ 2\end{array}\right)}{{\left(\frac{{\,}^{13}{{{{{\rm{C}}}}}}}{{\,}^{12}{{{{{\rm{C}}}}}}}\right)}}^{2}\right]}/{\left[{\left(\begin{array}{c}20\\ 1\end{array}\right)}{\left(\frac{{\,}^{13}{{{{{\rm{C}}}}}}}{{\,}^{12}{{{{{\rm{C}}}}}}}\right)}\right]}\) ≈ 10% for 13C/12C ~ 0.01. Such a chain is long enough that the δ13C of gas products is insensitive to C–C chain length. Numerical simulation was conducted with Mathworks MATLAB 2020a.

Model B: bulk and clumped isotopic fractionations of CH4

Conversion of methylene in a long C–C chain to methane is generalised into two steps:

$${{{{{\rm{R}}}}}}{\mbox{-}}{{{{{{\rm{CH}}}}}}}_{2}{\mbox{-}}{{{{{\rm{R}}}}}}{^\prime} \mathop{\longrightarrow}\limits^{{{{{{{\rm{r}}}}}}}_{a}}_{+{{{{{\rm{H}}}}}}}{{{{{\rm{R}}}}}}{\mbox{-}}{{{{{{\rm{CH}}}}}}}_{3}\mathop{\longrightarrow}\limits^{{{{{{{\rm{r}}}}}}}_{b}}_{+{{{{{\rm{H}}}}}}}{{{{{{\rm{CH}}}}}}}_{4}$$
(8)

The first step (step a) is the conversion of the methylene group R-CH2-R′ to a methyl group (RCH3) by accepting a capping hydrogen atom from the hydrogen donor (activated H2); the second step (step b) is the conversion of the methyl group to methane by accepting another capping hydrogen atom. This scheme is highly generalised, and each step may involve multiple elementary biochemical reaction steps, such as the binding of H2 and long alkyl chains to the enzyme, activation of H–H and C–C bonds, and release of the short alkane products from the enzyme. It is beyond the scope of this work to discuss the detailed biochemical reaction steps. But the cleavage and formation of chemical bonds in these steps should be constrained by the observed isotopic patterns.

Due to the computational complexity, we did not use the random scission model (Model A) in the simulation involving clumped isotopic fractionation, as explained in the following. A conventional kinetic model of the decomposition of organic matter without considering the constraints of C–C chain lengths is a zero-dimensional problem. Modelling the random cutting of long C–C chains without considering isotopes is a one-dimensional problem, and modelling bulk carbon isotopic fractionation during random cutting (Model A) is a two-dimensional problem. If 13C–13C coupling is included in random cutting, the modelling is a three-dimensional problem; a complex Monte Carlo method has been applied to deal with this problem19. If the 13C–D or D–D coupling is included in Model B, as we wish, it is a problem above the fourth dimension. The complexity of programming and the difficulty of computation make the model unattainable; even if it is achievable, solving this problem is far beyond the scope of this work.

Reaction equation Eq. 8 is expanded to the scheme in Fig. 3a to quantify the five most abundant isotopologues in methane (three or more substitutions such as 13CH2D2 or 12CHD3 are ignored due to their low abundances). For the subscripts in Fig. 3a (m, i, and j in ramij or rbmij), the first digit (m = 0 or 1) is the number of 13C atoms involved in the reaction, the second digit (i = 0, 1, or 2) is the number of deuterium atoms connected in the methylene or methyl group, and the third digit (j = 0 or 1) is the number of deuterium atoms in the hydrogen donor.

Clumped isotopic compositions of methylene and methane are defined as the following:

$$\left\{\begin{array}{l}{{\Delta}} {{{{{\rm{R}}}}}}{\,}^{13}{{{{{\rm{C}}}}}}{{{{{\rm{HDR}}}}}}^{\prime} =\frac{({{{{{\rm{R}}}}}}{\,}^{13}{{{{{\rm{C}}}}}}{{{{{\rm{HDR}}}}}}^{\prime} )({{{{{\rm{R}}}}}}{\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{2}{{{{{\rm{R}}}}}}^{\prime} )}{({{{{{\rm{R}}}}}}{\,}^{13}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{2}{{{{{\rm{R}}}}}}^{\prime} )({{{{{\rm{R}}}}}}{\,}^{12}{{{{{\rm{C}}}}}}{{{{{\rm{HDR}}}}}}^{\prime} )}-1\hfill\\ {{\Delta}} {{{{{\rm{R}}}}}}{\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{D}}}}}}}_{2}{{{{{\rm{R}}}}}}^{\prime} =4\frac{({{{{{\rm{R}}}}}}{\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{D}}}}}}}_{2}{{{{{\rm{R}}}}}}^{\prime} )({{{{{\rm{R}}}}}}{\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{2}{{{{{\rm{R}}}}}}^{\prime} )}{{({{{{{\rm{R}}}}}}{\,}^{12}{{{{{\rm{C}}}}}}{{{{{\rm{HDR}}}}}}^{\prime} )}^{2}}-1\hfill\\ {{\Delta}} {\,}^{13}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{3}{{{{{\rm{D}}}}}}=\frac{({\,}^{13}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{3}{{{{{\rm{D}}}}}})({\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{4})}{({\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{3}{{{{{\rm{D}}}}}})({\,}^{13}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{4})}-1\hfill\\ {{\Delta}} {\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{2}{{{{{{\rm{D}}}}}}}_{2}=\frac{8}{3} \frac{({\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{2}{{{{{{\rm{D}}}}}}}_{2})({\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{4})}{{({\,}^{12}{{{{{\rm{C}}}}}}{{{{{{\rm{H}}}}}}}_{3}{{{{{\rm{D}}}}}})}^{2}}-1 \hfill\end{array}\right.$$
(9)

Note that the isotopic compositions here are expressed in decimals; they should be multiplied by 1000 to give per mil values.

The deuterium isotope ratio between the hydrogen donor (denoted with subscript B) and the methylene group (subscript A) is expressed as:

$${\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}=\frac{1+{{{\delta }}{{{{{\rm{D}}}}}}}_{{{{{{\rm{B}}}}}}}}{1+{{{\delta }}{{{{{\rm{D}}}}}}}_{{{{{{\rm{A}}}}}}}}$$
(10)

For each reaction step in Fig. 3a, the corresponding rate constants are denoted as kamij for step a or kbmij for step b. Kinetic fractionation factors αkamij = kamij/ka000 and αkbmij = kbmij/kb000 define KIEs. Note that a DKIE is often expressed as kH/kD, which is the reciprocal of the αk nomenclature here. A DKIE may be primary or secondary; a primary DKIE results in αka001 ≠ 1 and αkb001 ≠ 1, and a secondary one results in αka010 ≠ 1 and αkb010 ≠ 1. Kinetic clumped isotope fractionation factors γamij = αkamij/(αka100mαka010iαka001j) and γbmij = αkbmij/ (αkb100mαkb010iαkb001j) define the excessive KIE due to isotope clumping in steps a and b, respectively30.

Conversion of the reactant R-CH2-R′ is defined as 1 − f, where f is the residual fraction of R-CH2-R′:

$$f=({{{{{\rm{R}}}}}}{\mbox{-}}{{{{{{\rm{CH}}}}}}}_{2}{\mbox{-}}{{{{{\rm{R}}}}}}{^\prime} )/{({{{{{\rm{R}}}}}}{\mbox{-}}{{{{{{\rm{CH}}}}}}}_{2}{\mbox{-}}{{\rm{R}}}{^\prime} )}_{{{{{{\rm{initial}}}}}}}$$
(11)

Considering the isotope abundance of D << H, the analytical solution of isotopic compositions at the beginning of the reaction (f = 1) is derived as:

$$\left\{\begin{array}{l}{{{{\delta }}}^{13}{{{{\rm{C}}}}}_{{{{{\rm{CH}}}}}_{4}}|}_{f=1}={\alpha }_{{{{\rm{k}}}}a100}{\alpha }_{{{{\rm{k}}}}b100}{{{\delta }}}^{13}{{{\rm{C}}}}+({\alpha }_{{{{\rm{k}}}}a100}{\alpha }_{{{{\rm{k}}}}b100}-1)\hfill\\ {{{{\delta}}{{{\rm{D}}}}}_{{{{{\rm{CH}}}}}_{4}}|}_{f=1}=(\frac{{\alpha }_{{{{\rm{k}}}}a010}{\alpha }_{{{{\rm{k}}}}b010}}{2}{{{\delta }}{{{\rm{D}}}}}_{{{{\rm{A}}}}}+\frac{{\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b010}\,+\,{\alpha }_{{{{\rm{k}}}}b001}}{4}{{{\delta }}{{{\rm{D}}}}}_{{{{\rm{B}}}}})+(\frac{{\alpha }_{{{{\rm{k}}}}a010}{\alpha }_{{{{\rm{k}}}}b010}}{2}+\frac{{\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b010}\,+\,{\alpha }_{{{{\rm{k}}}}b001}}{4}-1)\hfill\\ {{{\Delta}} {\,}^{13}{{{\rm{C}}}}{{{{\rm{H}}}}}_{3}{{{\rm{D}}}}|}_{f=1}=\frac{{\alpha }_{{{{\rm{A}}}}}^{{{{\rm{B}}}}}[{\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b010}({\gamma }_{a101}{\gamma }_{b110}\,-\,1)\,+\,{\alpha }_{{{{\rm{k}}}}b001}({\gamma }_{b101}\,-\,1)]\,+\,2{\alpha }_{{{{\rm{k}}}}a010}{\alpha }_{{{{\rm{k}}}}b010}[(1+{{\Delta}} {{{\rm{R}}}}{\,}^{13}{{{\rm{C}}}}{{{\rm{HDR}}}}^{\prime} ){\gamma }_{a110}{\gamma }_{b110}\,-\,1]}{{\alpha }_{{{{\rm{A}}}}}^{{{{\rm{B}}}}}({\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b010}+{\alpha }_{{{{\rm{k}}}}b001})\,+\,2{\alpha }_{{{{\rm{k}}}}a010}{\alpha }_{{{{\rm{k}}}}b010}}\hfill\\ {{{\Delta}} {\,}^{12}{{{\rm{C}}}}{{{{\rm{H}}}}}_{2}{{{{\rm{D}}}}}_{2}|}_{f=1}=\frac{8[{({\alpha }_{{{{\rm{A}}}}}^{{{{\rm{B}}}}})}^{2}{\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b001}{\alpha }_{{{{\rm{k}}}}b010}{\gamma }_{b011}\,+\,2{\alpha }_{{{{\rm{A}}}}}^{{{{\rm{B}}}}}{\alpha }_{{{{\rm{k}}}}a010}{\alpha }_{{{{\rm{k}}}}b010}({\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b010}{\gamma }_{a011}{\gamma }_{b020}\,+\,{\alpha }_{{{{\rm{k}}}}b001}{\gamma }_{b011})\,+\,{\alpha }_{{{{\rm{k}}}}a010}^{2}{\alpha }_{{{{\rm{k}}}}b010}^{2}{\gamma }_{a020}{\gamma }_{b020}]}{3{[{\alpha }_{{{{\rm{A}}}}}^{{{{\rm{B}}}}}({\alpha }_{{{{\rm{k}}}}a001}{\alpha }_{{{{\rm{k}}}}b010}\,+\,{\alpha }_{{{{\rm{k}}}}b001})\,+\,2{\alpha }_{{{{\rm{k}}}}a010}{\alpha }_{{{{\rm{k}}}}b010}]}^{2}}-1\end{array}\right.$$
(12)

If the abundance of the hydrogen donor is excessive and thus approximately constant, then the analytical solution at the end of the reaction (f = 0) is

$$\left\{\begin{array}{l}{{{{\delta }}}^{13}{{{{{{\rm{C}}}}}}}_{{{{{{{\rm{CH}}}}}}}_{4}}|}_{f=0}={{{\delta }}}^{13}{{{{{{\rm{C}}}}}}}_{{{{{{\rm{A}}}}}}}\hfill\\ {{{{\delta }}{{{{{\rm{D}}}}}}}_{{{{{{{\rm{CH}}}}}}}_{4}}|}_{f=0}=\left(\frac{1}{2}{{{\delta }}{{{{{\rm{D}}}}}}}_{{{{{{\rm{A}}}}}}}+\frac{{\alpha }_{{{{{{\rm{k}}}}}}a001}+{\alpha }_{{{{{{\rm{k}}}}}}b001}}{4}{{{\delta }}{{{{{\rm{D}}}}}}}_{{{{{{\rm{B}}}}}}}\right)+\left(\frac{{\alpha }_{{{{{{\rm{k}}}}}}a001}+{\alpha }_{{{{{{\rm{k}}}}}}b001}}{4}-\frac{1}{2}\right)\hfill\\ {{{{\Delta}} }^{13}{{{{{{\rm{CH}}}}}}}_{3}{{{{{\rm{D}}}}}}|}_{f=0}=\frac{{\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}\left[{\alpha }_{{{{{{\rm{k}}}}}}a001}({\gamma }_{a101}-1)+{\alpha }_{{{{{{\rm{k}}}}}}b001}({\gamma }_{b101}-1)\right]+2{{\Delta}} {{{{{\rm{R}}}}}}{\,}^{13}{{{{{\rm{C}}}}}}{{{{{\rm{HDR}}}}}}^{\prime} }{{\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}({\alpha }_{{{{{{\rm{k}}}}}}a001}+{\alpha }_{{{{{{\rm{k}}}}}}b001})+2}\hfill\\ {{{{\Delta}} }^{12}{{{{{{\rm{CH}}}}}}}_{2}{{{{{{\rm{D}}}}}}}_{2}|}_{f=0}=\frac{8\left[{({\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}})}^{2}{\alpha }_{{{{{{\rm{k}}}}}}a001}{\alpha }_{{{{{{\rm{k}}}}}}b001}{\gamma }_{b011}+2{\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}({\alpha }_{{{{{{\rm{k}}}}}}a001}{\gamma }_{a011}+{\alpha }_{{{{{{\rm{k}}}}}}b001}{\gamma }_{b011})+1\right]}{3{\left[{\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}({\alpha }_{{{{{{\rm{k}}}}}}a001}+{\alpha }_{{{{{{\rm{k}}}}}}b001})+2\right]}^{2}}-1\end{array} \right.$$
(13)

If the KIE is absent (all αk and γ factors equal to unity) and the clumped isotopic compositions in the precursor are 0, then clumped isotopic compositions become the following:

$$\left\{\begin{array}{l}{{{\Delta}} }^{13}{{{{{{\rm{CH}}}}}}}_{3}{{{{{\rm{D}}}}}}=0\hfill\\ {{{\Delta}} }^{12}{{{{{{\rm{CH}}}}}}}_{2}{{{{{{\rm{D}}}}}}}_{2}=-\frac{1}{3}{{\left(\frac{1-{\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}}{1+{\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}}\right)}}^{2} \end{array}\right.$$
(14)

With \({\alpha }_{{{{{{\rm{A}}}}}}}^{{{{{{\rm{B}}}}}}}\) = 0.354 from δDA = −126‰ and δDB = −691‰ given in the text, Δ12CH2D2 = −76‰ is obtained. Both Δ13CH3D and Δ12CH2D2 are more depleted of clumped isotopes than the reported values (4–6‰ for Δ13CH3D and −10 to 5‰ for Δ12CH2D2)10,11, indicating that 13C–D clumping in the methylene precursor should be considered to explain Δ13CH3D, and DKIE should be considered to explain Δ12CH2D2.

Numerical simulations are carried out to find parameters satisfying the following:

  1. (1)

    The δ13CCH4, δDCH4, Δ13CH3D, and Δ12CH2D2 values at higher conversions of organic precursors are within the reported ranges.

  2. (2)

    The δDA and δDB values are close to the derived values from Fig. 1c (−691 and −126‰, respectively) to show that δDCH4 is mainly determined by δD of the precursors rather than by DKIE during the hydrogenolysis.

A value of ΔR13CHDR′ =6‰ in the organic precursor is applied so that the final Δ13CH3D is in the range of reported values. This 13C–D clumping in the precursor is acceptable, considering that Δ13CH3D = 5.6‰ has been reported for biogenic gas10,11. A Δ12CH2D2 value close to the observed value but much higher than the stochastic one (Eq. 14) requires γa011 > 1 or γb011 > 1, as shown by the Δ12CH2D2 expression in Eq. (13). With this prerequisite, either an inverse primary DKIE (1° DKIE, αka001 > 1, αkb001 > 1) or an inverse secondary DKIE (2° DKIE, αka010 > 1, αkb010 > 1) is necessary, and through numerical simulation, we found that only the inverse 1° DKIE satisfies the above-mentioned δDA, δDB, and Δ12CH2D2 values.

Two scenarios (one is the pure stochastic condition, the other is with an inverse 1° DKIE) are modelled (Fig. 3). The parameters are listed in Table 1. For comparison, analytical solutions at the beginning and end of reactions from Eqs. (12) and (13) are presented. The numerical and analytical solutions are nearly identical at the beginning of conversion. There are small differences between the numerical and analytical solutions at the endpoint because the abundance of the hydrogen donor is not extremely excessive. A weak 13C fractionation between the organic precursor and the methane product is obtained with the KIE parameters (Fig. 3b). With such a weak 13C KIE, Δ13CH3D is nearly constant for reaction extent (Fig. 3c). Note that we applied an inverse 13C KIE, as required by the δ13C distribution of the alkane gases (Method 1, Model A). The δD and Δ12CH2D2 values are independent of 13C KIE. Both the bulk and clumped isotopic compositions of methane within the range of reported values are obtained at the organic precursor conversion of 0.65–0.70 as constrained by Fig. 2.