Knot probabilities in equilateral random polygons

We consider the probability of knotting in equilateral random polygons in Euclidean 3-dimensional space, which model, for instance, random polymers. Results from an extensive Monte Carlo dataset of random polygons indicate a universal scaling formula for the knotting probability with the number of edges. This scaling formula involves an exponential function, independent of knot type, with a power law factor that depends on the number of prime components of the knot. The unknot, appearing as a composite knot with zero components, scales with a small negative power law, contrasting with previous studies that indicated a purely exponential scaling. The methodology incorporates several improvements over previous investigations: our random polygon data set is generated using a fast, unbiased algorithm, and knotting is detected using an optimised set of knot invariants based on the Alexander polynomial.


Introduction
The tendency of long random filaments to become knotted is familiar to everyone carrying headphone cables in their pocket. It seems natural to expect that the probability that a random closed curve in three dimensions is knotted increases with its length. Random knotting-especially in closed random walks-has been studied at least since the 1960s. It was conjectured [1,2] that sufficiently long linear polymers in dilute solution, undergoing a ring closure reaction, would produce knotted ring polymers with high probability. The study of knotted random walks has been associated with knotted polymers ever since, employing insight and techniques from geometry, topology and statistical mechanics. Analytical results are rare [3,4], so the problem is most naturally studied with computers. Closed random walks of sufficient length must be generated, whose knotting is analysed to investigate the asymptotics. Topologically distinct kinds of knot are classified (the simplest examples are shown in Figure 1(a)), and so we can ask what the probabilities of different knot types are in closed random walks. A natural statistical model for this investigation, extensively studied, is the ensemble of equilateral random polygons [4]. These are piecewise linear embeddings of the circle in R 3 such that each edge has unit length: they are effectively random walks in three-dimensional Euclidean space, conditioned to return to their starting point.
(The equilateral condition can be relaxed to study walks whose edge lengths have some other distribution [3].) Examples of random polygons of different edge numbers with two specific knot types are shown in Figure 1 (b) and (c). Diao [4] has demonstrated that the probability of an equilateral random polygon being unknotted tends to zero as the number of edges increases. Many studies of knotted random walks have followed, generating different statistical ensembles, utilising larger datasets as numerical power has improved, and with increasingly sophisticated knot type analyses [5][6][7][8][9][10][11]. Several facts agreeing with large-N asymptotics are supported, such as the probability of composite knots increasing with N [4].
Here we revisit these questions, bringing contemporary large-scale computing resources to bear on the problem by generating many millions of random polygons by a Monte Carlo routine. Our methodology incorporates several improvements over previous studies, including: • large datasets, targeting 10 7 random polygons with edge number N up to 4000 steps (compared to ∼ 2 × 10 6 from other sources), though much of our analysis is based on polygons with N ≤ 3000 where the statistics are better; • utilising the algorithm of Cantarella et al. [12], which samples equilateral random polygons correctly and quickly; • identifying knots using a set of optimised numerical knot invariants.
As we shall describe, our results are consistent with the probability P K (N ) of a particular knot of type K occurring in a random polygon with N sides, having the form This expression incorporates an overall exponential decay, with decay parameter N K , combined with a power law exponent v K and other asymptotic corrections to scaling (β K , γ K and ∆). The first three factors involving constants C K , v K and N K are similar to the analogous knotting probability for self-avoiding polygons of N edges on a cubic lattice [13], which is well-grounded in a range of different random polygon models [6-8, 10, 11]. The small-scale correction terms to scaling are less well studied, and the term in square brackets in (1) is guessed based on the behaviour of lattice models [13]. The value of the confluent correction exponent ∆ is not known precisely, but we assume that ∆ = 1/2 [13]. The form (1) is supported by our observations of many knot types occurring for both smaller and larger N , including prime knots with up to nine crossings and composite knots with up to five components, and we present data for eight-crossing prime knots and four-component composites. Knot terminology is explained in Section 2.2. Our results suggest that the exponential parameter N K is universal for all knot types, N K = N 0 , both prime and composite [6].
From our data, the power law exponent appears to be v K = v 0 + n p (K), with universal constant v 0 and n p (K), the number of prime components of K. Prime knots have one prime component (n p = 1), composite knots have more than one prime component (n p > 1). This form of exponent is supported by an argument that the knotted components are on average localised and relatively small along the curve [14] and unentangled with one another [7]. If we had a pattern theorem for the unknot, then, under these assumptions, the composite exponent would be the sum of the exponent for the unknot and the number of prime components in the knot decomposition. This behaviour is comparable to lattice models, for which there is evidence that v K = n p [13,15,16], up to small corrections. In this sense, v K controls the asymptotic relative frequency of composite knots with different n p : the knot with more components will always eventually be more common. Knotted random polygons appear to follow this behaviour, with a small negative offset v 0 . This deviation of v K from n p has been seen elsewhere [11], so the deviation from the lattice result seems to be typical of unconstrained random polygons.
Notably, the unknot appears as the composite knot with zero prime components, n p = 0, with the small offset and the same exponential parameter N 0 . That the unknot scales as a "zero component knot", rather than with no power law and possibly a different N K , is a new observation from our data.
The amplitude C K depends on knot type and is the only parameter that differentiates between prime knots or composite knots with the same n p up to corrections to scaling.
From the best fits to our data, the values of these universal constants are N 0 = 259.3 ± 0.2 for equilateral random polygons, and v 0 = −0.190 ± 0.001. Since −1/2 < v 0 < 0, the unknot acquires a negative power law scaling in addition to the well-established exponential decay with N . This contrasts with previous studies, where the unknot probability was interpreted as scaling exponentially with no power law. Our results are not necessarily incompatible with prior investigations, in which the errors (due to smaller samples) are larger.
We also made preliminary investigations for an ensemble of non-equilateral random polygons. We used an ensemble of closed random polygons based on quaternions introduced in [21], which we call the quaternionic model. This has the advantage of being very fast and straightforward to implement numerically, and the quaternionic polygons have edgelengths sampled from beta distributions. For the quaternionic model we find the same power law, v 0 = −0.19 ± 0.03, and a different exponent, N K = 430.5 ± 1. The similarity of v 0 and difference of N 0 is consistent with expectations of knot scaling. We will not describe many features of this model, but the broad findings are consistent with the random polygons.
This type of numerical analysis is fundamentally limited: longer random polygons are not only more computationally expensive to analyse, but may adopt a vast plethora of prime and composite knot types. The chance of a specific knot type occurring for large N therefore drops dramatically. Furthermore, more complex knot types are harder to identify numerically, and it becomes hard to find topological invariants that robustly distinguish them in realistic timescales. We make some simple estimates of the misidentification rate to support our main conclusions, but such difficulties limit the maximum N for which reliable data can be found. Furthermore, given that N K = N 0 and v K = v 0 + 1 > 0 for all prime knots, the Ansatz (1) suggests that all prime knots have a maximum probability at N ≈ N 0 (1+v 0 ) (with error depending on corrections to scaling). Clearly, knots with a large number of crossings (of N 0 (v 0 + 1) crossings or more) cannot have this maximum, and indeed we show that the position of the maximum drifts, depending on the correction to scaling parameters β K and γ K . Nevertheless, (1) gives a good agreement with the data of a significant number of the commonest random knots, both prime, composite and the unknot.
The outline of the remainder of this paper is as follows. The various subsections of Section 2 provide the details of random walk generation, knot detection and classification, and numerical parameter choices. In Section 3 we describe our results, and we conclude with a brief discussion in Section 5. In Section 4 we summarize our results from the quaternionic random walk model. Before this, however, we briefly summarise the knotting properties of random polygons confined to lattices, justifying the form of the Ansatz (1).

Knotting in lattice polygons
Although there are few analytic results [4] to test against the numerical results for the random polygons, some rigorous results are available for random lattice polygons (simple closed curves embedded in a three-dimensional lattice such as the simple cubic lattice, Z 3 ). These rigorous results [17,18] guide our questions about the behaviour of random polygons in the continuum.
Writing p N for the number of polygons in the simple cubic lattice with N edges, up to translation, clearly we have p N = 0 if N is odd, p 4 = 3 and p 6 = 22. Hammersley [19] showed that the limit, taken through even values of N , exists and the growth constant µ satisfies 3 < µ < 5. If p N (∅) ≡ p 0 N is the number of N -edge polygons that are unknotted, then [17,18] lim N →∞ and µ 0 < µ, i.e. unknotted polygons are exponentially rare in the set of lattice polygons. If p N (K) denotes the number of N -edge polygons of knot type K then, similarly, so polygons with any fixed knot type are also exponentially rare. The existence of the limit has not been proved for any knot type other than the unknot, and it has not been proved whether or not the exponential growth rate is independent of knot type. Although these rigorous results give interesting information about knot probabilities, they say very little about the relative probability of different knot types. To address these questions we need to know about the subdominant terms. It is believed [13] that and it is reasonable to guess that where µ 0 < µ and where there is numerical evidence suggesting that α 0 = α [15,16]. Similarly, there is numerical evidence [13,15,16,20] that where n p (K) is the number of prime knots in the knot decomposition of K. Thus all knot types exhibit an exponential growth rate, with the exponent depending only on the number of prime knots in the knot decomposition, and not on the particular knots involved.
The probability that a lattice polygon has knot type K is (assuming that α 0 = α) where A K = C K /C, while the relative probability of the knot type being K 1 or K 2 is even if α 0 = α. Our Ansatz (1) for random polygons strongly resembles (8), has the negative exponential with N K = 1/ log(µ/µ 0 ), consistent with N K being independent of K. Our form of v K = v 0 + n p (K) with −1/2 < v 0 < 0 indicates that for random polygons, the analogue α > α 0 . We will give numerical evidence for this in the following. Readers uninterested in the details of the dataset generation can skip to Section 3.

Methodology and datasets
This section describes the numerical methods used to generate closed equilateral random walks, and the knot invariants used to identify knot types. Our numerical implementation of both the random walk models of Section 2.1, and the topological invariants of Section 2.2, are publicly available in the pyknotid knot identification toolkit [22]. We also perform a range of least square fits to the numerical data, using standard nonlinear fitting routines [23].

Random walk models
A typical algorithm generating general random walks does not give closed loops, i.e. curves which return to their starting point. It is more difficult to sample the subset of closed random walks properly, but many algorithms have been proposed for generating random polygons, either equilateral or with some distribution of step lengths (such as Gaussian distribution [3]). Examples include the polygonal fold, hedgehog, triangle, and crankshaft methods [24]. Although easy to implement numerically, not all of these algorithms give the desired probability distribution. When they do, they do so only as the limit distribution of a Markov process, and convergence may be slow [12]. In particular, different algorithms appear to generate very different selections of knot types, even with parameters that are nominally similar [24]. For a detailed investigation of knot statistics, it is desirable to generate random polygons with a properly defined distribution.
A small number of algorithms have been shown to produce the correct distribution in polygon space rigorously. One method is to generate each polygon edge at random, conditioned that the walk will return to its origin after a fixed number of further steps [25,26]. Although good for short random knots, it is numerically complex and slow to generate longer polygons [25]. An improvement was recently proposed by Cantarella et al. [12], in which the complicated numerical arithmetic is replaced by a direct rejection sampling of valid states, generating valid polygons with N edges in O(N 5/2 ) time. This action-angle method is the chosen source of random equilateral polygons here. Another approach-the 'toric symplectic Markov chain Monte Carlo' algorithm [27]-has been shown to converge to the appropriate distribution, but this is again relatively difficult to implement numerically.
We sampled 1.96 × 10 9 equilateral random polygons using the action-angle method, at lengths from 6 to 4000 edges. The sampled lengths are every N from 6 ≤ N ≤ 50, steps of 10 from 50 < N ≤ 200, steps of 50 from 200 < N ≤ 1000, and steps of 100 from 1000 < N ≤ 4000. At each length N ≤ 3000, we analysed at least 10 7 different polygons, in some cases far more. For each length N > 3000 we analysed at least 10 6 different polygons. Our analysis with the quaternionic model was based on similar choices.

Methods for identifying knot types
Knots abound in random walks, and it is necessary to distinguish their distinct knot types. The Rolfsen table of knots [28], with standard extensions for knots with up to 16 crossings [29] denotes the knot K i as the ith knot with crossing number K, the minimum number of crossings a 2-dimensional diagram of the knot can have, which we denote n c (see Figure 1(a)). The ordering of index i is effectively arbitrary. The knot 0 1 is the special case, called the unknot, representing the topologically trivial, simple circle. Knots with a crossing number n c ≥ 11 are referred to as K ai or K ni (e.g. 11 a343 , 11 n3 ), where a or n indicates alternating and nonalternating knots respectively. Distinct chiral pairs of knots are not distinguished. Tables of knots and their properties are available from the Knot Atlas [30] and KnotInfo [31]. Knot tables give only the prime knots, which can also be joined together by a connect sum to form composite knots. Connect sums are denoted by # or with exponents denoting repeated connect sums of the same knot type. For instance, 3 1 #4 2 1 represents the connect sum of a trefoil knot 3 1 and two figure-eight knots 4 1 . Figure 1(a) shows the seven prime knots with n c ≤ 6. Beyond these the number of knot types grows more rapidly; there are then 7 knots with n c = 7, 21 with n c = 8, 49 with n c = 9, 165 with 10, 552 with 11, 2176 with 12, 9988 with 13, 46972 with 14, . . .. The overall trend is of exponential growth in the number of prime knots with n c crossings [32,33]. Figure 1(b),(c) shows some examples of the knots 3 1 and 7 6 in random walks with different lengths. The trefoil knot 3 1 is usually very small, made of only a few edges of the whole polygon. The knot 7 6 is somewhat more complicated, dominating much of the structure of the random walk at 50 or even 100 edges, but as N grows, the knotted regions occupy less of the curve in both cases; this behaviour for fixed knot type is well established [14]. Furthermore, it becomes relatively unlikely that a long polygon will contain a single knot component; at large N , composite knots dominate the statistics, with knots occurring essentially independently in different regions of the polygon [13,16].
Determining the type of a complex random knot can be difficult. It is most efficient to identify knots by some set of knot invariants, i.e. tabulated functions of knot type. Unfortunately, easily calculable knot invariants are not perfect discriminators, taking the same value for distinct knot types. Furthermore, more discriminatory invariants usually require increased computational complexity.
The most common invariant for studying random knotting is the Alexander polynomial ∆ K (t) for knot type K [28,34,35]. As numerical polynomial arithmetic is inconvenient, it is common to use the knot determinant |∆ K (−1)|. Unfortunately, the knot determinant is far less discriminatory than the full Alexander polynomial: |∆ 4 1 (−1)| = |∆ 5 1 (−1)| = 5, whereas the simplest indistinguishable pair by Alexander polynomials is ∆ 6 1 (t) = ∆ 9 47 (t) = t 2 − 5t + 1, and the simplest knot with Alexander polynomial indistinguishable from the unknot, ∆(t) = 1, is 11 n39 . Therefore the determinant is often paired with certain Vassiliev invariants v 2 , v 3 , v 4 , . . . [6,25,36,37]. These may be calculated in polynomial time in the number of crossings of the knot representation. In practice, v 2 and the determinant are easily calculated, v 3 is practically calculable for knots with up to a few tens or hundreds of crossings, and higher Vassiliev invariants are generally not computationally practical for use with complicated curves. The Alexander polynomial is not completely independent of these invariants; in fact, v 2 is equal to the coefficient of t 2 in the (properly normalised) Conway polynomial. Although other invariants, such as the Jones and HOMFLY polynomials, are more powerful discriminators, computing these is exponential in the number of crossings of the projection [35], and they are only practical for projected curves with no more than a few tens of crossings.
The invariants we use here are the Alexander polynomial at certain roots of unity, Each ∆ r is an invariant as easily calculated as the knot determinant, with the only numerical change being the use of complex datatypes. ∆ 1 = 1 always, so is not a useful invariant [38], and ∆ 2 is the knot determinant. As shown in Appendix A, ∆ 2 , ∆ 3 , ∆ 4 conveniently are always integers, and we limit our calculation to these values. Higherorder roots of unity provide relatively little extra discriminatory value; to discriminate between the prime and composite knots which appear in random walks, and the first three roots of unity are almost as good as the full Alexander polynomial. Although we could attempt to increase discriminatory power by calculating Vassiliev invariants, v 2 adds little to no useful discriminatory power, and v 3 and higher invariants significantly slow down the calculations for knots longer than a few hundred steps. Hence, to recognise knots, we calculate ∆ 2 = |∆ K (−1)|, ∆ 3 = |∆ K (exp(2πi/3))| and ∆ 4 = |∆ K (i)|. This allows us, with confidence, to distinguish all prime knots with n c ≤ 7, the 21 knots with n c = 8 except for 8 5 21 , and the 49 nine-crossing knots except for 9 2 , 9 8 , 9 12 , 9 16 , 9 23 , 9 24 , 9 28 , 9 29 , 9 37 , 9 38 , 9 39 , 9 40 , 9 46 , 9 48 . These excluded eight-and nine-crossing knots knots have invariants the same as either a simpler (more common) prime knot, or a common composite knot. We also identify composite knots with five and fewer components, involving any number of trefoil knots 3 1 with one other prime knot, and a smaller number of examples involving more nontrefoil components. This introduces some error into the count e.g. some cases identified as 3 2 1 might be 8 20 (which has the same ∆(t)). However, in all important cases, one of the possible knots for a given set of invariants occurs with much more frequency than the alternatives, and this conflict does not appear to harm the results. Figure 2 shows the knot fractions for several different prime and composite knot types from our numerically generated equilateral random polygons. Figure 2 (a) shows data for prime knots. Evidently, the prime knot probabilities are all very similar, apart from the overall amplitude factor given by C K , which decreases as the knot complexity increases (as characterised by the crossing number). Figure 2 (b) shows data for composite knots, with numbers of components n p (K) varying from 0 (the unknot 0 1 ) to 3 (the connect sum of three trefoils). Composite knots with the same number of components n p , have broadly similar probabilities, up to a relative scaling determined by C K . The location of the maximum in the probability distribution increases with n p as the overall amplitude decreases. Overall, knots with larger n p are less likely. The knot types shown in Figure 2 are only a small sampling of the data we have, and the behaviour for other knot types is consistent with that shown in the figure. The data points in the figure are fitted according to (1), with N K = N 0 = 259.3 ± 0.2, v K = v 0 + n p (K) with v 0 = −0.190 ± 0.001. Values of C K , β K and γ K , are chosen to give the best fit for each knot type, and the fit for each knot type is excellent. The following discussion will provide more details for motivating the form of the Ansatz and the universal nature of N K = N 0 , v K = v 0 + n p (K).

Summary of observed behaviour
Equation (1) is an excellent fit to the data for the various prime and composite knots shown. In the following sections, we will provide separate motivation to support  the form of (1). In the following subsection, we consider ratios of probabilities of knot types with the same n p , or differing by unity; the gradients being zero or one (within error) indicate the universality of N K = N 0 and v K = v 0 + n p (K). In the following section, we consider the best fit result for N 0 and v 0 together, showing the best fit agrees for different knot types. We then explore this fit further for the unknot, for which the form (1) with nonzero v 0 is new. We then consider the different values of the amplitudes C K , before discussing the corrections to scaling in the final section.

Probability ratios
Comparisons of P K 1 (N )/P K 2 (N ), for prime K 1 and K 2 , justify our claim that N K and v K are independent of prime knot type. If N K depends on knot type, then as N → ∞, the ratio tends to zero or infinity exponentially rapidly. If N K 1 = N K 2 , but the exponent v K depends on prime knot type, then the ratio goes to zero or infinity, but not exponentially rapidly. If N K 1 = N K 2 and v K 1 − v K 2 = n p (K 1 ) − n p (K 2 ) then the ratio has the form The results from our data for P K 1 (N )/P K 2 (N ) against N on a log-log scale are shown in Figure 3. In (a), n p (K 1 ) = n p (K 2 ) + 1 for several pairs with n p (K 2 ) = 0,1,2. The curves fit very well to a straight line of gradient unity with a very small error in all cases. This suggests each K 1 and K 2 have the same exponential term, and power law term differing by 1. In Figure 3 (b), several pairs are shown where n p (K 1 ) = n p (K 2 ). Again the curves seem to be asymptotically linear with limiting slopes effectively zero. None of the curves in Figure 3 (b) approach zero or infinity as N increases, suggesting that N K = N 0 and v K = v 0 + n p (K) for all knots. This analysis, however, does not give values for the universal constants N 0 and v 0 . The corrections to scaling β K in (11), indicating the way the curve approaches the asymptotic ratios, will be considered below in Section 3.6.

Determining values of N 0 and v 0
Although justifying the general form of the knot probability, the method above does not determine the numerical values of N 0 and v 0 . This is complicated by the fact that best fits to N 0 and v 0 cannot be determined independently. We perform the analysis for the commonest knot type of each number of components n p (K): the unknot 0 1 , the trefoil knot 3 1 , and connect sums of trefoils 3 1 #3 1 and 3 1 #3 1 #3 1 . As evident in Figure  2, the commonest knot types from all the data are, in order, the unknot, the trefoil and 3 1 #3 1 .
For each knot type and v 0 in the considered range, we calculate the best fit to the Ansatz (1) by minimising the sum of the square deviations and weighting the data points by the inverse variance, whilst varying N 0 , C K , β K and γ K . In Figure 4 we plot the optimal N K for each v 0 , with the error bars represent 95% tolerance of the fitted data to this value (the other parameters are not shown). The lines of best fit for N K against v 0 intersect very close to one another, and very close to the values where the error bars  (1) with the numerical values fit as in Section 3.3 (red line), the best fit assuming a pure exponential decay with the best fit exponential decay constant N 0 = 246.5 (blue), and the the best fit using the exponential constant N 0 = 259.3 (black) from Section 3.3. In each case, corrections to scaling terms are included (not shown) to optimise the fit. The inset shows the modulus of the relative deviation of each fit from the data. The Ansatz, including the power law, is a significantly better fit than the pure exponentials. are the smallest, as shown in the inset. The crossings do not take place at precisely the same v 0 , N K , and from this we estimate the errors, giving v 0 = −0.190 ± 0.001 and N K = 259.3 ± 0.2.
As discussed above, these values of exponents give excellent fits for all knot types, as indicated for a sample of our data in Figure 2.

Fitting the unknot
As we have discussed, the unknot 0 1 appears in our Ansatz (1) not as a type of prime knot, but is properly considered as the unique composite knot with zero components, n p (0 1 ) = 0. Without corrections to scaling, its probability is P 0 1 (N ) ≈ C 0 1 N v 0 exp(−N/N 0 ), and indeed in the last subsection, we described how the best fit of v 0 and N 0 to the unknot data agrees very well with the values for the trefoil and its connect sums.
It has long been thought that the probability of a random unknotted polygon, ignoring small-scale effects, is simply exponential [5,6,34,39], with no power law term. This form is consistent with [4] and (6), and has been verified in a wide variety of random polygon models [5,6,11,34,37], as well as the lattice case [34,40].
In Figure 5, we show how the unknot data for our Ansatz, including the v 0 exponent (and best fit corrections to scaling), compares with the raw exponential form (12). The difference between the best fits is shown in the inset. Most data was generated for N ≤ 300, and in this range, the agreement is good for all of the fits. However, the pure exponential with N 0 systematically deviates (with a linear error) for N > 300, as indicated by the black curve in Figure 5, and the pure exponential with N 0 deviates systematically, in a similar way, when N > 600, as indicated by the blue curve in Figure  5. The fitting exponents vary from the data systematically with different signs. The two fitted curves without the power law term have a systematic deviation that grows as N increases, while our Ansatz, indicated by the red curve in Figure 5, is a substantially better bet without any systematic deviation.
The Ansatz with exponential and power law found from the last section, based on the data from the knots as well as unknots, gives a good fit over the entire range. This suggests that the pure exponential model is an approximation for small N , while for large N it is necessary to have the power law term. Meanwhile, our Ansatz suggest a greater universality that incorporates the unknot into a wider class, as a composite knot with zero components. As discussed in Section 3.3, these values were chosen from the simultaneous optimisation both of the unknot, and multiple trefoil knots.
In fact, the deviation from pure exponential scaling for the unknot has been observed in previous studies [11,37]. However, its effect was not distinguished from systematic errors: many older studies do not sample enough random walks to detect the change. The discrepancy was interpreted [37] as nontrivial knots being incorrectly identified as the unknot. These misidentifications are not represented in the error bars as it is difficult to estimate their number.
It is very difficult to estimate confidently, beyond tabulations, the number of nontrivial knots with Alexander polynomial (at roots of unity) corresponding to the unknot. The prime knots with ∆ 2 , ∆ 3 , ∆ 4 matching the unknot with n c ≤ 15, grow quickly with n c ≥ 11 (there are 2 examples with n c = 11, 2 with n c = 12, 15 with 13, 36 with 14, 145 with 15). The probability of each of these knot types occurring drops rapidly with n c , as discussed in Section 3.5; it is not clear how this decrease compares to the exponential increase of knot types with n c , and no stable pattern emerges for n c ≤ 15. Composite knots consisting of these components would also appear as the unknot, but are even rarer.
We do not believe that the deviation in Figure 5 can be explained by misidentification of unknots. Rather than estimate the misidentification rate by improving the discriminating power, we adopt the opposite methodology, by comparing the results with a less discriminatory analysis using only the determinant ∆ 2 which is a much weaker invariant than the set ∆ 2 , ∆ 3 , ∆ 4 . There are 2 examples misidentified as the unknot with n c = 10, 4 with n c = 11, 11 with n c = 12, 44 with 13, 162 with 14, 724 with 15, . . .. If the deviation from the fit in Figure 5 were due to misidentification, we would expect a significantly larger deviation from identifying the unknot only by determinant. However, the change to the results is very small: for instance, it accounts for < 2% more (a) (b) Figure 6. Features that vary with knot type. (a) Scatter plot of C K for the prime knots, following data given in Table 1. C K tends to decrease with crossing number n c , but with a broad range of C K for a given n c . (b) Scatter plot of N max , the position of maximum of P (N ) K for different K. Without the corrections to scaling, these would all be at the same point at N ≈ 210.
detected 'unknots' at length 2000 than with ∆ 2 , ∆ 3 , ∆ 4 , and this misidentification rate grows only slowly with N . This is far smaller than the ∼ 13% deviation of the unknot fraction from exponential decay in Figure 5, despite the unknot misidentification rate being far larger than with the original data. These are too small to be visible in any of the plots of Figure 5, and we conclude that unknot misidentification is a negligible error in Figure 5. This also suggests that the determinant alone is a reasonably reliable invariant for detecting unknotted random polygons; however, it cannot distinguish between simple prime knots. Furthermore, our results about the unknot do not exist independently of other knot types: the argument in Section 3.2. If the unknot indeed had the different form (12) with different exponents, it would be very surprising that the inclusion of other knots with the same Alexander polynomial would exactly cancel to give the relevant ratio plots in Figure 3 (b). As discussed above, these plots of the ratio of the logarithm of probabilities of various prime knots against the unknot tending to straight lines of gradient unity, and not to 0 or ∞, indicates that the unknot probability has the same form as prime knots, except for the different n p .
It is important to note that the best fits reported here involve varying the corrections to scaling parameters β K and γ K , not shown in Figure 5. These parameters were optimised for all three fits to the data shown. Varying the values of β K and γ K , does not affect the systematic advantage of the Ansatz fit over the others; a variation of 10% in these fitting parameters changes the deviation by about 3% at N = 2000. These fitting parameters will be discussed in general later in Section 3.6.

The knot coefficient amplitudes C K
The results discussed so far indicate that the main way the knot type determines the random polygon probability is the knot coefficient C K (up to the number of prime   Table 1. Values of − log C K , the logarithm of knot coefficient/amplitude for random polygons, depending on knot type K, for the simplest distinguishable prime and composite knots. The unknot has C 01 = 3.67, i.e. log C 01 = 1.30. These are found from the best fits for each knot type using (1) with the fixed values of N 0 and v 0 , and varying β K and γ K for the best fit. The knot 8 2 is absent from the table because since its occurrence in the data is nearly negligible, and hence a fit is not possible.
components n p (K) in K, and ignoring corrections to scaling). In particular, the relative fractions of composite knots with the same n p are determined almost entirely by the knot coefficient C K [11]. The values of C K endow the (prime) knots with a natural ordering-lower values indicate more complex knots, occurring more rarely-although little is known about how C K is related to the average geometry of the curves. We estimate the amplitudes C K for all prime knots with at most seven crossings, and some eight crossing knots and composite knots. The values of C K for the simplest prime and composite K for random polygons are given in Table 1. Figure 6 (a), shows how the the amplitudes C K depend on crossing number n c for prime knots. There is a general decrease in the value of the amplitude as the crossing number increases but the spread in values at fixed crossing number also increases as the number of prime knots with that crossing number increases. This is also consistent with our results for the quaternionic model of random knots with varying edge lengths, given in Table 2.
It is interesting to examine the values of the amplitude ratios and we have estimated C 3 1 /C K for prime knots K. These ratios are the relative probabilities of the two knots in the large N limit. We find that C 3 1 /C 4 1 ≈ 4.6, C 3 1 /C 5 1 ≈ 13.0, C 3 1 /C 5 2 ≈ 7.3, C 3 1 /C 6 1 ≈ 22.0, C 3 1 /C 6 2 ≈ 20.7 and C 3 1 /C 6 3 ≈ 34.1. This suggests a trefoil is about 4.6 times more likely than a figure-eight knot in the large N limit, and so on. These values are very different from the values found by Janse van Rensburg and Rechnitzer [41] for lattice knots but they are close to the values that we find for the quaternionic   Table 2.
Values of − log C K , the logarithm of knot coefficient/amplitude for quaternionic random walks, depending on knot type K, for the simplest distinguishable prime and composite knots. The data here are less good than the random polygons. The unknot has C 01 = 4.25, i.e. log C 01 = 1.46. These are found from the best fits for each knot type using (1) with the fixed values of N 0 and v 0 , and varying β K and γ K for the best fit. knots model, for which C 3 1 /C 4 1 ≈ 4.85 is fairly close to the value found by Deguchi [8]. Similarly, for C 3 1 #3 1 /C 3 1 #4 1 we find a value of about 2.4 and Deguchi [8] reports a value of about 2.5. It seems that amplitude ratios are probably universal among the off-lattice models (with no excluded volume term) but that lattice knots belong to a different universality class [41].
In Figure 6 we plot the values of N for N max , the maximum of P K (N ) for different prime knots. Without corrections to scaling, all of these would be at N = N 0 (v 0 + 1) ≈ 210. Evidently, the values of N max are all larger than 210, and increase with n c . This shows the significant effect of the corrections to scaling terms, to which we now turn.

Corrections to scaling
In Ansatz (1), we include corrections to scaling terms. There are two: one, proportional to 1/ √ N , with parameter β K , and a Darboux-type 1/N with parameter γ K . These are suggested by the corrections expected for self-avoiding walks on a lattice. The coefficient γ K should depend on knot type since it will reflect, in part, the minimum number of edges required to tie the knot. It is not a priori obvious whether β K should depend on knot type.
For prime knots we have presented evidence that the exponential growth term N K = N 0 and the exponent v K = v 0 + 1, are independent of prime knot type. Thus the ratio of probabilities of two prime knots, as N → ∞, should approach the ratio of their amplitudes. The corrections to scaling terms control how this limit is approached. In Figure 7 (a)-(c), we show the ratios of probabilities for various pairs of prime knots, as a function of 1/ √ N . These curves appear to intercept the vertical axis (in the limit N → ∞) at positive finite values, consistent with N K and v K indeed being independent of prime knot type, with a limiting slope close to zero. This intercept is close to the The orange curves are the best fits from the Ansatz, including the corrections to scaling. The constant lines (red) indicate the ratio of knot coefficients C K1 /C K2 , with errors given in the green band. The fact that the curves appear to approach these constant values in the limit 1/ √ N → 0 indicates once again that the overall scalings follow the Ansatz (1) with universal N 0 and v 0 . The approach indicates how the correction to scaling term β K compares between K 1 and K 2 . In (a)-(c), which are comparisons of simple prime knots, this is close to zero, but in (d)-(f), for knots with different numbers of components, this is quite different. ratio of knot coefficients C K 1 /C K 2 , given by the horizontal red lines. If the limiting slope is exactly zero, then β K is independent of prime knot type.
While the asymptotic curve in Figure 7 (a) is nearly flat, the asymptotic gradients in Figure 7 (b) and (c) are clearly non-zero. This suggests that, although that the β K values of 3 1 and 4 1 are close, β K for 5 1 and 5 2 differs from that of 3 1 much more than 4 1 . We infer that β K is dependent on knot type, and is somehow dependent on crossing number. This result is also consistent with Figure 6 (b) where the position of maximum of P (N ) K for prime knots shifts to higher N as the number of crossings increases; such a shift is algebraically reflected by the correction terms.
In our fitting calculations as described in Section 3.3, both β K and γ K were allowed to vary in order to achieve the best fit for N 0 . These best fits are also shown in Figure  7. The best fit β K and γ K for several knot types are given in Table 3.  Table 3. Values of correction to scaling coefficients β K and γ K for random polygons when K is prime. Optimising these values was part of the fitting procedure. In addition, the unknot is found to have values β 01 = −3.8, γ 01 = 8.3. Furthermore, simple composite trefoil knots have β 3 2 1 = +1.7, γ 3 2 1 = −48.9 and β 3 3 1 = +3.3, γ 3 3 1 = −69. We find the analogous terms for the quaternionic walks to follow similar trends. The knot 8 2 is absent from the table because since its occurrence in the data is nearly negligible, and hence a fit is not possible.
The best fit values for β K are given in the table to be −1.24, −1.14 and −1.3 for 3 1 , 4 1 and 5 1 , which are very close but not quite the same; this is consistent with the fitted curves in Figure 7 (a)-(c) having a small gradient when they meet the vertical axis.
We also compare N P K 1 (N )/P K 2 (N ) against 1/ √ N where n c (K 2 ) − n c (K 1 ) = 1. These are shown in Figure 7 (d)-(f). The curves approach a positive finite value as 1/ √ N → 0, consistent with our claims that N K 1 = N K 2 and v K 2 = v K 1 + 1.
The unknot best fit of β 0 1 = −3.8, which is quite far from β 3 1 , consistent with the gradient in Figure 7 (d). When we compare the unknot against the trefoil, or composites of trefoils against each other, there is a strong negative gradient, suggesting the β K 2 > β K 1 . The strong minimum in each case indicates a significant effect from the Darboux term γ K as well.
We also found (not shown) that when K 1 and K 2 are composite with the same number of components, the β K values are similar when their components have similar β K values.

Summary of results from quaternionic random walks
In Figure 8 we summarise the main results of the analysis with quaternionic random walks. These all appear very similar to the analogous plots for equilateral random polygons; the fits from the Ansatz in Figure 8 (a), (b) look very good, and the best fit for v 0 is at −0.19, albeit with a slightly less good fit than for equilateral random polygons. The optimal value of the exponential decay constant N 0 = 430.5, somewhat larger than for equilateral random polygons (as expected for walks with varying step lengths, since multiple short steps do not contribute significantly to the knotting topology). Figure 8 (d) shows that the best fit line for the unknot probability P 0 (N ) again follows our Ansatz (red) much better, for larger N , than either the pure exponential with the value of N 0 fitted from Figure 8 (c) (black) or the best fit pure exponential (blue). Figure 8 (e) shows the ratio of probability of trefoils to figure-8 knots decreases to the ratio of knot coefficients, apparently with similar values of β K . Just as for equilateral random polygons, the β K are quite different for prime knots and the unknot, evident in Figure 8 (e). Again, the lower quality of the large-N data for quaternionic walks is revealed towards the asymptotic regimes.
We give values of the best fit C K values in Table 2, which were considered briefly in Section 3.5.

Discussion
We have presented numerical evidence that Equation (1) describes the scaling behaviour of the probability of different knot types occurring in random polygons with length N , that is, an exponential decay characterised by constant N 0 (the same for all knots, but dependent on the model), and a power law term v K = v 0 + n p (K) depending on the number of prime components n p (K) of K, but with v 0 ≈ −0.190 a universal constant. They also depend on a knot coefficient/amplitude C K , depending on knot type, and terms giving correction to scaling for smaller N . These results are inspired by and are similar to the corresponding results for knots in random polygons on lattices [13,16].
In particular, we have provided firm evidence of v 0 providing a power law correction for the unknot scaling (consistent with the unknot being the unique knot with n p = 0). The lattice result, and evidence from previous numerical surveys (over a smaller range of N ), gave the unknot probability as a pure exponential.
Our investigations highlight two contrasting types of result. Firstly, our unusual numerical precision has led to surprising new observations about knotting of (a) random polygons; in particular the probability of unknotting does not simply decay exponentially with side length N , in contrast with many other studies. Secondly, this numerical accuracy reveals fundamental limitations of this type of knotting analysis. Only knots with minimum crossing number n c ≤ 16 are classified, yet the number of distinct possible knot types a polygon can assume, grows very rapidly with N . Future advances in numerical resources are unlikely to extend to dramatically longer lengths, without accompanying advances in knot recognition. As discussed previously, the main results here are given for the action-angle model of equilateral random polygons, although our preliminary data for quaternionic random walks (which were almost as extensive) qualitatively support all the numerical observations and results.
The largest systematic error in this kind of analysis is knot misidentification. In our investigation of unknot probability, we found that the unknot misidentification rate appears to be almost irrelevant, and is no larger than the other errors. The occurrence of knots with the same Alexander polynomial (or invariants ∆ 2 , ∆ 3 , ∆ 4 ) as another with equal or lower n c seems to become an effect around n c = 11; the number of knots begins to grow quickly here (552 prime knot types, compared to 165 at n c = 10). It isn't clear how the misidentification rate could be improved, as more powerful knot invariants such as the Jones polynomial are especially slow to calculate for the long curves that present most of the problems. The third-order Vassiliev invariant v 3 (and possibly others of higher order) is at least a polynomial time invariant, but as the polynomial order is higher, these are still relatively slow to calculate. It is also possible that new, polynomial time invariants might provide extra discriminating power, such as the new example introduced in [42], but it is not yet clear what the improvements in knot resolution could be with these.
The results we report are strongly backed by numerical evidence, and hopefully will stimulate new investigations into proving them rigorously. Following the results for lattices, our results are consistent with • the existence of a pattern theorem for unknotted equilateral polygons; • the tightness of individual prime components; • different prime components occurring almost independently along the polygon.
In spite of this, the meaning of C K for prime knots remains largely mysterious. It is impossible for all random polygon knots to have a maximum probability at N ≈ 210 -knots of a sufficiently large crossing number will not be possible in a polygon with 210 sides. The correction to scaling β K and γ K are not fundamental, but are indicative of various asymptotic terms beyond leading order about which we have no knowledge. The curious asymptotic behaviour in Figure 7 emphasises unusual trends in the largest N data in which we have confidence. Numerical resolution of all such questions is clearly beyond our current capabilities.

Acknowledgments
We are grateful to Jason Cantarella, Tetsuo Deguchi, David Foster, Enzo Orlandini and Eric Rawdon for discussions, and to Keith Alexander for providing the knot diagrams in Figure 1 (a). This research was funded in part by the Leverhulme Trust Research Programme Grant No. RP2013-K-009, SPOCK: Scientific Properties of Complex Knots.
The datasets generated and analysed in the paper are available from the authors.
Appendix A. The Alexander polynomial evaluated at roots of unity As discussed in Section 2.2, the Alexander polynomial ∆ K (t) of a knot K can be evaluated at roots of unity t = exp(2πi/r) for integer r. In particular, we claim that when r = 2, 3, 4 then ∆ K (exp(2πi/r) is always an integer, as proved as follows.
The overall factor t n is irrelevant to the invariants ∆ r = |∆ K (exp(2πi/r)|. Each coefficient except a is multiplied by t −p + t p = 2 cos(2πp/r) for some integer p. This is clearly an integer when r = 2, 3, 4 (for which cos(2π/r) = −1, − 1 2 , 0 respectively). However, these are special values of r and in general |∆ K (exp(2πi/r))| is not an integer.
The Alexander polynomial evaluated at each root of unity can be considered as the sum of coefficients with a particular weighting: for instance, the determinant ∆ 2 when r = 2 is the sum of the coefficients with alternating sign. Other values give more complicated sequences.