CAAP – S Sequence Approach | Stephen P. Hershey

1. Introduction

The topic of this post is the s sequence approach to computer-assisted aperiodicity proof (CAAP).

Table of Contents
1. Introduction
2. Binary Filter – I
3. Binary Filter – II
4. Binary Filter – III
5. Binary Filter – IV
6. Binary Filter – V
7. CAAP – I
8. CAAP – II
9. [0 1 1] PVA s Sequence – I
10. [0 1 1] PVA s Sequence – II
11. CAAP – III
12. S Sequence Search – Algorithm
13. S Sequence Search – CCA Determination
14. S Sequence Search – Hit Count Matrix – I
15. S Sequence Search – Hit Count Matrix – II
16. S Sequence Search – Derivatives
17. S Sequence Search 1 – User Interface
18. Decomposition of SEQ 806.4 into S Sequences
19. S Sequence CCA Determination via Lookup Table (PVA)
20. S Sequence Search using C Sequences
21. S Sequence Search – Pointer Shuffling
22. S Sequence Search – SEQ 828.4 “crisis”
23. Open Questions
24. What’s Next?

To modify a famous quote, in my opinion technical work is 1% inspiration, 98% perspiration, and 1% surprise. When you’re lucky, the surprise includes wonder, awe, amazement, and delight. My exploration of the s sequence approach to CAAP yielded some surprises:

(1) The large number of CCA (can conjecture aperiodic) s sequences often found by the computerized s sequence search algorithm (SSSA1) when applied to a wide variety of different r sequences (Section 16).

(2) The decomposition of SEQ 806.4 into PVAA s sequences (Section 18). The presumed aperiodicity of SEQ 806.4 is an emergent property arising from the interaction of many ones and zeroes as the sequence is generated using the CDPS algorithm. When I began studying this sequence I had no idea what mechanism might be responsible for its aperiodicity.

(3) The elegant fractal pattern that emerged when I studied CCA/XCA determination for PVA s sequences (Section 19).

(4) Finding an s sequence with 27 components that was consistent with an r sequence (SEQ 828.4) and then finding that the 28th component of that s sequence is inconsistent with a longer version of the r sequence (Section 22).

What about awe? Well, with a few seconds of work my 2017 iMac found over 441 million candidate s sequences, of which 191 were actual s sequences and 8 were CCA, within the SEQ 828.4 r sequence (Section 22; Figure 20.2). My astonishment at modern computational power goes back to my 1978 teenage experiments writing software in BASIC for the original Apple II computer.

The author investigated the s sequence approach to CAAP between April 2021 and March 2023.

2. Binary Filter – I

The theoretical foundation of the s sequence approach to CAAP is the binary filter. Every s sequence is a filter, but not every filter is an s sequence. Sections 3 through 6 begin with the simplest binary filter and then present filters of increasing complexity, culminating in a filter that does not allow a repeating block consisting of any data and length 5 to pass through. A later section will introduce an aperiodic filter, which does not allow a repeating block consisting of any data and having any length to pass through.

3. Binary Filter – II

Figure 1.1 shows Filter A consisting of a single component with value 1. This is an example of the simplest binary filter. On the left side, there is a single block of length 5 with data [1 0 1 1 0]. As the block repeats to the right, it encounters Filter A. The block will be unable to pass through the filter because the block’s data is inconsistent with the filter’s component.

Figure 1.2 shows Filter B, another single component binary filter, this time with value 0. The block’s data is consistent with the filter’s component and thus the block can pass through the filter.

4. Binary Filter – III

Figure 2.1 shows Filter C, a slightly more complex binary filter consisting of two components. Both components have value 0. On the left side, there is a single block of length 5 with data [1 0 1 1 0]. As the block repeats to the right, it encounters Filter C. The block will be unable to pass through the filter because the block data is inconsistent with the filter’s components.

Figure 2.2 shows Filter D, which also consists of two components. One component has value 0 while the other has value 1. On the left side, there is a single block of length 5 with data [1 0 1 1 0]. As the block repeats to the right, it encounters Filter D. The block data is consistent with filter’s components and thus the block can pass through the filter.

5. Binary Filter – IV

Figure 3.1 shows a two-component Filter E. The filter components are spaced 5 elements apart. On the left side, there is a single block of length 5 with data [1 0 1 1 0]. As the block repeats to the right, it encounters Filter E. The block will be unable to pass through the filter because the block data is inconsistent with the filter data.

Figure 3.2 shows Filter F, which is a generalization of Filter E : the possible components are {x = 1, y = 0} or vice-versa. There are therefore two possible F filters. The block data is [a b c d e], where each of the block’s elements is either 0 or 1. There are 32 possible blocks. So, the diagram in Figure 3.2 can be instantiated in 64 possible ways. Either version of Filter F is inconsistent with ALL blocks of length 5. A repeating block of length 5 will generate the same sequence values every 5 elements, which is inconsistent with Filter F, which as defined has different component values spaced 5 elements apart. Figure 3.2 shows that either the leading or the trailing component of the filter must be inconsistent with a repeating block of length 5.

Figure 3.2 delivers the first key insight: a filter composed of adjacent components with different values spaced by L is inconsistent with all blocks of length L. This is the foundation upon which aperiodic filters (which are inconsistent with blocks of all lengths) can be constructed.

Figure 3.3 shows Filter G, a variation on Filter E where the spacing between the filter’s 0 and 1 components has been doubled to 10. For both Filter E and Filter G, the filters’ second component is inconsistent with the repeating block, however for Filter G the block is able to ‘internally’ repeat once before the inconsistent second component is encountered. Filter G can be readily generalized to Filter G’, where the spacing between the filter’s 0 and 1 components is n times 5, where n is any integer greater than 1. All of these filters are inconsistent with the repeating block in the same way.

Figure 3.3 delivers the second key insight: for any repeating block of length of L, a filter with adjacent components with different values spaced by n*L, where n >= 1, will be inconsistent with the block.

6. Binary Filter – V

Figure 4.1 shows Filter H consisting of two components spaced 5 elements apart. A block of length 5 can be established on the right side of filter G and then repeat forever to the right.

Figure 4.2: shows Filter I, a generalization of Filter H to infinite length. Filter I has an infinite number of alternating 0 and 1 components. It is impossible to establish a repeating block of length 5 anywhere in the sequence.

Figure 4.2 delivers the third key insight: an aperiodicity proof makes a claim about an infinite-length sequence; only an infinite-length filter can support such a claim.

7. CAAP – I

Figure 5 introduces some useful terminology.

• An r’ binary sequence has infinite length. The r’ sequence in Figure 5 is created by repeating the length 10 block [1 0 1 1 0 1 1 1 1 0] forever.

• An r sequence is a truncated r’ sequence. The r sequence in Figure 5 consists of the first 43 elements of the associated r’ sequence.

• An s sequence is an alternating 1-0 (or 0-1, depending upon the first value) subsequence of an r sequence. Each element of an s sequence is called a component. The s sequence in Figure 5 is a subsequence of the r sequence. The s sequence shown can be completely described as having initial value 0, initial position (element number) 1, velocity 5 (spacing between components), and length 9 (number of components). It’s also correct to say that the r sequence shown is a supersequence of the s sequence. An s sequence is a type of filter, as discussed in Sections 3 – 6.

An s’ sequence is an infinite length s sequence. The s’ sequence in Figure 5 is a subsequence of the r’ sequence. The s’ sequence shown can be completely described as having initial value 0, initial position 1, velocity 5, and infinite length. It’s also correct to say that the r’ sequence shown is a supersequence of the s’ sequence.

8. CAAP – II

This section takes a baby step towards CAAP using the work done so far. Imagine that:

(1) an analyst hands the r sequence shown in Figure 5 to a computer, and asks it to find an s sequence.

(2) the computer responds with the s sequence shown in Figure 5, which can be described in compact notation as 0:[1 5]:9. ‘0’ is the value of the first component, ‘1’ is the r sequence index (position) of the initial component, ‘5’ is the r sequence spacing (velocity) between successive components, and there are a total of 9 components in the s sequence.

(3) the analyst successfully proves that the s’ sequence 0:[1 5]:INF is a subsequence of r’. INF is an abbreviation of infinite – the s’ sequence has an infinite number of components.

(4) the analyst has then proven that r’ is not periodic with period 5.

This is not an aperiodicity proof, but it’s a step in that direction. This process will be formalized Section 11.

9. [0 1 1] PVA s Sequence – I

This section introduces an s’ sequence (filter) useful in aperiodicity proofs called a PVA s’ sequence, where P is position, V is velocity, and A is acceleration. The r’ sequence indicies of s’ sequence components are defined by P, V, and A. The particular PVA values of interest for this section are P=0, V=1, and A=1. In compact notation, this s’ sequence can be described as v:[P V A]:L = v:[0 1 1]:INF, where v is the value of the first component of the s’ sequence (0 or 1) and L (length) is the number of components within the s’ sequence. This section shows that if this s’ sequence is a subsequence of an r’ sequence, then r’ is aperiodic.

Figure 6 shows the positions, velocities, and accelerations for the first 32 components of the s’ sequence 0:[0 1 1]:INF. The left-most column contains the s’ sequence component values, the next-left column contains the component positions (r’ sequence indicies), the next column contains the velocities, and the right-most column contains the accelerations. The argument that this s’ sequence is aperiodic is straightforward:

(1) each velocity represents the spacing (number of r’ sequence elements) between each adjacent 0-1 or 1-0 s’ sequence component pair.

(2) since the velocities begin with 1 and increase everywhere by 1, every positive integer appears in the velocity list.

(3) so, for any repeating block length L, the velocities L, 2L, 3L, 4L, etc. must appear in the list. In Figure 6, this is shown for block length 5.

(4) so, it is impossible for any repeating block of length L to be consistent with the s’ sequence. Any repeating block of length L will first be inconsistent with the s’ sequence component pair with spacing L, then with spacing 2L, then with spacing 3L, then with spacing 4L, etc., as discussed in Section 5.

(5) therefore, if s’ sequence 0:[0 1 1];INF is a subsequence of r’, then r’ is aperiodic.

A similar argument applies for the s’ sequence 1:[0 1 1]:INF.

10. [0 1 1] PVA s Sequence – II

This article provides an example of an r’ sequence with the s’ sequence 0:[0 1 1]:INF as a subsequence. The r’ sequence is called SEQ 1101.

Figure 7.1.1 shows the first few elements of SEQ 1101 in a one-dimensional format.

Figure 7.1.2 shows the first few elements of SEQ 1101 in a two-dimensional format to show how SEQ 1101 is constructed.

Figure 7.2 shows the result of a computerized search for s sequences within a SEQ 1101 r sequence of length 16,384. The s sequence highlighted in red is 0:[0 1 1]:181. The s sequence search algorithm will be described in more detail in later sections. Replacing the s sequence of length of 181 with INF converts the s sequence into the s’ sequence 0:[0 1 1]:INF.

Figure 7.3.1 shows the s’ sequence found in a horizontal numerical format. The bottom row contains the s’ sequence values, which alternate between 0 and 1. The second row from the bottom contains the s’ sequence indicies into the r’ sequence – the positions P. The next row contains the velocities V. The top row contains the accelerations A.

Figure 7.3.2 show the s’ sequence found within SEQ 1101. The r’ sequence is shown in a one-dimensional format. The s’ sequence component positions are shown in red. The component values alternate between 0 and 1 as required for an s sequence.

Figure 7.3.3 shows the s’ sequence found within SEQ 1101, but this time the r’ sequence is shown in a two-dimensional format to clarify how the s’ sequence found relates to the r’ sequence.

11. CAAP – III

This article formalizes computer-assisted aperiodicity proof (CAAP). The approach to CAAP presented in this post will be called CAAP1 to distinguish it from future versions. Figure 8 provides a 10-step process for CAAP1:

(1) Inquirer provides r’ to the Analyst and asks whether r’ is aperiodic.

For example, the Inquirer could send the algorithm for generating SEQ 1101 shown in Figure 7.1.2 to the Analyst.

(2) The Analyst generates r from r’.

For example, the Analyst could use the SEQ 1101 algorithm to generate the first 16,384 elements of the r’ sequence.

(3) The Analyst sends r to the Computer.

(4) The Computer searches the r sequence for s sequences that can be conjectured to be aperiodic (CCA) and returns any found to the Analyst.

For example, in Figure 7.2 the Computer returned seven CCA s sequences to the Analyst.

(5) The Analyst converts a CCA s sequence found into a CCA s’ sequence.

For example, in Figure 7.2 the Computer found the CCA s sequence 0:[0 1 1]:181. Converting this to an s’ sequence is straightforward: just replace 181 with INF.

(6) The Analyst provides the CCA s’ sequence to the Mathematician, and asks whether s’ is in fact aperiodic (i.e., the conjecture is true).

(7) The Mathematician proves that s’ is aperiodic and provides the proof to the Analyst.

For example, in Section 9 v:[0 1 1]:INF was proven aperiodic.

(8) The Analyst provides r’ and s’ to the Mathematican, and asks for proof that s’ is a subsequence of r’.

(9) The Mathematican proves that s’ is a subsequence of r’ and provides the proof to the Analyst.

For example, Figure 7.3.3 provides strong evidence that 0:[0 1 1]:INF is a subsequence of SEQ 1101.

(10) The Analyst reports to the Inquirer that r’ is aperiodic.

There are three failure points within CAAP1:
(1) the Computer fails to find a CCA s sequence within the r sequence;
(2) the Mathematician fails to prove that the CCA s’ sequence is aperiodic;
(3) the Mathematician fails to prove that the CCA s’ sequence is a subsequence of r’.

12. S Sequence Search – Algorithm

The next few sections focus on the Computer’s s sequence search algorithm (SSSA). The current SSSA as implemented has limited capability. To differentiate this SSSA from future versions, the current implementation as described in this post will be called SSSA1.

SSSA1 attempts to provide the Analyst with an s sequence that when converted into an s’ sequence has a reasonable chance of being proven aperiodic by the Mathematician. The Computer conjectures that the s’ sequence is aperiodic; such s’ sequences are labeled CCA – can conjecture aperiodic. If a Mathematician proves that the s’ sequence is aperiodic, then it can be used to prove the aperiodicity of the r’ supersequence. Otherwise, the Mathematician will have (possibly) expended a great deal of time and effort in the disproof while failing to make progress toward a proof that the r’ sequence is aperiodic. Therefore, SSSA1 should strive to provide the Analyst with only “high probability of being proven true” CCA s’ sequences.

The SSSA1 algorithm does an exhaustive search within a user-specified initial portion of the r sequence looking for candidate s sequences. After finding an s sequence candidate, it checks whether the candidate is a subsequence of the r sequence. If so, it then checks whether the s sequence can be conjectured as aperiodic, a process that will be discussed in more detail in later sections.

This section describes SSSA1’s algorithm for finding s sequence candidates using SEQ 1101 as an example. Figure 9.1 shows the initial elements of SEQ 1101. In order to find PVA s sequence candidates, SSSA1 uses three pointers as indicies into the r sequence. The pointers are initially positioned at the first 010 or 101 pattern at the beginning of the r sequence. This is the first s sequence candidate found. Subsequent s sequence candidates are found by shuffling the pointers to the right, always stopping when a 010 or 101 pattern is found. An example of this shuffle-to-right search is shown in Figure 9.1. After finding an s sequence candidate, simple arithmetic can convert the pointer indicies into PVA. For example, in Figure 9.1 the s sequence candidate highlighted in red has pointer indicies [p0 p1 p2] = [1 3 8], which can be converted into PVA = [1 2 3] as shown:
A: 3
V: 2 5
P: 1 3 8

Figure 9.2 shows how the SSSA1 s sequence candidate search space is specified by the user:

p0r is how far the p0 index can range to the right: from index 0 to 9;
p1r is how far the p1 index can range to the right relative to the current value of the p0 index;
p2r is how far the p2 index can range to the right relative to the current value of the p1 index.
In this example, [p0r p1r p2r] = [10 10 10].

Section 21 below discusses how SSSA1’s shuffle-to-right s sequence candidate search algorithm can sometimes deliver undesirable results.

13. S Sequence Search – CCA determination

After SSSA1 finds an s sequence, it then determines whether or not the s sequence can be conjectured as aperiodic (inconsistent with all blocks of any length). Since SSSA1 cannot check blocks of all lengths, it instead checks all block lengths within a user-specified range to determine whether they are inconsistent with the s sequence found. If so, then the Computer conjectures that the s sequence is aperiodic and provides the s sequence to the Analyst, who then converts it into an s’ sequence and sends it to the Mathematician. If not, then the Computer discards the s sequence and continues its search. As mentioned in earlier sections, an s sequence that can be conjectured as aperiodic is called CCA (“can conjecture aperiodic”); otherwise, the s sequence is called XCA (“cannot conjecture aperiodic”).

For example, Figure 7.2 shows the results of SSSA1’s s sequence search within SEQ 1101, which has a length of 16,384 elements. The Analyst has configured SSSA1 to check the s sequence consistency of all block lengths between 2 and 128. SSSA1 found that all of these blocks lengths are inconsistent with the s sequence 0:[0 1 1]:181. Therefore, the Computer conjectures that this s sequence is aperiodic.

The next two sections describe how SSSA1 checks whether a particular block length is consistent with an s sequence.

14. S Sequence Search – Hit Count Matrix – I

SSSA1 uses a data structure called a Hit Count Matrix (HCM) to determine whether all of the blocks of a particular length L are inconsistent with an s sequence. For simplicity, this post will refer to this as “block length L is inconsistent with an s sequence”, with “all of the blocks of length L” implied.

There are two rows in a HCM, one for each possible s sequence component value (0 or 1). The number of columns in a HCM equals the block length. Each column is an index into the block.

A HCM records how a repeating block interacts with the components of an s sequence. The HCM begins with all zeroes. The block begins at the far left side of the r sequence. As the block repeats to the right, when block index j encounters s sequence component with value i, the corresponding HCM entry [row][column] = [i][j] is incremented. Each such entry of an s sequence component / block index interaction into a HCM is called a “hit”.

In Figure 10.1 the s sequence consists of two components with values 0 and 1. Five r elements separate the two s sequence components. As the block of length 5 repeats to the right, the block encounters the s sequence component with value 0 at block index 1. The corresponding entry HCM[0][1] is incremented. The block then encounters the s sequence component with value 1, again at block index 1. The corresponding entry HCM[1][1] is incremented. The resulting HCM is shown in Figure 10.1. We know from previous sections that an s sequence with two components with opposite values spaced five elements apart is inconsistent with all blocks of length five. An HCM with non-zero values in both rows of the same column indicates that the s sequence is inconsistent with all blocks of the given length. Such an HCM is called joint. The HCM in Figure 10.1 is joint.

In Figure 10.2 the s sequence consists of two components with values 0 and 1, this time separated by four r sequence elements. As the block of length 5 repeats to the right, the block encounters the s sequence components with value 0 at block index 1. The corresponding entry HCM[0][1] is incremented. The block then encounters the s sequence components with value 1 at block index 0. The corresponding entry HCM[1][0] is incremented. The resulting HCM is shown in Figure 10.2. We know from previous sections that an s sequence with two components with opposite values spaced four elements apart is consistent with at least one block of length five. An HCM where all columns have at least one row with a zero entry is called disjoint. Such an HCM is consistent with at least one repeating block of the given length. The HCM in Figure 10.2 is disjoint.

The HCM in Figure 10.2 can be used to identify all blocks consistent with the s sequence: [* 0 * * 1], where ‘*’ is a wildcard meaning “0 or 1”. This schema shows that there are 2^3 = 8 blocks consistent with the s sequence.

The general method for generating the schema from a HCM depends upon each HCM column:
(1) if an HCM column has zeroes for both rows, enter “*” (wildcard).
(2) if an HCM column is nonzero for row 0 and zero for row 1, enter 0.
(3) if an HCM column is zero for row 0 and nonzero for row 1, enter 1.
(4) if an HCM column is nonzero for both rows 0 and 1, the HCM is joint. All blocks of length L are inconsistent with the s sequence. No block schema can be produced.

Figures 10.3 and 10.4 are variants of Figure 10.1.

In Figure 10.1 the spacing between the s sequence components is five. All blocks of length five are inconsistent with the s sequence.

In Figure 10.3 the spacing between the s sequence components is 5*2 = 10. All blocks of length five are still inconsistent with the s sequence. The resulting HCM is identical to that presented in Figure 10.1.

In Figure 10.4 the spacing between the s sequence components is 5*3 = 15. All blocks of length five are still inconsistent with the s sequence. The resulting HCM is identical to that presented in Figure 10.1.

Figures 10.3 and 10.4 provide an intuitive understanding that an s sequence with adjacent components having opposite values with spacing n*L, where n >= 1, is inconsistent with all blocks of length L. This reinforces the binary filter discussed in Section 5.

15. S Sequence Search – Hit Count Matrix – II

When a repeating block of length L interacts with the components of a PVA s sequence, a repeating pattern of hits within the HCM occurs. This is called an orbit. This section provides two examples. The first example involves the [0 1 2] PVA s sequence; the second involves the [0 1 1] PVA s sequence.

— Example 1 —

Figure 11.1 shows the 0:[0 1 2]:20 s sequence and the HCM that results for a repeating block of length 6. There are four blocks consistent with this s sequence: [0 1 0 1 0 0], [0 1 1 1 0 0], [0 1 0 1 0 1], and [0 1 1 1 0 1]. As discussed in previous sections, if the s sequence is viewed as a filter, then these are the four blocks that can pass through the filter.

Figure 11.2 is the data in Figure 11.1 mod 6. This operation is justified by the discussion in Section 5, which noted that s sequence adjacent component spacing v and v mod L, where L is the block length, have the same effect on consistency. The figure shows that a repeating pattern of hits emerges in the HCM. The pattern is {<0 0>, <1 1>, <0 4>, <1 3>, <0 4>, <1 1>}, and then repeats. For each pair, the first element is the s sequence component value (0 or 1) and the second element is the block index (0 – 5).

Figure 11.3 is the data in Figure 11.2 truncated at the 6th hit, which is the length of the orbit. The single-orbit HCM shown can be used to calculate the HCM that results after N orbits. For example, to find the HCM that results after N=10 orbits, simply multiply all of the entries in the single-orbit HCM shown by 10.

When determining whether a particular block length is consistent with an s sequence, SSSA1 detects the start of a new orbit and stops recording new s sequence hits in the HCM. Any subsequent hits would simply retrace the sequence of hits already recorded during the first orbit and thus contribute no additional useful information to the consistency determination.

— Example 2 —

Figure 12.1 shows the 0:[0 1 1]:30 s sequence and the HCM that results for a repeating block of length 6. No blocks are consistent with this s sequence.

Figure 12.2 is the data in Figure 12.1 mod 6. The figure shows that a repeating pattern of hits emerges in the HCM. The pattern is {<0 0>, <1 1>, <0 3>, <1 0>, <0 4>, <1 3>,<0 3>,<1 4>,<0 0>,<1 3>,<0 1>,<1 0>}, and then repeats.

Figure 12.3 is the data in Figure 12.2 truncated at the 12th hit, which is the length of the orbit.

— s sequence consistency determination and the orbit —

When performing an s sequence consistency determination for a particular block length L, SSSA1 records hits in the HCM until one of three events occurs:
(1) if both rows in a particular HCM column become nonzero, then the algorithm halts and indicates that the block length is inconsistent with the s sequence because the HCM is joint;
(2) if a complete orbit is completed without both rows in a HCM column becoming nonzero, then the algorithm halts and indicates that block length L is consistent with the s sequence because the HCM is disjoint;
(3) if all s sequence hits are entered into the HCM without event (1) or (2) occurring, then the algorithm halts and indicates that s sequence consistency could not be determined (the HCM is either joint or disjoint). This event occurs when the s sequence is too short to complete an orbit within the HCM.

16. S Sequence Search – Derivatives

16.1 introduction

My experience using SSSA1 yielded a surprise – many CCA s sequences are often found within a particular r sequence. This section describes two reasons for the this abundance: derivative sequences. Given a certain s sequence (the core), other s sequences can be derived from the core s sequence. The SSSA1 brute-force exhaustive s sequence search algorithm finds not only a core s sequence but also its derivatives. The two types of s sequence derivatives are forward-iteration and undersampled (which could also be called aliased).

This section is not a complete explanation for the many-CCA-s-sequences-found phenomenon – sometimes many CCA s sequences remain even after eliminating derivative sequences.

16.2 forward-iteration derivative s sequences

Let’s say that SSSA1 finds the s sequence 0:[0 1 1] shown in Figure 13.1. Since SSSA1 does an exhaustive search, it will also find 1:[1 2 1], 0:[3 3 1], 1:[6 4 1], etc. These are called forward-iteration derivatives (FIDs) of the core s sequence 0:[0 1 1]. The SSSA1 user has the option to suppress FID s sequences from the display.

If a core s’ sequence is aperiodic, then all of its FID s sequences are also aperiodic.

16.3 undersampled derivative s sequences

Let’s say that SSSA1 finds the s sequence 0:[0 1 1] shown in Figure 13.2. Since SSSA1 does an exhaustive search, it will also find 0:[0 6 9]. This phenomenon is called undersampling. The figure shows s sequence component undersampling by 3 (every 3rd component of the core s sequence is sampled).

Figure 13.3 shows that given a core s sequence, it’s possible to undersample by any odd number of components (e.g. 3, 5, 7, 9, etc.) and obtain an undersampled derivative s sequence. Since SSSA1 does an exhaustive search, it will also find these undersampled derivative s sequences.

Figure 13.4 shows that SSSA1 finds 0:[0 1 1] (a core s sequence) and 0:[0 6 9] (an undersampled derivative s sequence) in SEQ 1101.

SSSA1 does not implement a method for detecting undersampled derivative s sequences and thus does not provide the user with an option to suppress the display of such sequences.

17. S Sequence Search 1 – User Interface

This section provides some addition commentary on the SSSA1 user interface. Figure 14 shows the annotated SSSA1 user interface for PVA s sequence search.

• the “u SEQ length, max” parameter specifies the maximum length of an orbit. It exists because the algorithm that determines whether a particular block length is consistent with an s sequence does not limit the number of hits recorded in the HCM to the number found in the r sequence (the s sequence length). Instead, it converts the s sequence into an s’ sequence and keeps recording hits until the user-specified maximum orbit length is reached, at which time it halts and indicates that the consistency determination was inconclusive as described in Section 13. Typically, the algorithm terminates with a consistency determination before reaching the user-specified maximum orbit length. The “u SEQ length, max” parameter is a failsafe in case (for some reason) SSSA1 fails to detect an orbit after a “reasonable” number of hits.

• If “cand failed” is YES, then every candidate s sequence found is shown. The setting is used only to debug the SSSA1 candidate search algorithm and is normally NO due to the large number of candidate s sequences found during a typical search.

the user can suppress the display of forward-iteration-derivative CCA and/or XCA s sequences with the two “FWD-ITER-DRV” options.
The “mod 3 bit plane only” option suppresses the display of s sequences where the components ARE NOT all confined to a modulus 3 bit plane. This was added to examine a particular case and isn’t of general utility.

18. Decomposition of SEQ 806.4 into S Sequences

The SEQ 806 family of r’ sequences is described in this post. This section describes how SEQ 806.4 can be decomposed into multiple interleaved s sequences.

I discovered the SEQ 806.4 s sequence decomposition before developing s sequence search software. My technique was completely manual. When I examined the SEQ 806.4 r sequence shown in Figure 15.1, I noticed that there is a repeating background pattern of [1 0 0] interrupted by two types of ‘anomaly’ or ‘event’: [0 0 0 0 0] and [1 1]. These events are highlighted in red in the figure. I wrote software to extract the locations of these events in the r sequence to see if a pattern could be ascertained. I was delighted to find that both types of events can be described using acceleration. Before starting the analysis I suspected that SEQ 806.4 is aperiodic but I had no idea what kind of mechanism might be responsible for the aperiodicity. Further analysis showed that a simple PVA s sequence cannot be used to describe the events. Instead, a minor generalization of the PVA s sequence called a PVAA s sequence was required. A further complication is that the [1 1] event must be decomposed into two interleaved sub-events, called [1 1]/A and [1 1]/B.

Figure 15.2 shows general PVA and PVAA s sequences for comparison. A PVAA s sequence interleaves two constant accelerations whereas a PVA s sequence employs a single constant acceleration.

So, using completely manual techniques I conjectured that:

the [1 1]/A event can be described by the PVAA s’ sequence 1:[4 15 [3 6]]:INF
the [1 1]/B event can be described by the PVAA s’ sequence 0:[1 12 [6 3]]:INF
the [0 0 0 0 0] event can be described by the PVAA s’ sequence 1:[9 15 [6 3]]:INF
These s’ sequences are shown in Figure 15.3.

An algorithm for generating SEQ 806.4 is thus:
(1) begin with an empty r sequence
(2) generate the PVAA s sequence 1:[4 15 [3 6]]:L1 into the r sequence
(3) generate the PVAA s sequence 0:[1 12 [6 3]]:L2 into the r sequence
(4) generate the PVAA s sequence 1:[9 15 [6 3]]:L3 into the r sequence
(5) generate the background pattern [1 0 0] into the remaining empty elements in the r sequence
This algorithm is verified in Figure 15.4, which shows that the two different methods for generating SEQ 806.4 (CDPS and interleaved PVAA s sequences + background pattern) produce the same r sequence (length = 16,384 elements).

Later, when PVAA s sequence search software was available, I re-ran the search and verified that the computer could find the same PVAA s sequences that I found manually. This is shown in Figure 15.5.

The PVAA s sequence search software labels the two [1 1] events as CCA whereas the [0 0 0 0 0] event is labeled as XCA. This means that the two [1 1] events can be used within the CAAP1 framework to prove that SEQ 806.4 is aperiodic whereas the [0 0 0 0 0] event cannot. As of this writing I do not have a proof that either of the [1 1] event PVAA s sequences is aperiodic. The CAAP1 framework specifies that a mathematician should provide this proof.

19. S Sequence CCA Determination via Lookup Table (PVA)

As described in Section 13, SSSA1 decides whether a particular s sequence is CCA or XCA by checking the consistency of the block lengths within a user-specified range against the s sequence. If all block lengths tested are found to be inconsistent, then the algorithm labels the s sequence as CCA – can conjecture aperiodic. Otherwise, the s sequence is labelled XCA – cannot conjecture aperiodic. The PVA class of s sequences has a predictable CCA/XCA pattern that makes such consistency checking unnecessary (although SSSA1 does not make use of this pattern; instead, all s sequences found go through the block length consistency checking procedure).

Figure 16.1 shows the pattern. Each row of the table in an acceleration; each column is a velocity. The CCA/XCA determination for a particular PVA s sequence can be found by looking up the value in the table. For example, the figure shows this table lookup process for PVA = [p 17 26] s sequence; the result is XCA.

Figures 16.2 through 16.6 show that Figure 16.1 has a fractal structure. For example, Figure 16.4 shows that the data in rows 1 through 7 is exactly repeated within rows 9 through 15. Then, Figure 16.5 shows that the data in rows 1 through 15 is exactly repeated within rows 17 through 31. Presumably, this recursive process repeats forever. Therefore, the table in Figure 16.1 can be extended to arbitrary size by repeating the duplicate & extend process shown.

20. S Sequence Search using C Sequences

Any s sequence (alternating ‘0’ and ‘1’ components) can be decomposed into two interleaved c sequences, one consisting only of ‘0’ components and the other consisting only of ‘1’ components. This suggests an alternative s sequence search algorithm: first find c sequences and then combine them to create s sequences. Each s sequence found could then undergo CCA/XCA determination as described in earlier sections. An advantage to this approach is that it would not be necessary to develop separate search algorithms for PVA and PVAA s sequences (for example). Instead, a single search algorithm could find both types of s sequence. A possible disadvantage to this approach is that identifying pairs of candidate c sequences that can the combined to create candidate s sequences may be nontrivial. I made some tentative steps in this direction but do not have a working implementation. The next two figures illustrate my progress.

Figure 17.1 shows the general form of the PVAA s sequence and its two interleaved c subsequences. A PVA s sequence is a special case of the PVAA s sequence where c = d. Also shown is an example that calculates the two c PVA subsequences from a given PVAA s sequence.

Figure 17.2 shows the abridged result of searching for both ‘0’ and ‘1’ PVA c sequences within SEQ 806.4. We know from previous work that there are two PVAA s sequences within this r sequence. The figure shows that the ‘0’ and ‘1’ c subsequences for the SEQ 806.4 [1 1]/A event s sequence and SEQ 806.4 [1 1]/B event s sequence are among the many c sequences found within the SEQ 806.4 r sequence, as expected.

21. S Sequence Search – Pointer Shuffling

As noted in Section 12 above, when searching for candidate PVA s sequences SSSA1 begins with all pointers to the left side of the r sequence and then exhaustively shuffles them to the right up to the limits specified by the user. A problem with this approach is that for any particular position SSSA1 first finds PVA s sequence candidates with low velocities and high accelerations rather than vice versa. A PVA s sequence with low acceleration is likely to have more components within the r sequence than one with high acceleration, and thus it’s more likely that the corresponding s’ sequence is a subsequence of the corresponding r’ sequence.

An alternative s sequence search algorithm begins with all pointers shifted to the right by some user-specified amount and then exhaustively shuffles them to the left. This algorithm would first find PVA s sequences with low accelerations before high accelerations. Figure 18.1 provides an intuitive look at shuffling pointers the right. Figure 18.2 provides a similar look at shuffling pointers to the left. SSSA1 does not provide the user with the option of shuffling pointers to the left when performing s sequence search.

22. S Sequence Search – the SEQ 828.4 “crisis”

As an incremental step toward using CAAP1 to prove that the square root of two binary sequence (SEQ 2000) is aperiodic, I used SSSA1 on SEQ 828.4. This r’ sequence is more complex (and thus more challenging) than SEQ 806.4 but less complex than SEQ 2000.

SEQ 828.4 is a straightforward modification of SEQ 806.4. For SEQ 806.4, the parity function is used to generate each element of the resultant sequence from each of the delayed periodic sequences. The parity function can equivalently be thought of as the sum modulus 2. SEQ 828.4 employs sum modulus 3 as the method of combining delayed periodic sequences within the CDPS framework. This means that for a given bit column there is sometimes a ‘carry bit’ added to the adjacent left column when the sum modulus 3 equals 2 for the right column. This also means that when generating SEQ 828.4 a data-dependent number of far right columns must be discarded because it’s unknown whether a ‘carry bit’ should be added to the rightmost column and then some columns to the left. The discard algorithm is to begin with the rightmost column and begin discarding columns to the left with resultant value equal 1 up to and including the first column encountered with resultant value equal 0. So, at a minimum one far-right column will be discarded (when it has resultant value equal 0). If the far-right column has resultant value 1 then the number of subsequent columns discarded to the right depend upon the number of resultant 1s encountered before the first resultant 0. For example, If five resultant 1s are encountered before the first resultant 0 then a total of 6 rightmost columns will be discarded.

Figure 19 provides an example of SEQ 828.4 generation. The first 32 combined delayed periodic sequences for a SEQ 828.4 r sequence of length 128 are shown. The first resultant sequence is the sum of each column mod 3. The second resultant sequence is the binary sequence that results when the appropriate arithmetic carries are performed. Note that the binary sequence is shortened by one element due to the indeterminancy of the right-most element value.

My suspicion was that SEQ 828.4 is aperiodic. The challenge was to use CAAP1 and SSSA1 to prove it. Figure 20.1 shows my initial attempt to use SSSA1 on the SEQ 828.4 r sequence of length 16384 to find CCA PVAA s sequences. The most promising candidate found was the 110th CCA s sequence 0:[10 18 [21 72]]:27. This s sequence has the most number of components of all of the 636 CCA PVAA s sequences found. However, as shown in Figure 20.2, when I ran SSSA1 again on the SEQ 828.4 r sequence with length 65536 instead and a larger search space, this CCA PVAA s sequence was not found. Subsequent testing showed that the first 27 components of the s sequence are consistent with the SEQ 828.4 r sequence and then the 28th component is inconsistent with the r sequence. This called into question the utility of the CAAP1 approach to aperiodicity proof. If the Analyst often hands an s’ sequence to a Mathematician that is not, in fact, a subsequence of the associated r’ sequence, then the CAAP1 approach is not too useful. CAAP1 assumes that the Computer can deliver an s’ sequence that has high probability of being a subsequence of the r’ sequence. It’s unlikely that a Mathematician will be willing to expend the time and energy required to prove that an s’ sequence is a subsequence of the r’ sequence unless she has a reasonable expectation that this conjecture is true. So, my experience with SEQ 828.4 led to a crisis of confidence. However, instead of abandoning CAAP1 entirely I developed the QM decomposition approach to aperiodicity proof, which is discussed in a separate post. The SEQ 806 family of r’ sequences had additional secrets to reveal!

23. Open Questions

Some open questions:

(Q1) Prove that the two SEQ 806.4 CCA s’ sequences corresponding to the [1 1]/A and [1 1]/B events are aperiodic. See Section 18.

(Q2) Prove that the two SEQ 806.4 CCA s’ sequences corresponding to the [1 1]/A and [1 1]/B events are subsequences of the SEQ 806.4 r’ sequence. See Section 18.

Comment: having Q1 and Q2 proofs in hand would allow me to construct the first complete nontrivial example of an aperiodicity proof using the CAAP1 framework.

(Q3) Does EVERY aperiodic r’ sequence contain at least one aperiodic s’ subsequence?

Comment: I would be surprised if this is true. Can someone deliver an aperiodic r’ sequence that DOES NOT have a single aperiodic s’ subsequence?

(Q4) Is it possible to construct an aperiodic r’ sequence containing multiple interleaved s’ sequences that are not themselves aperiodic?

Comment #1: if so, then the answer to Q3 is NO.
Comment #2: the interleaved s’ sequences would constructively ‘cooperate’ to create aperiodicity. For example, s’ sequence #1 could filter blocks with length L mod 3 = 0 or 1, while s’ sequence #2 could filters blocks with length L mod 3 = 2. Together, the two s’ sequences would filter blocks of all lengths and thus the r’ supersequence would be aperiodic.

(Q5) Is it possible to construct an aperiodic r’ sequence containing an infinite sequence of different aperiodic s sequences?

Comment #1: if so, then the answer to Q3 is NO.
Comment #2: each aperiodic s sequence would enforce aperiodicity along a finite length of the r’ supersequence. Since there would be an infinite number of these differing aperiodic s sequences chained together, the r’ supersequence would be aperiodic.

(Q6) In my initial investigation into CAAP1 using SSSA1, I found PVA and PVAA s sequences to be useful. If I studied additional r’ sequences, would I find use for hyperaccelerating s’ sequences such as PVAH or PVAHI, or PVAHIJ, etc.?

(Q7) Can c sequence search be developed into a useful way of finding s sequences? See Section 20.

(Q8) In Section 19, I show that an elegant fractal pattern governs whether a particular PVA s sequence is CCA or XCA. Does a similar pattern govern PVAH (hyperaccelerating) s sequences?

(Q9) Prove that the fractal structure for PVA s sequence CCA / XCA identification is true in general (i.e., for any velocity and acceleration). See Section 19.

(Q10) Is there an easy-to-implement algorithm for detecting undersampled s sequences? See Section 16.

Comment: the lack of such capability is a major limitation in SSSA1. The inability to detect these s sequences means that SSSA1 can’t offer the user the option of omitting them from the list of s sequence found during s sequence search.

(Q11) If a core s’ sequence is aperiodic, then are any and/or all of its undersampled derivative s’ sequences also aperiodic? See Section 16. It’s obvious that if a core s’ sequence is aperiodic, then all of its forward-iteration derivative s’ sequences are also aperiodic. The situation is trickier for the undersampled derivative s’ sequences.

(Q12) What mathematical tools are available to assist a mathematician in proving that a particular s’ sequence is a subsequence of an r’ sequence?

Comment: the computer only finds s sequences that are a subsequence of a particular r sequence. Claims regarding infinite-length sequences (s’ and r’) are the domain of the mathematician.

(Q13) What mathematical tools are available to assist a mathematician in proving that a particular s’ sequence is aperiodic?

Comment: the technique I used in Section 9 to “prove” that the [0 1 1] PVA s’ sequence is aperiodic can be extended to other s’ sequences (the details are interesting but beyond the scope of this post); however, there may be s’ sequences that require a different approach.

(Q14) Is it possible to use CAAP1 to prove that well-known r’ sequences such as the square root of two, e, pi, etc. are aperiodic? If not, why not?

Comment: No one doubts that these r’ sequences are aperiodic. Success using CAAP1 to prove these sequences aperiodic may mean that this approach can be successfully applied to other sequences that may or may not be aperiodic.

(Q15) After proving that SEQ 806.4 (sum mod 2) and SEQ 828.4 (sum mod 3) are aperiodic, further prove that the r’ SEQs generated by (sum mod 4), (sum mod 5), (sum mod 6), the general case sum mod M, and sum (no modulus) are aperiodic.

24. What’s Next?

If there is sufficient interest, I can post the C-language source code developed during my investigation into the application of s sequences to CAAP. Please contact me if interested.

I don’t have any immediate plans to do further work on this topic.

Stephen P. Hershey

Engineer