The Science Behind Crypto Pattern Matching Algorithms
DTW, Pearson, Ensemble, K-mer — pattern matching algorithms are not black boxes. This guide explains how each one works, what it measures, and when to use which.
The Pattern Finder uses 10 different algorithms to search over 1 million historical candles for price sequences similar to the current chart. Each algorithm defines "similarity" differently — and that distinction matters enormously for which historical matches you retrieve and how reliable the resulting prediction is.
This guide explains how each algorithm works, what it measures, and when each one produces the most useful results for crypto pattern analysis.
Why Different Algorithms Exist
No single mathematical measure of similarity is optimal for every type of price pattern. A pattern that unfolds slowly over 80 candles might be structurally identical to one that compressed into 60 candles. A pattern with the same shape but different volatility amplitude might be highly similar in direction but appear very different to a distance-based measure. A pattern where volume and price both moved in a specific way needs an algorithm that considers both dimensions simultaneously.
Using 10 algorithms — rather than one — ensures that the matching system captures the full range of meaningful similarity dimensions. The Ensemble algorithm combines multiple measures to produce a balanced score; the specialized algorithms let you interrogate specific dimensions of similarity independently.
Dynamic Time Warping (DTW)
DTW is the most widely used algorithm for time series similarity in financial analysis. Its key feature is elastic time alignment: DTW allows matches to be stretched or compressed in time, so a pattern that took 80 candles in the past can match a current pattern that took 60 candles if the shape is otherwise similar.
This makes DTW particularly effective for patterns that do not always take the same amount of time to form — which describes most real-world crypto chart patterns. A bull flag might consolidate for 20 candles on one occasion and 40 candles on another; DTW recognizes the structural similarity regardless.
DTW performs best for shape-based matching where the temporal alignment of the pattern is variable. It is the go-to algorithm for identifying broadly similar price trajectories across different historical market conditions.
Pearson Correlation
Pearson correlation measures the linear relationship between two series — specifically, how consistently the two sequences move in the same direction at the same time. Unlike DTW, Pearson requires exact temporal alignment: candle 1 in the current pattern must correspond to candle 1 in the historical match.
The result is a correlation coefficient between -1 and +1. A score near +1 means the two sequences moved almost identically in both direction and timing. A score near -1 means they moved in opposite directions. A score near 0 means no meaningful linear relationship.
Pearson is best for identifying patterns where the specific timing of each move matters — for example, when you want to find historical instances where the exact sequence of up-days and down-days mirrors the current pattern. It is less flexible than DTW but can produce higher-precision matches when the temporal alignment is genuinely similar.
Ensemble Algorithm
The Ensemble algorithm combines multiple similarity measures into a single balanced score. Rather than optimizing for one dimension — shape, timing, or magnitude — it produces a weighted composite that considers all of them together.
The practical result is that Ensemble matches tend to be broadly similar across multiple dimensions rather than extremely similar on one dimension while diverging on others. For most use cases — especially when you are not sure which specific dimension of similarity you want to prioritize — Ensemble is the recommended starting point.
Ensemble is particularly useful when you want a general answer to the question "when did price look like this before?" without wanting to constrain the search to only shape similarity or only timing similarity.
K-mer Frequency Analysis
K-mer analysis originated in genomics for comparing DNA sequences. Applied to price data, it treats price candles as characters in a sequence and compares the frequency distributions of short sub-sequences (k-mers) between the current pattern and historical candidates.
Rather than comparing the sequence directly, K-mer asks: do these two sequences contain the same types of local patterns (small ups-and-downs, multi-candle moves) in similar proportions? Two patterns might have different specific shapes but share the same "vocabulary" of sub-moves.
K-mer is most useful for identifying patterns that share the same underlying market dynamics — a mix of small consolidations, impulsive moves, and specific volatility characteristics — even when the overall shapes are not exactly alike. It complements DTW and Pearson by capturing a different type of structural similarity.
Euclidean Distance
Euclidean distance is the most straightforward similarity measure: it calculates the straight-line distance between corresponding points in the two sequences. Low Euclidean distance means the patterns are nearly identical at every point in time.
Because Euclidean distance requires exact temporal alignment and is sensitive to amplitude differences, it tends to be the strictest similarity measure — producing the fewest matches but the most precise ones. It is most useful when you want to find historical instances that are as close as possible to the current exact price sequence.
Cosine Similarity
Cosine similarity measures the angle between two vectors rather than the distance between them. In practice, this means it is insensitive to the magnitude of moves — two patterns that moved in exactly the same direction but with different amplitudes will score very high cosine similarity even though one had twice the volatility of the other.
This makes cosine similarity useful for finding patterns with the same directional structure regardless of whether the historical period was higher or lower volatility than the current one.
Chebyshev Distance
Chebyshev distance measures the maximum point-by-point deviation between two sequences — the single worst-case mismatch across all candles. Unlike Euclidean distance which averages deviations, Chebyshev is controlled entirely by the largest individual deviation.
This makes it particularly sensitive to outlier candles — a single extreme wick or volatility spike will produce a high Chebyshev distance even if the rest of the pattern matches well. It is most useful when you want matches where there was no major single-candle divergence.
Manhattan Distance
Manhattan distance (also called L1 norm) sums the absolute differences at each point rather than squaring them as Euclidean distance does. This makes it less sensitive to extreme outlier deviations than Euclidean while still capturing total accumulated deviation across the sequence.
Spearman Rank Correlation
Spearman correlation measures the similarity of the rank ordering of values rather than their absolute values. Two sequences with the same relative ordering of candles — the 5th candle was the highest, the 3rd was the lowest, etc. — will score high Spearman correlation even if the actual price levels were very different.
This is useful for finding patterns with the same relative price structure across very different price levels or time periods.
OBV Pattern Matching
OBV (On-Balance Volume) matching applies DTW to the OBV series rather than the price series. This finds historical periods where the volume flow pattern — how cumulative volume was building or declining — matched the current OBV shape.
OBV matching is most valuable as a confirmation tool. When the top historical OBV matches show the same subsequent price behavior as the top price-shape matches, the convergent signal has significantly higher reliability. When OBV matches show different outcomes from price matches, it warrants caution.
Which Algorithm Should You Use?
- Start with Ensemble for a balanced view of overall similarity
- Use DTW when the pattern might have taken more or fewer candles in historical occurrences
- Use Pearson when exact timing of moves matters (e.g., the pattern must peak at the same relative candle)
- Use OBV to cross-validate that the volume flow also matches — the strongest confirmation signal
- Use K-mer when you want to find periods with similar underlying market dynamics, not just similar shapes
Frequently Asked Questions
What is Dynamic Time Warping (DTW) in crypto analysis?
Dynamic Time Warping (DTW) is a time series similarity algorithm that allows elastic time alignment — meaning it can match patterns that took different numbers of candles to form, as long as the overall shape is similar. In crypto analysis, DTW is particularly effective because chart patterns (bull flags, wedges, triangles) do not always take the same number of candles across different historical occurrences. DTW is the most widely used algorithm for shape-based price pattern matching.
What is the difference between DTW and Pearson correlation?
DTW allows elastic time alignment — it can match patterns that took different numbers of candles to form by stretching or compressing time. Pearson correlation requires exact temporal alignment — candle 1 must correspond to candle 1, candle 2 to candle 2. DTW is more flexible and finds more matches; Pearson is more precise and finds only temporally aligned matches. For most crypto pattern searches, DTW is the better default; Pearson is useful when the specific timing of each move within the pattern matters.
What is the Ensemble algorithm for pattern matching?
The Ensemble algorithm combines multiple similarity measures — shape distance, directional alignment, temporal alignment, and others — into a single weighted composite score. Rather than optimizing for one specific dimension of similarity, Ensemble produces matches that are broadly similar across multiple dimensions. It is the recommended starting point for most pattern searches because it avoids the blind spots of any individual algorithm.
When should I use K-mer frequency analysis?
K-mer frequency analysis is most useful when you want to find historical periods that share the same underlying market dynamics — the same mix of small consolidations, impulsive moves, and volatility characteristics — even when the overall chart shapes are not exactly alike. It originated in genomics for comparing DNA sequences and captures a different type of structural similarity than distance-based or correlation-based algorithms. Use K-mer when Ensemble and DTW searches return low-quality matches but you believe similar market conditions must have occurred historically.
Which algorithm is best for finding Bitcoin patterns?
For Bitcoin pattern matching, the Ensemble algorithm is the best default because Bitcoin's historical database is deep enough to produce high-quality composite matches. DTW is the best specialized algorithm when you want shape-based matches across periods with different timeframes. OBV matching is particularly valuable for Bitcoin because institutional volume flows are a significant driver of BTC price, and OBV divergence has historically preceded major moves. Running Ensemble first, then validating with OBV, produces the strongest signals for Bitcoin pattern analysis.
Try it yourself
Everything described in this article is available free on LetsDoCrypto — no sign-up required.