Deep DiveMarch 19, 202610 min readby Waseem Akmal

The Science Behind Crypto Pattern Matching Algorithms

DTW, Pearson, Ensemble, K-mer — pattern matching algorithms are not black boxes. This guide explains how each one works, what it measures, and when to use which.

The Pattern Finder uses 10 different algorithms to search over 1 million historical candles for price sequences similar to the current chart. Each algorithm defines "similarity" differently — and that distinction matters enormously for which historical matches you retrieve and how reliable the resulting prediction is.

This guide explains how each algorithm works, what it measures, and when each one produces the most useful results for crypto pattern analysis.

Why Different Algorithms Exist

No single similarity measure is optimal for every pattern type — the right algorithm depends on whether you care about shape, timing, magnitude, volume, or some combination of all four.

No single mathematical measure of similarity is optimal for every type of price pattern. A pattern that unfolds slowly over 80 candles might be structurally identical to one that compressed into 60 candles. A pattern with the same shape but different volatility amplitude might be highly similar in direction but appear very different to a distance-based measure. A pattern where volume and price both moved in a specific way needs an algorithm that considers both dimensions simultaneously.

Using 10 algorithms — rather than one — ensures that the matching system captures the full range of meaningful similarity dimensions. The Ensemble algorithm combines multiple measures to produce a balanced score; the specialized algorithms let you interrogate specific dimensions of similarity independently.

Dynamic Time Warping (DTW)

DTW is the dominant algorithm for shape-based financial pattern matching because it solves the core problem of variable pattern duration — two bull flags that took 20 and 40 candles respectively are recognized as the same shape.

DTW is the most widely used algorithm for time series similarity in financial analysis. Its key feature is elastic time alignment: DTW allows matches to be stretched or compressed in time, so a pattern that took 80 candles in the past can match a current pattern that took 60 candles if the shape is otherwise similar.

This makes DTW particularly effective for patterns that do not always take the same amount of time to form — which describes most real-world crypto chart patterns. A bull flag might consolidate for 20 candles on one occasion and 40 candles on another; DTW recognizes the structural similarity regardless.

DTW performs best for shape-based matching where the temporal alignment of the pattern is variable. It is the go-to algorithm for identifying broadly similar price trajectories across different historical market conditions.

Pearson Correlation

Pearson correlation finds the highest-precision matches when exact timing matters — it rewards patterns where each candle in the sequence moved in lockstep with the historical counterpart.

Pearson correlation measures the linear relationship between two series — specifically, how consistently the two sequences move in the same direction at the same time. Unlike DTW, Pearson requires exact temporal alignment: candle 1 in the current pattern must correspond to candle 1 in the historical match.

The result is a correlation coefficient between -1 and +1. A score near +1 means the two sequences moved almost identically in both direction and timing. A score near -1 means they moved in opposite directions. A score near 0 means no meaningful linear relationship.

Pearson is best for identifying patterns where the specific timing of each move matters — for example, when you want to find historical instances where the exact sequence of up-days and down-days mirrors the current pattern. It is less flexible than DTW but can produce higher-precision matches when the temporal alignment is genuinely similar.

Ensemble Algorithm

The Ensemble algorithm is the recommended default because it produces matches that are broadly similar across shape, timing, and magnitude — avoiding the blind spots that arise when any single dimension dominates the score.

The Ensemble algorithm combines multiple similarity measures into a single balanced score. Rather than optimizing for one dimension — shape, timing, or magnitude — it produces a weighted composite that considers all of them together.

The practical result is that Ensemble matches tend to be broadly similar across multiple dimensions rather than extremely similar on one dimension while diverging on others. For most use cases — especially when you are not sure which specific dimension of similarity you want to prioritize — Ensemble is the recommended starting point.

Ensemble is particularly useful when you want a general answer to the question "when did price look like this before?" without wanting to constrain the search to only shape similarity or only timing similarity.

K-mer Frequency Analysis

K-mer analysis identifies periods with the same underlying market vocabulary — the same mix of impulsive moves, consolidations, and volatility patterns — even when the overall chart shapes do not look visually alike.

K-mer analysis originated in genomics for comparing DNA sequences. Applied to price data, it treats price candles as characters in a sequence and compares the frequency distributions of short sub-sequences (k-mers) between the current pattern and historical candidates.

Rather than comparing the sequence directly, K-mer asks: do these two sequences contain the same types of local patterns (small ups-and-downs, multi-candle moves) in similar proportions? Two patterns might have different specific shapes but share the same "vocabulary" of sub-moves.

K-mer is most useful for identifying patterns that share the same underlying market dynamics — a mix of small consolidations, impulsive moves, and specific volatility characteristics — even when the overall shapes are not exactly alike. It complements DTW and Pearson by capturing a different type of structural similarity.

Euclidean Distance

Euclidean distance is the strictest similarity measure — it produces the fewest matches but the most precise ones, making it the right choice when you want historical instances that are nearly identical to the current pattern at every point in time.

Euclidean distance is the most straightforward similarity measure: it calculates the straight-line distance between corresponding points in the two sequences. Low Euclidean distance means the patterns are nearly identical at every point in time.

Because Euclidean distance requires exact temporal alignment and is sensitive to amplitude differences, it tends to be the strictest similarity measure — producing the fewest matches but the most precise ones. It is most useful when you want to find historical instances that are as close as possible to the current exact price sequence.

Cosine Similarity

Cosine similarity measures directional alignment independent of magnitude — two patterns that moved in the same direction but at different volatility levels score identically, making it ideal for cross-regime comparisons.

Cosine similarity measures the angle between two vectors rather than the distance between them. In practice, this means it is insensitive to the magnitude of moves — two patterns that moved in exactly the same direction but with different amplitudes will score very high cosine similarity even though one had twice the volatility of the other.

This makes cosine similarity useful for finding patterns with the same directional structure regardless of whether the historical period was higher or lower volatility than the current one.

Chebyshev Distance

Chebyshev distance is the most sensitive measure to outlier candles — a single extreme wick dominates the score, making it useful specifically when you need matches where there was no major single-candle divergence.

Chebyshev distance measures the maximum point-by-point deviation between two sequences — the single worst-case mismatch across all candles. Unlike Euclidean distance which averages deviations, Chebyshev is controlled entirely by the largest individual deviation.

This makes it particularly sensitive to outlier candles — a single extreme wick or volatility spike will produce a high Chebyshev distance even if the rest of the pattern matches well. It is most useful when you want matches where there was no major single-candle divergence.

Manhattan Distance

Manhattan distance offers a middle ground between Euclidean precision and outlier tolerance — it captures total accumulated deviation without squaring individual differences, reducing the influence of any single extreme candle.

Manhattan distance (also called L1 norm) sums the absolute differences at each point rather than squaring them as Euclidean distance does. This makes it less sensitive to extreme outlier deviations than Euclidean while still capturing total accumulated deviation across the sequence.

Spearman Rank Correlation

Spearman correlation is ideal for finding patterns with the same relative price structure across very different absolute price levels — it cares about which candle was highest and which was lowest, not the actual prices themselves.

Spearman correlation measures the similarity of the rank ordering of values rather than their absolute values. Two sequences with the same relative ordering of candles — the 5th candle was the highest, the 3rd was the lowest, etc. — will score high Spearman correlation even if the actual price levels were very different.

This is useful for finding patterns with the same relative price structure across very different price levels or time periods.

OBV Pattern Matching

OBV matching is the most powerful confirmation tool in the arsenal — when volume flow history matches the current OBV shape and points to the same directional outcome as the price-shape search, the convergent signal has significantly higher reliability than either alone.

OBV (On-Balance Volume) matching applies DTW to the OBV series rather than the price series. This finds historical periods where the volume flow pattern — how cumulative volume was building or declining — matched the current OBV shape.

OBV matching is most valuable as a confirmation tool. When the top historical OBV matches show the same subsequent price behavior as the top price-shape matches, the convergent signal has significantly higher reliability. When OBV matches show different outcomes from price matches, it warrants caution.

Which Algorithm Should You Use?

Start with Ensemble for a balanced view of overall similarity
Use DTW when the pattern might have taken more or fewer candles in historical occurrences
Use Pearson when exact timing of moves matters (e.g., the pattern must peak at the same relative candle)
Use OBV to cross-validate that the volume flow also matches — the strongest confirmation signal
Use K-mer when you want to find periods with similar underlying market dynamics, not just similar shapes

Key Takeaways

No single algorithm is optimal for all pattern types — the right choice depends on whether shape, timing, magnitude, or volume flow is most important for your current search
DTW (Dynamic Time Warping) is the best default for shape-based matching because it handles patterns that took different numbers of candles to form — the most common real-world scenario
The Ensemble algorithm combines multiple measures into a balanced score and is the recommended starting point for general-purpose searches
K-mer frequency analysis, borrowed from genomics, identifies periods with the same market vocabulary — useful when shapes differ but underlying dynamics are similar
OBV matching is the strongest confirmation signal: when volume flow history and price shape history both point to the same outcome, the probability estimate is materially more reliable
Use the Pattern Finder to run multiple algorithms on the same setup and compare consensus — disagreement across algorithms signals an ambiguous setup worth sizing down

Frequently Asked Questions

What is Dynamic Time Warping (DTW) in crypto analysis?

Dynamic Time Warping (DTW) is a time series similarity algorithm that allows elastic time alignment — meaning it can match patterns that took different numbers of candles to form, as long as the overall shape is similar. In crypto analysis, DTW is particularly effective because chart patterns (bull flags, wedges, triangles) do not always take the same number of candles across different historical occurrences. DTW is the most widely used algorithm for shape-based price pattern matching.

What is the difference between DTW and Pearson correlation?

DTW allows elastic time alignment — it can match patterns that took different numbers of candles to form by stretching or compressing time. Pearson correlation requires exact temporal alignment — candle 1 must correspond to candle 1, candle 2 to candle 2. DTW is more flexible and finds more matches; Pearson is more precise and finds only temporally aligned matches. For most crypto pattern searches, DTW is the better default; Pearson is useful when the specific timing of each move within the pattern matters.

What is the Ensemble algorithm for pattern matching?

The Ensemble algorithm combines multiple similarity measures — shape distance, directional alignment, temporal alignment, and others — into a single weighted composite score. Rather than optimizing for one specific dimension of similarity, Ensemble produces matches that are broadly similar across multiple dimensions. It is the recommended starting point for most pattern searches because it avoids the blind spots of any individual algorithm.

When should I use K-mer frequency analysis?

K-mer frequency analysis is most useful when you want to find historical periods that share the same underlying market dynamics — the same mix of small consolidations, impulsive moves, and volatility characteristics — even when the overall chart shapes are not exactly alike. It originated in genomics for comparing DNA sequences and captures a different type of structural similarity than distance-based or correlation-based algorithms. Use K-mer when Ensemble and DTW searches return low-quality matches but you believe similar market conditions must have occurred historically.

Which algorithm is best for finding Bitcoin patterns?

For Bitcoin pattern matching, the Ensemble algorithm is the best default because Bitcoin's historical database is deep enough to produce high-quality composite matches. DTW is the best specialized algorithm when you want shape-based matches across periods with different timeframes. OBV matching is particularly valuable for Bitcoin because institutional volume flows are a significant driver of BTC price, and OBV divergence has historically preceded major moves. Running Ensemble first, then validating with OBV, produces the strongest signals for Bitcoin pattern analysis.

Try it yourself

Everything described in this article is available free on LetsDoCrypto — no sign-up required.

Open Live Scanner →Pattern Finder Home