No Description

Kirk Trombley 156e2a796b start moving download code to new script 2 years ago
data 156e2a796b start moving download code to new script 2 years ago
legacy 10c247bb9a Organize repo a bit 2 years ago
tools 156e2a796b start moving download code to new script 2 years ago
web 33dcfc25e7 Move the db and web actually, new plan 2 years ago
.gitignore 156e2a796b start moving download code to new script 2 years ago
.prettierrc ebed92f2f0 Tweak prettier settings 2 years ago
README.md 8965896418 Update 'README.md' 2 years ago
ingest2.py 96367ab39b Update pokedex processing 2 years ago
requirements.txt 47f5ef4a69 Add reqs for anim_ingest 3 years ago

README.md

Pokemon Color Search

Utility for quickly finding pokemon by the sprite's "distance" from a given color.

Included Files

  • ingest2.py downloads images and runs statistics on them for the entire pokedex, producing a database file (and an ingest record for debugging). This database is used by the UI for actual sorting. Requires reasonably up to date versions of numpy, scipy, Pillow, and aiohttp.
  • index.html along with the web directory serve as a UI for doing pokemon search. Fuse.js, d3-color, and d3-cam02 are all loaded statically from unpkg. TODO - tweak this once d3 no longer needed. Designed to work when served or browsed to as a local file.
  • data directory has some legacy data and the latest database and pokedex, which means as long as you have that you will not need to run the ingester.

Problem Statement

Given a color q and a collection of sets of colored pixels, determine which set is "most similar" to q. Put another way, find a measure D(q, P) that scores each set of pixels P according to a "perceptual distance" from q.

Color difference measurement is an active area of study. While there are many strategies, the most common uses so-called uniform color spaces, where the perceptual difference between colors correlates strongly with their Euclidean distance.

This project uses the OKLab color space, a simple to compute but still largely perceptually uniform color space. The core of the measure should be the Euclidean distance, but this does not solve the whole problem - how do you take the distance between a point and a set of points?

Distance to Mean - δ

The simplest useful metric is to simply take the centroid of the set P (the arithmetic mean of all points in P), call this μ(P), and assign

D(q, P) = ||μ(P) - q||

where ||x|| denotes the Euclidean/L2-norm of x. This is cheap to compute, since μ(P) can be pre-computed for each set.

This definition of D is available in the UI as δ²(P) = ||μ(P) - q||².

Average of Distance to Each Point

In some sense, the most perfect metric would be to calculate the Euclidean distance from q to each point p in P, and take the arithmetic mean of these distances. This involves a large sum of square roots, which is hard to do meaningful precomputation for, and makes calculation prohibitively expensive. For this reason, this distance metric is not used. Instead, there is an approximation of it.

Root-Mean-Square Distance to Each Point - ρ

Rather than taking the arithmetic mean of the Euclidean distances to the target color, the quadratic mean (root-mean-square) gives a useful result while being comparatively trivial to compute. Assign the distance metric as follows:

D(q, P) = sqrt(sum(||p-q||² ∀ p ∈ P) / |P|)

Then perform a series of algebraic manipulations as follows:

D(q, P)     = sqrt(sum(||p-q||² ∀ p ∈ P) / |P|)
D(q, P)²    = sum(||p-q||² ∀ p ∈ P) / |P|
D(q, P)²|P| = sum(||p-q||² ∀ p ∈ P)
D(q, P)²|P| = sum((p-q)⋅(p-q) ∀ p ∈ P)
D(q, P)²|P| = sum((p⋅p - 2q⋅p + q⋅q) ∀ p ∈ P)
D(q, P)²|P| = sum(p⋅p ∀ p ∈ P)
            - 2*sum(q⋅p ∀ p ∈ P)
            + sum(q⋅q ∀ p ∈ P)
D(q, P)²|P| = sum(p⋅p ∀ p ∈ P)
            - 2q⋅sum(p ∀ p ∈ P)
            + |P|q⋅q
D(q, P)²    = sum(p⋅p ∀ p ∈ P) / |P|
            - 2q⋅sum(p ∀ p ∈ P) / |P|
            + q⋅q

Note that sum(p ∀ p ∈ P) / |P| is just the centroid, μ(P).

D(q, P)² = sum(p⋅p ∀ p ∈ P) / |P|
         - 2q⋅μ(P)
         + q⋅q
D(q, P)² = sum(p⋅p ∀ p ∈ P) / |P|
         - μ(P)⋅μ(P)
         + μ(P)⋅μ(P)
         - 2q⋅μ(P)
         + q⋅q
D(q, P)² = sum(p⋅p ∀ p ∈ P) / |P|
         - μ(P)⋅μ(P)
         + (μ(P) - q)⋅(μ(P) - q)
D(q, P)² = sum(p⋅p ∀ p ∈ P) / |P|
         - μ(P)⋅μ(P)
         + ||μ(P) - q||²
D(q, P)² = sum(||p||² ∀ p ∈ P) / |P|
         - ||μ(P)||²
         + ||μ(P) - q||²

Note that ||μ(P) - q||² is just the squared distance to the centroid, δ²(P). Additionally, we take the first two terms of this polynomial and define a new function σ as follows:

σ²(P) = sum(||p||² ∀ p ∈ P) / |P| - ||μ(P)||²

This σ can be pre-computed for each set P. Now we can simply define D(q, P) as:

D(q, P) = sqrt(σ²(P) + δ²(P))

This definition of D is available in the UI as ρ²(P) = σ²(P) + δ²(P).

A Note on Terminology

The value σ² is referred to as the variance of the image, and that is why the symbol σ was chosen, but it is important to note this notion has a hidden assumption. The above equations look remarkably similar to calculations of RMS error in a sample, but this is only because of a certain choice of definition.

Consider the definition of statistical variance of a distribution X.

Var(X) = E[X²] - E[X]²

What could a definition of Var(P) be? If we treat P as a uniform distribution over the pixels it contains, we get the following definition of expectation.

E[P] = sum(p ∀ p ∈ P) / |P| = μ(P)

But, this is a vector-valued function. How can we "square" it? Here is where the choice of definition comes into play.

Define x² = x⋅x = ||x||². This not far off from how some geometric algebras define the square of a vector!

Define E[P²] = sum(p² ∀ p ∈ P) / |P|. This is usually how the second moment is defined for discrete uniform probability distributions anyway, but it is worth stating explicitly here.

With just these definitions we can now compute the variance of the "distribution" P as follows:

Var(P) = E[P²] - E[P]²
       = sum(p² ∀ p ∈ P) / |P| - (sum(p ∀ p ∈ P) / |P|)²
       = sum(p² ∀ p ∈ P) / |P| - μ²(P)
       = sum(||p||² ∀ p ∈ P) / |P| - ||μ(P)||²
       = σ²(P)

And so we recover our definition of σ² from before. If this definition of a vector square is accepted, then it is accurate to say that σ² is the variance in color of the pixel set. But this distinction is important, since otherwise there a risk of using this value imprecisely.

For example, it is not accurate to say σ² is the variance of the norms of the pixels, because it actually uses the norm of the mean and not the mean of the norms. σ² is first and foremost a computational tool, and any statistical meaning (or lack thereof) does not have any bearing on the performance of the distance metric.

Cosine Similarity - κ

As a final option for similarity/difference measurment, consider the cosine similarity of two vectors, defined as the cosine of the angle between them, computed as the dot product of two normalized vectors. Since this is related to Euclidean distance, there is reason to consider it an equally valid similarity test in a perceptually uniform color space.

To convert measure of similarity to one of difference, simply subtract the similarity from 1. To compute the mean of this difference over the whole set, assign the distance metric as follows:

D(q, P) = sum(1 - (p/||p||)⋅(q/||q||) ∀ p ∈ P) / |P|

Then, rearrange algebraically:

D(q, P) = sum(1 - (p/||p||)⋅(q/||q||) ∀ p ∈ P) / |P|
        = |P|/|P| - sum((p/||p||)⋅(q/||q||) ∀ p ∈ P) / |P|
        = 1 - sum((p/||p||)⋅(q/||q||) ∀ p ∈ P) / |P|
        = 1 - sum(p/||p|| ∀ p ∈ P)⋅(q/||q||) / |P|
        = 1 - (sum(p/||p|| ∀ p ∈ P) / |P|)⋅q/||q||

Define a new vector τ, called the "tilt" (since it is a measure of how "off-center" the set is)

τ(P) = sum(p/||p|| ∀ p ∈ P) / |P|

This can of course be pre-computed for each set P, and simplifies the distance function.

D(q, P) = 1 - τ(P)⋅q/||q||

This definition of D is available in the UI as (κ(P) - 1)||q|| = -q⋅τ(P).