Up: Documentation Home.

The g-SNE paper (with matlab code on github) suggests a modification of t-SNE to add a second divergence that is more focused on matching similarities due to longer distances. The idea is that a weighted combination of the two will balance global and local structure.

The g-SNE cost function is:

\[C = \sum_{ij} p_{ij} \log \frac{p_{ij}}{q_{ij}} + \lambda \sum_{ij} \hat{p}_{ij} \log \frac{\hat{p_{ij}}}{\hat{q}_{ij}}\]

with \(\lambda\) being a weighting factor as in NeRV, the un-hatted part of the cost function being the same as t-SNE, and the hatted quantities referring to using the reciprocal of the usual t-SNE kernel:

\[\hat{w}_{ij} = 1 + d_{ij}^2 = \frac{1}{w_{ij}}\]

The gradient is:

\[\frac{\partial C}{\partial \mathbf{y_i}} = 4 \sum_j \left[ \left(p_{ij} - q_{ij}\right) -\lambda \left( \hat{p}_{ij} - \hat{q}_{ij} \right) \right]w_{ij} \left( \mathbf{y_i - y_j} \right) \]

As was also apparent from the cost function, you can see that this reduces back to t-SNE when \(\lambda = 0\).

Datasets

See the Datasets page.

Settings

To use g-SNE in smallvis, use method = "gsne", which uses a default \(\lambda = 1\), which is used in most of the datasets in the paper. Different values of \(\lambda\) can be specified by using e.g.  method = list("gsne", lambda = 5). In the g-SNE paper, values of lambda between 0 and 10 are considered, so we will look at 1, 2.5, 5 and 10.

Other settings will be kept standard. The word ‘perplexity’ doesn’t seem to appear in the paper at all, and the github code examples use the default value of 30, so perplexity = 40 is going to be fine here.

Early exaggeration also isn’t mentioned in the paper, and in my experiments it made no difference to the results, but it’s on by default in the code on github, so I’ve also used it to here. The exaggeration is not applied to \(\hat{P}\) in either smallvis or the matlab implementation.

iris_gsne <- smallvis(iris, method = list("gsne", lambda = 5), perplexity = 40, eta = 100, Y_init = "spca", exaggeration_factor = 4)

An obvious point of comparison is t-SNE, which is the same as g-SNE with \(\lambda = 0\). For preserving global distances, we can use metric MDS. We’ve looked at MMDS elsewhere, but in this case we will use the t-SNE scaling (not that it makes much difference).

iris_tsne <- smallvis(iris, perplexity = 40, eta = 100, Y_init = "spca", exaggeration_factor = 4)
iris_mmds <- smallvis(iris, eta = 1e-5, max_iter = 2000, Y_init = "spca", method = "mmds")

Results

For each dataset, from left-to-right and top-to-bottom, lambda increased from 0 (t-SNE) to 1, 2.5, 5 and 10. The bottom right image is MMDS.

iris

iris tsne iris gsnexl1
iris gsnexl2.5 iris gsnexl5
iris gsnexl10 iris mmds

s1k

s1k tsne s1k gsnexl1
s1k gsnexl2.5 s1k gsnexl5
s1k gsnexl10 s1k mmds

oli

oli tsne oli gsnexl1
oli gsnexl2.5 oli gsnexl5
oli gsnexl10 oli mmds

frey

frey tsne frey gsnexl1
frey gsnexl2.5 frey gsnexl5
frey gsnexl10 frey mmds

coil20

coil20 tsne coil20 gsnexl1
coil20 gsnexl2.5 coil20 gsnexl5
coil20 gsnexl10 coil20 mmds

mnist6k

mnist6k tsne mnist6k gsnexl1
mnist6k gsnexl2.5 mnist6k gsnexl5
mnist6k gsnexl10 mnist6k mmds

fashion6k

fashion6k tsne fashion6k gsnexl1
fashion6k gsnexl2.5 fashion6k gsnexl5
fashion6k gsnexl10 fashion6k mmds

Conclusions

Except for iris, it’s clear that, in general, increasing \(\lambda\) makes the g-SNE progressively approach the sort of layout we’d expect with MMDS. I’m just not sure that I particularly like any of the intermediate results more than just sticking with the t-SNE results. I think starting from the scaled PCA initialization gives you a lot of the global layout you would want. lambda values between 1 and 2.5 seem the best place to start exploring.

On the other hand, the authors use a topological clique analysis involving Betti numbers for some gene expression data and conclude that the g-SNE plot represents the data better. Unfortunately, the plots are rather small, and for a second example which required a value lambda = 5 for optimal results, the plots aren’t shown.

Up: Documentation Home.