Excellent articles on the topic:
Collapsed Gibbs Sampling for Dirichlet Process Gaussian Mixture Models
Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process
- To be honest, not a big fan of the above article as I think the metaphor is slightly deviating and I find it rather distracting… well whatever, still a good introduction! Especially the Recap section comparing each generative process side by side!
In the following trying to take some notes as I walk through the Python code of Gibbs sampling for DPGMM.
1. Initialisation
- 2-dimensional Gaussians
50 samples
-> Same input data as Gaussian Mixture Model Ellipsoids from Scikit-learn example
2. Learning GMM, DPGMM using Scikit examples
# Fit a mixture of gaussians with EM using five components
gmm = mixture.GMM(n_components=, covariance_type='full')
gmm.fit(X)
# Fit a dirichlet process mixture of gaussians using five components
dpgmm = mixture.DPGMM(n_components=, covariance_type='full')
dpgmm.fit(X)
3. Learn DPGMM using Gibbs Sampling
dpmm = DPMM(n_components=-) # -1, 1, 2, 5
# n_components is the number of initial clusters (at random, TODO k-means init)
# -1 means that we initialize with 1 cluster per point
dpmm.fit_collapsed_Gibbs(X)
3.1 Create DPMM
3.2 Create Conjugate Prior Gausssian
3.3 Initialise DPMM with data
3.4 Initialise Prior Gaussian with data