伊東生態統計研究室 - NIMBLE (nimbleSMC) による粒子フィルタ

NIMBLEには、nimbleSMCという関連パッケージがあって、粒子フィルタ（逐次モンテカルロ）に対応しています。ということで、ためしてみました。

おもに、NIMBLEマニュアルの第8章と『基礎からわかる時系列分析』を参考にしました。

萩原淳一郎・瓜生真也・牧山幸史 (著) 石田基広 (監修) 基礎からわかる時系列分析 : Rで実践するカルマンフィルタ・MCMC・粒子フィルタ技術評論社

準備

library(nimble)
## nimble version 1.0.1 is loaded.
## For more information on NIMBLE and a User Manual,
## please visit https://R-nimble.org.
## 
## Note for advanced users who have written their own MCMC samplers:
##   As of version 0.13.0, NIMBLE's protocol for handling posterior
##   predictive nodes has changed in a way that could affect user-defined
##   samplers in some situations. Please see Section 15.5.1 of the User Manual.
## 
##  次のパッケージを付け加えます: 'nimble'
##  以下のオブジェクトは 'package:stats' からマスクされています:
## 
##     simulate
library(nimbleSMC)
## 
##  次のパッケージを付け加えます: 'nimbleSMC'
##  以下のオブジェクトは 'package:nimble' からマスクされています:
## 
##     buildAuxiliaryFilter, buildBootstrapFilter, buildEnsembleKF,
##     buildIteratedFilter2, buildLiuWestFilter
library(dlm)
set.seed(1234)

データ

ナイル川の流量データを使用します。また、1899年からのダムによる流量の変化のモデルに組み込みため、damという変数をつくっています。

data(Nile)

dam <- c(rep(0, 1899 - start(Nile)[1]),
         rep(1, end(Nile)[1] - 1899 + 1))

NIMBLEのコードです。

nile_code <- nimbleCode({
  x[1, 1] ~ dnorm(m0[1], var = C0[1])
  x[2, 1] ~ dnorm(m0[2], var = C0[2])
  y[1] ~ dnorm(x[1, 1] + x[2, 1] * dam[1], var = V)
  for (t in 2:T) {
    x[1, t] ~ dnorm(x[1, t - 1], var = W[1])
    x[2, t] ~ dnorm(x[2, t - 1], var = W[2])
    y[t] ~ dnorm(x[1, t] + x[2, t] * dam[t], var = V)
  }
})

モデルとあてはめ

初期値を乱数で生成します。

inits <- list(x = matrix(runif(2 * (length(Nile) + 1), 0, 5), nrow = 2))

観測モデルの分散を15000、状態モデルのレベル成分の分散と回帰成分の分散をともに1としました。初期状態の平均はともに0、分散はともに10^7としました。

データの先頭にダミーの値を入れておきます（『基礎からわかる時系列分析』p.366）。

nile_model <- nimbleModel(nile_code,
                          constants = list(T = length(Nile) + 1,
                                           V = 15000, W = c(1, 1),
                                           m0 = c(0, 0),
                                           C0 = c(1e+07, 1e+07)),
                          data = list(y = c(NA, Nile),
                                          # append a dummy initial value
                                      dam = c(0, dam)),
                          inits = inits)
## Defining model
## Building model
## Setting data and initial values
## Running calculate on model
##   [Note] Any error reports that follow may simply reflect missing values in model variables.
## Checking model sizes and dimensions
##   [Note] This model is not fully initialized. This is not an error.
##          To see which variables are not initialized, use model$initializeInfo().
##          For more information on model initialization, see help(modelInitialization).

粒子フィルタのアルゴリズムとして、ブートストラップフィルタを使用します。thresh = 1で、毎回リサンプリングをおこなうように設定しています。また、saveAll = TRUEで全時点でのサンプルを保存し、 smoothin = FALSEでフィルタリングをおこなうようにしています。

nile_bootF <- buildBootstrapFilter(nile_model, nodes = "x",
                                   control = list(thresh = 1,
                                                  saveAll = TRUE,
                                                  smoothing = FALSE))

コンパイル

モデルとフィルタをコンパイルします。

nile_cModel <- compileNimble(nile_model)
## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
nile_cBootF <- compileNimble(nile_bootF, project = nile_model)
## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.

実行

粒子数を20000として、粒子フィルタを実行します。

(nile_cBootF$run(m = 20000))     # returns log likehood
## [1] -637.7484

結果

重みつきサンプルを重みを取り出して、サンプルのレベル成分と回帰成分×説明変数を足し合わせます。

nile_filt <- as.matrix(nile_cBootF$mvWSamples, "x")[, -c(1, 2)]
nile_weight <- as.matrix(nile_cBootF$mvWSamples, "wts")[, -1]
col_x1 <- colnames(nile_filt) |>
  startsWith("x[1, ")
col_x2 <- colnames(nile_filt) |>
  startsWith("x[2, ")
nile_yhat <- as.matrix(nile_filt[, col_x1]) +
             as.matrix(nile_filt[, col_x2]) %*% diag(dam)

重みつき平均を元のデータに重ねてプロットします。

filt_mean <- sapply(seq_along(Nile), function(i) {
  weighted.mean(nile_yhat[, i], nile_weight[, i])
  })
plot(Nile, las = 1, ylab = "Level")
lines(ts(filt_mean, start = start(Nile), end = end(Nile)),
      col = 2, lwd = 2)

変化点のところがNaNになっていたり（重み成分に-Infがでるため）、乱数によって結果が微妙に変わったりするので、もうすこし改良が必要なようです。