---
title: "Kriging from scratch -- Simple Features version"
author: "D G Rossiter"
date: "`r Sys.Date()`"
output:
  html_document:
    number_sections: TRUE
    theme: "lumen"
    code_folding: show
    fig_height: 4.5
    fig_width: 4.5
    fig_align: 'center'
    toc: TRUE
    toc_float: TRUE
---

This tutorial shows how to compute an Ordinary Kriging prediction directly from the OK equations.

This is an adaptation of a previous tutorial that used the `sp` package to represent spatial objects in R. Here we use the newer `sf` package that implements Simple Features [^1] representation of spatial objects.

[^1]: https://r-spatial.github.io/sf/index.html

# Ordinary Kriging equations

The prediction by Ordinary Kriging at an unknown point $x_0$ is computed as a weighted average $\hat{x}_o = \sum_i \lambda_i x_i$ of known points $\hat{x}_i$, $i = 1 \ldots n$.

These weights $\lambda_i$ are dervied by solving the Ordinary Kriging equations: $\mathbf{A} \lambda = b$, solved as $\lambda = \mathbf{A}^{-1} b$. The kriging prediction variances are computed as $b^T \lambda$, where$A, \lambda, b$ are defined from the semivariances $\gamma$ and the LaGrange multiplier $\psi$ as:

\begin{eqnarray*}
\mathbf{A} & = & \left[
\begin{array}{*5{c}}
\gamma(\mathbf{x}_1,\mathbf{x}_1) & \gamma(\mathbf{x}_1,\mathbf{x}_2) & \cdots &
\gamma(\mathbf{x}_1,\mathbf{x}_N) & 1 \\
\gamma(\mathbf{x}_2,\mathbf{x}_1) & \gamma(\mathbf{x}_2,\mathbf{x}_2) & \cdots &
\gamma(\mathbf{x}_2,\mathbf{x}_N) & 1 \\
\vdots & \vdots & \cdots & \vdots & \vdots \\
\gamma(\mathbf{x}_N,\mathbf{x}_1) & \gamma(\mathbf{x}_N,\mathbf{x}_2) & \cdots &
\gamma(\mathbf{x}_N,\mathbf{x}_N) & 1 \\
1 & 1 & \cdots & 1 & 0 \\
\end{array}
\right]
;
\mathbf{\lambda} & = & \left[
\begin{array}{c}
\lambda_1 \\ \lambda_2 \\ \vdots \\ \lambda_N \\ \psi
\end{array}
\right]
;
\mathbf{b} & = & \left[
\begin{array}{c}
\gamma(\mathbf{x}_1,\mathbf{x}_0) \\ \gamma(\mathbf{x}_2,\mathbf{x}_0) \\ \vdots
\\ \gamma(\mathbf{x}_N,\mathbf{x}_0) \\ 1
\end{array}
\right]
     \end{eqnarray*}
     
We will build these matrices and solve explicitly for one prediction point.

# Example data

Load the Meuse example dataset from the `sp` package and convert to an `sf` object, with both coordinates and attributes. Note that all functions in the `sf` package have the prefix `st_`, which stands for "space + time"; this prefix prevents namespace conflicts with functions from other loaded packages.

```{r}
require(sf)
data(meuse, package="sp")
class(meuse)
str(meuse)
meuse.sf <- st_as_sf(meuse, coords=c("x", "y"))
class(meuse.sf)
str(meuse.sf)
st_crs(meuse.sf)
st_crs(meuse.sf) = 28992  # the EPSG code, see ?meuse
st_crs(meuse.sf)
```

Notice that the converted object has the original `data.frame`, except that the two fields representing coördinates are now in a newly-created field `geometry`, of class `sfc_POINT`. The Coördinate Reference System (CRS) must be supplied from the metadata, since no where in the `data.frame` can this be indicated. Although not needed for this example, we assign anyway, using information from `?meuse`.  Notice the Well-known text (WKT) format of the CRS specification.

Place a point to predict somewhere in the middle of the bounding box, and convert it to an `sf` object. Note it has no attributes, we only need its coordinates. The `sfg` "Simple Features Geometry" object must be upgraded to an `sfc` "Simple feature columns" object in order to specify the CRS.

```{r}
print(bb <- st_bbox(meuse.sf))
x0 <- st_point(c(median(bb$xmin, bb$xmax),
                 median(bb$ymin, bb$ymax)))
print(x0)
class(x0)
x0 <- st_sfc(x0)
st_crs(x0) <- st_crs(meuse.sf)
print(x0)
class(x0)
str(x0)
```

Now the prediction and known points are both `sf` ojects with CRS.

# Build the OK system matrices

Build a matrix of the distances between known points with the `st_distance` function:

```{r}
dim(st_coordinates(meuse.sf))
dm.A <- st_distance(meuse.sf)
dim(dm.A)
dm.A[1:5, 1:5]
```

Notice the diagonals are 0, this is the distance of a point to itself.

Convert these to semivariances, using a fitted variogram model, as the upper-left of the $\mathbf{A}$ matrix, using geostatistics functions from the `gstat` package, which can understand `sf` objects.

```{r}
require(gstat)
meuse.sf$logZn <- log10(meuse.sf$zinc)
v <- variogram(logZn ~ 1, loc=meuse.sf, cutoff=1300, width=90)
(vmf <- fit.variogram(v, vgm(psill=0.12, model="Sph", range=900, nugget=0.01)))
str(A <- variogramLine(object=vmf, dist_vector=dm.A))
A[1:5, 1:5]
```

Extend the $\mathbf{A}$ matrix with the column and row of 1's (constraints), bottom-right entry is 0:

```{r}
dim(A)
A <- cbind(A, 1)
dim(A)
A <- rbind(A, 1)
dim(A)
n <- dim(A)[1]
A[n,n] <- 0
```

Build the $b$ vector from the distances from the target point to the known points.

Note the use of the `drop` function to remove the 2nd dimension from the matrix returned by `st_distance`, in this case there is only one column, because all distances are computed to a single point.

```{r}
(st_crs(meuse.sf) == st_crs(x0)) # CRS must be the same for `st_distance` between two objects
dm <- drop(st_distance(meuse.sf, x0))  # column vectors, st_distance arguments are rows, columns
length(dm)
```

Convert to semivariances, and add a last element (1) needed for the kriging equations.

```{r}
b <- c(variogramLine(vmf, dist_vector=dm)$gamma, 1)
length(b)
head(b); tail(b)
```

# Solve the OK system

Solve for weights and LaGrange multiplier. Note `solve` with a single matrix argument inverts the matrix, and `%*%` is the matrix multiplication operator. The `drop` function removes matrix dimensions with only one entry, here it will be the column, and so converts the $\lambda$ matrix into a vector.

```{r}
lambda <- drop(solve(A) %*% b)
head(lambda)
```

The LaGrange multiplier is the last element in the solution:

```{r}
tail(lambda, 1) # LaGrange multiplier
```

Now that we have the $\lambda$ vector, we use it to solve for the prediction and its variance. Here `drop` will convert the results to scalars:

```{r}
# prediction
(ok.pred <- drop(lambda[1:n-1] %*% meuse.sf$logZn))
# variance
(ok.pred.var <- drop(t(b) %*% lambda))
```

# Compare to the `krige` function

Check that we get the same results with the `krige` function.
For this, the `locations` and `newdata` arguments must be of class `sf`, i.e., with the geometry as a special field in a `data.frame`. So, we upgrade `x0` with the `st_sf` function:

```{r}
class(meuse.sf)
class(x0)
x0.sf <- st_sf(x0)
class(x0.sf)
str(x0.sf)
(k.pt <- krige(logZn ~1, loc = meuse.sf, newdata = x0.sf, model = vmf))
```

As promised!

Questions? Comments? mailto:d.g.rossiter@cornell.edu?subject=Comments_on_KrigingFromScratch_sf
