This vignette is a short introduction to the spatial weight matrices feature of the spNetwork package.

Quick introduction to spatial weight matrices

A vast number of spatial analysis methods are based on a spatial matrix Wij of size n x n with i an observation and j the neighbours of that observation. Wij represents the degree of spatial relationship between i and j.

Classically one can define :

  • neighbouring matrix (Wij = 1 if j is a neighbour of i, 0 otherwise)
  • distance matrix (Wij = the distance between i and j, modified by a function like \(\frac{1}{distance}\) or \(\frac{1}{distance^2}\))
  • interaction matrix (Wij = the degree of interaction between i and j, the measure of the interaction depends on the subject of the analysis)

In R, the classical package to deal with such matrices is the package ‘spdep’ which defines objects like neighbour lists and spatial weight lists, and offers the possibility to convert these objects into regular matrices.

When one works with data constrained on a network, using Euclidean distance to estimate proximity between observations leads to an underestimation of the real distances between them

spNetwork makes it possible to create listw objects based on network distance. Let us give an example here: calculating the Moran autocorrelation index for the number of bike accidents recorded on the Montreal road network in 2016.

First, we want to load data:

options("rgdal_show_exportToProj4_warnings"="none")

library(spNetwork)
library(sf)
library(dplyr)
library(spdep)

data(mtl_network) 
data(bike_accidents)

then we want to split lines into lixels and calculate for each lixel the number of events on that lixel.

lixels <- lixelize_lines(mtl_network,200,mindist = 50)

# defining and oid for the lixels
lixels$oid <- 1:nrow(lixels)

# snapping the points on the lines and counting
snapped_acc <- snapPointsToLines2(bike_accidents,lixels, idField ="oid")
counts <- table(snapped_acc$nearest_line_id)
counts_df <- data.frame("oid" = as.numeric(as.character(names(counts))),
                        "count" = as.numeric(counts))

# merging the results
lixels$nbAccident <- left_join(lixels,counts_df, by="oid")$count
nbAccident <- ifelse(is.na(lixels$nbAccident),0,lixels$nbAccident)

We use here the function network_listw the create a listw object representing the spatial weight matrix. The distances can be calculated from the centroids of the lixels, from the extremities of the lixels or from evenly spaced points on the lixels. We use here the second approach and specify it with the parameter method = "ends".

Let us consider that above 300 metres two segments are not neighbours anymore, and convert the distances between the observations into spatial weights by using the inverse of the squared distance.

netlistw <- network_listw(lixels,mtl_network,
                           method = "ends",
                           mindist = 10,
                           maxdistance = 300,
                           dist_func = "squared inverse",
                           line_weight = 'length',
                           matrice_type = 'W',
                           grid_shape = c(1,1),
                           verbose=FALSE)

With that matrix, we can calculate the Moran I for the number of accidents on a lixel.

Test <- moran.test(nbAccident, netlistw, zero.policy = T)
print(round(Test$estimate,4))
#> Moran I statistic       Expectation          Variance 
#>            0.0511           -0.0003            0.0001

The autocorrelation is weak, this is due to the large distance used and the fact that the analyzed variable is a counting variable (number of accidents). Indeed, the Moran I is supposed to be used on continuous variables.

One could go further and define its own function to convert distances into spatial weights

my_conv_func <- function(x){
  if (x>=300){
    return(0)
  }else{
    return(1/x**3)
  }
}

netlistw2 <- network_listw(lixels,mtl_network,
                             method = "ends",
                             mindist = 10,
                             maxdistance = 300,
                             dist_func = my_conv_func,
                             line_weight = 'length',
                             matrice_type = 'W',
                             grid_shape = c(1,1),
                             verbose=FALSE)

To speed up calculation, one could use a multiprocessing plan defined with the package ‘future’. To do so, the study area is divided into rectangles (according to the parameter grid_shape), and each rectangle is treated separately. A buffer is applied around the rectangles to avoid edge effects.

# setting the multiprocess plan
future::plan(future::multisession(workers=2))

netlistw3 <- network_listw.mc(lixels,lixels,
                             method = "ends",
                             mindist = 10,
                             maxdistance = 300,
                             dist_func = my_conv_func,
                             line_weight = 'length',
                             matrice_type = 'W',
                             grid_shape = c(2,2),
                             verbose=FALSE)

if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)

Other features

  • A spatial matrix could be calculated for every type of geometries not only lines. Points can directly be used. For polygons, it is possible to use the centres of the geometries as starting points. One can also choose a distance and equally spaced starting points will be defined on the border of the polygons according to that distance.
  • The cost of travelling on an edge could be set to something else than “length”.
  • It is possible to specify directions on the network.