This document can be cited as follows:
This markdown document provides code and detailed examples associated with a segment of an R Workshop held on March 26th, 2019 in Cambridge, UK at the “Big Data in Archaeology” conference hosted by the McDonald Instititue of Archaeological Research. This document assumes you have a basic knowledge of network science terminology and at least some basic familiarity with R and R Studio. If you are new to R, I recommend you first run through the brief Code School examples focused on R available here (http://tryr.codeschool.com/). The examples below can be run by simply copying and pasting the code below but modifications will require a bit of additional experience. For those new to network science in general, I recommend you start by reading the Peeples 2019 and Brughmans 2013 review articles (see recommended bibliography for more details). For those with more background in R and network statistics, there is also a more indepth version of this tutorial that goes into many more statistical techniques for evaluating network metrics and sensitivity analyses posted here
For the purposes of this workshop we will be using a real archaeological dataset as an example pulled from the Digital Archaeological Record (tDAR). These data are derived from research previously conducted by me associated with my “Connected Communities” book with the University of Arizona Press (Peeples 2018). These data represent the results of an analysis which defined clusters of cooking pottery from the Zuni/Cibola region of Arizona and New Mexico (ca. AD 1150-1325) based on a series of technological attributes recorded for just over 2,200 individual vessels. In the workshop we’ll practice searching for and retrieving these data directly from tDAR but I am also providing them here for future reference. This dataset is divided into two .csv files, the first with the counts of ceramic technological clusters by site and a second with the locations of those sites (not on tDAR). To ensure the security of site locational information as required by the state data repositories charged with maintaining these data, I have randomly relocated each settlement between 7-10 kilometers from their actual locations.
Map of the Cibola region and the sites included in this analysis along with the frequencies of technological clusters by sub-region
Right click and choose “save as” to download the ceramic data and attribute data.
The code below imports these data from the .csv files into objects we can use in R. Before running the code below, however, we need to ensure that our R session is set to the correct working directory (the location where you placed the .csv files for this workshop). To do that, go to the menu bar at the top and click Session > Set Working Directory > Choose Directory and navigate to the place on your hard drive where these files reside (alternatively you can hit Ctrl + Shift + H and then navigate to the appropriate directory). Once you have done that, you will be able to execute the code below by simply copying the text and then pasting it into the R console.
Let’s start by importing our ceramic and attribute data.
# Import ceramic ceramic data into an object named ceramic. row.names=1 sets the first column values to the row names for each item.
ceramic <- read.csv(file='ceramic_clust.csv',row.names=1)
# Import attribute data into an object named ceramic.attr
ceramic.attr <- read.csv('ceramic_clust_attr.csv',row.names=1)
For the purposes of this workshop, we will rely on a few pre-existing R packages. In order to use these packages in a new installation of R and R-studio, we first need to install them. Note that you will only need to do this once on each new installation of R/R-Studio. To install packages, you can click on the “Packages” tab in the window in the bottom right of R studio, then click the “Install” button at the top and type the names of the packages separated by commas. Alternatively you can install packages from the console by simply typing “install.packages(‘nameofpackagehere’)” without the quotation marks.
For the purposes of this workshop we will rely on several existing packages. Copy the line below into the console or follow the instructions to install using the packages tab:
install.packages(‘statnet’,‘tnet’,‘vegan’,‘FastKNN’,‘kableExtra’,‘ggraph’,‘GGally’)
Once you have installed these packages we use the library console command to initialize our packages.
library(statnet) #includes the libraries network, sna, and ergm
library(tnet) #includes the library igraph
library(vegan)
library(FastKNN)
library(kableExtra)
library(ggraph)
library(GGally)
In addition to these R packages, there are also a few additional functions that we will need to define manually for several of our analyses below. Specifically, we want to be able to take our ceramic frequency data and convert that into a symmetric similarity or distance matrix that we will use to define and weight our networks. For the examples below, we will rely on four methods for defining these matrices: 1) Co-presence of types, 2) Brainerd-Robinson similarity, 3) \(\chi^{2}\) distances, and 4) \(k\) nearest neighbors based on the site locations. These are only a few among a wide variety of options.
The first simple measure we will use for defining networks here is the number of categories that are co-present at pairs of sites. We can calculate this measure through simple matrix algebra. First, we create a binary incidence matrix of ceramic counts by site by simply dichotomizing our count data (a matrix with 1 for all categories that are present and 0 elsewhere). We can define that incidence matrix as \(A\) and the transpose (switching rows and columns) of that matrix as \(A^{T}\) we can find the number of categories that overlap between sites \(P\) as: \[P=A * A^{T}\]
The result of this procedure is a symmetric matrix with the number of rows and columns determined by the number of nodes comparing each site to every other site with each cell represents counts of co-occurrence between pairs of sites. The diagonal of this matrix represents the total number of categories present for the site denoted by that row. In many cases, we may not want to count rare occurrences as “present” and may instead want to create a threshold absolute value or proportion that a given category must exceed to be counted as “present” for creating our matrix. In the example we will use here today a category must represent at least 25% of a row to be considered “present” in this calculation. The code below could be easily modified to use a different cutoff and the choice of cutoff is an important decision with substantive impacts on our networks.
The chunk of code below defines a procedure for calculating simple co-presence for all categories representing more than 25% of a given row. We then create a new object called “ceramicP” which represents this matrix of co-occurrence from site to site.
co.p <- function (x,thresh=0.25) {
#create matrix of proportions from ceramic
temp <- prop.table(as.matrix(x),1)
# define anything with greater than or equal to 0.1 as present (1)
temp[temp>=thresh] <- 1
# define all other cells as absent (0)
temp[temp<1] <- 0
# matrix algebraic calculation to find co-occurence (%*% indicates matrix multiplication)
out <- temp%*%t(temp)
return(out)}
# run the function
ceramicP <- co.p(ceramic)
# display the results
kable(ceramicP) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "300px")
| Apache Creek | Atsinna | Baca Pueblo | Casa Malpais | Cienega | Coyote Creek | Foote Canyon | Garcia Ranch | Heshotauthla | Hinkson | Hooper Ranch | Horse Camp Mill | Hubble Corner | Jarlosa | Los Gigantes | Mineral Creek Pueblo | Mirabal | Ojo Bonito | Pescado Cluster | Platt Ranch | Pueblo de los Muertos | Rudd Creek Ruin | Scribe S | Spier 170 | Techado Springs | Tinaja | Tri-R Pueblo | UG481 | UG494 | WS Ranch | Yellowhouse | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Apache Creek | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Atsinna | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Baca Pueblo | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Casa Malpais | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Cienega | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Coyote Creek | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Foote Canyon | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Garcia Ranch | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Heshotauthla | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Hinkson | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1 |
| Hooper Ranch | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Horse Camp Mill | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Hubble Corner | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Jarlosa | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Los Gigantes | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Mineral Creek Pueblo | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Mirabal | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Ojo Bonito | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Pescado Cluster | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Platt Ranch | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 2 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 2 |
| Pueblo de los Muertos | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Rudd Creek Ruin | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Scribe S | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Spier 170 | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Techado Springs | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| Tinaja | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 1 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| Tri-R Pueblo | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| UG481 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| UG494 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| WS Ranch | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Yellowhouse | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 2 | 1 | 2 | 2 | 2 | 0 | 2 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 3 |
The next metric we will use here is a rescaled version of the Brainerd-Robinson (BR) similarity metric. This BR measure is commonly used in archaeology including in a number of recent (and not so recent) network studies. This measure represents the total similarity in proportional representation of categories and is defined as:
\[S = {\frac{2-\sum_{k} \left|x_{k} - y_{k}\right|} {2}}\]
where, for all categories \(k\), \(x\) is the proportion of \(k\) in the first assemblage and \(y\) is the proportion of \(k\) in the second. This provides a scale of similarity from 0-1 where 1 is perfect similarity and 0 indicates no similarity. For this example, we use the “vegdist” function from the “vegan” package which has this measure and many other distance metrics built in (Brainard-Robinson is referred to as Manhattan distance in this package). This chunk ends by running this function for our sample dataset defining a new object “ceramicBR” with the resulting similarity matrix.Note that by default, vegdist calculates this as a distance rather than a similarity. As the maximum possible distance is 2 we convert this to a similarity by subtracting the results from 2 and the rescale from 0 to 1 by dividing the result by 2.
ceramic.p <- prop.table(as.matrix(ceramic), margin = 1) # This line converts the ceramic cluster frequency table to a table of proportions by row
# The following line uses the vegdist function to calculate the Brainard-Robinson similarity score.
ceramicBR <- (2-as.matrix(vegdist(ceramic.p, method='manhattan')))/2
# display the results
kable(ceramicBR) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "300px")
| Apache Creek | Atsinna | Baca Pueblo | Casa Malpais | Cienega | Coyote Creek | Foote Canyon | Garcia Ranch | Heshotauthla | Hinkson | Hooper Ranch | Horse Camp Mill | Hubble Corner | Jarlosa | Los Gigantes | Mineral Creek Pueblo | Mirabal | Ojo Bonito | Pescado Cluster | Platt Ranch | Pueblo de los Muertos | Rudd Creek Ruin | Scribe S | Spier 170 | Techado Springs | Tinaja | Tri-R Pueblo | UG481 | UG494 | WS Ranch | Yellowhouse | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Apache Creek | 1.0000000 | 0.3433584 | 0.4455782 | 0.7050691 | 0.3049155 | 0.7757143 | 0.5691057 | 0.5983103 | 0.3730159 | 0.5988613 | 0.8348214 | 0.8656783 | 0.8028571 | 0.2862950 | 0.2619048 | 0.7329193 | 0.2866389 | 0.3772894 | 0.2923588 | 0.4964061 | 0.3000000 | 0.7509158 | 0.3809524 | 0.2931548 | 0.8441558 | 0.3476190 | 0.7857143 | 0.8102919 | 0.7161905 | 0.5902876 | 0.2380952 |
| Atsinna | 0.3433584 | 1.0000000 | 0.5750090 | 0.3740804 | 0.7741935 | 0.3252632 | 0.3767651 | 0.5415959 | 0.6157895 | 0.5657895 | 0.3240132 | 0.4165839 | 0.3585965 | 0.5695336 | 0.7215767 | 0.5110603 | 0.7426333 | 0.7361673 | 0.6466748 | 0.4639192 | 0.8556391 | 0.3954116 | 0.7883169 | 0.7061404 | 0.2719298 | 0.7709273 | 0.3274854 | 0.4278438 | 0.4252632 | 0.4048984 | 0.5137845 |
| Baca Pueblo | 0.4455782 | 0.5750090 | 1.0000000 | 0.5608953 | 0.4737766 | 0.5277551 | 0.4644102 | 0.4674128 | 0.4523810 | 0.6271074 | 0.4931973 | 0.5529457 | 0.5273469 | 0.4564460 | 0.3506494 | 0.5874002 | 0.3633842 | 0.6569859 | 0.3399779 | 0.4193300 | 0.4571429 | 0.5719519 | 0.4285714 | 0.5644133 | 0.4826840 | 0.4489796 | 0.5260771 | 0.6028089 | 0.4416327 | 0.6280057 | 0.2857143 |
| Casa Malpais | 0.7050691 | 0.3740804 | 0.5608953 | 1.0000000 | 0.4193548 | 0.8233333 | 0.6224757 | 0.7204301 | 0.5132616 | 0.7142356 | 0.7489919 | 0.7980321 | 0.7505376 | 0.4652505 | 0.3727133 | 0.6273960 | 0.4123604 | 0.4611249 | 0.3993498 | 0.6308582 | 0.3735791 | 0.7725393 | 0.4668557 | 0.3938172 | 0.7033236 | 0.4688172 | 0.7782898 | 0.8064516 | 0.5320430 | 0.4795060 | 0.4086022 |
| Cienega | 0.3049155 | 0.7741935 | 0.4737766 | 0.4193548 | 1.0000000 | 0.3237634 | 0.4683976 | 0.6344086 | 0.7301075 | 0.7047686 | 0.2930108 | 0.3920674 | 0.3737634 | 0.7471807 | 0.8354978 | 0.5030388 | 0.8345339 | 0.7609595 | 0.8044511 | 0.6025563 | 0.8313364 | 0.3614557 | 0.7980622 | 0.8168683 | 0.2507331 | 0.8626728 | 0.3604711 | 0.4193548 | 0.4030108 | 0.3149154 | 0.6943164 |
| Coyote Creek | 0.7757143 | 0.3252632 | 0.5277551 | 0.8233333 | 0.3237634 | 1.0000000 | 0.6382927 | 0.6570968 | 0.3811111 | 0.6178261 | 0.8379167 | 0.8428302 | 0.8400000 | 0.2943902 | 0.2700000 | 0.7113043 | 0.3085437 | 0.3853846 | 0.3142636 | 0.5247170 | 0.3080952 | 0.8233333 | 0.3908791 | 0.3012500 | 0.8039394 | 0.3557143 | 0.7949206 | 0.8683871 | 0.6300000 | 0.5753465 | 0.2600000 |
| Foote Canyon | 0.5691057 | 0.3767651 | 0.4644102 | 0.6224757 | 0.4683976 | 0.6382927 | 1.0000000 | 0.6892211 | 0.5246612 | 0.6548250 | 0.5952744 | 0.5878969 | 0.7213008 | 0.4390244 | 0.4146341 | 0.6869919 | 0.4265925 | 0.5093809 | 0.4278692 | 0.6488725 | 0.4138211 | 0.6138211 | 0.4990619 | 0.4420732 | 0.5532151 | 0.4923345 | 0.7003484 | 0.6783373 | 0.6214634 | 0.5516783 | 0.3658537 |
| Garcia Ranch | 0.5983103 | 0.5415959 | 0.4674128 | 0.7204301 | 0.6344086 | 0.6570968 | 0.6892211 | 1.0000000 | 0.7240143 | 0.7889201 | 0.6172715 | 0.6786366 | 0.6202151 | 0.6144768 | 0.5877671 | 0.6895746 | 0.6166614 | 0.6749380 | 0.6066517 | 0.7346318 | 0.5849462 | 0.6550868 | 0.6823821 | 0.6088710 | 0.5268817 | 0.6801843 | 0.6062468 | 0.6989247 | 0.5858065 | 0.3941233 | 0.5806452 |
| Heshotauthla | 0.3730159 | 0.6157895 | 0.4523810 | 0.5132616 | 0.7301075 | 0.3811111 | 0.5246612 | 0.7240143 | 1.0000000 | 0.7417874 | 0.3534722 | 0.4429769 | 0.4311111 | 0.7986450 | 0.8010101 | 0.5468599 | 0.7798274 | 0.7709402 | 0.8506460 | 0.7010482 | 0.6841270 | 0.4042735 | 0.7840049 | 0.7486111 | 0.3040404 | 0.7238095 | 0.4238095 | 0.4663082 | 0.4355556 | 0.2557756 | 0.8063492 |
| Hinkson | 0.5988613 | 0.5657895 | 0.6271074 | 0.7142356 | 0.7047686 | 0.6178261 | 0.6548250 | 0.7889201 | 0.7417874 | 1.0000000 | 0.5674819 | 0.6808860 | 0.6678261 | 0.6625133 | 0.6407397 | 0.7246377 | 0.6646264 | 0.7357860 | 0.6410516 | 0.7178015 | 0.6442029 | 0.6555184 | 0.7090301 | 0.6684783 | 0.5447958 | 0.7232919 | 0.6544168 | 0.7134175 | 0.5595652 | 0.4798752 | 0.5869565 |
| Hooper Ranch | 0.8348214 | 0.3240132 | 0.4931973 | 0.7489919 | 0.2930108 | 0.8379167 | 0.5952744 | 0.6172715 | 0.3534722 | 0.5674819 | 1.0000000 | 0.8197720 | 0.7825000 | 0.2743902 | 0.2368777 | 0.7105978 | 0.2464604 | 0.3653846 | 0.2521802 | 0.4754324 | 0.2645833 | 0.7339744 | 0.3407738 | 0.2812500 | 0.7727273 | 0.3122024 | 0.7886905 | 0.8323253 | 0.6845833 | 0.6263408 | 0.1979167 |
| Horse Camp Mill | 0.8656783 | 0.4165839 | 0.5529457 | 0.7980321 | 0.3920674 | 0.8428302 | 0.5878969 | 0.6786366 | 0.4429769 | 0.6808860 | 0.8197720 | 1.0000000 | 0.7763522 | 0.3734468 | 0.3467287 | 0.7100082 | 0.3598644 | 0.4644412 | 0.3450344 | 0.5566038 | 0.3779874 | 0.8560716 | 0.4258760 | 0.3803066 | 0.8296169 | 0.4256065 | 0.7750824 | 0.8796916 | 0.6407547 | 0.5766860 | 0.2830189 |
| Hubble Corner | 0.8028571 | 0.3585965 | 0.5273469 | 0.7505376 | 0.3737634 | 0.8400000 | 0.7213008 | 0.6202151 | 0.4311111 | 0.6678261 | 0.7825000 | 0.7763522 | 1.0000000 | 0.3443902 | 0.3200000 | 0.8011594 | 0.3588350 | 0.4353846 | 0.3742636 | 0.5747170 | 0.3580952 | 0.7005128 | 0.4408791 | 0.3512500 | 0.7030303 | 0.4057143 | 0.8780952 | 0.8572043 | 0.7333333 | 0.5875908 | 0.3200000 |
| Jarlosa | 0.2862950 | 0.5695336 | 0.4564460 | 0.4652505 | 0.7471807 | 0.2943902 | 0.4390244 | 0.6144768 | 0.7986450 | 0.6625133 | 0.2743902 | 0.3734468 | 0.3443902 | 1.0000000 | 0.7421603 | 0.4736656 | 0.7096851 | 0.7645403 | 0.7182832 | 0.7399908 | 0.6343786 | 0.3320826 | 0.6236934 | 0.7545732 | 0.2213599 | 0.6724739 | 0.3418506 | 0.3899816 | 0.3843902 | 0.2598406 | 0.8292683 |
| Los Gigantes | 0.2619048 | 0.7215767 | 0.3506494 | 0.3727133 | 0.8354978 | 0.2700000 | 0.4146341 | 0.5877671 | 0.8010101 | 0.6407397 | 0.2368777 | 0.3467287 | 0.3200000 | 0.7421603 | 1.0000000 | 0.4302654 | 0.9273736 | 0.6803197 | 0.8909695 | 0.5755942 | 0.8450216 | 0.3076923 | 0.8571429 | 0.7524351 | 0.1969697 | 0.8441558 | 0.3174603 | 0.3655914 | 0.3189610 | 0.1540440 | 0.7922078 |
| Mineral Creek Pueblo | 0.7329193 | 0.5110603 | 0.5874002 | 0.6273960 | 0.5030388 | 0.7113043 | 0.6869919 | 0.6895746 | 0.5468599 | 0.7246377 | 0.7105978 | 0.7100082 | 0.8011594 | 0.4736656 | 0.4302654 | 1.0000000 | 0.4398480 | 0.5646600 | 0.4455679 | 0.6193601 | 0.4579710 | 0.6867336 | 0.5341615 | 0.4805254 | 0.6357049 | 0.5055901 | 0.7570738 | 0.8097242 | 0.8017391 | 0.6395466 | 0.3913043 |
| Mirabal | 0.2866389 | 0.7426333 | 0.3633842 | 0.4123604 | 0.8345339 | 0.3085437 | 0.4265925 | 0.6166614 | 0.7798274 | 0.6646264 | 0.2464604 | 0.3598644 | 0.3588350 | 0.7096851 | 0.9273736 | 0.4398480 | 1.0000000 | 0.6833458 | 0.8864303 | 0.6084448 | 0.8764679 | 0.3305950 | 0.9017390 | 0.7360437 | 0.2152104 | 0.8771151 | 0.3501310 | 0.3818770 | 0.3285437 | 0.1667788 | 0.7614424 |
| Ojo Bonito | 0.3772894 | 0.7361673 | 0.6569859 | 0.4611249 | 0.7609595 | 0.3853846 | 0.5093809 | 0.6749380 | 0.7709402 | 0.7357860 | 0.3653846 | 0.4644412 | 0.4353846 | 0.7645403 | 0.6803197 | 0.5646600 | 0.6833458 | 1.0000000 | 0.6541443 | 0.6335269 | 0.7256410 | 0.4230769 | 0.7197802 | 0.8822115 | 0.3123543 | 0.6978022 | 0.4145299 | 0.4809760 | 0.4753846 | 0.4737243 | 0.6153846 |
| Pescado Cluster | 0.2923588 | 0.6466748 | 0.3399779 | 0.3993498 | 0.8044511 | 0.3142636 | 0.4278692 | 0.6066517 | 0.8506460 | 0.6410516 | 0.2521802 | 0.3450344 | 0.3742636 | 0.7182832 | 0.8909695 | 0.4455679 | 0.8864303 | 0.6541443 | 1.0000000 | 0.5984350 | 0.7805094 | 0.3184258 | 0.8284351 | 0.7107558 | 0.2135307 | 0.8181617 | 0.3477298 | 0.3738435 | 0.3342636 | 0.1511244 | 0.8150609 |
| Platt Ranch | 0.4964061 | 0.4639192 | 0.4193300 | 0.6308582 | 0.6025563 | 0.5247170 | 0.6488725 | 0.7346318 | 0.7010482 | 0.7178015 | 0.4754324 | 0.5566038 | 0.5747170 | 0.7399908 | 0.5755942 | 0.6193601 | 0.6084448 | 0.6335269 | 0.5984350 | 1.0000000 | 0.5509434 | 0.5152395 | 0.6471076 | 0.5966981 | 0.4328188 | 0.6266846 | 0.5563043 | 0.5799351 | 0.5256604 | 0.3546609 | 0.7169811 |
| Pueblo de los Muertos | 0.3000000 | 0.8556391 | 0.4571429 | 0.3735791 | 0.8313364 | 0.3080952 | 0.4138211 | 0.5849462 | 0.6841270 | 0.6442029 | 0.2645833 | 0.3779874 | 0.3580952 | 0.6343786 | 0.8450216 | 0.4579710 | 0.8764679 | 0.7256410 | 0.7805094 | 0.5509434 | 1.0000000 | 0.3457875 | 0.8725275 | 0.7452381 | 0.2333333 | 0.8952381 | 0.3269841 | 0.4000000 | 0.3466667 | 0.2605375 | 0.6476190 |
| Rudd Creek Ruin | 0.7509158 | 0.3954116 | 0.5719519 | 0.7725393 | 0.3614557 | 0.8233333 | 0.6138211 | 0.6550868 | 0.4042735 | 0.6555184 | 0.7339744 | 0.8560716 | 0.7005128 | 0.3320826 | 0.3076923 | 0.6867336 | 0.3305950 | 0.4230769 | 0.3184258 | 0.5152395 | 0.3457875 | 1.0000000 | 0.3992674 | 0.3389423 | 0.8111888 | 0.3934066 | 0.7094017 | 0.8354012 | 0.4994872 | 0.5894897 | 0.2564103 |
| Scribe S | 0.3809524 | 0.7883169 | 0.4285714 | 0.4668557 | 0.7980622 | 0.3908791 | 0.4990619 | 0.6823821 | 0.7840049 | 0.7090301 | 0.3407738 | 0.4258760 | 0.4408791 | 0.6236934 | 0.8571429 | 0.5341615 | 0.9017390 | 0.7197802 | 0.8284351 | 0.6471076 | 0.8725275 | 0.3992674 | 1.0000000 | 0.6895604 | 0.2943723 | 0.8351648 | 0.4151404 | 0.4546851 | 0.4228571 | 0.2319661 | 0.6923077 |
| Spier 170 | 0.2931548 | 0.7061404 | 0.5644133 | 0.3938172 | 0.8168683 | 0.3012500 | 0.4420732 | 0.6088710 | 0.7486111 | 0.6684783 | 0.2812500 | 0.3803066 | 0.3512500 | 0.7545732 | 0.7524351 | 0.4805254 | 0.7360437 | 0.8822115 | 0.7107558 | 0.5966981 | 0.7452381 | 0.3389423 | 0.6895604 | 1.0000000 | 0.2282197 | 0.7750000 | 0.3472222 | 0.3968414 | 0.3912500 | 0.4016089 | 0.6875000 |
| Techado Springs | 0.8441558 | 0.2719298 | 0.4826840 | 0.7033236 | 0.2507331 | 0.8039394 | 0.5532151 | 0.5268817 | 0.3040404 | 0.5447958 | 0.7727273 | 0.8296169 | 0.7030303 | 0.2213599 | 0.1969697 | 0.6357049 | 0.2152104 | 0.3123543 | 0.2135307 | 0.4328188 | 0.2333333 | 0.8111888 | 0.2943723 | 0.2282197 | 1.0000000 | 0.2809524 | 0.7308802 | 0.7986315 | 0.5733333 | 0.5982598 | 0.1515152 |
| Tinaja | 0.3476190 | 0.7709273 | 0.4489796 | 0.4688172 | 0.8626728 | 0.3557143 | 0.4923345 | 0.6801843 | 0.7238095 | 0.7232919 | 0.3122024 | 0.4256065 | 0.4057143 | 0.6724739 | 0.8441558 | 0.5055901 | 0.8771151 | 0.6978022 | 0.8181617 | 0.6266846 | 0.8952381 | 0.3934066 | 0.8351648 | 0.7750000 | 0.2809524 | 1.0000000 | 0.4031746 | 0.4476190 | 0.3942857 | 0.2891089 | 0.6857143 |
| Tri-R Pueblo | 0.7857143 | 0.3274854 | 0.5260771 | 0.7782898 | 0.3604711 | 0.7949206 | 0.7003484 | 0.6062468 | 0.4238095 | 0.6544168 | 0.7886905 | 0.7750824 | 0.8780952 | 0.3418506 | 0.3174603 | 0.7570738 | 0.3501310 | 0.4145299 | 0.3477298 | 0.5563043 | 0.3269841 | 0.7094017 | 0.4151404 | 0.3472222 | 0.7308802 | 0.4031746 | 1.0000000 | 0.8172043 | 0.7250794 | 0.6060035 | 0.2857143 |
| UG481 | 0.8102919 | 0.4278438 | 0.6028089 | 0.8064516 | 0.4193548 | 0.8683871 | 0.6783373 | 0.6989247 | 0.4663082 | 0.7134175 | 0.8323253 | 0.8796916 | 0.8572043 | 0.3899816 | 0.3655914 | 0.8097242 | 0.3818770 | 0.4809760 | 0.3738435 | 0.5799351 | 0.4000000 | 0.8354012 | 0.4546851 | 0.3968414 | 0.7986315 | 0.4476190 | 0.8172043 | 1.0000000 | 0.6610753 | 0.6407963 | 0.3118280 |
| UG494 | 0.7161905 | 0.4252632 | 0.4416327 | 0.5320430 | 0.4030108 | 0.6300000 | 0.6214634 | 0.5858065 | 0.4355556 | 0.5595652 | 0.6845833 | 0.6407547 | 0.7333333 | 0.3843902 | 0.3189610 | 0.8017391 | 0.3285437 | 0.4753846 | 0.3342636 | 0.5256604 | 0.3466667 | 0.4994872 | 0.4228571 | 0.3912500 | 0.5733333 | 0.3942857 | 0.7250794 | 0.6610753 | 1.0000000 | 0.5865347 | 0.2800000 |
| WS Ranch | 0.5902876 | 0.4048984 | 0.6280057 | 0.4795060 | 0.3149154 | 0.5753465 | 0.5516783 | 0.3941233 | 0.2557756 | 0.4798752 | 0.6263408 | 0.5766860 | 0.5875908 | 0.2598406 | 0.1540440 | 0.6395466 | 0.1667788 | 0.4737243 | 0.1511244 | 0.3546609 | 0.2605375 | 0.5894897 | 0.2319661 | 0.4016089 | 0.5982598 | 0.2891089 | 0.6060035 | 0.6407963 | 0.5865347 | 1.0000000 | 0.0891089 |
| Yellowhouse | 0.2380952 | 0.5137845 | 0.2857143 | 0.4086022 | 0.6943164 | 0.2600000 | 0.3658537 | 0.5806452 | 0.8063492 | 0.5869565 | 0.1979167 | 0.2830189 | 0.3200000 | 0.8292683 | 0.7922078 | 0.3913043 | 0.7614424 | 0.6153846 | 0.8150609 | 0.7169811 | 0.6476190 | 0.2564103 | 0.6923077 | 0.6875000 | 0.1515152 | 0.6857143 | 0.2857143 | 0.3118280 | 0.2800000 | 0.0891089 | 1.0000000 |
The next measure we will use is the \(\chi^{2}\) distance metric which is the basis of correspondence analysis and related methods commonly used for frequency seriation in archaeology (note that this should probably really be called the \(\chi\) distance since the typical form we use is not squared, but the name persists this way in the literature so that’s what I use here). This measure is defined as:
\[\chi_{jk} = \sqrt{\sum \frac 1{c_{j}} ({x_{j}-y_{j})^{2}}}\]
where \(c_j\) denotes the \(j_{th}\) element of the average row profile (the proportional abundance of \(j\) across all rows) and \(x\) and \(y\) represent row profiles for the two sites under comparison. This metric therefore takes raw abundance (rather than simply proportional representation) into account when defining distance between sites. The definition of this metric is such that rare categories play a greater role in defining distances among sites than common categories (as in correspondence analysis). This measure has a minimum value of 0 and no theoretical upper limit.
The code for calculating \(\chi^{2}\) distances is defined in the chunk below and a new object called “ceramicX” is created using this measure. It is sometimes preferable to rescale this measure so that it is bounded between 0 and 1. We create a second object called “ceramicX01” which represents rescaled distances by simply dividing the matrix by the maximum observed value (there are many other ways to scale this measure but this simple option will be fine for our current purposes).
chi.dist <- function(x) {
rowprof <- x/apply(x,1,sum) # calculates the profile for every row
avgprof <- apply(x,2,sum)/sum(x) # calculates the average profile
# creates a distance object of $\chi^{2}$ distances
chid <- dist(as.matrix(rowprof)%*%diag(1/sqrt(avgprof)))
# return the reults
return(as.matrix(chid))}
# Run the script and then create the rescaled 0-1 version
ceramicX <- chi.dist(ceramic)
ceramicX01 <- ceramicX/max(ceramicX)
# display the results
kable(ceramicX01) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "300px")
| Apache Creek | Atsinna | Baca Pueblo | Casa Malpais | Cienega | Coyote Creek | Foote Canyon | Garcia Ranch | Heshotauthla | Hinkson | Hooper Ranch | Horse Camp Mill | Hubble Corner | Jarlosa | Los Gigantes | Mineral Creek Pueblo | Mirabal | Ojo Bonito | Pescado Cluster | Platt Ranch | Pueblo de los Muertos | Rudd Creek Ruin | Scribe S | Spier 170 | Techado Springs | Tinaja | Tri-R Pueblo | UG481 | UG494 | WS Ranch | Yellowhouse | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Apache Creek | 0.0000000 | 0.7095338 | 0.8989205 | 0.3833492 | 0.6924796 | 0.3129657 | 0.5260409 | 0.4459922 | 0.6550725 | 0.4383268 | 0.2368516 | 0.1811790 | 0.2240690 | 0.7672812 | 0.7416956 | 0.3414848 | 0.7134453 | 0.7039843 | 0.7128040 | 0.6136304 | 0.7276556 | 0.3394063 | 0.6407472 | 0.7708847 | 0.2611124 | 0.6732155 | 0.2328644 | 0.2303551 | 0.4111959 | 0.6537220 | 0.8101312 |
| Atsinna | 0.7095338 | 0.0000000 | 0.6606827 | 0.7000075 | 0.2700330 | 0.7600273 | 0.8127292 | 0.5762487 | 0.4723241 | 0.4703015 | 0.7652527 | 0.6331742 | 0.7266089 | 0.5386435 | 0.3677256 | 0.6262869 | 0.3751164 | 0.3268446 | 0.4572702 | 0.6679810 | 0.2128746 | 0.7161344 | 0.3734639 | 0.3108637 | 0.7750665 | 0.2840681 | 0.7502584 | 0.6420039 | 0.7657112 | 0.6693529 | 0.6331742 |
| Baca Pueblo | 0.8989205 | 0.6606827 | 0.0000000 | 0.8530409 | 0.7739726 | 0.9022397 | 0.9573654 | 0.8777808 | 0.8491163 | 0.7365035 | 0.9288940 | 0.7952807 | 0.9128092 | 0.8120862 | 0.8925783 | 0.8709471 | 0.8906429 | 0.4904889 | 0.9263186 | 0.9189806 | 0.7393821 | 0.8097825 | 0.9110600 | 0.5603828 | 0.8569546 | 0.7481423 | 0.8946640 | 0.8083725 | 1.0000000 | 0.5563467 | 0.9892470 |
| Casa Malpais | 0.3833492 | 0.7000075 | 0.8530409 | 0.0000000 | 0.6206696 | 0.3379605 | 0.5001327 | 0.3060764 | 0.5194491 | 0.3257256 | 0.3675392 | 0.2670962 | 0.3853877 | 0.5714386 | 0.6547077 | 0.4966927 | 0.6249516 | 0.6158990 | 0.6197513 | 0.4134978 | 0.6762156 | 0.3873051 | 0.5834942 | 0.6663568 | 0.3958593 | 0.5702034 | 0.4212723 | 0.3259262 | 0.6415072 | 0.7240529 | 0.6130341 |
| Cienega | 0.6924796 | 0.2700330 | 0.7739726 | 0.6206696 | 0.0000000 | 0.7041767 | 0.7164108 | 0.4580036 | 0.2943300 | 0.3827491 | 0.7265611 | 0.6206443 | 0.6654294 | 0.3244954 | 0.2276253 | 0.5347793 | 0.2703113 | 0.3256443 | 0.3220142 | 0.4750137 | 0.2670086 | 0.6824085 | 0.2868049 | 0.2848910 | 0.7560043 | 0.2401733 | 0.6851371 | 0.5894743 | 0.7018863 | 0.6868189 | 0.4334776 |
| Coyote Creek | 0.3129657 | 0.7600273 | 0.9022397 | 0.3379605 | 0.7041767 | 0.0000000 | 0.4465467 | 0.4283771 | 0.6514601 | 0.4451165 | 0.3018550 | 0.2831241 | 0.2787382 | 0.7475291 | 0.7497140 | 0.3602008 | 0.7249037 | 0.7203471 | 0.7125304 | 0.5867065 | 0.7546067 | 0.2139738 | 0.6733187 | 0.7689920 | 0.2337362 | 0.6776394 | 0.3811525 | 0.1915126 | 0.5447314 | 0.6680361 | 0.7855958 |
| Foote Canyon | 0.5260409 | 0.8127292 | 0.9573654 | 0.5001327 | 0.7164108 | 0.4465467 | 0.0000000 | 0.5749626 | 0.6581748 | 0.5635715 | 0.5409967 | 0.5275683 | 0.3406766 | 0.6971603 | 0.7558842 | 0.4946163 | 0.7536250 | 0.7224683 | 0.7270447 | 0.5102400 | 0.7904517 | 0.5739943 | 0.7385288 | 0.7545689 | 0.5836607 | 0.7255604 | 0.4551519 | 0.4826874 | 0.4947864 | 0.7314761 | 0.7544642 |
| Garcia Ranch | 0.4459922 | 0.5762487 | 0.8777808 | 0.3060764 | 0.4580036 | 0.4283771 | 0.5749626 | 0.0000000 | 0.3804029 | 0.2970302 | 0.3890086 | 0.3566695 | 0.4435010 | 0.4672769 | 0.4686878 | 0.4210840 | 0.4503842 | 0.5398951 | 0.4350290 | 0.3731739 | 0.5257877 | 0.4408093 | 0.4277193 | 0.5415027 | 0.5289976 | 0.4043380 | 0.4463934 | 0.3319121 | 0.5895549 | 0.7292109 | 0.4886172 |
| Heshotauthla | 0.6550725 | 0.4723241 | 0.8491163 | 0.5194491 | 0.2943300 | 0.6514601 | 0.6581748 | 0.3804029 | 0.0000000 | 0.2658615 | 0.6974775 | 0.5726741 | 0.6083405 | 0.2585020 | 0.2350154 | 0.5390827 | 0.2581299 | 0.3973180 | 0.1944983 | 0.3491583 | 0.3767529 | 0.6179888 | 0.2663091 | 0.4184306 | 0.7139678 | 0.3115805 | 0.6369095 | 0.5508457 | 0.7410489 | 0.8059763 | 0.2899974 |
| Hinkson | 0.4383268 | 0.4703015 | 0.7365035 | 0.3257256 | 0.3827491 | 0.4451165 | 0.5635715 | 0.2970302 | 0.2658615 | 0.0000000 | 0.5078046 | 0.3391316 | 0.4295679 | 0.3973989 | 0.4077008 | 0.3999032 | 0.3889958 | 0.3916897 | 0.3768919 | 0.3672003 | 0.4325894 | 0.3968613 | 0.3434164 | 0.4648359 | 0.4650545 | 0.3626327 | 0.4616089 | 0.3384299 | 0.6304483 | 0.6618696 | 0.4572272 |
| Hooper Ranch | 0.2368516 | 0.7652527 | 0.9288940 | 0.3675392 | 0.7265611 | 0.3018550 | 0.5409967 | 0.3890086 | 0.6974775 | 0.5078046 | 0.0000000 | 0.2291810 | 0.3324581 | 0.7806772 | 0.7869185 | 0.4102317 | 0.7674576 | 0.7365901 | 0.7601539 | 0.6347068 | 0.7864592 | 0.3478543 | 0.7120921 | 0.7742617 | 0.3448926 | 0.7024541 | 0.3194223 | 0.2477428 | 0.4637163 | 0.6475832 | 0.8427467 |
| Horse Camp Mill | 0.1811790 | 0.6331742 | 0.7952807 | 0.2670962 | 0.6206443 | 0.2831241 | 0.5275683 | 0.3566695 | 0.5726741 | 0.3391316 | 0.2291810 | 0.0000000 | 0.3014560 | 0.6776052 | 0.6752997 | 0.3865400 | 0.6495803 | 0.5961261 | 0.6489713 | 0.5635585 | 0.6548435 | 0.2634527 | 0.5842151 | 0.6651682 | 0.2531833 | 0.5866802 | 0.3243995 | 0.1922011 | 0.5225082 | 0.6055054 | 0.7442914 |
| Hubble Corner | 0.2240690 | 0.7266089 | 0.9128092 | 0.3853877 | 0.6654294 | 0.2787382 | 0.3406766 | 0.4435010 | 0.6083405 | 0.4295679 | 0.3324581 | 0.3014560 | 0.0000000 | 0.7043687 | 0.7001172 | 0.2849081 | 0.6756262 | 0.6909660 | 0.6623812 | 0.5052071 | 0.7090251 | 0.3819629 | 0.6274055 | 0.7458184 | 0.3450524 | 0.6489547 | 0.1843742 | 0.2483190 | 0.3454456 | 0.6757227 | 0.7304815 |
| Jarlosa | 0.7672812 | 0.5386435 | 0.8120862 | 0.5714386 | 0.3244954 | 0.7475291 | 0.6971603 | 0.4672769 | 0.2585020 | 0.3973989 | 0.7806772 | 0.6776052 | 0.7043687 | 0.0000000 | 0.3747380 | 0.6288089 | 0.4047035 | 0.3843676 | 0.3937243 | 0.3048746 | 0.4751910 | 0.7327908 | 0.4505519 | 0.3722825 | 0.8105495 | 0.3906584 | 0.7030906 | 0.6497701 | 0.7931872 | 0.7764263 | 0.2778855 |
| Los Gigantes | 0.7416956 | 0.3677256 | 0.8925783 | 0.6547077 | 0.2276253 | 0.7497140 | 0.7558842 | 0.4686878 | 0.2350154 | 0.4077008 | 0.7869185 | 0.6752997 | 0.7001172 | 0.3747380 | 0.0000000 | 0.6099721 | 0.0887290 | 0.4451597 | 0.1277225 | 0.4818653 | 0.2279874 | 0.7354927 | 0.1679149 | 0.4098011 | 0.8245659 | 0.2089582 | 0.7382395 | 0.6499491 | 0.7811090 | 0.8719123 | 0.3418459 |
| Mineral Creek Pueblo | 0.3414848 | 0.6262869 | 0.8709471 | 0.4966927 | 0.5347793 | 0.3602008 | 0.4946163 | 0.4210840 | 0.5390827 | 0.3999032 | 0.4102317 | 0.3865400 | 0.2849081 | 0.6288089 | 0.6099721 | 0.0000000 | 0.6037672 | 0.6065226 | 0.5923884 | 0.5150044 | 0.6302599 | 0.3724670 | 0.5569857 | 0.6484900 | 0.3984090 | 0.5816209 | 0.2935167 | 0.2401786 | 0.3384410 | 0.5703746 | 0.6871392 |
| Mirabal | 0.7134453 | 0.3751164 | 0.8906429 | 0.6249516 | 0.2703113 | 0.7249037 | 0.7536250 | 0.4503842 | 0.2581299 | 0.3889958 | 0.7674576 | 0.6495803 | 0.6756262 | 0.4047035 | 0.0887290 | 0.6037672 | 0.0000000 | 0.4695702 | 0.1320799 | 0.4730329 | 0.2094042 | 0.7175024 | 0.1309988 | 0.4428813 | 0.7975595 | 0.1806641 | 0.7160576 | 0.6297962 | 0.7765164 | 0.8879674 | 0.3329997 |
| Ojo Bonito | 0.7039843 | 0.3268446 | 0.4904889 | 0.6158990 | 0.3256443 | 0.7203471 | 0.7224683 | 0.5398951 | 0.3973180 | 0.3916897 | 0.7365901 | 0.5961261 | 0.6909660 | 0.3843676 | 0.4451597 | 0.6065226 | 0.4695702 | 0.0000000 | 0.4895434 | 0.5568736 | 0.3730097 | 0.6480328 | 0.4932293 | 0.1554204 | 0.7323865 | 0.3635438 | 0.6906821 | 0.5960219 | 0.7688771 | 0.5526790 | 0.5786496 |
| Pescado Cluster | 0.7128040 | 0.4572702 | 0.9263186 | 0.6197513 | 0.3220142 | 0.7125304 | 0.7270447 | 0.4350290 | 0.1944983 | 0.3768919 | 0.7601539 | 0.6489713 | 0.6623812 | 0.3937243 | 0.1277225 | 0.5923884 | 0.1320799 | 0.4895434 | 0.0000000 | 0.4519065 | 0.3023409 | 0.6959533 | 0.1845571 | 0.4757055 | 0.7952231 | 0.2645315 | 0.7051157 | 0.6193361 | 0.7778099 | 0.9088848 | 0.3148329 |
| Platt Ranch | 0.6136304 | 0.6679810 | 0.9189806 | 0.4134978 | 0.4750137 | 0.5867065 | 0.5102400 | 0.3731739 | 0.3491583 | 0.3672003 | 0.6347068 | 0.5635585 | 0.5052071 | 0.3048746 | 0.4818653 | 0.5150044 | 0.4730329 | 0.5568736 | 0.4519065 | 0.0000000 | 0.5815142 | 0.6388697 | 0.4903803 | 0.5665938 | 0.6800456 | 0.4752939 | 0.5149517 | 0.5243335 | 0.6401034 | 0.8115940 | 0.3037147 |
| Pueblo de los Muertos | 0.7276556 | 0.2128746 | 0.7393821 | 0.6762156 | 0.2670086 | 0.7546067 | 0.7904517 | 0.5257877 | 0.3767529 | 0.4325894 | 0.7864592 | 0.6548435 | 0.7090251 | 0.4751910 | 0.2279874 | 0.6302599 | 0.2094042 | 0.3730097 | 0.3023409 | 0.5815142 | 0.0000000 | 0.7278371 | 0.2628569 | 0.3418096 | 0.8034647 | 0.1468526 | 0.7415268 | 0.6464960 | 0.7870911 | 0.7973265 | 0.4883085 |
| Rudd Creek Ruin | 0.3394063 | 0.7161344 | 0.8097825 | 0.3873051 | 0.6824085 | 0.2139738 | 0.5739943 | 0.4408093 | 0.6179888 | 0.3968613 | 0.3478543 | 0.2634527 | 0.3819629 | 0.7327908 | 0.7354927 | 0.3724670 | 0.7175024 | 0.6480328 | 0.6959533 | 0.6388697 | 0.7278371 | 0.0000000 | 0.6601640 | 0.7204521 | 0.2050969 | 0.6647827 | 0.4308795 | 0.1867895 | 0.6189393 | 0.6077271 | 0.8016973 |
| Scribe S | 0.6407472 | 0.3734639 | 0.9110600 | 0.5834942 | 0.2868049 | 0.6733187 | 0.7385288 | 0.4277193 | 0.2663091 | 0.3434164 | 0.7120921 | 0.5842151 | 0.6274055 | 0.4505519 | 0.1679149 | 0.5569857 | 0.1309988 | 0.4932293 | 0.1845571 | 0.4903803 | 0.2628569 | 0.6601640 | 0.0000000 | 0.4935438 | 0.7263279 | 0.2451147 | 0.6745255 | 0.5782056 | 0.7390779 | 0.8667500 | 0.3960380 |
| Spier 170 | 0.7708847 | 0.3108637 | 0.5603828 | 0.6663568 | 0.2848910 | 0.7689920 | 0.7545689 | 0.5415027 | 0.4184306 | 0.4648359 | 0.7742617 | 0.6651682 | 0.7458184 | 0.3722825 | 0.4098011 | 0.6484900 | 0.4428813 | 0.1554204 | 0.4757055 | 0.5665938 | 0.3418096 | 0.7204521 | 0.4935438 | 0.0000000 | 0.8145558 | 0.3193465 | 0.7467950 | 0.6504735 | 0.7923746 | 0.6057615 | 0.5577138 |
| Techado Springs | 0.2611124 | 0.7750665 | 0.8569546 | 0.3958593 | 0.7560043 | 0.2337362 | 0.5836607 | 0.5289976 | 0.7139678 | 0.4650545 | 0.3448926 | 0.2531833 | 0.3450524 | 0.8105495 | 0.8245659 | 0.3984090 | 0.7975595 | 0.7323865 | 0.7952231 | 0.6800456 | 0.8034647 | 0.2050969 | 0.7263279 | 0.8145558 | 0.0000000 | 0.7470344 | 0.3888921 | 0.2396361 | 0.5825908 | 0.6214820 | 0.8730703 |
| Tinaja | 0.6732155 | 0.2840681 | 0.7481423 | 0.5702034 | 0.2401733 | 0.6776394 | 0.7255604 | 0.4043380 | 0.3115805 | 0.3626327 | 0.7024541 | 0.5866802 | 0.6489547 | 0.3906584 | 0.2089582 | 0.5816209 | 0.1806641 | 0.3635438 | 0.2645315 | 0.4752939 | 0.1468526 | 0.6647827 | 0.2451147 | 0.3193465 | 0.7470344 | 0.0000000 | 0.6760039 | 0.5743924 | 0.7458013 | 0.7731135 | 0.4012758 |
| Tri-R Pueblo | 0.2328644 | 0.7502584 | 0.8946640 | 0.4212723 | 0.6851371 | 0.3811525 | 0.4551519 | 0.4463934 | 0.6369095 | 0.4616089 | 0.3194223 | 0.3243995 | 0.1843742 | 0.7030906 | 0.7382395 | 0.2935167 | 0.7160576 | 0.6906821 | 0.7051157 | 0.5149517 | 0.7415268 | 0.4308795 | 0.6745255 | 0.7467950 | 0.3888921 | 0.6760039 | 0.0000000 | 0.2788753 | 0.3024226 | 0.6421796 | 0.7485000 |
| UG481 | 0.2303551 | 0.6420039 | 0.8083725 | 0.3259262 | 0.5894743 | 0.1915126 | 0.4826874 | 0.3319121 | 0.5508457 | 0.3384299 | 0.2477428 | 0.1922011 | 0.2483190 | 0.6497701 | 0.6499491 | 0.2401786 | 0.6297962 | 0.5960219 | 0.6193361 | 0.5243335 | 0.6464960 | 0.1867895 | 0.5782056 | 0.6504735 | 0.2396361 | 0.5743924 | 0.2788753 | 0.0000000 | 0.4573628 | 0.5700576 | 0.7107249 |
| UG494 | 0.4111959 | 0.7657112 | 1.0000000 | 0.6415072 | 0.7018863 | 0.5447314 | 0.4947864 | 0.5895549 | 0.7410489 | 0.6304483 | 0.4637163 | 0.5225082 | 0.3454456 | 0.7931872 | 0.7811090 | 0.3384410 | 0.7765164 | 0.7688771 | 0.7778099 | 0.6401034 | 0.7870911 | 0.6189393 | 0.7390779 | 0.7923746 | 0.5825908 | 0.7458013 | 0.3024226 | 0.4573628 | 0.0000000 | 0.6476775 | 0.8525427 |
| WS Ranch | 0.6537220 | 0.6693529 | 0.5563467 | 0.7240529 | 0.6868189 | 0.6680361 | 0.7314761 | 0.7292109 | 0.8059763 | 0.6618696 | 0.6475832 | 0.6055054 | 0.6757227 | 0.7764263 | 0.8719123 | 0.5703746 | 0.8879674 | 0.5526790 | 0.9088848 | 0.8115940 | 0.7973265 | 0.6077271 | 0.8667500 | 0.6057615 | 0.6214820 | 0.7731135 | 0.6421796 | 0.5700576 | 0.6476775 | 0.0000000 | 0.9830288 |
| Yellowhouse | 0.8101312 | 0.6331742 | 0.9892470 | 0.6130341 | 0.4334776 | 0.7855958 | 0.7544642 | 0.4886172 | 0.2899974 | 0.4572272 | 0.8427467 | 0.7442914 | 0.7304815 | 0.2778855 | 0.3418459 | 0.6871392 | 0.3329997 | 0.5786496 | 0.3148329 | 0.3037147 | 0.4883085 | 0.8016973 | 0.3960380 | 0.5577138 | 0.8730703 | 0.4012758 | 0.7485000 | 0.7107249 | 0.8525427 | 0.9830288 | 0.0000000 |
The measures above all rely on the ceramic frequency data to calculate similarity/distance among pairs of sites. It is often the case that we want to define networks based on spatial distances and neighborhoods instead. This can be done in a wide variety of ways and I direct readers to the references at the end of this document for more information (Verhagen 2017).
For the purposes of this workshop, we will use a very simple measure of spatial connectivity where we will define a network based on \(k\) nearest neighbors. For example, if \(k\) = 3 then each node will be connected to the 3 closest nodes in geogrpahic space. In the example here we will use \(k\) = 5.
Note that unlike our other measures this measure is not symmetric. This means that site 1 may be among site 2’s \(k\) closest neighbors even if site 2 is not among site 1’s \(k\) closest neighbors.
# Create a distance matrix using Euclidean distances based on the site coordinates in ceramic.attr
distMatrix <- as.matrix(dist(ceramic.attr))
# set k as 5 and create a function that calculates the 5 nearest neighbors for each node
k <- 5
nrst <- lapply(1:nrow(distMatrix), function(i) k.nearest.neighbors(i, distMatrix, k = k))
# the chunk of code below creates a symmetric matrix of 0s and then fills cells with 1 where two sites are k nearest neighbors
dist.knn <- matrix(nrow = dim(distMatrix), ncol=dim(distMatrix),0)
for(i in 1:length(nrst)) for(j in nrst[[i]]) dist.knn[i,j] = 1
# set row and column names
row.names(dist.knn) <- row.names(ceramic.attr)
colnames(dist.knn) <- row.names(ceramic.attr)
# display the results
kable(dist.knn) %>%
kable_styling() %>%
scroll_box(width = "100%", height = "300px")
| Apache Creek | Atsinna | Baca Pueblo | Casa Malpais | Cienega | Coyote Creek | Foote Canyon | Garcia Ranch | Heshotauthla | Hinkson | Hooper Ranch | Horse Camp Mill | Hubble Corner | Jarlosa | Los Gigantes | Mineral Creek Pueblo | Mirabal | Ojo Bonito | Pescado Cluster | Platt Ranch | Pueblo de los Muertos | Rudd Creek Ruin | Scribe S | Spier 170 | Techado Springs | Tinaja | Tri-R Pueblo | UG481 | UG494 | WS Ranch | Yellowhouse | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Apache Creek | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| Atsinna | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| Baca Pueblo | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Casa Malpais | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Cienega | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Coyote Creek | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Foote Canyon | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| Garcia Ranch | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Heshotauthla | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| Hinkson | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Hooper Ranch | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Horse Camp Mill | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 |
| Hubble Corner | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 |
| Jarlosa | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Los Gigantes | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Mineral Creek Pueblo | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Mirabal | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Ojo Bonito | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Pescado Cluster | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| Platt Ranch | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Pueblo de los Muertos | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Rudd Creek Ruin | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| Scribe S | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Spier 170 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Techado Springs | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
| Tinaja | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Tri-R Pueblo | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
| UG481 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 |
| UG494 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |
| WS Ranch | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Yellowhouse | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Now that we have defined our four measures of similarity or distance, the next step is to convert these into network objects that our R packages will be able to work with. We can do this by either creating binary networks (where ties are either present or absent) or weighted networks (which in many cases are simply the raw similarity/distance matrices we calculated above). I will provide examples of both approaches, starting with simple binary networks. There are many ways to define networks from matrices like those we generated above and my examples below should not been seen as an exhaustive set of approaches.
First, we will create binary networks using our ceramic ware co-occurrence matrix. Sites that share a single co-occurrence will be defined as connected here. We then use the “network” function to create a network object using the argument “directed=F” to let the function know that our data are from a symmetric matrix and that ties always extend in both directions between pairs of linked nodes. See help(network) for more details on options here.
After we create this new network object we plot it first using the default graph layout (Fruchterman-Reingold - We’ll discuss what this is in the workshop) and then based on the geographic location of our nodes. We’re not going to spend a lot of time exploring network plotting procedures in this workshop, but for those of you who may be interested in making far prettier graphs than those used here for demonstration purposes, I would recommend exploring the new “ggraph” package. (https://github.com/thomasp85/ggraph)
# create network object from co-occurrence
Pnet <- network(ceramicP,directed=F)
# Now let's add names for our nodes based on the row names of our original matrix
Pnet %v% 'vertex.names' <- row.names(ceramicP)
# look at the results
Pnet
## Network attributes:
## vertices = 31
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 184
## missing edges= 0
## non-missing edges= 184
##
## Vertex attribute names:
## vertex.names
##
## No edge attributes
# set up for plotting two plots, side by side by setting the pot to 1 row, 2 columns
par(mfrow=c(1,2))
# plot network using default layout
plot(Pnet, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5,main='co-presence network')
# plot network using geographic coordinates
plot(Pnet, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5, coord=ceramic.attr)
par(mfrow=c(1,1)) # return to single plotting mode
The next chunk of code will produce a network based on our BR similarity matrix. In this example, we define ties as present between pairs of sites when they share more than 65% commonality (BR > 0.65) in terms of the proportions of ceramic wares recovered from both sites in a dyad. This threshold was selected based on an analysis not described here but discussed in detail in Chapter 5 of Peeples 2018.
In the code below, the event2dichot function (from the statnet/network package) takes our matrix and divides it into 1s and 0s based on the cut off we choose. Here we’re using and ‘absolute’ cut off meaning we’re assigning a specific value to use as the cut off (0.65) and defining all similarity values higher than that cutoff as 1 and all other dyads as 0. We then send the output of this function to the network function just as before. After examining our new network we then plot it both using a graph layout and geographic locations.
# Define our binary network object from BR similarity
BRnet <- network(event2dichot(ceramicBR,method='absolute',thresh=0.65),directed=F)
# Now let's add names for our nodes based on the row names of our original matrix
BRnet %v% 'vertex.names' <- row.names(ceramicBR)
# look at the results.
BRnet
## Network attributes:
## vertices = 31
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 167
## missing edges= 0
## non-missing edges= 167
##
## Vertex attribute names:
## vertex.names
##
## No edge attributes
par(mfrow=c(1,2)) # set up for plotting two plots, side by side
# plot network using default layout
plot(BRnet, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5,main='BR network')
# plot network using geographic coordinates
plot(BRnet, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5, coord=ceramic.attr)
par(mfrow=c(1,1)) # return to single plotting mode
In the next chunk of code we will use the \(\chi^2\) distances to create binary networks. This time, we will not use an absolute value to define ties as present, but instead will define those similarities (1-distances) greater than 80 percent of all similarities as present. We will then once again plot just as above.
# Note we use 1 minus ceramicX01 here so to convert a distance to a similarity
Xnet <- network(event2dichot(1-ceramicX01,method='quantile',thresh=0.80),directed=F)
# Once again add vertex names
Xnet %v% 'vertex.names' <- row.names(ceramicX01)
# look at the results
Xnet
## Network attributes:
## vertices = 31
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 80
## missing edges= 0
## non-missing edges= 80
##
## Vertex attribute names:
## vertex.names
##
## No edge attributes
par(mfrow=c(1,2)) # set up for plotting two plots, side by side
# plot network using default layout
plot(Xnet, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5,main='Chi-squared network')
# plot network using geographic coordinates
plot(Xnet, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5, coord=ceramic.attr)
par(mfrow=c(1,1)) # return to single plotting mode
Finally, we now plot the network based on the \(k\) nearest neighbors spatial definition described above. As mentioned previously, the \(k\) nearest neighbors procedure can produce directed ties as when A is a NN of B, B may not be a NN of A. Thus, we use the “directed=T” argument in the network call below. This also means that we can use arrows to indicate the direction of each tie in the plot.
# Create network object with directed ties
dist.net <- network(dist.knn,directed=T)
# Once again add vertex names
dist.net %v% 'vertex.names' <- row.names(ceramic.attr)
# look at the results
dist.net
## Network attributes:
## vertices = 31
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 155
## missing edges= 0
## non-missing edges= 155
##
## Vertex attribute names:
## vertex.names
##
## No edge attributes
par(mfrow=c(1,2)) # set up for plotting two plots, side by side
# plot network using default layout
plot(dist.net, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5,main='K nearest neighbors network, k=5')
# plot network using geographic coordinates
plot(dist.net, edge.col='gray', edge.lwd=0.25, vertex.cex=0.5, coord=ceramic.attr)
par(mfrow=c(1,1)) # return to single plotting mode
It is also possible to use R to create weighted networks where individual edges are valued. I have found that this works reasonably well with networks of co-presence or something similar (counts of mentions in texts or monuments for example) but this does not perform well when applied to similarity or distance matrices (because every possible link has a value, however small, so the network gets unwieldy very fast). In the latter case, I have found it is better to just work directly with the underlying similarity/distance matrix or to provide a cut-off below which ties are not shown.
Creating a weighted network object in R is easy and only requires a slight modification from the procedure above. In the chunk of code below, I will simply add the arguments “ignore.eval=F” and “names.eval=‘weight’” to let the network function know we would like weights to be retained and we would like that attribute called ‘weight’. We will apply this to the matrix of co-presence defined above and then plot the result showing the weights of individual ties. Although it is difficult to tell with a network this size, the lines defining the edges are scaled based on their edge weights. Although we only show this approach for the co-presence network, these same arguments work with the other matrices defined above. In this case, all ties have a weight of 1 or 2 with ties with a weight of 2 shown in red.
# create weighted network object from co-occurrence matrix by adding the ignore.eval=F argument
Pnet2 <- network(ceramicP,directed=F,ignore.eval=F,names.eval='weight')
Pnet2 %v% 'vertex.names' <- row.names(ceramicP)
Pnet2
## Network attributes:
## vertices = 31
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 184
## missing edges= 0
## non-missing edges= 184
##
## Vertex attribute names:
## vertex.names
##
## Edge attribute names:
## weight
par(mfrow=c(1,2)) # set up for plotting two plots, side by side
# plot weighted network using default layout
plot(Pnet2, edge.col='weight', edge.lwd='weight', vertex.cex=0.5, vertex.col='red',
main='co-presence weighted network')
# plot weighted network using geographic coordinates
plot(Pnet2, edge.col='weight', edge.lwd='weight', vertex.cex=0.5, coord=ceramic.attr, vertex.col='red')
par(mfrow=c(1,1)) # return to single plotting mode
If we wished to do this for a similarity network, there may be a few additional steps to consider. In this example, using the cutoff defined above, we can eliminate the weights below the threshold of 0.65 so that those will not be shown. There are many more plotting options within R base graphics and ggplot/ggraph and I direct you to the help documents for those packages for more details as well as the additional examples below.
ceramicBR2 <- ceramicBR # create object to convert into weighted network object
ceramicBR2[ceramicBR2<0.65] <- 0 # set values for similarities less than 0.65 to 0 sot that they are not shown
BRnet2 <- network(ceramicBR2,directed=F,ignore.eval=F,names.eval='weight') # create weighted network object
BRnet2 %v% 'vertex.names' <- row.names(ceramicP)
BRnet2
## Network attributes:
## vertices = 31
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 167
## missing edges= 0
## non-missing edges= 167
##
## Vertex attribute names:
## vertex.names
##
## Edge attribute names:
## weight
par(mfrow=c(1,2)) # set up for plotting two plots, side by side
edge.set <- round(get.edge.value(BRnet2,'weight')*100,0)-64 #extract edge values and round, subtract 64 so that the minimum value will be 1. This procedure would need to be modified if a different cutoff were used.
edge.cols <- colorRampPalette(c('gray','darkblue'))(max(edge.set)) #create color pallette for the number of values represented
plot(BRnet2,edge.col=edge.cols[edge.set],edge.lwd='weight',vertex.cex=0.5,vertex.col='red',main='co-presence weighted network') # plot weighted network using default layout
plot(BRnet2,edge.col=edge.cols[edge.set],edge.lwd='weight',vertex.cex=0.5,coord=ceramic.attr,vertex.col='red',main='co-presence weighted network') # plot weighted network using geographic coordinates
par(mfrow=c(1,1)) # return to single plotting mode
One of the most common kinds of analysis for archaeological and other networks is the calculation of measures of node/edge centrality and graph centralization. There are many different measures in the literature each appropriate for different kinds of research questions and data formats (see Borgatti and Everett 2006). I will not cover the interpretation of these network metrics in depth in this workshop, but we will be using several common measures as our means of assessing the impact of missing data and other kinds of uncertainty on our interpretations of archaeological networks.
In this section, I briefly describe the network metrics we will use and then define a function to easily calculate multiple measures simultaneously on multiple networks. The primary measures we will use are the binary and weighted versions of: 1) degree centrality, 2) betweenness centrality, and 3) eigenvector centrality. I direct readers to Peeples and Roberts (2013) and especially the online supplemental materials for that article for more details on the calculation of these metrics.
Degree centrality for a node is defined as the total number of direct connections in which that node is involved. In weighted networks, this is simply the total weight of all connections for that node (minus 1 to remove self-loops). Betweenness centrality is defined as the number of shortest paths between pairs of nodes in a network involving the target node divided by the total number of shortest paths in the network as a whole. For binary networks, shortest paths are defined as the smallest number of direct ties that must be crossed to get from one specific node to another. Calculating betweenness centrality for weighted networks is a bit more complicated. The method we use here come from Opsahl and others (2010). This approach defines shortest paths as the “path of least resistance” between pairs of target nodes. In other words, the stronger the path (the higher the edge weights) the “shorter” it is according to this algorithm. Refer to Peeples and Roberts (2013) for more details. Eigenvector centrality is a measure of a node’s importance in a network defined in relation to the first eigenvector of the adjacency matrix of nodes for both binary and weighted networks. For both binary and weighted networks, a node’s eigenvector centrality score is proportional to the summed scores of other nodes to which it is connected. In other words, eigenvector centrality for a node will increase if a node is either connected to lots of other nodes, or if a node is connected to highly central nodes. We rescale this measure as is commonly seen in the network science literature so that the sum of squared scores is equal to the total number of nodes in that network.
Calculating individual centrality measures usually involves just one or two lines of code as the examples below illustrate.
## degree centrality
dg <- as.matrix(sna::degree(BRnet,gmode='graph'))
## eigenvector centrality
eg <- as.matrix(sna::evcent(BRnet))
eg <- sqrt((eg^2)*length(eg)) # standardized to start network as is frequently seen in the literature
## betweenness centrality
bw <- sna::betweenness(BRnet,gmode='graph')
# calculate weighted degree as the sum of weights - 1
dg.wt <- as.matrix(rowSums(ceramicBR)-1)
# calculate weighted eigenvector centrality and rescale
eg.wt <- as.matrix(sna::evcent(ceramicBR))
eg.wt <- sqrt((eg.wt^2)*length(eg.wt)) # standardize to start network
# calculate weighted betweenness from the tnet package (we use the suppressWarnings package to avoid notifications)
bw.wt <- suppressWarnings(betweenness_w(ceramicBR,directed=F))[,2]
head(dg)
## [,1]
## [1,] 11
## [2,] 8
## [3,] 1
## [4,] 11
## [5,] 13
## [6,] 11
To simplify the rest of the discussion, I have created a function that calculates multiple centrality measures at the same time. The chunk of code below contains two functions: 1) the first calculates the three measures of centrality described above for binary networks and 2) the second calculates these measures for weighted networks (or similarity matrices). The results are by default returned in a matrix with a column for each measure and a row for each node. Note that the statnet/sna package uses some of the same names for functions as tnet/igraph. Thus, we must specify which package we mean to use in the code below (e.g., sna::degree means the degree calculation method used in the sna package which is initialized through statnet. If we wished to use the igraph version we would instead use igraph::degree).
# Calculate centrality scores for binary networks
net.stats <- function(y){
# calculate degree centrality
dg <- as.matrix(sna::degree(y,gmode='graph'))
# calculate and scale eigenvector centrality
eg <- as.matrix(sna::evcent(y))
eg <- sqrt((eg^2)*length(eg))
# calculate betweenness centrality
bw <- sna::betweenness(y,gmode='graph')
# combine centrality scores into matrix
output <- cbind(dg,eg,bw)
rownames(output) <- rownames(as.matrix(y))
colnames(output) <- c('dg','eg','bw')
return (output)} # return results of this function
# Calculate centrality scores for weighted networks (similarity matrices)
net.stats.wt <- function(y){
# calculate weighted degree as the sum of weights - 1
dg.wt <- as.matrix(rowSums(y)-1)
# calculate weighted eigenvector centrality and rescale
eg.wt <- as.matrix(sna::evcent(y))
eg.wt <- sqrt((eg.wt^2)*length(eg.wt))
# calculate weighted betweenness from the tnet package (we use the suppressWarnings package to avoid notifications)
bw.wt <- suppressWarnings(betweenness_w(y,directed=F))[,2]
output <- cbind(dg.wt,eg.wt,bw.wt)
rownames(output) <- rownames(as.matrix(y))
colnames(output) <- c('dg.wt','eg.wt','bw.wt')
return (output)} # return results of this function
Now let’s calculate these measures for our networks defined above and look at the first couple of rows for the final example based on the \(k\) nearest neighbors network.
# net stats for binary co-presence network
co.p.stats <- net.stats(Pnet)
# net stats for binary BR similarity network
BR.stats <- net.stats(BRnet)
# net stats for binary X^2 similarity network (1-distance)
X.stats <- net.stats(Xnet)
# net stats for KNN network
dist.stats <- net.stats(dist.net)
head(dist.stats)
## dg eg bw
## Apache Creek 2 1 24.0000000
## Atsinna 4 1 9.5000000
## Baca Pueblo 5 1 3.2083333
## Casa Malpais 7 1 7.5250000
## Cienega 6 1 0.7083333
## Coyote Creek 8 1 10.9833333
And now let’s calculate the weighted versions. Note that we don’t include the KNN network here as there are no weights to those ties.
# net stats for weighted co-presence network
co.pw.stats <- net.stats.wt(ceramicP)
# net stats for weighted BR similarity network
BR.stats.w <- net.stats.wt(ceramicBR)
# net stats for X^2 similarity (1-distance)
X.stats.w <- net.stats.wt(1-ceramicX01)
head(X.stats.w)
## dg.wt eg.wt bw.wt
## Apache Creek 13.949603 1.0099577 0
## Atsinna 12.950237 0.9409329 0
## Baca Pueblo 5.177017 0.3711367 0
## Casa Malpais 14.431399 1.0497798 0
## Cienega 14.980572 1.0885190 0
## Coyote Creek 13.652186 0.9904176 0
It is also often informative to evaluate graph-level measures of centralization. In the code below, we calculate centralization measures associated with the node-level centrality measures described above. These measures are essentially a measure of how central all nodes in our network are in relation to a theoretical maximum. We can calculate most measures using existing functions but we need to create a custom function for calculating weighted betweenness centralization. Again Peeples and Roberts (2013) provide more details.
## calculate centralization measures for binary network
cent.bin <- function(net) {
output <- matrix(0,1,3)
colnames(output) <- c('degree','between','eigen')
#calculate binary degree centralization
output[1,1] <- centralization(net,sna::degree,normalize=T)
#calculate binary eigenvector centralization
output[1,2] <- centralization(net,sna::evcent,normalize=T)
#calculate binary betweenness centralization
output[1,3] <- centralization(net,sna::betweenness,normalize=T)
return(output)}
# define function for calculating weighted betweenness centralization
bw.cent <- function(x){
Cstar <- max(x)
Csum <- (Cstar-x)
num <- 2*(sum(Csum))
den <- ((length(x)-1)^2)*(length(x)-2)
out <- num/den
return(out)} # output result of this function
## calculate centralization measures for weighted network
cent.wt <- function(sim) {
output <- matrix(0,1,3)
colnames(output) <- c('degree.wt','eigen.wt','between.wt')
#calculate degree centralization
output[1,1] <- centralization(sim,sna::degree,normalize=T)
#calculate eigenvector centralization
output[1,2] <- centralization(sim,sna::evcent,normalize=T)
# calculate betweenness centralization
output[1,3] <- bw.cent(sim)
return(output)}
Now let’s see examples of a couple of results using these functions focusing on the BR measure for now.
# BR binary net centralization
cent.bin(BRnet)
## degree between eigen
## [1,] 0.2574713 0.1149407 0.3064557
# BR similarity centralization
cent.wt(ceramicBR)
## degree.wt eigen.wt between.wt
## [1,] 0.1082207 0.03267446 9.386233e-07
Another procedure that is often useful for exploratory analysis of archaeological networks is to visualize networks using attributes or network metrics to scale or weight the sizes of nodes or edges. There are way more options than I could ever hope to show here but I can highlight a few of the most common plotting formats and options here.
The first set of examples focuses on the “plot.network”" function within the SNA package. Note that “vertex.cex” is being used here to set the size of nodes based on degree centrality. Becuase the values of degree vary widely, I divide the value by 6 to keep the points to a reasonable size. You can experiment with similar scaling techniques and I’ll discuss more of these in the workshop.
par(mfrow=c(1,2)) # set up for plotting two plots, side by side
# plot network using default layout
plot(BRnet,displaylabels=T,label.cex=0.5,vertex.cex=sna::degree(BRnet)/6, edge.lwd=0.25, edge.col='gray')
# plot network using geographic coordinates
plot(BRnet,displaylabels=T,label.cex=0.5,vertex.cex=sna::degree(BRnet)/6, coord=ceramic.attr, edge.lwd=0.25, edge.col='gray')
par(mfrow=c(1,1)) # return to single plotting mode
The next set of examples shows some of the advanced plotting options that are available through the ggraph and tidygraph packages. These packages provide many more options for color, edge weights, and even animation. I encourage you to experiment by modifying the code below and looking in the help(ggraph) documents online as well as many excellent tutorials you can find on the internet.
BRnet %>%
ggraph(layout = 'fr') +
geom_edge_link(color='gray') +
geom_node_point(aes(size = bw, color = bw)) +
scale_color_continuous(guide = 'legend') +
theme_graph()
BRnet %>%
ggraph(layout = 'manual', node.positions = ceramic.attr[,1:2]) +
geom_edge_link(color='gray') +
geom_node_point(aes(size = bw, color = bw)) +
scale_color_continuous(guide = 'legend') +
theme_graph()
edge_weight <- get.edge.value(BRnet2,'weight')
BRnet2 %>%
ggraph(layout = 'fr') +
geom_edge_link(aes(alpha=edge_weight), color='black') +
geom_node_point(aes(size = bw, color = bw)) +
geom_node_text(aes(label = name), repel=T)+
scale_color_continuous(guide = 'legend') +
theme_graph()
This document is by no means an exhaustive account of what R has to offer for archaeological networks. There are many more pre-built tools to experiment with and endless options for creating your own methods. I hope this inspires you to do more. If you’ve completed this tutorial and want to know more about network metrics and sensitivity, you can also try this more advanced tutorial that was part of a full-day course at the Computer Applications in Archaeology meeting in 2017.
Borgatti, Stephen P., and Martin G. Everett 2006 A Graph-Theoretic Perspective on Centrality. Social Networks 28(4): 466-484.
Brughmans, Tom 2013 Thinking Through Networks: A Review of Formal Network Methods in Archaeology. Journal of Archaeological Method and Theory 20: 623-662.
Brughmans, Tom and Matthew A. Peeples 2017 Trends in Archaeological Network Research: A Bibliometric Analysis. Journal of Historical Network Research 1(1):article 1.
Opsahl, Tore, Filip Agneessens, and John Skvoretz 2010 Node Centrality in Weighted Networks: Generalizing Degree and Shortest Paths. Social Networks 32(3): 245-251.
Peeples, Matthew A., and John M. Roberts 2013 To Binarize or Not to Binarize: Relational Data and the Construction of Archaeological Networks. Journal of Archaeological Science 40(7): 3001-3010.
Peeples, Matthew A. 2018 Connceted Communities: Networks, Identity, and Social Change in the Ancient Cibola World. University of Arizona Press, Tucson, AZ.
Peeples, Matthew A. 2019 Finding a Place for Networks in Archaeology. Journal of Archaeological Research (online first):1-49.
In addition to the resource listed above Tom Brughmans and I (Brughmans and Peeples 2017) have compiled a large Zotero library of archaeological network literature that is freely available through the Historical Network Research organization here.