> install.packages("RandomFields")
inserting the appropriate package name.
> require(RandomFields) # we need this package loaded for the GaussRF() functionLook at the help for the GaussRF function. It mainly requires you to specify a set of locations where the data should be simulated, as well as the correlation function to use. That's the C(h) from the lecture.
test.GaussRF <- function(dim)
{
model <- "spherical"
mean <- 0
variance <- 4
nugget <- 1
scale <- 10
alpha <- 1
step <- 1
x <- seq(0, dim, step)
y <- seq(0, dim, step)
f <- GaussRF(x=x, y=y, model=model, grid=T,
param=c(0, variance, nugget, scale))
par(pty="s")
image(x,y,f)
test.grid <- expand.grid(x=seq(0, dim, step), y=seq(0, dim, step))
return((cbind(test.grid, f=as.vector(f)))
}
This function should create and plot an image of a Gaussian correlated field. Does it work? Do you believe the data to be correlated? Why? Look at the help file for the GaussRF function. What do the variance, nugget and scale parameters control?
sim50 <- test.GaussRF(50) sim50.gd <- as.geodata(sim50) sim50.variog <- variog(sim50.gd)You can quickly plot the variogram by calling plot(sim50.variog). What do you expect to see, and does this give you the expected results? Everyone's plots should be different, but I suspect that something strange may be happening at higher lags.
In class we discussed some practical rules. If our grid is 50x50, what is the largest lag we should be computing the variogram up to? You can set this value using the max.dist parameter to the variog function. Adjust this appropriately, and replot the variogram. Is the situation improved?
The other practical consideration was to ensure at least 30 pairs per lag distance. You can check this by looking at sim50.variog$n (or whatever your object is called). Are we ok here?
Now, to evaluate isotropy, we will use the variog4() function. The function basically works the same, so use it to calculate the directional variogram and then plot it. (Note this function is a little slow, so be patient.) What does this suggest about whether the given result is isotropic? Comment specifically on whether you think the range values are equivalent and if the sill values are equivalent.
Next, lets evaluate the range (and so variation) of computed results at each lag within the variogram. We are going to look at what is referred to as a "variogram cloud".
# first calculate the variogram using the option="cloud" parameter sim50.vcloud <- variog(sim50.gd, option="cloud", max.dist=25) # and create a boxplot of this data ... this seems a little klugey, but the # point is to create a nice looking boxplot... boxplot(sim50.vcloud$v~as.factor((sim50.vcloud$u %/% 2.5)*2.5))What does the box plot show? How would you interpret the results?
sim50.ols <- variofit(sim50.variog, c(3.5, 10), cov.model="spherical")(Note: if you didn't set the max.dist parameter when you calculated the empirical variogram, your parameter estimates will be highly biased!)
What are your parameter estimates? You will need to look at the help file to figure out the object names for the nugget, partial sill and range. How do these compare to the original values used within the simulation? If they are different, why do you suspect that might be? You might want to compare the parameter estimates from each of the different grid dimensions. What happens as the grid size goes up?
The scallop data is available here. Save this file to your computer in your current R directory. Read in the scallops data (command below) and quickly look at what's happening for each column. The important columns are "lat" and "long", which are the latitude and longitude where the catch occurred, "tcatch" which is the total catch and is the sum of the "prerec" or prerecruits and "recruits".
> scallops.df <- read.table("scallopsdf.txt", sep=",", header=T)
> summary(scallops.df)
It seems we have highly skewed data, so let's perform a log transformation on the catch data.
> scallops.df[,"lgcatch"] <- log(scallops.df$tcatch+1)For this next part, we want to draw a map with the actual locations of the catch, and on top of that we will add a contour plot of the log of the catch amount.
> require(maps) # needed for the map() function call
> require(akima) # needed for the interp() function call
draw.scallopmap <- function() {
# draw the background as the map of the NE coast of the US
map("usa", xlim=c(-75.25,-71), ylim=c(38.2, 41.5), fill=T, col="gray")
# add the points where the catch occurred
points(scallops.df$long, scallops.df$lat, cex=0.75, pch=20)
# overlay the contour plots
contour(interp(scallops.df$long, scallops.df$lat, scallops.df$lgcatch),
add=T, nlevels=4)
}
> draw.scallopmap()
Next, here are some functions to rotate the scallops data. You can play with this more if you want to, but it turns out that 52 degrees is pretty close to the `best' rotation. You'll see the results when you call the rotate.scallops() function.
rotate.axis <- function(xy, theta)
{
# convert the rotation angle to radians
theta.rad <- theta*(pi/180)
# create the rotation matrix
newx <- c(cos(theta.rad), sin(theta.rad))
newy <- c(-sin(theta.rad), cos(theta.rad))
# perform the matrix multiplication to perform the rotation
XY <- as.matrix(xy) %*% cbind(newx, newy)
# return the value
return(as.data.frame(XY))
}
rotate.scallops <- function()
{
scall.rot <- rotate.axis(scallops.df[c("long", "lat")], 52)
plot(scall.rot$newx, scall.rot$newy)
return(scall.rot)
}
Finally, use these functions to create the rotated data set, and then convert it to its geodata version, which is merely the x,y and response data.
> scall.rot <- rotate.scallops() > scallops.gd <- as.geodata(cbind(scall.rot, scallops.df$lgcatch))
Last updated: 5/23/08.