Tim Evans R Tips Page

Tim Evans Computing Tips Main Page Tim Evans Informal Home Page | Tim Evans Imperial College page |

R

The R statistics package is free, widely used, and is extremely powerful. It is a statistics package first so its analysis tools, e.g. non-linear curve fitting, is very good. It has extensive plotting abilities and can produce pdf and postscript output suitable for publications. It can also display networks, e.g. using iGraph. I have found it painful to learn as it does not work in the same way as normal programming languages (works more like Matlab) and the manuals are dense if comprehensive. The best way in is to take the time to read the introductory manual and then to find a good book of which there are several (see the R Wiki book list). On the other hand being open source there is masses of material on the web. A search starting with the terms R statistics followed by other relevant keywords should throw up what you need. There are GUI front ends for Windows etc. but most commands need to be entered on the command line. I tend to test the commands on the command line, copying what works into simple scripts so I can repeat this quickly in the future. Scripts are easy to run (or "source" them in R language).

Some places to look

R Wiki has almost everything and links to things held elsewhere, e.g. lists of books and manuals.
R Graph Gallery has lots of examples.
The blog by Kelly O’Day contains many other useful tips for R.
The "How to do stuff in r in two minutes or less" site looks like a useful list of short video intros to various ideas. Also double as list of commands you should be aware of.

Tips

In Windows, right click the R icon to get the properties and change the start in directory to something more useful. Also look for Rprofile files (dot in front on Unix machines). A global Rprofile.site in the etc directory of the installation sets various parameters, and a Rprofile file in the current (startup) directory can override these settings too. SeeR for Windows FAQ for details on Windows machines.
For instance I found that on my college machine it tried to use a network drive for the libraries. So with my R starting up in c:\PRG\R I created a .Renviron directory in that directory containing the line HOME=c:/PRG (note backward to forward slashes, no idea if this is essential). I suppose there are neater ways but this worked for me.
```
HOME=c:/PRG/R/
R_LIBS=c:/PRG/R/win64-library/
```
Use graphics.off() to get rid of all the graphics windows. Use windows() to send following plots to the screen on a windows box and quartz() when on a Mac. To send to a pdf file I use something like pdf(pdfFileName, onefile=FALSE, height=6, width=6, pointsize=10). See device drivers below.
For lists of useable colours search for colourlist and colourList in examples below, such as when you need colours in plots. Otherwise colors() produces all the named colours.
Use object() and search() to find the list of named objects and libraries. rm(name1,name2, ...) and detach(dataframe1, dataframe2, ...) to remove unwanted objects.
When using if ... else you can enclose statements in curly brackets but make sure the else follows on the same line as the curly bracket ending the first (condition is true) statement.
If reading in a tab separated file with first line a header line, do not forget to specify tab as the separation character as white space in the column labels will confuse read.table. For instance I use
```
s250 <- read.table("tabseparateddata.dat", header=TRUE, sep="\t", fill=TRUE);
```
This reads in a table. The fill option means that it works even if there are more header items than columns, it just fills the extra lines with rubbish.
A useful check for problems with numbers of entries on lines is to use
```
count.fields("tabseparateddata.dat", header=TRUE, sep="\t");
```
Basic file input is in the R-intro document. However for more general advice on input and output to files see the R-intro document.
The typical commands for output to text files are write.table for data frames and write for matrices. For instance
```
    tfdf <- data.frame(source=sourceVector, target=targetvector)
    write.table(tfdf, file = "outputFile.dat", append = FALSE,  sep = "\t", quote=FALSE, row.names = FALSE, col.names = TRUE)
    
```
For example using scan see the dendrogram example below.
On MS Windows don't forget to use a double backslash to represent the single backslash used to separate directories
To get information on screen use the cat command. You need to give an explicit line feed so use the paste command to join strings with a "\n" string added to the end of what you want
```
cat("This is information\n")
cat(paste("variable =", variable, "\n",sep=""))
```
cat can direct to the screen ( file = "") chooses the standard output usually the screen) or to a specified file
```
cat("This is information\n",file="output.txt")
cat(paste("variable =", variable, "\n",sep=""),file="output.txt")
```
However I also use a way of redirecting screen output, sink, as follows:-
```
sink(distanceTextFileName) # all screen sent to file, use split=TRUE to copy to both screen and file
 cat(paste("This is information", "\n",sep=""))
 cat(paste("variable =", variable, "\n",sep=""))
sink()
```
Note if you use the print command (with a quote=FALSE option to ensure no quotes around strings in the output)
```
print("This is information", quote=FALSE)
```
this produces a nasty number in square brackets at the start of the line that I could not get rid of.
Use the paste command to combine strings
```
paste("string 1"," and ","string 2",sep="")
```
Note that the default for the separation string used in between each string is a single space so for true concatenation you must assign this to be the empty string as shown above if you don't want spaces.
Note paste will do conversions to. convert lists of numbers into strings for use in plot legends etc
```
  EEE <- 1000;
  KKK <- 4.0;
  pdetitle <- paste("pde E=",EEE," K=",KKK,sep=" ");
  prlist = c(0.9999,0.1,1e-2,1e-3,1e-4);
  prliststring <- paste(prlist);
  legend (xxx,yyy, prliststring, col=colourlist[1:length(prlist)],lty=1:length(prlist),pch=1:length(prlist)
```
Sometimes you see entries in lists referred to with double brackets, e.g. if list = (first=0.1,second=0.2) then list[[1]] refers to the first entry. In this case we have given the entry our own name so list$first also refers to the first entry. Its as if [1] was some sort of standard default name for entry number one in a list.
Appending to a list and creating a list of arbitrary length. A null list is created simply by result=list() and then one can add to this (or to any list) using a mixture of c (concatenate) and list, or just refer to the new index you want to create..
```
  result=list() # create empty list
  for ( i in 1:6) {
     name <- list(filename=paste("filename",i,sep=""), dirname=paste("dir",i,sep="") ) # create list of names
     output <- list(value=i,sq=i*i,cube=i*i*i) # create list of results
     singlelist <- list(c(name,output)) # combine these into a single list
     #result <- c(result,singlelist) # make a new list of lists
     result[[i]] <- singlelist # create new entry in lest
   }
```
This means that singlelist has four elements such as singlelist$sq the numerical value of the square of the value. result is now a list of length 6, each of whose entries is one of the singlelist, so each entry in the list result is a list of four objects. Thus result[[2]]$sq will be the square of result[[2]]$value .
In order to get greek characters, maths formulae or values of variables into a string, such as a graph title or label use substitute as in substitute(y==phi*alpha-sum(k^gamma)). That doesn't always work for me and there are other approaches, I just can't work out when each will work. Its trial and error. To control the format command. More detailed examples of what I have used to get greek characters, formulae and values into a string are given below.

Non linear Fitting example

This can be very frustrating, but this is often due to the inherent nature of non-linear fitting and not the fault of R. I find that playing around with the starting values and bounds for the parameters being fitted can often turn rubbish into beautiful results. Persevere is the watch word here.

Create a data frame containing the data, here with two columns labelled k and logp

df <- data.frame(k = dataf$k[2:kmax], logp = log(dataf$pk[2:kmax]));

The parameters in this example are called Afit, gammafit and k0fit and one needs to specify lower, upper and starting values for these. Of course the starting values must be larger (smaller) than the corresponding value specified in the lower (upper) list. For instance I might use

 startlist <- list(Afit=Aest,   gammafit=gammaest,   k0fit=k0est);
 lowerlist <- list(Afit=Alower, gammafit=gammalower, k0fit=k0lower)
 upperlist <- list(Afit=Aupper, gammafit=gammaupper, k0fit=k0upper )

where Aest etc. all have numerical values. To fit see if the logp data fits form with a linear and log k terms, I use one of the following

  fitres <- nls( logp ~ log(Afit)-gammafit*log(k) - k/k0fit  , df, startlist, lower=lowerlist, upper=upperlist,  algorithm="port")
  fitres <- nls( logp ~ log(Afit)-gammafit*log(k) - k/k0fit  , df, startlist, lower=lowerlist, upper=upperlist)

Note the first part is how you specify a formula. The names in the formula are either data or parameters. The data variable are those present in the data frame df, e.g. logp refers to the values in df$logp. Any names appearing in the in the upper, lower and starting value lists are treated as the parameters whose values are being sought, e.g. Afit in the formula appears in startlist so its value is being found. Anything else better have a fixed numerical value.

A summary is given on screen and the results are in fitres which is a "a type of fitted model object". You need to look this up to find out how to get the information out of it. Look up the nls function in the stats package, for instance as discussed in the reference manual. For instance coef(fitres) gives the coefficients of the fit, summary(fitres) gives a summary.

Don't forget that non-linear fitting is not guaranteed to converge and you may fail to get an answer at first. You may need to think about what sort of values are likely, play around with start/lower/upper values or whatever to get it to work. If all else fails try to get it working with a trivial example.

Plotting example

See R Basic Graphs page for basic ideas and the R Graph Gallery for some fancy ones.

One idea is to use the package gplots which you get get installed from the GUI package menu.

One use for a list of objects is when you have run the same routine on different data sets and you want to plot the results on the same plot. Here ResultsList is a list of lists where the sublist ResultsList[[iii]]] has one set of data and a fit (for instance the output from nls,m see below) to that data as its entries. The trick is to open a new window then to plot an example with null data where you specify the features of the plot (log axes, titles etc). In particular here you must set the range of the axes by hand, e.g. by calculating them for yourself as illustrated by the first three lines below. You then add more data points or lines as required. Also note that this code generates different point symbols, line types and colours, the latter are a subset of those known to the system which can be found by typing colors()

  #First calculate the minimum value of y axis.
  yMin<-ResultsList[[1]]$yMin;
  for (iii in 2:length(ResultsList)) yMin<-min(yMin, ResultsList[[iii]]$yMin)
  # Short list of allowed colours suitable for display or printing
  colourlist=c("black", "red", "blue", "green", "magenta", "brown");
  # Now plot on screen
  windows()
  plot(x=NULL,  y=NULL, type="n", log="xy", xlim=c(xMin,xMax) , ylim=c(yMin,yMax) , main="Main Title", sub="(subtitle)", xlab="x axis name", ylab="y axis name")
  for (iii in 1:length(ResultsList)) {
  points(ResultsList[[iii]]$y,   ResultsList[[iii]]$x, pch=iii, col=colourlist[iii])
  lines(ResultsList[[iii]]$y, fitted(ResultsList[[iii]]$fit), col=colourlist[iii], lty=iii)
  }
  # Finally add legend
  legend (x="topright",y=NULL, nameArray[1:length(ResultsList)], col=colourlist[1:length(ResultsList)],lty=1:length(ResultsList),pch=1:length(ResultsList));

Note that on a MAC you need quartz() to get displays in a window, the window() is for MS Windows platforms so I often define a boolean OSWindows and use if (OSWindows) windows() else quartz(). One could probably access some machine environment variable to do this too.

To get output to a file or a new window on a screen you use device drivers. These may sound scary but there is just one for every different type of file or output device and by default the appropriate screen is set up so most of the time you don"worry about these. For instance postscript() or pdf() with extra arguments should be used to produce files of each of these types. Note that for the former you must close the device (using dev.off(which = dev.cur()) or similar) or you get empty output. Thats a good idea anyway for all `devices' except for screen output though many types of device (file/screens) don't seem to care. Also for eps (best for inclusion in documents) not ps output you must include the onefile=FALSE option.

  rootName="IC090521"
  hepsFileName<-paste(rootName,"yearhist.ps",sep="_")
  print(paste("ps plotting",hepsFileName), quote=FALSE)
  postscript(hepsFileName, horizontal=FALSE, onefile=FALSE, height=6, width=6, pointsize=10)
  hist(yearList)
  dev.off(which = dev.cur())

Here is a simple example of how to alter the size and colours of the points or lines. It also shows how to set up a simple routine to do plots. A simple copying of a single line calling this routine can be used to reproduce the same plot for different outputs such as screen, eps file, pdf file. Also shown are labels for the curves.

#Next three lines are altered as needed
  screenOn=TRUE
  epsOn=FALSE
  pdfOn=FALSE
  pngOn=FALSE
  OSWindows=TRUE
  
fileNameFullRoot=paste("figure",sep="")
colourlist=c("black", "red", "blue", "green", "magenta", "brown");

plotFig<-function(){
    # next two control size of text and lines
	cexvalue=2
	lwdvalue=2
	xmin=0
	xmax=2.0
	# plots cross at this point and at (0,1)
	xcross=0.5
	ycross=0.5
	nvalues=100 	# number of xvalues to use
	xvec = seq(from=xmin, to=xmax, by=(xmax-xmin)/nvalues)
	# plot exponential e^(-gamma*x)
	gammavalue = -(1.0/xcross)*log(ycross)
	lll=1 ## dummy variable, helps if you later insert or remove some of the lines on the plot
    yvec=exp(-gammavalue*xvec)
	plot(xvec, yvec,  ylim=c(-0.01,1.01), type="l", col=colourlist[lll], lty=lll, cex=cexvalue, lwd=lwdvalue)
	xlabelpos=1.0
	text(xlabelpos,yvec[xvec==xlabelpos],labels=c("exponential"), col=colourlist[lll], offset = 2, pos=1, cex=cexvalue)
	# pos argument is 1, 2, 3 and 4, for below, to the left of, above and to the right of the specified coordinates.
	# plot power law 1/(x+x0)^alpha
	alphavalue = 2
	x0value = xcross/( (ycross^(-1.0/alphavalue))-1)
	lll=lll+1
	yvec=(x0value/(xvec+x0value))^alphavalue
    lines(xvec, yvec,  type="l", col=colourlist[lll], lty=lll, cex=cexvalue, lwd=lwdvalue)
	text(xlabelpos,yvec[xvec==xlabelpos],labels=c("power law"), col=colourlist[lll], offset = 2, pos=3, cex=cexvalue)
	# plot theta function
	lll=lll+1
    yvec=rep(1,times=length(xvec))
	yvec[xvec>xcross]=0
	lines(xvec, yvec,  type="l", col=colourlist[lll], lty=lll, cex=cexvalue, lwd=lwdvalue)
	text(xcross,0.8,labels=c("sharp cut off"), col=colourlist[lll], offset = 0.5, pos=4, cex=cexvalue)
}

 if (screenOn){
         if (OSWindows) windows() else quartz()
         plotFig()
  }
  # EPS plot
  if (epsOn){
         epsFileName<- paste(fileNameFullRoot,".eps",sep="")
         print(paste("eps plotting",epsFileName), quote=FALSE)
         postscript(epsFileName, horizontal=FALSE, onefile=FALSE, height=10, width=10, pointsize=20)
         plotFig()
         dev.off(which = dev.cur())
  }
  # PDF plot
  if (pdfOn){
         pdfFileName<- paste(fileNameFullRoot,".pdf",sep="")
         print(paste("pdf plotting",pdfFileName), quote=FALSE)
         pdf(pdfFileName, onefile=FALSE, height=10, width=10, pointsize=20, units="cm")
         plotFig()
         dev.off(which = dev.cur())
  }
# PNG plot
  if (pngOn){
         dpivalue=1200
         pngFileName<- paste(fileNameFullRoot,dpi,".png",sep="")
         print(paste("png plotting",pngFileName), quote=FALSE)
         png(pngFileName, height=10, width=10, pointsize=20, units="cm", res=dpivalue)
         plotFig()
         dev.off(which = dev.cur())
  }

 plotDataTableLines<-function(dataTable1,dataTable2, legendString1,legendString2){
    ltyList=c(2,3);
    plot(dataTable1$gamma, dataTable1$Communities,  ylim=c(0,12), xlim=c(0,2.5), main=NULL, xlab=expression(gamma), ylab="Number Communities", type="l", col=colourlist[2], lty=ltyList[1], lwd=2)
    lines(dataTable2$gamma,dataTable2$Communities, col=colourlist[3], lty=ltyList[2], lwd=2)
    legend(x="bottomright" ,y=NULL, c(legendString1,legendString2), lty=ltyList, lwd=c(4,4), col=colourlist[2:3]);
}

So then I would call this as

  #Next three lines are altered as needed
  screenOn=TRUE
  epsOn=TRUE
  pdfOn=TRUE
  pngOn=TRUE
  OSWindows=TRUE
  if (screenOn){
         if (OSWindows) windows() else quartz()
         plotDataTable(gnct2,gnct4,typeStringt2,typeStringt4)
  }
  # EPS plot
  if (epsOn){
         epsFileName<- paste(fileNameFullRoot,".eps",sep="")
         print(paste("eps plotting",epsFileName), quote=FALSE)
         postscript(epsFileName, horizontal=FALSE, onefile=FALSE, height=6, width=6, pointsize=10)
         plotDataTable(gnct2,gnct4,typeStringt2,typeStringt4)
         dev.off(which = dev.cur())
  }
  # PDF plot
  if (pdfOn){
         pdfFileName<- paste(fileNameFullRoot,".pdf",sep="")
         print(paste("pdf plotting",pdfFileName), quote=FALSE)
         pdf(pdfFileName, onefile=FALSE, height=6, width=6, pointsize=10)
         plotDataTable(gnct2,gnct4,typeStringt2,typeStringt4)
         dev.off(which = dev.cur())
  } 
  # PNG plot
  if (pngOn){
         pngFileName<- paste(fileNameFullRoot,".png",sep="")
         print(paste("png plotting",pngFileName), quote=FALSE)
         png(pngFileName, onefile=FALSE, height=6, width=6, pointsize=20)
         plotDataTable(gnct2,gnct4,typeStringt2,typeStringt4)
         dev.off(which = dev.cur())
  }

A useful function to increase the scale of the plot e.g. to leave room for text labels

plotrange<-function(coordinates,factor=0.05){
 cmin=min(coordinates)
 cmax=max(coordinates)
 crange=cmax-cmin
 cextra=crange*factor
 c(cmin-cextra,cmax+cextra)
}

I call this as

plot(xvalues,yvalues,xlim=plotrange(xvalues,0.05),ylim=plotrange(yvalues,0.05))

which gives an five percent whitespace border.

This is the old way I used to combine several data sets and lines on one plot. The options in the plot command illustrate several of the useful options. Here s010 and the other parts of SimonDataList are data frames formed using a read.table. The double square bracket is confusing and comes from the default name given to entries in list. The FitList is a list of fits coming out of some call to nls or similar fitting command. ifirst and ilast are there just for flexibility.

  colourlist=c("black", "red", "blue", "green", "magenta", "brown");
  SimonDataList <- list(s010 ,s250, s750, s950, s990 , s999);
  ifirst=1
  ilast=length(SimonDataList)
  plot(SimonDataList[[ifirst]],   type="p", log="xy" ,  col=colourlist[ifirst], pch=ifirst, xlim=c(kmin,kmax), ylim=c(1e-9,1), xlab="degree k", ylab="p(k)", main="Laird and Jensen EPL 2006 Fig 1 (ll, ac fit)")
  for ( i in (ifirst+1):ilast) points(SimonDataList[[i]],  col=colourlist[i], pch = i);
  for ( i in ifirst:ilast) lines(SimonDataList[[i]]$k, exp(fitted(FitList[[i]])), col=colourlist[i]);

Offset of axes

The axes are usually drawn offset from the data. To change this you need the xaxs or yaxs options. This example is taken from the blog by Kelly O’Day which contains many other useful tips for R.

# Here are two simple x and y vectors to be plotted
    x <- c(0,1,2,3,4,5,6,7,8)
   y <- c(0,1,2,3,4,5,6,7,8)
# Set layout() for side by side plots
    layout(matrix(c(1,2), byrow=TRUE, ncol=2))
#  xaxs = “r” adds 4% over specified limits
    plot (y~x, xlim=c(0,10), ylim=c(0,10),
         xaxs=”r”, yaxs = “r”, las = 1,
         main = “Plot with ‘r’ axes placement”)
# The 4% expansion can be eliminated by specifying xaxs = “i” and yaxs = “i”
   plot (y~x, xlim=c(0,10), ylim = c(0,10),
      xaxs = “i”, yaxs = “i” ,  las = 1,
      main = “Plot with ‘i’ axes placement”)

Greek Letters, Equations and Numerical Values in Plots and strings in general

To get R to display mathematical expressions in plots (or in strings in general) use the expression(), bquote() or substitute() commands, which seem to be part of plotmath. There are some webpages with examples, and more symbols so search for these terms.

A simple example I had working is as follows

  subtitle=paste(substitute(alpha),"=",alpha,", ",substitute(gamma),"=",gamma)
  plot(x,y,main=substitute(y==Phi*alpha-sum(x^gamma)),type="l",subtitle=subtitle)
  text(20,30,substitute(Delta[z]==0.1))

It took me time to piece a complicated mixture of angle brackets, subscripts, mathematical expressions and numerical values. You can't always paste bits together as you can with normal strings. For instance the following work as labels in a plot

  graphKfitScale=1.0
  graphKfitShiftAbs=3.0
  xLabel = bquote(k/(.(graphKfitScale)*symbol("\341") * k[s] * symbol("\361")  - .(graphKfitShiftAbs) ) )
  yLabel = bquote(log(n(k))*p*( .(graphKfitScale)*symbol("\341") * k[s] * symbol("\361")  - .(graphKfitShiftAbs) ) )

This will substitute the values of the graphKfit parameters while the other symbols such as k are left unchanged. The whole is treated as an expression so the * symbols do not appear yet the angle brackets (the symbol commands) are left alone.

When values of variables appear in labels, leading and trailing zeros are usually suppressed. Sometimes this is not wanted so to control the conversion of numerical values to strings look at the ,tt>format function.

Dendrogram Plot Example

Here is an example of how to read in a table of distances and a list of labels for the sites (with a header) and to produce a dendrogram.

  rootName="aegean34";
  typeName="S1L3";
  namesFileName<-paste(rootName,"names.dat",sep="");
  distanceFileName<-paste(rootName,typeName,"dist.dat",sep="")
  nameList <- scan(namesFileName, what="", skip=1);
  distanceVector <- scan(distanceFileName,0);
  distanceMatrix <- matrix(distanceVector,nrow=34);
  distanceDist <- as.dist(distanceMatrix)
  hclustMethod <- c("single", "complete", "ward", "average", "mcquitty", "median", "centroid");
  methodName=hclustMethod[2]
  distanceHclust <- hclust(distanceDist, method=methodName)
  plclust(distanceHclust, labels=nameList, frame.plot=FALSE, hang=-10, xlab="", ylab="Distance", sub="Sites", main=paste(rootName,typeName," ",methodName,sep="") )

Histogram Barplot Example

Here we are plotting columns of data in file SiteHist.dat as horizontal bars. Note that though I'd call this a histogram one actually wants to use the barplot command. The las=1 is vital here to turn the labels around to be horizontal so each bar is clearly labelled.

  df <- read.table("SiteHist.dat", header=TRUE, sep="\t", fill=TRUE);
  wmax <- max(df$Weight)
  rmax <- max(df$Ranking)
  imax <- max(df$Influence)
  dataArray <- rbind(df$Weight,df$Ranking,df$Influence)
  barplot(ddd, beside=TRUE, names.arg=df$ShortName, horiz=TRUE, las=1)

Error bars

Look in the gplots library for confidence limits. plotci and barplot2 for instance have this.

Networks and Graphs

I have used the iGraph package. After install.packages("igraph") (I had to fix the location for libraries on my college PC, see R tips above) you load the library with library(igraph). Typical commands I use to get a figure of a graph are

  # convert flowMatrix N-by-N array to adjacency matrix
  minw=1/N # minw is a minimum threshold for weights
  adjMatrix <- ifelse(flowMatrix>=minw,1,0)
  flowgGraph <- graph.adjacency(adjMatrix, mode="upper", diag=FALSE)
  # make an N-by-2 matrix for the locations of vertices, here taken from a data frame of site input parameters - sitedf -
  # containing latitude and longitude of sites
  xvec <- sitedf$Long
  yvec <- sitedf$Lat
  vc <- array(c(xvec,yvec),dim=c(numberRows,2))
  # now make a vector from some data to produce vertices of different sizes, here from results vector
  maxInFlowVector <- max(siteresultsdf$inFlowVector)
  vertexSize <- siteresultsdf$inFlowVector*10/maxInFlowVector+1
  # How to colour the vertices using some membership vector - here generated by strong component membership
  flowgClusters <- clusters(flowg,mode="strong")
  colourListLong=c("black", "red", "blue", "green", "magenta", "brown","cyan","gold","darkred","darkblue", "darkgreen","yellow","hotpink","lightblue", "olivedrab");
  ncolours <-length(colourList)
  vertexColour <- colourList[ (flowgClusters$membership +1)%%ncolours +1]
  # now produce graphic with edges of different widths and named vertices of different sizes
  plot(flowg, layout=vc, vertex.label=sitedf$ShortName, vertex.size=vertexSize, vertex.color=vertexColour)
  # oops the edgewidth only works if we used the edge list method below
  #plot(flowg, edge.width=edgedf$weight*10, layout=vc, vertex.label=sitedf$ShortName, vertex.size=vertexSize)

Note how simple it is to specify the locations of the vertices, a two column matrix (three for 3D). I found this quite hard to find in the documentation. flowg <- graph.adjacency(adjMatrix, mode="upper", diag=FALSE) Also note that I had tried to create the graph from an edge list, which I generated from the adjacency matrix. This is how I did that but I think there can be issues with multiple edges.

  # convert flowMatrix N-by-N array to three column edge data frame edgedf
  # First create a N*N vector of weights, then do same for sources and targets.
  weight<- flowMatrix
  dim(weight)<-N*N
  #Internally the vertices are numbered from 0 but can use any unique numbering for each row and column
  source<-array(1:N,dim=dim(weight))
  target <- t(array(1:N,dim=c(N,N)))
  dim(target)=dim(source)
  # minw is a minimum threshold for weights
  minw=1/N
  edgedf <-data.frame(source=source[weight>minw], target=target[weight>minw], weight=weight[weight>minw])
  # create a graph structure from a data frame whose first two columns are the source then the target vertices.
  flowg <- graph.data.frame(edgedf, directed=T)
  # Trouble is multiple edges are allowed as this command will show
  count.multiple(flowg)

Finally note that there was a problem with fonts and poscript/pdf output. The solution is to load the fonts needed in the postscript (or pdf) command. The serif is the default needed to get this to work.

       postscript(epsFileName, horizontal=FALSE, onefile=FALSE, height=6, width=6, pointsize=10, fonts=c("serif", "Palatino"))
       plot(fixedRGraph,  layout=vc, vertex.label=dataList$sitedf$ShortName, vertex.color=vertexColour, vertex.size=vertexSize, vertex.label.dist=1)

For more details on plot poscript/pdf output, see examples above.

Ordering vectors

Use order as in the following example output

>xxx <- c(1:4)
>yyy <- c("one", "two", "three", "four")
>aorder <- order(yyy,xxx)
>aorder
[1] 4 1 3 2
> yyy[aorder]
[1] "four"  "one"   "three" "two"
> xxx[aorder]
[1] 4 1 3 2

Libraries

To install try install.packages("libraryname") where libraryname is the name of the library. Note that I had to fix the location for libraries on my college PC, see R tips above as it tried to put things on a networkk drive by default. You load the library with library(libraryname).