Workflow through the script:
1. Subset the data into these
Spring (containing both Mundaring and Yanchep)
Autumn (containing both Mundaring and Yanchep)
Mundaring (containing both spring and autumn)
Yanchep (containing both spring and autumn)
Spring & Mundaring
Spring & Yanchep
Autumn & Mundaring
Spring & Yanchep
2. Made some density plots to get an idea what affects what. Don't know if you're into density plots, but they are like histograms for continuous data and they are more appropriate for looking at this data than histograms.
3. Linear models across all eight of the subsets above as well as the unsubsetted data. Generated some plots of those models just by running plot(fit.model.name). Did not get into extracting data from the fit models and plotting it manually.
4. Analysis of residuals: Shapiro Wilkes test for normality, Bresuch Pagan Test for heteroskedasticity.
5. ANOVA
6. Generated but didn't do much with influence objects. These get you into stuff like hat matrices which you are probably not expected to know. Use it as you please.
7. GVLMA which is diagnostics made easy for linear models. Gives you real values and p values for kurtosis, skew, etc. Plots them, again nothing fancy, just the built in plot(gvlma.model.name).
------------------
Some notes:
- If you want to get involved with extracting data from the fit models be them linear models, anova, whatever, DO NOT try to extract from the raw model. Coerce it into a dataframe using:
new.object = as.data.frame( linear fit or anova or whatever)
str(new.object)
Then use $ or [ , ] or whatever to extract from that as usual. This is especially true for glmva models.
- cor(dependent ~ independent) does not create models but gives you correlation coefficients for the data which difficult to access within a lm model.
- confit( model name) gives you confidence intervals for the model which look very pretty on graphs.
- I suggest you copy and paste the code into whatever you're putting together.