## TranceNova 3 years ago R/Multiple regression, analysis in biology :D @blues Okay.. so basically I'll attach the data in the next thread, but the assignment basically says "analyse this" (with some rough hints that we should use regression).

1. TranceNova

Some data...

2. TranceNova

Oh, this may help explain the data a bit: The response variable is the number of seeds produced from each plant, and there are two factors: location (Yanchep and Mundaring) and time of controlled burn (spring and autumn). There is also a quantitative variable (length of the plant flower spike) that may help predict the response. The flower spike length is measured in metres and the seed count is the total number of seeds produced.

3. TranceNova

So I figure at the end of the day I will need some sort of model for just (a) seeds vs. spike length (b) seeds vs. spike length in terms of location (c) seeds vs. spike length in terms of burn and (d) seeds vs. length in terms of both location and spike length. P.s. I'll forgive you if you run away now haha

4. blues

Do you have an R script you have been working on and if so could you upload it?

5. TranceNova

Hmm okay, here's what I have so far..

6. TranceNova

But I have two friends ones (which have used different data) - I don't know if they are right but they have been somewhat confusing me more, but I'll attach them anyway... (the tom one is really hard to read)

7. blues

OK, and as usual, what part of it would you like most help with - understanding the statistic procedures or dealing with R, which can be a right pain in the [whatever the Aussie word for that is].

8. TranceNova

Lol.. a pain is the aussie word :P herm, okay dealing with it in R is the main problem, though I lack the statistics understanding too (to some extent)..why don't we just start with trying to work out a model for just location?

9. blues

Cool, I have to commute at the moment but I will get back to you on this. Let's say by midnight my time, noon yours? Meanwhile, I suggest looking at this tutorial for regression in R: http://data.princeton.edu/R/linearModels.html

10. TranceNova

okay, sounds good!

11. blues

Workflow through the script: 1. Subset the data into these Spring (containing both Mundaring and Yanchep) Autumn (containing both Mundaring and Yanchep) Mundaring (containing both spring and autumn) Yanchep (containing both spring and autumn) Spring & Mundaring Spring & Yanchep Autumn & Mundaring Spring & Yanchep 2. Made some density plots to get an idea what affects what. Don't know if you're into density plots, but they are like histograms for continuous data and they are more appropriate for looking at this data than histograms. 3. Linear models across all eight of the subsets above as well as the unsubsetted data. Generated some plots of those models just by running plot(fit.model.name). Did not get into extracting data from the fit models and plotting it manually. 4. Analysis of residuals: Shapiro Wilkes test for normality, Bresuch Pagan Test for heteroskedasticity. 5. ANOVA 6. Generated but didn't do much with influence objects. These get you into stuff like hat matrices which you are probably not expected to know. Use it as you please. 7. GVLMA which is diagnostics made easy for linear models. Gives you real values and p values for kurtosis, skew, etc. Plots them, again nothing fancy, just the built in plot(gvlma.model.name). ------------------ Some notes: - If you want to get involved with extracting data from the fit models be them linear models, anova, whatever, DO NOT try to extract from the raw model. Coerce it into a dataframe using: new.object = as.data.frame( linear fit or anova or whatever) str(new.object) Then use \$ or [ , ] or whatever to extract from that as usual. This is especially true for glmva models. - cor(dependent ~ independent) does not create models but gives you correlation coefficients for the data which difficult to access within a lm model. - confit( model name) gives you confidence intervals for the model which look very pretty on graphs. - I suggest you copy and paste the code into whatever you're putting together.

12. blues

For completeness should probably give you the reformatted .txt data.

13. TranceNova

Lol, it was my luck that the day I needed the internet connection it completely broke... university wide (I bet someone is going to boil over that) Anyway! Thanks for your help, I did get there in the end but I bet you have a different answer, so I'll have a good read through it :)

14. blues

Taha - thanks for the medal! They're worth more from mods now, you know... Post up what you have (that is the script) if you'd like. And I'll glance through it. If you're interested there are other, more elegant ways of doing what I did and I will put them up when I get to it. Whenever that is. I think it is wonderful that you're learning R and also wonderful that R questions are being posted in biol, not computer science. We should decree that a biology forum policy...

15. blues

I also liked the "Sent via my boyfriend." Nice spin on "Sent via my smart phone."

16. TranceNova

Bwhahaha! I agree :P But R is less of a programmers code and more of a biologists ..um, code. I think all have to ge really into the R code of this once I have written my lit review (bleah!) way to much to do. Lol, I actually don't know what he sent (there's no sent box on here), I just told Adam to go and send a message to blues saying I had no internet and couldn't meet, aaah he's got a good sense of humour :D Hmm I had better keep in mind that our medals are worth more, do you know how much more? (Though you deserve it for the effort you put into my question) :D And lol, after that all nighter I slept for about 14hours after I got home from uni :P (but I think I did pretty well considering I did the all nighter then stayed at uni working till 4...) Nearly fell asleep on someone on the train though.

17. blues

Re: R. They turn up their noses at R for some reason. Perhaps the occasional R question in Biology will make some other people on the forum pick it up. Re medals: they are being tight lipped about the actual scoring algorithm. All I know is that medals from highly ranked teachers - that is, the high problem solving components - are worth more than other peoples, and moderators' medals are worth most of all. As is well and proper. I wondered how long you slept. Worried me there, you did. :/ I didn't go all night - went to bed at 4, got up and 6. Pretty beaten and heading off myself soon. So post that code if you like and I'll catch up with it...

18. TranceNova

Sorry for worrying you! I probably should have sent a message before I fell into the black hole of sleep :P Hope you have a good sleep! Here is my code (apparently I didn't actually save the script, but luckily I did paste it into the appendix for my assignment lol!), don't kill yourself trying to do it, after all I have handed in my assignment now.