I need to perform ANOVA between a dependent variable and several categorical predictors to obtain the first order (non interactive) effect.
When I use more categorical predictors Statistica reduces the sample (ex. if I use one predictors the sample is made up of 270, if I use for predictors the sample falls down to 266). This because of differently missing (empty cells) data between different predictors.
Does it exist an imput to oblige Statistica to use all avaible data for each predictors?
if this was an experiment, I would have serious concerns as to why there is no informaiton on which group a record comes from. if this is an observational study and the information was not able to be collected, that is a different story.
missing data in either X or Y is not allowed in ANOVA. for the predictors, ANOVA must know from which group each case comes, so it knows how to partition the data. So the right thing to do might be:
1. figure out which group is correct and fill them in.
2. create a new group for unknown. i.e. repleace the missing values with a new group.
3. ignore those cases as the tool is already doing.
Dependent varaible is Protein Expression (the same for each SNP)
Categorical predicators are SNPs, extracted from BrainCloud.
Every SNP has some missing data for genotype that are different each other.
I need to perrform ANOVA for each SNP.
The problem is that performing ANOVA with multiple categorical predictors StatSoft remove the missing genotypes for all the SNPs included as categorical predictors so I have a reduced sample for all the ANOVA performed. In other words StatSofr uses only subjects that have data for all the predictors included even if I am not interested second order effects. I need just the main effect.
I have two choices: 1) perform ANOVA for each SNP that are 540
2) find a solution to force StatSoft to include all avaible data for each SNP in multiple analysis
could you help me?
it does not matter if interactions are included in the model. if a predictor is included in the model as a main effect, it is required for the model. this is not unique to statistica. the same would be true for any analysis tool. when selecting multiple x factors in a single model, each needs to be a complete record.
seperate one-way ANOVA models would not require data on the remaining x variables not included in the model. This is a different type of model and should be spedicified as such. I believe you were trying to say that you have 540 SNPs. if so, then you would have 540 seperate analyses.
The 540 seperate 1 way ANOVAS can be done via a macro with the code below. There are 2 places tha likely need updated. for i = 3 to 6 is where the x variables are selected. the oad2.varaibles= line, where 1 is should be the position of your y varaible.
Option Base 1
'the i varaible should lise the set of x variables for the one way anovas
Dim i As Integer
For i = 3 To 6
Dim newanalysis As Analysis
Set newanalysis = Analysis (scGLM, ActiveInputDataSet)
Dim oStaDocs As StaDocuments
' General Linear Models (GLM): Adstudy.sta
Dim oAD1 As STAGLM.GLMStartup
Set oAD1 = newanalysis.Dialog
' GLM One-Way ANOVA: Adstudy.sta
Dim oAD2 As STAGLM.GLMSpecifications
Set oAD2 = newanalysis.Dialog
'the #1 in the line below is where y is selected
oAD2.Variables = "1|" & i
' GLM Results 1: Adstudy.sta
Dim oAD3 As STAGLM.GLMResults
Set oAD3 = newanalysis.Dialog
Set oStaDocs = oAD3.EffectSizeAndPower
newanalysis.RouteOutput(oStaDocs).Visible = True
Set oStaDocs = Nothing
Set oAD1 = Nothing
Set oAD2 = Nothing
Set oAD3 = Nothing
Just a little more help. I need to generate simultaneously all the related graphs. How can I do this?
Thank you very much
I found the command by myself, thank you very much again!