I am trying to create a mixed linear model with the following data. I am trying to predict gambling from alcdep, with covariates age and sex. I am trying to use the statsmodels from python, but I am unsure as how to go about doing it.
So far I have tried:
md = smf.mixedlm("acldep ~ Gambling", data, groups=data["Gambling"])
But I keep getting errors and I dont know how to specify the covariates using this way.
Here is the head of the data:
{'IID': {0: 'Yale_0001', 1: 'Yale_0004', 2: 'Yale_0006', 3: 'Yale_0007', 4: 'Yale_0008'}, 'SEX': {0: 2, 1: 1, 2: 2, 3: 1, 4: 1}, 'AGE': {0: 27, 1: 39, 2: 41, 3: 45, 4: 44}, 'alcdep': {0: 2, 1: 2, 2: 2, 3: 2, 4: 2}, 'Gambling': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1}, 'Zero': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 'Yes': {0: 'Yes', 1: 'Yes', 2: 'Yes', 3: 'Yes', 4: 'Yes'}, 'PRS': {0: 0.053486584299999994, 1: 0.0304387435, 2: 0.00917773968, 3: 0.016352741100000002, 4: 7.433452840000001e-05}}
CodePudding user response:
I have modified your data slightly because what you gave results in singular matrices.
You where almost there, by forgot some things. So, with this data:
data = {'IID': {0: 'Yale_0001', 1: 'Yale_0004', 2: 'Yale_0006', 3: 'Yale_0007', 4: 'Yale_0008'}, 'SEX': {0: 2, 1: 1, 2: 2, 3: 1, 4: 1}, 'AGE': {0: 27, 1: 39, 2: 41, 3: 45, 4: 44}, 'alcdep': {0: 2, 1: 2, 2: 2, 3: 1, 4: 1}, 'Gambling': {0: 1, 1: 1, 2: 2, 3: 1, 4: 2}, 'Zero': {0: 0, 1: 0, 2: 0, 3: 1, 4: 0}, 'Yes': {0: 'Yes', 1: 'Yes', 2: 'Yes', 3: 'Yes', 4: 'Yes'}, 'PRS': {0: 0.053486584299999994, 1: 0.0304387435, 2: 0.00917773968, 3: 0.016352741100000002, 4: 7.433452840000001e-05}}
you can do this:
import researchpy as rp
import statsmodels.api as sm
import scipy.stats as stats
import statsmodels.formula.api as smf
md = smf.mixedlm("alcdep ~ Gambling",groups="Gambling",data = df).fit()
md.summary()
which gives:
Mixed Linear Model Regression Results
=======================================================
Model: MixedLM Dependent Variable: alcdep
No. Observations: 5 Method: REML
No. Groups: 2 Scale: 0.3889
Min. group size: 2 Log-Likelihood: -3.7360
Max. group size: 3 Converged: Yes
Mean group size: 2.5
-------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
-------------------------------------------------------
Intercept 1.833 1.630 1.125 0.261 -1.362 5.028
Gambling -0.167 1.050 -0.159 0.874 -2.224 1.891
Gambling Var 0.389
=======================================================
To take care of independant variables, say SEX,
md = smf.mixedlm("alcdep ~ Gambling C(SEX)",groups="Gambling",data = df).fit()
md.summary()
which gives:
Mixed Linear Model Regression Results
=======================================================
Model: MixedLM Dependent Variable: alcdep
No. Observations: 5 Method: REML
No. Groups: 2 Scale: 0.2857
Min. group size: 2 Log-Likelihood: -2.5581
Max. group size: 3 Converged: Yes
Mean group size: 2.5
-------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
-------------------------------------------------------
Intercept 1.714 1.400 1.225 0.221 -1.029 4.458
C(SEX)[T.2] 0.714 0.495 1.443 0.149 -0.256 1.684
Gambling -0.286 0.904 -0.316 0.752 -2.057 1.485
Gambling Var 0.286
=======================================================

