glm {SparkR} | R Documentation |
Fits a generalized linear model, similarly to R's glm().
glm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, ...) ## S4 method for signature 'formula,ANY,SparkDataFrame' glm(formula, family = gaussian, data, epsilon = 1e-06, maxit = 25, weightCol = NULL)
formula |
a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'. |
family |
a description of the error distribution and link function to be used in the model.
This can be a character string naming a family function, a family function or
the result of a call to a family function. Refer R family at
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html.
Currently these families are supported: |
data |
a SparkDataFrame or R's glm data for training. |
weights |
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen
when the data contain |
start |
starting values for the parameters in the linear predictor. |
etastart |
starting values for the linear predictor. |
mustart |
starting values for the vector of means. |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be |
control |
a list of parameters for controlling the fitting
process. For |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
method |
the method to be used in fitting the model. The default
method User-supplied fitting functions can be supplied either as a function
or a character string naming a function, with a function which takes
the same arguments as |
x,y |
For |
contrasts |
an optional list. See the |
... |
For For |
epsilon |
positive convergence tolerance of iterations. |
maxit |
integer giving the maximal number of IRLS iterations. |
weightCol |
the weight column name. If this is not set or |
glm
returns a fitted generalized linear model.
glm since 1.5.0
## Not run:
##D sparkR.session()
##D data(iris)
##D df <- createDataFrame(iris)
##D model <- glm(Sepal_Length ~ Sepal_Width, df, family = "gaussian")
##D summary(model)
## End(Not run)