% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/vw.R
\name{vw}
\alias{vw}
\title{Trains Vowpal Wabbit models from R.}
\usage{
vw(training_data, validation_data, model = "mdl.vw",
  path_vw_data_train = NULL, path_vw_data_val = NULL, target = NULL,
  namespaces = NULL, weight = NULL, tag = NULL, out_probs = NULL,
  validation_labels = NULL, loss = "logistic", b = 25,
  learning_rate = 0.5, passes = 1, l1 = NULL, l2 = NULL,
  early_terminate = NULL, link_function = "--link=logistic", extra = NULL,
  do_evaluation = TRUE, use_perf = TRUE, plot_roc = TRUE,
  verbose = TRUE)
}
\arguments{
\item{training_data}{a [data.frame] or path to a vw data file}

\item{validation_data}{a [data.frame] or path to a vw data file}

\item{model}{name of the model file}

\item{path_vw_data_train}{if training_data is a [data.frame], the path to which to save
the vw data file. If NULL, the data is stored in a temporary folder and deleted before exiting
the function}

\item{path_vw_data_val}{if validation_data is a [data.frame], the path to which to save
the vw data file. If NULL, the data is stored in a temporary folder and deleted before exiting
the function}

\item{target}{if training_data or validation_data is a [data.frame], the name of the variable
in the [data.frame] corresponding to the target variable}

\item{namespaces}{used only if training_data or validation_data is a [data.frame]. See arguments
of dt2vw}

\item{weight}{used only if training_data or validation_data is a [data.frame]. See arguments
of dt2vw}

\item{tag}{used only if training_data or validation_data is a [data.frame]. See arguments
of dt2vw}

\item{out_probs}{path to file where to save the predictions. If NULL, the file is stored in
a temporary file then deleted.}

\item{validation_labels}{file to look for validation data true labels - to compute auc using perf
or roc_auc() from the R package pROC. If the validation data is a [data.frame] and validation_labels
is NULL, the validation labels file is deleted before exiting the function. If validation_labels is not
NULL, it indicates the path where validation labels should be stored.}

\item{loss}{loss function. By default logistic.}

\item{b}{number of bits for the weight vector allocation}

\item{learning_rate}{}

\item{passes}{}

\item{l1}{l1 regularization}

\item{l2}{l2 regularization}

\item{early_terminate}{}

\item{link_function}{used to generate predictions}

\item{extra}{These is where more VW commands can be passed as text}

\item{do_evaluation}{TRUE to compute auc on validation_data. Use FALSE, to just score data}

\item{use_perf}{use perf to compute auc. Otherwise, auc_roc() from the R package pROC is used.}

\item{verbose}{mostly used to debug but shows AUC and the vw command used to train the model}

\item{interactions}{Add interaction terms. Can be passed in extra also.}

\item{out_probs}{filename to write probabilities}
}
\description{
This function is fairly simple and extensible to other problems, so far just supports binary classification.
Thought to be used in conjuction to perf in order to compute validation metrics on left out datasets.
See osmot.cs.cornell.edu/kddcup/software.html for more info about perf.
}
\examples{
# 1. Create a training set (training_data) and validation set (validation_data) in vw format.
# 2. Install perf
# 3. Create a vector of true labels for the validation dataset, in the [0, 1] range. This is what perf likes.
# 4. Run one model with the present code

\dontrun{
auc = vw(training_data='X_train.vw', validation_data='X_valid.vw',
        loss='logistic', model='mdl.vw', b=25, learning_rate=0.5,
        passes=20, l1=1e-08, l2=1e-08, early_terminate=2,
        interactions=NULL, extra='--stage_poly')
}
}

