Skip to content

Multinomial logistic regression

In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).

Multinomial logistic regression is known by a variety of other names, including polytomous LR, multiclass LR, softmax regression, multinomial logit, maximum entropy (MaxEnt) classifier, conditional maximum entropy model.

Softmax function

In mathematics, in particular probability theory and related fields, the softmax function, or normalized exponential, is a generalization of the logistic function that "squashes" a K-dimensional vector \(\mathbf{z}\) of arbitrary real values to a K-dimensional vector \(\sigma(\mathbf{z})\) of real values in the range (0, 1) that add up to 1. The function is given by

$$ \sigma(\mathbf{z})j = \frac{e^{z_j}}{\sum{k=1}^K e^{z_k}} $$    for j = 1, ..., K.

Artificial neural networks

In neural network simulations, the softmax function is often implemented at the final layer of a network used for classification. Such networks are then trained under a log loss (or cross-entropy) regime, giving a non-linear variant of multinomial logistic regression.

Since the function maps a vector and a specific index i to a real value, the derivative needs to take the index into account:

$$ \frac{\partial}{\partial q_k}\sigma(\textbf{q}, i) = \dots = \sigma(\textbf{q}, i)(\delta_{ik} - \sigma(\textbf{q}, k)) $$

Here, the Kronecker delta is used for simplicity (cf. the derivative of a sigmoid function, being expressed via the function itself).

See Multinomial logit for a probability model which uses the softmax activation function.

See also

Favorite site