Incorrect predictions when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial'.

Description

Steps/Code to Reproduce

    from sklearn.linear_model import LogisticRegression
    import sklearn.metrics
    import numpy as np

    # Set up a logistic regression object
    lr = LogisticRegression(C=1000000, multi_class='multinomial',
                            solver='sag', tol=0.0001, warm_start=False,
                            verbose=0)

    # Set independent variable values
    Z = np.array([
       [ 0.        ,  0.        ],
       [ 1.33448632,  0.        ],
       [ 1.48790105, -0.33289528],
       [-0.47953866, -0.61499779],
       [ 1.55548163,  1.14414766],
       [-0.31476657, -1.29024053],
       [-1.40220786, -0.26316645],
       [ 2.227822  , -0.75403668],
       [-0.78170885, -1.66963585],
       [ 2.24057471, -0.74555021],
       [-1.74809665,  2.25340192],
       [-1.74958841,  2.2566389 ],
       [ 2.25984734, -1.75106702],
       [ 0.50598996, -0.77338402],
       [ 1.21968303,  0.57530831],
       [ 1.65370219, -0.36647173],
       [ 0.66569897,  1.77740068],
       [-0.37088553, -0.92379819],
       [-1.17757946, -0.25393047],
       [-1.624227  ,  0.71525192]])
    
    # Set dependant variable values
    Y = np.array([1, 0, 0, 1, 0, 0, 0, 0, 
                  0, 0, 1, 1, 1, 0, 0, 1, 
                  0, 0, 1, 1], dtype=np.int32)

    lr.fit(Z, Y)
    p = lr.predict_proba(Z)
    print(sklearn.metrics.log_loss(Y, p)) # ...

    print(lr.intercept_)
    print(lr.coef_)

Expected Results

If we compare against R or using multi_class='ovr', the log loss (which is approximately proportional to the objective function as the regularisation is set to be negligible through the choice of C) is incorrect. We expect the log loss to be roughly 0.5922995

Actual Results

The actual log loss when using multi_class='multinomial' is 0.61505641264.

Further Information

See the stack exchange question https://stats.stackexchange.com/questions/306886/confusing-behaviour-of-scikit-learn-logistic-regression-multinomial-optimisation?noredirect=1#comment583412_306886 for more information.

The issue it seems is caused in https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/linear_model/logistic.py#L762. In the multinomial case even if classes.size==2 we cannot reduce to a 1D case by throwing away one of the vectors of coefficients (as we can in normal binary logistic regression). This is essentially a difference between softmax (redundancy allowed) and logistic regression.

This can be fixed by commenting out the lines 762 and 763. I am apprehensive however that this may cause some other unknown issues which is why I am positing as a bug.

Versions

Linux-4.10.0-33-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
NumPy 1.13.1
SciPy 0.19.1
Scikit-Learn 0.19.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with `multi_class='multinomial'`. #9889

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Further Information

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial'. #9889

Description

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Further Information

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with `multi_class='multinomial'`. #9889