Skip to content

Column info near attribute/method lookups is sometimes incorrect #94036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brandtbucher opened this issue Jun 20, 2022 · 3 comments
Closed

Column info near attribute/method lookups is sometimes incorrect #94036

brandtbucher opened this issue Jun 20, 2022 · 3 comments
Assignees
Labels
3.11 only security fixes 3.12 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@brandtbucher
Copy link
Member

We perform some manual fiddling with positions for attribute lookups at a few different points in the compiler. However, the extended position information for these locations can be nonsensical in some edge cases. Here is one (admittedly contrived) example:

>>> def foo():
...     (bar.
... baz)
... 
>>> for instruction in dis.get_instructions(foo):
...     print(instruction.positions, instruction.opname)
... 
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME
Positions(lineno=2, end_lineno=2, col_offset=5, end_col_offset=8) LOAD_GLOBAL
Positions(lineno=3, end_lineno=3, col_offset=5, end_col_offset=3) LOAD_ATTR
Positions(lineno=3, end_lineno=3, col_offset=5, end_col_offset=3) POP_TOP
Positions(lineno=3, end_lineno=3, col_offset=5, end_col_offset=3) LOAD_CONST
Positions(lineno=3, end_lineno=3, col_offset=5, end_col_offset=3) RETURN_VALUE

The last four instructions claim to be located on line 3, starting with at column 5 and ending at column 3 (which doesn't make sense).

A slight variation (using a method call) is even weirder:

>>> def foo():
...     (bar.
... baz(
... ))
... 
>>> for instruction in dis.get_instructions(foo):
...     print(instruction.positions, instruction.opname)
... 
Positions(lineno=1, end_lineno=1, col_offset=0, end_col_offset=0) RESUME
Positions(lineno=2, end_lineno=2, col_offset=5, end_col_offset=8) LOAD_GLOBAL
Positions(lineno=3, end_lineno=3, col_offset=None, end_col_offset=None) LOAD_ATTR
Positions(lineno=3, end_lineno=4, col_offset=None, end_col_offset=1) CALL
Positions(lineno=3, end_lineno=4, col_offset=None, end_col_offset=1) POP_TOP
Positions(lineno=3, end_lineno=4, col_offset=None, end_col_offset=1) LOAD_CONST
Positions(lineno=3, end_lineno=4, col_offset=None, end_col_offset=1) RETURN_VALUE

Here the col_offset is totally nonexistent, even in cases where end_col_offset is set! I suspect that this may be because the column math just happened to underflow the valid column range (>=0) and end up at -1, which is a special value we use to mean "missing".

@markshannon @pablogsal

@brandtbucher brandtbucher added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.11 only security fixes 3.12 only security fixes labels Jun 20, 2022
@15r10nk
Copy link
Contributor

15r10nk commented Jul 17, 2022

I found a similar case:

class files:
    def setdefault(self, a):
        return self


def test():
    files().setdefault(
        0
    ).setdefault(
        0
    )


import dis

for bc in dis.Bytecode(test):
    error = ""
    p = bc.positions
    if p.col_offset != 0 and p.end_col_offset != 0:
        # the end has to be after the start of the expression
        if (p.lineno, p.col_offset) >= (p.end_lineno, p.end_col_offset):
            error = "(error: start > end )"

    print(bc.positions, error, bc.opname, bc.argval)

output

Positions(lineno=6, end_lineno=6, col_offset=0, end_col_offset=0)  RESUME 0
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=9)  LOAD_GLOBAL files
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=11)  PRECALL 0
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=11)  CALL 0
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=22)  LOAD_METHOD setdefault
Positions(lineno=8, end_lineno=8, col_offset=8, end_col_offset=9)  LOAD_CONST 0
Positions(lineno=7, end_lineno=9, col_offset=4, end_col_offset=5)  PRECALL 1
Positions(lineno=7, end_lineno=9, col_offset=4, end_col_offset=5)  CALL 1
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=16)  LOAD_METHOD setdefault
Positions(lineno=10, end_lineno=10, col_offset=8, end_col_offset=9)  LOAD_CONST 0
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=5) (error: start > end ) PRECALL 1
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=5) (error: start > end ) CALL 1
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=5) (error: start > end ) POP_TOP None
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=5) (error: start > end ) LOAD_CONST None
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=5) (error: start > end ) RETURN_VALUE None

The second method call is definitely wrong.
Maybe this helps.

@alexmojaki
Copy link

There's another kind of weirdness in the example above which doesn't feel like it's been mentioned yet and I don't know if the current PR fixes it. The positions for the CALL instructions for the two setdefault calls are:

Positions(lineno=7, end_lineno=9, col_offset=4, end_col_offset=5)
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=5)

My understanding is that fiddling has made lineno in the second call 9 (instead of 7) for a good traceback. But why is end_lineno also 9 instead of 11? Why do the positions span multiple lines for the first CALL but not the second?

@brandtbucher
Copy link
Member Author

There's another kind of weirdness in the example above which doesn't feel like it's been mentioned yet and I don't know if the current PR fixes it.

The PR fixes this (I found it while writing tests). We shouldn't be modifying the end locations at all.

For the record, here is the new output from the above script:

Positions(lineno=6, end_lineno=6, col_offset=0, end_col_offset=0)  RESUME 0
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=9)  LOAD_GLOBAL files
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=11)  CALL 0
Positions(lineno=7, end_lineno=7, col_offset=4, end_col_offset=22)  LOAD_ATTR setdefault
Positions(lineno=8, end_lineno=8, col_offset=8, end_col_offset=9)  LOAD_CONST 0
Positions(lineno=7, end_lineno=9, col_offset=4, end_col_offset=5)  CALL 1
Positions(lineno=9, end_lineno=9, col_offset=6, end_col_offset=16)  LOAD_ATTR setdefault
Positions(lineno=10, end_lineno=10, col_offset=8, end_col_offset=9)  LOAD_CONST 0
Positions(lineno=9, end_lineno=11, col_offset=6, end_col_offset=5)  CALL 1
Positions(lineno=9, end_lineno=11, col_offset=6, end_col_offset=5)  POP_TOP None
Positions(lineno=9, end_lineno=11, col_offset=6, end_col_offset=5)  LOAD_CONST None
Positions(lineno=9, end_lineno=11, col_offset=6, end_col_offset=5)  RETURN_VALUE None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants