Skip to content

CSV write has SEGV when trying to write data 2GB or larger #129409

Closed
@bdrosen96

Description

@bdrosen96

Crash report

What happened?

import csv


bad_size = 2 * 1024 * 1024 * 1024 + 1
val = 'x' * bad_size


print("Total size of data {}".format(len(val)))
for size in [2147483647, 2147483648, 2147483649]:
    data = val[0:size]
    print("Trying to write data of size {}".format(len(data)))
    with open('dump.csv', 'w', newline='') as csvfile:
        spamwriter = csv.writer(csvfile, delimiter=',',
                                quotechar='|', quoting=csv.QUOTE_MINIMAL)
        spamwriter.writerow([data])
 python dump2.py 
Total size of data 2147483649
Trying to write data of size 2147483647
Trying to write data of size 2147483648
Segmentation fault (core dumped)

This happens with both 3.10 and 3.12

When I reproduce this with python-dbg inside of gdb I see the following:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352495104) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737352495104) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737352495104, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff7c2871b in __assert_fail_base (fmt=0x7ffff7ddd130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x814bed "index >= 0", file=0x7ee008 "../Include/cpython/unicodeobject.h", line=318, 
    function=<optimized out>) at ./assert/assert.c:92
#6  0x00007ffff7c39e96 in __GI___assert_fail (assertion=assertion@entry=0x814bed "index >= 0", 
    file=file@entry=0x7ee008 "../Include/cpython/unicodeobject.h", line=line@entry=318, 
    function=function@entry=0x97ae28 <__PRETTY_FUNCTION__.4.lto_priv.56> "PyUnicode_READ") at ./assert/assert.c:101
#7  0x00000000006d46c3 in PyUnicode_READ (index=-2147483648, data=0x7ffe772fd058, kind=1)
    at ../Include/cpython/unicodeobject.h:318
#8  join_append_data (self=self@entry=0x7ffff74a0050, field_kind=field_kind@entry=1, 
    field_data=field_data@entry=0x7ffe772fd058, field_len=field_len@entry=2147483648, quoted=quoted@entry=0x7fffffffd0ec, 
    copy_phase=copy_phase@entry=0) at ../Modules/_csv.c:1108
#9  0x00000000006d49ea in join_append (self=self@entry=0x7ffff74a0050, 
    field=field@entry='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', quoted=<optimized out>, quoted@entry=0) at ../Modules/_csv.c:1213
#10 0x00000000006d4c9a in csv_writerow (self=self@entry=0x7ffff74a0050, 
    seq=seq@entry=['xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx']) at ../Modules/_csv.c:1303
#11 0x000000000062b002 in _PyEval_EvalFrameDefault (tstate=0xcd8d80 <_PyRuntime+475008>, frame=0x7ffff7fb0020, throwflag=0)
    at Python/bytecodes.c:3094

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.12.8 (main, Dec 4 2024, 08:54:12) [GCC 11.4.0]

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixes3.13bugs and security fixes3.14bugs and security fixesextension-modulesC modules in the Modules dirtype-crashA hard crash of the interpreter, possibly with a core dump

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions