Skip to content
This repository was archived by the owner on Jul 22, 2019. It is now read-only.

Bug when used with multiprocessing #313

Open
kevinjqiu opened this issue Mar 6, 2017 · 3 comments
Open

Bug when used with multiprocessing #313

kevinjqiu opened this issue Mar 6, 2017 · 3 comments

Comments

@kevinjqiu
Copy link

kevinjqiu commented Mar 6, 2017

There appears to be a race condition and the bug exists in both Python 2.x and Python 3.x, although they're manifested differently.

Minimum code to reproduce the bug:

import couchdb
import multiprocessing
import multiprocessing.pool


server = couchdb.Server('http://couchdb_host:5984/')
try:
    database = server.create('test')
except:
    server.delete('test')
    database = server.create('test')

database.save({'_id': '1', 'type': 'dog', 'name': 'chase'})
database.save({'_id': '2', 'type': 'dog', 'name': 'rubble'})
database.save({'_id': '3', 'type': 'cat', 'name': 'kali'})

def query_id(id):
    return dict(database[id])

def main():
    pool = multiprocessing.pool.Pool(3)

    docs = pool.map(query_id, ['1', '2', '3'])
    print(docs)


if __name__ == '__main__':
    main()

Observation 1:
When run on Python 2.x, the following error is encountered:

$ python bug.py 
Traceback (most recent call last):
  File "bug.py", line 54, in <module>
    main()
  File "bug.py", line 46, in main
    docs = pool.map(query_id, ['1', '2', '3'])
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
TypeError: 'ResponseBody' object is not iterable

Observation 2:
When run on Python 3.x, the execution hangs, and when you 'Ctrl+C' to terminate the program, the following stack trace is printed:

[ ... ]
    headers=headers, **params)
  File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/lib/python3.6/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/home/kevin/src/couchdb-python/couchdb/http.py", line 593, in _request
    credentials=self.credentials)
  File "/usr/lib/python3.6/http/client.py", line 258, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/kevin/src/couchdb-python/couchdb/http.py", line 402, in request
    data = resp.read()
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.6/http/client.py", line 462, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python3.6/http/client.py", line 612, in _safe_read
    chunk = self.fp.read(min(amt, MAXAMOUNT))
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
KeyboardInterrupt
KeyboardInterrupt

Observation 3
If I change the pool size to 1 (essentially serialize the GET operations), the bug does not exist. Same happens when I try to debug it with visual studio code (whose debugger practically blocks the execution of other processes), the code runs without issue.

Observation 4
If I run a proxy server in front of couchdb (e.g., haproxy), the code runs without issue.

@djc
Copy link
Owner

djc commented Mar 6, 2017

Have you thought about the possibility that there is a CouchDB bug, rather than a bug in CouchDB-Python? In particular, I think observation 4 (thanks for the detailed report!) suggests that the bug might not be in CouchDB-Python.

My other thought is that this might have to do with the connection pooling we're doing in couchdb.http.

One question I have is, when you run this test case for 100 times (or 10), does it fail every time? My expectation would be for it to be intermittent.

@kevinjqiu
Copy link
Author

kevinjqiu commented Mar 6, 2017

Hi @djc

Have you thought about the possibility that there is a CouchDB bug

On couchdb's end, the requests were carried out successfully. I can see in the couchdb logs there are three concurrent GET requests, all responded with 200 OK. Also, I can use the requests library to call the endpoints concurrently without issue. Those observations lead me to think it's some sort of race condition inside CouchDB-Python.

One question I have is, when you run this test case for 100 times (or 10), does it fail every time

Yes, it fails every single time.

I might be able to reduce the sample code even further to only use couchdb.http methods to reproduce the issue. Out of curiosity, why CouchDB-Python didn't use the stock httplib? Sorry I'm not too familiar with the genesis of this project. EDIT: I see you built ConnectionPool on top of httplib.

@kevinjqiu
Copy link
Author

@djc A tentative fix for the race condition: #314

kevinjqiu added a commit to kevinjqiu/couchdb-python that referenced this issue Mar 18, 2017
kevinjqiu added a commit to kevinjqiu/couchdb-python that referenced this issue Mar 18, 2017
kevinjqiu added a commit to kevinjqiu/couchdb-python that referenced this issue Mar 19, 2017
kevinjqiu added a commit to kevinjqiu/couchdb-python that referenced this issue Mar 19, 2017
kevinjqiu added a commit to kevinjqiu/couchdb-python that referenced this issue Mar 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants