Skip to content

Commit c3b61d3

Browse files
committed
THREADING.md: Initial changes post review.
1 parent ed8eaa7 commit c3b61d3

File tree

1 file changed

+117
-46
lines changed

1 file changed

+117
-46
lines changed

v3/docs/THREADING.md

Lines changed: 117 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This document is primarily for those wishing to interface `uasyncio` code with
44
that running under the `_thread` module. It presents classes for that purpose
5-
which may also find use for communicatiing between threads and in interrupt
5+
which may also find use for communicating between threads and in interrupt
66
service routine (ISR) applications. It provides an overview of the problems
77
implicit in pre-emptive multi tasking.
88

@@ -22,10 +22,11 @@ is fixed.
2222
# Contents
2323

2424
1. [Introduction](./THREADING.md#1-introduction) The various types of pre-emptive code.
25-
1.1 [Interrupt Service Routines](./THREADING.md#11-interrupt-service-routines)
26-
1.2 [Threaded code on one core](./THREADING.md#12-threaded-code-on-one-core)
27-
1.3 [Threaded code on multiple cores](./THREADING.md#13-threaded-code-on-multiple-cores)
28-
1.4 [Debugging](./THREADING.md#14-debugging)
25+
1.1 [Hard Interrupt Service Routines](./THREADING.md#11-hard-interrupt-service-routines)
26+
1.2 [Soft Interrupt Service Routines](./THREADING.md#12-soft-interrupt-service-routines) Also code scheduled by micropython.schedule()
27+
1.3 [Threaded code on one core](./THREADING.md#13-threaded-code-on-one-core)
28+
1.4 [Threaded code on multiple cores](./THREADING.md#14-threaded-code-on-multiple-cores)
29+
1.5 [Debugging](./THREADING.md#15-debugging)
2930
2. [Sharing data](./THREADING.md#2-sharing-data)
3031
2.1 [A pool](./THREADING.md#21-a-pool) Sharing a set of variables.
3132
2.2 [ThreadSafeQueue](./THREADING.md#22-threadsafequeue)
@@ -40,11 +41,19 @@ is fixed.
4041

4142
Various issues arise when `uasyncio` applications interface with code running
4243
in a different context. Supported contexts are:
43-
1. A hard or soft interrupt service routine (ISR).
44-
2. Another thread running on the same core.
45-
3. Code running on a different core (currently only supported on RP2).
46-
47-
This section compares the characteristics of the three contexts. Consider this
44+
1. A hard interrupt service routine (ISR).
45+
2. A soft ISR. This includes code scheduled by `micropython.schedule()`.
46+
3. Another thread running on the same core.
47+
4. Code running on a different core (currently only supported on RP2).
48+
49+
In all these cases the contexts share a common VM (the virtual machine which
50+
executes Python bytecode). This enables the contexts to share global state. In
51+
case 4 there is no common GIL (the global interpreter lock). This lock protects
52+
Python built-in objects enabling them to be considered atomic at the bytecode
53+
level. (An "atomic" object is inherently thread safe: if thread changes it,
54+
another concurrent thread performing a read is guaranteed to see valid data).
55+
56+
This section compares the characteristics of the four contexts. Consider this
4857
function which updates a global dictionary `d` from a hardware device. The
4958
dictionary is shared with a `uasyncio` task.
5059
```python
@@ -65,53 +74,100 @@ Beware that some apparently obvious ways to interface an ISR to `uasyncio`
6574
introduce subtle bugs discussed in the doc referenced above. The only reliable
6675
interface is via a thread safe class, usually `ThreadSafeFlag`.
6776

68-
## 1.1 Interrupt Service Routines
77+
## 1.1 Hard Interrupt Service Routines
6978

70-
1. The ISR and the main program share a common Python virtual machine (VM).
71-
Consequently a line of code being executed when the interrupt occurs will run
72-
to completion before the ISR runs.
79+
1. The ISR and the main program share the Python GIL. This ensures that built
80+
in Python objects (`list`, `dict` etc.) will not be corrupted if an ISR runs
81+
while the object is being modified. This guarantee is quite limited: the code
82+
will not crash, but there may be consistency problems. See consistency below.
7383
2. An ISR will run to completion before the main program regains control. This
7484
means that if the ISR updates multiple items, when the main program resumes,
7585
those items will be mutually consistent. The above code fragment will provide
7686
mutually consistent data.
7787
3. The fact that ISR code runs to completion means that it must run fast to
7888
avoid disrupting the main program or delaying other ISR's. ISR code should not
79-
call blocking routines and should not wait on locks. Item 2. means that locks
80-
are seldom necessary.
89+
call blocking routines. It should not wait on locks because there is no way
90+
for the interrupted code to release the lock. See locks below.
8191
4. If a burst of interrupts can occur faster than `uasyncio` can schedule the
8292
handling task, data loss can occur. Consider using a `ThreadSafeQueue`. Note
8393
that if this high rate is sustained something will break: the overall design
8494
needs review. It may be necessary to discard some data items.
8595

86-
## 1.2 Threaded code on one core
96+
#### locks
97+
98+
there is a valid case where a hard ISR checks the status of a lock, aborting if
99+
the lock is set.
100+
101+
#### consistency
102+
103+
Consider this code fragment:
104+
```python
105+
a = [0, 0, 0]
106+
b = [0, 0, 0]
107+
def hard_isr():
108+
a[0] = read_data(0)
109+
b[0] = read_data(1)
87110

88-
1. On single core devices with a common GIL, Python instructions can be
89-
considered "atomic": they are guaranteed to run to completion without being
90-
pre-empted.
91-
2. Hence where a shared data item is updated by a single line of code a lock or
92-
`ThreadSafeQueue` is not needed. In the above code sample, if the application
93-
needs mutual consistency between the dictionary values, a lock must be used.
94-
3. Code running on a thread other than that running `uasyncio` may block for
111+
async def foo():
112+
while True:
113+
await process(a + b)
114+
```
115+
A hard ISR can occur during the execution of a bytecode. This means that the
116+
combined list passed to `process()` might comprise old a + new b.
117+
118+
## 1.2 Soft Interrupt Service Routines
119+
120+
This also includes code scheduled by `micropython.schedule()`.
121+
122+
1. A soft ISR can only run at certain bytecode boundaries, not during
123+
execution of a bytecode. It cannot interrupt garbage collection; this enables
124+
soft ISR code to allocate.
125+
2. As per hard ISR's.
126+
3. A soft ISR should still be designed to complete quickly. While it won't
127+
delay hard ISR's it nevertheless pre-empts the main program. In principle it
128+
can wait on a lock, but only if the lock is released by a hard ISR or another
129+
hard context (a thread or code on another core).
130+
4. As per hard ISR's.
131+
132+
## 1.3 Threaded code on one core
133+
134+
1. The common GIL ensures that built-in Python objects (`list`, `dict` etc.)
135+
will not be corrupted if a read on one thread occurs while the object's
136+
contents are being updated.
137+
2. This protection does not extend to user defined data structures. The fact
138+
that a dictionary won't be corrupted by concurrent access does not imply that
139+
its contents will be mutually consistent. In the code sample in section 1, if
140+
the application needs mutual consistency between the dictionary values, a lock
141+
is needed to ensure that a read cannot be scheduled while an update is in
142+
progress.
143+
3. The above means that, for example, calling `uasyncio.create_task` from a
144+
thread is unsafe as it can scramble `uasyncio` data structures.
145+
4. Code running on a thread other than that running `uasyncio` may block for
95146
as long as necessary (an application of threading is to handle blocking calls
96147
in a way that allows `uasyncio` to continue running).
97148

98-
## 1.3 Threaded code on multiple cores
149+
## 1.4 Threaded code on multiple cores
99150

100151
Currently this applies to RP2 and Unix ports, although as explained above the
101152
thread safe classes offered here do not yet support Unix.
102153

103-
1. There is no common VM hence no common GIL. The underlying machine code of
104-
each core runs independently.
154+
1. There is no common GIL. This means that under some conditions Python built
155+
in objects can be corrupted.
105156
2. In the code sample there is a risk of the `uasyncio` task reading the dict
106-
at the same moment as it is being written. It may read a corrupt or partially
107-
updated item; there may even be a crash. Using a lock or `ThreadSafeQueue` is
108-
essential.
109-
3. Code running on a core other than that running `uasyncio` may block for
157+
at the same moment as it is being written. Updating a dictionary data entry is
158+
atomic: there is no risk of corrupt data being read. In the code sample a lock
159+
is only required if mutual consistency of the three values is essential.
160+
3. In the absence of a GIL some operations on built-in objects are not thread
161+
safe. For example adding or deleting items in a `dict`. This extends to global
162+
variables which are implemented as a `dict`.
163+
4. The observations in 1.3 on user defined data structures and `uasyncio`
164+
interfacing apply.
165+
5. Code running on a core other than that running `uasyncio` may block for
110166
as long as necessary.
111167

112168
[See this reference from @jimmo](https://github.com/orgs/micropython/discussions/10135#discussioncomment-4309865).
113169

114-
## 1.4 Debugging
170+
## 1.5 Debugging
115171

116172
A key practical point is that coding errors in synchronising threads can be
117173
hard to locate: consequences can be extremely rare bugs or (in the case of
@@ -129,10 +185,13 @@ There are two fundamental problems: data sharing and synchronisation.
129185

130186
The simplest case is a shared pool of data. It is possible to share an `int` or
131187
`bool` because at machine code level writing an `int` is "atomic": it cannot be
132-
interrupted. In the multi core case anything more complex must be protected to
133-
ensure that concurrent access cannot take place. The consequences even of
134-
reading an object while it is being written can be unpredictable. One approach
135-
is to use locking:
188+
interrupted. A shared global `dict` might be replaced in its entirety by one
189+
process and read by another. This is safe because the shared variable is a
190+
pointer, and replacing a pointer is atomic. Problems arise when multiple fields
191+
are updated by one process and read by another, as the read might occur while
192+
the write operation is in progress.
193+
194+
One approach is to use locking:
136195
```python
137196
lock = _thread.allocate_lock()
138197
values = { "X": 0, "Y": 0, "Z": 0}
@@ -154,14 +213,24 @@ async def consumer():
154213
lock.release()
155214
await asyncio.sleep_ms(0) # Ensure producer has time to grab the lock
156215
```
157-
This is recommended where the producer runs in a different thread from
158-
`uasyncio`. However the consumer might hold the lock for some time: it will
159-
take time for the scheduler to execute the `process()` call, and the call
160-
itself will take time to run. In cases where the duration of a lock is
161-
problematic a `ThreadSafeQueue` is more appropriate as it decouples producer
162-
and consumer code.
163-
164-
As stated above, if the producer is an ISR no lock is needed or advised.
216+
Condsider also this code:
217+
```python
218+
def consumer():
219+
send(d["x"].height()) # d is a global dict
220+
send(d["x"].width()) # d["x"] is an instance of a class
221+
```
222+
In this instance if the producer, running in a different context, changes
223+
`d["x"]` between the two `send()` calls, different objects will be accessed. A
224+
lock should be used.
225+
226+
Locking is recommended where the producer runs in a different thread from
227+
`uasyncio`. However the consumer might hold the lock for some time: in the
228+
first sample it will take time for the scheduler to execute the `process()`
229+
call, and the call itself will take time to run. In cases where the duration
230+
of a lock is problematic a `ThreadSafeQueue` is more appropriate than a locked
231+
pool as it decouples producer and consumer code.
232+
233+
As stated above, if the producer is an ISR a lock is normally unusable.
165234
Producer code would follow this pattern:
166235
```python
167236
values = { "X": 0, "Y": 0, "Z": 0}
@@ -170,8 +239,10 @@ def producer():
170239
values["Y"] = sensor_read(1)
171240
values["Z"] = sensor_read(2)
172241
```
173-
and the ISR would run to completion before `uasyncio` resumed, ensuring mutual
174-
consistency of the dict values.
242+
and the ISR would run to completion before `uasyncio` resumed. The ISR could
243+
run while the `uasyncio` task was reading the values: to ensure mutual
244+
consistency of the dict values the consumer should disable interrupts while
245+
the read is in progress.
175246

176247
###### [Contents](./THREADING.md#contents)
177248

0 commit comments

Comments
 (0)