Skip to content

Commit cacc491

Browse files
Populated directory
1 parent 3a24f74 commit cacc491

File tree

11 files changed

+2057
-2
lines changed

11 files changed

+2057
-2
lines changed

README.md

Lines changed: 0 additions & 2 deletions
This file was deleted.

README.txt

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
2+
LRU Dictionaries
3+
=================
4+
5+
>>> from darts.lib.utils.lru import LRUDict
6+
7+
An `LRUDict` is basically a simple dictionary, which has a defined
8+
maximum capacity, that may be supplied at construction time, or modified
9+
at run-time via the `capacity` property::
10+
11+
>>> cache = LRUDict(1)
12+
>>> cache.capacity
13+
1
14+
15+
The minimum capacity value is 1, and LRU dicts will complain, if someone
16+
attempts to use a value smaller than that::
17+
18+
>>> cache.capacity = -1 #doctest: +ELLIPSIS
19+
Traceback (most recent call last):
20+
...
21+
ValueError: -1 is not a valid capacity
22+
>>> LRUDict(-1) #doctest: +ELLIPSIS
23+
Traceback (most recent call last):
24+
...
25+
ValueError: -1 is not a valid capacity
26+
27+
LRU dictionaries can never contain more elements than their capacity value
28+
indicates, so::
29+
30+
>>> cache[1] = "First"
31+
>>> cache[2] = "Second"
32+
>>> len(cache)
33+
1
34+
35+
In order to ensure this behaviour, the dictionary will evict entries if
36+
it needs to make room for new ones. So::
37+
38+
>>> 1 in cache
39+
False
40+
>>> 2 in cache
41+
True
42+
43+
The capacity can be adjusted at run-time. Growing the capacity does not
44+
affect the number of elements present in an LRU dictionary::
45+
46+
>>> cache.capacity = 3
47+
>>> len(cache)
48+
1
49+
>>> cache[1] = "First"
50+
>>> cache[3] = "Third"
51+
>>> len(cache)
52+
3
53+
54+
but shrinking does::
55+
56+
>>> cache.capacity = 2
57+
>>> len(cache)
58+
2
59+
>>> sorted(list(cache.iterkeys()))
60+
[1, 3]
61+
62+
Note, that the entry with key `2` was evicted, because it was the oldest
63+
entry at the time of the modification of `capacity`. The new oldest entry
64+
is the one with key `1`, which can be seen, when we try to add another
65+
entry to the dict::
66+
67+
>>> cache[4] = "Fourth"
68+
>>> sorted(list(cache.iterkeys()))
69+
[3, 4]
70+
71+
The following operations affect an entry's priority::
72+
73+
- `get`
74+
- `__getitem__`
75+
- `__setitem__`
76+
- `__contains__`
77+
78+
Calling any of these operations on an existing key will boost the key's
79+
priority, making it more unlikely to get evicted, when the dictionary needs
80+
to make room for new entries. There is a special `peek` operation, which
81+
returns the current value associated to a key without boosting the priority
82+
of the entry::
83+
84+
>>> cache.peek(3)
85+
'Third'
86+
>>> cache[5] = "Fifth"
87+
>>> sorted(list(cache.iterkeys()))
88+
[4, 5]
89+
90+
As you can see, even though we accessed the entry with key `3` as the last
91+
one, the entry is now gone, because it did not get a priority boost from
92+
the call to `peek`.
93+
94+
The class `LRUDict` supports a subset of the standard Python `dict`
95+
interface. In particular, we can iterate over the key, values, and
96+
items of an LRU dict::
97+
98+
>>> sorted([k for k in cache.iterkeys()])
99+
[4, 5]
100+
>>> sorted([v for v in cache.itervalues()])
101+
['Fifth', 'Fourth']
102+
>>> sorted([p for p in cache.iteritems()])
103+
[(4, 'Fourth'), (5, 'Fifth')]
104+
>>> sorted(list(cache))
105+
[4, 5]
106+
107+
Note, that there is no guaranteed order; in particular, the elements are
108+
not generated in priority order or somesuch. Similar to regular `dict`s,
109+
an LRU dict's `__iter__` is actually any alias for `iterkeys`.
110+
111+
Furthermore, we can remove all elements from the dict:
112+
113+
>>> cache.clear()
114+
>>> sorted(list(cache.iterkeys()))
115+
[]
116+
117+
118+
Thread-safety
119+
--------------
120+
121+
Instances of class `LRUDict` are not thread safe. Worse: even concurrent
122+
read-only access is not thread-safe and has to be synchronized by the
123+
client application.
124+
125+
There is, however, the class `SynchronizedLRUDict`, which exposes the
126+
same interface as plain `LRUDict`, but fully thread-safe. The following
127+
session contains exactly the steps, we already tried with a plain `LRUDict`,
128+
but now using the synchronized version::
129+
130+
>>> from darts.lib.utils.lru import SynchronizedLRUDict
131+
>>> cache = SynchronizedLRUDict(1)
132+
>>> cache.capacity
133+
1
134+
>>> cache.capacity = -1 #doctest: +ELLIPSIS
135+
Traceback (most recent call last):
136+
...
137+
ValueError: -1 is not a valid capacity
138+
>>> LRUDict(-1) #doctest: +ELLIPSIS
139+
Traceback (most recent call last):
140+
...
141+
ValueError: -1 is not a valid capacity
142+
>>> cache[1] = "First"
143+
>>> cache[2] = "Second"
144+
>>> len(cache)
145+
1
146+
>>> 1 in cache
147+
False
148+
>>> 2 in cache
149+
True
150+
>>> cache.capacity = 3
151+
>>> len(cache)
152+
1
153+
>>> cache[1] = "First"
154+
>>> cache[3] = "Third"
155+
>>> len(cache)
156+
3
157+
>>> cache.capacity = 2
158+
>>> len(cache)
159+
2
160+
>>> sorted(list(cache.iterkeys()))
161+
[1, 3]
162+
>>> cache[4] = "Fourth"
163+
>>> sorted(list(cache.iterkeys()))
164+
[3, 4]
165+
>>> cache.peek(3)
166+
'Third'
167+
>>> cache[5] = "Fifth"
168+
>>> sorted(list(cache.iterkeys()))
169+
[4, 5]
170+
>>> sorted([k for k in cache.iterkeys()])
171+
[4, 5]
172+
>>> sorted([v for v in cache.itervalues()])
173+
['Fifth', 'Fourth']
174+
>>> sorted([p for p in cache.iteritems()])
175+
[(4, 'Fourth'), (5, 'Fifth')]
176+
>>> sorted(list(cache))
177+
[4, 5]
178+
>>> cache.clear()
179+
>>> sorted(list(cache.iterkeys()))
180+
[]
181+
182+
183+
Auto-loading Caches
184+
====================
185+
186+
Having some kind of dictionary which is capable of cleaning itself
187+
up is nice, but in order to implement caching, there is still something
188+
missing: the mechanism, which actually loads something into our dict.
189+
This part of the story is implemented by the `AutoLRUCache`::
190+
191+
>>> from darts.lib.utils.lru import AutoLRUCache
192+
193+
Let's first define a load function::
194+
195+
>>> def load_resource(key):
196+
... if key < 10:
197+
... print "Loading %r" % (key,)
198+
... return "R(%s)" % (key,)
199+
200+
and a cache::
201+
202+
>>> cache = AutoLRUCache(load_resource, capacity=3)
203+
>>> cache.load(1)
204+
Loading 1
205+
'R(1)'
206+
>>> cache.load(1)
207+
'R(1)'
208+
209+
As you can see, the first time, an actual element is loaded, the load
210+
function provided to the constructor is called, in order to provide the
211+
actual resource value. On subsequent calls to `load`, the cached value
212+
is returned.
213+
214+
Internally, the `AutoLRUCache` class uses an `LRUDict` to cache values,
215+
so::
216+
217+
>>> cache.load(2)
218+
Loading 2
219+
'R(2)'
220+
>>> cache.load(3)
221+
Loading 3
222+
'R(3)'
223+
>>> cache.load(4)
224+
Loading 4
225+
'R(4)'
226+
>>> cache.load(1)
227+
Loading 1
228+
'R(1)'
229+
230+
Note the "Loading 1" line in the last example. The cache has been initialized
231+
with a capacity of 3, so the value of key `1` had to be evicted when the one
232+
for key `4` was loaded. When we tried to obtain `1` again, the cache had to
233+
properly reload it, calling the loader function.
234+
235+
If there is actually no resource for a given key value, the loader function
236+
must return `None`. It follows, that `None` is never a valid resource value
237+
to be associated with some key in an `AutoLRUCache`.
238+
239+
>>> cache.load(11, 'Oops')
240+
'Oops'
241+
242+
243+
Thread-safety
244+
--------------
245+
246+
Instances of class `AutoLRUCache` are fully thread safe. Be warned, though,
247+
that the loader function is called outside of any synchronization scope the
248+
class may internally use, and has to provide its own synchronization if
249+
required.
250+
251+
The cache class actually tries to minimize the number of invocations of the
252+
loader by making sure, that no two concurrent threads will try to load the
253+
same key value (though any number of concurrent threads might be busy loading
254+
the resources associated with different keys).
255+
256+
257+
Change Log
258+
==========
259+
260+
Version 0.3
261+
------------
262+
263+
Added class `SynchronizedLRUDict` as thread-safe counterpart for `LRUDict`.
264+

darts/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
__import__('pkg_resources').declare_namespace(__name__)

darts/lib/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
__import__('pkg_resources').declare_namespace(__name__)

darts/lib/utils/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
__import__('pkg_resources').declare_namespace(__name__)

0 commit comments

Comments
 (0)