File tree Expand file tree Collapse file tree 1 file changed +12
-0
lines changed Expand file tree Collapse file tree 1 file changed +12
-0
lines changed Original file line number Diff line number Diff line change @@ -228,3 +228,15 @@ which scrapes one of these sites.
228
228
229
229
.. _this page : http://search.cpan.org/~ecarroll/HTML-TreeBuilderX-ASP_NET-0.09/lib/HTML/TreeBuilderX/ASP_NET.pm
230
230
.. _example spider : http://github.com/AmbientLighter/rpn-fas/blob/master/fas/spiders/rnp.py
231
+
232
+ What's the best way to parse big XML/CSV data feeds?
233
+ ----------------------------------------------------
234
+
235
+ Parsing big feeds with XPath selectors can be problematic since they need to
236
+ build the DOM of the entire feed in memory, and this can be quite slow and
237
+ consume a lot of memory.
238
+
239
+ In order to avoid parsing all the entire feed at once in memory, you can use
240
+ the functions ``xmliter `` and ``csviter `` from ``scrapy.utils.iterators ``
241
+ module. In fact, this is what the feed spiders (see :ref: `topics-spiders `) use
242
+ under the cover.
You can’t perform that action at this time.
0 commit comments