Skip to content

Commit 241ae9f

Browse files
committed
Merge pull request scrapy#1820 from redapple/http-tls-settings
[MRG+1] Document DOWNLOADER_* settings for HTTP/1.0 and TLS
2 parents 84dea19 + 709b4fa commit 241ae9f

File tree

2 files changed

+79
-0
lines changed

2 files changed

+79
-0
lines changed

docs/news.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,9 @@ This 1.1 release brings a lot of interesting features and bug fixes:
2626
- Selectors were extracted to the parsel_ library (:issue:`1409`). This means
2727
you can use Scrapy Selectors without Scrapy and also upgrade the
2828
selectors engine without needing to upgrade Scrapy.
29+
- HTTPS downloader now does TLS protocol negotiation by default,
30+
instead of forcing TLS 1.0. You can also set the SSL/TLS method
31+
using the new :setting:`DOWNLOADER_CLIENT_TLS_METHOD`.
2932

3033
- These bug fixes may require your attention:
3134

@@ -85,6 +88,10 @@ Additional New Features and Enhancements
8588
interval (:issue:`1282`).
8689
- Download handlers are now lazy-loaded on first request using their
8790
scheme (:issue:`1390`, :issue:`1421`).
91+
- HTTPS download handlers do not force TLS 1.0 anymore; instead,
92+
OpenSSL's ``SSLv23_method()/TLS_method()`` is used allowing to try
93+
negotiating with the remote hosts the highest TLS protocol version
94+
it can (:issue:`1794`, :issue:`1629`).
8895
- ``RedirectMiddleware`` now skips the status codes from
8996
``handle_httpstatus_list`` on spider attribute
9097
or in ``Request``'s ``meta`` key (:issue:`1334`, :issue:`1364`,

docs/topics/settings.rst

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -366,6 +366,78 @@ Default: ``'scrapy.core.downloader.Downloader'``
366366

367367
The downloader to use for crawling.
368368

369+
.. setting:: DOWNLOADER_HTTPCLIENTFACTORY
370+
371+
DOWNLOADER_HTTPCLIENTFACTORY
372+
----------------------------
373+
374+
Default: ``'scrapy.core.downloader.webclient.ScrapyHTTPClientFactory'``
375+
376+
Defines a Twisted ``protocol.ClientFactory`` class to use for HTTP/1.0
377+
connections (for ``HTTP10DownloadHandler``).
378+
379+
.. note::
380+
381+
HTTP/1.0 is rarely used nowadays so you can safely ignore this setting,
382+
unless you use Twisted<11.1, or if you really want to use HTTP/1.0
383+
and override :setting:`DOWNLOAD_HANDLERS_BASE` for ``http(s)`` scheme
384+
accordingly, i.e. to
385+
``'scrapy.core.downloader.handlers.http.HTTP10DownloadHandler'``.
386+
387+
.. setting:: DOWNLOADER_CLIENTCONTEXTFACTORY
388+
389+
DOWNLOADER_CLIENTCONTEXTFACTORY
390+
-------------------------------
391+
392+
Default: ``'scrapy.core.downloader.contextfactory.ScrapyClientContextFactory'``
393+
394+
Represents the classpath to the ContextFactory to use.
395+
396+
Here, "ContextFactory" is a Twisted term for SSL/TLS contexts, defining
397+
the TLS/SSL protocol version to use, whether to do certificate verification,
398+
or even enable client-side authentication (and various other things).
399+
400+
.. note::
401+
402+
Scrapy default context factory **does NOT perform remote server
403+
certificate verification**. This is usually fine for web scraping.
404+
405+
If you do need remote server certificate verification enabled,
406+
Scrapy also has another context factory class that you can set,
407+
``'scrapy.core.downloader.contextfactory.BrowserLikeContextFactory'``,
408+
which uses the platform's certificates to validate remote endpoints.
409+
**This is only available if you use Twisted>=14.0.**
410+
411+
If you do use a custom ContextFactory, make sure it accepts a ``method``
412+
parameter at init (this is the ``OpenSSL.SSL`` method mapping
413+
:setting:`DOWNLOADER_CLIENT_TLS_METHOD`).
414+
415+
.. setting:: DOWNLOADER_CLIENT_TLS_METHOD
416+
417+
DOWNLOADER_CLIENT_TLS_METHOD
418+
----------------------------
419+
420+
Default: ``'TLS'``
421+
422+
Use this setting to customize the TLS/SSL method used by the default
423+
HTTP/1.1 downloader.
424+
425+
This setting must be one of these string values:
426+
427+
- ``'TLS'``: maps to OpenSSL's ``TLS_method()`` (a.k.a ``SSLv23_method()``),
428+
which allows protocol negotiation, starting from the highest supported
429+
by the platform; **default, recommended**
430+
- ``'TLSv1.0'``: this value forces HTTPS connections to use TLS version 1.0 ;
431+
set this if you want the behavior of Scrapy<1.1
432+
- ``'TLSv1.1'``: forces TLS version 1.1
433+
- ``'TLSv1.2'``: forces TLS version 1.2
434+
- ``'SSLv3'``: forces SSL version 3 (**not recommended**)
435+
436+
.. note::
437+
438+
We recommend that you use PyOpenSSL>=0.13 and Twisted>=0.13
439+
or above (Twisted>=14.0 if you can).
440+
369441
.. setting:: DOWNLOADER_MIDDLEWARES
370442

371443
DOWNLOADER_MIDDLEWARES

0 commit comments

Comments
 (0)