Answer a question

Question:

How can proxy scrapy requests with socks5?

I know I can use polipo to convert Socks Proxy To Http Proxy

But:

I want to set a Middleware or some changes in scrapy.Request

import scrapy

class BaseSpider(scrapy.Spider):
    """a base class that implements major functionality for crawling application"""
    start_urls = ('https://google.com')

    def start_requests(self):

        proxies = {
            'http': 'socks5://127.0.0.1:1080',
            'https': 'socks5://127.0.0.1:1080'
        }

        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
                meta={'proxy': proxies} # proxy should be string not dict
            )

    def parse(self, response):
        # do ...
        pass

what should I assign to proxies variable?

Answers

It is currently not possible. There is a feature request for it.

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐