python的request返回400_爬虫发出ajax请求，requests能获取正常响应，scrapy发出请求却返回400...

我爬取一个网站，数据是要向服务器发出异步请求加载带，我仿造headers,参数都没有错误，用requests能获取正常响应，当scrapy不行def parse_histical_data(self, response):html = BeautifulSoup(response.body, 'lxml')patterm = re.compile(r'smlId: [0-9]*', re.MULT

weixin_39726379

1178人浏览 · 2020-12-05 17:13:06

weixin_39726379 · 2020-12-05 17:13:06 发布

我爬取一个网站，数据是要向服务器发出异步请求加载带，我仿造headers,参数都没有错误，用requests能获取正常响应，当scrapy不行

def parse_histical_data(self, response):

html = BeautifulSoup(response.body, 'lxml')

patterm = re.compile(r'smlId: [0-9]*', re.MULTILINE|re.UNICODE)

script = html.find('script', text=patterm).text

smlId_text = patterm.search(script).group()

smlId = smlId_text.split(' ')[1]

curr_id = response.meta['pair_id']

header=html.select('#leftColumn > div.instrumentHeader > h2')[0].string

st_date = '01/01/2001'

end_date = '05/07/2050'

interval_sec = 'Daily'

sort_col = 'date'

sort_ord = 'DESC'

action = 'historical_data'

data = {'smlID': smlId, 'curr_id': curr_id, 'header': header, 'st_date': st_date, 'end_state': end_date,

'interval_sec': interval_sec, 'sort_col': sort_col, 'sort_ord': sort_ord, 'action': action}

head = self.download_headers.copy()

request = FormRequest(self.his_url, callback=self.parse_histical_data,

headers=head, formdata=data)

yield request

请求带网址是'https://www.investing.com/ins...'，使用一模一样带headers和data，scrapy返回400

华为云开发者联盟

为开发者提供学习成长、分享交流、生态实践、资源工具等服务，帮助开发者快速成长。

更多推荐

cover

通过HPA+CronHPA组合应对业务复杂弹性伸缩场景

华为云开发者联盟

cover

FT-FMEA融合混沌演练，零售运营系统韧性架构在线验证实践

华为云开发者联盟

cover

如何使用Python和Plotly绘制3D图形

华为云开发者联盟

所有评论(0)

查看更多评论

weixin_39726379

@weixin_39726379

已为社区贡献2条内容