开源低代码平台-Microi吾码-采集引擎
本文初步讲一下采集引擎,后续会不断在此文章更新各种采集方式,采集引擎支持采集DOM渲染前的html,也支持渲染后的html,采集引擎支持采集网页的所有资源请求、接口请求
·
开源低代码平台-Microi吾码-采集引擎
优势
- 本文初步讲一下采集引擎,后续会不断在此文章更新各种采集方式
- 采集引擎支持采集DOM渲染前的html,也支持渲染后的html
- 采集引擎支持采集网页的所有资源请求、接口请求
相关开源项目
基于开源低代码平台Microi吾码的图片壁纸、短视频开源项目:https://microi.blog.csdn.net/article/details/144002079
采集图片、视频接口引擎代码
实际上以下代码我们也可以写的更通用一点,将selectors由前端传入,这样可以做到一个接口采集万物,当然也可以每个网站采集对应一个接口引擎
if(!V8.Param.Url){
V8.Result = { Code : 0, Msg : '参数错误!' }; return;
}
var url = V8.Param.Url;
var headless = V8.Param.Headless;
var isCloseBrowser = V8.Param.IsCloseBrowser;
var isClosePage = V8.Param.IsClosePage;
var selectors = [];
var urlWebType = 'kuaishou';
//目前仅支持[v.kuaishou.com] 实际上支持所有网站,只要你会写Selector
if(url.indexOf('v.kuaishou.com') > -1){
urlWebType = 'kuaishou'
}else if(url.indexOf('v.douyin.com') > -1){
urlWebType = 'douyin'
}else{
V8.Result = { Code : 0, Msg : '目前仅支持快手、抖音。' }; return;
}
if(urlWebType == 'kuaishou'){
if(V8.Param.ContentType == 'ShortVideo'){
//采集快手视频。后期需改成动态采集规则配置
selectors = [{
Key : 'Author',
Selector : '.short-video-info .short-video-info-container .profile-user-name .profile-user-name-title',
Script : '(element) => element.innerText',
},{
Key : 'Title',
Selector : '.short-video-info .short-video-info-container .short-video-info-container-detail .video-info-title',
Script : '(element) => element.innerText',
},{
Key : 'FileUrls',
Selector : '.kwai-player-container-video video',
Script : '(element) => element.src',
},{
Key : 'Cover',
Selector : '.short-video-detail .short-video-detail-container .short-video-wrapper .video-container-player',
Script : '(element) => element.getAttribute(\'poster\')',
}];
}else{
//采集快手图片。后期需改成动态采集规则配置
selectors = [{
Key : 'Author',
Selector : '.work-info.section .author .txt-wrapper .txt',
Script : '(element) => element.innerText',
},{
Key : 'Title',
Selector : '.work-info.section .desc',
Script : '(element) => element.innerText',
},{
Key : 'FileUrls',
Selector : '.long-image-container img, .swiper-container-item img',
Script : '(element) => element.src',
}];
}
}if(urlWebType == 'douyin'){
if(V8.Param.ContentType == 'ShortVideo'){
//采集抖音视频。后期需改成动态采集规则配置
selectors = [{
Key : 'Author',
Selector : '.video-detail .leftContainer img',
Script : '(element) => element.alt',
},{
Key : 'Title',
Selector : 'title',
Script : 'el => el.textContent',
},{
Key : 'FileUrls',
Selector : '.xg-video-container video source',
Script : 'el => el.getAttribute(\'src\')',//'el => el.querySelector(\'source\').getAttribute(\'src\')',
},{
Key : 'Cover',
Selector : 'meta[name=\'lark:url:video_cover_image_url\']',
Script : 'el => el.content',
}];
}else{
//采集抖音图片。后期需改成动态采集规则配置
selectors = [{
Key : 'Author',
Selector : '.work-info.section .author .txt-wrapper .txt',
Script : '(element) => element.innerText',
},{
Key : 'Title',
Selector : '.work-info.section .desc',
Script : '(element) => element.innerText',
},{
Key : 'FileUrls',
Selector : '.long-image-container img, .swiper-container-item img',
Script : '(element) => element.src',
}];
}
}
V8.Result = V8.Spider.GetRenderHtml({
Headless : headless,
IsCloseBrowser : isCloseBrowser,
IsClosePage : isClosePage,
Url : url,
Selectors : selectors
// ExecutablePath : 'D:\\Web\\microi-api\\publish\\Chrome\\Application\\109.0.5414.168',
// VirtualWindows : true,
//Selector : '.long-image-container img, .swiper-container-item img',
//Script : '(element) => element.src',
// ResponseUrlStart : 'https://m.yxixy.com/rest/wd/photo/info?'
});
Microi吾码-系列文档
- 平台介绍:https://microi.blog.csdn.net/article/details/143414349
- 一键安装使用:https://microi.blog.csdn.net/article/details/143832680
- 快速开始使用:https://microi.blog.csdn.net/article/details/143607068
- 源码本地运行-后端:https://microi.blog.csdn.net/article/details/143567676
- 源码本地运行-前端:https://microi.blog.csdn.net/article/details/143581687
- Docker部署:https://microi.blog.csdn.net/article/details/143576299
- 表单引擎:https://microi.blog.csdn.net/article/details/143671179
- 模块引擎:https://microi.blog.csdn.net/article/details/143775484
- 接口引擎:https://microi.blog.csdn.net/article/details/143968454
- 工作流引擎:https://microi.blog.csdn.net/article/details/143742635
- 界面引擎:https://microi.blog.csdn.net/article/details/143972924
- 打印引擎:https://microi.blog.csdn.net/article/details/143973593
- V8函数列表-前端:https://microi.blog.csdn.net/article/details/143623205
- V8函数列表-后端:https://microi.blog.csdn.net/article/details/143623433
- V8.FormEngine用法:https://microi.blog.csdn.net/article/details/143623519
- Where条件用法:https://microi.blog.csdn.net/article/details/143582519
- DosResult说明:https://microi.blog.csdn.net/article/details/143870540
- 分布式存储配置:https://microi.blog.csdn.net/article/details/143763937
- 自定义导出Excel:https://microi.blog.csdn.net/article/details/143619083
- 表单引擎-定制组件:https://microi.blog.csdn.net/article/details/143939702
- 表单控件数据源绑定配置:https://microi.blog.csdn.net/article/details/143767223
- 复制表单和模块到其它数据库:https://microi.blog.csdn.net/article/details/143950112
- 论传统定制开发与低代码开发的优缺点:https://microi.blog.csdn.net/article/details/143866006
- 开源版、个人版、企业版区别:https://microi.blog.csdn.net/article/details/143974752
- 成为合伙人:https://microi.blog.csdn.net/article/details/143974715
接口引擎实战-系列文档
- 接口引擎实战-发送第三方短信:https://microi.blog.csdn.net/article/details/143990546
- 接口引擎实战-发送阿里云短信:https://microi.blog.csdn.net/article/details/143990603
- 接口引擎实战-微信小程序授权手机号登录:https://microi.blog.csdn.net/article/details/144106817
- 接口引擎实战-微信v3支付JSAPI下单:https://microi.blog.csdn.net/article/details/144156119
- 接口引擎实战-微信支付回调接口:https://microi.blog.csdn.net/article/details/144168810
- 接口引擎实战-MongoDB相关操作:https://microi.blog.csdn.net/article/details/144434527
更多推荐
已为社区贡献14条内容
所有评论(0)