Commit 3b7f82c4 by chenweitao

Merge branch 'working' into 'master'

Working

See merge request !183
parents 4e2ff7cd eae35e40
...@@ -14,26 +14,45 @@ ...@@ -14,26 +14,45 @@
#### 数据来源 #### 数据来源
1.微博热搜 | 题号 | 热搜名字 |
2.今日头条热搜 | :----:|:----:|
3.百度风云榜热搜 1|[微博热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboHotSearchCrawler.java)
4.抖音热搜 2|[今日头条热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/ToutiaoHotSearchCrawler.java)
5.知乎热搜 3|[百度风云榜热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/BaiDuHotSearchCrawler.java)
6.知乎热榜 4|[抖音热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/DouyinHotSearchCrawler.java)
7.知乎热榜数码分类 5|[知乎热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/ZhihuHotSearchCrawler.java)
8.知乎热榜国际分类 6|[知乎热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/ZhihuTopicSearchCrawler.java)
9.知乎热榜时事分类 7|[知乎热榜数码分类](./src/main/java/com/zhiwei/searchhotcrawler/crawler/ZhihuChildHotSearchCrawler.java)
10.腾讯新闻热搜 8|[腾讯新闻热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/TengXunCrawler.java)
11.新浪热点 9|[新浪热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/XinLangHotSearchCrawler.java)
12.新浪热搜 10|[搜狐话题](./src/main/java/com/zhiwei/searchhotcrawler/crawler/SouhuTopicCrawler.java)
13.搜狐话题 11|[凤凰新闻热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/FengHuangSearchCrawler.java)
14.凤凰新闻热榜 12|[网易热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WangYiHotSearchCrawler.java)
15.凤凰新闻热搜 13|[网易跟帖热议](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WangYiHotSearchCrawler.java)
16.网易新闻热榜 14|[搜狗微信热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/SougoHotSearchCrawler.java)
17.网易新闻跟帖热议 15|[微博话题](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboTopicCrawler.java)
18.搜狗微信热搜 16|[微博预热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboHotSearchCrawler.java)
19.微博话题 17|[B站综合热门](./src/main/java/com/zhiwei/searchhotcrawler/crawler/BiliComprehensiveHotCrawler.java)
20.微博预热榜 18|[B站热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/BililiCrawler.java)
19|[36氪人气榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/HotSearch36KrCrawler.java)
20|[虎嗅热文推荐](./src/main/java/com/zhiwei/searchhotcrawler/crawler/HuXiuHotSearchCrawler.java)
21|[快手热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/KuaiShouHotSearchCrawler.java)
22|[脉脉热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/MaiMaiHotSearchCrawler.java)
23|[淘宝热搜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/TaoBaoHotSearchCrawler.java)
24|[微博娱乐榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboEntertainmentCrawler.java)
25|[微博要闻榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboNewsCrawler.java)
26|[微博出圈榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboOutCircleCrawler.java)
27|[微博超话](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiboSuperTopicCrawler.java)
28|[微视热榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/WeiShiHotSearchCrawler.java)
29|[B站排行榜](./src/main/java/com/zhiwei/searchhotcrawler/crawler/BililiCrawler.java)
30|搜狗微信客户端热搜(暂停采集)
31|新浪热点(暂停采集)
32|凤凰新闻热搜(暂停采集)
33|腾讯较真榜(暂停采集)
34|知乎热搜国际分类采集(暂停采集)
35|知乎热搜时事分类采集(暂停采集)
36|微博搜索框热词(暂停采集)
37|抖音同城榜(本地采集)
#### Mongo内网 #### Mongo内网
192.168.0.101,192.168.0.106,192.168.0.108 192.168.0.101,192.168.0.106,192.168.0.108
......
...@@ -80,6 +80,10 @@ public class WeiboOutCircleCrawler { ...@@ -80,6 +80,10 @@ public class WeiboOutCircleCrawler {
return result; return result;
} catch (Exception e) { } catch (Exception e) {
log.error("解析微博出圈榜出现解析错误,数据不是json结构", e); log.error("解析微博出圈榜出现解析错误,数据不是json结构", e);
}finally {
if (result.size()<0){
log.error("页面解析出现问题");
}
} }
} else { } else {
log.info("解析微博出圈榜时出现解析错误,页面结构有问题"); log.info("解析微博出圈榜时出现解析错误,页面结构有问题");
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment