Commit 6c9f649a by zhiwei

处理搜狗微信搜索链接中出现两次https的问题

parent 9159a942
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
<modelVersion>4.0.0</modelVersion> <modelVersion>4.0.0</modelVersion>
<groupId>com.zhiwei</groupId> <groupId>com.zhiwei</groupId>
<artifactId>wechat</artifactId> <artifactId>wechat</artifactId>
<version>1.3.0-SNAPSHOT</version> <version>1.3.1-SNAPSHOT</version>
<description> <description>
知微微信采集程序,包含 知微微信采集程序,包含
1.微信历史文章采集 1.微信历史文章采集
......
...@@ -300,7 +300,11 @@ public class WechatAritcleSearch { ...@@ -300,7 +300,11 @@ public class WechatAritcleSearch {
for (Element element : elements) { for (Element element : elements) {
try { try {
title = element.select("div.txt-box").select("h3").text(); title = element.select("div.txt-box").select("h3").text();
link = "https://weixin.sogou.com" + element.select("div.txt-box").select("h3 >a").attr("href"); link = element.select("div.txt-box").select("h3 >a").attr("href");
if(!link.contains("https")){
link = "https://weixin.sogou.com" + link;
}
content = ""; content = "";
if (element.select("p.txt-info").isEmpty()) { if (element.select("p.txt-info").isEmpty()) {
content = element.select("p.txt-info").text(); content = element.select("p.txt-info").text();
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment