Commit fc1372d5 by zhiwei

添加README.md 说明文件

parent ba2389f8
知微搜索引擎采集jar
==================
##### 摘要
> 这是一个基于OKHttp+Jsoup实现的网页抓取及解析功能的搜索引擎采集爬虫,目前包含:百度新闻、搜狗新闻、360新闻三种根据关键词采集功能
的爬虫项目
##### maven
<dependency>
<groupId>com.zhiwei</groupId>
<artifactId>media_data_crawler</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
##### 调用demo
String word = "马云"; //关键词
String startTime = "2017-03-01 00:00:00"; //开始时间
String endTime = "2017-03-01 23:59:59"; //结束时间
Proxy proxy = null; //代理IP,不用可不填写
//百度新闻采集demo
List<NewsData> baiduNewsList = DataCrawler.getBaiduNewsData(word, startTime, endTime, proxy);
//搜狗新闻关键词采集demo
List<NewsData> sogouNewsList = DataCrawler.getSougouNewsData(word, proxy);
//360新闻采集demo
List<NewsData> soNewsList = DataCrawler.getSoNewsData(word, proxy);
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment