【转载】Jsoup设置代理ip访问

转载地址:https://blog.csdn.net/qq_36980713/article/details/80913248

import java.io.IOException;
import java.util.*;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;import net.sf.json.JSONObject;public class Test {//获取代理ip,记得更换,我用的是蘑菇代理的,可以换成其他的网站的private final static String GET_IP_URL = "http://piping.mogumiao.com/proxy/api/get_ip_bs?appKey=xxxxx&count=10&format=1";public static void main(String[] args) throws InterruptedException {List<String> addrs = new LinkedList<String>();Map<String,Integer> addr_map = new HashMap<String,Integer>();Map<String,String> ipmap = new HashMap<String,String>();ExecutorService exe = Executors.newFixedThreadPool(10);for (int i=0 ;i<1;i++) {Document doc = null;try {doc = Jsoup.connect(GET_IP_URL).get();} catch (IOException e) {continue;}System.out.println(doc.text());JSONObject jsonObject = JSONObject.fromObject(doc.text());List<Map<String,Object>> list = (List<Map<String,Object>>) jsonObject.get("msg");int count = list.size();for (Map<String,Object> map : list ) {String ip = (String)map.get("ip");String port = (String)map.get("port") ;ipmap.put(ip,"1");checkIp a = new checkIp(ip, new Integer(port),count);exe.execute(a);}exe.shutdown();Thread.sleep(1000);}}
}class checkIp implements Runnable {private static Logger logger = LoggerFactory.getLogger(aaa.class);private static int suc=0;private static int total=0;private static int fail=0;private String ip ;private int port;private int count;public checkIp(String ip, int port,int count) {super();this.ip = ip;this.port = port;this.count = count;}@Overridepublic void run() {Random r = new Random();String[] ua = {"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.87 Safari/537.36 OPR/37.0.2178.32","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586","Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko","Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0)","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 BIDUBrowser/8.3 Safari/537.36","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 Safari/537.36 Core/1.47.277.400 QQBrowser/9.4.7658.400","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 UBrowser/5.6.12150.8 Safari/537.36","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36 SE 2.X MetaSr 1.0","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36 TheWorld 7","Mozilla/5.0 (Windows NT 6.1; W…) Gecko/20100101 Firefox/60.0"};int i = r.nextInt(14);logger.info("检测中------ {}:{}",ip,port );Map<String,String> map = new HashMap<String,String>();map.put("waybillNo","DD1838768852");try {total ++ ;long a = System.currentTimeMillis();//爬取的目标网站,url记得换下。。。!!!Document doc = Jsoup.connect("http://trace.yto.net.cn:8022/TraceSimple.aspx").timeout(5000).proxy(ip, port, null).data(map).ignoreContentType(true).userAgent(ua[i]).header("referer","http://trace.yto.net.cn:8022/gw/index/index.html")//这个来源记得换...post();System.out.println(ip+":"+port+"访问时间:"+(System.currentTimeMillis() -a) + "   访问结果: "+doc.text());suc ++ ;} catch (IOException e) {e.printStackTrace();fail ++ ;}finally {if (total == count ) {System.out.println("总次数:"+total);System.out.println("成功次数:"+suc);System.out.println("失败次数:"+fail);}}}}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/509115.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【转载保存】webCollector使用教程

github&#xff1a;https://github.com/CrawlScript/WebCollector gitee(里面新闻的例子不错): https://gitee.com/education

【转载保存】java操作HDFS

http://www.cnblogs.com/langgj/p/6595756.html

牛吃草 数论

4243: 牛吃草 Time Limit: 1 Sec Memory Limit: 128 MBSubmit: 306 Solved: 87Description 农夫有一个长满草的&#xff08;x0, y0&#xff09;为圆心&#xff0c;r为半径的圆形牛栏&#xff0c;他要将一头牛栓在坐标&#xff08;x1, y1&#xff09;栏桩上&#xff0c;但只让…

【转载保存】hadoop学习之wordcount运行错误处理

https://blog.csdn.net/lxa8008/article/details/50868192

【转载保存】mapreduce优秀文章

Hadoop MapReduce执行过程详解及MR中job参数及设置map和reduce的个数&#xff08;带hadoop例子&#xff09;&#xff1a;https://blog.csdn.net/helloxiaozhe/article/details/79246400 MapReduce–倒排索引&#xff1a;https://blog.csdn.net/jianjian1992/article/details/4…

爬虫遇到路径转换的解决方案

String href n.attr("abs:href");//jsoup自带的路径转换方法&#xff0c;有的时候行不通if("".equals(href)) {href n.attr("href");if (href.indexOf("http") < 0) {href getAbsoluteURL(url, href);}}SuppressWarnings("…

利用正则匹配url是否合法对于有的url会浪费过长时间使程序卡死,切记!

改进&#xff1a;改成匹配url是否为以某个结尾的&#xff0c;至于非法的url就让Jsoup.connect(url)把异常抛弃 //启动该正则匹配特别的慢 // public static String regex "^([hH][tT]{2}[pP]:/*|[hH][tT]{2}[pP][sS]:/*|[fF][tT][pP]:/*)(([A-Za-z0-9-~]).)([A-Za-z0-9-~…

【转载保存】Selenium Webdriver元素定位的八种常用方式

转载地址&#xff1a;https://www.cnblogs.com/qingchunjun/p/4208159.html

yum安装rz、

yum install lrzsz

4245: KI的斐波那契 递归

4245: KI的斐波那契 Time Limit: 1 Sec Memory Limit: 128 MBSubmit: 562 Solved: 213Description KI十分喜欢美丽而优雅的斐波那契数列,最近他新认识了一种斐波那契字符串,定义如下 f (0) b, f (1) a, f (2) f (1) f (0) ab, f (3) f (2) f (1) aba, f (4) f (3) …

StringEscapeUtils类的使用

https://blog.csdn.net/layman1024/article/details/72628379

json解析双引号

解析一个json数据&#xff1a; {“manifest”:{ Version:“3.0”}} 仔细看的话&#xff0c;这个字符串不是正规的json格式&#xff0c;Version少了双引号&#xff0c;应该是&#xff1a; {“manifest”:{ “Version”: “3.0”}} 转载&#xff1a;https://www.cnblogs.com/…

jetty9更改post请求长度

添加如下代码即可&#xff1a; static {System.setProperty("org.eclipse.jetty.server.Request.maxFormContentSize", String.valueOf(Integer.MAX_VALUE));System.setProperty("org.eclipse.jetty.server.Request.maxFormKeys", String.valueOf(Integer.…

java.lang.NoSuchMethodError: javax.servlet.http.HttpServletRequest.isAsyncStarted()Z 的解决

jetty 9 嵌入式开发时&#xff0c;启动正常&#xff0c;但是页面一浏览就报错如下&#xff1a; java.lang.NoSuchMethodError: javax.servlet.http.HttpServletRequest.isAsyncStarted()Z 原因&#xff1a;jetty 9 依赖的servlet-api是3.X版本&#xff0c;如果项目中还有其它第…

Hive的UDF概念

首先我们学习hadoop的时候&#xff0c;为了让我们不太会java语言但是对SQL很熟悉的工程师能够操作基本的mapreduce计算过程&#xff0c;Hive被设计出来了。Hive就好比是hadoop在执行MR&#xff08;mapreduce&#xff09;程序的一个操作系统&#xff0c;因为我们可以用简单的SQL…

QAQ的幸运数字 数学

QAQ的幸运数字 Time Limit: 1000MS Memory Limit: 65536KBSubmit StatisticProblem Description 金牌巨 QAQ 经常靠涨人品 (Rising RP) 来 A 题。他的幸运数字是 4 和 7&#xff0c;因此他也经常在第 4 发或第 7 发提交时过题&#xff08;误&#xff09;。 一天&#xff0c;突 …

根据经纬度求最近点的三种解法java实现

文章目录1. geoHash2. kdTree算法求最近点3.暴力法4.利用elasticsearch或者lucene1. geoHash 首先对经纬度点进行编码&#xff1a; 利用geoHash把经纬转换成32进制的编码字符串将待搜索的坐标转换成编码与坐标库中的串进行比较&#xff0c;找出前缀匹配长度高放入map中&#…

bLue的除法算术题 数学

bLue的除法算术题 Time Limit: 1000MS Memory Limit: 65536KBSubmit StatisticProblem Description bLue 最近接了个重活&#xff0c;需要帮助小学生手算大量的除法算术题&#xff0c;这可把他累坏了。 但是&#xff0c;机智的 bLue 一想&#xff0c;写个 “printf("%f&qu…

机器学习入门知识

本文主要向大家介绍了机器学习入门之机器学习------精心总结&#xff0c;通过具体的内容向大家展现&#xff0c;希望对大家学习机器学习入门有所帮助。 1.数学 偏差与方差 拉格朗日 核函数 凸优化 协方差矩阵 Hessian矩阵 CDF&#xff08;累计分布函数&#xff09; 高斯概率密…