HttpClient v4.5 简单抓取主页数据

        由于工作原因,需要每隔半小时刷新一些网页,并查看上面的数据是否有更新。这件事能否自动化进行呢?查找了下Java相关的资料,蹦出一个关键词:HttpClient。

        HttpClient是常用Http客户端库,相关的资料也不少,只是网上找到的资料好多都是不能用于4.5版的HttpClient,还是需要自己摸索。

        在eclipse里新建一个maven工程(maven 3),在pom.xml中做如下设置:

 1 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
 2     <modelVersion>4.0.0</modelVersion>
 3     <groupId>test</groupId>
 4     <artifactId>admin.test.httpclient</artifactId>
 5     <version>0.0.1-SNAPSHOT</version>
 6     <dependencies>
 7         <dependency>
 8             <groupId>org.apache.httpcomponents</groupId>
 9             <artifactId>httpclient</artifactId>
10             <version>4.5</version>
11         </dependency>
17     </dependencies>
18     <build>
19         <finalName>MvnTest</finalName>
20         <plugins>
21             <plugin>
22                 <artifactId>maven-compiler-plugin</artifactId>
23                 <version>2.0.2</version>
24                 <configuration>
25                     <source>1.5</source>
26                     <target>1.5</target>
27                 </configuration>
28             </plugin>
29         </plugins>
30     </build>
31 </project>

        在pom.xml上运行"maven install"命令完成之后,在“Maven Dependencies”下有了四个jar包:

        拿某个知名网站发送GET请求做测试,看看效果:

 1 public class HttpClientTest {
 2     private static String HOST = "www.sina.com";
 3     private static String BASE_URL = "http://"+HOST+"/";
 4     public static void main(String[] args) throws  ClientProtocolException, IOException 
 5         CloseableHttpClient httpClient = HttpClients.createDefault();
 6         /// 设置GET请求参数,URL一定要以"http://"开头
 7         HttpGet getReq = new HttpGet(BASE_URL);
 8         /// 设置请求报头,模拟Chrome浏览器
 9         getReq.addHeader("Accept", "application/json, text/javascript, */*; q=0.01");
10         getReq.addHeader("Accept-Encoding", "gzip,deflate,sdch");
11         getReq.addHeader("Accept-Language", "zh-CN,zh;q=0.8");
12         getReq.addHeader("Content-Type", "text/html; charset=UTF-8");
13         getReq.addHeader("Host", HOST);
14         getReq.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 5.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36");
15         /// 发送GET请求
16         CloseableHttpResponse rep = httpClient.execute(getReq);
17         /// 从HTTP响应中取出页面内容
18         HttpEntity repEntity = rep.getEntity();
19         String content = EntityUtils.toString(repEntity);
20         /// 打印出页面的内容:
21         System.out.println(content);
22         /// 关闭连接
23         rep.close();
24         httpClient.close();
25     }
26 }

        得到的页面内容:

  1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  2 <!--[5019,2,1] published at 2015-01-01 14:46:19 from #110 by 22-->
  3 <html xmlns="http://www.w3.org/1999/xhtml">
  4 <head>
  5 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  6 <title>WWW.SINA.COM</title>
  7 <meta name="keywords" content="sina, ??°???" />
  8 <meta name="description" content="??°???é??é??" />
  9 
 10 <style type="text/css">
 11 <!--
 12 /* basic setup */
 13 body, div, dl, dt, dd, ul, ol, li, h1, h2, h3, h4, h5, h6, form, fieldset, legend, input, textarea, p, blockquote, th, td {margin: 0; padding: 0;}
 14 body {background: #ebebed url(http://ui.sina.com/assets/img/www/bg_gradient.gif) repeat-x; font-family: Arial, Helvetica, sans-serif; min-height: 100%;}
 15 img {border: 0;}
 16 em {position: absolute; left: -9999em;}
 17 .clearDiv {clear: both;}
 18 #wrap { padding: 50px 0 10px;margin:0 auto; width: 775px}
 19 
 20 /* Header */
 21 #header {position: relative; margin: 0 auto; width: 775px; border-bottom: 1px solid #ffa600;}
 22 #header h1 {float: left; margin: 0; width: 275px; height: 50px; background: url(http://ui.sina.com/assets/img/www/sina_id_www.gif) no-repeat top left;}
 23 #header ul {float: left; margin: 0; width: 500px; height: 50px; list-style: none; font-size: 12px; color: #333; text-transform: capitalize;}
 24 #header ul li {float: right; margin: 30px 0 0 0;}
 25 #header ul li a {color: #333; text-decoration: none;}
 26 #header ul li a:hover {color: #ff9900; text-decoration: none;}
 27 
 28 #map {position: relative; margin: 0; width: 775px; height: 248px;}
 29 
 30 #channel {position: relative; margin: 0; width: 775px; border-bottom: 1px solid #ffa600;}
 31 
 32 /* Footer */
 33 #footer {position: relative; margin: 0 auto; width: 775px; border-top: 1px solid #ffa600;}
 34 #footer ul {margin: 10px auto; padding: 0; width: 775px; list-style: none; font-size: 12px; color: #333; text-transform: capitalize; text-align: center;}
 35 #footer ul li {display: inline; padding: 2px 5px;}
 36 #footer ul li a {color: #333; text-decoration: none;}
 37 #footer ul li a:hover {color: #ff9900; text-decoration: none;}
 38 
 39 /* ads */
 40 #ads {position: relative; margin: 5px 0; padding: 0; width: 775px;}
 41 #ads ul {margin: 5px 0; width: 775px; list-style: none; text-align: center;}
 42 #ads ul li.bnr728 {margin: 5px auto; padding: 0; width: 775px; height: 90px;}
 43 #ads ul li.bnr545 {float: left; margin: 5px auto; padding: 0; width: 620px; height: 80px;}
 44 #ads ul li.bnr120 {float: left; margin: 5px auto; padding: 0; width: 155px; height: 60px; line-height: 60px;}
 45 #ads ul li.bnr120_2 {float: left; margin: 5px auto; padding: 0; width: 155px; height: 80px; line-height: 80px;}
 46 
 47 -->
 48 </style>
 49 
 50 <!-- swfObject -->
 51 <script type="text/javascript" src="http://ui.sina.com/assets/js/swfobject.js"></script>
 52 
 53 <!-- btn.5 -->
 54 <script type="text/javascript">
 55     var flashvars = {};
 56     var params = {};
 57     params.base = "";
 58     params.menu = "true";
 59     params.scale = "noscale";
 60     params.bgcolor = "#fff";
 61     params.quality = "best";
 62     // params.allowfullscreen = "true";
 63     params.salign = "c";
 64     params.wmode = "window";
 65     var attributes = {};
 66     swfobject.embedSWF("http://ui.sina.com/rm/toyota/091110/toyota_120x60_4_091110.swf", "btn5", "120", "60", "9.0.0", "expressInstall.swf", flashvars, params, attributes);
 67 
 68 </script>
 69 <!-- END -->
 70 
 71 </head>
 72 <body>
 73 
 74 <!-- SUDA_CODE_START --> 
 75 <div style='position:absolute;top:0;left:0;width:0;height:0;z-index:1'><div style='position:absolute;top:0;left:0;width:1;height:1;'><iframe id='SUDA_FC' src='' width=1 height=1 SCROLLING=NO FRAMEBORDER=0></iframe></div><div style='position:absolute;top:0;left:0;width:0;height:0;visibility:hidden' id='SUDA_CS_DIV'></div></div> 
 76 <script type="text/javascript"> 
 77 //<!--
 78 var SSL={Config:{},Space:function(d){var b=d,c=null;b=b.split(".");c=SSL;for(i=0,len=b.length;i<len;i++){c[b[i]]=c[b[i]]||{};c=c[b[i]]}return c}};SSL.Space("Global");SSL.Space("Core.Dom");SSL.Space("Core.Event");SSL.Space("App");SSL.Global={win:window||{},doc:document,nav:navigator,loc:location};SSL.Core.Dom={get:function(a){return document.getElementById(a)}};SSL.Core.Event={on:function(){}};SSL.App={_S_gConType:function(){var a="";try{SSL.Global.doc.body.addBehavior("#default#clientCaps");a=SSL.Global.doc.body.connectionType}catch(b){a="unkown"}return a},_S_gKeyV:function(g,b,d,c){if(g==""){return""}if(c==""){c="="}b=b+c;var f=g.indexOf(b);if(f<0){return""}f=f+b.length;var a=g.indexOf(d,f);if(a<f){a=g.length}return g.substring(f,a)},_S_gUCk:function(a){if((undefined==a)||(""==a)){return""}return SSL.App._S_gKeyV(SSL.Global.doc.cookie,a,";","")},_S_sUCk:function(e,a,b,d){if(a!=null){if((undefined==d)||(null==d)){d="sina.com.cn"}if((undefined==b)||(null==b)||(""==b)){SSL.Global.doc.cookie=e+"="+a+";domain="+d+";path=/"}else{var c=new Date();var f=c.getTime();f=f+86400000*b;c.setTime(f);f=c.getTime();SSL.Global.doc.cookie=e+"="+a+";domain="+d+";expires="+c.toUTCString()+";path=/"}}},_S_gJVer:function(f,b){var e,a,g,c=1,d=0;if("MSIE"==b){a="MSIE";e=f.indexOf(a);if(e>=0){g=parseInt(f.substring(e+5));if(3<=g){c=1.1;if(4<=g){c=1.3}}}}else{if(("Netscape"==b)||("Opera"==b)||("Mozilla"==b)){c=1.3;a="Netscape6";e=f.indexOf(a);if(e>=0){c=1.5}}}return c},_S_gFVer:function(nav){var ua=SSL.Global.nav.userAgent.toLowerCase();var flash_version=0;if(SSL.Global.nav.plugins&&SSL.Global.nav.plugins.length){var p=SSL.Global.nav.plugins["Shockwave Flash"];if(typeof p=="object"){for(var i=10;i>=3;i--){if(p.description&&p.description.indexOf(" "+i+".")!=-1){flash_version=i;break}}}}else{if(ua.indexOf("msie")!=-1&&ua.indexOf("win")!=-1&&parseInt(SSL.Global.nav.appVersion)>=4&&ua.indexOf("16bit")==-1){for(var i=10;i>=2;i--){try{var object=eval("new ActiveXObject('ShockwaveFlash.ShockwaveFlash."+i+"');");if(object){flash_version=i;break}}catch(e){}}}else{if(ua.indexOf("webtv/2.5")!=-1){flash_version=3}else{if(ua.indexOf("webtv")!=-1){flash_version=2}}}}return flash_version},_S_gMeta:function(b,c){var d=SSL.Global.doc.getElementsByName(b);var a=0;if(c>0){a=c}return(d.length>a)?d[a].content:""},_S_gHost:function(b){var a=new RegExp("^http(?:s)?://([^/]+)","im");if(b.match(a)){return b.match(a)[1].toString()}else{return""}},_S_gTJMTMeta:function(){return SSL.App._S_gMeta("mediaid")},_S_gTJZTMeta:function(){var a=SSL.App._S_gMeta("subjectid");a.replace(",",".");a.replace(";",",");return a},_S_isFreshMeta:function(){return false},_S_isIFrameSelf:function(b,a){if(SSL.Global.win.top==SSL.Global.win){return false}else{try{if(SSL.Global.doc.body.clientHeight==0){return false}if((SSL.Global.doc.body.clientHeight>=b)&&(SSL.Global.doc.body.clientWidth>=a)){return false}else{return true}}catch(c){return true}}},_S_isHome:function(b){var a="";try{SSL.Global.doc.body.addBehavior("#default#homePage");a=SSL.Global.doc.body.isHomePage(b)?"Y":"N"}catch(c){a="unkown"}return a}};function SUDA(I,h,g){var f=SSL.Global,y=SSL.Core.Dom,v=SSL.Core.Event,j=SSL.App;var F="webbug_meta_ref_mod_noiframe_async_fc_:9.12c",k="-9999-0-0-1";var b=f.nav.appName.indexOf("Microsoft Internet Explorer")>-1?"MSIE":f.nav.appName;var u=f.nav.appVersion;var q=f.loc.href.toLowerCase();var z=f.doc.referrer.toLowerCase();var p="";var n="",J="SUP",w="",t="Apache",x="SINAGLOBAL",r="ULV",G="UOR",s="_s_upa",a=320,l=240,H=0,o="",m="",M=0,K=10000,E=0,d="_s_acc";var C=q.indexOf("https")>-1?"https://":"http://",B="beacon.sina.com.cn",D=C+B+"/a.gif",L=C+B+"/e.gif";var e=100,c=2000;var A={_S_gsSID:function(){var N=j._S_gUCk(t);if(""==N){var O=new Date();N=Math.random()*10000000000000+"."+O.getTime();j._S_sUCk(t,N)}return N},_S_sGID:function(N){if(""!=N){j._S_sUCk(x,N,3650)}},_S_gGID:function(){return j._S_gUCk(x)},_S_gsGID:function(){var N=j._S_gUCk(x);if(""==N){N=A._S_gsSID();A._S_sGID(N)}return N},_S_gCid:function(){try{var N=j._S_gMeta("publishid");if(""!=N){var P=N.split(",");if(P.length>0){if(P.length>=3){k="-9999-0-"+P[1]+"-"+P[2]}return P[0]}}else{return"0"}}catch(O){return"0"}},_S_gAEC:function(){return j._S_gUCk(d)},_S_sAEC:function(N){if(""==N){return}var O=A._S_gAEC();if(O.indexOf(N+",")<0){O=O+N+","}j._S_sUCk(d,O,7)},_S_p2Bcn:function(R,Q){var P=new Date();var O=Q+"?"+R+"&gUid_"+P.getTime();var N=new Image();SUDA.img=N;N.src=O},_S_gSUP:function(){if(w!=""){return w}var P=unescape(j._S_gUCk(J));if(P!=""){var O=j._S_gKeyV(P,"ag","&","");var N=j._S_gKeyV(P,"user","&","");var Q=j._S_gKeyV(P,"uid","&","");var S=j._S_gKeyV(P,"sex","&","");var R=j._S_gKeyV(P,"dob","&","");w=O+":"+N+":"+Q+":"+S+":"+R;return w}else{return""}},_S_gsLVisit:function(P){var R=j._S_gUCk(r);var Q=R.split(":");var S="";if(Q.length>=6){if(P!=Q[4]){var O=new Date();var N=new Date(parseInt(Q[0]));Q[1]=parseInt(Q[1])+1;if(O.getMonth()!=N.getMonth()){Q[2]=1}else{Q[2]=parseInt(Q[2])+1}if(((O.getTime()-N.getTime())/86400000)>=7){Q[3]=1}else{if(O.getDay()<N.getDay()){Q[3]=1}else{Q[3]=parseInt(Q[3])+1}}S=Q[0]+":"+Q[1]+":"+Q[2]+":"+Q[3];Q[5]=Q[0];Q[0]=O.getTime();j._S_sUCk(r,Q[0]+":"+Q[1]+":"+Q[2]+":"+Q[3]+":"+P+":"+Q[5],360)}else{S=Q[5]+":"+Q[1]+":"+Q[2]+":"+Q[3]}}else{var O=new Date();S=":1:1:1";j._S_sUCk(r,O.getTime()+S+":"+P+":",360)}return S},_S_gUOR:function(){var N=j._S_gUCk(G);var O=N.split(":");if(O.length>=2){return O[0]}else{return""}},_S_sUOR:function(){var R=j._S_gUCk(G),W="",O="",V="",Q="";var X=/[&|?]c=spr(_[A-Za-z0-9]{1,}){3,}/;var S=new Date();if(q.match(X)){V=q.match(X)[0]}else{if(z.match(X)){V=z.match(X)[0]}}if(V!=""){V=V.substr(3)+":"+S.getTime()}if(R==""){if(j._S_gUCk(r)==""&&j._S_gUCk(r)==""){W=j._S_gHost(z);O=j._S_gHost(q)}j._S_sUCk(G,W+","+O+","+V,365)}else{var T=0,U=R.split(",");if(U.length>=1){W=U[0]}if(U.length>=2){O=U[1]}if(U.length>=3){Q=U[2]}if(V!=""){T=1}else{var P=Q.split(":");if(P.length>=2){var N=new Date(parseInt(P[1]));if(N.getTime()<(S.getTime()-86400000*30)){T=1}}}if(T){j._S_sUCk(G,W+","+O+","+V,365)}}},_S_gRef:function(){var N=/^[^\?&#]*.swf([\?#])?/;if((z=="")||(z.match(N))){var O=j._S_gKeyV(q,"ref","&","");if(O!=""){return O}}return z},_S_MEvent:function(){if(M==0){M++;var O=j._S_gUCk(s);if(O==""){O=0}O++;if(O<K){var N=/[&|?]c=spr(_[A-Za-z0-9]{2,}){3,}/;if(q.match(N)||z.match(N)){O=O+K}}j._S_sUCk(s,O)}},_S_gMET:function(){var N=j._S_gUCk(s);if(N==""){N=0}return N},_S_gCInfo_v2:function(){var N=new Date();return"sz:"+screen.width+"x"+screen.height+"|dp:"+screen.colorDepth+"|ac:"+f.nav.appCodeName+"|an:"+b+"|cpu:"+f.nav.cpuClass+"|pf:"+f.nav.platform+"|jv:"+j._S_gJVer(u,b)+"|ct:"+j._S_gConType()+"|lg:"+f.nav.systemLanguage+"|tz:"+N.getTimezoneOffset()/60+"|fv:"+j._S_gFVer(f.nav)},_S_gPInfo_v2:function(N,O){if((undefined==N)||(""==N)){N=A._S_gCid()+k}return"pid:"+N+"|st:"+A._S_gMET()+"|et:"+E+"|ref:"+escape(O)+"|hp:"+j._S_isHome(q)+"|PGLS:"+j._S_gMeta("stencil")+"|ZT:"+escape(j._S_gTJZTMeta())+"|MT:"+escape(j._S_gTJMTMeta())+"|keys:"},_S_gUInfo_v2:function(N){return"vid:"+N+"|sid:"+A._S_gsSID()+"|lv:"+A._S_gsLVisit(A._S_gsSID())+"|un:"+A._S_gSUP()+"|uo:"+A._S_gUOR()+"|ae:"+A._S_gAEC()},_S_gEXTInfo_v2:function(O,N){o=(undefined==O)?o:O;m=(undefined==N)?m:N;return"ex1:"+o+"|ex2:"+m},_S_pBeacon:function(R,Q,O){try{var T=A._S_gsGID();if(""==T){if(H<1){setTimeout(function(){A._S_pBeacon(R,Q,O)},c);H++;return}else{T=A._S_gsSID();A._S_sGID(T)}}var V="V=2";var S=A._S_gCInfo_v2();var X=A._S_gPInfo_v2(R,A._S_gRef());var P=A._S_gUInfo_v2(T);var N=A._S_gEXTInfo_v2(Q,O);var W=V+"&CI="+S+"&PI="+X+"&UI="+P+"&EX="+N;A._S_p2Bcn(W,D)}catch(U){}},_S_acTrack_i:function(N,P){if((""==N)||(undefined==N)){return}A._S_sAEC(N);if(0==P){return}var O="AcTrack||"+A._S_gGID()+"||"+A._S_gsSID()+"||"+A._S_gSUP()+"||"+N+"||";A._S_p2Bcn(O,L)},_S_uaTrack_i:function(P,N){var O="UATrack||"+A._S_gGID()+"||"+A._S_gsSID()+"||"+A._S_gSUP()+"||"+P+"||"+N+"||"+A._S_gRef()+"||";A._S_p2Bcn(O,L)}};if(M==0){if("MSIE"==b){SSL.Global.doc.attachEvent("onclick",A._S_MEvent);SSL.Global.doc.attachEvent("onmousemove",A._S_MEvent);SSL.Global.doc.attachEvent("onscroll",A._S_MEvent)}else{SSL.Global.doc.addEventListener("click",A._S_MEvent,false);SSL.Global.doc.addEventListener("mousemove",A._S_MEvent,false);SSL.Global.doc.addEventListener("scroll",A._S_MEvent,false)}}A._S_sUOR();return{_S_pSt:function(N,P,O){try{if((j._S_isFreshMeta())||(j._S_isIFrameSelf(l,a))){return}++E;A._S_gsSID();setTimeout(function(){A._S_pBeacon(N,P,O,0)},e)}catch(Q){}},_S_pStM:function(N,P,O){++E;A._S_pBeacon(N,((undefined==P)?A._S_upExt1():P),O)},_S_acTrack:function(N,P){try{if((undefined!=N)&&(""!=N)){setTimeout(function(){A._S_acTrack_i(N,P)},e)}}catch(O){}},_S_uaTrack:function(O,N){try{if(undefined==O){O=""}if(undefined==N){N=""}if((""!=O)||(""!=N)){setTimeout(function(){A._S_uaTrack_i(O,N)},e)}}catch(P){}},_S_gCk:function(N){return j._S_gUCk(N)},_S_sCk:function(Q,N,O,P){return j._S_sUCk(Q,N,O,P)},_S_gGlobalID:function(){return A._S_gGID()},_S_gSessionID:function(){return A._S_gsSID()}}}var GB_SUDA;if(GB_SUDA==null){GB_SUDA=new SUDA({})}var _S_PID_="";function _S_pSt(a,c,b){GB_SUDA._S_pSt(a,c,b)}function _S_pStM(a,c,b){GB_SUDA._S_pStM(a,c,b)}function _S_acTrack(a){GB_SUDA._S_acTrack(a,1)}function _S_uaTrack(b,a){GB_SUDA._S_uaTrack(b,a)}(function(){function a(b,e,d){var c=document.createElement("script");if(typeof e==="string"){c.charset=e}c.onreadystatechange=c.onload=function(){if(!this.readyState||this.readyState=="loaded"||this.readyState=="complete"){if(e&&typeof e==="function"){e()}if(d&&typeof d==="function"){d()}c.onreadystatechange=c.onload=null;c.parentNode.removeChild(c)}};c.src=b;document.getElementsByTagName("head")[0].appendChild(c)}a("http://d3.sina.com.cn/shh/ws/2012/xb/gladnews_run.js")})();
 79 //-->
 80 </script> 
 81 <script type="text/javascript"> 
 82 //<!--
 83 GB_SUDA._S_pSt("");
 84 //-->
 85 </script> 
 86 <noScript> 
 87 <div style='position:absolute;top:0;left:0;width:0;height:0;visibility:hidden'><img width=0 height=0 src='http://beacon.sina.com.cn/a.gif?noScript' border='0' alt='' /></div> 
 88 </noScript> 
 89 <!-- SUDA_CODE_END -->
 90 
 91 <div id="wrap">
 92     <!-- Header -->
 93     <div id="header">
 94         <h1><em>??°???????????±???é???§?</em></h1>
 95         <ul>
 96         <li><a href="http://english.sina.com/index.html" onclick="_S_uaTrack('global_guide', 'english');">Sina English</a></li>
 97         </ul>
 98         <div class="clearDiv"></div>
 99     </div>
100 
101     <!-- Map -->
102     <div id="map">
103         <img src="http://ui.sina.com/assets/img/www/worldmap.jpg" alt="" name="map1" width="775" height="248" border="0" usemap="#Map1" id="Map1" />
104 
105 <map name="Map1" id="">
106 <area shape="rect" coords="173,81,299,137" href="http://home.sina.com" target="_self" alt="????????°???" title="????????°???" onclick="_S_uaTrack('global_guide', 'us');" />
107 <area shape="rect" coords="468,81,572,129" href="http://www.sina.com.cn" target="_self" alt="????????°???" title="????????°???" onclick="_S_uaTrack('global_guide', 'beijing');" />
108 <area shape="rect" coords="482,145,578,184" href="http://www.sina.com.hk" target="_self" alt="é???????°???" title="é???????°???" onclick="_S_uaTrack('global_guide', 'hongkong');" />
109 <area shape="rect" coords="658,123,755,162" href="http://www.sina.com.tw" target="_self" alt="??°?????°???" title="??°?????°???" onclick="_S_uaTrack('global_guide', 'taipei');" />
110 </map>
111     </div>
112 
113     <!-- Channels -->
114     <div id="channel">
115         <img src="http://ui.sina.com/assets/img/www/categories-120918.gif" alt="" width="775" height="44" border="0" usemap="#Map4"  id="Map4" />
116 
117 <map name="Map4" id="">
118 <area shape="rect" target="_self" alt="??????" coords="4,3,76,35" href="http://us.weibo.com" onclick="_S_uaTrack('global_guide', 'weibo');" />
119 <area shape="rect" target="_self" alt="??????" coords="95,3,166,37" href="http://google.sina.com/" onclick="_S_uaTrack('global_guide', 'search');" />
120 <area shape="rect" target="_self" alt="è??é??" coords="171,2,241,38" href="http://video.sina.com" onclick="_S_uaTrack('global_guide', 'video');" />
121 <area shape="rect" target="_self" alt="??¤???" coords="257,3,328,39" href="http://match.sina.com/" onclick="_S_uaTrack('global_guide', 'match');" />
122 <area shape="rect" target="_self" alt="???é??" coords="432,3,496,36" href="http://travel.sina.com/" onclick="_S_uaTrack('global_guide', 'travel');" />
123 <area shape="rect" target="_self" alt="é??é??" coords="509,2,582,35" href="http://yp.sina.com/" onclick="_S_uaTrack('global_guide', 'yellow');" />
124 <area shape="rect" target="_self" alt="?????????" coords="590,2,679,33" href="http://sina.echineselearning.com/" onclick="_S_uaTrack('global_guide', 'chinese');" />
125 <area shape="rect" target="_self" alt="è?????" coords="335,3,417,38" href="http://bbs.sina.com/" onclick="_S_uaTrack('global_guide', 'bbs');" />
126 <area shape="rect" target="_self" alt="??????" coords="688,1,772,35" href="http://deals.sina.com" onclick="_S_uaTrack('global_guide', 'deals');" />
127 </map>
128     </div>
129 
130     <!-- ads (banners/buttons) -->
131     <div id="ads">
132         <ul>
133             <li class="bnr728"><!-- Row 1 . 728x90 -->
134 <script type="text/javascript">
135 //<![CDATA[
136 ord = window.ord || Math.floor(Math.random()*1E16);
137 document.write('<script type="text/javascript" src="http://ad.doubleclick.net/adj/us.homepage/;pos=top;sz=728x90;ord=' + ord + '?"><\/script>');
138 //]]>
139 </script>
140 <noscript><a href="http://ad.doubleclick.net/jump/us.homepage/;pos=top;sz=728x90;ord=123456789?" target="_blank" ><img src="http://ad.doubleclick.net/ad/us.homepage/;pos=top;sz=728x90;ord=123456789?" border="0" alt="" /></a></noscript>
141 <!-- END . Row 1 . 728x90 -->
142 
143 </li>
144 
145             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/2.js"></script></li>
146             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/3.js"></script></li>
147             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/4.js"></script></li>
148             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/5.js"></script></li>
149             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/6.js"></script></li>
150 
151             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/7.js"></script></li>
152             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/8.js"></script></li>
153             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/9.js"></script></li>
154             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/10.js"></script></li>
155             <li class="bnr120"><script type="text/javascript" src="http://dailynews.sina.com/gb/ads/www/120_60/11.js"></script></li>
156 
157         </ul>
158 
159         <div class="clearDiv"></div>
160     </div>
161     <!-- END . ads -->
162 
163     <!-- Footer -->
164     <div id="footer">
165         <ul>
166         <li><a href="http://corp.sina.com.cn/eng/">About SINA</a></li>
167         <li>|</li>
168         <li><a href="http://corp.sina.com.cn/eng/sina_rela_eng.htm">Investor</a></li>
169         <li>|</li>
170         <li><a href="http://mediakit.sina.com/">Media Kit</a></li>
171         <li>|</li>
172         <li><a href="http://mediakit.sina.com/contact.html">Comments or Question?</a></li>
173         <br /><br />
174         <li class="copyright">Copyright &copy; 1996-2015 SINA Corporation, All Rights Reserved</li>
175         </ul>
176     </div>
177 
178 </div>
179 
180 <!--floating video-->
181 <div id="flvideo">
182 <script type="text/javascript" src="http://dailynews.sina.com/gb/ads/common/floatingvideo.js"></script>
183 </div>
184 
185 <!-- START Nielsen Online SiteCensus V6.0 -->
186 <script type="text/javascript" src="//secure-us.imrworldwide.com/v60.js"></script>
187 <script type="text/javascript">
188 var pvar = { cid: "us-sina", content: "0", server: "secure-us" };
189 var feat = { surveys_enabled: 1, sample_rate: 0.1 };
190 var trac = nol_t(pvar, feat);
191 trac.record().post().do_sample();
192 </script>
193 <noscript>
194 <div>
195 <img src="//secure-us.imrworldwide.com/cgi-bin/m?ci=us-sina&amp;cg=0&amp;cc=1&amp;ts=noscript" width="1" height="1" alt="" />
196 </div>
197 </noscript>
198 <!-- END Nielsen Online SiteCensus V6.0 -->
199 
200 </body>
201 </html>
HTML Code

        OK,这样就可以抓到网站主页的数据了。现在的HttpClient对于gzip格式的响应解析做得很好,在内部就解压缩了,不需要使用者做特殊处理。

        后续还需要做一个桌面的应用,能够隔几分钟轮询页面,并将所需部分内容是否更新的状态通知给用户的功能。

转载于:https://www.cnblogs.com/dsdk2008/p/4745243.html

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/258498.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

matlab局部放大的图中图画法

【亲测有效】 在作图过程中&#xff0c;如果想将局部信息展示出来并且画在同一张图中&#xff0c;一般的MATLAB作图法就比较拙计了&#xff0c;好在MATLAB还是很强大的&#xff0c;当然&#xff0c;除了不能当女朋友之外 .... ╮(╯▽╰)╭ function showdetail()% 在当前的ax…

进入Python世界——Python基础知识

本文通过实例练习Python基础语法, python版本2.7 # -*- coding: utf-8 -*- import randomimport re import requests from bs4 import BeautifulSoup# 爬取糗事百科的首页内容 def qiushibaike():content requests.get(http://www.qiushibaike.com/).contentsoup BeautifulS…

db2 版本发布历史_数据库各厂商的发展历史(2. DB2 of IBM)

如若转载&#xff0c;请务必注明出处&#xff0c;iihero 2008.9.26于CSDN1973年&#xff0c;IBM研究中心启动System R项目&#xff0c;为DB2的诞生打下良好基础。System R 是 IBM 研究部门开发的一种产品&#xff0c;这种原型语言促进了技术的发展并最终在1983年将 DB2 带到了商…

android---简单的通讯录

遗留问题:获取头像及其他信息 利用adapter和Cursor来获取联系人的姓名和手机号,重在复习之前学过的内容加深自己的理解. 其中需要注意的部分: 1.adapter中的getview的优化问题,用到tag这一属性 2.onBackPressed()返回方法的重写,使得程序更加人性化 下面是主要代码 1.adapte…

win phone 获取并且处理回车键事件

参考自&#xff1a;http://www.cnblogs.com/mohe/archive/2013/03/18/2966540.html 实用场景,比如输入帐号和密码啦,输入搜索关键字啦.protected override void OnKeyDown(KeyEventArgs e) {if (e.Key Key.Enter){MessageBox.Show("我是windows phone 回车键"); …

【2020年】最新中国科学院大学学位论文写作规范

最近在完成国科大博士论文写作的时候&#xff0c;有一些心得体会&#xff0c;特此总结下来&#xff0c;以飨读者&#xff0c;尤其是可爱的学弟学妹们。需要注意的是&#xff0c; 以下仅仅是我自己的心得而已&#xff0c;仅供参考。 1. 首先推荐大家使用国科大的Latex模板&…

谈谈Java基础数据类型

Java的基本数据类型 类型意义取值boolean布尔值true或falsebyte8位有符号整型-128~127short16位有符号整型-pow(2,15)~pow(2,15)-1int32位有符号整型-pow(2,31)~pow(2,31)-1long64位有符号整型-pow(2,63)~pow(2,63)-1float32位浮点数IEEE754标准单精度浮点数double64位浮点数IE…

用fft对信号进行频谱分析实验报告_示波器上的频域分析利器,Spectrum View测试分析...

简介&#xff1a;【Spectrum View技术文章系列】从基础篇开始&#xff0c;讲述利用示波器上的Spectrum View功能观测多通道信号频谱分析正文&#xff1a;示波器和频谱仪都是电子测试测量中必不可少的测试设备&#xff0c;分别用于观察信号的时域波形和频谱。时域波形是信号最原…

DataTable RowFilter 过滤数据

用Rowfilter加入过滤条件 eg&#xff1a; string sql "select Name,Age,Sex from UserInfo"; DataTable dt DataAccess.GetDataTable(sql);//外部方法&#xff08;通过一条查询语句返回一个DataTable&#xff09; dt.DefaultView.RowFilter "Sex女"; dt…

platform_device与platform_driver

做Linux方面也有三个多月了&#xff0c;对代码中的有些结构一直不是非常明确&#xff0c;比方platform_device与platform_driver一直分不清关系。在网上搜了下&#xff0c;做个总结。两者的工作顺序是先定义platform_device -> 注冊 platform_device->&#xff0c;再定义…

复盘caffe安装

最近因之前的服务器上的caffe奔溃了&#xff0c;不得已重新安装这一古老的深度学习框架&#xff0c;之前也尝试了好几次&#xff0c;每次都失败&#xff0c;这次总算是成功了&#xff0c;因此及时地总结一下。 以下安装的caffe主要是针对之前虹膜分割和巩膜分割所需的caffe版本…

HP P2000 RAID-5两块盘离线的数据恢复报告

1. 故障描述本案例是HP P2000的存储vmware exsi虚拟化平台&#xff0c;由RAID-5由10块lT硬盘组成&#xff0c;其中6号盘是热备盘&#xff0c;由于故障导致RAID-5磁盘阵列的两块盘掉线&#xff0c;表现为两块硬盘亮黄灯。 经用户方维护人员检测&#xff0c;故障硬盘应为物理故障…

微智魔盒骗局_微智魔盒官宣

原标题&#xff1a;微智魔盒官宣微智魔盒官方宣传视频微达国际集团创建于2011年&#xff0c;是一家坚持创新的集科研、产销、服务为一体的智能化产业平台&#xff0c;致力于国际领先的专注人工智能领域的产业投资、项目孵化、教育培训&#xff0c;并提供终极解决方案。集团创新…

瑞柏匡丞_移动互联的发展现状与未来

互联网作为人类文明史上最伟大、最重要的科技发明之一&#xff0c;发展到今天&#xff0c;用翻天覆地来形容并不过分。而作为传统互联网的延伸和演进方向&#xff0c;移动互联网更是在近两年得到了迅猛的发展。如今&#xff0c;越来越多的用户得以通过高速的移动网络和强大的智…

android 进程间通信数据(一)------parcel的起源

关于parcel&#xff0c;我们先来讲讲它的“父辈” Serialize。 Serialize 是java提供的一套序列化机制。但是为什么要序列化&#xff0c;怎么序列化&#xff0c;序列化是怎么做到的&#xff0c;我们将在本文探讨下。 一&#xff1a;java 中的serialize 关于Serialize这个东东&a…

为什么torch.nn.Linear的表达形式为y=xA^T+b而不是常见的y=Ax+b?

今天看代码&#xff0c;对比了常见的公式表达与代码的表达&#xff0c;发觉torch.nn.Linear的数学表达与我想象的有点不同&#xff0c;于是思索了一番。 众多周知&#xff0c;torch.nn.Linear作为全连接层&#xff0c;将下一层的每个结点与上一层的每一节点相连&#xff0c;用…

Leetcode47: Palindrome Linked List

Given a singly linked list, determine if it is a palindrome. 推断一个链表是不是回文的&#xff0c;一个比較简单的办法是把链表每一个结点的值存在vector里。然后首尾比較。时间复杂度O(n)。空间复杂度O(n)。 /*** Definition for singly-linked list.* struct ListNode {…

内存颗粒位宽和容量_SDRAM的逻辑Bank与芯片容量表示方法

1、逻辑Bank与芯片位宽讲完SDRAM的外在形式&#xff0c;就该深入了解SDRAM的内部结构了。这里主要的概念就是逻辑Bank。简单地说&#xff0c;SDRAM的内部是一个存储阵列。因为如果是管道式存储(就如排队买票)&#xff0c;就很难做到随机访问了。阵列就如同表格一样&#xff0c;…

[Unity菜鸟] Time

1. Time.deltaTime 增量时间 以秒计算&#xff0c;完成最后一帧的时间(秒)(只读) 帧数所用的时间不是你能控制的。每一帧都不一样&#xff0c;游戏一般都是每秒60帧&#xff0c;也就是updata方法调用60次&#xff08;假如你按60帧来算 而真实情况是不到60帧 那么物体就不会运动…

【转】七个例子帮你更好地理解 CPU 缓存

我的大多数读者都知道缓存是一种快速、小型、存储最近已访问的内存的地方。这个描述相当准确&#xff0c;但是深入处理器缓存如何工作的“枯燥”细节&#xff0c;会对尝试理解程序性能有很大帮助。在这篇博文中&#xff0c;我将通过示例代码来说明缓存是如何工作的&#xff0c;…