工具:
python 3.6
Fiddler4
所需要的库:
requests
BeautifulSoup
首先抓包,观察登录时需要什么:
这个authenticity_token的值是访问/login后可以获取,值是随机生成的,所以登录前要获取一下。
注意到还需要cookie
观察到action = ‘/session’
所以post的目标url为‘https://github.com/session’
# coding:utf-8import requests
from bs4 import BeautifulSoupurl = 'https://github.com/login'
url2 = 'https://github.com/session'
#首先登录/login,获取cookie和authenticity_token
r = requests.get(url)
html = BeautifulSoup(r.text,'lxml')
#获取cookies
cookie = r.cookies
authen = [i.attrs['value'] for i in html.find_all('input',{'name':'authenticity_token'})][0]
#将需要的数据列出来
postdata = {'commit':'Sign in','utf8':'√','authenticity_token':authen,'login':'********','password':'********',}
#设置好header
header = {'User-Agent':'''Mozilla/5.0 (Windows NT 6.3; WOW64)AppleWe\
bKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36''','Referer':r'https://github.com/login','Connection':'keep-Alive','Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',}
#利用设置好的header和cookie,就可以访问了
r = requests.post(url2,data = postdata,cookies=cookie)#将访问的结果网页下载下来
f = open('123.html','w')
f.write(r.text)
f.close()