亚洲欧洲av综合一区二区三区,伊夜草婷综合视频在线观看

主頁 > 知識庫 > Selenium爬取b站主播頭像并以昵稱命名保存到本地

Selenium爬取b站主播頭像并以昵稱命名保存到本地

申明：資料來源于網(wǎng)絡(luò)及書本，通過理解、實(shí)踐、整理成學(xué)習(xí)筆記。

Pythion的Selenium自動化測試之獲取嗶哩嗶哩主播的頭像以昵稱命名保存到本地文件

效果圖

方法1

通過接口獲取

首先使用pip下載requests包

pip install requests

import requests
# 通過接口獲取請求的接口：想要獲取網(wǎng)頁的url
url = 'https://api.live.bilibili.com/xlive/web-interface/v1/second/getList?platform=webparent_area_id=1area_id=0sort_type=sort_type_152page=1'
# 發(fā)送get請求，獲取返回數(shù)據(jù)
request = requests.get(url)
# 保存圖片的路徑
dir = '../requests/bilibili/'
# 將獲取的數(shù)據(jù)轉(zhuǎn)化為json文件并獲取到圖片的鏈接
info = request.json()['data']['list']
for i in info:
	# 將圖片以主播的昵稱命名
    file = open(dir + '{}.png'.format(i['uname']), 'wb')
    # 將圖片保存到之前的路徑
    file.write(requests.get(i['face']).content)
    # 關(guān)閉文件流
    file.close()

方法2

通過html定位獲取

首先使用pip下載requests和selenium包

pip install requests
pip install selenium

import requests
from selenium import webdriver
# 使用谷歌驅(qū)動打開谷歌瀏覽器
driver = webdriver.Chrome()
# 訪問嗶哩嗶哩直播頁面
driver.get('https://live.bilibili.com/p/eden/area-tags?visit_id=2mwktlg4e2q0areaId=0parentAreaId=1')
# 循環(huán)30次一次保存的頭像圖片
for i in range(1, 31):
	# xpth定位頭像的位置
    image_xpath = '/html/body/div[1]/div[3]/div/ul/li[{}]/a/div[1]/div/div'.format(i)
    # 獲取位置的style屬性值
    image_style_value = driver.find_element_by_xpath(image_xpath).get_attribute('style')
    # 從style屬性值中切片出圖片的鏈接
    image_url = image_style_value[image_style_value.find('h'):image_style_value.find('@'):1]
    # xpath定位昵稱的位置
    title_xpath = '/html/body/div[1]/div[3]/div/ul/li[{}]/a/div[2]/div[2]/div/span'.format(i)
    # 獲取位置的title值
    name_title_value = driver.find_element_by_xpath(title_xpath).get_attribute('title')
    print(image_url)
    # 發(fā)送get請求，獲取返回數(shù)據(jù)
    request = requests.get(image_url)
    # 保存圖片的路徑
    file = open('D:Python Projects/requests/bilibili/{}.jpg'.format(name_title_value), 'wb')
    # 將圖片保存到路徑
    file.write(request.content)
    # 關(guān)閉文件流
    file.close()

最后,在為大家增加一個獲取b站視頻信息的操作代碼

# coding:utf-8
import requests
import json
import time
import pymysql
import bs4
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'
}
result = []
def get_aid(page):
    url = 'https://search.bilibili.com/all?keyword=爬蟲from_source=nav_searchspm_id_from=333.851.b_696e7465726e6174696f6e616c486561646572.11' + 'page=' + str(page)
    response = requests.get(url, headers=headers, verify=False).text
    time.sleep(1)
    try:
        soup = bs4.BeautifulSoup(response, 'lxml').find('div', attrs={'id':'all-list'}).find('div', attrs={'class':'mixin-list'})
        ul = soup.find('ul', attrs={'class':'video-list clearfix'}).find_all('li', attrs={'class':'video-item matrix'})
        for item in ul:
            # print(item)
            info = item.find('div', attrs={'class': 'headline clearfix'}).find('span', attrs={'class': 'type avid'}).get_text()
            aid = info.replace('av', '')
            print(aid)
            result.append(aid)
        return result
    except:
        print('something is wrong')
def get_contents(url):
    response = requests.get(url=url, headers=headers, verify=False).json()
    time.sleep(1)
    try:
        data_1 = response['data']
        data = data_1['stat']
        aid = data['aid']
        view = data['view']
        coin = data['coin']
        like = data['like']
        favorite = data['favorite']
        share = data['share']
        danmaku = data['danmaku']
        print('視頻編號', aid)
        print('觀看數(shù)量', view)
        print('投幣數(shù)量', coin)
        print('收藏數(shù)量', favorite)
        print('點(diǎn)贊數(shù)量', like)
        print('分享數(shù)量', share)
        print('彈幕數(shù)量', danmaku)
    except:
        print('------------')
if __name__ == '__main__':
    for i in range(1, 50):
        result = get_aid(i)
    for i in result:
        url = 'https://api.bilibili.com/x/web-interface/view?aid=' + str(i)
        get_contents(url)

到此這篇關(guān)于Selenium爬取b站主播頭像并以昵稱命名保存到本地的文章就介紹到這了,希望對大家有所幫助，更多相關(guān)python爬取內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持腳本之家！

您可能感興趣的文章:

python爬取晉江文學(xué)城小說評論(情緒分析)
Python如何利用正則表達(dá)式爬取網(wǎng)頁信息及圖片
用基于python的appium爬取b站直播消費(fèi)記錄
Python爬蟲之爬取2020女團(tuán)選秀數(shù)據(jù)
用python爬蟲爬取CSDN博主信息

標(biāo)簽：貴州赤峰陽泉日照克拉瑪依雙鴨山臨汾金華

巨人網(wǎng)絡(luò)通訊聲明：本文標(biāo)題《Selenium爬取b站主播頭像并以昵稱命名保存到本地》，本文關(guān)鍵詞 Selenium,爬取,站,主播,頭像,；如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問題，煩請?zhí)峁┫嚓P(guān)信息告之我們，我們將及時溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò)，涉及言論、版權(quán)與本站無關(guān)。