婷婷久久综合九色综合,欧美成色婷婷在线观看视频,偷窥视频一区,欧美日本一道道一区二区

<tt id="bu9ss"></tt>

<span id="bu9ss"></span>

<pre id="bu9ss"><tt id="bu9ss"></tt></pre>

<label id="bu9ss"></label>

<span id="3a9hj"></span>

<object id="3a9hj"></object>

<fieldset id="3a9hj"></fieldset>

首頁

營銷

財富

IDC

IT資訊

創(chuàng)業(yè)頭條

創(chuàng)業(yè)加盟

創(chuàng)業(yè)項目加盟: 招商合作; VIP特權; 最新創(chuàng)業(yè)項目; 創(chuàng)業(yè)項目排行榜

網(wǎng)站服務: SEO診斷; SEO顧問

營銷推廣服務: A5全媒體平臺; 品牌營銷; 企業(yè)會員; 小紅書推廣; 快手信息流開戶; 云主機優(yōu)惠

當前位置：首頁 > 科技 > 互聯(lián)網(wǎng) > 正文

品牌
標簽
企業(yè)會員

實現(xiàn)網(wǎng)絡圖片爬蟲，只需5秒快速把整個網(wǎng)頁上的圖片全下載打包zip

2019-01-29 09:10 來源：用戶投稿我來投稿撤稿糾錯

　阿里云優(yōu)惠券先領券再下單

我們經(jīng)常需要用到互聯(lián)網(wǎng)上的一些共享資源，圖片就是資源的一種，怎么把網(wǎng)頁上的圖片批量下載下來?有時候我們需要把網(wǎng)頁上的圖片下載下來，但網(wǎng)頁上圖片那么多，怎么下載我們想要的東西呢，如果這個網(wǎng)頁都是我們想要的圖片,難道我們要一點一點一張一張右鍵下載嗎? 當然不好，這里提供一段Java實現(xiàn)的網(wǎng)絡爬蟲抓圖片代碼,程序員同志有喜歡的記得收藏哦。

材料：必須會java開發(fā)，用到的核心jar Jsoup自己去網(wǎng)上下載很多。以下是我已經(jīng)實現(xiàn)的界面化的抓取圖片的在線工具，有興趣的朋友可以按照圖片地址打開看看

下圖是抓取效果網(wǎng)絡上隨便找第一個美女圖片網(wǎng)站

下面是實現(xiàn)代碼：

/**

*模擬用戶請求

*/

public final static String UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.26 Safari/537.36 Core/1.63.6821.400

QQBrowser/10.3.3040.400";

/*

*

*抓取全部圖片地址備注：zfilepath是zip文件路徑 url是網(wǎng)頁地址 pp是img的其中屬性一般是src即可

*/

public static boolean getImgSrc(String zfilepath,String url,String pp){

boolean isb =false;

// 利用Jsoup獲得連接

Connection connect = Jsoup.connect(url).timeout(5000);

connect.header("Connection", "Keep-Alive");

connect.header("Content-Type", "application/x-www-form-urlencoded");

connect.header("Accept-Encoding", "gzip, deflate, sdch");

connect.header("Accept", "*/*");

connect.header("User-Agent",Const.UserAgent);

ZipOutputStream out = null;

try {

// 得到Document對象

Document document = connect.ignoreContentType(true).timeout(5000).get();

// 查找所有img標簽

Elements imgs = document.getElementsByTag("img");

File zipfile = new File(zfilepath);

out=new ZipOutputStream(new FileOutputStream(zipfile));

int i=1;

Listlistimg = new ArrayList();

for (Element element : imgs) {

//獲取每個img標簽URL "abs:"表示絕對路徑

String imgSrc = element.attr("abs:"+pp);

listimg.add(imgSrc);

}

listimg = removeCf(listimg);

if(listimg!=null && listimg.size()>0){

for(int x=0;x<listimg.size();x++){< p="">

long stime = System.currentTimeMillis();

String imgSrc =listimg.get(x);

// 打印URL

System.out.println(imgSrc);

//下載圖片到本地

boolean is = downImages(imgSrc,out);

long etime = System.currentTimeMillis();

float alltime = (float)(etime - stime)/1000;

Map<string,string> rest = new HashMap<string,string>();

rest.put("img",imgSrc);

rest.put("time",(alltime)+"");

rest.put("num",i+"");

rest.put("status","true");

if(is){

rest.put("http","成功");

}else{

rest.put("http","失敗");

}

i++;

}

Map<string,string> rest1 = new HashMap<string,string>();

rest1.put("status","true");

rest1.put("msg","打包完成");

System.out.println("下載完成");

isb =true;

}else{

Map<string,string> rest1 = new HashMap<string,string>();

rest1.put("status","true");

rest1.put("msg","未抓取到數(shù)據(jù)，有可能反爬蟲了");

client.sendEvent("chatevent", rest1);

}

} catch (IOException e) {

e.printStackTrace();

Map<string,string> rest = new HashMap<string,string>();

rest.put("status","false");

} catch (InterruptedException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}finally{

try {

if(out!=null){

out.close();

}

} catch (IOException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

}

return isb;

}

/**

* 下載圖片到指定目錄

*

* @param filePath 文件路徑

* @param imgUrl 圖片URL

*/

public static boolean downImages(/*String filePath,*/ String imgUrl,ZipOutputStream outStream) {

boolean is = false;

// 若指定文件夾沒有，則先創(chuàng)建

/* File dir = new File(filePath);

if (!dir.exists()) {

dir.mkdirs();

}*/

// 截取圖片文件名

String fileName = imgUrl.substring(imgUrl.lastIndexOf('/') + 1, imgUrl.length());

try {

// 文件名里面可能有中文或者空格，所以這里要進行處理。但空格又會被URLEncoder轉義為加號

String urlTail = URLEncoder.encode(fileName, "UTF-8");

// 因此要將加號轉化為UTF-8格式的%20

imgUrl = imgUrl.substring(0, imgUrl.lastIndexOf('/') + 1) + urlTail.replaceAll("\+", "\%20");

/**

* 驗證圖片格式保證獲取動態(tài)圖片

*/

fileName = vidImg(fileName);

if(fileName.equals("")){

return is;

}

} catch (UnsupportedEncodingException e) {

e.printStackTrace();

}

// 寫出的路徑

InputStream in = null;

try {

// 獲取圖片URL

URL url = new URL(imgUrl);

// 獲得連接

HttpURLConnection connection = (HttpURLConnection) url.openConnection();

connection.setRequestProperty("User-Agent",Const.UserAgent);

// 設置10秒的相應時間

connection.setConnectTimeout(10 * 1000);

// 獲得輸入流

in = connection.getInputStream();

byte[] data=readInputStream(in);

outStream.putNextEntry(new ZipEntry(fileName));

outStream.write(data);

is = true;

return is;

} catch (MalformedURLException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

} catch (Exception e) {

// TODO Auto-generated catch block

e.printStackTrace();

}finally{

try {

outStream.closeEntry();

in.close();

} catch (IOException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

}

return is;

}

/**

* 去除重復的圖片

* @param list

* @return

*/

public static ListremoveCf(Listlist){

ListlistTemp = new ArrayList();

for(int i=0;i<list.size();i++){< p="">

if(!listTemp.contains(list.get(i))){

listTemp.add(list.get(i));

}

}

return listTemp;

}

喜歡的記得收藏哦

這個工具我已經(jīng)發(fā)布了，地址就是：http://www.yzcopen.com/img/imgdown

申請創(chuàng)業(yè)報道，分享創(chuàng)業(yè)好點子。點擊此處，共同探討創(chuàng)業(yè)新機遇！

相關標簽: 站長工具; 建站技術; 爬蟲

相關文章

三叔站長工具箱上線啦

用戶可以在插件市場選擇自己想要的插件，下載后既可以在首頁使用。

標簽：

站長工具
站長大神進階必備的九項工具介紹

這篇教程是向腳本之家的朋友介紹站長大神進階必備的九項工具，對于建站的站長來說，是非常值得來了解的，好了，下面跟隨小編來看看吧

標簽：

站長工具
百度上線站長工具“百度診站”

現(xiàn)在的站長圈，很多站長都吐槽百度逐漸對網(wǎng)站失去興趣，但是最近松松編輯杰哥發(fā)現(xiàn)百度又低調(diào)了上線了一款網(wǎng)站工具“百度診站”可以用來幫助站長對網(wǎng)站基礎信息、SSL證書信息、網(wǎng)站安全狀態(tài)、網(wǎng)站備案信息、企業(yè)工商等網(wǎng)站信息進行綜合評測

標簽：

站長工具

百度站長平臺

百度熱搜
搜索引擎蜘蛛（爬蟲）工作過程及原理

搜索引擎蜘蛛首先會抓取網(wǎng)頁信息，把抓取到的信息存放到搜索引擎臨時數(shù)據(jù)庫中，接著搜索引擎會根據(jù)自身的甄別原則分析信息價值，有價值的信息保留下來，沒有價值的信息進行刪除處理。

標簽：

搜索引擎蜘蛛

爬蟲

搜索引擎的工作原理
淘寶客程序發(fā)展淘寶聯(lián)盟爬蟲的注意點

淘寶網(wǎng)現(xiàn)在已經(jīng)發(fā)展的十分成熟，也衍生了淘寶客這個行業(yè)，在此之前也是經(jīng)歷了很多很多的階段才有今天的成績，通過不斷改進和迎合客戶的需求和互聯(lián)網(wǎng)的發(fā)展，一次又一次的更新才讓人們看到今天的淘寶網(wǎng)。那么淘寶客程序經(jīng)歷了哪些過程呢？

標簽：

淘寶客

淘寶聯(lián)盟

淘寶客程序

爬蟲

加載更多

熱門排行

信息推薦

熱門標簽

王者榮耀活動谷歌入股京東網(wǎng)絡創(chuàng)業(yè)小項目英偉達暴跌房產(chǎn)類網(wǎng)站珠海網(wǎng)站建設網(wǎng)址鏈接預覽頭腦風暴 airpods8種配色 tcl取消mwc發(fā)布會寧德時代李佳琦動手術小米無線秒沖快手小店通辛巴 2021春運淘寶修改標題天貓服務規(guī)則 360織語社區(qū)團購系統(tǒng)