以從小紅書“複制連結”出來的一個位址為例,例如:
http://xhslink.com/JDk1s 下面用PHP代碼來實作提取圖集,具體代碼如下:
$userAgent = "Mozilla/5.0 (Linux; Android 5.0; SM-G900P Build/LRX21T) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Mobile Safari/537.36";
$header= [
'User-Agent:'.$userAgent,
];
$url = 'http://xhslink.com/JDk1s';
//$url = 'http://xhslink.com/Mvo2s';
$content = curlGet($url, $header, $userAgent);
if (preg_match('|window.__INITIAL_SSR_STATE__=(.*?)\|', $content, $match)) {
$str = str_replace('undefined', '""', $match[1]);
$result= json_decode($str, true);
//針對圖集解析
if ($result['NoteView']['noteType'] == 'normal') {
$imageData = $result['NoteView']['content']['imageList'];
$images = [];
foreach($imageData as $info) {
$images[] = 'https:'.$info['url'];
}
print_r($images);
}
//針對視訊解析
if ($result['NoteView']['noteType'] == 'video') {
$videoUrl = $result['NoteView']['content']['video']['url'];
print_r($videoUrl);
}
}
function curlGet($url = '', $header = [], $userAgent = '') {
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ( $ch, CURLOPT_HTTPHEADER, $header );
curl_setopt ( $ch, CURLOPT_USERAGENT, $userAgent );
curl_setopt ( $ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt ( $ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt ( $ch, CURLOPT_MAXREDIRS, 5 );
curl_setopt ( $ch, CURLOPT_FOLLOWLOCATION, 1 );
curl_setopt ( $ch, CURLOPT_TIMEOUT, 5 );
$content = curl_exec($ch);
curl_close($ch);
return $content;
}
通過解析分享網頁的方式,是可以得到高清的圖檔,如果分享的是視訊位址,也可以得到視訊,但是因為這裡是直接解析外網通路位址的,也就是小紅書展示出來的網頁,不管是視訊還是圖檔都是有水印的。當然,這個是目前很多人能做到的通用的做法
那麼如何去掉這塊的水印呢,實際上可以通過借助第三方API的方式來解決,具體的代碼如下:
// https://www.vnil.cn開發者背景生成的appkey
$appkey = '';
//需要解析的url
$url = '';
$param = [
'appkey'=> $appkey,
'url'=> $url,
];
//得到請求的位址:https://api.vnil.cn/api/parse/deal?appkey=appkey&url=url
$apiUrl = 'https://api.vnil.cn/api/parse/deal?'.http_build_query($param);
$ch = curl_init();
curl_setopt ( $ch, CURLOPT_URL, $apiUrl );
curl_setopt ( $ch, CURLOPT_SSL_VERIFYPEER, FALSE );
curl_setopt ( $ch, CURLOPT_SSL_VERIFYPEER, 0 );
curl_setopt ( $ch, CURLOPT_SSL_VERIFYHOST, 0 );
curl_setopt ( $ch, CURLOPT_MAXREDIRS, 5 );
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt ( $ch, CURLOPT_FOLLOWLOCATION, 1 );
curl_setopt ( $ch, CURLOPT_TIMEOUT, 10 );
$content = curl_exec( $ch );
curl_close ( $ch);
print_r($content);
通過上面的代碼,就可以很直接的得到無水印的圖檔了
感興趣的朋友不妨可以試一下