話不多說 附上程式碼
<?php
session_start();
header("Content-type: text/html;charset=utf-8");
$contents= file_get_contents("網址");
preg_match_all("/<\/?title>(.*?)<\/?title>/", $contents, $result, PREG_SET_ORDER);
preg_match_all("/<\/?pubDate>(.*?)<\/?pubDate>/", $contents, $resultdate, PREG_SET_ORDER);
echo $contents;
echo "自由時報新聞跑馬燈";
for($i=0;$i&lt;10;$i++)
{
echo $i+1 .". ".$content[1][$i];
echo "";
echo "www.plurk.com/m/p/".$link[1][$i];
}
echo "下一頁".$next[1][0];
?>
而preg_match_all 其實不匹配换行符(默认情况下),所以如果要匹配到換行符號的話,要多加s
例如
preg_match_all("/<\/?title>(.*?)<\/?title>/", $contents, $result, PREG_SET_ORDER);
就要改成
preg_match_all("/<\/?title>(.*?)<\/?title>/s", $contents, $result, PREG_SET_ORDER);
或是我發現透過dom操作的方式來抓網頁也很爽快
<?php
$xml = <<< XML
<?xml version="1.0" encoding="utf-8"?>
<books>
<book>Patterns of Enterprise Application Architecture</book>
<book>Design Patterns: Elements of Reusable Software Design</book>
<book>Clean Code</book>
</books>
XML;
$dom = new DOMDocument;
$dom->loadXML($xml);
$books = $dom->getElementsByTagName('book');
foreach ($books as $book) {
echo $book->nodeValue, PHP_EOL;
}
?>
來源:http://php.net/manual/en/domdocument.getelementsbytagname.php
或是直接
$html = file_get_html('http://localhost/get.php');
$html2 = str_get_html($html);
foreach($html2->find('tr') as $element)
{
$td = array();
foreach( $element->find('th') as $row)
{
$td [] = $row->plaintext;
}
print_r($td);
$td = array();
foreach( $element->find('td') as $row)
{
$td [] = $row->plaintext;
}
print_r($td);
}