I can't get a regular pattern to scrape a web page for hyperlinks?


Using DOMDocument To Display For Fetch Results

Description #

Often you may come up against a web page which is hard to scrape using the standard mADL builder method on $t->rows( )

In this case we are looking at retrieving all the hyperlinks from a web directory page but it is dynamically build and we cannot capture a regular expression pattern for the fetch.

You can use The DOMDocument class from php to resolve this.

You still use $t->Fetch() but you add the result to an array and then iterate the array to a screen output.

Examples #

View a live example


$wsDom = new DOMDocument();
foreach($wsDom->getElementsByTagName('a') as $thisLink) {

if ($thisLink->getAttribute('href')!=""){
echo "<a href=\" ".$thisLink->getAttribute('href')  ."