I'm trying to get the values of the following table. I tried both curl/regex (I know it's not recommended) and DOM separately, but wasn't able to get the values properly.
There are multiple rows in the page, so I'll need to use a foreach. I need an exact match of the structure below.
<td width="75" style="NS">
<img src="NS" width="64" alt="INEEDTHISVALUE">
<td style="NS">
<a href="NS">NS</a>
NS = Non-static values. They change for each td and a since it's a colored (inline css) table. They may contain special characters like ; / or numbers/alphabetical characters.
I'm using simple_html_dom class which can be found here : http://htmlparsing.com/php.html
I'm using the code below to get all td's, but I need more specific output (I included the table row above)
What I've tried so far :
$html = file_get_html("URL");
foreach($html->find('td') as $td) {
echo $td."<br>";
$site = "URL";
$ch = curl_init();
$hc = "YahooSeeker-Testing/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; Yahoo! Search - Web Search)";
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($ch, CURLOPT_URL, $site);
curl_setopt($ch, CURLOPT_USERAGENT, $hc);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$site = curl_exec($ch);
preg_match_all('@<tr><td width="75" style="(.*?)"><img src="/folder/link/(.*?)" width="64" alt="(.*?)"></td><td style="(.*?)"><a href="/folder2/link2/(.*?)">(.*?)</a></td><td style="(.*?)">(.*?)</td></tr>@', $site, $arr);
var_dump($arr); // returns empty array, WHY?