I'd like to import an HTML document onto a MySQL database using PHP.
The structure of the document looks like this :
<p >
<span >word1</span>
<span >noun</span>
</p>
...
<p >
<span >word128</span>
<span >adjective</span>
</p>
For each word, I only have one word-text and one grammatical-type.
I'm able to find each word node, but for each of its children word-text and grammatical-type I'd like to perform a MySQL query :
$dom = new DOMDocument();
$dom->loadHTMLFile($location);
$xpath = new DomXPath($dom);
$res = $xpath->query("//p[@class='word']");
foreach ($res as $textNode) {
//do something here for each *word-text*->nodeValue
//do something here for each *grammatical-type*->nodeValue
}
}
I tried in the foreach loop to pass $textNode, which is a DOMNode, as a $contextNode as follows :
$wordText = $xpath->query("span[@class='word-text']", $textNode);
$myWord = $wordText->nodeValue;
But in $wordText I only have a DOMNodeList with a NULL nodeValue.
How can I, starting from the word node, manage the children nodes ?
Thanks
CodePudding user response:
Solved.
You just need to, as you know that the node only contains a single element, select this single element using item(0) :
$dom = new DOMDocument();
$dom->loadHTMLFile($location);
$xpath = new DomXPath($dom);
$res = $xpath->query("//p[@class='word']");
foreach ($res as $textNode) {
$wordTextNode = $xpath->query("span[@class='word-text']", $textNode);
$word = $wordTextNode->item(0)->nodeValue;
//do same thing here for each *grammatical-type*
}
}
CodePudding user response:
You can provide different node as context in your $xpath->query calls:
<?php
$location = 'so-dom.html';
$dom = new DOMDocument();
$dom->loadHTMLFile($location);
$xpath = new DomXPath($dom);
$res = $xpath->query("//p[@class='word']");
foreach ($res as $textNode) {
echo $xpath->query('./a/text()', $textNode)[0]->nodeValue;
//^^^^^^^^^
};
?>
Where doc is
<head></head>
<body>
<p ><a>one</a></p>
<p ><a>two</a></p>
</body>
will print "onetwo"
