Using htmlagilitypack in .NET (C#) and have some html code as such:
<p><ol><li>A bunch of text</li></ol><em>some em text</em> more text here.</p>
I then load it into a doc and save it via LoadHtml and Save functions. But I end up with:
<p><ol><li>A bunch of text</li></ol><em>some em text</em> more text here.
The last closing p tag is gone.
Why is this happening? How to fix it?
CodePudding user response:
As others said in the comments, it's an invalid HTML so that might be the reason why the HtmlDocument class itself is removing </p> in the end when you store it into a file using the Save method, but as a workaround, you can store it using System.IO.File class and store the document.Text at the output location.
var html = "<p><ol><li>A bunch of text</li></ol><em>some em text</em> more text here.</p>";
var document = new HtmlDocument();
document.LoadHtml(html);
File.WriteAllText("insert_your_path_here", document.Text);
