Project

General

Profile

Defect #1607

Issues with $htmlize.get or #h

Added by John Makkink 3 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Category:
velocity
Target version:
Start date:
10/28/2025
Due date:
% Done:

0%

Estimated time:
Browser (if web client):

Description

Hi Patrick,

I am sorry to make a new ticket again about the htmlize.get().
So far I can see the problem is with an < (<) in the html.

Template

<html>
<body>
<table>
#foreach($record in $fsPatient)
   <tr>
     <td>$!record.notes</td>
     <td>$!htmlize.get($!record.notes)</td>
     <td>#h($!record.notes)</td>
   </tr>
#end
</table>
</body>
</html>

result with version 4.0.1:

<html>
<body>
<table>
   <tr>
     <td>&lt;div&gt;&lt;strong&gt;Hello&lt;/strong&gt; with &lt;em&gt;some&lt;/em&gt; &lt;u&gt;simple&lt;/u&gt; markup&lt;/div&gt;</td>
     <td><div><strong>Hello</strong> with <em>some</em> <u>simple</u> markup</div></td>
     <td> <div><strong>Hello</strong> with <em>some</em> <u>simple</u> markup</div></td>
   </tr>
   <tr>
     <td>&lt;div&gt;&lt;strong&gt;Hello&lt;/strong&gt; with &lt;em&gt;some&lt;/em&gt; &lt;u&gt;simple&lt;/u&gt; markup with GT symbol &gt; in it&lt;/div&gt;</td>
     <td><div><strong>Hello</strong> with <em>some</em> <u>simple</u> markup with GT symbol > in it</div></td>
     <td> <div><strong>Hello</strong> with <em>some</em> <u>simple</u> markup with GT symbol > in it</div></td>
   </tr>
   <tr>
     <td>&lt;div&gt;&lt;strong&gt;Hello&lt;/strong&gt; with &lt;em&gt;some&lt;/em&gt; &lt;u&gt;simple&lt;/u&gt; markup with LT symbol &lt; in it&lt;/div&gt;</td>
     <td>&lt;div&gt;&lt;strong&gt;Hello&lt;/strong&gt; with &lt;em&gt;some&lt;/em&gt; &lt;u&gt;simple&lt;/u&gt; markup with LT symbol &lt; in it&lt;/div&gt;</td>
     <td> &amp;lt;div&amp;gt;&amp;lt;strong&amp;gt;Hello&amp;lt;/strong&amp;gt; with &amp;lt;em&amp;gt;some&amp;lt;/em&amp;gt; &amp;lt;u&amp;gt;simple&amp;lt;/u&amp;gt; markup with LT symbol &amp;lt; in it&amp;lt;/div&amp;gt;</td>
   </tr>
</table>
</body>
</html>

I made 3 lines as example:
1. without any < or >
2. with > then symbol
3. with < then symbol

I didn't check the 4.0.2 because I saw there was no real change in it that possible fix it.

To show you what the issue is I made an small sample solution.
This sample solution is made with servoy 2024.3.7

In this sample solution there 2 buttons
1) read template
-> will read the template from media file
2) evaluateWithContext
-> will use the template from step 1
-> build context object
-> run method: plugins.VelocityReport.evaluateWithContext

Kind regard,
John Makkink
eFertility


Files

vr_sample_lt_symbol.servoy (7.42 KB) vr_sample_lt_symbol.servoy John Makkink, 10/28/2025 12:21 PM

History

#1

Updated by Patrick Talbot 3 months ago

Hi John!

Thanks for the sample solution. The problem comes from the fact that once the characters are unescaped the resulting html is not properly formed...
If you tried to create a PDF out of it, you would get an exception and the PDF would not be produced.
So, I've added some code to the $htmlize.get() function (the #h macro is just a shortcut to it) so that it checks whether the markup is proper xHTML (properly formed html, conform with xml rules), this is so that people will still get a PDF result, even though they will see the tags: at least they can check their data and correct if needed.

To check that the html is properly formed, I'm using a XML parser after having unescape the

&gt; &lt; &amp; &nbsp; &quot;
html entities into their related characters. If the XML has an exception, I leave the data as-is (meaning with the html entities, not the characters). In the case of > (>) it looks like the xml parser accepts it, but for < (<), the parser is looking for a closing > character, and if not found, considers this is not xml, so the process ends there.

This check has been a lot of help to some of my clients, so I'm not willing to rollback this change.
If you are not using the template to create PDF though, this is not something you might want, so what I can do is add a boolean parameter to the htmlize.get() function so that you can bypass the xml check... you will then get the previous behavior, where the html entities are rendered as character even though the result is not proper xHTML.

So, you will just need to replace $htmlize.get($!record.notes) with $htmlize.get($!record.notes,true) and #h(!record.notes) with #h(!record.notes,true).
This will be v4.0.3 coming soon.

#2

Updated by Patrick Talbot 3 months ago

  • Status changed from New to Resolved
  • Assignee set to Patrick Talbot

See v4.0.3 - I've added a boolean parameter to force unescaping html entities even if the html is not accepted by the xml parser, effectively bypassing the test. Use with caution when the result should be PDF. This should only be used for non PDF output.

So in your sample, use $!htmlize.get($!record.notes,true) and #h($!record.notes,true)
- just make sure you update the VM_global_library.vm as well if you are using the #h() macro.

#3

Updated by John Makkink 3 months ago

Hi Patrick,

Thank you for checking this out and making the fix, it solves showing but printing to PDF will break what you said.
I will discuss internally what to do and possible solutions.

Kind regard,
John Makkink
eFertility

#4

Updated by John Makkink 2 months ago

Hi Patrick,

For our use cases we rely on the htmlize function that was used in 3.8.1. We display and make PDF from it.
With the current fix in 4.0.3 we can display but we can't create a PDF output what you said.

Is it possible to bring that function back as it was in 3.8.1 as a legacy / option function.
$!htmlize.get($!record.notes,true) = will trigger the old function?

If this is a big change, please contact me or wouter to talk about compensation / payment.

Kind regard,
John Makkink
eFertility

#5

Updated by Patrick Talbot 2 months ago

John Makkink wrote:

$!htmlize.get($!record.notes,true) = will trigger the old function?

Yes, $!htmlize.get($!record.notes,true) will act the same way as before...
The boolean true indicates that you want to bypass the xml validation, so it will just unescape all the < > etc.
To note that if you bypass the validation, you are on your own, meaning if your data contains invalid xHTML (html that doesn't validate according to xml rules), then the renderer will not be able to produce a PDF from it.

Also available in: Atom PDF