Defect #1302
fromJSON() fails on UTF-8 files with BOM
0%
Description
We created an interface where we get a JSON file from SAP SuccessFactors EmployeeCentral when employee data got changed there.
The first lines of such a JSON look like this:
{
"S136" : {
"e_all_in_contract": null,
"e_cost_center": "10 3800",
...
We read the JSON into an object variable with
oEC_EIM_persons = plugins.VelocityReport.fromJSON( plugins.file.readTXTFile( jsFile.getAbsolutePath() ));
Everything works great when the JSON is ANSI coded.
However when I switch the JSON file to UTF-8 coded (e.g. with Notepad++), Servoy prompts this error:
JAVASCRIPT ERROR
net.stuff.plugin.velocityreport.org.json.JSONException: Expected a ':' after a key at 3 [character 4 line 1]
After some investigation I found the reason here:
[[https://de.wikipedia.org/wiki/UTF-8#Byte_Order_Mark]]
There are UTF-8 files with and without a BOM (Byte Order Mark) at the beginning.
That BOM is the sequence EF BB BF.
When saving as "UTF-8 without BOM" (can be done easily with Notepad++ in menu 'Coding / Convert to UTF-8 without BOM'), everything works well.
So to be able to process also "UTF-8 with BOM", VelocityReport.fromJSON() could check if the BOM is there, and then ignore or strip it before processing.
History
Updated by Patrick Talbot about 7 years ago
- Status changed from New to Rejected
This is a known issue and actually not coming from Velocity but from the way Java File IO is choking on UTF-8 + with BOM files.
Most java program reading text files as UTF-8 have the same issue.
Problem is that the Exception thrown internally in the JVM is trapped and not bubbled up to Velocity's classes so there's no way for Velocity to know in advance and the file is then read incomplete and JSON cannot convert it.
There would be a convoluted way to check out the BOM, by first opening the file as binary, checking the first bytes, etc.
But this is an unnecessary overhead that I have decided against.