Toolkit simplexml_load_string parser error UTF-8

General discussion on Zend Server for IBM System i
Post Reply
richardwebster
Posts: 12
Joined: Tue May 01, 2012 4:35 pm

Toolkit simplexml_load_string parser error UTF-8

Post by richardwebster » Thu May 12, 2016 10:26 pm

Hi.

I am having problems passing certain characters from PHP to an RPG program via the toolkit (i.e. using PgmCall).

For example if I pass a parameter which contains the '–' character (utf-8 hex e28093) then in php.log I get:

[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): Entity: line 8: parser error : Unregistered error message in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): <data var='value' type='560A' varying='on'><![CDATA[]]></data> in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): ^ in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): Entity: line 8: parser error : PCDATA invalid Char value 26 in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): <data var='value' type='560A' varying='on'><![CDATA[]]></data> in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): ^ in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): Entity: line 8: parser error : Sequence ']]>' not allowed in content in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): <data var='value' type='560A' varying='on'><![CDATA[]]></data> in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): ^ in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): Entity: line 8: parser error : internal error: detected an error in element content in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): <data var='value' type='560A' varying='on'><![CDATA[]]></data> in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): ^ in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): Entity: line 8: parser error : Extra content at the end of the document in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): <data var='value' type='560A' varying='on'><![CDATA[]]></data> in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798
[12-May-2016 18:09:10 GMT] PHP Warning: simplexml_load_string(): ^ in /usr/local/zendsvr6/share/ToolkitApi/ToolkitServiceXML.php on line 798

I can run SQL from the PHP to write and read these special characters to/from file successfully, so I think the database is ok. It just seems to be a problem when I use the toolkit.

A few years ago we made changes to the config and the DB2 database files to make our application handle all utf-8 characters. Maybe a subsequent Zend/toolkit upgrade has reset some of the config we changed back then but I'm not sure what.

In toolkit.ini in the [system] section I currently have encoding = "UTF-8".

Any idea what the problem might be please?

richardwebster
Posts: 12
Joined: Tue May 01, 2012 4:35 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by richardwebster » Tue Jun 07, 2016 2:33 pm

Anyone know please?

richardwebster
Posts: 12
Joined: Tue May 01, 2012 4:35 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by richardwebster » Thu Jun 09, 2016 5:33 pm

Not sure if its relevant to my problem or not, but I've tried setting ibm_db2.i5_override_ccsid=0 in ibm_db2.ini as suggested at https://support.zend.com/hc/en-us/articles/206477507 but that hasn't fixed it.

aseiden
Posts: 875
Joined: Thu Apr 09, 2009 5:45 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by aseiden » Fri Jun 10, 2016 2:47 am

Hi, Richard,

Could you create a debug.log file and post it here? Set debug=true in your toolkit.ini, then run your application with the offending character. The log will be created as /usr/local/zendsvr6/var/log/debug.log or a similar name.

I'm curious to learn whether the problem is sending the character in, or getting it out. I'm thinking the problem is parsing the XML on the way out but I'd like to see your debug log to know for sure.

Thanks,
Alan Seiden
PHP Toolkit for IBM i Project Leader

richardwebster
Posts: 12
Joined: Tue May 01, 2012 4:35 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by richardwebster » Fri Jun 10, 2016 5:34 pm

Hi Alan.

Thanks for your reply and apologies for not posting this in the New Toolkit forum (assuming it is a toolkit issue).

So here is debug.log when I call my RPG procedure ADDTEXT in service program UTFUTILS. I pass the second parameter as 'A–B' which contains the special character ('–'). It is this RPG call that causes the simplexml_load_string errors in php.log.

Code: Select all

Creating new conn with database: '*LOCAL', user or i5 naming flag: 'websterde4', transport: 'ibm_db2', persistence: ''
Going to create a new db connection at 2016-06-10 10:08:14.                                                           
Did create a new db connection in 0.009838 seconds.                                                                   
Exec start: 2016-06-10 10:08:14                                                                                       
Version of toolkit front end: 1.5.0                                                                                   
IPC: '/tmp/websterde4-WEBSTER'. Control key: *cdata *sbmjob(ZENDSVR6/ZSVR_JOBD/XTOOLKIT)                              
Stmt: call ZENDSVR6.iPLUG512K(?,?,?,?) with transport: ibm_db2                                                        
Input XML: <?xml version="1.0" encoding="UTF-8" ?>                                                                    
<script>                                                                                                              
<pgm name='UTFUTILS' lib='*LIBL' func='ADDTEXT'>                                                                      
<parm comment='utf8test'><data var='utf8test' type='68A' varying='on'>UTF8TEST</data></parm>                          
<parm comment='value'><data var='value' type='560A' varying='on'>A B</data></parm>                                    
</pgm>                                                                                                                
</script>                                                                                                             
Output XML: <?xml version="1.0" encoding="UTF-8" ?>                                                                   
<script>                                                                                                              
<pgm name='UTFUTILS' lib='*LIBL' func='ADDTEXT'>                                                                      
<parm comment='utf8test'>                                                                                             
<data var='utf8test' type='68A' varying='on'><![CDATA[UTF8TEST]]></data>
</parm>                                                                 
<parm comment='value'>                                                  
<data var='value' type='560A' varying='on'><![CDATA[A B]]></data>       
</parm>                                                                 
<success><![CDATA[+++ success *LIBL UTFUTILS ADDTEXT ]]></success>      
</pgm>                                                                  
</script>                                                               
Exec end: 2016-06-10 10:08:15. Seconds to execute: 0.054421901702881.   
If you need anything else let me know.

aseiden
Posts: 875
Joined: Thu Apr 09, 2009 5:45 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by aseiden » Tue Jun 21, 2016 5:22 pm

Thanks, Richard. I'm discussing with Rod (with whom you opened a case).

aseiden
Posts: 875
Joined: Thu Apr 09, 2009 5:45 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by aseiden » Tue Jun 21, 2016 5:27 pm

Richard, the log you posted shows "A B" (with a space in between, rather than a hyphen or dash). Is that a quirk of the log or did you actually pass a space?

Thanks!
Alan

richardwebster
Posts: 12
Joined: Tue May 01, 2012 4:35 pm

Re: Toolkit simplexml_load_string parser error UTF-8

Post by richardwebster » Fri Jun 24, 2016 10:49 am

I think it appearing as a blank must be a quirk of the log.

I have tried it again, writing strings 'A€B' and 'A–B' using both SQL and RPG. Then I did DSPPFM followed by F10 then F11 to look at the hex:
DSPPFM.png
DSPPFM
DSPPFM.png (58.71 KiB) Viewed 8076 times
Records 1 and 2 are 'A€B' written first by SQL then by RPG. In both cases it has stored it in the VARCHAR field as 5 bytes compromising hex 41 ('A'), E282AC ('€') and 42 ('B').

Records 3 and 4 are 'A–B' written first by SQL then by RPG. The SQL (record 3) has stored it in the VARCHAR field as 5 bytes compromising hex 41 ('A'), E28093 ('–') and 42 ('B'). However the RPG (record 4) has stored it in the VARCHAR field as 3 bytes compromising hex 41 ('A'), 1A ('SUB') and 42 ('B'). According to https://en.wikipedia.org/wiki/Substitute_character "a substitute character (SUB) is a control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device". So I think the question is why is it storing it as 1A instead of E28093?

gfroehlich
Posts: 24
Joined: Tue Jul 06, 2010 11:48 am

Re: Toolkit simplexml_load_string parser error UTF-8

Post by gfroehlich » Tue Jul 05, 2016 12:42 pm

Richard, what CCSID is used as default on the system, at DB table and field and what is used by the jobs reading and writing DB.
Normally jobs do not use UTF-8 (CCSID=1208), because it''s only possible on systems with DBCS installed. I think your problem happens at code conversion from DB to JOB CCSID and vice versa. Some conversion tables at IBM i does not work properly.

Gabriel

Post Reply