Working with multiple occurrence data structures

Re: Working with multiple occurrence data structures

Postby khimeira on Tue Nov 13, 2012 7:43 pm

Hi folks,

I'm Timo's colleague and product manager for this mentioned ERP solution.

I have high hopes that we would get pass current "shortcomings" with the new toolkit so that we would be able to update ZS to our customers.

Our highest hopes are on the performance, performance and performance.
At the moment tests have indicated that with large chunks of data (Occurrence DS) performance with new toolkit hasn't been as good as with the old one. But keeping my fingers crossed... ;)


As Timo said, we use PHP in many different forms. And we use it widely. And with ERP data amounts are huge.


One example:
Customers x, y and z are having 3rd party web shops (from different vendors, of course). For fast response times, these web shops query stock balance data from our iSeries ERP every hour. This means they run more than 50 web service queries every hour with occurrence ds of 200 products. So they query stock balances for more than 10000 products every hour. And this is for the "front page" views only...
When an order is created, they check the stock balances for the order to inform the user... and these orders are big... dozens of order rows.. and at this point this web service is "real time" query and it has to be fast. So occurrence ds is the solution and it needs to be fast.


We use PHP for web services, xml/xsl, printouts, office file integrations, web user interface.... And for us the performance is vital. And with our limited resources it is also vital that we don't have to go and change all scripts we have made during past 4 years.


As said, keeping my fingers crossed. Keep up the good work guys!

Jan Salminen
Product Manager
khimeira
 
Posts: 2
Joined: Thu Aug 27, 2009 12:20 pm

Re: Working with multiple occurrence data structures

Postby rangercairns on Tue Nov 13, 2012 8:01 pm

Well ... i normalized your XML data to one line per <element> before grep count ... really is around 16,000 data elements, which likely just too many XML nodes for the current discreet XMLSERVICE <data>value</data> design.

i still have a bug in xmlservice 'bigAssist', but we will likely have to do something completely different for this sort of massive data transfer.

wow what a monster, way beyond anything i could imagine "reasonable" for a web application ...

> grep -c '<data' test_40502_nocall_timo.phpt
14612
> grep -c '<ds' test_40502_nocall_timo.phpt
1803
> grep -c '<parm' test_40502_nocall_timo.phpt
15
rangercairns
 
Posts: 215
Joined: Fri Jul 24, 2009 6:28 pm

Re: Working with multiple occurrence data structures

Postby rangercairns on Tue Nov 13, 2012 8:27 pm

Opps ... don't mean to sound like we stop trying to figure something out ... i suspect we should be able to do something here (programming geeks at work you know), but may take change both on XMLSERVICE "server" / PHP Toolkit "interface" for this more than a handful of data.
rangercairns
 
Posts: 215
Joined: Fri Jul 24, 2009 6:28 pm

Re: Working with multiple occurrence data structures

Postby rangercairns on Wed Nov 14, 2012 12:10 am

Ok, timo massive data test works on new xmlservice version 1.7.4-sg4, please try when/if you have time.
This 1.7.4-sg4 xmlservice fix does NOT require any changes to PHP Toolkit or customer script (Yahoo).
Thanks for the help.

http://www.youngiprofessionals.com/wiki ... ICETesting

Active test versions
2012–11–13 -xmlservice-rpg-1.7.4-sg4.zip
FIX — parsing large data works again
big data test 5,000 elements … improved 40 seconds to 3 seconds
massive data timo 16,000 XML element test works … improved 60+ seconds to 15 seconds
not done with work, but try if you would like

I suspect 15 seconds is still longer than you would like, but this version fix is still 100% RPG code (no PASE assembler) ... and ... there may still be room for improvement in xmlservice RPG only implementation.

MAYBE at some point we may have to entertain a new format for "massive data", but not today. BTW -- i suspect "massive data" is completely possible (even staying within XML), but we will see what you want to do.
rangercairns
 
Posts: 215
Joined: Fri Jul 24, 2009 6:28 pm

Re: Working with multiple occurrence data structures

Postby timo_karvinen on Wed Nov 14, 2012 10:26 am

rangercairns wrote:I will give it a try with current design ... BUT ...

We may have to invent a new format for massive records passed input ... perhaps something like this ...
Code: Select all
</template label='mybigds'>
<describe type='132A'/>
<describe type='12p2'/>
... and so on ...
</template>
<raw template='mybigds' delimit=':' eol='LF'>
frog132:12.37:...:
toad145:34512.37:...:
... and so on ...
</raw>

... where each delimited record can be quickly popped into memory in consecutive fashion (RPG DS array style).

Philosophy:
I would like to stay with character data (frog132:12.37:...) and avoid "binary" data transfers because clients never have types like packed/zoned decimal ... and ... well for a web point of view you can send a big string around the web using any protocol (DB2, REST GET/POST, ftp, etc.) over any language (PHP, Ruby, perl, csh, bash, curl, etc.), which in the long run will protect your application for all manner of device proliferation (ipad, phone, pc, etc.)

With that said, do you have other design ideas (this is open source development my friend)???


How about using JSON as format, it's much less overhead than XML, but it's still a "standard" format for presenting complex data as text string, instead of completely proprietary format (though a bit more overhead than the one you suggested).

-Timo
timo_karvinen
 
Posts: 74
Joined: Wed Aug 12, 2009 7:58 am
Location: Tampere, Finland

Re: Working with multiple occurrence data structures

Postby timo_karvinen on Wed Nov 14, 2012 10:38 am

rangercairns wrote:Ok, timo massive data test works on new xmlservice version 1.7.4-sg4, please try when/if you have time.
This 1.7.4-sg4 xmlservice fix does NOT require any changes to PHP Toolkit or customer script (Yahoo).
Thanks for the help.

Active test versions
2012–11–13 -xmlservice-rpg-1.7.4-sg4.zip
FIX — parsing large data works again
big data test 5,000 elements … improved 40 seconds to 3 seconds
massive data timo 16,000 XML element test works … improved 60+ seconds to 15 seconds
not done with work, but try if you would like

I suspect 15 seconds is still longer than you would like, but this version fix is still 100% RPG code (no PASE assembler) ... and ... there may still be room for improvement in xmlservice RPG only implementation.

MAYBE at some point we may have to entertain a new format for "massive data", but not today. BTW -- i suspect "massive data" is completely possible (even staying within XML), but we will see what you want to do.


Yes, I tested this on our environment and verified that it does indeed work now, though like you said the performance isn't quite there yet, the same exact web service runs in old toolkit at 2.3 seconds.
That's 2.3 seconds from end to end, from client calling web service -> toolkit -> rpg program -> web service return -> process incoming soap message, same process in new toolkit with 1.7.4-sg4 goes on roughly for 17 seconds (it varies a bit).

-Timo
timo_karvinen
 
Posts: 74
Joined: Wed Aug 12, 2009 7:58 am
Location: Tampere, Finland

Re: Working with multiple occurrence data structures

Postby rangercairns on Wed Nov 14, 2012 4:13 pm

Happy to keep plugging away at "reasonable" XML based toolkit performance here, and i am grateful for your testing/requirements help, BUT i must say few things ...

Play nice ...
I will not/must not be party to any sort of inappropriate public "competition" against other good IBM i vendors. Therefore, if you continue to forum discuss any sort of public product "benchmark competition", i will be forced to disengage from this Open Source venture and discontinue participation on this forum (be careful what you say Dude). There should be no ambiguity here, if you prefer the original toolkit as it exists from EasyCom, please continue your business relationship with the good Aura company.

Open Source toolkit priorities ...
#1 ... free, free, free -- we wish to have a free Open Source solution for IBM i emerging Open Source web languages, many of these langauges simply be loaded as source and compiled on the IBM i (including PHP, Ruby, perl, python, etc.)
#2 ... flexible, flexible, flexible -- XML/JSON "just a string" based payload can be process by any language, any device, any transport (btw-- we do not have pure JSON yet, but it was topic at this year IBM i part of Zendcon 2012)
#3 ... function, function, function -- we want to be able to call anything on IBM i using just XML (PGM, CMD, SRVPGM, System APIs, PASE utilities, Db2, etc.)
#4 ... performance, performance, performance -- yep we want it, best we can get without compromising the other higher priorities


Can i please count on you to stop this "compare" IBM i products discussion???
rangercairns
 
Posts: 215
Joined: Fri Jul 24, 2009 6:28 pm

Re: Working with multiple occurrence data structures

Postby kentatzend on Wed Nov 14, 2012 6:06 pm

Hi all,

First I want to thank everyone involved in this thread for providing some really interesting test cases on "big datasets". We are always looking for input that will help us improve this technology so keep the questions coming. At a purely technical level it's interesting and the "programmer geeks" (AKA Tony and Alan) are having fun with this. As Tony commented we had a set of priorities for the development of this new toolkit and initially performance was not our highest item. But now that we've address a lot of the basic problems we're looking more at performance than previously.

But I'm a bit worried that this is really just an artificial test case that is not going to exist in very many real world test cases. Of course I'm not 100% sure that this "assessment" by me is correct so I'd like to ask a few questions. It seems like there is a mismatch between what Jan says about the test case and the example Timo is giving us. Jan says
Customers x, y and z are having 3rd party web shops (from different vendors, of course). For fast response times, these web shops query stock balance data from our iSeries ERP every hour. This means they run more than 50 web service queries every hour with occurrence ds of 200 products. So they query stock balances for more than 10000 products every hour. And this is for the "front page" views only...
When an order is created, they check the stock balances for the order to inform the user... and these orders are big... dozens of order rows.. and at this point this web service is "real time" query and it has to be fast. So occurrence ds is the solution and it needs to be fast.

So in this scenario where there are 50 calls fetching 200 elements the performance is probably pretty good even though it is fetching 10,000 element because it's not fetching all 10,000 at once. Fetching 10,000 (or 20,000 or ...) elements at once is a very different case. I'm not saying we don't want to fix it but ....

Quite frankly I'm wondering what application really ever needs 10K+ elements to be fetched at one time and have to do that on any sort of regular basis. I could see where you might fetch 10K element once or even periodically where the period is fairly long (15 minutes, hourly, etc). The case would be for something like a local cached "catalog" of items for performance, etc. I could even see where you would fetch 10K elements in bunches (as Jan describes). But I find it hard to imagine (and I have a pretty good imagination) a scenario where a user request would need to pull all 10K+ element and could display or process them as part of a single user request/response.

Am i missing something here or is this really just a good artificial test case that was designed to push the edge of the envelope?

Kent Mitchell
Sr. Director, Product Management
Zend
User avatar
kentatzend
 
Posts: 1778
Joined: Thu Dec 11, 2008 1:08 pm

Re: Working with multiple occurrence data structures

Postby erich_hieden on Thu Nov 15, 2012 9:51 am

I think I need to throw my 2 cents into the ring as well.

What I learned from our application is, that the performance culprit is solely the XML loading/parsing of XMLSERVICE (I'm already in contact with Tony and Alan over this). That means that FETCHING tons of data is no/little problem as long as the XML going into XMLSERVICE is small (that's why I worked with Alan to reduce it's size as much as possible). The performance problems come with large input XML, which is what Timo started this thread for.

For large input XML, you can put it this way:
Code: Select all
Application --(fast)--> Compatibility Wrapper --(fast)--> XML-Toolkit --(SLOW)--> XMLSERVICE --(fast)--> RPG --(fast)--> XMLSERVICE --(fast)--> XML-Toolkit --(fast)--> Compatibility Wrapper --(fast)--> Application


Best
(Hopefully not stepping on anyone's toes)
erich_hieden
 
Posts: 373
Joined: Tue Jul 07, 2009 9:01 am

Re: Working with multiple occurrence data structures

Postby rangercairns on Thu Nov 15, 2012 4:24 pm

Hi Erich,
No problem with my toes, big feet run in my oversized Norwegian heritage, besides i rarely understand business office politics (mostly lost on me). Again, Alan/I thank you for your continued patience and support with free XMLSERVICE/Toolkit Open Source effort for PHP on IBM i.

You are indeed correct ...
1) -> big data input perform not-so-much -- 5% rare web applications
... performs slow, because we never designed for "gigantic discreet data" INPUT (14,000 data elements this forum entry)
... until this forum entry, we/i believed "gigantic discreet data" INPUT would essential be a rare "web abomination" (web programming design error if you will)
... in truth, we/i still don't quite believe this is a representative "web" application test, but we are enjoying geek programming task attempting to make this artificial looking test work "reasonably"

2) -> big data out perform ok -- 95% case web applications
... usually performs well, because we designed for "much data" reads
... read/display model seemed most likely web applications (we assumed 90% use case)
... we assume "big data" input would simply be default "output" arrays like most RPG programs (likeds(dcTon_t) dim(TONMAX))
... YES, Alan is working on unneeded discreet data input much smaller by actually using the many short cut xmlservice "default input" features (<ds dim='999'>)
Code: Select all
In this case "a ton" of complex data records data will be spilled output (xmlservice output),
but input to the ZZTON call would just be default *BLANKS, zeros, (xmlservice input).
     D zzton           PR
     D  nn1p0                         1p 0
     D  nn7a                          7a
     D  nn8p0                         8p 0
     D  nnDS                               likeds(dcTon_t) dim(TONMAX)

<?xml version='1.0'?>
<script>
<cmd comment='chglibl'>CHGLIBL LIBL(xyzlibxmlservicexyz)</cmd>
<pgm name='ZZSRV' func='ZZTON'>
<parm io='in'><data type='1p0'>1</data></parm>
<parm io='in'><data type='7a'>7</data></parm>
<parm io='in'><data type='8p0'>8</data></parm>
<parm io='both'>
  <ds var='dcTon_t' dim='999'>
    <data var='s01' type='7A'>1</data>
    <data var='s02' type='4p0'>2</data>
    :
</ds>
</parm>
</script>
rangercairns
 
Posts: 215
Joined: Fri Jul 24, 2009 6:28 pm

PreviousNext

Return to New Toolkit

Who is online

Users browsing this forum: No registered users and 2 guests