zend, lecene, web services

For programming and general questions on Zend Framework
Post Reply
khearn
Posts: 10
Joined: Tue Apr 28, 2009 11:19 pm

zend, lecene, web services

Post by khearn » Wed Apr 29, 2009 10:05 pm

Good day,

I found zend by searching for php & lucene on google and would like to know if it can be used to accomplish the following:

Index files with millions of rows (and multiple fields per row in tab delimited format) that is UTF-8 encoded

Be used to produce xml/web services that be queried from an ajax enabled form

Thanks

User avatar
ericritchie
Posts: 118
Joined: Tue Feb 10, 2009 10:09 am

Re: zend, lecene, web services

Post by ericritchie » Thu Apr 30, 2009 10:04 am

Hi Khearn,

Zend_Lucene is capable of indexing large numbers of large files and supports different character sets without a problem.

I am not sure however what you want to achieve though. If you have one file and want to search within it for a given string, then there could be better solutions that would fit your needs. Perhaps you can give us a hint of what you want to achieve?

Regards,
Eric Ritchie.

User avatar
ericritchie
Posts: 118
Joined: Tue Feb 10, 2009 10:09 am

Re: zend, lecene, web services

Post by ericritchie » Mon May 04, 2009 10:07 am

Hi Khearn,

Thanks for the clarification by PM. I am answering here so other people can see what I am suggesting and perhaps also offer input.

Basically Lucene works with a document concept. You insert a new documents into the Lucene index and can then search for them in a variety of ways (including the fuzzy search you are interested in). The problem with indexing your large file is that this would represent one document to Lucene. So any search would return a reference to that one document, but you would not really know which part of the document you matched.

However, you can also index "virtual" documents. Simply load your file (or your SQL result) and add the information from each line/row as a separate "document". You should also add the primary key as a document reference. Then when you search for something you can use the key to know which record to retrieve from your database. You will need to write a script to index what is currently in your database and you can update your code to maintain the index when information is added or deleted to/from your database.

It should work rather well.

Regards,
Eric Ritchie.

Post Reply