DelphiFAQ Home Search:



Databases
InterBase, MS-SQL, mysql, Oracle
Programming
C#, C++, Delphi, Java,
JavaScript, perl, php, Visual Basic, VBScript
Linux
Apache, Network, Shell
Web Publishing
JavaScript, perl CGI, VBScript, Web Hosting
Windows
Apache, File Types, Internet Explorer,
Network, Printing, Processes
Outside the Cube
Auto, Computer Hardware,
Finances, Dating Scams,
Household, Male Dating Scammers,
Other Scams, Travel

Articles:

This list is sorted by recent document popularity (not total page views).
New documents will first appear at the bottom.

Featured Article

How to update a htdig search engine database

Question:

I use ht/dig from www.htdig.org to provide a search function on our web site. Every now and then, new pages appear on our site or existing ones get updated. How can the search engine's database get updated?

Answer:

You need to run the script rundig after each significant change. You can also add this command in your crontab table and schedule it for daily execution.

If you simply run rundig it will visit all pages it can find from the start page and rebuild the database completely. This process is called 'crawling' and 'indexing'.

The downside is that during this crawling / indexing your database is not available for search and users of your web site cannot use the search function.

The solution is parameter '-a' for the rundig script. This parameter makes rundig use alternate work files during the crawling and indexing (the alternate work files have an additional extension .work - your file list in the /htdig/db folder will temporarily look like this:

db.docdb
db.docdb.work
db.docs
db.docs.index
db.metaphone.db
db.soundex.db
db.wordlist
db.wordlist.work
db.words.db



Basically, a second copy of the database is built. This keeps the original files to be used by htsearch. After htdig and htmerge are done building the .work database files, rundig will move them into place, replacing the original files.

Read "How do I set up a cron job?" to see how to schedule rundig -a for daily execution.



Generated 4:05:16 on Dec 21, 2014