There have been some ongoing scalability issues affecting Drupal.org's built in search functionality for some time now. Less interested in outsourcing search to a big black box such as Google, I spent some time helping clean up the Xapian module, making it possible to completely replace Drupal's built in SQL-powered search functionality with a Xapian powered engine. With the basic search functionality complete, there was still a need to actually compare the performance of the two solutions.
Toward this goal, over the weekend I launched a new project called SearchBench, a Drupal module for benchmarking Drupal's search performance. As the module evolves, I hope it will prove extremely useful for comparing the performance and scalability of the many free and open source search options available to Drupal powered websites.
- automatically creates random search queries from a wordlist
- supports the creation of named "search lists", which are saved lists of search queries that can be used over and over to more precisely compare performance
- generates comprehensive named logs each time a "search list" is benchmarked
- generates reports from the logs, offering detailed comparisons between search benchmarks
- create complex queries (supporting Drupal's advanced search functionality)
- improved wordlists (and control over which wordlists are used)
- offer comparison reports
- generate graphs from the data
- export the data in csv format so other tools can generate graphs
Some initial testing on scratchvm showed Drupal's core search consistently outperforming Xapian search. However, Xapian was running in a virtual machine on the same server as the web server, while Drupal's core search was using a remote database. In other words, the current comparison means nothing beyond the fact that the SearchBench module is getting to the point where it's able to generate some potentially useful data. It will be very interesting to set up a clean benchmarking infrastructure and do some actual comparison tests.
Until then, here are a couple of screenshots of the reports currently generated by this module: