<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Code Fury &#187; Search Engine Development</title>
	<atom:link href="http://codefury.net/category/search-engine-development/feed/" rel="self" type="application/rss+xml" />
	<link>http://codefury.net</link>
	<description>One programmer's formatted output stream</description>
	<lastBuildDate>Sat, 31 Dec 2011 22:20:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>A Better WordPress Search with WPSearch 2.0.2.0</title>
		<link>http://codefury.net/2010/11/wpsearch-released-lives-change/</link>
		<comments>http://codefury.net/2010/11/wpsearch-released-lives-change/#comments</comments>
		<pubDate>Fri, 12 Nov 2010 02:57:58 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[Gadgets]]></category>
		<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Wordpress Development]]></category>

		<guid isPermaLink="false">http://codefury.net/?p=272</guid>
		<description><![CDATA[I&#8217;m definitely not the type to evangelize something I don&#8217;t think is useful. Ask friends of mine, and they&#8217;ll likely tell you how I went through phases where I endlessly promoted things like Notepad++ for Windows, Netbeans IDE (PHP), Sequel Pro, Gnome-Do, Thinkpads, Macbooks, Toy Story 3, iPod Touches, and Visual Studio &#8217;08. I just [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2010%2F11%2Fwpsearch-released-lives-change%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2010%2F11%2Fwpsearch-released-lives-change%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>I&#8217;m definitely not the type to evangelize something I don&#8217;t think is useful. Ask friends of mine, and they&#8217;ll likely tell you how I went through phases where I endlessly promoted things like Notepad++ for Windows, Netbeans IDE (PHP), Sequel Pro, Gnome-Do, Thinkpads, Macbooks, Toy Story 3, iPod Touches, and Visual Studio &#8217;08. I just can&#8217;t help it. When I get excited about something, I have a hard time stfu-ing.</p>
<p>But <strong>holy crap</strong>. Let me tell you — if you think the WordPress default search sucks — as I did 2 years ago, <a href="http://wordpress.org/extend/plugins/wpsearch/">try WPSearch</a>. It&#8217;s (IMO) the second-most-useful plug-in that has ever existed for WordPress. It also has the slickest admin UI I&#8217;ve ever used in WordPress (a big thanks to JQuery and it&#8217;s plug-ins, of course).</p>
<p>I wrote WPSearch, and I 100% recommend it to anyone running something that isn&#8217;t a traditional WordPress blog. Run a recipe catalog? <strong>Use It!</strong> Review engine? <strong>Use it! </strong>Shopping engine? <strong>What the hell are you doing without my plug-in on your site? Your customers can&#8217;t find shit!</strong></p>
<p><strong>Fun scenario:</strong> Let&#8217;s imagine you sold laser printers on your site along with other computer peripherals, and every product page on your site had &#8220;[brand] [model] Printers&#8221; in it&#8217;s title, like &#8220;Dell 1100 Laser Printer&#8221;. If someone searched for &#8220;printers&#8221; with default WordPress search, they wouldn&#8217;t get those products back.</p>
<p>Does that scare you? It should.<a href="http://wordpress.org/extend/plugins/wpsearch/"> Get WPSearch now</a>. And if you don&#8217;t like it, tweet <strong>@_kennyk_</strong> and be brutally honest. I can take it.</p>
<p>Side note #1: When I wrote WPSearch, I was also trying to cash-in big time at a programming contest where the grand prize was a trip to RailsConf and the runner-up would get a set of steak knives (presumably to stab whoever won the grand prize).</p>
<p>Side note #2: Toy Story 3 is an intellectually provocative and introspective masterpiece!</p>
<p>Side note #3: Askimet is #1, but is tragically taken for granted</p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2010/11/wpsearch-released-lives-change/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Best WordPress Search Plug-in: WPSearch 2</title>
		<link>http://codefury.net/2010/10/the-best-wordpress-search-plug-in-wpsearch-2/</link>
		<comments>http://codefury.net/2010/10/the-best-wordpress-search-plug-in-wpsearch-2/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 13:05:38 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Wordpress Development]]></category>

		<guid isPermaLink="false">http://codefury.net/?p=262</guid>
		<description><![CDATA[If you want to read about the background of the WPSearch Search plug-in for WordPress, read below. But if you just want the gist of this post, here it is: WPSearch is the best search plug-in for your WordPress blog. It is a stemming, stop-word blocking, fast, relevant, fulltext search for WordPress. There isn&#8217;t a [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2010%2F10%2Fthe-best-wordpress-search-plug-in-wpsearch-2%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2010%2F10%2Fthe-best-wordpress-search-plug-in-wpsearch-2%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>If you want to read about the background of the WPSearch Search plug-in for WordPress, read below. But if you just want the gist of this post, here it is:</p>
<p><strong>WPSearch is the best search plug-in for your WordPress blog</strong>. It is a stemming, stop-word blocking, fast, relevant, fulltext search for WordPress. <em>There isn&#8217;t a single plug-in in the WordPress repository that can do what it does</em>.</p>
<p>You can get it here: <a href="http://wordpress.org/extend/plugins/wpsearch/">http://wordpress.org/extend/plugins/wpsearch/</a> &#8212; or just install it through the WordPress plugin administration backend. Just search for WPSearch.</p>
<p>If you run any sort of monetized blog, you might be losing sales or readership if users can&#8217;t find what they need on your site. <a href="http://www.adotas.com/2010/09/on-site-search-the-other-white-meat/">According to Adotas, 43% of users hit the search box first</a>.</p>
<p><strong>Background:</strong></p>
<p>I&#8217;ll admit that I&#8217;ve been pretty vocal already about WordPress&#8217; lacking search functionality. I&#8217;ve been vocal about it for 2 years, and it was the original impetus to get this blog up and running. It was also the reason that I wrote wpSearch, the original version of WPSearch &#8212; the lucene-based search plug-in for WordPress. wpSearch was the first true fulltext search for WordPress, in my opinion.</p>
<p>But the problem with wpSearch was that it wasn&#8217;t highly engineered. I had written it for a programming contest, and I was on a tight deadline &#8212; and when projects are rushed, the quality of code goes down. And because of that, bugs made their way out of the woodwork over the next two years.</p>
<p>Consumed by college work and my job, I didn&#8217;t have much time to address those issues. In fact, I had declared wpSearch unsupported a year after it was released in 2009.</p>
<p>A year following that, <a href="http://pixelberry.co.nz/">Daniel Hay at Pixelberry in New Zealand</a> had requested that I add the ability of searching within a category for a client of his. That&#8217;s when WPSearch 2 development began.</p>
<p>I wanted to give wpSearch a full rewrite, and correct several mistakes I made with the first version:</p>
<ul>
<li>Rename it from wpSearch to WPSearch</li>
<li>Change the listing name in the WordPress repository to WP Search, so searches for &#8216;search&#8217; would bring it up</li>
<li>Follow and MVC pattern (it&#8217;s a complex plug-in)</li>
<li>Build a configurable search driver framework, so any driver could be written to search and index</li>
<li>Build better logging</li>
<li>Evangelize it to no end (wpSearch was barely promoted)</li>
<li>Give it a UI independent of the default WordPress stylesheets, and also make it stylish</li>
</ul>
<p>The biggest change is the configurable driver part. The free version of WPSearch contains a driver that uses Zend Lucene in the background. Any driver, however, can be written to work with WPSearch. Drivers for SOLR, sphinx, the Google Search Appliance, or name-your-own-search-product could be written.</p>
<p>I did that because PHP is not the best language to write a search engine in. Since Zend_Search_Lucene is the backend driver of the free version, there is an upper-bound of scalability on the plug-in. After all PHP is a scripting language, and I doubt Zend ever really imagined someone would make the ludicrous decision of indexing tens of thousands of posts in PHP. I found the breaking point to be about 20,000 docs. At that point, I ran into memory issues, slow mid-indexing optimizations, and slow first-hit (non-cached) searches.</p>
<p>So the point is that I poured everything I had into WPSearch 2, and I want to tell everyone about it. I did this project under the umbrella of OConf, my start-up, with business partner John Crepezzi. John&#8217;s an ex-engineer at Sun Microsystems, and he spends his days at Patch.com now. He also wrote the backend driver for WPSearch Pro, an alternate driver for WPSearch 2 which can handle up to 500,000 docs.</p>
<p>John and I gave a talk at Wordcamp NYC, where we officially launched WPSearch 2.The topic was on the default WordPress search, and why avoiding a remedy to it can lose you both readers and money. If you run a shopping engine, people can&#8217;t find your products. If you run a news site, readers can&#8217;t find your content.</p>
<p>If you don&#8217;t think people even use the search box &#8212; heads up, the advertising gurus at Adotas say <a href="http://www.adotas.com/2010/09/on-site-search-the-other-white-meat/">43% of users who find your site do</a>.</p>
<p>You can check it out here: <a href="http://wordpress.org/extend/plugins/wpsearch/">http://wordpress.org/extend/plugins/wpsearch/</a> . In it&#8217;s short time in the repository (about 9 days), it&#8217;s already had around 1,000 downloads, and I&#8217;ve had a lot of positive feedback coming in. I also dropped <a href="http://ma.tt/2009/07/acquia-searc/#comments">a comment about it on WordPress founder Matt Mullenweg&#8217;s site, where he mentioned a search product for Drupal</a>. Hopefully he&#8217;ll check it out and let me know what he thinks.</p>
<p>Check it out!</p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2010/10/the-best-wordpress-search-plug-in-wpsearch-2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>wpSearch 1.5.0.5 Released With Features, Fixes</title>
		<link>http://codefury.net/2008/08/wpsearch-1505-released-with-features-fixes/</link>
		<comments>http://codefury.net/2008/08/wpsearch-1505-released-with-features-fixes/#comments</comments>
		<pubDate>Thu, 07 Aug 2008 05:09:52 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[Wordpress Development]]></category>
		<category><![CDATA[1.5.0.5]]></category>
		<category><![CDATA[fix]]></category>
		<category><![CDATA[plugin]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[wordpress]]></category>
		<category><![CDATA[wpsearch]]></category>

		<guid isPermaLink="false">http://codefury.net/?p=35</guid>
		<description><![CDATA[After an exhausting week and a half tracking down the source of a mysterious bug in wpSearch, I think I can finally close the book on the &#8220;null result&#8221; issue that had me pouring over the source code. wpSearch 1.5.0.5, the first official release after the 1.5 landmark, brings to the forefront some of the [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2008%2F08%2Fwpsearch-1505-released-with-features-fixes%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2008%2F08%2Fwpsearch-1505-released-with-features-fixes%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>After an exhausting week and a half tracking down the source of a mysterious bug in wpSearch, I <strong>think </strong>I can finally close the book on the &#8220;null result&#8221; issue that had me pouring over the source code.</p>
<p>wpSearch 1.5.0.5, the first official release after the <a href="http://codefury.net/2008/07/wpsearch-15-the-fastest-lightest-yet/">1.5 landmark</a>, brings to the forefront some of the features and fixes slated in the last post. wpSearch 1.5 has had the following features implemented:</p>
<ul>
<li>Comment Searching</li>
<li>A behind-the-scenes event logger for easily figuring out user issues</li>
<li>An upgrade to the underlying Lucene Search</li>
<li>An upgrade to the underlying StandardAnalyzer (used for relevancy)</li>
</ul>
<p>And these fixes:</p>
<ul>
<li>No more null results after a post is edited</li>
<li>Foreign character support (or simply indexing content with &#8216;UTF-8&#8242; encoding</li>
<li>Memory issues for content-heavy posts</li>
</ul>
<p>wpSearch 1.5.0.5 is a rock-solid release that is starting to make a name for itself in the WordPress world. The new &#8216;Phone Home&#8217; feature in wpSearch allows users to report their copy of wpSearch. A few of the blogs with wpSearch currently in use are listed here:</p>
<ul>
<li><a href="http://buildingtheergonomicguitar.com/">Building The Ergonomic Guitar</a></li>
<li><a href="http://www.computerbob.com/">ComputerBob</a></li>
<li><a href="http://savoringkentucky.com/wordpress">Savoring Kentucky</a></li>
</ul>
<p>Patrick Cushing at the <a href="http://EnterVenture.com">EnterVenture </a>blog wrote a very detailed comparision of the default WordPress search&#8217;s relevancy vs. wpSearch&#8217;s. This article ended up at <a href="http://digg.com/software/wpSearch_could_be_the_WordPress_search_you_ve_been_waiting_f">digg</a>.</p>
<p>Of course, as far as wpSearch has come in its short lifespan, there exists a set of users that deserve credit for pointing out issues and keeping me informed of bugs, needed features, etc.. So, in no particular order, I would like to thank:</p>
<ul>
<li>ComputerBob, at <a href="http://ComputerBob.com">ComputerBob.com</a> for pointing out the first instance of the empty result issue. He has thoroughly documented his usage with wpSearch at his blog, in a fair and balanced fashion. Furthermore, he has sent his index data back with detailed comments when most users would simply give up on wpSearch. Thanks ComputerBob.</li>
<li><a href="http://buildingtheergonomicguitar.com/">Robert Irizarry</a>, who has kept the wpSearch thread at the WordPress repository stuffed with feature ideas and issue notices.</li>
<li><a href="http://itsogay.com">Olivier</a>, who&#8217;s 6000 posts provided the first failed scalability test for wpSearch. His pointing out of this issue led to a change to allow for greater scalability &#8212; in other words, wpSearch 1.5 was tested successfully up to 7,000 posts. Great dedication to detailing these issues has helped wpSearch greatly.</li>
<li><a href="http://www.fellbeisser.net/news">Karl Heigl</a>, who first mentioned the fact the wpSearch was not handling German accents, and subsequently all foreign (to the U.S.) characters. This also ended up affecting Olivier. This bug was fixed in 1.5.0.5. Thanks Karl!</li>
<li>A user named Brian, said, &#8220;Thanks for the update.  If you need any other information or even help testing, I’d be happy to assist. Just let me know.  &#8221; Thanks for your support Brian.</li>
<li>And to all those who have donated to this project so far!</li>
</ul>
<p>So, wpSearch 1.5.0.5 wouldn&#8217;t be at it&#8217;s current status if it weren&#8217;t for those supporting it.</p>
<p>Features coming up for <a href="http://codefury.net/projects/wpSearch/">wpSearch</a> include result highlighting, contextual snippets, and a progress meter for index building.  I encourage everyone who is reading this but hasn&#8217;t installed wpSearch yet to try it out, and <a href="http://codefury.net/projects/wpsearch/wpsearch-screenshots/">see the awesome blog search that you&#8217;ve been missing.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2008/08/wpsearch-1505-released-with-features-fixes/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
		<item>
		<title>wpSearch Accepted Into WordPress Plugins</title>
		<link>http://codefury.net/2008/07/wpsearch-accepted-into-wordpress-plugins-new-release/</link>
		<comments>http://codefury.net/2008/07/wpsearch-accepted-into-wordpress-plugins-new-release/#comments</comments>
		<pubDate>Fri, 11 Jul 2008 05:43:04 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[Wordpress Development]]></category>
		<category><![CDATA[in]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[plug]]></category>
		<category><![CDATA[plugin]]></category>
		<category><![CDATA[repository]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://codefury.net/?p=25</guid>
		<description><![CDATA[wpSearch (more info in my previous post), the lucene-powered search plugin for WordPress, has officially been accepted into the WordPress plugins repository. You can view and download wpSearch here: http://wordpress.org/extend/plugins/wpsearch/ The latest version as of right now is 1.1.0.0. Several major features have been added since the original beta release. Seamless integration of wpSearch into [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2008%2F07%2Fwpsearch-accepted-into-wordpress-plugins-new-release%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2008%2F07%2Fwpsearch-accepted-into-wordpress-plugins-new-release%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>wpSearch (<a href="http://codefury.net/2008/06/a-lucene-based-search-plugin-for-wordpress/">more info in my previous post</a>), the lucene-powered search plugin for WordPress, has officially been accepted into the WordPress plugins repository. You can view and download wpSearch here:</p>
<p><a href="http://wordpress.org/extend/plugins/wpsearch/">http://wordpress.org/extend/plugins/wpsearch/</a></p>
<p>The latest version as of right now is 1.1.0.0. Several major features have been added since the original beta release.</p>
<ul>
<li>Seamless integration of wpSearch into your blog. After you activate wpSearch and build your blog&#8217;s search index, the search box on your blog will now be configured to use wpSearch for searches.</li>
<li>You can now decide whether you want search results in the page ( the standard ), or have them loaded into and AJAX search pop-up. (Originally, the AJAX pop-up was the only way to view results ). This option is configurable via the WordPress admin screen.</li>
<li>Bloggers can now tweak the importance of things such as title, content, and tags in a blog search. This effectively allows control over what is considered relevant in a blog search.</li>
</ul>
<p>So what&#8217;s next for wpSearch?</p>
<p>More searchable content: It&#8217;s no secret that the best content on a blog is sometimes in the comments. This is especially true for bloggers of tech and programming sites where blog readers often put useful contributions in comments.</p>
<p>The opening of the source: At SourceForge! Sure, PHP is inherently open-source (it&#8217;s a scripting language, after all!). But the best future for wpSearch would entail its placement into SourceForge.NET where the coding community can have the opportunity to contribute to the wpSearch project.  wpSearch is already registered at SourceForge, and has a project page at:</p>
<p><a href="http://wpsearch.sourceforge.net/">http://wpsearch.sourceforge.net/</a>. (Right now, there isn&#8217;t much setup up).</p>
<p>I plan to have wpSearch developed at SourceForge, and have stable releases be uploaded to the plug-in repository at WordPress.</p>
<p>There are some other features I plan to add to wpSearch very shortly, one of which is contextual search result content, so you can see the words around the matching content of a search result.</p>
<p>I can&#8217;t think of the others off the top of my head. What I would really like to know is if anyone finds wpSearch to be of value so far, and whether they are having any difficulties.</p>
<p>I read on another blog that blogs get xx% more comments if the words &#8220;Have your say&#8221; are at the end of a post. I think I&#8217;ll try that.</p>
<p>Have your say!</p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2008/07/wpsearch-accepted-into-wordpress-plugins-new-release/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A Lucene-based Search Plugin For WordPress</title>
		<link>http://codefury.net/2008/06/a-lucene-based-search-plugin-for-wordpress/</link>
		<comments>http://codefury.net/2008/06/a-lucene-based-search-plugin-for-wordpress/#comments</comments>
		<pubDate>Tue, 01 Jul 2008 00:01:54 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[php zend lucene wordpress search]]></category>

		<guid isPermaLink="false">http://codefury.net/?p=22</guid>
		<description><![CDATA[There are many things I love about WordPress &#8212; the extendability, the ease of use, and large library of themes available online, to name a few. But if there is one aspect of WordPress that needs a little work, it is the default search functionality. Recently, I&#8217;ve been spending a lot of time working on [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2008%2F06%2Fa-lucene-based-search-plugin-for-wordpress%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2008%2F06%2Fa-lucene-based-search-plugin-for-wordpress%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>There are many things I love about WordPress &#8212; the extendability, the ease of use, and large library of themes available online, to name a few. But if there is one aspect of WordPress that needs a little work, it is the default search functionality.</p>
<p>Recently, I&#8217;ve been spending a lot of time working on a search plugin for WordPress that is based on the Lucene search engine &#8212; a very cool and powerful search library used by <a href="http://www.manning.com/hatcher2/">a lot of big places</a>. The plugin is in its beta stage, and ready for use and evaluation by anyone who would like to check it out. The plugin is currently implemented on my blog, so you can use the search box on the upper-right side to see it in action.</p>
<p>wpSearch uses the PHP port of the library by Zend. It also spawned a sub-project, the PHP StandardAnalyzer.<a href="http://codefury.net/projects/StandardAnalyzer/"> You can read more about that here.</a></p>
<p>The search currently uses a lightbox floating over the page to allow users to navigate search results. An option to integrate the results into the page may be and option in the future.</p>
<p>The major features of wpSearch are:</p>
<ul>
<li>Unmatched and customizable search relevancy (that&#8217;s the power of Lucene working)</li>
<li>Very fast search speed</li>
<li>Wildcard and Boolean operator support</li>
<li>Easy installation</li>
<li>Instantly updated searching after a post has been written</li>
<li>Searching of Posts and Pages</li>
</ul>
<p>Features for advanced users:</p>
<ul>
<li>Customizable interface via CSS</li>
<li>Access to the internal search service for extendability</li>
</ul>
<p>wpSearch was written for a development contest at<a href="http://ltech.com"> LTech Consulting</a> (a firm specializing in search with Lucene and the Google Search Appliance), but with the full intent of being open source.  If anyone is interested in helping develop it, drop me a comment on this post.</p>
<p>Also, if anyone gives this plugin a try and has any suggestions, I would really appreciate your input! Just leave a comment and I&#8217;ll get back to you. The plugin will be made available in the WordPress search repository shortly.</p>
<p>Full information about wpSearch (installation instructions, screenshots, etc) is available on <a href="http://codefury.net/projects/wpSearch/">wpSearch&#8217;s project page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2008/06/a-lucene-based-search-plugin-for-wordpress/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
		</item>
		<item>
		<title>A Stemming Analyzer for Zend&#8217;s PHP Lucene</title>
		<link>http://codefury.net/2008/06/a-stemming-analyzer-for-zends-php-lucene/</link>
		<comments>http://codefury.net/2008/06/a-stemming-analyzer-for-zends-php-lucene/#comments</comments>
		<pubDate>Thu, 05 Jun 2008 15:36:25 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[php zend lucene search]]></category>

		<guid isPermaLink="false">http://katzgrau.simplesample.org/?p=10</guid>
		<description><![CDATA[In my last post I spoke a little about Zend&#8217;s Lucene implementation in PHP, and its extensive usefulness for content-oriented PHP web applications. One of the roadblocks to implementing a Google-like search, however, was the absence of a stemming analyzer in the Zend package. While using PHP Lucene, I came across this issue while developing [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2008%2F06%2Fa-stemming-analyzer-for-zends-php-lucene%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2008%2F06%2Fa-stemming-analyzer-for-zends-php-lucene%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>In my last post I spoke a little about Zend&#8217;s Lucene implementation in PHP, and its extensive usefulness for content-oriented PHP web applications. One of the roadblocks to implementing a Google-like search, however, was the absence of a stemming analyzer in the Zend package.</p>
<p>While using PHP Lucene, I came across this issue while developing a plug-in for wordpress. I wasn&#8217;t getting the relevancy I needed in test searches that I was looking for, and I decided to develop one of my own. I decided that this analyzer should:</p>
<ul>
<li>Stem words for greater search relevancy</li>
<li>Use the pre-existing Zend lowercase filter</li>
<li>Filter out a standard set of stop words using the Zend stop words filter</li>
</ul>
<p>After a couple days working on the issue, I&#8217;ve developed an analyzer that performs these tasks. I&#8217;ve named it the &#8216;StandardAnalyzer&#8217; after the implementation Java&#8217;s Lucene has. You can download the <a href="/projects/StandardAnalyzer/">StandardAnalyzer at its project page</a>.</p>
<p>Just a few notes on the about its creation:</p>
<ul>
<li>It is not meant to sit within the Zend framework folder. The &#8216;StandardAnalyzer&#8217; should sit alongside it, and is configured accordingly. The reason for this is to keep what is Zend&#8217;s in Zend&#8217;s  folder, and what is the user&#8217;s in his own.  I figured that if the StandardAnalyzer was ever integrated into framework, the good folks at Zend would know best how they would like it.</li>
<li>The code provided handles English words only, but organized to encourage future languages as well.</li>
<li>I must give a special thanks to <a title="Richard Heyes" href="http://phpguru.net">Richard Heyes</a>, whose Stemming algorithm is used instead of my own. In tests, I found his code to be a bit more elegant and quicker than my own, which was a direct port of the Java stemming algorithm. From what I gather, Richard is a Zend-Certified Engineer, making his code usage very fitting.</li>
</ul>
<h4>Example Usage</h4>
<p>I&#8217;ve decided to pack the StandardAnalyzer with an example project and index to make things a little easier for those looking to use it. The <a href="/projects/StandardAnalyzer/">example project</a>, as well as most user projects, would start off like:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #b1b100;">require_once</span> <span style="color: #0000ff;">'Zend/Search/Lucene.php'</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">require_once</span> <span style="color: #0000ff;">'StandardAnalyzer/Analyzer/Standard/English.php'</span><span style="color: #339933;">;</span></pre></div></div>

<p>As mentioned before, the StandardAnalyzer folder should sit in the same directory as the Zend Framework.  Now that you have the power of Zend ready to go, you can proceed build your index. But don&#8217;t forget that to use the StandardAnalyzer, you have to set the default analyzer to an instance of the StandardAnalyzer. So before you index documents <strong>or</strong> search over the index, you should call:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;">Zend_Search_Lucene_Analysis_Analyzer<span style="color: #339933;">::</span><span style="color: #004000;">setDefault</span>
<span style="color: #009900;">&#40;</span> <span style="color: #000000; font-weight: bold;">new</span> StandardAnalyzer_Analyzer_Standard_English<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>I folded that line to keep it looking readable.</p>
<p>Anyway, any indexing or searching you do after this line uses the Standard analyzer. (I may not have been very clear, but the same analyzer needs to be used when indexing and searching, or else you won&#8217;t get many results.) You can also change the getDefaultAnalyzer() code in Zend/Search/Lucene/Analysis/Analyzer to reference your the StandardAnalyzer too. But I would rather not change this code of the Framework, and leave it in untainted form.</p>
<p>So take a look at the <a href="/projects/StandardAnalyzer/">StandardAnalyzer project</a>, and the example project. The example was put together fairly quickly, but it should provide a good example of how to use it.  I think a synonym filter would make a nice addition in the future, so I might take a look into that.</p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2008/06/a-stemming-analyzer-for-zends-php-lucene/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>A word on Lucene&#8217;s PHP port by Zend</title>
		<link>http://codefury.net/2008/06/a-word-on-lucenes-php-port-by-zend/</link>
		<comments>http://codefury.net/2008/06/a-word-on-lucenes-php-port-by-zend/#comments</comments>
		<pubDate>Thu, 05 Jun 2008 04:12:21 +0000</pubDate>
		<dc:creator>Kenny Katzgrau</dc:creator>
				<category><![CDATA[PHP Development]]></category>
		<category><![CDATA[Search Engine Development]]></category>
		<category><![CDATA[php development zend lucene]]></category>

		<guid isPermaLink="false">http://katzgrau.simplesample.org/?p=9</guid>
		<description><![CDATA[Lucene is an open source search engine written in Java. If you have never heard of it prior to now, listen to this: It allows you to create a mini google-like search for anything. That&#8217;s right &#8212; anything. But I&#8217;ll be a little more specific: Consider you run a news website &#8212; or a wiki [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fcodefury.net%2F2008%2F06%2Fa-word-on-lucenes-php-port-by-zend%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fcodefury.net%2F2008%2F06%2Fa-word-on-lucenes-php-port-by-zend%2F&amp;source=_kennyk_&amp;style=normal&amp;space=12&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Lucene is an open source search engine written in Java. If you have never heard of it prior to now, listen to this: It allows you to create a mini google-like search for anything. That&#8217;s right &#8212; anything.</p>
<p>But I&#8217;ll be a little more specific: Consider you run a news website &#8212; or a wiki for that matter. How would you let users search the website? For most programmers, the answer is in implementing a plain-vanilla SQL search over the title and content of the articles. There are a few issues with this approach:</p>
<ul>
<li>The search time can be fairly lengthy</li>
<li>Running a LIKE query can still be very inaccurate (a search for &#8216;manager&#8217; over a field containing &#8216;manage&#8217; will not be considered a match)</li>
<li>There is almost no relevancy relationship in the way the results are ordered</li>
</ul>
<p>Lucene is a Java package which lets a Java programmer insert documents into an &#8216;index&#8217;, basically the search engine&#8217;s data base, and search over that index later on. So it is a true search engine that a  Java programmer can use to grab information at incredible speeds: milliseconds in my tests.</p>
<p>The details of Lucene can be found at its <a title="Apache Incubator Site" href="http://lucene.apache.org/java/docs/index.html">Apache incubator Site</a>.</p>
<p>I&#8217;ll get to the real point of this post. Considering how useful a tool Lucene would be, you are probably somewhat disappointed that I said it was for Java. After all, many would find something like this most useful if integrated with a server-side language such as PHP.</p>
<p>Zend, a PHP devoted firm most noted for the <a title="Zend Framework" href="http://www.zend.com/en/community/framework">Zend Framework</a> created a &#8220;Search&#8221; component as part of its framework, which is a port of Lucene for PHP. Using this port can be extremely useful for implementing search functionality in a web application.  There is a single problem standing in the way of creating a true full text search, although, and that is the default search functionality provided in PHP Lucene.</p>
<p>Consider a scenario where we are employers searching for prospective employees on a job search board. In a certain applicant&#8217;s resume, he states that he has &#8220;Managed a software team with great success, and has great managerial skills.&#8221;</p>
<p>Let&#8217;s assume this guy&#8217;s resume, as well as thousands of other resumes are in a lucene search index. When an employer executes a search on the job board, the job board code than uses the Lucene API to find documents matching the manager&#8217;s search terms &#8220;sales manager&#8221;.</p>
<p>Using the standard functionality of PHP Lucene, our employer would likely never find our mentioned employee. Why? Because the word &#8216;managed&#8217; and &#8216;managerial&#8217; is not the word &#8216;manager&#8217;.  Even though this document is very relevant to the employer&#8217;s search, it will be nowhere within the result set.</p>
<p>Java Lucene has a way to overcome this scenario: the Standard Analyzer. The Standard analyzer is a component that Java Lucene can use to manipulate data when it is going into a search index. So when &#8220;Managed a software team with great success, and has great managerial skills&#8221; is put in the index, it will be stored as &#8220;manag a software team with great success, and has great manag skill.&#8221; The standard analyzer performs lower casing and word stemming on the data of a document.</p>
<p>The analyzer is also used on queries. &#8220;Sales manager&#8221; would become &#8220;sal manag&#8221;. Now a query of these terms would definitely turn up the employee we just spoke about.</p>
<p>PHP&#8217;s Lucene unfortunately does not have this ability yet. My next post will be about my  creation of such an analyzer for PHP Lucene.</p>
]]></content:encoded>
			<wfw:commentRss>http://codefury.net/2008/06/a-word-on-lucenes-php-port-by-zend/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

