I have been experimenting with various search options for the eutveckling.se site for a while. Google Custom Search is nice and very fast, but the number of ads appearing in the search result page makes it difficult for users to separate result items from ads. (Update: I am sticking with Google Custom Search until I figure out how to get Yahoo search to present proper excerpts).
I am a fast reader which comes with the tradeoff of missing important information sometimes. Skimming through the terms for using the API I was a bit disappointed at first. That was because I was only reading the first column in the table that lists the previous terms of use. Oh well. The second column that lists the current restrictions (or rather lack of restrictions) makes it clear that Yahoo search is very easy to get started with. It is almost so that you start wondering where Yahoo will make money from providing a service like that.
Anyway, here are the five simple steps to get Yahoo search integrated in your Django site:
1. Get an API Key
…or “application ID” as Yahoo calls it. Visit this page to sign up for an API key.
2. Install the pYsearch library
Download pYsearch from Sourceforge to your computer (here is the direct link to the package), and:
tar xvf pYsearch-3.1.tar.gz
cd pYsearch
sudo python setup.py build
sudo python setup.py install
This should install pYsearch somewhere on your Python path.
3. Set up URL-pattern and view method
In your urls.py add a URL-pattern to pick up search requests. We’ll use queries like /search/?q=myquery:
url(r’^search/$’, ‘myapp.views.search’, name=‘site_search’),
Set up the view method in your application’s views.py (make sure the search query is encoded to utf-8 to enable characters outside ISO-8859-1 in the query parameter):
from yahoo.search.web import WebSearch
def search(request): query = request.GET.get(“q”, “”).encode(“utf-8”)
if len(query) > 0:
#Call yahoo!
api\_key = "\[your api key\]"
srch = WebSearch(api\_key)
srch.site = "www.example.com" #restrict to your own site
srch.query = query
srch.results = 50
result = srch.parse\_results() #puts all result items into a dict
return render\_to\_response('search/search.html', locals())
4. And now the search form and result template
Add the search field and form somewhere on your site:
The base.html sets up the basic web page. This template lives in myapp/templates/search/search.html
{% extends ‘base.html’ %} {% block content %} {% if result.results %} {% for item in result %} {{item.Title}} {{item.Summary|cut:" …"}}… {% endfor %} {% else %} Suggestion:
5. There is no step 5!
That’s it! Your site should now have a nice search engine. Some issues you may encounter:
- Yahoo does not seem to have indexed all of eutveckling.se yet. This means that the result set will be limited. No PDF documents seem to have been included (searching for “Vägledningen 24-timmarswebben” does not return the PDF document even though it has many links on the site). It would be nice to be able to see how much of the site that the search engine knows about.
- The summary text for each page seems to be the same (including hidden skiplink text). This may be my fault as I haven’t provided meta description elements yet. I have added som CSS classnames (robots-nocontent) to navigation elements to help the search engine decide on what should be included and what should be skipped. I had expected that the summary would contain a phrase close to the query term instead of text from the top of the page.