Use More like this function of Elastic Search to implement the related content function



I have a forum about the programmer.

At first, I used “title search” to implement the related content of threads. It means if you want to know, which threads are related to one thread, you just use the title of this thread to search. This is a very easy approach, but there are some disadvantages of this method:

  1. Sometimes the title is to sample to represent the content.
  2. Sometimes there are no similar contents, but the author wrote other contents in this site, so we want these contents can be shown.
  3. The title is too short to represent, but we just cannot search all the content in this thread.

So, I am happy to find out Elastic Search, which I am using for search, have a function “More like this” can do it.

Before we start to write codes we can use CURL to test the idea.

Note:

  1. Put -d at the end, so we can use shell multi-line string.
  2. [{"_id" : 3721}] is the id of content.
  3. Use "_source": ["title","id"] to limit return fileds, we just want to generate a related content list, we don’t other information.

Result is:

But some content is truly in the search engine, you can find it in search, but this method can not return any related content. It will return like this:

So we can provide more condition (title, author, etc.) to get more results, like this:

Now it is done.

But there is another small problem, is “More like this” function is much slower than search, so you must want to cache the result. 🙂



My website configuration log



Web part ( http://ourcoders.com/ ) :

Elastic Search part :