Similar Articles on middleman-blog

I created a middleman-blog extension to lookup similar articles.

View middleman-blog-similar on GitHub.

gem 'middleman-blog-similar'
gem 'levenshtein-ffi', :require => 'levenshtein'
h2 Similar Entries
ul
  - similar_articles.first(5).each do|article|
    li= link_to article.title, article.url

You can retrieve similar articles from similar_articles helper method or Middleman::Blog::BlogArticle#similar_articles instance method.

Currently this extension supports similarity engines: levenshtein-ffi, levenshtein and damerau-levenshtein.

However, I think those are bit low accuracy, so I’m tring to create engines with tf-idf-similarity library.

Pull request: [wip] tf*idf support #2

You will be able to choose similarity engines like the below in your config.rb:

# Levenshtein distance function:
activate :similar # , :algorithm => :levenshtein by default.

# Damerau–Levenshtein distance function:
activate :similar, :algorithm => :damerau_levenshtein

# Term Frequency-Inverse Document Frequency function:
activate :similar, :algorithm => :tf_idf

# Okapi BM25 ranking function:
activate :similar, :algorithm => :bm25