I’ve Finally Found a Rails 4.x Blogging Engine / Gem

I can’t believe how difficult it’s been to find a good solution for plugging a simple blog into an existing Rails app. I wanted to add a blog to SwingTradeBot, the new site I’m building but most answers to this question that I’ve found say to either use RefineryCMS or “roll your own. Well I tried Refinery and quickly ran into gem conflicts galore. As for rolling my own… I don’t have time for that — I’d rather use something that’s been thought through and is well suited to the task.

I was ready to give up and just roll my own when I found the Monologue gem. That looked really promising but then I ran into a Rails 4 compatibility issue. However, reading through the discussion thread on that issue I discovered that somebody had created the Blogo gem (plugin / engine).

It’s still early days with this gem but so far, so good for the most part. Installation and set-up went smoothly (in development mode). Here are some things I ran into after pushing to production (on Heroku):

  1. There’s a generator to create the admin user (rake blogo:create_user[user_name,user@email.com,password]) – that didn’t work in production. I didn’t find out until after creating the user manually in a Rails console that I needed to prepend ‘RAILS_ENV=production” to the rake command.
  2. The assets were missing. running “RAILS_ENV=production rake assets:precompile” fixed that.
  3. Note that for comment to appear you need to be signed up for Disqus and you need to enter your site’s short name into the Blogo config.
  4. There are some configuration options that I had to discover via digging through the code. See below for an example of what I’ve added to my config/application.rb

Here’s what’s in my config/application.rb:

Blogo.config.site_title = "SwingTradeBot Blog"
Blogo.config.site_subtitle = "Some clever subtitle..."
Blogo.config.keywords = 'stock trading, technical analysis, stock scanning'
Blogo.config.disqus_shortname = 'swingtradebot'
Blogo.config.twitter_username = 'swingtradebot'

Lack of Indexes on Ultimate Tag Warrior Tables

Over the last week or so I’ve been on a mission to improve the performance of my web server, and especially MySQL. I took Arne’s advice and turned on the query cache. That helped but I still needed to do more. After doing some research I discovered MySQL’s slow query log, which does exactly what it sounds like. I enabled slow query logging and set “long_query_time” to 5 seconds. Shortly after I restarted MySQL the slow query count started to rise.

Every query in the slow query log was sent from the Ultimate Tag Warrior WordPress plugin which I use on my other blog. Here are some of the queries:

SELECT count( p2t.post_id ) cnt
FROM wp_tags t
INNER JOIN wp_post2tag p2t ON t.tag_id = p2t.tag_id
INNER JOIN wp_posts p ON p2t.post_id = p.ID
WHERE post_date_gmt < '2007-03-08 21:49:06' AND ( post_type = 'post' ) GROUP BY t.tag ORDER BY cnt DESC LIMIT 1 ;


SELECT tag, t.tag_id, count( p2t.post_id ) AS count, (
count( p2t.post_id ) /3661
) *100
) AS weight, (
count( p2t.post_id ) /1825
) *100
) AS relativeweight
FROM wp_tags t
INNER JOIN wp_post2tag p2t ON t.tag_id = p2t.tag_id
INNER JOIN wp_posts p ON p2t.post_id = p.ID
WHERE post_date_gmt < '2007-03-09 02:27:39' AND ( post_type = 'post' ) GROUP BY t.tag ORDER BY weight DESC LIMIT 50 ;

That led me to take a look at what was going on with the wp_tags and wp_post2tag tables. I did EXPLAINs on the queries and saw that they were doing table scans instead of using the indexes. So I went to look at the table definitions and was surprised at what I saw. The only index on the wp_post2tag table was rel_id, the auto-incremented primary key. So the columns that were actually used to do joins with, tag_id and post_id, had no indices. My SQL is very rusty but I knew that wasn’t a good thing. I also took a look at the wp_tags table and saw that it only had an index on the tag_id column. I’ve seen some queries with “tag = ‘tag_name’ ” in the WHERE clause so I figured that it would be good to have an index on the tag column as well.

After consulting with my brother, whose SQL skills are much more up to date than my own I decided to add indexes to those tables. I created an index called ‘tags_tag_idx’ on the wp_tags.tag column. On the wp_post2tag column I created two indexes — the post2tag_tag_post_idx index is on tag_id then post_id and the post2tag_post_tag_idx index is on post_id then tag_id. I’m not sure if using concatenated indexes is better than just creating separate single column indexes for each column but I think it’s the way to go after discussing with my brother and looking at how the wp_post2cat and wp_linktocat tables are indexed. They both have concatenated indices.

I ran some queries on the tables before and after to see if things were sped up and indeed they were. Unfortunately when I ran the EXPLAIN on the queries in the slow query log I saw mixed results. The keys that I added were now showing up as “possible_keys” and the actual keys but the queries still ended up doing table scans. For the tags table the EXPLAIN shows the dreaded “Using temporary; Using filesort”.

So while I didn’t completely solve my slow query problem the new indexes do help for many of the simpler queries which access wp_post2tag and wp_tag. If you’re using Ultimate Tag Warrior and are concerned about your database load you may want to add some indexes to the tag tables.

Winners of the 2005 Black Weblog Awards

Congratulations to all of the winners of the 2005 Black Weblog Awards. Here they are:

I guess I need to see what Daily Views, Pop Culture, Rants and News is all about since it won all those awards.

Thanks to the organizer(s) (whoever you are!) of the awards for spotlighting our little corner of the blogosphere.

Hopefully next year (or even this year!) we’ll get to see the other nominees. I found it difficult to vote in all of the categories since in many cases I couldn’t think of a blog for that category. It would also be nice to give those nominees some exposure and more traffic. Some permalinks on the awards site would be nice too!

(Originally posted on Negritude)

Technorati Sandbox???

The Google Sandbox is well-known but is there also a Technorati Sandbox? If so, I think I’m in it — or at least one of my three blogs is in it. For some reason my main blog (http://tradermike.net/) hasn’t been indexed by Technorati in months yet my other two blogs, which are hosted under the same domain, get indexed with no problem. To make matter worse when I try to claim my main blog on Technorati I get an error message telling me that it’s not claimable.

I’ve been trying to get help from their technical support for about two weeks but I haven’t gotten any response yet. Somebody help! (shameless ‘ping’ of David Sifry!)

Technorati Beta

Check out the revamped (and well designed) Technorati. Here’s what’s new in the beta release:

  • We’ve improved the user experience, making Technorati accessible to more people and, specifically, people who are new to blogging. We’ve tried
    to make it very simple to understand what Technorati is all about, and make it easy to understand how we’re different from other search engines.
  • We’ve learned from the incredible success of tags, and brought some of the those same features into search, as well as expanding tag functionality. Now, if your search matches a tag, we bring in photos and links from flickr, furl, delicious, and now buzznet as well.
  • We now have more powerful advanced search features. You can now click the “Options” link beside any search box for power searching options.
  • We’ve added more personalization. Sign in, and you’ll see your current set of watchlists, claimed blogs, and profile info, right on the homepage, giving you quick access to the stuff you want as quickly as possible.
  • New Watchlist capabilities have been added. For example, you no longer need a RSS reader to watch your favorite searches. Now you can view all of your favorite searches on one page. Of course, you can still get your watchlists via RSS, and it is even easier to create new watchlists. You can also get RSS feeds for tagged posts, just check the bottom of each page of tag results!

Is Weblogs.com Filtering Out Certain Blogs?

Terry and some other folks have noticed that their weblogs don’t seem to ever be listed on weblogs.com any more. That has raised a question of whether certain sites are being targeted. I haven’t been to weblogs.com in such a long time that I don’t even know if my sites still gets listed there. With all the ping timeouts I get I wouldn’t be surprised if I’m no longer getting listed. Come to think of it, I think I stopped using weblogs.com when I finally got down with RSS. In any case I’ll have to make a point to check weblogs.com after I update.

Movable Type 3.16 released

Six Apart has just released Movable Type 3.16. Hopefully some of its ‘over one hundred fixes’ will fix the annoying problems I’ve had recently.

On a related note, this SpamLookup plugin looks like it’s worth installing. It’s so effective that Jay Allen, the author of MT-Blacklist has disabled his MT-Blacklist installation as is letting SpamLookup do all his spam blocking.

Fun times in MT world…

Interview with a Link / Comment Spammer

The Register interviewed a link spammer who revealed some of his methods and motivation. The bottom line — spammers can make up to seven figure incomes from some simple computer code. Some key points:

For even a semi-competent programmer, writing programs that will link-spam vulnerable websites and blogs is pretty easy. All you need is a list of blogs – which again, even a semi-competent programmer will be able to pull together (by searching for sites with keywords such as “WordPress”, “Movable Type” and “Blogger”) a huge list of blogs to hit.

And people like Sam are much more than competent. “You could be aiming at 20,000 or 100,000 blogs. Any sensible spammer will be looking to spam not for quality [of site] but quantity of links.” When a new blog format appears, it can take less than ten minutes to work out how to comment spam it. Write a couple of hundred lines of terminal script, and the spam can begin. But you can’t just set your PC to start doing that. It’ll get spotted by your ISP, and shut down; or the IP address of your machine will be blocked forver by the targeted blogs.

So Sam, like other link spammers, uses the thousands of ‘open proxies’ on the net. These are machines which, by accident (read: clueless sysadmins) or design (read: clueless managers) are set up so that anyone, anywhere, can access another website through them. Usually intended for internal use, so a company only needs one machine facing the net, they’re actually hard to lock down completely.

By this Sam means spammers setting up their own blogs, and referencing posts on zillions of blogs, which will then incestuously point back to the spammer, whose profile is thus raised. So what does put a link spammer off? It’s those trusty friends, captchas – test humans are meant to be able to do but computers can’t, like reading distorted images of letters. “Even user authentication can be automated.” (Unix’s curl command is so wonderfully flexible.)

“The hardest form to spam is that which requires manual authentication such as captchas. Or those where you have to reply to an email, click on a link in it; though that can be automated too. Those where you have to register and click on links, they’re hard as well. And if you change the folder names where things usually reside, that’s a challenge, because you just gather lists of installations’ folder names.”