{"id":136,"date":"2003-08-05T17:35:54","date_gmt":"2003-08-05T21:35:54","guid":{"rendered":"http:\/\/www.tradermike.net\/movethecrowd\/2003\/08\/my_spam_filter\/"},"modified":"2003-08-05T17:35:54","modified_gmt":"2003-08-05T21:35:54","slug":"my_spam_filter","status":"publish","type":"post","link":"http:\/\/www.michaelseneadza.com\/blog\/2003\/08\/05\/my_spam_filter\/","title":{"rendered":"My Spam Filter"},"content":{"rendered":"<p>I <a href=\"http:\/\/tradermike.net\/cgi-bin\/mt\/mt-search.cgi?search=bloomba\" title=\"My previous Bloomba posts\">finally<\/a> turned on <a href=\"http:\/\/saproxy.bloomba.com\/\">SAProxy<\/a>, the spam filter in the <a href=\"http:\/\/www.bloomba.com\/index.php\">Bloomba<\/a> email program.  It&#8217;s working well so far, although a few legit messages have gotten marked as spam.  A very cool feature of the filtering engine (<a href=\"http:\/\/spamassassin.rediris.es\/index.html\">SpamAssassin<\/a>) is that it tells you why it flagged certain messages.  As you&#8217;ll see, failing one of the filters assigns a certain number of points to the message.  Once the point threshold is hit the message is marked as spam.  Here are some of the rules\/filters (from the Bayesian filter) that it uses:<br \/>\n<!--more--><\/p>\n<blockquote><p>FROM_ENDS_IN_NUMS  (0.7 points)  From: ends in numbers <b>** in one case this was actually from a legit address<\/b><br \/>\nSEARCH_ENGINE_PROMO (1.5 points)  BODY: Discusses search engine listings <b>** this was actually from a legit address<\/b><br \/>\nHTML_10_20         (1.4 points)  BODY: Message is 10% to 20% HTML<br \/>\nFORGED_YAHOO_RCVD  (2.3 points)  &#8216;From&#8217; yahoo.com does not match &#8216;Received&#8217; headers <b>** this was actually from a legit address<\/b><br \/>\nNO_REAL_NAME       (0.8 points)  From: does not include a real name<br \/>\nHTML_80_90         (0.5 points)  BODY: <b>Message is 80% to 90% HTML<br \/>\n<\/b>HTML_IMAGE_RATIO_02 (0.5 points)  BODY: HTML has a low ratio of text to image area<br \/>\nHTML_IMAGE_ONLY_02 (1.9 points)  BODY: HTML has images with 0-200 bytes of words<br \/>\nVERY_SUSP_RECIPS   (2.2 points)  Very similar addresses in recipient list <b>**we&#8217;ve all seen this done<\/b><br \/>\nADDR_NUMS_AT_BIGSITE (0.6 points)  Uses an address with lots of numbers, at a big ISP<br \/>\nFROM_WEBMAIL_ENDS_IN_NUMS6 (1.5 points)  From address is webmail, and ends in lots of numbers <b>**deadly combo!<\/b><br \/>\nEARN_MONEY         (1.0 points)  BODY: Message talks about earning money <b>** always a warning sign<\/b><br \/>\nEXCUSE_19          (0.6 points)  BODY: Claims you opted-in or registered <b>**I want to see the whole list of excuses \ud83d\ude42<\/b><br \/>\nOPT_IN             (0.5 points)  BODY: Talks about opting in (lowercase version)<br \/>\nSAVE_BUCKS         (0.0 points)  BODY: Save $$$ <b>**Nuff said!<\/b><br \/>\nEXCUSE_1           (0.0 points)  BODY: <b>Gives a lame excuse<\/b> about why you were sent this spam<br \/>\nEXCUSE_3           (0.1 points)  BODY: <b>Claims you can be removed from the list<\/b><br \/>\nREMOVE_FROM_LIST   (0.0 points)  BODY: To be removed from list<br \/>\nTARGETED           (2.8 points)  BODY: Targeted Traffic \/ Email Addresses<br \/>\nFOR_FREE           (0.6 points)  BODY: <b>No such thing as a free lunch(1)<\/b><br \/>\nEMAIL_MARKETING    (0.0 points)  BODY: Talks about email marketing<br \/>\nOPT_IN_CAPS        (0.2 points)  BODY: Talks about opting in (capitalized version)<br \/>\n<b>LINES_OF_YELLING   (0.0 points)  BODY: A WHOLE LINE OF YELLING DETECTED<br \/>\nLINES_OF_YELLING_2 (0.0 points)  BODY: 2 WHOLE LINES OF YELLING DETECTED<br \/>\nLINES_OF_YELLING_3 (0.0 points)  BODY: 3 WHOLE LINES OF YELLING DETECTED<\/b><br \/>\nREMOVE_PAGE        (0.3 points)  URI: URL of page called &#8220;remove&#8221;<br \/>\nSUBJ_ALL_CAPS      (1.1 points)  <b>Subject is all capitals<\/b><br \/>\nAS_SEEN_ON         (1.9 points)  BODY: <b>As seen on national TV!<\/b><br \/>\nONLY_COST          (0.0 points)  BODY: Only $$$<br \/>\nMLM                (0.8 points)  BODY: Multi Level Marketing mentioned<br \/>\nEARN_MONEY         (1.0 points)  BODY: Message talks about earning money<br \/>\nONE_TIME           (0.0 points)  BODY: <b>One Time Rip Off<\/b><br \/>\nJODY               (2.9 points)  BODY: Contains &#8220;My wife, Jody&#8221; testimonial <b>** Is Jody that popular???<\/b><br \/>\nBANG_MONEY         (0.7 points)  BODY: <b>Talks about money with an exclamation!<\/b><br \/>\nBULK_EMAIL         (1.6 points)  BODY: Talks about bulk email <b>**talk about a dead give-away<\/b><br \/>\nORDER_REPORT       (2.9 points)  BODY: Order a report from someone<br \/>\nSENT_IN_COMPLIANCE (4.3 points)  BODY: Claims compliance with spam regulations<br \/>\nFINANCIAL          (4.3 points)  BODY: <b>Financial Freedom<\/b><br \/>\nSECTION_301        (1.7 points)  BODY: Claims compliance with spam regulations<br \/>\nINVALUABLE_MARKETING (2.9 points)  BODY: <b>Invaluable <\/b>marketing information<br \/>\nDONT_DELETE        (0.0 points)  BODY: <b>Don&#8217;t delete me!  Nooooo!!!!<\/b><br \/>\nRISK_FREE          (0.9 points)  BODY: <b>Risk free.<\/b>  Suuurreeee&#8230;.<br \/>\n<b>COPY_ACCURATELY    (2.9 points)  BODY: Common pyramid scheme phrase<\/b> (1)<br \/>\nINITIAL_INVEST     (2.7 points)  BODY: <b>Requires Initial Investment<\/b><br \/>\nHTML_FONT_COLOR_RED (0.1 points)  BODY:<b> HTML font color is red<\/b><br \/>\nHTML_FONT_BIG      (0.3 points)  BODY: <b>FONT Size +2 and up or 3 and up<\/b><br \/>\nHTML_SHOUTING5     (0.0 points)  BODY: <b>HTML has very strong &#8220;shouting&#8221; markup<\/b><br \/>\n<b>CASHCASHCASH<\/b>       (0.0 points)  Contains at least 3 dollar signs in a row\n<\/p><\/blockquote>\n<p>Somebody should make a filter for all those messages (hoaxes &amp; other nonsense) that people forward to all of their friends.  Rule #1 for that filter would be if the messages says &#8216;forward this to everyone you know&#8217; flag it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I finally turned on SAProxy, the spam filter in the Bloomba email program. It&#8217;s working well so far, although a few legit messages have gotten marked as spam. A very cool feature of the filtering engine (SpamAssassin) is that it tells you why it flagged certain messages. As you&#8217;ll see, failing one of the filters&hellip; <a class=\"more-link\" href=\"http:\/\/www.michaelseneadza.com\/blog\/2003\/08\/05\/my_spam_filter\/\">Continue reading <span class=\"screen-reader-text\">My Spam Filter<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":220,"url":"http:\/\/www.michaelseneadza.com\/blog\/2003\/10\/15\/james_sengs_bayesian_filter_for_fighting_blog_spam\/","url_meta":{"origin":136,"position":0},"title":"James Seng&#8217;s Bayesian Filter for Fighting Blog Spam","date":"October 15, 2003","format":false,"excerpt":"Let me first say that James Seng is the man! A few days ago he provided the world with his excellent CAPTCHA-based spam solution for Movable Type blogs. Now he's created another spam blocker which uses a Bayesian filter. This new filter works on TrackBacks too. I like the idea\u2026","rel":"","context":"In &quot;Blogging&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":414,"url":"http:\/\/www.michaelseneadza.com\/blog\/2004\/11\/14\/comment_spammers_are_wilding_out\/","url_meta":{"origin":136,"position":1},"title":"Comment Spammers are Wilding Out","date":"November 14, 2004","format":false,"excerpt":"I just took a look at my MT-Blacklist statistics and was shocked by what I saw. A couple of weeks ago I noticed an increase in the amount of comment spam that was slipping by the blacklist. So I decided to add some items of my own to the filter.\u2026","rel":"","context":"In &quot;Blogging&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":238,"url":"http:\/\/www.michaelseneadza.com\/blog\/2003\/11\/02\/loving_my_spam_filter\/","url_meta":{"origin":136,"position":2},"title":"Loving My Spam Filter","date":"November 2, 2003","format":false,"excerpt":"I'm really liking the way that MT-Bayesian is working out. Not only is it catching spam, but it's also filtering out the ignorant comments that we all get every so often. You know the type -- those comments in which every other word is a curse word, and the rest\u2026","rel":"","context":"In &quot;Blogging&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":446,"url":"http:\/\/www.michaelseneadza.com\/blog\/2005\/01\/31\/interview_with_a_link_comment_spammer\/","url_meta":{"origin":136,"position":3},"title":"Interview with a Link \/ Comment Spammer","date":"January 31, 2005","format":false,"excerpt":"The Register interviewed a link spammer who revealed some of his methods and motivation. The bottom line -- spammers can make up to seven figure incomes from some simple computer code. Some key points: For even a semi-competent programmer, writing programs that will link-spam vulnerable websites and blogs is pretty\u2026","rel":"","context":"In &quot;Blogging&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":237,"url":"http:\/\/www.michaelseneadza.com\/blog\/2003\/11\/01\/yahoo_mail_spam_explosion\/","url_meta":{"origin":136,"position":4},"title":"Yahoo! Mail Spam Explosion","date":"November 1, 2003","format":false,"excerpt":"Is it just me or is everybody else getting bombarded with spam on their Yahoo! Mail accounts? It seems like ever since they announced their 'new spam measures' the spam has at least doubled. I'm now getting about 200 spams a day. Every time I go into my account I\u2026","rel":"","context":"In &quot;Internet&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":316,"url":"http:\/\/www.michaelseneadza.com\/blog\/2004\/04\/25\/yahoo_swings_back_at_googles_gmail\/","url_meta":{"origin":136,"position":5},"title":"Yahoo! Swings Back at Google&#8217;s GMail","date":"April 25, 2004","format":false,"excerpt":"Yahoo has changed their e-mail policies so that spam now longer counts as part of a user's allotted storage space...","rel":"","context":"In &quot;Internet&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/posts\/136"}],"collection":[{"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/comments?post=136"}],"version-history":[{"count":0,"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/posts\/136\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/media?parent=136"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/categories?post=136"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.michaelseneadza.com\/blog\/wp-json\/wp\/v2\/tags?post=136"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}