{"id":16,"date":"2007-02-21T15:37:41","date_gmt":"2007-02-21T20:37:41","guid":{"rendered":"http:\/\/www.opticality.com\/blog\/?p=16"},"modified":"2007-02-21T15:37:41","modified_gmt":"2007-02-21T20:37:41","slug":"spam-problem-solved","status":"publish","type":"post","link":"https:\/\/opticality.com\/blog\/2007\/02\/21\/spam-problem-solved\/","title":{"rendered":"SPAM Problem Solved!"},"content":{"rendered":"<p>OK, not really. That was only a <em>slight<\/em> exaggeration \ud83d\ude09<\/p>\n<p>Seriously, the specific spam problem that I complained about in my &#8220;<a href=\"https:\/\/www.opticality.com\/blog\/?p=14\" title=\"Technology feels so random at times\">technology is random<\/a>&#8221; posting is what I&#8217;ve now solved.<\/p>\n<p>As I mentioned in that post, I had a combination of procmail rules and SpamBayes filtering, etc. I completely turned off the old SB filtering, because at first I thought that somehow it was causing the emails with attachments to be deleted. Only when I did that, did I notice that it was throwing away <em>other<\/em> emails simply because it was incorrectly tagging them as <strong>certain<\/strong> spam (score  of 1.0). I couldn&#8217;t believe that, but like I said, since I wasn&#8217;t updating the db, it was degrading.<\/p>\n<p>So, I turned off the SB filtering, and still, emails were being sent to \/dev\/null on the server if they had large-ish attachments. That meant that one of my other procmail rules was kicking in. I looked at each (I have <strong>many<\/strong>) very closely, and couldn&#8217;t imagine which might be causing this.<\/p>\n<p>Also as mentioned in the previous post, I temporarily fixed this by creating a procmail-based white list, which (unfortunately) was both after the fact, and growing steadily.<\/p>\n<p>I also went back and with a few carefully crafted grep and tail pipelines, was able to identify other emails that had quietly been thrown away, and then contacted those (very surprised) authors, and asked for a resend.<\/p>\n<p>OK, on to the solution (almost). Yesterday, an old boss of mine (no, he&#8217;s not <em>that<\/em> old, but I haven&#8217;t worked for him directly since 1989!) asked me to review a 384 page document that he had written (<strong>no, I&#8217;m not kidding about the size<\/strong>). People who know me, know that I (and Lois) are like an <strong>echo<\/strong> when it comes to email (think &#8220;ping pong&#8221;).  When he didn&#8217;t get an acknowledgement from me within an hour, he assumed that something was wrong.<\/p>\n<p>He sent me another email, asking if I&#8217;d gotten the file. Of course, \/dev\/null had eaten it&#8230;<\/p>\n<p>I white listed him, and got the file (which is how I know the size, as at first he scared me by telling me that it was 400 pages) \ud83d\ude09<\/p>\n<p>That got me to thinking that I now had a <strong>specific<\/strong> attachment that I knew would fail. I ended up sending it to myself from an account that wasn&#8217;t white listed. It got thrown out immediately. <strong>Bingo!<\/strong> Now I was at least in control of my own destiny, since I could provoke the problem any time I wanted to.<\/p>\n<p>The next step was easy (and obvious). I turned on verbose logging in procmail and resent the email. You might ask &#8220;Why the hell didn&#8217;t you turn on verbose logging earlier?&#8221; Good question. Aside from not really thinking about it, I must have known (intuitively) that my disk would have filled up waiting for a &#8220;bad&#8221; email to come in and provoke the problem. Even asking someone to resend would have an unacceptable lag in waiting for them to see my email and act on it, etc.<\/p>\n<p>Logging showed that I was being <strong>completely stupid<\/strong> in one specific rule. As the rest of you must know, one of the most popular email annoyances are the <em>pump-and-dump<\/em> stock schemes. They <em>promote<\/em> a specific stock as the next moon shot. Many are traded on an exchange with a code of PK (for the few of you who don&#8217;t know, that&#8217;s the Karache Stocke Exchange in Pakistan, a place where I am dying to find a good stock deal!) \ud83d\ude09<\/p>\n<p>So, I started a little procmail rule that added any symbol in those emails that I was <strong>sure<\/strong> (and here comes my <strong>ultra-stupidity<\/strong>) that couldn&#8217;t occur in a <em>normal<\/em> email. So far so good, right? As an example, let&#8217;s say that one of the symbols was &#8220;JMNX.PK&#8221;. Come on, would I worry about accidentally deleting an email that had that string of characters in it?<\/p>\n<p>Well, mistake number 1 (the tiny one) is that without escaping the &#8220;.&#8221; in the above symbol, it would have substituted for <em>any<\/em> character, so if a buddy sent me an email saying &#8220;Howdy, check out JMNXOPK&#8221;, I would never have seen it. Hopefully, I&#8217;d survive such a faux pas. But, over time, I added shorter symbols. Notably, one was PHYA. Again, I wasn&#8217;t &#8220;worried&#8221; that someone would send me a legitimate email with that in it. This was mistake <a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/opticality.com\/blog\/tag\/2\/\">#2<\/a>, and clearly the biggie&#8230;<\/p>\n<p>When someone sends you an attachment, it gets <strong>encoded<\/strong>, typically in <strong>base64<\/strong>, which is an ascii encoding. That means that it is converted into a series of apparently random characters. The bigger the attachment, the more of these random characters, and the more likely that <strong>any<\/strong> 4-letter combination will appear.<\/p>\n<p>So, it turned out that the 384 page document had the string &#8220;pHYa&#8221; in it. Note that procmail was kind enough to be case insensitive so that &#8220;pHYa&#8221; matched my input of &#8220;PHYA&#8221;, reducing the number of random combinations I had to sweat out.<\/p>\n<p>Of course, in retrospect, I was an idiot, and the inevitability of the match is obvious. The solution is trivial too: <strong>delete the rule<\/strong> \ud83d\ude42 Now that it&#8217;s gone, it&#8217;s just as simple to add at least another step to check for any number of <em>other<\/em> typical pump-and-dump keywords along with the ticker symbol, and that should work just fine. In the end, it was both laziness on my part, coupled with the fantasy of catching <strong>every<\/strong> occurence of that particular type of email that did me in.<\/p>\n<p>All I can say is <strong>amen<\/strong>, a modicum of sanity has returned to the world&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OK, not really. That was only a slight exaggeration \ud83d\ude09 Seriously, the specific spam problem that I complained about in my &#8220;technology is random&#8221; posting is what I&#8217;ve now solved. As I mentioned in that post, I had a combination of procmail rules and SpamBayes filtering, etc. I completely turned off the old SB filtering, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[4,3,2],"tags":[],"class_list":["post-16","post","type-post","status-publish","format-standard","hentry","category-4","category-3","category-2"],"_links":{"self":[{"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/posts\/16","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/comments?post=16"}],"version-history":[{"count":0,"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/posts\/16\/revisions"}],"wp:attachment":[{"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/media?parent=16"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/categories?post=16"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/opticality.com\/blog\/wp-json\/wp\/v2\/tags?post=16"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}