TranslateProject/sources/tech/20181222 How to detect automatically generated emails.md
Xingyu Wang e6432840ce APL
2019-08-07 23:37:39 +08:00

6.4 KiB
Raw Blame History

How to detect automatically generated emails

How to detect automatically generated emails

When you send out an auto-reply from an email system you want to take care to not send replies to automatically generated emails. At best, you will get a useless delivery failure. At words, you will get an infinite email loop and a world of chaos.

Turns out that reliably detecting automatically generated emails is not always easy. Here are my observations based on writing a detector for this and scanning about 100,000 emails with it (extensive personal archive and company archive).

Auto-submitted header

Defined in RFC 3834.

This is the official standard way to indicate your message is an auto-reply. You should not send a reply if Auto-Submitted is present and has a value other than no.

X-Auto-Response-Suppress header

Defined by Microsoft

This header is used by Microsoft Exchange, Outlook, and perhaps some other products. Many newsletters and such also set this. You should not send a reply if X-Auto-Response-Suppress contains DR (“Suppress delivery reports”), AutoReply (“Suppress auto-reply messages other than OOF notifications”), or All.

List-Id and List-Unsubscribe headers

Defined in [RFC 2919][3]

You usually dont want to send auto-replies to mailing lists or news letters. Pretty much all mail lists and most newsletters set at least one of these headers. You should not send a reply if either of these headers is present. The value is unimportant.

Feedback-ID header

Defined [by Google][4].

Gmail uses this header to identify mail newsletters, and uses it to generate statistics/reports for owners of those newsletters. You should not send a reply if this headers is present; the value is unimportant.

Non-standard ways

The above methods are well-defined and clear (even though some are non-standard). Unfortunately some email systems do not use any of them :-( Here are some additional measures.

Precedence header

Not really defined anywhere, mentioned in [RFC 2076][5] where its use is discouraged (but this header is commonly encountered).

Note that checking for the existence of this field is not recommended, as some ails use normal and some other (obscure) values (this is not very common though).

My recommendation is to not send a reply if the value case-insensitively matches bulk, auto_reply, or list.

Other obscure headers

A collection of other (somewhat obscure) headers Ive encountered. I would recommend not sending an auto-reply if one of these is set. Most mails also set one of the above headers, but some dont (but its not very common).

  • X-MSFBL; cant really find a definition (Microsoft header?), but I only have auto-generated mails with this header.

  • X-Loop; not really defined anywhere, and somewhat rare, but sometimes its set. Its most often set to the address that should not get emails, but X-Loop: yes is also encountered.

  • X-Autoreply; fairly rare, and always seems to have a value of yes.

Email address

Check if the From or Reply-To headers contains noreply, no-reply, or no_reply (regex: ^no.?reply@).

HTML only

If an email only has a HTML part, but no text part its a good indication this is an auto-generated mail or newsletter. Pretty much all mail clients also set a text part.

Delivery failures

Many delivery failure messages dont really indicate that theyre failures. Some ways to check this:

  • From contains mailer-daemon or Mail Delivery Subsystem

Many mail libraries leave some sort of footprint, and most regular mail clients override this with their own data. Checking for this seems to work fairly well.

  • X-Mailer: Microsoft CDO for Windows 2000 Set by some MS software; I can only find it on autogenerated mails. Yes, its still used in 2015.

  • Message-ID header contains .JavaMail. Ive found a few (5 on 50k) regular messages with this, but not many; the vast majority (thousends) of messages are news-letters, order confirmations, etc.

  • ^X-Mailer starts with PHP. This should catch both X-Mailer: PHP/5.5.0 and X-Mailer: PHPmailer blah blah. The same as JavaMail applies.

  • X-Library presence; only [Indy][6] seems to set this.

  • X-Mailer starts with wdcollect. Set by some Plesk mails.

  • X-Mailer starts with MIME-tools.

Final precaution: limit the number of replies

Even when following all of the above advice, you may still encounter an email program that will slip through. This can very dangerous, as email systems that simply IF email THEN send_email have the potential to cause infinite email loops.

For this reason, I recommend keeping track of which emails youve sent an autoreply to and rate limiting this to at most n emails in n minutes. This will break the back-and-forth chain.

We use one email per five minutes, but something less strict will probably also work well.

What you need to set on your auto-response

The specifics for this will vary depending on what sort of mails youre sending. This is what we use for auto-reply mails:

Auto-Submitted: auto-replied
X-Auto-Response-Suppress: All
Precedence: auto_reply

Feedback

You can mail me at [martin@arp242.net][7] or [create a GitHub issue][8] for feedback, questions, etc.


via: https://arp242.net/weblog/autoreply.html

作者:Martin Tournoij 选题:lujun9972 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出

[3]: https://tools.ietf.org/html/rfc2919) [4]: https://support.google.com/mail/answer/6254652?hl=en [5]: http://www.faqs.org/rfcs/rfc2076.html [6]: http://www.indyproject.org/index.en.aspx [7]: mailto:martin@arp242.net [8]: https://github.com/Carpetsmoker/arp242.net/issues/new