|
BodyItems are simply words or phrases that you can add to your installation of spamREF to make it more effective.
There are three different types of BodyItem:
- A standard word or phrase, for example the word Viagra as it seems to appear quite a lot! spamREF checks for the occurrence of the word or phrase in an email. It also looks immediately before and after the word or phrase to ensure that are no further characters indicating that it is part of another word. For instance, you may decide that you are getting a lot of SPAM that contains the word core, if you were to put core in your BodyItems list then if an email contains the word core it would be marked as SPAM. It would not, however, decide that the word scored was SPAM. This is because the word core is actually inside another word and has characters immediately before and/or after.
- Literal BodyItems are a special case and are contained in single quotes. If you were to put 'core' (including the single quotes) in your BodyItems then all occurrences of core, whatever their location in the text, would be treated as SPAM. In this case using 'core' would result in any email containing any word that included the string of letters core e.g. scored to also be marked as SPAM.
- Regex BodyItems are the final word in powerful SPAM detection. Using a C# Regular Expression construct allows searches of extremely complicated sequences. Regex BodyItems start with the identifier regex: followed by the expression. As an example the first Regex used by a spamREF user was regex:(?<!schemas-microsoft)-com(?!\w). This Regex looks for -com but only if it doesn't have schemas-microsoft before it and a word character after it.
BodyItems do tend to build up over time so it is worth using the analysis tools here to make sure that your list is still providing value rather than just slowing your system down!
|