It is utterly impossible to do this job completely. However, there are a many things you can do to nail most spam.
However, there are some pretty serious downsides:
To sum up, follow this algorithm:
(Of course, that's in addition to any relay rejection you may want to do, but that is beyond the scope of this document. Consider that part of "process as normal", which should include "if it's not to your site, or any site with which you have explicit forwarding agreements, bounce it".)
Then, inspect what's been sidetracked ASAP. When you find something that is spam but has something distinctive about it, add that something to the "definite spam" filters; when you find something that is not spam but has something distinctive about it, add that something to the "smells like spam but isn't" filter. Also ask your users to report to you any spam that slips through intact, so you can tune your "probable spam" filter too.
Now, what software do you use for all this? On a standard Unix setup, most of it can be done with sendmail. I have heard that it can also be done with qmail. I've even heard that these can reject definite malformed crap and other spam before it even gets into your system. However, I wouldn't know, I haven't done that kind of setup yet. My system is a Fidonet BBS, run under OS/2, receiving the email in Fidonet packet format, most of it bound for other local Fidonet BBSes. (My system is the email router for the Washington DC area Fidonet.) I currently use several large NetMgr scripts to do it, but firstly, NetMgr has Y2K problems, and secondly, and sadly, it is no longer supported. The previous local Fidonet/Internet gateway, running on a UUCP feed (as many such gateways do) cobbled together his own filtering software, that worked quite well, in Perl under OS/2.
Well, that's the technical side. Then there's the BOFHly side....