(Originally posted at Anton Chuvakin’s blog at https://medium.com/anton-on-security/retaining-logs-for-a-year-boring-or-useful-70ea21fa3dda)
Let’s start from the boring subject of log retention … and then evolve to some exciting topics like perhaps hunting (promise!). Our old friend PCI DSSreminds us that we need to keep security logs for one year. Now, the same friend is mum about using logs beyond near real-time threat detection. People pick the “write only” media of their choice, show it to their PCI assessor (from a distance, natch) and they are good to go.
When clients asked me about log retention (such as via “for how long should we keep our logs?”) they often implied a very passive value from log retention. This could have been an explicit compliance requirement, or just some fear that an auditor would check on them. On a more extreme end, an occasional mis-interpretation of US HIPAA law (well, my experience seems to suggest it is, but naturally I am not a lawyer and such) suggests a 7 year retention period for some audit logs.
Now, such passive value often implies low value, in fact. Would you pay a lot to save logs knowing that nobody will ever look at them? Most people would say “no” — the concept of “compliance excellence” is nonsensical after all. It is OK to treat compliance as a checkbox, because … duh … it IS one! :-)
Let’s ask a similar, but subtly different question: for how long are the logs typically useful?
My logic for justifying a log retention period would stem from a typical breach discovery timeline. Dependent which data breach report you read, you may discover a typical incident discovery timeline averages of 100 to 200 days. Essentially, this does confirm logic for retaining logs for one year because you might actually need the logs to investigate an incident that occurred 200 days ago. Otherwise, you’d find yourself in a situation described here (albeit with packets), where you pay for log retention, but never get to benefit from it. [The question of whether you would be able to actually use 1 year old logs without the necessary context is left for future discussion, BTW]
Is this fun yet? Frankly no. Incident response use of log data is as old as the trees … well…logs.
So, finally, let’s try to ask the question differently “for how long can logs be useful and actively used if you don’t have to pay for retention?” If asked like this and perhaps broadened from merely logs to, say, logs, EDR and network traffic data, the answers start looking different. Detections off 6 month old data are not that rare, if you look at the top-tier threat actors, for example. You may literally find intrusion evidence in last year’s data given some new bit of knowledge that you received today. A hunting clue you receive today may be that thread that you can pull to find that proverbial APT in your environment.
So. Conclusion: 1 year log retention is both a boring compliance requirement and a key resource for detecting top-tier threats.