Windows Event ID 4649 “A replay attack was detected “ — Oh really? Are we under ATTACK? Should we do Incident Response?

Iveco
8 min readMar 18, 2020

--

  • gathered and collected all available knowledge about Event ID 4649
  • is it a low or high noise signal?
  • a cyber-attack by hackers — or just an badly programmed/crashed piece of software causing this event?
  • key question: put on alert/detection for your SOC/CSIRT yes/no?

As Blueteamer you might have heard of Samir (@SBousseaden), which is known for some really great insights into Windows Event security, processing and creation and therefore detection of attacks helping us threat hunters getting everything together (of course there are other mentionable people like Florian Roth — SIGMA — you should always watch for).

When I read the tweet from him about the Event 4649, I also did not find much information about it in public. So I went ahead and put some information together and analyzed it in my environment and drew my own conclusion.

Tweet suggesting putting Event ID 4649 on detection for alerts ( Source: https://twitter.com/SBousseaden/status/1238151329220001792 )

I tried to answer myself the following question:

Should I create an alert for it? All we know about this event so far is:

This event generates on domain controllers when KRB_AP_ERR_REPEAT Kerberos response was sent to the client.

Domain controllers cache information from recently received tickets. If the server name, client name, time, and microsecond fields from the Authenticator match recently seen entries in the cache, it will return KRB_AP_ERR_REPEAT. You can read more about this in RFC-1510. One potential cause for this is a misconfigured network device between the client and server that could send the same packet(s) repeatedly.

There is no example of this event in this document.

Since there is no official information from Microsoft, how relevant this information could be for detections about attacks (e.g. in SIEMs for blueteams) further information is required. Blueteamers can not just put something up and running and hope it gets triggered, the key of blueteamers is the knowledge and the understanding in the big picture, what triggers what, why does windows this and why does this happen this way. Windows is like a huge network, constantly connecting each cells, and some of these functions/Win32 API Calls/triggers/communication WILL trigger an event. The more you know about internals in Windows, especially Kernel-Side, how Sysmon works, where the future goes to (especially interesting in .NET currently, Microsoft Defender ATP, huge telemetry data, worldwide etc.) there more you see everything as light bulbs hanging around in your company networks. And you can setup hundreds of these types of alerts to see attacks early. But you need to always understand the little pieces, so you can conclude an incident and draw an killchain (MITRE-wise) to catch really good bad guys. An attacker is never able to hide from all lights (alerts/detections) and behaviours based security rules are more and more getting troublesome.

As blueteamer its important that we challenge ourself and understand what we are looking for, even if the piece is a little as this one event, to draw a total conclusion of the big picture. All events, as small and little they are, every date, every piece of information you can conclude, correlate, ingest, enrich or make use of, is good for us. And these are hundreds of thousands and more and more data is being generated on new products like Microsoft Defender ATP or EDRs. So we need to constantly challenge ourself, how can we detect something and what data do we have available? Never just go ahead and copy and paste something, which you dont understand 100%. You need to build your own big system of detections and traps and correlate everything — build your own “shield”, customized for your environment.

Therefore its important to understand what KRB_AP_ERR_REPEAT means. So we have to read the Kerberos RFC4120 at https://tools.ietf.org/html/rfc4120.

What does KRB_AP_ERR_REPEAT mean?
The replay cache will store at least the server name, along with the
client name, time, and microsecond fields from the recently-seen
authenticators, and if a matching tuple is found, the
KRB_AP_ERR_REPEAT error is returned. Note that the rejection here is
restricted to authenticators from the same principal to the same
server. Other client principals communicating with the same server
principal should not have their authenticators rejected if the time
and microsecond fields happen to match some other client’s
authenticator.

Also see:
https://k5wiki.kerberos.org/wiki/Projects/replay_cache_collision_avoidance

To further take a look at how some software implements the Kerberos protocol based on the RFC, I googled some more around, how some Java Software handles the Kerberos KRB_AP_ERR_REPEAT error:

- They ignore it during auth OR they check if the time&hash is the same and thrown an exception during auth (example: https://www.programcreek.com/java-api-examples/?class=sun.security.krb5.internal.Krb5&method=KRB_AP_ERR_REPEAT example2: https://www.freesoft.org/CIE/RFC/1510/123.htm)

- There were Java SSO Kerberos Libs around which did not check the flag KRB_AP_ERR_REPEAT correctly as it looks this is mostly used together with KRB_AP_ERR_SKEW (stale tokens)
(e.g. detect timeskew attacks during kerberos auth).

Additionally I tried to find practical issues, where people already have being confronted with the problem and try to read solutions from that.

Possible root causes for Event being thrown (various forum sources):

→ If the realm, Application Server name, along with the Client name, time and microsecond fields from the Kerberos Authenticator (in the AP Request) match any recently-seen such tuples, the KRB_AP_ERR_REPEAT error is returned.

→ May happen on caching/threading issues when authentication server gets the sees the same Kerberos tuple “milliseconds ago”. This could happen on devices switching between networks where connection is shortly disconnected and re-established and some software may request Kerberos auth again.

→ I would check the routing and timing on the hosts involved. I have not seen this particular event associated with known attacks. This event will be triggered when the same exact request is seen twice. It could be an anomaly based on network configurations if it is not consistently generated.
If the local (server) time and the client time in the authenticator differ by more than the allowable clock skew (e.g., 5 minutes), the KRB_AP_ERR_SKEW error is returned. If the server name, along with the client name, time and microsecond fields from the Authenticator match any recently-seen such tuples, the KRB_AP_ERR_REPEAT error is returned (Note that the rejection here is restricted to authenticators from the same principal to the same server. The RFC explicitly mentions microseconds. However we observed that the time in the authenticator has a much lower resolution (in the range of miliseconds) since we experience the KRB_AP_ERR_REPEAT error if we sequentially request two service tickets. One possible solution is to make sure that the thread requesting the service ticket waits for at least one milisecond before requesting the next ticket so that the authenticator’s time stamp differs from the previous one. However this only works if a client has only one process communicating with the same server. If for instance two seperate processes are requesting tickets and performing requests to the same service chances are high that two ticket requests are made in the same milisecond and one of the processes gets the KRB_AP_ERR_REPEAT error.

→ There seem to be multiply known confirmed crashs outside with “w3wp.exe” as process_name throwing Event ID 4649 (IIS Worker Process)
probably caused by crashes or multi-thread issues in handling auth

Software Solutions are mostly:
KRB_AP_ERR_REPEAT: the initiator should build a new AP-REQ
KRB_AP_ERR_SKEW: the initiator should build a new AP-REQ with time corrected for the offset between the initiator’s and acceptor’s clocks

Nice to know:
Hex Code: 0x22 → KRB_AP_ERR_REPEAT
This error indicates that a specific authenticator showed up twice — the KDC has detected that this session ticket duplicates one that it has already received.

Hex Code: 0x25 → KRB_AP_ERR_SKEW
The error is logged if a client computer sends a timestamp whose value differs from that of the servers timestamp by more than the number of minutes found in the Max tolerance setting in kerberos policy

Know we can almost draw an conclusion WHY the event is being thrown (same kerberos auth tuple has been detected). But we still can’t say for sure, what the reason for that is (attack or not?).

We take an look at the event template, to try to understand what could be in it:

Get raw event template format via Powershell:

Init the ETW:

$provider = Get-WinEvent -ListProvider Microsoft-Windows-Security-Auditing

Show all events from ETW where Event ID equals to 4649, select desired fields and format the output as list:

$provider.events | Where-Object {$_.id -eq 4649} | select id,template,description | fl
(German Windows) — Output of raw EID 4649 template output via Powershell

ETW: Microsoft-Windows-Security-Auditing
Channel: Security
Event ID: 4649
Event Description: A replay attack was detected.
Advanced Audit Security Policy: Audit Other Logon/Logoff Events
(https://docs.microsoft.com/en-US/windows/security/threat-protection/auditing/audit-other-logonlogoff-events)
Operating Systems: Windows 2008 R2 and 7, Windows 2012 R2 and 8.1, Windows 10, Windows Server 2016,Windows Server 2019
Raw Template Description in OSSEM:
https://github.com/hunters-forge/OSSEM/blob/master/data_dictionaries/windows/etw-providers/Microsoft-Windows-Security-Auditing/events/event-4649.md
Event ID 4649 Template information:
https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/event.aspx?eventID=4649
Original Microsoft event description:
https://docs.microsoft.com/en-US/windows/security/threat-protection/auditing/event-4649

Final conclusion for blueteamers:

Proven by further analyzing own events thrown in my company, researching all known circumstances of this event, most of them seem to be caused by software issues (e.g. kerberos self-developed stack in software is broken, multi-threading/caching issues, crashes of the software (w3wp.exe from Microsoft caused this once). Or network hand-overs during network switches (e.g. failover in company when WLAN goes off) —you will be mostly threat hunting bugs in software or network issues — but no real hacking attacks.

Also worth mentioning is the reliable of this event as low noise signal, in a good work network. In a huge environment I only got about 5–10 events thrown of this ID for one domain per year, so it seems to be really something rare and seldom. I had higher event counts on domains with kerberos/ntlm/GPO policy issues. So this may also indicate troublesome network/policy/caching issues. Or the usage of some really bad programmed software, which implemented a bad Kerberos stack.

It seems worth to put event 4649 on alert, even though there is currently (03–18–2020) no known attack-vector causing this event to be thrown. But since EID 4649 is not creating a large noise on good networks, creating an alert on this event is recommend as blueteamer — it also prepares you for anything upcoming and might indicate upcoming network/software issues additionally.

added at 03–21–2020: I’ve got asked how to analyze the issue best. Copied my answer here:

1/2 Events revealing no ProcessName are a blind guess. First always make sure the local time of the remote host is fine (Kerberos ~5min skew, e.g. policy ‘Maximum tolerance for computer clock synchronization’ is not violated) →
https://docs.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/maximum-tolerance-for-computer-clock-synchronization

2/2 Events with ProcessName, depends on the process — further logs or even crash events like EID 1000 in channel Application might be available which could be used to further analyze if same the kerberos tuple was malicously sent or just by other issues (software, network)

A story written by xknow_infosec (@xknow_infosec) — 18.03.2020
https://twitter.com/xknow_infosec

--

--