Why (only) EDRs won’t save you

7 May 2025 | Prepared by: Prepared by: Elliot Xu, Offensive Consultant

In this blog post, we share some of the core concepts behind a fully customised red team tool we’ve developed - Hermit Crab, not to glorify evasion, but to reinforce a critical point: Endpoint Detection and Response (EDR) tools are not silver bullets. While they offer powerful behavioural analysis and protection capabilities, adversaries continually evolve, finding new ways to bypass even the most mature solutions. True security requires a layered, adaptive approach. The journey we describe highlights where EDRs fall short, and how determined attackers slip through.

Make peace with EDRs

Since March 2024, to demonstrate greater impact in red teaming and internal engagements against EDR-protected environments, our team began developing a fully customised malware codenamed Hermit Crab. Each instance is referred to as an "agent." So far, it has proven effective against leading EDR solutions.

Through extensive field testing, various modules have evolved to better navigate enterprise environments—bypassing antivirus, anti-malware tools, and firewalls.

About EDRs

EDRs are a staple in modern enterprise defence, offering behavioural detection, real-time response, and forensic capabilities across endpoints. But as we all know, their detection logic hinges on signatures, heuristics, and behavioural baselines—making them susceptible to tailored implants, novel TTPs, and stealthy execution chains. The very mechanisms that make EDRs effective can be turned against them with enough creativity and understanding of their thresholds.

Keep it simple

When we started building this tool, we set a simple goal: it's a Windows malware that can establish a connection to our TeamServer. It can receive commands in a stealthy way, execute it and return the result. Our aim was to develop something entirely new, ensuring a high level of evasiveness.
We believe that rebuilding the wheel is something necessary, so we built everything from the ground up, starting from the network layer.

Unlearn what’s learned

For the network layer, we wanted complete control. Instead of relying on HTTP API libraries, we opted for raw TCP socket connections. This approach in later stages introduced some overhead (which we’ll discuss shortly), but it gave us full control down to the single byte.
Our first milestone was reached quickly: we completed the prototype, and the TeamServer was fully web-based. With a VPN connection, agents could be managed from anywhere.
However, a major challenge soon emerged. The agent was consistently flagged as malicious file by CrowdStrike’s AI-based sensors. Despite numerous attempts to modify its structure, obfuscate strings, and inject junk code, we still couldn’t get over this hurdle. At this point, it was clear that a new approach was needed.

Hermit Crab - go find a good shell

Why is the file malicious? Based on what criteria? Come to think about it, if an unknown executable is connecting to some server immediately after launching, with some APIs that do not offer any value to the user, it’s safer for the EDRs to report it as malicious than let it run and cause unforeseeable trouble. That being said, we need a good “shell”, as a disguise.

In this case, we want something that can offer good legitimacy, has the need for network communication, and has features that we can easily hide malicious assets in. After some thought, we picked game as the “shell” of our agent. Hermit Crab can have fun now.

Games, which meet all the requirements mentioned above, can be an extremely good disguise for malware. And it’s Microsoft we’re talking about. Gaming is what Windows are “best for” (personal opinion). Someone would ask, what about Windows servers? Games have gaming servers, right?

Quick action was taken, the agent evolved into a game, and all of a sudden, we're clean from any alerts, ever since.

Is that all?

As more and more real-world engagements were taking place, there came new challenges.

Prevent the unwanted

At this stage, we noticed unwanted agent connections from Microsoft IP ranges, indicating that our malware had been uploaded to sandboxes for analysis. To mitigate this, we implemented Guardrails using key identifiers to determine whether the agent’s core functions should execute. We used public IP, username, hostname, and domain name as matching criteria, ensuring the agent runs only on the intended host. If the validation fails, the agent deletes itself to prevent analysis.

Additionally, we implemented a kill date for the agent to automatically terminate and erase itself after a predefined period, leaving no trace behind.

A bonus tip for implementing VM and sandbox detection: Traditional virtual environment detection has become less effective due to cloud-based virtual hosts and kernel-sharing features introduced by containerisation solutions. To implement effective anti-analysis mechanisms, focus on detecting specific cloud environments, sandboxes, and analysis tools rather than relying on generic VM detection logic. For example, checking for accessible cloud metadata can help determine if a host is running in a cloud environment.

Modularisation

Though not detected anymore, having all tasks being executed in one place worries us. We have to structure the agent according to the risk of the task it will perform. The logic behind this is that getting a username with Win32 API, of course, is not as dangerous as executing Rubeus .NET assembly. So, we must have a hierarchical solution, the more dangerous the task is, the “far away” it should be from the main agent. Modularisation is imminent.

Main agent

The main agent should be responsible for most casual tasks, like the ones we’ve mentioned—get a username, or get the current working directory. These APIs are commonly used in the system by other programs so it’s rare that EDRs will monitor those. Well, it’s not a good idea to slow down the machine to the extent of losing a customer. So, the main agent is simple and “lightweight”, which means it only calls “friendly” APIs and will not carry any shellcode with it. That way, we intend to keep the main agent laying as low as possible, providing good stealthiness and stability.

The submodule

Having a firm anchor in the target, things can escalate a little. Now we can start thinking about those more dangerous tasks, such as invoking cmd, powershell, or creating a process, etc. How to avoid doing all those in the main agent itself? The answer is “fork and run”. Being a familiar jargon, fork and run in CobaltStrike can be configured to target processes like rundll32.exe, svchost.exe, runtimebroker.exe, which are often abused in process injection operations. Nowadays, these processes are heavily monitored so an abnormal parent child chain is highly likely going to have the agent killed. What’s to the rescue?

Hey, our agent is clean, then why don’t we fork ourselves? If you’re an EDR, you see an already “benign” process forks another “benign” process of itself, you would think it’s all good. So, this is it. For riskier tasks, we have decided to separate them from the main agent and put them in its forked version—the submodule.

Submodules have some other key features. In order not to sabotage the main agent in any way, because we still have a clear parent child chain here, the submodule needs to finish as quickly as possible when it enters execution mode. Time consuming tasks are definitely not recommended there (hold your horses on things like WinPeas). And the submodule deserves most attention for obfuscation and evasion techniques.

Along with those techniques, the submodule should also have a way to extend its capabilities through things like PE injection, hosting arbitrary .NET assembly, and ideally, the ability to work with DLLs.
Other things like how to pass down arguments from main agent to the submodule, and how the submodule writes the result back to the main agent, are implementation details that each one could take a different route to achieve.

Delegation

This part, which happens furthest away from the main agent, provides more OPSEC for the engagement. We call it delegation, because we want to delegate most dangerous tasks, such as persistence (creating scheduled tasks, modifying registry keys, etc), loggers, credential harvests, to another legitimate windows process. Testing shows even if we get caught, the IoC will highly likely only trace back to the victim process we targeted (as in the case of CrowdStrike), nothing about the agent. All we need to do is to find a safe way to inject whatever into the remote process, be it shellcode, or a DLL.

With this three-tier execution architecture, we can potentially achieve the most out of an engagement and ensure the agent won’t die early. Long live the Crab!

Evolution

After reconstructing the agent, we have achieved another level of stealthiness. In the first half of 2024, with the integration of process hollowing, we could use the agent to dump LSASS if PPL is not enabled, under CrowdStrike.

Everything went well until we encountered firewalls. The problem is that most firewalls enforce HTTP compliance, so TCP socket connections won’t work. It’s time for a network layer uplift to address the issue.

The overhead

We mentioned the overhead above. With socket, we must implement HTTP headers ourselves. It takes time and effort to find out what headers will make the request valid through the firewalls. So, the approach we take to verify this with different firewalls is to make requests with minimal headers, then if things don't work, add other headers one by one until everything clicks.

During testing, we discovered interesting results regarding some well-known firewalls. For example, for Sophos XG with default settings, we don’t even need a “Host” or “User-Agent” header to make a legitimate request.

Fortunately, the previous implementation decoupled a lot of functions so this HTTP compliance issue was dealt with fairly easily in the agent. However, it took us a lot of time to modify the server-side code to adapt to this new behaviour, because previously all data was handled in one persistent connection, but now it’s one request per data exchange, connect, send and then disconnect. It’s something to keep in mind when choosing how to implement the network layer.

A beneficial offspring of having to mess around with socket HTTP connection is that at the end we have the ability to utilise the proxy settings on the host under any security context.

What does that mean? That means we don’t need to worry about the issue where SYSTEM accounts have no proxy settings inherited. If we have a way to escalate to SYSTEM, we can just execute the agent as SYSTEM user, and that’s it. The agent will detect proxy settings automatically and connect to the TeamServer through the proxy if there is one.

Now that the Crab can come and go at will, what’s left is to think about the resilience of the whole red team infrastructure.

Infrastructure

Regarding the infrastructure, we started simple with just one cloud VM instance and iptable rules for traffic forwarding.
Normally for TeamServers, we don’t have to think too much about concurrency, load balancing, etc, because everything’s behind cloud firewalls.

With a proper setup, it’s safe enough from potential threats such as DDos. However, we do pay attention to those ports that must be publicly facing for the victims to connect to, so authentication and soft limit solutions have been implemented to block any malicious IP addresses that are poking our server (failed the authentication or reached predefined connection threshold).

Over time, for flexibility and better OPSEC, we incorporated Nginx and Cloudflare. One to reverse proxy traffic and one to mask our true IP addresses. Nginx’s versatility enables us to proxy traffic to many different protocols, HTTP/HTTPS, TCP, CGIs, websockets and streams, so we can listen on some internal ports and let Nginx do the magic.

Some pieces of advice for configuring Nginx and Cloudflare:

No need to cache anything.
Use Origin Server Certificate to encrypt everything end-to-end.
Use Full (strict) encryption mode.
Apply firewall rules if needed.
Some optimisation is likely going to fiddle with the data that’s being sent. It is recommended to disable all of them both in Cloudflare and Nginx.

For example, in Nginx, disable gzip and brotli compression. This may vary in different situations, deploy at your own discretion.

And we are good to go from here.

Miscellaneous

Before we wrap things up, a TeamServer would be much nicer to have these features.

Collaboration

Imagine one teammate encountered issues with some command, then he invites another member of the team to observe the execution and collaborate to solve the problem. A steaming-like feature is there to support the teamwork. Anyone could observe one specific agent when it connects back to the TeamServer, and every command executed or in queue, will be “streamed” to all participants, so everyone can see what the potential problem is, which makes the debugging process much easier.

Logging

A TeamServer should have the ability to log different information at different logging levels. Find the document for your tech stack and implement at least logs for:

Operation: including who, what, when, where. For example, 12/03/2025 13:28:22 John executed command net time on agent DESKTOP-VSDT23J. Who was the team member that’s operating, what was the command executed, when was it executed, and where (on which agent) was the command sent to. It can also include other operational information like what files were transferred, what files were downloaded, etc, to provide a detailed trace of what happened in the engagement during the specific period.

Security: including custom security related information, such as on what time, what IP address connected to which port, whether if it’s authenticated, and the reason if it’s banned, and banned for how long.

Console: other information could be sent to stdout on the console as well, for more convenient troubleshooting.

History

Every command and result should be properly saved for later reference. It’s recommended to use a database to store these so it’s persistent and easy to manage.

Searching/Filtering

Searching and filtering among the saved history is very important as well, both for quick reviewing and reporting.

Notification/Alert

Send a daily or monthly report to the admin through email, give a summary of whatever information needed, and keep track of any abnormalities. You can have messages about when an agent connects, when it disconnects, and similar.

Conclusion

Building custom red team tools isn’t as difficult as it sounds. More importantly, customisation is the key to evading EDR detection. A unique tool stands a better chance of bypassing behavioural analysis.

This reinforces a crucial point: EDRs are not the ultimate defence. While valuable, they should be part of a broader strategy. Layered security, informed users, continuous monitoring, and swift response are what truly raise the bar against intrusions.