They mention that they do not have access to the threat actor’s obfuscating compiler itself, but while reading the analysis it occurs to me that given they have released a purpose-built deobfuscator, that they could certainly develop a ScatterBrain-like compiler and then I wonder if doing so might enable creation of useful heuristics that might reveal the quiet existence of ScatterBrain compiler in some sample, archive, darknet tools repo, compromised host, torrent, etc.
Just as they have supplied IOCs, perhaps they could provide reasonable signatures or heuristic rules that scanners in various places might ingest and apply that might allow for the discovery of some latent copy of the compiler itself, which could be useful in and of itself, as well as for all of the possible breadcrumbs and inferences that could be made based on where/when it was spotted, if it was.
Judging by this analysis one simple approach would be to search for control-flow desynchronizations, ARM is easier since instructions are aligned on 2/4-bytes but the variable instruction format for x86/x64 is used here to make the life of the decompiler harder.
However, you can store a map of how instructions are placed and detecting cases where instructions overlap to different sequences should be a big red flag for an AV tool (that said, it's not impossible to disguise instruction targets enough that an analyser would need to be nearly Turing Complete to find even this).
So what I was suggesting is I guess a detector of compiled (lol and possibly in need of deobfuscation now that I think of it, but that’s apparently a solved problem) code that generates the type of code you mention, in order to find a copy of the compiler, and not the compiler’s obfuscated output of malware, but as I try to clarify that in this reply and realize that the binary we would be trying to flag would probably be both of these things, it occurs to me that a) your near-turing-complete comment holds even if my original target wasn’t communicated clearly and b) if said copy of the compiler does already exist somewhere in the wild, it may well be picked up as a submitted sample based on an IOC anyway, since I have to assume the threat actor obfuscated their compiler binary by building it with a copy of itself. :-)
This is very cool. Can someone help me understand the behind the scenes, what’s their strategy? Their motivations? Are they targeting specific industries or nations for a reason?
They mention that they do not have access to the threat actor’s obfuscating compiler itself, but while reading the analysis it occurs to me that given they have released a purpose-built deobfuscator, that they could certainly develop a ScatterBrain-like compiler and then I wonder if doing so might enable creation of useful heuristics that might reveal the quiet existence of ScatterBrain compiler in some sample, archive, darknet tools repo, compromised host, torrent, etc.
Just as they have supplied IOCs, perhaps they could provide reasonable signatures or heuristic rules that scanners in various places might ingest and apply that might allow for the discovery of some latent copy of the compiler itself, which could be useful in and of itself, as well as for all of the possible breadcrumbs and inferences that could be made based on where/when it was spotted, if it was.
Judging by this analysis one simple approach would be to search for control-flow desynchronizations, ARM is easier since instructions are aligned on 2/4-bytes but the variable instruction format for x86/x64 is used here to make the life of the decompiler harder.
However, you can store a map of how instructions are placed and detecting cases where instructions overlap to different sequences should be a big red flag for an AV tool (that said, it's not impossible to disguise instruction targets enough that an analyser would need to be nearly Turing Complete to find even this).
So what I was suggesting is I guess a detector of compiled (lol and possibly in need of deobfuscation now that I think of it, but that’s apparently a solved problem) code that generates the type of code you mention, in order to find a copy of the compiler, and not the compiler’s obfuscated output of malware, but as I try to clarify that in this reply and realize that the binary we would be trying to flag would probably be both of these things, it occurs to me that a) your near-turing-complete comment holds even if my original target wasn’t communicated clearly and b) if said copy of the compiler does already exist somewhere in the wild, it may well be picked up as a submitted sample based on an IOC anyway, since I have to assume the threat actor obfuscated their compiler binary by building it with a copy of itself. :-)
The source for the de-obfuscator: https://github.com/mandiant/poisonplug-scatterbrain
This is the result when an elite attacker meets an elite analyst group.
That's some very heavy stuff.
This is very cool. Can someone help me understand the behind the scenes, what’s their strategy? Their motivations? Are they targeting specific industries or nations for a reason?
Yes, this in an interesting question. Are they just trying to hide from anti-virus signatures, or are they hiding code they perceive as valuable?
Is it correct to presume that the obfuscated samples might be hard to come by for the average interested viewer?
You can search open threat exchange for files tagged with "scatterbrain" and it will give you various hashes: https://otx.alienvault.com/browse/global/indicators?q=scatte...
You can then use the hashes with platforms like virustotal to download some samples.
I'd be curious to see how obfuscated code produced like this fares when analyzed with ghidra augmented with AI plugins.
Also, I'm surprised there seems to be no mention in the article of why standard decompilation techniques fail (I might have missed it).
They reference an earlier analysis from PwC that does show some decompilation pre and post deobfuscation.
https://www.pwc.co.uk/issues/cyber-security-services/insight...
Given that this was made by a nation-state attacker I'd expect something more sophisticated than pairipcore VM..
So, still waiting for full pairipcore (the newer one) writeup.