For the fingerprinting part, can you explain the difference with the JShelter browser extension (https://jshelter.org/)?
I checked as you did in your demo video with https://demo.fingerprint.com/playground (using JShelter in Firefox).
It produces a fingerprint detector report, like so :
Ooh nice, I haven’t seen this project! I actually tried attempting this as an extension at first but wasn’t able to override page window functions. I’m curious to know how they accomplished this. (edit: I see that I missed the chrome.scripting API facepalm)
Thank you for sharing :)
FWIW I still think a custom browser approach has some benefits (stealth and executing in out of process iframes. could be wrong on the second part, haven’t actually tested!)
Most of my job is reverse engineering a major website builder company's code so we can leverage their undocumented features. It's often a difficult job but your project could make it easier. I'm sure there are others out there that will find this useful.
In the past I've considered forking Chromium so every asset that it downloads (images, scripts, etc) is saved somewhere to produce a sort of "passive scraper".
This article made me consider creating a new CDP domain as a possible option, but tbf I haven't thought about this problem in ages so maybe there's something less stupid that I could do.
Ha, I've had the exact same thought before as well, but due to lack of experience and time constraints I ended up using mitmproxy with a small Python script instead. It was slow and buggy, but it served it purpose...
While searching for a tool I found several others asking for something similar, so I'm sure there are quite a few who would be interested in the project if you ever do decide to pick it up.
It's not quite the same, but in the past I've written (in python) scrapers that run off of the cache. E.g. it would extract recipes from web pages that I had visited. The script would run through the cache and run an appropriate scraper based on the url. I think I also looked for json-ld and microdata.
The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.
This is such an eye opening, and really interesting. It reminded me of projects like XprivacyLua that "expose" the different calls and request from android apps. Great work!
I am amazed what you've accomplished here: adding your own custom CDP domain. Years ago I gave up on trying to hack Chromium (I wanted to learn how to add back Manifest Version 2 support before it got removed.).
Build times were way longer on my potato hardware. Since then I haven't touched much C++.
For anyone that doesn't want to maintain a fork of chromium, just download the PDB and hook it at runtime for spoofing and/or dumping call logs. For hook itself just add your dll as a dependency in the PE structure.
Someone mentioned this as well in another comment. Turns out most of this could’ve been done as an extension after all :-)
edit: actually, wouldn’t you still need to override the global you’d like to instrument? At that point, the toString of the modified function would leak your hook.
Very cool, thanks for sharing. I would love to see this show up as an OSS project. I know a few people who would likely enjoy being able to contribute if that's something you'd be looking for.
This is neat but it also makes me uncomfortable to see just how much fingerprinting is done these days. TikTok is creepy but I'm sure they aren't the worst.
Neat investigation but I didn’t totally follow how the project would be useful for reverse engineering, it seems like a project that would mostly be useful for evading bot checks like web scraping or AI automation.
...and power users. This is a browser that acts in the interests of the user, something that the mainstream authoritarian technocracy is actively trying to destroy and has been ever since they removed "View Source" from its customary place.
I have a project (in my rather long project backlog) that involves hooking JS APIs to download youtube videos. I'm worried that if my extension (or a similar extension) gained enough popularity, youtube would start inspecting the relevant JS objects to see if they'd been replaced with proxy instances.
Aside from playing a hooking/patching game of cat and mouse, I don't think this is fully solvable without modifying the browser engine itself - then you can hook things in a way that's completely transparent to the JS in webpages.
Was just about to comment this I’ve played that exact cat and mouse game before there’s also another fun way to hook I used to like by doing something like Object.defineProperty on Object.prototype to globally hook onto something and you can do lots of stuff with that it’s pretty useful in user scripts
Thanks for sharing some examples! Someone shared a similar project in the other thread. I didn’t realize this at the time of writing haha.
FWIW I still think modifying the browser has some positives wrt stealth and hooking out of process frames (could be wrong on the second part, haven’t actually tested!)
Still good to know though will leave a note in the article :-)
Yeah, there's a pretty overwhelming amount of browser APIs and functionality which isn't always (well-)documented to learn about. If I recall correctly Proxies wouldn't be detectable (seems to be supported by https://exploringjs.com/es6/ch_proxies.html#sec_detect-proxi...) so long as your injected content script runs first (otherwise other code could presumably override the Proxy constructor). You should also be able to hook any embedded frames by setting `target: { ..., allFrames: true }`.
To note, there are undocumented detections to even Proxys, for example using `in` operator in v8 (such as `proxiedFunc in 1` for some proxied function). Really cool to see a project like this.
How do you use `in` in v8 to detect proxies? I assume its a difference in the exception, but the message and the cause were the same in both direct and proxied `x in 1`.
Very interesting, thanks!
For the fingerprinting part, can you explain the difference with the JShelter browser extension (https://jshelter.org/)?
I checked as you did in your demo video with https://demo.fingerprint.com/playground (using JShelter in Firefox). It produces a fingerprint detector report, like so :
{
}However, it appears there is no way to display what was actually produced by the browser.
Was this the reason you had to build your own browser? Or is it possible to extend JShelter to do the same?
Ooh nice, I haven’t seen this project! I actually tried attempting this as an extension at first but wasn’t able to override page window functions. I’m curious to know how they accomplished this. (edit: I see that I missed the chrome.scripting API facepalm)
Thank you for sharing :)
FWIW I still think a custom browser approach has some benefits (stealth and executing in out of process iframes. could be wrong on the second part, haven’t actually tested!)
Most of my job is reverse engineering a major website builder company's code so we can leverage their undocumented features. It's often a difficult job but your project could make it easier. I'm sure there are others out there that will find this useful.
In the past I've considered forking Chromium so every asset that it downloads (images, scripts, etc) is saved somewhere to produce a sort of "passive scraper".
This article made me consider creating a new CDP domain as a possible option, but tbf I haven't thought about this problem in ages so maybe there's something less stupid that I could do.
Ha, I've had the exact same thought before as well, but due to lack of experience and time constraints I ended up using mitmproxy with a small Python script instead. It was slow and buggy, but it served it purpose...
While searching for a tool I found several others asking for something similar, so I'm sure there are quite a few who would be interested in the project if you ever do decide to pick it up.
It's not quite the same, but in the past I've written (in python) scrapers that run off of the cache. E.g. it would extract recipes from web pages that I had visited. The script would run through the cache and run an appropriate scraper based on the url. I think I also looked for json-ld and microdata.
The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.
"toString theory" is an incredible title for that section
resworb nwo ym detnaw syawla ev'i dna reenigne esrever a m'I
.ƨbɿɒwʞɔɒd ɘɿɒ ƨɿɘɟɟɘl ɿuoY
noitcelfer diova ot yrt I edoc ym nI
This isn't rot13.
EDIT: Oh, it took me a minute!
Abj vg vf.
This is such an eye opening, and really interesting. It reminded me of projects like XprivacyLua that "expose" the different calls and request from android apps. Great work!
I am amazed what you've accomplished here: adding your own custom CDP domain. Years ago I gave up on trying to hack Chromium (I wanted to learn how to add back Manifest Version 2 support before it got removed.).
Build times were way longer on my potato hardware. Since then I haven't touched much C++.
It would be dangerous if this tool fell into the wrong hands.
Where's the wait list?
Nice work! Check out visible v8: https://github.com/wspr-ncsu/visiblev8 for inspiration on using the V8 debug logs.
For anyone that doesn't want to maintain a fork of chromium, just download the PDB and hook it at runtime for spoofing and/or dumping call logs. For hook itself just add your dll as a dependency in the PE structure.
That sounds like a Windows-only approach though.
pdb's exist for all builds of google chrome.
Love this blog, still waiting on part 2 of Reverse Engineering Tiktoks VM
You can just use Proxy to get around toString shenanigans and prevent any detection whatsoever.
Someone mentioned this as well in another comment. Turns out most of this could’ve been done as an extension after all :-)
edit: actually, wouldn’t you still need to override the global you’d like to instrument? At that point, the toString of the modified function would leak your hook.
see: https://gist.github.com/voidstar0/179990efe918d1028b72f292cf...
Regardless, I do have some interesting ideas that should hopefully make my pain of compiling Chromium for 3 hours worth it though :p
Cheat Engine for site scripts? Who knows. Mostly just using this as an opportunity to learn some browser internals so id say it still paid off :)
Your example proxies the console object, the intended way in this case is to make a proxy from the log function itself and use the apply hook
toString will be called on the Proxy and not your hook so it won't reveal anything
D'oh! You are correct :-) Good catch and thanks for teaching me something!
no you cannot since you can throw an exception and your proxy will be leaked leading to a detection.
Very cool, thanks for sharing. I would love to see this show up as an OSS project. I know a few people who would likely enjoy being able to contribute if that's something you'd be looking for.
This is neat but it also makes me uncomfortable to see just how much fingerprinting is done these days. TikTok is creepy but I'm sure they aren't the worst.
Neat investigation but I didn’t totally follow how the project would be useful for reverse engineering, it seems like a project that would mostly be useful for evading bot checks like web scraping or AI automation.
I would love to be able to see IFrame and BroadcastChannel communication
...and power users. This is a browser that acts in the interests of the user, something that the mainstream authoritarian technocracy is actively trying to destroy and has been ever since they removed "View Source" from its customary place.
Interesting tool. Would love to contribute
Not to comment on the rest of article or the author's goals, but it's absolutely possible to use a content script (dynamically injected into the `main` world, as opposed to the default `isolated`, for example: https://github.com/tbrockman/browser-extension-for-opentelem...) and Proxy's (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) to hook (most? if not all) Javascript being executed in the webpage transparently.
Which for some functionality would have been a bit more portable and involved less effort.
I have a project (in my rather long project backlog) that involves hooking JS APIs to download youtube videos. I'm worried that if my extension (or a similar extension) gained enough popularity, youtube would start inspecting the relevant JS objects to see if they'd been replaced with proxy instances.
Aside from playing a hooking/patching game of cat and mouse, I don't think this is fully solvable without modifying the browser engine itself - then you can hook things in a way that's completely transparent to the JS in webpages.
Was just about to comment this I’ve played that exact cat and mouse game before there’s also another fun way to hook I used to like by doing something like Object.defineProperty on Object.prototype to globally hook onto something and you can do lots of stuff with that it’s pretty useful in user scripts
Thanks for sharing some examples! Someone shared a similar project in the other thread. I didn’t realize this at the time of writing haha.
FWIW I still think modifying the browser has some positives wrt stealth and hooking out of process frames (could be wrong on the second part, haven’t actually tested!)
Still good to know though will leave a note in the article :-)
Yeah, there's a pretty overwhelming amount of browser APIs and functionality which isn't always (well-)documented to learn about. If I recall correctly Proxies wouldn't be detectable (seems to be supported by https://exploringjs.com/es6/ch_proxies.html#sec_detect-proxi...) so long as your injected content script runs first (otherwise other code could presumably override the Proxy constructor). You should also be able to hook any embedded frames by setting `target: { ..., allFrames: true }`.
To note, there are undocumented detections to even Proxys, for example using `in` operator in v8 (such as `proxiedFunc in 1` for some proxied function). Really cool to see a project like this.
How do you use `in` in v8 to detect proxies? I assume its a difference in the exception, but the message and the cause were the same in both direct and proxied `x in 1`.
Ah wow, good catch- yeah, you're right, this technique seems to be patched
feature request: allow setting breakpoints without having obfuscator debugger statement loops get in the way
I actually wrote a separate blog post about this! Changing the debugger keyword :) see: https://nullpt.rs/evading-anti-debugging-techniques
could be very useful for my work, nice to see