I've been using a shitty streaming website whose player interrupts the playback of a video in irregular intervals and presents a cryptic error message. I've started looking into the JavaScript code to see if I can't code up a work-around mechanism (basically debugging their garbage implementation), and of course (why actually?) their player code is also obfuscated.
And I've gotta say, emplying an AI assistant has proven to be an invaluable help in trying to understand obfuscated code. It's actually really cool to take a function of gobbledegook JavaScript and ask the AI to rewrite it in a more canonical and easily understandable way, with inline comments. Of course, there are flaws every now and then, but the ability to do this has been such a game changer for reverse engineering, IMO.
I can even ask to take a guess at finding better variable/function names and the AI can infer from the code (maybe has seen the unobfuscated libraries during training?) what this code is actually doing on a high-level and turn something like e.g(e.g) into player.initialize(player.state) which is nothing short of amazing.
So for anyone doing similar work, I cannot recommend highly enough to have an AI agent as another tool in your tool belt.
> As this is a Javascript file executed on the web, it is actually possible to replace the normal webmssdk.js with the deobfuscated file and use TikTok normally.
> This can be achieved by using two browser extensions known as Tampermonkey for executing custom code and CSP to disable CSP so I can fetch files from blocked origins. This is so I can put latestDeobf.js in my own file server and have it be fetched each time, this is so I can easily edit the file and let the changes take effect each time I refresh. This makes it much easier to bebug when reversing functions.
I believe you can achieve the same effect without any 3rd party extensions. You can use Local Overrides in Chrome DevTools.
This seems like quite a lot of work to hide the code. What would the legitimate reasons for this be? Because it looks like it would make the program less optimized and more complexity just leads to more errors.
I understand the desire to make it harder for bots, but 1) it doesn't seem to be effective and bots seem to be going a very different route 2) there's got to be better ways that are more effective. It's not like you're going to stop clones through this because clones can replicate by just seeing how things work and reverse engineer blackbox style.
Very impressive work! I always enjoy a good write up about reverse engineering efforts and yours was really simple to follow.
Many popular/large websites and bot protection services usually have environment checking as a baseline and mouse-movement tracking in some of the more aggressive anti-bot checks.
It's always interesting to see how long it takes from when the measures have been defeated/publicised until the service ends up making changes to their mechanism to make you start over (hopefully not from scratch).
...can I ask a really stupid question? What is a VM in this context?
I've used VM's for years to run Windows on top of macOS or Linux on top of Windows or macOS on top of macOS when I need an isolated testing environment. I also know that Java works via the "Javascript Virtual Machine" which I've always thought of as "Java code actually runs in its own lightweight operating system on top of the host OS, which makes it OS-agnostic". The JVM can't run on bare metal because it doesn't have hardware drivers, but presumably it could if you wrote those drivers.
But presumably the VM being discussed in TFA isn't that kind of VM, right? Bytedance didn't write an operating system in Javascript?
I've been seeing "VM" used in lots of contexts like this recently and it makes me think I must be missing something, but it's the sort of question I don't know how to Google. AIs have not been helpful either, plus I don't trust them.
Reverse engineering the obfuscated TikTok VM
(github.com)410 points by xfeeefeee 21 April 2025 | 120 comments
Comments
And I've gotta say, emplying an AI assistant has proven to be an invaluable help in trying to understand obfuscated code. It's actually really cool to take a function of gobbledegook JavaScript and ask the AI to rewrite it in a more canonical and easily understandable way, with inline comments. Of course, there are flaws every now and then, but the ability to do this has been such a game changer for reverse engineering, IMO.
I can even ask to take a guess at finding better variable/function names and the AI can infer from the code (maybe has seen the unobfuscated libraries during training?) what this code is actually doing on a high-level and turn something like e.g(e.g) into player.initialize(player.state) which is nothing short of amazing.
So for anyone doing similar work, I cannot recommend highly enough to have an AI agent as another tool in your tool belt.
> This can be achieved by using two browser extensions known as Tampermonkey for executing custom code and CSP to disable CSP so I can fetch files from blocked origins. This is so I can put latestDeobf.js in my own file server and have it be fetched each time, this is so I can easily edit the file and let the changes take effect each time I refresh. This makes it much easier to bebug when reversing functions.
I believe you can achieve the same effect without any 3rd party extensions. You can use Local Overrides in Chrome DevTools.
Great work!
I understand the desire to make it harder for bots, but 1) it doesn't seem to be effective and bots seem to be going a very different route 2) there's got to be better ways that are more effective. It's not like you're going to stop clones through this because clones can replicate by just seeing how things work and reverse engineer blackbox style.
Many popular/large websites and bot protection services usually have environment checking as a baseline and mouse-movement tracking in some of the more aggressive anti-bot checks.
It's always interesting to see how long it takes from when the measures have been defeated/publicised until the service ends up making changes to their mechanism to make you start over (hopefully not from scratch).
https://ibiyemiabiodun.com/projects/reversing-tiktok-pt2/
I've used VM's for years to run Windows on top of macOS or Linux on top of Windows or macOS on top of macOS when I need an isolated testing environment. I also know that Java works via the "Javascript Virtual Machine" which I've always thought of as "Java code actually runs in its own lightweight operating system on top of the host OS, which makes it OS-agnostic". The JVM can't run on bare metal because it doesn't have hardware drivers, but presumably it could if you wrote those drivers.
But presumably the VM being discussed in TFA isn't that kind of VM, right? Bytedance didn't write an operating system in Javascript?
I've been seeing "VM" used in lots of contexts like this recently and it makes me think I must be missing something, but it's the sort of question I don't know how to Google. AIs have not been helpful either, plus I don't trust them.
https://lynxjs.org/
Also discussed on HN
https://news.ycombinator.com/item?id=43264957
https://cryptome.org/2012/07/gent-forum-spies.htm
But if AI can help to fight those people's work, good for humanity I guess.
That said... Is AI going to de-obfuscate/reverse engineer their obsfuscated AI prompts or web apps?
TikTok uses a custom virtual machine (VM) as part of its obfuscation and security layers. This project includes tools to:
Deobfuscate webmssdk.js that has the virtual machine.
Decompile TikTok’s virtual machine instructions into readable form.
Script Inject Replace webmssdk.js with the deobfuscated VM injector.
Sign URLs Generate signed URLs which can be used to perform auth-based requests eg. Post comments.