Reverse engineering the obfuscated TikTok VM Hackernews Viewer

Reverse engineering the obfuscated TikTok VM

412 points by xfeeefeee 21 April 2025 | 120 comments

Comments

kleiba 21 April 2025

I've been using a shitty streaming website whose player interrupts the playback of a video in irregular intervals and presents a cryptic error message. I've started looking into the JavaScript code to see if I can't code up a work-around mechanism (basically debugging their garbage implementation), and of course (why actually?) their player code is also obfuscated.

And I've gotta say, emplying an AI assistant has proven to be an invaluable help in trying to understand obfuscated code. It's actually really cool to take a function of gobbledegook JavaScript and ask the AI to rewrite it in a more canonical and easily understandable way, with inline comments. Of course, there are flaws every now and then, but the ability to do this has been such a game changer for reverse engineering, IMO.

I can even ask to take a guess at finding better variable/function names and the AI can infer from the code (maybe has seen the unobfuscated libraries during training?) what this code is actually doing on a high-level and turn something like e.g(e.g) into player.initialize(player.state) which is nothing short of amazing.

So for anyone doing similar work, I cannot recommend highly enough to have an AI agent as another tool in your tool belt.

SoKamil 21 April 2025

> As this is a Javascript file executed on the web, it is actually possible to replace the normal webmssdk.js with the deobfuscated file and use TikTok normally.

> This can be achieved by using two browser extensions known as Tampermonkey for executing custom code and CSP to disable CSP so I can fetch files from blocked origins. This is so I can put latestDeobf.js in my own file server and have it be fetched each time, this is so I can easily edit the file and let the changes take effect each time I refresh. This makes it much easier to bebug when reversing functions.

I believe you can achieve the same effect without any 3rd party extensions. You can use Local Overrides in Chrome DevTools.

Great work!

godelski 21 April 2025

This seems like quite a lot of work to hide the code. What would the legitimate reasons for this be? Because it looks like it would make the program less optimized and more complexity just leads to more errors.

I understand the desire to make it harder for bots, but 1) it doesn't seem to be effective and bots seem to be going a very different route 2) there's got to be better ways that are more effective. It's not like you're going to stop clones through this because clones can replicate by just seeing how things work and reverse engineer blackbox style.

davidsojevic 21 April 2025

Very impressive work! I always enjoy a good write up about reverse engineering efforts and yours was really simple to follow.

Many popular/large websites and bot protection services usually have environment checking as a baseline and mouse-movement tracking in some of the more aggressive anti-bot checks.

It's always interesting to see how long it takes from when the measures have been defeated/publicised until the service ends up making changes to their mechanism to make you start over (hopefully not from scratch).

mrkramer 21 April 2025

In my bookmarks I found this RE examples as well: https://www.nullpt.rs/reverse-engineering-tiktok-vm-1

https://ibiyemiabiodun.com/projects/reversing-tiktok-pt2/

ronsor 21 April 2025

There is no legitimate reason for a social media platform to employ this much obfuscation.

Wowfunhappy 21 April 2025

...can I ask a really stupid question? What is a VM in this context?

I've used VM's for years to run Windows on top of macOS or Linux on top of Windows or macOS on top of macOS when I need an isolated testing environment. I also know that Java works via the "Javascript Virtual Machine" which I've always thought of as "Java code actually runs in its own lightweight operating system on top of the host OS, which makes it OS-agnostic". The JVM can't run on bare metal because it doesn't have hardware drivers, but presumably it could if you wrote those drivers.

But presumably the VM being discussed in TFA isn't that kind of VM, right? Bytedance didn't write an operating system in Javascript?

I've been seeing "VM" used in lots of contexts like this recently and it makes me think I must be missing something, but it's the sort of question I don't know how to Google. AIs have not been helpful either, plus I don't trust them.

heinternets 21 April 2025

Is TikTok so obfuscated to prevent people from knowing the full extent of data collection and device fingerprinting?

RexM 21 April 2025

Is this VM somehow related to Lynx (their cross platform dev tooling?)

https://lynxjs.org/

Also discussed on HN

https://news.ycombinator.com/item?id=43264957

0xDEADFED5 21 April 2025

this is cool. i briefly worked on a TikTok bot a while back and it was a huge pain in the ass.

weinzierl 21 April 2025

Is there also a VM in their iOS app? I thought a VM would be against Apple's policies?

lazyeye 21 April 2025

An oldie but a goodie. A guide to manipulating online comments to hide/dilute/obsfucate undesirable commentary....

https://cryptome.org/2012/07/gent-forum-spies.htm

sylware 21 April 2025

What's terrible are the humans writing such software...

But if AI can help to fight those people's work, good for humanity I guess.

That said... Is AI going to de-obfuscate/reverse engineer their obsfuscated AI prompts or web apps?

domfie 21 April 2025

Looks like a lot of work. I recently discovered webcrack and the tool jehna/humanify for such deobfuscate tasks

itsthecourier 21 April 2025

this level of obfuscation in a social app is super suspicious

worldsavior 21 April 2025

That's a very strong obfuscation. Takes a lot of work to deobfuscate such a thing. Great writeup.

xfeeefeee 21 April 2025

The fascinating process of reverse engineering this VM is detailed here.

TikTok uses a custom virtual machine (VM) as part of its obfuscation and security layers. This project includes tools to:

Deobfuscate webmssdk.js that has the virtual machine.

Decompile TikTok’s virtual machine instructions into readable form.

Script Inject Replace webmssdk.js with the deobfuscated VM injector.

Sign URLs Generate signed URLs which can be used to perform auth-based requests eg. Post comments.