Grouping scripts and using JS libraries: frida-compile

An "agent" example. It uses frida-compile to get different TypeScript files compiled down to a single JS one (featuring source maps, TS type definitions and reloading/recompiling). Frida can’t load more than one script, and require doesn’t work inside its JS environment. frida-compile is the solution. Use frida-create to create a project based on it.

See also the module example. Like the agent, but apparently more structured towards being an API for other agents that want to use this "module" like a library component.

  • can i organize the JS code in different files and include them all together from the entrypoint that i feed to frida? purely for code organization purposes

  • yep, try the frida-create CLI tool in an empty directory to set up an example project

  • does the frida CLI take in multiple -l ( --load ) arguments to load multiple scripts?

  • Not currently. You won’t need that with frida-compile though as you’ll get a single .js that you can load with the REPL. But I think we should extend the REPL so it adds require() to the global object, and have the implementation do send() + recv().wait() to dynamically load other files (with careful validation on the REPL side)

  • Well Frida’s API requires a single blob of JavaScript when loading a script (session.create_script("console.log('code here');")), so if you compose an agent from multiple files you will need to use some kind of JavaScript toolchain to do that – frida-compile is one such tool (which piggybacks on browserify and other web toolchain building blocks). You can also instantiate one script per file, but that isn’t recommended as it will waste a lot of memory, and also because frida-java-bridge (the Java API) needs to instrument the ART runtime, so multiple scripts doing that isn’t ideal.

TypeScript details

  • When trying to compile the typescript, I’m having issues in Interceptor callbacks where I use the “this” property to save args in onEnter. It complains that “this” is of “any” type. I don’t set “this” so how can I give it a type?

  • onEnter(this: MyType, args) { …​ }

  • i learned the hard way the this can be different inside a function written using the new syntax as opposed to the old style function(args) { } but not sure if that applies here

Using Frida writing C/native code

Examples of using the C API of Frida Gum:

Examples of using Frida in C. Includes the 3 variants: Gum (just C), GumJS and Core (both with JS). Core is more low level, but it’s still a large amount of code and still uses JS as "agent".

Article from Leon Jacobs. At the end covers the problem of packaging the python part, and how he opted for moving the code to use frida-core, which is still packaging the JS engine, but avoid having to depend on py2exe or something like that for distributing that to regular user computers.

Loading a library

Calling the native function. How it works?

Find the initial code in bindings/gumjs/gumquickcore.c, to figure out about the vtable/this pointer issue.

How to deal with structs

Q/A

  • @oleavr do callprobes in stlalker replace the tarrget function, or just wrap it like interceptor?

  • Wrap it. They’re like a callout at the start of the target’s basic block

  • so not quite "wrapped" as there’s no onLeave

  • is there any way to replace the target function?

  • yeah implement a transformer and if the first instruction fetched is a target you care about, write your own code instead of calling .keep()

  • but how to jump to other blocks if same instruction.mnemonic found earlier

  • You’re probably assuming that Stalker processes all basic blocks, even the ones following the one we completely rewrite. It won’t. Stalker processes basic blocks just in time – just before they’re executed. So if you rewrite the starting basic block of a function, execution won’t reach the next basic block unless you generate code to reach it (or other code jumps to it – though you are in control of that too).

  • has anyone used MemoryAccessMonitor Api on Android b4? it is a power feature but sadly my registed callback only access once whenever a read/write/execute event happen.

  • That’s how it’s meant to work, you have to re-enable it. Have a look at Stalker if you need to see everything that a thread does. You could even combine them, so you start stalking from first access of somewhere interesting.

  • (…​)

  • Stalker allows you to see every single instruction executed by the thread(s) it follows. It also allows you to plug in your own transformer, so you can weave in arbitrary instrumentation between those instruction. And that means you can generate instrumentation before/after instructions that read/write memory, and do whatever you want. It’s not easy though, but it’s possible. https://t.me/fridadotre/61163

  • (…​)

  • Btw, this example might make Stalker’s transformer feature clearer: https://github.com/frida/frida-presentations/blob/master/R2Con2017/02-transforms/04-aes.js note though that this can be optimized by implementing the transform callback in C using e.g. the CModule API (but would recommend starting with the whole thing in JS and optimizing later)

  • You can either Interceptor.attach() to add an instruction-level probe someplace interesting that the thread will reach, or you can Stalker.follow() it and do a putCallout() where you want your breakpoint – from there you can block using send() + recv().wait() (see tutorial on Messages) to implement a breakpoint with full interactivity

  • We have an implementation of it in Gum.Process.modify_thread(), but it’s not yet exposed to JS. (PR welcome)

Internals

  • No, it recompiles the machine code just-in-time to weave in instrumentation. This is a shadow version of the code, so the original code isn’t modified. But the shadow version has identical side-effects on stack/memory/etc.

  • how that shadow version of memory commuicate with original app memory ?

  • It’s not shadow in the literal sense, but conceptually. It’s in the same address space. But the original instructions there have been adjusted for their new location in memory, and in some cases fully rewritten into something equivalent. (E.g. CALL on x86 – Stalker doesn’t emit the CALL, but instead emits code that pushes what the CALL instruction would have pushed on the stack, and then deals with the branching separately.)

  • No, it’s designed to be in-process. It also hides its own ranges/resources from APIs so you don’t see yourself when querying them.

  • I think you’re confusing it with frida-core’s Linux injector. It’s the most primitive of the injectors and relies on ptrace() for now (for a brief moment during injection), but the plan is to improve that.

  • (But you can use any injector obviously – Gum is what you use in the payload and doesn’t make any assumptions about being injected a certain way)

  • (…​)

  • Right. We use our own out-of-process dynamic linker on i/macOS, so we can map a .dylib into memory with only task-port access to a target process. (So this works even if the target process is sandboxed and cannot dlopen() / read from the filesystem.) The goal is to also implement that on the other OSes, but for now we are relying on dlopen() / LoadLibrary() there.

Other

A patch on Gist to add logging. Seems the recommended way to work inside frida-gum in some cases is to output to some file, to not mess the stdout of different projects, which just gets lost.