Self-modifying software sounds like a forbidden trick: a program reaches into itself, changes the instructions, and continues as something slightly different.
That image is real, but it is also too narrow. The modern version is broader: software can generate code at runtime, patch running functions, specialize itself based on observed behavior, rewrite its own source, load new native code, or produce future versions of itself. Some of that is elegant engineering. Some of it is malware. Some of it is the foundation of high-performance runtimes.
The important point is this: self-modification is not an edge case outside computer science. It is one of the consequences of treating programs as data.
What Counts as Self-Modifying Software?
There are at least five useful meanings:
Instruction mutation. A program changes bytes that will later be executed as machine instructions. This is the classic assembly-language meaning.
Runtime code generation. A program writes new executable code while it runs, then jumps into it. JIT compilers do this constantly.
Dynamic replacement. A running system redirects execution from old code to new code without restarting. Kernel livepatching and dynamic software updating live here.
Source-level rewriting. A program changes its own source files or configuration, then restarts, reloads, or recompiles itself.
Self-reproduction and self-reference. A program can obtain, print, transform, or embed a description of itself. Quines, compiler bootstrapping, and viral replication are the canonical examples.
These are different mechanisms, but they share one deep idea: the boundary between “program” and “data” is movable. In a von Neumann machine, instructions and data are both stored in memory. In a programming language, source can be represented as strings, syntax trees, bytecode, or intermediate representation. Once code has a data representation, other code can inspect it, transform it, and run the result.
The Theoretical Ground: Fixed Points
The cleanest theoretical proof comes from computability theory.
Kleene’s second recursion theorem says, roughly: for any computable transformation of programs, there exists a program that behaves the same as its own transformed version. More formally, for a computable function f over program descriptions, there is an index e such that the partial computable function described by e is extensionally equal to the one described by f(e).
That is a fixed point theorem for code.
The practical reading is powerful: if you can write a program transformer, computability theory guarantees the existence of a program that can, in effect, account for its own description inside that transformation. This is why quines are possible without infinite regress. A self-printing program does not contain an infinite copy of itself. It contains a finite representation plus a procedure for reconstructing the whole.
Neil D. Jones connects the classical theorem to programming-language ideas directly: the S-m-n theorem, universal machines, and Kleene’s second recursion theorem correspond to partial evaluation, self-interpretation, and reflection. He also emphasizes that Kleene’s original proof is constructive and essentially builds a self-reproducing program.
That is the first kind of evidence: not a demo, but a theorem. There are programs that can operate on their own descriptions because fixed points exist in any sufficiently expressive model of computation.
Quines Are the Smallest Practical Proof
A quine is a program that outputs its own source code without simply reading the source file. It is the “hello world” of self-reference.
The proof idea is simple:
- Put part of the program in a data value.
- Print that data value both as executable text and as quoted data.
- Arrange the quoting so the output reconstructs the original program exactly.
In Python, the pattern can be as small as:
s = 's = %r\nprint(s %% s)'
print(s % s)
That snippet is not modifying itself, but it proves the same underlying claim: code can contain a finite description that reconstructs the whole code object. Self-modifying systems add a transformer between “description” and “next program.”
Ken Thompson’s 1984 Turing Award lecture, “Reflections on Trusting Trust,” made the security version famous. Thompson walks from self-reproducing programs to a compiler that can teach its own backdoor to future compiler binaries even when the visible compiler source looks clean. The lesson is still brutal: source inspection alone cannot prove that a software supply chain is clean if the tools that build the source are already compromised.
Partial Evaluation: Programs Specializing Programs
Partial evaluation is where theory starts to look like an engineering tool.
If an interpreter takes two inputs, interpreter(program, data), and you specialize the interpreter with respect to a fixed program, the result is a new program that runs that program directly on future data. This is the first Futamura projection:
specialize(interpreter, program) = compiled_program
Futamura’s work showed that a compiler can be generated from a formal interpreter by specializing computation. The second and third Futamura projections go further: specialize the specializer itself and you get compiler generators.
This matters because it reframes self-modification. A system does not need to blindly mutate machine instructions to be self-modifying in a meaningful way. It can carry an interpreter, collect known facts, specialize itself, and emit a faster version. The generated program is not magic. It is the residual code left after the known parts of the computation have been evaluated away.
Modern JIT compilers are the industrial descendant of this idea, even when they use different implementation techniques.
JIT Compilers: The Mainstream Case
JavaScript engines, JVMs, .NET runtimes, LuaJIT, Julia, and many database/query engines all rely on runtime code generation. They observe execution, infer facts, generate machine code, and sometimes throw that code away when assumptions break.
V8 is a clean example because the team documents the tiering model publicly.
In V8’s Ignition and TurboFan pipeline, JavaScript is first compiled to bytecode and interpreted. Runtime feedback from execution is then used by optimizing compilers to produce machine code. V8’s Maglev compiler adds an intermediate tier between Sparkplug and TurboFan: it uses runtime feedback such as object shapes and types to build specialized SSA nodes, emits machine code, and relies on deoptimization when assumptions fail.
This is self-modifying software in the practical runtime sense:
- The running program produces new executable code.
- The new code is specialized to observed behavior.
- The runtime can invalidate or replace that code later.
- Program behavior must remain semantically compatible with the language specification.
The evidence is not only conceptual. V8 reports concrete benchmark effects: the Ignition/TurboFan pipeline improved Speedometer by 5-10% and reduced memory footprint by 5-10% in Chrome M59. The Maglev tier was reported as roughly 10 times faster to compile than TurboFan, while producing code faster than Sparkplug.
LLVM’s ORC JIT shows the same pattern as reusable infrastructure. ORC provides APIs to link relocatable object files into a target process at runtime and components that make it easier to add LLVM IR to a JIT’d process. This is not a lab curiosity. It is an official LLVM subsystem for runtime compilation.
Livepatching: Changing Running Systems Without Restarting
The operating-system version is livepatching.
The Linux kernel documentation describes livepatching as a way to fix critical functions without rebooting, by redirecting function calls. It also makes the mechanism explicit: kprobes, ftrace, and livepatching all relate to redirecting code execution, and all three approaches need to modify existing code at runtime.
Red Hat’s documentation describes the same production pattern: kernel live patching can patch a running kernel without rebooting or restarting processes, and patch modules provide original functions plus replacement function pointers.
This is practical evidence that controlled runtime code modification is already part of production operations. The kernel cannot simply “restart the process” because it is the process-like substrate under everything else. So the engineering problem becomes:
- identify safe patch points
- redirect execution
- preserve a consistency model
- avoid patching a function while a task is executing through an unsafe state
- expose enough metadata to reverse or replace patches when possible
That is self-modification with governance.
Dynamic Software Updating: Beyond Kernels
Dynamic software updating asks the same question for general applications: can we upgrade a running program’s code, data, and types without stopping it?
Michael Hicks, Jonathan Moore, and Scott Nettles proposed a system for C-like languages where dynamic patches contain both updated code and transition code from the old version to the new one. Their FlashEd web server implementation showed typical update overhead below 1%.
That result is important because it separates self-modification from chaos. The goal is not “the program edits itself whenever it wants.” The goal is a disciplined update protocol:
- define update points
- transform old state into new state
- verify safety of new native code
- use dynamic linking machinery
- preserve service continuity
Dynamic updating is self-modification as operations engineering.
Verification: Can Self-Modifying Code Be Proven Correct?
The usual objection is fair: if code changes while it runs, how can we reason about it?
Hongxu Cai, Zhong Shao, and Alexander Vaynberg’s “Certified Self-Modifying Code” answers by changing the model. Instead of assuming code memory is immutable, their framework treats program code as a regular mutable data structure. They present a Hoare-logic-like framework for machine code with runtime code manipulation, mechanize soundness in Coq, and certify examples that run on SPIM and stock x86 hardware.
That is stronger evidence than “it works on my machine.” It shows that self-modifying machine code can be brought into formal reasoning when the logic is designed for mutable code memory from the start.
Other verification work attacks the analysis problem from model checking. Touili and Ye model self-modifying programs with self-modifying pushdown systems and reduce LTL model checking to an emptiness problem for self-modifying Buchi pushdown systems. Their implementation detected several self-modifying malware samples, including samples missed by multiple well-known antivirus tools.
So the research story is not “self-modifying code is unverifiable.” It is more precise: conventional verification techniques often assume immutable code, so self-modifying systems need semantics and proof systems that make code memory explicit.
Malware: The Dark Proof of Practicality
Self-modifying code is also common in malware because it frustrates static analysis.
Fred Cohen’s “Computer Viruses: Theory and Experiments” framed viruses as programs that can modify other programs to include a possibly evolved copy of themselves. Later malware families used encryption, packing, polymorphism, and metamorphism to make each instance look different while preserving behavior.
This is evidence from the adversarial side. If self-modifying software were only theoretical, defenders would not need model checkers, dynamic analysis, unpackers, emulators, and behavior-based detection. Malware authors use self-modification because it changes the economics of analysis: the bytes on disk are not necessarily the instructions that matter at runtime.
The same mechanism can be benign or hostile. JITs generate code for speed. Malware mutates code for concealment. Livepatching redirects code for reliability. The difference is intent, control, and verification.
AI Agents and Self-Modifying Codebases
Modern AI coding agents introduce a new layer: source-level self-modification mediated by language models.
An agent can inspect a repository, edit files, run tests, observe failures, and edit again. That is not the same as a JIT rewriting machine code in memory, but it is self-modification at the software-system level. The codebase becomes both artifact and workspace; the agent becomes a program transformer operating in a feedback loop.
The theory still matters. A code-generating agent is powerful precisely because programs can represent and transform programs. But the hard part is not generating a diff. The hard part is ensuring that the new code improves the system under a trusted objective.
For AI-driven self-modification, the practical safety bar should look more like dynamic software updating and verified SMC than like an unconstrained mutation loop:
- version every change
- run tests and static checks
- constrain the edit surface
- require human approval for high-risk changes
- preserve rollback paths
- measure behavior, not just code shape
- keep secrets and deployment authority out of the rewrite loop
The lesson from Thompson is still relevant here. If a system can rewrite the tools that evaluate it, the evaluation pipeline becomes part of the trusted computing base.
What Is Proven, and What Is Not?
Here is the strongest version of the claim:
Proven in theory: self-reference and self-reproduction are guaranteed by fixed point results such as Kleene’s second recursion theorem.
Proven constructively: quines and compiler bootstrapping demonstrate finite, executable self-reference.
Proven in practice: JIT compilers, dynamic binary translation, livepatching, dynamic linking, and dynamic software updating generate, replace, or redirect executable code in production.
Proven by formal methods: self-modifying machine code can be modeled and certified when the semantics treats code memory as mutable.
Proven adversarially: malware uses self-modification because it works against static inspection and signature-based detection.
But a weaker claim is often smuggled in:
Not proven: that a self-modifying system will reliably improve itself.
Modification is easy. Improvement is hard. A system needs an objective, a search process, constraints, validation, rollback, and a way to prevent the evaluator from being corrupted by the thing it evaluates. Without that, self-modification is only motion.
The Practical Pattern
Across the literature, the safe pattern repeats:
- Separate the representation of code from the authority to execute it.
- Make the transformation explicit.
- Define when replacement is allowed.
- Preserve a consistency model for running state.
- Validate the generated or patched code.
- Keep an audit trail and rollback path.
That is true whether the mechanism is a JIT compiler, a kernel livepatch, a dynamic software update, or an AI coding agent.
Self-modifying software is not one thing. It is a family of techniques around a single idea: code can be data for another computation. The theory proves that self-reference is possible. The systems evidence proves that runtime modification is useful. The security literature proves that it is dangerous when trust boundaries are weak.
The future version will likely be less about programs randomly rewriting themselves and more about governed self-adaptation: systems that specialize, patch, repair, and extend themselves inside a measured, reviewable loop.
The magic is real. The discipline is what makes it engineering.
Sources
- A Swiss Pocket Knife for Computability - Neil D. Jones
- Theory of Self-Reproducing Automata - John von Neumann
- Partial Evaluation of Computation Process, An Approach to a Compiler-Compiler - Yoshihiko Futamura
- Reflections on Trusting Trust - Ken Thompson lecture notes
- Certified Self-Modifying Code - Hongxu Cai, Zhong Shao, Alexander Vaynberg
- Launching Ignition and TurboFan - V8 team
- Maglev - V8’s Fastest Optimizing JIT - V8 team
- ORC Design and Implementation - LLVM documentation
- Livepatch - Linux kernel documentation
- Applying patches with kernel live patching - Red Hat documentation
- Dynamic Software Updating - Michael Hicks, Jonathan T. Moore, Scott Nettles
- Computer Viruses: Theory and Experiments - Fred Cohen
- LTL Model Checking of Self Modifying Code - Tayssir Touili, Xin Ye
- Self-Modifying Code in Open-Ended Evolutionary Systems - Patrik Christen