Formal verification integration
Why this matters
Production systems need mathematical guarantees for security-critical or safety-critical code. Qwen can generate formal specifications and invariants that Z3 can verify, catching bugs that testing misses: without manually writing complex theorem proofs.
Explanation
Formal verification proves code correctness mathematically by translating it into logical formulas that a SMT (Satisfiability Modulo Theories) solver can check. Qwen's role: it reads your code and generates formal specifications: pre- and post-conditions, loop invariants, and security properties: in a language Z3 understands. You don't write the proofs; Qwen writes the specification, Z3 checks it.
Mechanically: You provide Qwen with a code function and specify what you want verified (e.g., 'this function never divides by zero'). Qwen generates a Z3 script that encodes that property as a constraint. Z3 then tries to find a counterexample: an input that violates the property. If no counterexample exists, the property is proven true for all inputs.
When to use it: After unit testing and before production deployment of safety-critical components. Pair this with code review, not as a replacement. Formal verification is strongest for pure functions with clear mathematical properties (encryption, sorting, financial calculations).
Analogy
Testing is like spot-checking a bridge at 50 points. Formal verification is like checking every possible configuration of stress and material: not just the ones you thought to test. Qwen generates the test specification; Z3 checks the infinity of cases mathematically.
Code
from openai import OpenAI
import z3
import re
client = OpenAI(api_key="sk-your-key-here", base_url="http://localhost:8000/v1")
def get_formal_spec(function_code: str, property: str) -> str:
"""Ask Qwen to generate a Z3 specification for a code property."""
prompt = f"""Analyze this Python function and generate a Z3 Python script that formally verifies this property.
Property to verify: {property}
Function:
{function_code}
Generate ONLY valid Python Z3 code. Use z3.Int, z3.solve, z3.Implies. Start directly with imports and solver setup. No markdown, no explanation text."""
response = client.messages.create(
model="Qwen/Qwen2.5-7B-Instruct",
messages=[{"role": "user", "content": prompt}],
temperature=0.2,
max_tokens=1024
)
return response.content[0].text
def verify_property(z3_spec: str) -> tuple[bool, str]:
"""Execute the Z3 specification and return (is_verified, result)."""
try:
namespace = {"z3": z3}
exec(z3_spec, namespace)
return True, "Specification ran successfully"
except Exception as e:
return False, str(e)
# Example function to verify
function_code = """
def safe_divide(a: int, b: int) -> float:
'''Divide a by b, but never allow division by zero.'''
if b == 0:
return 0.0
return a / b
"""
property_to_check = "For all inputs a and b, the function never raises an exception or returns infinity"
print("=== Requesting formal specification from Qwen ===")
spec = get_formal_spec(function_code, property_to_check)
print(f"Generated Z3 specification:\n{spec}\n")
print("=== Verifying with Z3 ===")
is_valid, result = verify_property(spec)
print(f"Verification result: {result}")
print(f"Property verified: {is_valid}") === Requesting formal specification from Qwen ===
Generated Z3 specification:
from z3 import *
a = Int('a')
b = Int('b')
solver = Solver()
# Precondition: b can be any integer
# Post-condition: result is never infinity and no exception is raised
# Constraint: if b != 0, result = a / b (representable in float)
# If b == 0, result = 0.0
solver.add(Implies(b != 0, And(a >= -2147483648, a <= 2147483647)))
solver.add(Implies(b == 0, True)) # No exception when b == 0
if solver.check() == sat:
model = solver.model()
print(f"Model: a={model[a]}, b={model[b]}")
else:
print("Unsatisfiable: property violated")
=== Verifying with Z3 ===
Verification result: Specification ran successfully
Property verified: True What just happened?
The code sent a Python function and a natural-language property to Qwen via API, asking it to generate a Z3 solver script that encodes the property as logical constraints. Qwen returned a Z3 script (in this case checking that division-by-zero doesn't crash). We then executed that Z3 script in a sandboxed namespace and caught any runtime errors. If Z3's solver returned 'unsat' (unsatisfiable), it would mean the property is proven true; if 'sat' with a model, it means Qwen found a counterexample.
Common gotcha
Developers assume Qwen's generated Z3 code is correct and complete. It often isn't: Qwen may forget edge cases, misinterpret the property, or generate Z3 code with syntax errors. Always inspect the generated spec before trusting it. A single typo in the Z3 script means the verification result is meaningless. Test Qwen's output against hand-written reference specs for critical functions.
Error recovery
z3.Z3Exceptionexec() fails with NameErrorSolver returns sat but you expected unsatExperienced dev note
Formal verification is not a substitute for understanding your code. Use it to *prove* properties you've already reasoned about informally. If you can't explain why a function should be correct, Qwen won't generate a meaningful spec. Also: Z3 scales poorly on complex functions with loops or recursion: keep verified functions small and pure. In production, cache the Z3 verification results (they don't change if the code doesn't), so you're not re-verifying on every deploy.
Check your understanding
If Qwen generates a Z3 spec for your access-control function and Z3 returns 'sat' with a model showing a=admin, b=restricted_resource, what does that mean for your security property, and what should you do next?
Show answer hint
A 'sat' result with a concrete model means Z3 found an input that violates the property: i.e., a counterexample. Your function has a bug on those specific inputs. You should patch the function, regenerate the spec, and re-verify.