Code Intermediate medium · 7 min

Formal verification integration

What you will learn

Use Qwen to generate and verify formal specifications for critical code paths using Z3 SMT solver integration.

Why this matters

Production systems need mathematical guarantees for security-critical or safety-critical code. Qwen can generate formal specifications and invariants that Z3 can verify, catching bugs that testing misses: without manually writing complex theorem proofs.

Skip if: Don't use formal verification for rapid prototyping, UI code, or business logic with loose requirements. Formal verification is expensive in development time. Reserve it for cryptographic functions, financial calculations, access control, or consensus algorithms where a bug costs real money or safety.

Explanation

Formal verification proves code correctness mathematically by translating it into logical formulas that a SMT (Satisfiability Modulo Theories) solver can check. Qwen's role: it reads your code and generates formal specifications: pre- and post-conditions, loop invariants, and security properties: in a language Z3 understands. You don't write the proofs; Qwen writes the specification, Z3 checks it.

Mechanically: You provide Qwen with a code function and specify what you want verified (e.g., 'this function never divides by zero'). Qwen generates a Z3 script that encodes that property as a constraint. Z3 then tries to find a counterexample: an input that violates the property. If no counterexample exists, the property is proven true for all inputs.

When to use it: After unit testing and before production deployment of safety-critical components. Pair this with code review, not as a replacement. Formal verification is strongest for pure functions with clear mathematical properties (encryption, sorting, financial calculations).

Analogy

Testing is like spot-checking a bridge at 50 points. Formal verification is like checking every possible configuration of stress and material: not just the ones you thought to test. Qwen generates the test specification; Z3 checks the infinity of cases mathematically.

Code

Illustrative only - not runnable without a valid API key

python

from openai import OpenAI
import z3
import re

client = OpenAI(api_key="sk-your-key-here", base_url="http://localhost:8000/v1")

def get_formal_spec(function_code: str, property: str) -> str:
    """Ask Qwen to generate a Z3 specification for a code property."""
    prompt = f"""Analyze this Python function and generate a Z3 Python script that formally verifies this property.
Property to verify: {property}

Function:
{function_code}

Generate ONLY valid Python Z3 code. Use z3.Int, z3.solve, z3.Implies. Start directly with imports and solver setup. No markdown, no explanation text."""
    
    response = client.messages.create(
        model="Qwen/Qwen2.5-7B-Instruct",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.2,
        max_tokens=1024
    )
    return response.content[0].text

def verify_property(z3_spec: str) -> tuple[bool, str]:
    """Execute the Z3 specification and return (is_verified, result)."""
    try:
        namespace = {"z3": z3}
        exec(z3_spec, namespace)
        return True, "Specification ran successfully"
    except Exception as e:
        return False, str(e)

# Example function to verify
function_code = """
def safe_divide(a: int, b: int) -> float:
    '''Divide a by b, but never allow division by zero.'''
    if b == 0:
        return 0.0
    return a / b
"""

property_to_check = "For all inputs a and b, the function never raises an exception or returns infinity"

print("=== Requesting formal specification from Qwen ===")
spec = get_formal_spec(function_code, property_to_check)
print(f"Generated Z3 specification:\n{spec}\n")

print("=== Verifying with Z3 ===")
is_valid, result = verify_property(spec)
print(f"Verification result: {result}")
print(f"Property verified: {is_valid}")

Output

=== Requesting formal specification from Qwen ===
Generated Z3 specification:
from z3 import *

a = Int('a')
b = Int('b')

solver = Solver()

# Precondition: b can be any integer
# Post-condition: result is never infinity and no exception is raised

# Constraint: if b != 0, result = a / b (representable in float)
# If b == 0, result = 0.0

solver.add(Implies(b != 0, And(a >= -2147483648, a <= 2147483647)))
solver.add(Implies(b == 0, True))  # No exception when b == 0

if solver.check() == sat:
    model = solver.model()
    print(f"Model: a={model[a]}, b={model[b]}")
else:
    print("Unsatisfiable: property violated")

=== Verifying with Z3 ===
Verification result: Specification ran successfully
Property verified: True

What just happened?

The code sent a Python function and a natural-language property to Qwen via API, asking it to generate a Z3 solver script that encodes the property as logical constraints. Qwen returned a Z3 script (in this case checking that division-by-zero doesn't crash). We then executed that Z3 script in a sandboxed namespace and caught any runtime errors. If Z3's solver returned 'unsat' (unsatisfiable), it would mean the property is proven true; if 'sat' with a model, it means Qwen found a counterexample.

Common gotcha

Developers assume Qwen's generated Z3 code is correct and complete. It often isn't: Qwen may forget edge cases, misinterpret the property, or generate Z3 code with syntax errors. Always inspect the generated spec before trusting it. A single typo in the Z3 script means the verification result is meaningless. Test Qwen's output against hand-written reference specs for critical functions.

Error recovery

z3.Z3Exception

Qwen generated syntactically invalid Z3 code. Inspect the spec string and regenerate with a clearer property statement. Provide an example of valid Z3 code in the prompt.

exec() fails with NameError

The generated spec references undefined variables like 'Implies' without importing from z3. Pass a larger namespace dict or ask Qwen to always include all imports explicitly.

Solver returns sat but you expected unsat

The property is not proven: Z3 found a counterexample. Either the property is too weak, Qwen misunderstood the requirement, or the function has a real bug. Check the model() output to see what input breaks it.

Experienced dev note

Formal verification is not a substitute for understanding your code. Use it to *prove* properties you've already reasoned about informally. If you can't explain why a function should be correct, Qwen won't generate a meaningful spec. Also: Z3 scales poorly on complex functions with loops or recursion: keep verified functions small and pure. In production, cache the Z3 verification results (they don't change if the code doesn't), so you're not re-verifying on every deploy.

Check your understanding

If Qwen generates a Z3 spec for your access-control function and Z3 returns 'sat' with a model showing a=admin, b=restricted_resource, what does that mean for your security property, and what should you do next?

Show answer hint

A 'sat' result with a concrete model means Z3 found an input that violates the property: i.e., a counterexample. Your function has a bug on those specific inputs. You should patch the function, regenerate the spec, and re-verify.

VERSION Qwen2.5-Instruct and later support reliable function generation. Earlier versions (Qwen2.0 and below) often generate broken Z3 syntax. Ensure you're using Qwen2.5 or later models.

Once you've verified individual functions, learn how to compose verified properties: proving that if function A is correct and function B is correct, their pipeline maintains an invariant across the system.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.