How to fix your application logs with AI¶
More than 57% of developers' time is spent in war rooms solving application performance issues rather than building new features (Cisco/Splunk Developer Survey, 2024). That's not a skills problem. It's a logging problem.
The logs your application writes today were crafted for the developer sitting at the keyboard. Someone who knows the code structure, remembers which module emits which message, and understands the internal state machine. When an incident hits at 02:00 in the morning, the person staring at those logs isn't that developer. It's an on-call engineer who doesn't know your code, or an AI agent parsing them for patterns.
This article walks you through a copy-paste AI prompt that audits all logging in a Python application, rewrites vague messages into actionable ones, adds missing log statements, and generates a LOGS.md inventory.
You can run it in one pass.
Key Takeaways
- More than half of developers spend more time debugging than building, often because logs lack the context needed for fast triage (Cisco/Splunk, 2024).
- Only 13% of collected telemetry is ever used, the remaining 87% is noise (Sawmills, 2025).
- A single AI prompt can audit, rewrite, and document all log statements in a Python codebase in one pass.
- The output is a ready-to-commit PR with improved logs, flagged ambiguities, and a
LOGS.mdinventory file.
Why your logs are useless to everyone who needs them¶
In 2025, the average enterprise spends $905,000 annually on observability, yet only 13% of collected telemetry is ever used (Sawmills, 2025). The gap isn't tooling, it's the quality of what gets logged in the first place.
Developers write logs for themselves.
A message like Connection failed or Invalid state tells anyone who isn't the author almost nothing.
What failed? Was it transient or permanent? Which database, host, or tenant is affected? Should the operator restart a service, check a certificate, or just wait?
On the other side of the incident response chain, DevOps engineers and SREs need the opposite: logs written as operational instructions. They need each message to answer what happened, whether it signals a real problem, what resource is affected, the likely cause, and what to check next.
The result of this mismatch is expensive. Alert fatigue is the number-one obstacle to faster incident response, outpacing the next closest obstacle by nearly a 2:1 margin (Grafana Labs, 2025). And 73% of organizations report experiencing outages directly tied to ignored or suppressed alerts, essentially a consequence of noisy logs (Splunk, 2025).
The six rules of operational logging¶
Every log statement should survive the "02:00 in-the-morning test": Can someone who has never seen the code understand what happened and know what to do next? That means each message must answer five questions and carry structured fields to back them up.
1. What happened. The event, not the code path. Instead of handle_zk_callback() write ZooKeeper connection state changed.
2. Is this a real problem? The log level and message together should make it clear whether this is normal behavior, a warning worth investigating, or a failure requiring action.
3. What is affected? Include the component, resource, tenant, connection, request, or operation. A log about a database connection is useless without the database name.
4. What's the likely cause? If the code can infer why something happened, include it. Connection refused by ZooKeeper at 10.17.165.253:2181 is far more useful than Connection failed.
5. What should the operator do next? When possible, include a hint. ZooKeeper calls are now blocking, check network connectivity to 10.17.165.253:2181.
6. Structured fields, not just text. A human-readable message is necessary but not sufficient. Useful structured fields include:
| Field category | Examples |
|---|---|
| Identity | tenant, request_id, trace_id, correlation_id |
| Resource | host, endpoint, database, queue, topic, file |
| Operational | status, retry_count, timeout, duration_ms, size, count, offset |
| Error details | error_type, error_code, underlying exception class |
Here's a concrete example from the ASAB framework, which powers TeskaLabs LogMan.io:
19-Jun-2026 18:47:11.649010 WARNING asab.zookeeper.container [sd state="SUSPENDED" node="10.17.165.253:2181" session_id="session-1defih2145754"] ZooKeeper connection state changed. Zookeeper calls are now blocking!
This log passes the test.
It tells you the state (SUSPENDED), the node affected, the current session state, the consequence (calls are now blocking), and it uses structured fields in brackets that a parser can extract.
The module name asab.zookeeper.container is implicit from the logger name.
Compare that with something common:
ERROR: Connection failed
That tells you nothing actionable.
A ready-to-use AI prompt for log auditing¶
The fastest way to do this is with an agentic coding tool; an AI assistant that reads your whole repository, edits files across it, and opens a pull request on its own. Tools like Cursor, Claude Code, and OpenAI Codex turn log auditing from a manual file-by-file chore into a single agentic task: you state the goal once, and the agent works autonomously across the codebase until every log statement meets the bar. Because these agents operate on the entire project rather than one open file, they can infer context from surrounding code, find silent failure branches, and stay within your existing logging conventions.
Here's a prompt you can paste into any of these assistants. Set it as a /goal or /loop instruction and point it at your codebase.
# Logging improvement
Goal:
Improve all logging so that logs are useful for external DevOps users
and AI Agents who do not know the internal code structure.
In this application:
1. Find all log statements and log-producing branches.
Cover at least:
- CRITICAL - causing the application to stop
- ERROR - causing the application to malfunction
- WARNING - the application is working OK, this is typically
an issue with data or user input
- NOTICE - the application has something important to log that
should be visible in non-verbose mode
- INFO - the application log is visible only in verbose mode
2. Review each log message. Rewrite it so that it clearly explains:
- what happened
- whether this indicates a real problem or normal behavior
- what component, resource, tenant, connection, request, job, or
operation is affected
- what the likely cause is, if it can be inferred
- what the external DevOps user should check or do next
3. Preserve and improve structured logging.
Logs must include:
- a clear human-readable message
- useful key-value fields that expose relevant operational context
Prefer fields such as:
- tenant, where applicable
- request ID / correlation ID / trace ID, where available
- resource identifiers
- endpoint, host, topic, queue, file, path, or database name
- status/result
- retry count, timeout, duration, size, count, offset, or similar
operational values
- underlying error details
4. Check for missing logs.
Review important code branches and verify that failures, retries,
fallbacks, ignored conditions, degraded states, skipped work, and
unexpected states are logged.
If a branch should emit a log but does not, add an appropriate
structured log.
5. Make the logs actionable.
Avoid vague messages such as:
- "failed"
- "error occurred"
- "invalid state"
- "cannot continue"
Replace them with messages that help troubleshooting:
- what failed
- why it likely failed
- what input or dependency was involved
- whether retry is expected
- what the operator should inspect next
6. Do not expose unnecessary internal implementation details.
The log should help an external DevOps user operate and
troubleshoot the system, not require them to understand internal
code architecture.
7. Keep changes safe and minimal.
Do not change control flow unless required to add missing logging.
Preserve existing structured-log conventions used in the repository.
DO NOTs:
- When logging the exception i.e. `L.exception(...)`, do not
include `str(e)`
- Do not repeat info that is available from logger, such as
`service` and `operation`
- Do not change business logic.
Output:
1. The changes in the code, so that they can be committed to a new PR.
This includes logs improved and missing log branches added.
2. Identification of places where logging is still ambiguous.
3. New or updated LOGS.md file in the root of the repository with a
list of all CRITICAL, ERROR, and WARNING logs.
The prompt has seven phases that the LLM executes sequentially:
| Phase | What the AI does |
|---|---|
| Find | Scans every logging call and every branch that produces output |
| Review | Rewrites each message using the five questions above |
| Structure | Ensures key-value fields carry operational context |
| Gap-check | Adds logs to failure, retry, and fallback branches that were silent |
| Actionable | Replaces vague messages with specific, troubleshooting-oriented text |
| External view | Strips internal implementation details an operator doesn't need |
| Safe changes | Preserves control flow and existing logging conventions |
The output you get back is concrete: a diff-ready code change, a list of places where logging is still ambiguous, and a LOGS.md file cataloging every CRITICAL, ERROR, and WARNING log in your project.
That inventory alone is worth the effort.
It's the single source of truth for anyone on-call who needs to know what each log means.
How it works under the hood¶
Large language models are good at this task for a straightforward reason: logging is fundamentally a natural language problem wrapped in code. The AI reads each log statement, understands the surrounding code to infer context, and rewrites the message as though it were explaining the event to someone else.
The prompt doesn't guess at your architecture. It reads your existing logging conventions: the logger names, the structured field patterns, the severity levels; and improves within those boundaries. That's why rule 7 is critical: "Keep changes safe and minimal." The result is a PR that touches only logging, not business logic.
Here's what a typical before-and-after looks like:
# Before
logger.error("Database connection failed")
# After
logger.error(
"Failed to connect to database {database}. "
"Connection refused after {retry_count} retries. "
"Check database availability and network connectivity to {host}:{port}.",
extra={
"database": config["db"]["name"],
"host": config["db"]["host"],
"port": config["db"]["port"],
"retry_count": retry_count,
}
)
The before message wastes everyone's time. The after message tells the operator exactly which database, how many retries occurred, and what to check; all while feeding structured fields into your log management platform for parsing and alerting.
What to do after the AI audit¶
Once the prompt produces its output, you have a concrete set of next steps:
Review and commit. The diff is typically clean: log messages rewritten, a handful of new log statements added to previously silent branches. Review it like any other PR, focusing on whether the new messages accurately describe the events and whether the structured fields map to real variables.
Ship the LOGS.md. This inventory file lists every CRITICAL, ERROR, and WARNING log with a description of when it fires and what it means. Drop it in your repository root and link to it from your runbooks. New team members and on-call engineers will find it immediately useful.
Feed the improved logs into your SIEM. Better logs are only valuable if something reads them. In TeskaLabs LogMan.io, structured fields enable parsers to extract key terms, correlation rules to connect events across services, and detection rules to trigger on specific patterns. The quality of what you log directly determines the quality of what you can detect.
The numbers support the approach. In 2025, 78% of ITOps and engineering professionals say AI enables them to spend more time on innovation rather than maintenance (Splunk, 2025). AI is now the number-one buying criterion for observability platforms at 29%, surpassing cloud compatibility or data collection (Dynatrace, 2025). The companies seeing the most benefit are the ones that start with good data, and good data starts with good logs.
Frequently Asked Questions
Q: Does this prompt work with languages other than Python?
A: Yes. The principles are language-agnostic. You'd adjust the syntax-specific details; replacing logger.error() with log.Error() for Go, Logger.error() for Java, or console.error() for Node.js. But the five questions and structured-field guidance apply equally.
Q: What if my codebase uses a custom logging framework?
A: The prompt preserves existing conventions by design. It reads how your framework structures fields (whether that's JSON, key-value pairs in brackets, or a proprietary format) and improves within those patterns. The DO NOT rules prevent the AI from introducing a new logging style.
Q: Can I run this as an ongoing hygiene check?
A: Absolutely. Treat it as a logging linter. Run it periodically (for example, as part of a quarterly code review cycle or before major releases) to catch new vague messages and missing log branches introduced by feature development.
Conclusion¶
Good logs are a product. They have users: on-call engineers, security analysts, and increasingly, AI agents that parse them for patterns and anomalies. Writing logs for those users, rather than for yourself, is the single highest-leverage improvement you can make to your application's observability.
The prompt above automates the audit and rewrite process.
What you get is a PR with cleaner logs, a LOGS.md inventory, and a list of remaining ambiguities to address manually.
It typically takes minutes to generate and an hour to review.
If you'd like to try it on your own codebase, copy the prompt, paste it into your coding assistant, and point it at your repository. The improved logs will serve your team at 02:00 in-the-morning; and they'll serve your SIEM every day after that.
For a deeper dive on the principles behind application logging, see Application logging for software developers on this blog.