Enhancing AI Auditability Through Structured Summaries

Improving the auditability of AI interactions is crucial for maintaining security and control. A recent update focuses on preventing the exposure of raw code to AI models, enhancing data security, and providing better insights into flagged code changes.

The Challenge of Raw Diffs

Previously, raw git diffs were sent to AI models for analysis. This approach, while providing detailed context, posed a significant risk of exposing sensitive code and triggering unnecessary audit flags. The challenge was to maintain the benefits of AI analysis without the risks associated with exposing complete code changes.

Solution: Structured Summaries

To address this, the system has been redesigned to send programmatic summaries of code changes instead of raw diffs. These summaries include:

  • File counts: The number of files modified.
  • File types: The types of files changed (e.g., JavaScript, Python, configuration files).
  • Lines changed: The number of lines added, modified, or deleted.

This approach allows the AI to analyze the nature and scope of changes without ever seeing the actual source code.

Example Summary

Instead of sending the raw diff, the system now sends a JSON payload similar to this:

{
  "file_counts": 5,
  "file_types": ["JavaScript", "Python", "YAML"],
  "lines_changed": {
    "added": 120,
    "modified": 50,
    "deleted": 20
  }
}

Linking Audit Reports

To further enhance auditability, the system now links audit reports to flagged data sources. This is achieved through a foreign key relationship (unsafe_post_report_id) that connects data sources to their corresponding audit reports. A user interface component, such as a modal in a Filament-based application, allows users to view report details directly from the flagged data source.

Benefits

  • Enhanced Security: Prevents the exposure of sensitive code to AI models.
  • Improved Auditability: Provides structured summaries for AI analysis and links audit reports to flagged data sources.
  • Reduced False Positives: By analyzing summaries instead of raw diffs, the AI is less likely to flag harmless code changes.

Conclusion

By replacing raw diffs with structured summaries and linking audit reports, the system achieves a better balance between AI-driven analysis and data security. This approach enhances auditability, reduces the risk of exposing sensitive code, and provides better insights into flagged code changes. When implementing similar solutions, consider the balance between data richness and security, and always prioritize protecting sensitive information.

Gerardo Ruiz

Gerardo Ruiz

Author

Share: