Table of Contents
- Runtime Architecture
- Related Docs
- Overview
- Topology At A Glance
- Current Deployment Shapes
- Application Structure
- Core Data Model
- Runtime Responsibilities
- Web/API Process
- API Access Model
- Operator Console
- Worker Process
- Operations And Maintenance
- Storage Layer
- Translation Runtime
- Health Model
- Architecture Boundaries That Are Not Shipped Yet
Runtime Architecture
This document describes the architecture that is actually present in the repository today.
Related Docs
Overview
Iris Translation v2 is a modular Django application with Celery-based background execution for the translation pipeline.
At a high level, the checked-in system is made of:
- a Django web/API process
- a Celery worker process
- a Django-configurable relational database (
postgresqlorsqlite) - a configurable artifact storage backend (
localor S3-compatible) - a broker for async workflow execution when enabled
- an optional OpenSearch node for glossary retrieval
- Aspose.Words-backed DOCX extraction and reassembly
- an OpenAI-compatible translation/runtime integration layer
Topology At A Glance
flowchart LR
Client[HTTP client or operator] --> Web[Django web and API process]
Web --> DB[(PostgreSQL or SQLite)]
Web --> Storage[Artifact storage<br/>local or S3-compatible]
Web --> Broker{Async workflow enabled?}
Broker -->|yes| Queue[Configured broker]
Queue --> Worker[Celery worker]
Worker --> DB
Worker --> Storage
Worker --> Aspose[Aspose.Words]
Worker --> Search[Optional OpenSearch glossary lookup]
Worker --> Provider[Mock or OpenAI-compatible provider]
Current Deployment Shapes
Plain Local Development
Default local development is the simplest shape:
- Django app started with
manage.py runserver - Celery worker started on the host when using the async path
- PostgreSQL on
127.0.0.1:5433through the checked-in Docker dependency services by default - Redis broker on
127.0.0.1:6380by default - MinIO-backed S3 storage on
127.0.0.1:9000by default - OpenSearch glossary retrieval on
127.0.0.1:9200by default - SQLite plus local-file storage still available as an explicit fallback configuration
Checked-In Docker Compose Topology
The repository's local Docker stack runs:
webworkerpostgresredisopensearchopensearch-dashboardsminiominio-init
Important current facts:
- the stack has one generic
workerservice, not multiple worker services - the web and worker containers use the bundled
postgresservice - artifact storage is switched to S3-compatible mode through MinIO
- async workflow is enabled in Compose
- scheduled maintenance is available as a task/command, but Compose does not ship a dedicated scheduler
Application Structure
Installed Django Apps
settings.py currently installs:
apps.policyapps.documentsapps.jobsapps.reviewapps.auditapps.terminologyapps.memoryapps.qaapps.apiapps.operator_console
Service Modules
The services/ package contains the main business logic for:
- DOCX extraction and reassembly
- language operations and enrichment
- translation prompt selection and provider access
- retrieval over glossary and memory data
- storage and artifact handling
- health/readiness checks and maintenance
- reporting
- workflow intake, processing, review, control, replay, and reassembly
Core Data Model
Policy And Configuration
DomainPack: expertise and policy boundaryProject: project inside a domain pack, includingretention_daysProviderProfile: named provider configurationLanguagePairPolicy: source/target pair rules plus provider/profile binding
Documents
DocumentFamily: optional family grouping inside a projectDocument: logical source document identified byexternal_referenceDocumentVersion: specific uploaded source version;source_filenameis also the current file-history grouping key in the consoleTranslationUnit: extracted segment with a stable anchor and unit order
Jobs And Outputs
Job: workflow state, stage, diagnostics, metadata, timestampsArtifact: persisted source, intermediate, review, replay, and delivery artifactsJobPolicySnapshot: frozen policy/config snapshot for a runJobStatisticsSnapshot: rollup counts for verification and reuse metricsTranslationResult: per-segment output plus result source and verification stateVerificationResult: per-segment QA recordJobBatch: multi-file intake wrapper for many related job submissionsJobBatchItem: per-file intake state and retry/log record inside a batchProjectReviewCoverageSnapshot: persisted daily project-level coverage snapshot
Knowledge And Review
GlossaryEntry: scoped terminology entryCandidateMemoryEntry: review-derived memory candidateApprovedMemoryEntry: approved reusable memory entryReviewSession: TMX export/import tracking for human reviewAuditEvent: structured audit log records
Runtime Responsibilities
Web/API Process
The web process currently handles:
- session-authenticated HTML operator-console routes under
/console/ - session-authenticated and HTTP Basic-authenticated API access
- operator/admin authorization on
/api/v1/* - health endpoints
- policy and glossary configuration endpoints
- job creation and control endpoints
- reporting and audit endpoints
- artifact listing and local-download authorization
- synchronous processing paths when operators call job actions directly
Public API exceptions are intentionally narrow:
/health/live/health/ready/api/v1/artifacts/{artifact_id}/downloadwhen the caller already has a valid signed token
API Access Model
The checked-in API now enforces two roles:
operator: any active authenticated useradmin: any active authenticated user who is staff, superuser, or a member ofIRIS_API_ADMIN_GROUP
The supported API authentication mechanisms are:
- Django session auth
- HTTP Basic auth
Audit events now take actor identity from the authenticated user and source IP from X-Forwarded-For or REMOTE_ADDR. Legacy request fields such as submitted_by, changed_by, requested_by, and cancelled_by may still be accepted for compatibility, but they are no longer the audit source of truth.
Operator Console
The repo now ships a Django-template operator console at /console/.
That console currently supports:
- session-authenticated login/logout
- single-job submission
- multi-file batch submission and monitoring
- job detail with status, stats, artifacts, review sessions, and recent audit events
- filename-based file history under
/console/files/ - run-to-run comparison for the same filename timeline
- TMX-style segment inspection backed by extracted units and translation results
- process, cancel, rerun, TMX export, TMX import, and replay actions
- project-level review-coverage views
It reuses the same workflow and reporting services as the API rather than maintaining a separate frontend-only backend path.
Current file history is application-managed rather than storage-native. The console groups runs by project + source_filename, which keeps reruns and repeated uploads together without depending on MinIO/S3 bucket versioning. The checked-in Artifact model stores object keys, checksums, and metadata, but it does not currently persist provider-native object version identifiers such as S3 VersionId.
Worker Process
The checked-in worker command subscribes to these queues:
job_controldocx_extractretrieve_contexttranslate_batchqa_verifyreview_iodocx_reassemblemaintenance
Important nuance: retrieval still happens inside translation processing, so there is no separate tasks.workflow.retrieve_context task. The maintenance queue is now backed by tasks.workflow.maintenance_tick.
Operations And Maintenance
The checked-in repo now ships one maintenance service that can be run either:
- through the Celery task
tasks.workflow.maintenance_tick - through the management command
python manage.py maintenance_tick
That maintenance flow currently performs:
- expired artifact deletion based on
retention_expires_at - cleanup of stale non-promoted candidate memory tied to terminal jobs
- refresh of project review-coverage snapshots
- integrity checks for missing artifact objects and completed jobs missing delivery artifacts
Separately from scheduled maintenance, Django startup also performs a small recovery pass for jobs left in in_progress with stale or missing heartbeat metadata and marks them failed.
Storage Layer
The storage layer is abstracted through Django's default storage. Workflow code uses storage APIs plus materialization helpers so non-filesystem storage backends can still be used by DOCX-processing stages that need local files temporarily.
Artifact bytes live in the configured storage backend, but file/version history is tracked in Django models and services rather than through storage-provider-native object versioning. Bucket versioning can be enabled at the infrastructure layer if desired, but it is not part of the current application contract.
Translation Runtime
The translation runtime is built from three cooperating pieces:
- retrieval backend selection (
databaseor optionalopensearchfor glossary lookup) - prompt rendering from checked-in template files
- provider execution through either the mock provider or an OpenAI-compatible endpoint
Health Model
The readiness service currently checks:
databasestorageasposebrokeropensearch
aspose, broker, and opensearch are treated as optional checks and return disabled when the related feature flag is off.
Architecture Boundaries That Are Not Shipped Yet
The repository does not currently ship:
- a production WSGI/ASGI server definition
- separate checked-in worker deployments per queue family
- storage-provider-native object version browsing as part of the operator workflow
Those items belong in future delivery work, not in the description of the current repo baseline. For future gaps, see Implementation Plan: Verified Next Work and Open Questions.