Northwestern University Libraries
2026 Technical Refresh
Model of Models has been completely redesigned from the ground up. Every layer of the platform -- from the compute infrastructure to the training pipeline to the interface itself -- was re-engineered for speed, reliability, and a markedly better research experience.
A New Pipeline Architecture
The legacy monolithic processing pipeline has been replaced by an event-driven, container-based architecture orchestrated by AWS Step Functions. Every model training request is automatically routed to one of three dedicated compute pools based on the method selected and the size of the corpus:
- CPU Compute Pool -- General-purpose workloads run on AMD EPYC "Turin" (Zen 5c) processors with 4 vCPU and 16 GiB of memory. Optimized for LDA-based topic models, Multilevel LDA, and large-corpus analytics where parallel CPU cores outperform GPU acceleration.
- Standard GPU Pool -- GPU-accelerated methods on corpora under twenty thousand documents are dispatched to NVIDIA L4 hardware (Ada Lovelace, 4th-generation Tensor Cores) with a dedicated GPU memory slice, 4 vCPU, and 16 GiB of system RAM. Used for Word2Vec, Doc2Vec, and BERTopic on small-to-mid-sized corpora.
- Large GPU Pool -- For GPU methods on corpora of twenty thousand documents or more, the pipeline automatically routes to a larger NVIDIA L4 instance with roughly twice the GPU memory, 8 vCPU, and 32 GiB of system RAM. This is the pool that handles the most demanding BERTopic runs -- corpora in the tens of thousands of documents with full sentence-transformer embeddings, UMAP dimensionality reduction, and HDBSCAN clustering.
Routing is automatic and invisible to the researcher. The platform inspects the chosen method and the document count, then selects the appropriate pool. Each pool maintains a warm capacity of pre-provisioned instances so most jobs start within seconds instead of waiting for a cold boot.
If a GPU job exhausts its video memory mid-run, the orchestrator detects the failure, captures diagnostics, and automatically re-queues the job on the CPU pool so the researcher still gets a result.
GPU-Accelerated Visualizations
Four visualization methods have been optimized for accelerated compute -- three leveraging NVIDIA GPU cores and one using parallel multi-core training -- yielding substantial reductions in time-to-result:
Word2Vec GPU
Word embedding models train directly on GPU tensor cores, accelerating vector space construction for large vocabularies.
Doc2Vec GPU
Document-level embeddings leverage GPU parallelism to produce distributed representations of entire documents, enabling faster similarity analysis.
BERTopic GPU
Transformer-based topic modeling using sentence-transformer embeddings, GPU-native UMAP dimensionality reduction (RAPIDS cuML), and HDBSCAN clustering. The most computationally intensive method on the platform -- corpora of twenty thousand documents or more are automatically routed to the Large GPU pool for additional video memory and CPU headroom.
Multilevel LDA Parallel CPU
The flagship topic-modeling visualization trains a stack of LDA models using gensim's multi-core trainer on the AMD EPYC CPU pool. Streaming corpus iteration keeps memory flat regardless of document count, so the same instance handles hundred-document and hundred-thousand-document corpora without re-provisioning.
Real-Time Status and Observability
The My Models dashboard now provides granular, real-time status updates throughout every phase of a job's lifecycle:
- Batch state labels -- see exactly when a worker is booting from the warm pool, starting up, or initializing the pipeline, instead of a static "Scheduled" indicator.
- Live in-place updates -- status changes, progress bars, and elapsed timers all update seamlessly without reloading the page. When a job completes, the card transitions to its finished state in place.
- Three-second polling -- the dashboard checks for new status every three seconds, providing near-instant feedback on pipeline progress.
- GPU memory monitoring -- if a GPU job exhausts its available video memory, the system detects the failure, logs diagnostic telemetry, and automatically retries the job on CPU.
Interface Redesign
The entire user interface has been refreshed with a modern design language. Glassmorphic surfaces, refined typography, and a consistent Northwestern visual identity replace the previous layout. A new theme toggle lets users switch between the refreshed look and the classic interface at any time.
Help Us Make This Better
Model of Models is under active development. If you encounter a problem or have an idea for improvement, we would like to hear from you.
Report a Bug or Suggest a Feature