Tutorials#

This section provides tutorials that walk you through the process of building AI/ML applications on Union. The example applications range from training XGBoost models in tabular datasets to fine-tuning large language models for text generation tasks.

Sentiment Classification with DistilBERT

Fine-tune a pre-trained language model in the IMDB dataset for sentiment classification.

Sentiment Classification with Language Models
Agentic Retrieval Augmented Generation

Build an agentic retrieval augmented generation system with ChromaDB and Langchain.

Agentic Retrieval Augmented Generation
HDBSCAN Soft Clustering With Headline Embeddings with GPUs

Use HDBSCAN soft clustering with headline embeddings and UMAP on GPUs.

HDBSCAN Soft Clustering With Headline Embeddings on GPUs
Reddit Slack Bot on a Schedule

Securely store Reddit and Slack authentication data while pushing relevant Reddit posts to slack on a consistent basis.

Reddit Slack Bot
Time Series Forecaster Comparison

Visually compare the output of various time series forecasters while maintaining lineage of the training and forecasted data.

Time Series Model Comparison
GluonTS Time Series On GPUs

Train and evaluate a time series forecasting model with GluonTS.

Forecasting with GluonTS & PyTorch on GPUs
Credit Default Prediction with XGBoost & NVIDIA RAPIDS

Use NVIDIA RAPIDS cuDF DataFrame library and cuML machine learning to predict credit default.

Credit Default Prediction with XGBoost & NVIDIA RAPIDS
Genomic Alignment using Bowtie 2

Pre-process raw sequencing reads, build an index, and perform alignment to the a reference genome using the Bowtie2 aligner.

Genomic Alignment
Video Dubbing with Open-Source Models

Use open-source models to dub videos.

Video Dubbing