LLM-based Model Deployment

This module teaches the complete lifecycle of LLM-based model deployment—from understanding foundational research (like transformers and RAG) to training, quantization, and deploying both standard and multimodal LLMs. It emphasizes hands-on implementation, evaluation techniques, production readiness, and real-world deployment challenges across various frameworks and model types.

NOTE: The pattern for model deployment will be the same, but different types of models can be picked by different groups. These will be provided as well in this section, but the deliverables will remain the same.

Day 1-15:

Summary report for research paper for “Attention is all you need”, Analysis of BERT and Transformers

Deliverables

Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 16-30:

Summary report for the following videos on tokenization and creation of GPT from scratch:
- https://www.youtube.com/watch?v=kCc8FmEb1nY&t=29s
- https://www.youtube.com/watch?v=zduSFxRajkE

Deliverables

Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 31-45:

Summary report for crucial research techniques:
- Vector Databases and Vectorization
- RAG
- Advanced RAG
- Cache RAG
Implementation of HuggingFace models and inference API

Deliverables

Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)
Implementation code and report for HuggingFace models and inference API

Day 46-60:

Summary report for complete LLM pipeline and its auxiliary systems to understand different techniques and their purpose: (Phase 1)
- LLM data preprocessing
  - Different types of data sets: text, Q/A, etc. along with examples of such datasets used
- LLM training
- LLM Loss functions
- LLM Evaluation metrics
- LLM Guardrails
- Additional elements required for LLM production

Deliverables

Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 61-75:

Summary report for complete LLM pipeline and its auxiliary systems to understand different techniques and their purpose: (Phase 2)
- LLM data preprocessing
  - Different types of data sets: text, Q/A, etc. along with examples of such datasets used
- LLM training
- LLM Loss functions
- LLM Evaluation metrics
- LLM Guardrails
- Additional elements required for LLM production

Deliverables

Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 76-90:

Deployment of LLM model (can be from the list of models, ideally Llama 3.1 is recommended)

Deliverables

Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 91-105:

Summary report for different types of quantization and sample quantization on deployed LLM Model (Phase 1)

Deliverables

Proper implementation of different quantization techniques for various bits, their advantages and disadvantages, implementation of these techniques on locally deployed LLM models
Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 106-120:

Summary report for different types of quantization and sample quantization on deployed LLM Model (Phase 2)

Deliverables

Proper implementation of different quantization techniques for various bits, their advantages and disadvantages, implementation of these techniques on locally deployed LLM models
Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 121-135:

Research report of multimodal LLMs (Text to Image, Text to Video)

Deliverables

Review of multimodal LLMs, detailed report of their functioning
Summary report of the techniques, tutorial code with proper functioning, blog for publication, video demo (optional but recommended)

Day 136-150:

Deployment of multimodal LLM models (Phase 1)

Deliverables

Actual deployment of multimodal LLMs along with detailed internal doc, public blog, working code and testing results

Day 151-165:

Deployment of multimodal LLM models (Phase 2)

Deliverables

Actual deployment of multimodal LLMs along with detailed internal doc, public blog, working code and testing results

Day 166-180:

Summary report of real world challenges for multimodal LLMs
- Character consistency across different generations
- Character movements and actions along with consideration of other characters and backgrounds without breaking the real life shapes of the characters

Deliverables

Detailed report along with possible suggested solutions and future scope of improvement
Case study docs for products that have solved it and their properly documented testing and observation reports

Tech Stack

Python
HuggingFace
Tensorflow, Pytorch
LangChain
pandas
DeepSpeed - Microsoft’s framework for optimizing large-scale model training
FSDP (Fully Sharded Data Parallel) - Efficient training for large models
LoRA (Low-Rank Adaptation) - Efficient fine-tuning method for reducing training time
PEFT (Parameter Efficient Fine-Tuning) - Techniques for optimizing LLM fine- tuning
Megatron-LM - NVIDIA’s large-scale model parallel training framework
AWS, Digital Ocean
Ubuntu shell commands, Docker
ONNX Runtime
TensorRT
GPTQ (Quantized LLMs) - High-performance quantization framework
FastAPI
Flask
Gradio/Streamlit
LAVIS (Multimodal Vision-Language Framework) - Model support for multimodal architectures
LangChain Guardrails - Tools for controlling LLM outputs
OpenAI Evals / Hugging Face Evaluate - Standardized benchmarks for LLMs
MT-Bench (Multi-turn Benchmark for Chat Models) - Chat-based LLM performance benchmarking

Models list

LLaMA 3.1 (Large Language Model Meta AI):
DeepSeek R1
Mistral 7B
BLOOM
Falcon 180B
GPT-NeoX

Day 1-15:​

Deliverables​

Day 16-30:​

Deliverables​

Day 31-45:​

Deliverables​

Day 46-60:​

Deliverables​

Day 61-75:​

Deliverables​

Day 76-90:​

Deliverables​

Day 91-105:​

Deliverables​

Day 106-120:​

Deliverables​

Day 121-135:​

Deliverables​

Day 136-150:​

Deliverables​

Day 151-165:​

Deliverables​

Day 166-180:​

Deliverables​

Tech Stack​

Models list​

Multimodal​

Day 1-15:

Deliverables

Day 16-30:

Deliverables

Day 31-45:

Deliverables

Day 46-60:

Deliverables

Day 61-75:

Deliverables

Day 76-90:

Deliverables

Day 91-105:

Deliverables

Day 106-120:

Deliverables

Day 121-135:

Deliverables

Day 136-150:

Deliverables

Day 151-165:

Deliverables

Day 166-180:

Deliverables

Tech Stack

Models list

Multimodal