Toptube Video Search Engine



Title:RAG from the Ground Up with Python and Ollama
Duration:15:32
Viewed:26,927
Published:24-03-2024
Source:Youtube

Retrieval Augmented Generation (RAG) is the de facto technique for giving LLMs the ability to interact with any document or dataset, regardless of its size. Follow along as I cover how to parse and manipulate documents, explore how embeddings are used to describe abstract concepts, implement a simple yet powerful way to surface the most relevant parts of a document to a given query, and ultimately build a script that you can use to have a locally-hosted LLM engage your own documents. Check out my other Ollama videos: https://www.youtube.com/playlist?list=PL4041kTesIWby5zznE5UySIsGPrGuEqdB Links: Code from video - https://decoder.sh/videos/rag-from-the-ground-up-with-python-and-ollama Ollama Python library - https://github.com/ollama/ollama-python Project Gutenberg - https://www.gutenberg.org Nomic Embedding model (on ollama) - https://ollama.com/library/nomic-embed-text BGE Embedding model - https://huggingface.co/CompendiumLabs/bge-base-en-v1.5-gguf/blob/main/bge-base-en-v1.5-f16.gguf How to use a model from HF with Ollama - https://www.youtube.com/watch?v=fnvZJU5Fj3Q Cosine Similarity - https://blog.gopenai.com/rag-for-everyone-a-beginners-guide-to-embedding-similarity-search-and-vector-db-423946475c90#cdfc Timestamps: 00:00 - Intro 00:26 - Environment Setup 00:49 - Function review 01:50 - Source Document 02:18 - Starting the project 02:37 - parse_file() 04:35 - Understanding embeddings 05:40 - Implementing embeddings 07:01 - Timing embedding 07:35 - Caching embeddings 10:06 - Prompt embedding 10:19 - Cosine similarity for embedding comparison 12:16 - Brainstorming improvements 13:15 - Giving context to our LLM 14:29 - CLI input 14:49 - Next steps



SHARE TO YOUR FRIENDS


Download Server 1


DOWNLOAD MP4

Download Server 2


DOWNLOAD MP4

Alternative Download :