AI/ML Engineer
We are seeking a talented ML Engineer to join our team in designing and developing cutting-edge generative AI solutions. The role focuses on building and optimizing applications that leverage large language models (LLMs), Retrieval-Augmented Generation (RAG), and vector stores. You'll work closely with cross-functional teams, including data scientists and engineers, to ensure the successful deployment of generative AI technologies that address real-world problems.
This position involves creating efficient pipelines for data embedding, managing vector stores, implementing RAG systems, and fine-tuning LLMs to meet the specific needs of various projects. You’ll also contribute to AI system security, performance optimization, and query interface development.
Key Responsibilities:
- Embedding Management and Vector Store Integration:
- Generate embeddings for unstructured data and store them in vector databases (e.g., ChromaDB, Pinecone).
- Automate processes to keep embeddings up-to-date with new data.
- Experiment with and evaluate different embedding models to find the best fit for specific project requirements.
- LLM and Query Interface Development:
- Develop and integrate natural language query interfaces that combine LLMs with structured and unstructured data.
- Implement dynamic prompting and function calling to improve LLM query handling.
- Design QA backend for interacting with vectorized data.
- Implement RAG systems to enhance response accuracy by using relevant context from retrieved data.
- Advanced Query Handling and Classification:
- Implement logic to classify and route queries to structured or unstructured data sources as needed.
- Utilize frameworks such as LangChain or LlmaIndex to manage workflows and queries and break them down into sub-queries.
- LLM Fine-Tuning and Optimization:
- Fine-tune and optimize LLMs for specific use cases or domains.
- Continuously test and evaluate model performance, adjusting based on feedback and accuracy metrics.
- System Security and Optimization:
- Monitor system performance and optimize it for scalability and efficiency.
- Ensure security protocols are followed, particularly in handling sensitive data and user interactions with AI models.
Required Skills and Qualifications:
- Master's Degree in Computer Science, Data Science, Mathematics, Statistics, or a related field. Candidates with strong mathematical or statistical backgrounds are preferred.
- Experience with large language models (LLMs) like GPT, LLaMA, or similar models for generative AI and natural language querying.
- Proficiency in embedding generation and managing embeddings in vector stores (e.g., ChromaDB, Pinecone, or OpenSearch with embeddings).
- Strong knowledge of Retrieval-Augmented Generation (RAG) and query classification.
- Hands-on experience with building query interfaces and QA systems.
- Familiarity with tools like LangChain, LlamaIndex, Hugging Face, or similar query management frameworks.
- Solid programming skills in Python and experience with NLP and ML frameworks (e.g., PyTorch, TensorFlow).
- Experience with cloud services (AWS, GCP, or Azure) and containerized environments (Docker, Kubernetes).
- Strong problem-solving abilities and effective communication skills for collaboration in a cross-functional team.
- Experience with cloud services (AWS, GCP, or Azure) and containerized environments (Docker, Kubernetes).
- Experience with coding in JavaScript or TypeScript
Digital Infuzion, LLC is an Equal Opportunity Employer. EOE/AA/M/F/D/V
It is the policy of Digital Infuzion, LLC to provide equal employment opportunities without regard to race, color, religion, sex, gender identity, sexual orientation, national origin, age, disability, marital status, veteran status, genetic information or any other protected characteristic under applicable law.