Data First Jobs

Lily AI

Data Scientist2

Full Time · In Office · San Francisco, California (USA)

Posted Jun 20, 2026

  • Job Title: Data Scientist
  • Company: Lily AI
  • Location: SF Bay Area
  • This is a full-time, 100% on-site role requiring presence 5 days a week at our Mountain View, CA office. Please note we are currently not in a position to offer any relocation assistance for this role; candidates must be local to the SF Bay Area or willing to relocate entirely at their own expense.
  • Department: Data Science & Analytics
  • About Lily AI
  • At Lily AI, we are bridging the gap between how retailers describe their products ("merchant-speak") and how real people actually search for them ("customer-speak"). We are an AI-powered e-commerce product discovery platform trusted by top-tier global brands in fashion, home, and beauty. By automatically extracting rich product details and enriching metadata, we optimize search, discovery, and conversion across on-site search, SEO, Generative Engine Optimization (GEO), and paid ads.
  • Backed by an unmatched blend of retailer, public, and proprietary data—featuring over 1 billion data points and 1 million manually labeled training products—we are scaling rapidly and looking for an exceptional Data Scientist to help us build the next generation of e-commerce AI.
  • The Role
  • As a Data Scientist at Lily AI, you will play a critical role in building and refining the core intelligence of our platform. You will build, fine-tune, and deploy advanced NLP systems—blending Large Language Models (LLMs) with high-efficiency traditional ML. You'll work on everything from fine-tuning foundational models for proprietary text generation to building lightning-fast classifiers and semantic search algorithms that can process billions of data points with minimal latency, directly impacting the bottom line for the world's biggest brands.
  • Key Responsibilities
  • Advanced Predictive Modeling: Design, develop, and evaluate rigorous statistical and machine learning models to solve complex, unstructured business problems. Move beyond out-of-the-box algorithms by designing custom loss functions, conducting rigorous A/B testing, and validating model assumptions to ensure robust mathematical validity.
  • Modern NLP & LLM Development: Fine-tune foundational open-source models (e.g., Llama 3, Mistral) and build robust RAG (Retrieval-Augmented Generation) pipelines to extract granular product attributes and generate conversion-driving copy.
  • Scalable Machine Learning: Design systems that balance state-of-the-art accuracy with real-time e-commerce constraints. You will know exactly when to deploy a heavy LLM and when a lightweight classifier, custom embedding model, or specialized vector search is the right tool for latency and cost.
  • Semantic Search & Information Retrieval: Implement and optimize vector-based search and retrieval systems to map "customer-speak" to "merchant-speak" with lightning-fast execution.
  • Production Optimization & Evaluation: Monitor model performance in production, implement robust evaluation frameworks for generative outputs (mitigating hallucinations in retail taxonomy), and optimize inference for high-traffic environments.
  • Cross-Functional AI Strategy: Partner closely with data engineering, ML ops, and product teams to transition cutting-edge generative prototypes into highly scalable, revenue-driving production endpoints.
  • What You Bring to the Table
  • Education: Bachelor’s, Master’s, or Ph.D. in Computer Science, Data Science, Mathematics, Statistics, or a related quantitative field.
  • Experience: 2–5 years of hands-on experience building, deploying, and maintaining machine learning models in a production environment.
  • Statistical & Modeling Foundations: Deep understanding of exploratory data analysis, hypothesis testing, regression frameworks, and experimental design. Proven ability to translate abstract product questions into formal mathematical formulations and predictive frameworks.
  • Modern AI Technical Stack:
  • Strong proficiency in Python and SQL.
  • Deep experience with modern deep learning and NLP ecosystems (PyTorch, Hugging Face, transformers).
  • Familiarity with LLM orchestration and deployment (e.g., vLLM, LangChain, fine-tuning techniques like LoRA/QLoRA).
  • Experience with Vector Databases (e.g., Pinecone, Milvus, Weaviate) and embedding models.
  • Architectural Pragmatism: A highly analytical mindset with a proven track record of deciding when to use Generative AI versus traditional ML based on cost, latency, and determinism.
  • Bonus Points: Experience in e-commerce, search relevance algorithms, recommendation systems, or high-scale retail data environments.
  • Why Join Us?
  • Scale & Impact: Work with a proprietary dataset of over 1 billion data points that you can't find anywhere else.
  • Growth: We are a fast-growing, recognized leader in retail AI (recently named to the Deloitte Technology Fast 500).
  • Culture: Join a collaborative, innovative team that values experimentation, diversity of thought, and building products that solve real-world problems.
  • Benefits: Competitive salary, equity, comprehensive healthcare, and flexible PTO.
  • Salary:
  • $160,000 per year
  • How to Apply:
  • Ready to build the future of retail AI? Apply directly here: https://forms.gle/Lc1UC1dL6vLGyaRn7
  • Lily AI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Mention you found this on Data First Jobs — it helps us bring you more roles like this.

Data Scientist2

Lily AI

Like this role? Get carefully selected jobs like it, twice a week, straight to your inbox.

Free, no spam. Unsubscribe anytime.