Safouane Chergui
Lead Data Scientist | Paris, France
chsafouane@gmail.com | LinkedIn | GitHub
Technical Expertise
ML/AI Domains: NLP/LLMs, RAG Systems, Search & Recommendation Engines, Computer Vision Cloud & Infrastructure: GCP (Vertex AI, BigQuery), AWS, Docker, Kubernetes, MLflow, Airflow Specialized: LLM Evaluation, Production ML Monitoring, Model Drift Detection, Agentic Systems
Professional Experience
Founder & AI Engineer | LumiereAI | April 2025 - Current
- Built production agentic system for insurance companies using Google ADK framework and Vertex AI Gemini API
- Designed multi-modal pipeline integrating ElevenLabs voice synthesis with Gemini-powered insurer case analysis
- Architected end-to-end solution on GCP: Cloud Run for serving, BigQuery for analytics, Cloud Storage for document ingestion
Lead Data Scientist | EDF | April 2025 - Current
- Architected document parsing and RAG pipeline on GCP using Vertex AI embeddings and Vector Search for semantic retrieval over internal knowledge bases
- Built LLM evaluation framework leveraging Vertex AI Model Evaluation to benchmark Gemini and custom models on business-critical tasks
- Deployed production inference endpoints using Vertex AI Prediction with automated scaling and monitoring
Senior Data Scientist | Doctolib | Sep 2024 - April 2025
- Architected and deployed a RAG system reducing customer support costs by 20%
- Enhanced phone assistant system that allows patients to make a medical appointment using a fast search system and agents
Senior Data Scientist | Mirakl | Oct 2023 - Aug 2024
- Built & deployed an LLM-powered catalog integration system reducing products onboarding time from 20+ days to hours
- Implemented end-to-end MLOps pipeline for model versioning and automated retraining using MLflow
- Trained & Deployed small LLMs for domain-specific tasks to reduce costs of third-party API usage by 10X
Senior Data Scientist | TheFork (TripAdvisor) | May 2022 - Oct 2023
- Designed recommendation algorithms driving 35% increase in page visits across web, mobile, and email channels
- Pioneered LLM-based review summarization system processing 500K+ reviews monthly
- Optimized search functionality using semantic embeddings (Elasticsearch, pgvector)
- Built collaborative filtering and content-based recommendation models serving 20M+ users
Data Scientist | EDF | Dec 2018 - May 2022
- Deployed production ML models for customer segmentation, churn prediction (AUC 0.92), and LTV modeling
- Designed real-time drift detection system monitoring 15+ models in production environment
- Built A/B testing framework and interactive dashboard reducing experiment analysis time by 60%
- Developed predictive maintenance models using signal processing and deep learning (CNN/RNN architectures)
Key Achievements & Leadership
- Production Impact: Deployed 20+ ML models to production including Vertex AI endpoints with measurable business and costs impact
- Technical Leadership: Mentored 5+ junior data scientists, conducted Python/R training sessions at EDF
- Cross-functional Collaboration: Led projects spanning engineering, product, and business teams
Education & Certifications
- MSc Engineering - INSA Lyon (2013-2018)
- Deep Learning Specialization - Coursera
- Statistical Learning - Stanford Online
Languages
- English (Native)
- French (Professional)
- Arabic (Native)
- Spanish (Fluent)
- Mandarin (In the learning process)