Hey, I'm Josh!
ML Engineer @ Doctrine
I teach LLMs to decipher legal documents so lawyers can focus on what matters. Carved a path from psychology and economics to ML engineering.
What I learned along the way?
Put people first. Let AI follow.


About Me
I currently work as an ML Engineer at Doctrine, France's first AI legal platform, developing applications that help lawyers efficiently analyze documents and conduct legal research. Previously, I designed sentiment-based financial forecasting models and advanced knowledge-extraction pipelines at QuantCube Technology.
My academic journey began with a dual degree in Economics and Psychology, followed by an MSc in Data Sciences, specialising in NLP and advanced deep learning techniques. This interdisciplinary background has proven incredibly valuable as understanding human behavior turns out to be quite useful when teaching machines to interpret legal and financial texts.
Over the past eight years, I've lived and worked in Japan, the UK, Hong Kong, and France. These experiences have shaped my approach to both technology and collaboration, offering perspectives I wouldn't have gained otherwise. They've also allowed me to build an invaluable network of friends and talented people across the world.
Resume
Experience
ML Engineer
Doctrine
- Building Doctrine's legal platform for the German market
NLP Data Scientist
QuantCube Technology
- Spearheaded development of RAG pipeline that cut analyst report preparation time by 90%; currently directing small team to scale solution across multiple financial indicators.
- Optimised open-source LLMs to run at 2x inference speed using AWQ quantization and vLLM.
- Developed LLM pipeline in 4-person team to extract financial knowledge graphs from news articles, enabling real-time mapping of relationships between 1000+ companies to uncover market trends.
- Engineered scalable NER pipeline processing 30K+ daily news articles across 10 languages; reduced compute costs by 60% while maintaining >90% accuracy using spaCy/Transformers.
- Fine-tuned BERT sentiment classification models for financial news analysis using PyTorch, beating leading open-source models by >10% in Russian and German.
NLP Data Scientist Intern
QuantCube Technology
- Developed a streamlined BERT fine-tuning pipeline using HuggingFace Trainer and PyTorch.
- Engineered BERT-based sentiment classification pipeline, analyzing 100+ news articles daily to generate commodity trading signals via a custom trained CatBoost model.
Corporate Research Intern (Capstone Project - Data Science)
Atos
- Built ultrasound-based stroke detection model using LightGBM, reducing diagnosis time by 75%.
Junior Network Security Engineer
NWCS - Networker Consulting Services
- Conducted penetration testing for financial institutions and delivered detailed technical reports.
Education
MSc in Data Sciences & Business Analytics (With Distinction)
CentraleSupélec & ESSEC Business School
Key courses: NLP, Advanced Deep Learning, Advanced Machine Learning, Big Data
MA (SocSci)(Hons) in Economics & Psychology (First Class)
The University of Glasgow
Key courses: Statistical Models, Econometrics, Game Theory, Quantitative Research
International Exchange (3.68 GPA, equivalent to First Class)
The University of Hong Kong
Certifications
Skills
ML/NLP
Programming
Infrastructure
Languages
Projects
AI Research Summarization Dashboard
- Built LLM-powered dashboard to keep up with high volume of current AI research and developments.
- Queries research papers & newsletters via RSS and E-Mail to generate condensed daily summaries.
Neural Network Voice Classifier (Dissertation)
- Implemented and trained neural network voice classifier architecture from scratch using Keras and TensorFlow.
- Built pipeline to extract acoustic features from voice data (pitch, mel-frequency cepstral coefficients, etc.).
- Used model as the core component of my undergraduate thesis research on voice analysis.