Hey, I'm Josh!

ML Engineer @ Doctrine
I teach LLMs to decipher legal documents so lawyers can focus on what matters. Carved a path from psychology and economics to ML engineering.

What I learned along the way?
Put people first. Let AI follow.

Cartoon portrait of Joshua Hiepler, ML Engineer with short brown hair wearing a dark suit
Real portrait of Joshua Hiepler

About Me

I currently work as an ML Engineer at Doctrine, France's first AI legal platform, developing applications that help lawyers efficiently analyze documents and conduct legal research. Previously, I designed sentiment-based financial forecasting models and advanced knowledge-extraction pipelines at QuantCube Technology.

My academic journey began with a dual degree in Economics and Psychology, followed by an MSc in Data Sciences, specialising in NLP and advanced deep learning techniques. This interdisciplinary background has proven incredibly valuable as understanding human behavior turns out to be quite useful when teaching machines to interpret legal and financial texts.

Over the past eight years, I've lived and worked in Japan, the UK, Hong Kong, and France. These experiences have shaped my approach to both technology and collaboration, offering perspectives I wouldn't have gained otherwise. They've also allowed me to build an invaluable network of friends and talented people across the world.

Resume

PDF resume download will be available by May 2025. Please use the online resume for now.

Experience

ML Engineer

Doctrine

Jun. 2025 - Present Paris, France
  • Building Doctrine's legal platform for the German market

NLP Data Scientist

QuantCube Technology

Jan. 2023 - Jun. 2025 Paris, France
  • Spearheaded development of RAG pipeline that cut analyst report preparation time by 90%; currently directing small team to scale solution across multiple financial indicators.
  • Optimised open-source LLMs to run at 2x inference speed using AWQ quantization and vLLM.
  • Developed LLM pipeline in 4-person team to extract financial knowledge graphs from news articles, enabling real-time mapping of relationships between 1000+ companies to uncover market trends.
  • Engineered scalable NER pipeline processing 30K+ daily news articles across 10 languages; reduced compute costs by 60% while maintaining >90% accuracy using spaCy/Transformers.
  • Fine-tuned BERT sentiment classification models for financial news analysis using PyTorch, beating leading open-source models by >10% in Russian and German.

NLP Data Scientist Intern

QuantCube Technology

Jul. 2022 - Dec. 2022 Paris, France
  • Developed a streamlined BERT fine-tuning pipeline using HuggingFace Trainer and PyTorch.
  • Engineered BERT-based sentiment classification pipeline, analyzing 100+ news articles daily to generate commodity trading signals via a custom trained CatBoost model.

Corporate Research Intern (Capstone Project - Data Science)

Atos

Jan. 2022 - Jun. 2022 Paris, France
  • Built ultrasound-based stroke detection model using LightGBM, reducing diagnosis time by 75%.

Junior Network Security Engineer

NWCS - Networker Consulting Services

Jul. 2016 - Jun. 2019 Dreieich, Germany
  • Conducted penetration testing for financial institutions and delivered detailed technical reports.

Education

MSc in Data Sciences & Business Analytics (With Distinction)

CentraleSupélec & ESSEC Business School

Sep. 2021 - Jun. 2022 Paris, France

Key courses: NLP, Advanced Deep Learning, Advanced Machine Learning, Big Data

MA (SocSci)(Hons) in Economics & Psychology (First Class)

The University of Glasgow

Sep. 2017 - Jun. 2021 Glasgow, UK

Key courses: Statistical Models, Econometrics, Game Theory, Quantitative Research

International Exchange (3.68 GPA, equivalent to First Class)

The University of Hong Kong

Aug. 2019 - Dec. 2019 Hong Kong

Certifications

AI Agents Course

HuggingFace ID: jthiepler

Jun. 2025 Verify

Generative AI with Large Language Models

DeepLearning.AI ID: UY48SBM9HUTD

Nov. 2023 Verify

Machine Learning

Stanford Online ID: XXWZLRCY9B7N

Jul. 2020 Verify

Skills

ML/NLP

PyTorch Transformers vLLM RAG (LangChain, Weaviate) spaCy scikit-learn

Programming

Python R SQL pandas

Infrastructure

AWS (EC2/S3/Athena) Docker MLflow Streamlit Bash/Shell

Languages

German (Native) English (C2) French (A2/B1)

Projects

AI Research Summarization Dashboard

Python Gemini API Streamlit
  • Built LLM-powered dashboard to keep up with high volume of current AI research and developments.
  • Queries research papers & newsletters via RSS and E-Mail to generate condensed daily summaries.

Neural Network Voice Classifier (Dissertation)

Python Keras TensorFlow Librosa
  • Implemented and trained neural network voice classifier architecture from scratch using Keras and TensorFlow.
  • Built pipeline to extract acoustic features from voice data (pitch, mel-frequency cepstral coefficients, etc.).
  • Used model as the core component of my undergraduate thesis research on voice analysis.

Get in Touch

Always happy to exchange ideas, stories, and coffee shop recommendations. Warning: may get overly excited about AI and good espresso.

Connect With Me

Based in Paris, France