M.A. Linguistics (Cambridge): Double first-class honours

I am an independent IT professional with two specialisations: natural language processing / machine learning, and security architecture. With 25 years of project experience across various industry verticals, I bring a rigorous, structured, and tenacious approach to software development. I possess an innate instinct for architecture and design, and have a talent for creating robust and enduring products. I am a practitioner through and through but make a point of attending academic conferences to stay up to date with the latest research. Based in Munich, I hold German and British citizenship and am equally at home working in both languages.
+49 151 6243 5227
info@richardpaulhudson.com

Selected project experience

Freelance (from 03/2023)Led design, development, model selection, fine-tuning, and prompt engineering for a RAG pipeline solution that answers questions from a large corpus of legal documents, now live and used by several hundred help-desk employeesInsurance
Developed tooling to manage data and model versions, enabling efficient porting of the above RAG solution to additional corpora, languages, and business use casesInsurance
Designed and implemented a solution to extract text from PDF OCR layers with the correct semantic ordering and structure for further analysis by LLMsInsurance
Explored various explainable AI methods within the RAG contextInsurance
Explosion AI, Berlin (11/2021 — 02/2023)Core developer maintaining the spaCy and thinc libraries (Python / Cython)AI
Investigated and implemented strategies to improve the accuracy of lemmatisation
models across 17 languages while maintaining acceptable speed
AI
msg systems ag, Munich (02/2014 — 10/2021)Designed and developed algorithms and systems to analyse the structures of legal texts and to detect anomalous wording in contract proposalsInsurance
Designed a cloud service to recognise incoming E-mails containing orders and to extract structured information from themLogistics
Designed and developed an application with a Microsoft Word/PowerPoint plugin to recognise confidential text passages in internal documents and to remove them prior to external publicationAutomotive

Selected open-source contributions

LibraryRoleDescription
HolmesSole authorInformation extraction from English and German texts based on predicate logic; supports intelligent search.
CorefereeSole authorCoreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further languages.
spaCyFormer core
maintainer
Industrial-strength Natural Language Processing in Python.
thincFormer core
maintainer
The equivalent to PyTorch in the spaCy stack.

Skills

NLP /
Machine Learning
Deep learning; Explainable AI; GPU processing; Haystack; Hugging Face; LangChain; Large Language Models (LLM); LlamaIndex; Machine learning; Natural language processing (NLP); Neural networks; OpenSearch; Prompt engineering; PyTorch; Retrieval-Augmented Generation (RAG); Spacy; Transformers
Programming languages (ordered by experience)Python; Java; SQL; JEE; Bash; Cython; Javascript; C++
Cloud providersAzure; AWS; GCP
General technologies and frameworksAnsible; Cassandra; Docker; Hadoop; Kafka; Kubernetes; MongoDB; MySQL; PostgreSQL; RabbitMQ
Operating systemsUnix; Linux; Ubuntu; macOS
ConceptsApplication architecture; Big Data; Data lake; DevOps; Distributed systems; EAI; ETL; Integration; Messaging; Stakeholder management
SecurityBSI Grundschutz; Business continuity; Certificates; Compliance; Cryptography; Data protection; DevSecOps; Disaster recovery; Encryption; GDPR/DSGVO; IAM; ISO standards; ISMS; Key management; Legal requirements; LLM security; Network security; NIST; PKI; Risk management; Security engineering; Security policies; Threat analysis

Selected publications and talks

Introducing Holmes 4.0, Explosion AI, 2022
Quanten-Computing: Zukunftstechnologie mit stark eingeschränktem Einsatzfeld, iX Developer, 2020
Wortgewandt: Natürliche Sprache zielgenau verarbeiten mit semantischer Textanalyse, iX Developer, 2020
KI-gestützte Textanalyse beim Releasemanagement, Softwareforen Leipzig, 2019
Censor Robots: Using AI to Redact Confidential Information, (ISC)2 Secure Summit, London, 2019
Cybertwists: Hacking and Cyberattacks Explained, CreateSpace, 2018
Machine Learning Catalogue, msg, 2017