You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Danial KhiljiDK

Danial Khilji

Data Scientist

€ 289/dag
Preston, GB
3-7 jaar

Gemiddelde responstijd: 1 uur

Over Danial

I’m a Data Scientist specializing in knowledge representation, graph-based systems, and applied machine learning.

I design and productionize semantic systems for audience and identity mapping, using embeddings, probabilistic matching, and graph-based approaches. My work on the Audience Translator platform has enabled 10K+ users to map taxonomies to platforms like Meta and Google.

I build end-to-end solutions combining LLMs, retrieval (RAG), AI-Agents, and structured data, including taxonomy enrichment, automated validation workflows, and scalable Python pipelines.

More recently, I’ve focused on AI agent systems, developing a LangGraph-based multi-agent framework to automate ad-tech platform research, extracting API insights, audience taxonomies, and reach estimates.

I’m particularly interested in building intelligent systems at the intersection of structured knowledge, retrieval, and AI agents.

My portfolio:
  • Engels

    Tweetalig / moedertaal

Kan op locatie werken
Preston (tot 50km)

Werkervaring

  • WPP
    Data Scientist
    HIGHTECH
    juni 2023 - Vandaag (3 jaren)
    London, UK
    Designed and optimized a semantic knowledge representation system, modelling relationships between audience using semantic similarity and weighted graph similarity, enabling >10K users in just one quarter to map source taxonomies to ad-tech platforms like Meta and Google.

    Partnered with engineers to automate mappings pipelines using Airflow, reducing turnaround time from weeks to under one week through automated similarity, KPI calculations, and reporting.

    Built and maintain a Python package for taxonomy similarity, adopted across teams, with features for data cleaning, KPI generation, model evaluation, results evaluation and configurable weighting of embedding models.

    Integrated pre-trained and fine-tuned BERT-based models, developing a weighted similarity algorithm, and continuously improving mapping performance.

    Developed an evaluation framework with text corruption strategies, ground-truth validation, and confusion matrices to measure model accuracy.

    Implemented LLM-based features (GPT-4, Gemini 1.5 pro) for taxonomy enrichment and lightweight RAG, improving semantic accuracy without on external databases.

    Built an LLM supervision layer, using GPT-4/4o, to pre-validate mappings before human review, significantly reducing manual efforts improving efficiency.

    Investigated geo-targeted advertising inaccuracies in Snowflake, improving device and email matching via probabilistic methods; developed a QA framework for partner data validation and onboarding.

    Consulted on email matching optimization for campaigns using probabilistic matching algorithms and normalization techniques.

    Built a LangGraph-based multi-agent system for ad-tech platform research to retrieve API details, audience taxonomies, and reach estimates, and automatically generate validated markdown reports with details and code snippets.

    Developed MCP server to expose semantic similarity system to AI Agents.
    Python Data science LLMs LangGraph Snowflake
  • Choreograph (WPP Company)
    Junior Data Scientist
    HIGHTECH
    maart 2022 - juni 2023 (1 jaar en 3 maanden)
    London, UK
    Participated in WPP Data Challenge #4 and #5. Winner of Data Challenge #5

    Contributed to the Audience Knowledge Graph (AKG) project, developing taxonomy mapping module, data cleaning steps for data clean rooms, and Looker Studio dashboard for monitoring each module progress among other tasks.

    Created python package to expose taxonomy mapping project to different teams within company. Package uses pre-trained BERT language models from Hugging Face library to generate vector embeddings which are used to calculate cosine similarity.

    Focus on taxonomy mapping work also known as Named Entity Resolution (NER). Improved the previously developed algorithm by removing loops and reducing time complexity from O (n) to O (1).
    Python python package Embedding models semantic similarity Machine Learning Algorithms
  • WPP
    WPP NextGen Leader
    HIGHTECH
    juni 2022 - augustus 2022 (2 maanden)
    London, UK
    − 10 weeks internship in WPP to give an understanding of how, as a group of agencies, WPP work in the creative world, help their clients grow and build advertisement campaigns.
    Advertising Tech Lead marketing campaigns

Aanbevelingen

Wees de eerste die Danial aanbeveelt

Help deze freelancer om te schitteren door te vertellen hoe het is om met hem of haar te werken.

Deze freelancerprofielen matchen ook met zoekopdracht.

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Opleidingen

  • MSc Applied
    University of Central Lancashire
    2021
    MSc Applied
  • B.E
    College of EME, National University of Sciences and Technology
    2020
    B.E

Diploma's

Vaardigheden

Categorieën