VAST Innovation Placement: Building the Next Generation of Intelligent Digital Research Infrastructure

VAST Innovation Placement: Building the Next Generation of Intelligent Digital Research Infrastructure

In collaboration with VAST Data, DiRAC is pleased to invite applications for a 6-month Innovation Placement in building the next generation of intelligent digital research infrastructure.

The proposed start time for the placement is late July 2026.

Applications can be made using the form below by the deadline of 23:59, 27th May 2026.

For more information about how an Innovation Placement can benefit your career, as well as testimonials from previous placements students, see our Innovation Placements landing page. 

VAST Data is an AI-native “Thinking Machine” architecture that consolidates the disparate silos of high-performance computing (HPC)—including parallel file systems, object storage, and structured databases—into a single, unified global namespace. 

By leveraging a Disaggregated Shared-Everything (DASE) architecture, it eliminates the traditional trade-offs between the massive throughput required for checkpointing and the high-concurrency IOPS needed for random-access inference or multi-agent simulations. For researchers, this translates to a “zero-tier” storage environment where all data resides on high-density flash, effectively removing the need for complex data staging or manual tuning. 

With integrated services like the VAST DataEngine and VAST DataBase, the platform not only stores exascale datasets but also provides the computational triggers and metadata indexing necessary to automate the entire research lifecycle, from initial data ingestion to long-context agentic reasoning. 

VAST Data’s fundamental architecture makes it the only platform able to provide an end-to-end AI ecosystem with the governance and security needed to maintain public trust.

project

The placement invites candidates to build the next generation of a Cognitive Open Research Environment by developing an AI-enabled Federated Digital Research Environment.

The successful applicant will work with cutting-edge infrastructure, including serverless data pipelines and the VAST Data AI Operating System, to transform passive document repositories into interactive knowledge services using Retrieval Augmented Generation (RAG) and agentic interaction.

As a participant in this project, you will play a key role in developing and orchestrating LLM-powered conversational agents using the Common agent framework and Python. Your work will involve managing complex data pipelines, including automated document ingestion, chunking, and vector embedding, through the VAST DataEngine and VastDB. Furthermore, you will integrate state-of-the-art NVIDIA NIM endpoints for inference and reranking to refine the system’s Retrieval-Augmented Generation (RAG) capabilities. By the conclusion of the placement, you will have contributed to a significant reference architecture and co-authored a white paper that documents these research outcomes for the global academic and technical community.

applicant profile

Strong team spirit but capable of independence. Ability to represent oneself and the team both inside and outside the company. A knack for finding fun in the complex and difficult.

Responsibilities

Agent Development & Integration: Design, build, and test LLM-powered conversational agents using Python, FastAPI, and the Common agent framework.

Data Pipeline Management: Configure and optimize serverless data ingestion pipelines, document chunking, and vector embedding generation using the VAST DataEngine and VastDB.

AI/ML Orchestration: Integrate and evaluate NVIDIA NIM endpoints for LLM inference and reranking within the VAST InsightEngine RAG architecture.

System Deployment: Assist in deploying and maintaining the application stack within a containerized environment (Docker/Kubernetes).

Documentation & Output Generation: Author a comprehensive reference architecture, contribute to a white paper/conference paper on the project’s findings, and package open-source tools for the academic community.

Collaboration: Work closely with the host organization, DiRAC representatives, and the wider development team to share technology and ensure project milestones are met.

Skills & experience

We are looking for candidates who have:

  • Minimum of a BSc or MSc in Computer Science, Software Engineering, Artificial Intelligence, Data Science, or a related technical field.
  • Strong programming skills, particularly in Python (experience with web frameworks like FastAPI is highly desirable).
  • Foundational understanding of, or practical experience with, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and vector databases.
  • Familiarity with modern deployment stacks, including containerization (Docker) and orchestration (Kubernetes).
  • Strong grasp of software design, execution, automation, and testing metrics.
  • Ability to work effectively as a member of a geographically distributed development team and collaborate across complex organizations.
  • Excellent written and verbal communication skills, with the ability to translate complex technical architectures into clear reference documentation and academic papers.
  • Excellent time management skills, capable of independently prioritizing tasks and meeting deadlines in a fast-paced, innovative environment.
  • Highly effective analytical, problem-solving, and decision-making capabilities.

Diversity, Inclusion & Belonging

We welcome applications from all, regardless of background.

Placement Details

The successful candidate will remain based at their home university. We do our best to offer flexibility; part-time working can be arranged as long as the placement does not exceed 1 year.

If you have any questions, please email them to DiRAC_placements@leicester.ac.uk

how to apply

Placements are open to PhD students, and are fully funded but you must get your supervisor’s permission before applying – under UKRI rules participation in the scheme is only allowed with their consent.

Apply using the form below by 23:59, 27th May 2026.

VAST Data IP application
Name
Name
First Name
Last Name

Maximum file size: 516MB