GSoC 2026 Interest: Expanding GenAI Pipelines and Core GGUF Infrastructure #34207

Ashitpatel001 · 2026-02-20T04:46:30Z

Ashitpatel001
Feb 20, 2026

Hi OpenVINO Team,

I am Ashit Patel, an active contributor across the OpenVINO ecosystem. I am excited to announce that I am preparing two high-impact proposals for GSoC 2026: Project 4 (LTX Image-to-Video) and Project 15 (GGUF Reader v2 for Direct Execution).

My recent contributions directly address the complex technical challenges associated with these "Hard" category projects:

Technical Track Record

Core C++ & Graph Logic: Successfully implemented the 'flip' preprocessing step in the OpenVINO Core repository (openvinotoolkit/openvino#34135). This involved internal operator management, tensor manipulation, and adherence to strict C++ development standards.
GenAI C API: Developed the C API bindings for Text-to-Video in the OpenVINO GenAI repository (openvinotoolkit/openvino.genai#3331). This provided me with deep architectural familiarity with the openvino_genai repository and its optimized execution methods.
Vision & Media Pipelines: Built the C++ Video Style Transfer sample (openvinotoolkit/openvino.genai#3269) and a Live VLM Chat C++ sample (openvinotoolkit/openvino.genai#3308), handling real-time visual data streams via OpenCV and complex inference loops
Performance Engineering: Conducted rigorous PyTorch vs. OpenVINO INT4 latency benchmarks in the OpenVINO Notebooks repository (openvinotoolkit/openvino_notebooks#3245) to validate quantization and optimization strategies on Intel hardware.

Why These Projects?

Project 4: LTX Image-to-Video Support

I aim to leverage my experience with the Video GenAI C API and diffusion-based pipelines to ensure seamless Image-to-Video (I2V) parity between Python and C++. My focus will be on maintaining minimal memory overhead for latent space denoising on Intel AI PCs.

Project 15: GGUF Reader v2 (Dynamic Execution)

I plan to utilize my understanding of OpenVINO's frontend architecture and GgmlOVDecoder to transition from static model reconstruction to a dynamic, scalable GGML computation graph translation. This will significantly broaden OpenVINO's native support for the GGUF ecosystem.

I am currently looking forward to 350-hour project timelines for both proposals and look forward to discussing the architectural specifics with mentors @likholat , @sgonorov , @cavusmustafa , @ravi9

Best regards,
Ashit Patel

Ashitpatel001 · 2026-02-22T18:12:59Z

Ashitpatel001
Feb 22, 2026
Author

Hi @likholat , @sgonorov,

I've been spending the weekend analyzing the openvino.genai codebase (specifically Text2VideoPipeline) and the LTX-Video architecture to ensure my GSoC proposal is technically sound.

While looking into the Image-to-Video (I2V) requirements, I realized that since LTX is a Diffusion Transformer (DiT) utilizing a highly compressed Video-VAE rather than a traditional 3D U-Net injecting the image conditioning requires a specific architectural choice.

I weighed two potential paths: building a multimodal cross-attention adapter versus using Latent Initialization. Since adding a new fusion layer might require altering the pretrained transformer weights or assuming multimodal support that isn't native to the base DiT, I am supposing that Latent Initialization is the architecturally safer and more performant path for C++.

My thought process for the implementation is:

Take the input ov::Tensor image and run it backwards through the OpenVINO GenAI VAE encoder to generate the anchor latents.
Apply the scheduler's noise for the generated frames.
Concatenate them to form the starting sequence before passing it to the RoPE-enabled transformer.

This seems like the best way to ensure strict parity with the Python LTXConditionPipeline while keeping memory overhead low.

Before I finalize the milestones in my proposal, I wanted to humbly ask for your feedback on this: Is this VAE-to-Latent initialization approach the correct implementation direction for the C++ pipeline, or is the team envisioning a different architectural path?

0 replies

ravi9 · 2026-02-24T21:40:02Z

ravi9
Feb 24, 2026
Collaborator

Hi @Ashitpatel001 !
Thank you for your interest in Project 15: GGUF Reader v2.
We look forward to your proposal !

Thanks !

0 replies

Ashitpatel001 · 2026-02-25T03:52:23Z

Ashitpatel001
Feb 25, 2026
Author

Hi @ravi9! Thank you for the warm welcome. I am actively researching the GGUF Reader v2 architecture (specifically the llama.cpp graph translation) and will share my draft proposal soon. In the meantime, I'm wrapping up this C API PR to familiarize myself with the OpenVINO GenAI coding standards and tensor management. Excited to contribute!

0 replies

likholat · 2026-03-23T19:54:03Z

likholat
Mar 23, 2026
Collaborator

Hi Ashit,

Sorry for the late reply, and thank you for your application!

Please note that the final proposal must be submitted through the GSoC portal webapp to be considered.

Could you also please attach the PR/PRs you've worked on in OpenVINO (OpenVINO GenAI)? At the moment, solving good first issues or making a solid contribution is the requirement for applying.

As for Project 4 specifically, both the implementation of image-to-video support and the planning of its architecture are expected to be part of the program, rather than something that is already planned to be added beforehand.

Thanks again for your interest .

Regards,
Anna

1 reply

Ashitpatel001 Mar 24, 2026
Author

Hi @likholat,

Thank you for looking over the proposal! Regarding my contributions to the openvino.genai repository, I have focused heavily on the C++ APIs and video generation pipelines to prepare for this exact GSoC project.

Here are my active contributions to the GenAI repo:

Implement C API bindings for Text-to-Video Generation
Implement C API bindings for Text-to-Video Generation openvino.genai#3331
Add Live VLM Chat C++ sample (Webcam/OpenCV)
[Feature] Add Live VLM Chat C++ sample (Webcam/OpenCV) openvino.genai#3308
Add C++ video_style_transfer sample
[Sample] Add C++ video_style_transfer sample openvino.genai#3269

Beyond openvino.genai, I actively contribute across the wider AI ecosystem to bridge the gap between research and low-level hardware optimization.

I currently have active PRs in the main OpenVINO engine (Add flip right/left preprocess step - #34135) and the Notebooks repository (profiling PyTorch vs. OpenVINO inference latency - #3245).

I have also previously contributed to Google DeepMind's open source repositories. Working with top tier research codebases has strengthened my ability to translate complex architectures into efficient, production ready systems.

I will send my GSOC proposal via email in next 1-2 days and looking forward for your review

Thankyou!

Ashitpatel001 · 2026-03-27T07:22:09Z

Ashitpatel001
Mar 27, 2026
Author

Hi @likholat @sgonorov , I have successfully sent the final draft of my GSoC proposal via email. I would really appreciate any high level feedback or red flags you might have when you get a chance to review it.

I also had one quick architectural question regarding the LTX Image-to-Video implementation:
Should the timestep conditioning array be expected as a direct input node already present in the exported OpenVINO IR, or will that need to be constructed and handled dynamically during the integration phase?

Thanks again for your time and guidance!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2026 Interest: Expanding GenAI Pipelines and Core GGUF Infrastructure #34207

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GSoC 2026 Interest: Expanding GenAI Pipelines and Core GGUF Infrastructure #34207

Uh oh!

Ashitpatel001 Feb 20, 2026

Technical Track Record

Why These Projects?

Project 4: LTX Image-to-Video Support

Project 15: GGUF Reader v2 (Dynamic Execution)

Replies: 5 comments · 1 reply

Uh oh!

Uh oh!

Ashitpatel001 Feb 22, 2026 Author

Uh oh!

ravi9 Feb 24, 2026 Collaborator

Uh oh!

Ashitpatel001 Feb 25, 2026 Author

Uh oh!

likholat Mar 23, 2026 Collaborator

Uh oh!

Ashitpatel001 Mar 24, 2026 Author

Uh oh!

Ashitpatel001 Mar 27, 2026 Author

Ashitpatel001
Feb 20, 2026

Replies: 5 comments 1 reply

Ashitpatel001
Feb 22, 2026
Author

ravi9
Feb 24, 2026
Collaborator

Ashitpatel001
Feb 25, 2026
Author

likholat
Mar 23, 2026
Collaborator

Ashitpatel001 Mar 24, 2026
Author

Ashitpatel001
Mar 27, 2026
Author