Projects 💻


Topic Extraction & Dynamic Visualization

Graph View Image

This project concentrated on extracting keywords efficiently from concise submission titles, aiming to generate topics within 2-3 seconds to preserve the responsiveness of the web application. Additionally, by analyzing patterns in the data, we were able to automatically categorize submissions based on their topics or the nature of the submission, such as explanations, applications, or papers. To enhance user experience, we visualized the data in a graph-like hierarchical view, complete with drill-down and roll-up functionality.

Unit Test Generation using LLM and CodeQL

GenAI + SWE Image

The project aims to improve the accuracy of generating unit tests for Java methods using ChatGPT by addressing challenges linked to complex functions and limited context. By integrating CodeQL for static analysis, the project can extract detailed metadata from the codebase, enhancing ChatGPT's understanding of Java classes and their dependencies. This combined approach is expected to significantly improve the quality of unit tests generated, leveraging CodeQL's analysis strengths and ChatGPT's natural language processing capabilities.

End-to-End SQS -> PostgreSQL Data Pipeline

SQS ETL Image

In this project, I have developed and demonstrated a solution for piping data from AWS SQS to PostgreSQL. The entire project is containerized using Docker, ensuring easy deployment and scalability. Additionally, the project includes comprehensive unit tests to verify functionality and detailed documentation to support a production-ready implementation. This approach ensures reliability, maintainability, and ease of integration into existing systems.

Prompt Engineering for Code Tasks

Prompt Engg Image

The project aims to harness the power of ChatGPT, a language model, for automating various code-related processes. By employing ChatGPT, the project can generate inputs and outputs, unit tests, semantically equivalent code, and code mutants, streamlining code tasks and enhancing code quality. Extensive prompt engineering ensures ChatGPT comprehends the context and provides accurate responses, and a Python-based script enables multi-turn conversations, allowing users to iteratively refine queries for improved results.

Forest Migration

Forest ETL Image

In this project, I worked closely with internal stakeholders and clients to establish transformation rules for data processes. I developed a batch processing pipeline that utilized multi-threading to improve data transfer speeds, eliminate bottlenecks, and boost system throughput by 20%. Additionally, I employed optimization techniques to parallelize tasks like relation creation and dataset migration, which reduced latency by 15%.