Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection

Ziyu Yang, Guibin Chen, Yuxin Yang et al.

January 14, 2026 Score: 7.7 Deep analyzed

Interest Score Breakdown

Seismic Impact (30%)

7.0/10

Industry-wide significance

Ecosystem Relevance (70%)

8.0/10

Applicable to your apps

Abstract

Multi-Task Learning (MTL) combined with Low-Rank Adaptation (LoRA) has emerged as a promising direction for parameter-efficient deployment of Large Language Models (LLMs). By sharing a single adapter across multiple tasks, one can significantly reduce storage overhead. However, this approach suffers from negative transfer, where conflicting gradient updates from distinct tasks degrade the performance of individual tasks compared to single-task fine-tuning. This problem is exacerbated in LoRA due to the low-rank constraint, which limits the optimization landscape's capacity to accommodate diverse task requirements. In this paper, we propose Ortho-LoRA, a gradient projection method specifically tailored for the bipartite structure of LoRA. Ortho-LoRA dynamically projects conflicting task gradients onto the orthogonal complement of each other within the intrinsic LoRA subspace. Extensive experiments on the GLUE benchmark demonstrate that Ortho-LoRA effectively mitigates task interference, outperforming standard joint training and recovering 95\% of the performance gap between multi-task and single-task baselines with negligible computational overhead.

Deep Analysis

Get a detailed analysis of this paper's relevance to your ecosystem.

How to Use in Your Ecosystem

In Zac's ecosystem, Ortho-LoRA could be directly applied to the AI-powered agents like the orchestrator and specialized task agents to improve multi-task learning efficiency. For prediction market apps like trading and soccer_elo, this technique could help train more robust models that can handle diverse forecasting tasks without performance degradation, potentially enhancing the Claude-powered model's ability to context-switch between different domain-specific prediction scenarios. Rationale:

Source