Multi-Model Agentic Performance for Code Creation and Editing
I
Ismar
The evidence is mounting that combining different models can lead to more optimal performance in code creation and editing tasks. We're already seeing multi-agent approaches in tools like Cline and Roo Cline, which support distinct "Plan" and "Act" modes.
Main idea: dynamic model selection
Different models have strengths based on file size, line length, and editing strategy (e.g., whole file versus diff).
For example, Repo Prompt employs a sophisticated multi-agent framework that selects models based on factors like file length and content complexity.
Services like Deepclaude also demonstrate a promising approach, making intelligent model choices dynamically.
I'm thinking of something along these lines:
Step 1: I request Windsurf to explain an aspect of the code or make a change.
Step 2: Windsurf intelligently selects the best model for the task—possibly Deepseek R1 or o3-mini at the time of writing—based on various factors including file content, line count, and task complexity.
Step 3: The system uses tool calls to analyze the necessary context (class hierarchy, function calls, etc.) to create a detailed plan.
Step 4: This plan is then passed to a specialized model optimized for file editing and code modifications, likely Claude Sonnet 3.5 at the time of writing
Step 5: After the modification, the code undergoes rigorous quality checks (cyclomatic complexity, linting, adherence to SRP, DRY principles, etc.).
Step 6: The updated code is then presented as the final output.
Benefits for Windsurf and its users:
Integrating such a multi-model framework would allow Windsurf to leverage the strengths of various models, providing users with tailored, high-quality code explanations and changes.
The end goal is a seamless system where the user can simply request a change, and behind the scenes, the best-suited model takes over, ensuring a robust and polished result.
While I'm not entirely sure whether this functionality should be built directly into Windsurf’s own models with configurable options for the user, or if it should integrate with external services like Deepclaude, it’s clear that a multi-model approach could greatly benefit the overall user experience.
Cheers!
Oleh Melnyk
Yeah, that would be awesome! In a meanwhilthene - I've defined 40+ modes in .roomode, this instructionthan added to the .windsurfrules this instructions:
Custom Roles and Role-Switching
When I start a message with /role=SOME_ROLE, switch to the SOME_ROLE defined in the root .roomode file. Follow its customInstructions and stick to that role until I switch to another role or use /role=reset. Start each response with "[SOME_ROLE]" to confirm the active role. If SOME_ROLE is invalid, respond with "Invalid role" and list available roles from .roomode. Base Windsurf rules apply unless overridden by the role’s instructions. You may suggest a role switch if it makes sense and there is a better role to handle the request. Before switching to a new role - take a pause and suggest user to switch to another LLM - like claude sonnet 3.7, o3-mini, deepseek r1 or deepseek v3, gemini flash, etc.
----
and it kleast itinda works - at It lest is starts responses with [SELECTED_ROLE] and switchincouldg to other roles automatically... switcheswould be nice if it meantimecan switch roles AND LLM that works best with that role
Cristian Ghezzi
Yes, often I need to switch models and when one keeps failing the other one finds the correct solution right away. This should be automated.
K
Kai
Agree with this suggestion. However the current ability to allow user to choose their preferred model must remains as it is too.
M
Max Mestiri
Kai exactly this
All Mightimus Prime
Agreed! Something similar to 'DeepClaude' would be awesome. You can self-host that rust app with your own API to test how it works.
I'm thinking of a team of 3 agents (human+2 Ai agents) working as a small Dev team. Work flow would be that of product and project manager (Human + Claude/planning model) to define the scope and PRD and implementation plan. Then the technical implementation team (human + reasoning Ai agent) for technical design, coding, testing, tech docs with perhaps the Claude Agent acting as a means to help the human keep the coding agent in check.
Maybe some defined roles like in the image.
Alexandr Bodrov
Multi-agency, where different models work together based on their strengths and communicate effectively, is the future. Dynamically selecting the best model for each task, as described in this feature request, is a great idea and could greatly improve Windsurf’s capabilities!