Collaborative Strategies for Large Language Models

4 min readJul 14, 2024

I was reading on Saturday night a paper that I found interesting. thus, I will be discussing it in the below text.
Large language models (LLMs)systems demonstrate capabilities across multiple tasks. Each LLM has specific strengths and limitations. Researchers are investigating methods to combine multiple LLMs to enhance overall performance.

A survey paper titled “Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models” examines this topic. The paper categorizes collaborative strategies into three approaches: Merging, Ensemble, and Cooperation.

Merging: Combining AI Parameters

Merging involves combining the parameters of multiple LLMs to create a single, more capable model. One technique in this category is “Model Soup,” which averages the parameters of multiple models. Other methods like “TASK ARITHMETIC” and “TIES-MERGING” aim to combine AI capabilities additively.

Merging strategies seek to achieve two primary objectives:

1. Obtaining a relative optimal solution by leveraging strengths from different models.
2. Improving multitasking abilities by combining models specialized in different tasks.

Ensemble: Combining AI Outputs

The ensemble approach combines the outputs of multiple LLMs to produce results. Ensemble methods can be applied at different stages of the inference process:

Before inference: Using a “router” to select appropriate LLMs.
During inference: Sequentially combining outputs from each LLM.
After inference: Combining final outputs from each LLM.

A technique called “token alignment” allows LLMs with different vocabularies to cooperate during inference.

Ensemble methods are being applied in fields requiring high accuracy, such as medical diagnosis and financial forecasting. These methods also aim to reduce instances of AI “hallucinations” — occurrences where AI produces incorrect information with high confidence.

Cooperation: LLMs Working Together

Cooperation strategies focus on how LLMs with different capabilities can work together on specific tasks. This approach includes:

Collaboration for efficient computation: Methods like “Speculative Decoding” pair a smaller LLM with a larger one to process data more efficiently.

2. Collaboration for knowledge transfer: Techniques such as “Contrastive Decoding” utilize output differences between LLMs to achieve improved results.

3. Complementary cooperation: Strategies where one LLM compensates for the weaknesses of another, such as fact-checking or linking to external knowledge bases.

4. Federated Cooperation: Protects private data while sharing public models through:

a) Federated Training: Transfers knowledge between server LLMs and client smaller models without raw data transmission. FedMKT, OpenFedLLM, and FedCyBGD are key methods.

b) Federated Prompt Engineering: Uses local small models for privacy and cloud LLMs for execution. FDKT augments data with domain-specific demonstrations. PromptFL updates prompts for faster training and aggregation.

Future Developments and Challenges

Advancements in these technologies may lead to AI systems capable of solving complex, multidisciplinary problems. Potential applications include more natural human-AI collaboration and mitigation of current LLM limitations such as hallucinations and biases.

Challenges in this field include:

1. Efficiently coordinating LLMs with different architectures.
2. Addressing privacy concerns in multi-LLM systems.
3. Managing ethical considerations in AI collaboration.

Current research indicates a trend towards developing systems where multiple specialized LLMs work together effectively, rather than focusing solely on creating individual, large-scale models.

References

Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models (https://arxiv.org/html/2407.06089v1)

I’m Joe, and my ambition is to lead the way to industry 5.0 performance. I’m always interested in new opportunities, so don’t hesitate to contact me on my LinkedIn.