Large-Scale Data Processing and Machine Learning Analysis Model Based on Distributed Algorithm

Authors

  • Manfei Lo

DOI:

https://doi.org/10.56028/aetr.9.1.629.2024

Keywords:

Machine Learning; Large-Scale Data; Distributed Algorithm.

Abstract

The model of large-scale data processing and ML(machine learning) analysis based on DA(distributed algorithm) is a powerful computing method, which aims at processing huge data sets and performing efficient ML analysis. In this paper, a cluster topology driver module based on gradient switching and aggregate communication is designed, and its core goal is to adapt the distributed system to various underlying network topologies. By designing decentralized gradient exchange algorithm and aggregate communication framework, the parallel transmission ability of multi-interface network can be fully exerted, thus improving the model synchronization efficiency of ML task. The experimental results show that the cluster topology driver module can provide better performance than the existing methods in terms of training convergence, cluster scalability and communication overhead. Large-scale data processing and ML analysis model based on DA is widely used in processing massive data and realizing complex analysis tasks.

Downloads

Published

2024-01-25