TY - JOUR
T1 - Distributed stochastic gradient tracking methods
AU - Pu, Shi
AU - Nedić, Angelia
N1 - Publisher Copyright: © 2020, Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society.
PY - 2021/5
Y1 - 2021/5
N2 - In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method (DSGT) and a gossip-like stochastic gradient tracking method (GSGT). We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant stepsize choice). Under DSGT, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size n, which is a comparable performance to a centralized stochastic gradient algorithm. Moreover, we show that when the network is well-connected, GSGT incurs lower communication cost than DSGT while maintaining a similar computational cost. Numerical example further demonstrates the effectiveness of the proposed methods.
AB - In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method (DSGT) and a gossip-like stochastic gradient tracking method (GSGT). We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant stepsize choice). Under DSGT, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size n, which is a comparable performance to a centralized stochastic gradient algorithm. Moreover, we show that when the network is well-connected, GSGT incurs lower communication cost than DSGT while maintaining a similar computational cost. Numerical example further demonstrates the effectiveness of the proposed methods.
KW - Communication networks
KW - Convex programming
KW - Distributed optimization
KW - Stochastic optimization
UR - http://www.scopus.com/inward/record.url?scp=85082773201&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082773201&partnerID=8YFLogxK
U2 - 10.1007/s10107-020-01487-0
DO - 10.1007/s10107-020-01487-0
M3 - Article
SN - 0025-5610
VL - 187
SP - 409
EP - 457
JO - Mathematical Programming
JF - Mathematical Programming
IS - 1-2
ER -