Open this publication in new window or tab >>2004 (English)In: Computer journal, ISSN 0010-4620, E-ISSN 1460-2067, Vol. 47, no 5, p. 527-544Article in journal (Refereed) Published
Abstract [en]
Consider a parallel program with n processes and a synchronization granularity z. Consider also two parallel architectures: an SMP with q processors and run-time reallocation of processes to processors, and a distributed system (or cluster) with k processors and no run-time reallocation. There is an inter-processor communication delay of t time units for the system with no run-time reallocation. In this paper we define a function H(n,k,q,t,z) such that the minimum completion time for all programs with n processes and a granularity z is at most H(n,k,q,t,z) times longer using the system with no reallocation and k processors compared to using the system with q processors and run-time reallocation. We assume optimal allocation and scheduling of processes to processors. The function H(n,k,q,t,z)is optimal in the sense that there is at least one program, with n processes and a granularity z, such that the ratio is exactly H(n,k,q,t,z). We also validate our results using measurements on distributed and multiprocessor Sun/Solaris environments. The function H(n,k,q,t,z) provides important insights regarding the performance implications of the fundamental design decision of whether to allow run-time reallocation of processes or not. These insights can be used when doing the proper cost/benefit trade-offs when designing parallel execution platforms.
Abstract [sv]
Vi betraktar ett parallellt program med n processer och synkroniseringsgranularitet z, samt två parallella arkitekturer. Det första har q processorer och full allokering av processerna är tillåten, och det andra har k processorer och ingen reallokering under körningen. Varje reallokering tar t sekunder. Vi definierar en funktion H(n,k,q,t,z) så att körtiden för ett program med n processer och granularitet z är högst en faktor H(n,k,q,t,z) längre för systemet utan reallokering än för systemed med. Vi antar optimal allokering av processer i de två systemen. Funktionen är optimal - det finns program där körtiden är exakt H(n,k,q,t,z) gånger längre. Resultaten valideras med mätningar på multiprocessorer i Sun/Solaris miljö.
Place, publisher, year, edition, pages
Oxford: Oxford University Press, 2004
Keywords
multiprocessor, parallel computing, allocation, performance, granularity, synchronization
National Category
Mathematical Analysis Computer Sciences
Identifiers
urn:nbn:se:mau:diva-39004 (URN)10.1093/comjnl/47.5.527 (DOI)000223426300002 ()oai:bth.se:forskinfoA1C92F3B0B509BC9C12573CA00309B30 (Local ID)oai:bth.se:forskinfoA1C92F3B0B509BC9C12573CA00309B30 (Archive number)oai:bth.se:forskinfoA1C92F3B0B509BC9C12573CA00309B30 (OAI)
Note
Computer Journal, 47(5): 527-544 (2004), http://www.informatik.uni-trier.de/~ley/db/journals/cj/cj47.html#KlonowskaLB04
2012-09-182021-01-082021-01-08Bibliographically approved