Malmö University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparing the optimal performance of parallel architectures
2004 (English)In: Computer journal, ISSN 0010-4620, E-ISSN 1460-2067, Vol. 47, no 5, p. 527-544Article in journal (Refereed) Published
Abstract [en]

Consider a parallel program with n processes and a synchronization granularity z. Consider also two parallel architectures: an SMP with q processors and run-time reallocation of processes to processors, and a distributed system (or cluster) with k processors and no run-time reallocation. There is an inter-processor communication delay of t time units for the system with no run-time reallocation. In this paper we define a function H(n,k,q,t,z) such that the minimum completion time for all programs with n processes and a granularity z is at most H(n,k,q,t,z) times longer using the system with no reallocation and k processors compared to using the system with q processors and run-time reallocation. We assume optimal allocation and scheduling of processes to processors. The function H(n,k,q,t,z)is optimal in the sense that there is at least one program, with n processes and a granularity z, such that the ratio is exactly H(n,k,q,t,z). We also validate our results using measurements on distributed and multiprocessor Sun/Solaris environments. The function H(n,k,q,t,z) provides important insights regarding the performance implications of the fundamental design decision of whether to allow run-time reallocation of processes or not. These insights can be used when doing the proper cost/benefit trade-offs when designing parallel execution platforms.

Abstract [sv]

Vi betraktar ett parallellt program med n processer och synkroniseringsgranularitet z, samt två parallella arkitekturer. Det första har q processorer och full allokering av processerna är tillåten, och det andra har k processorer och ingen reallokering under körningen. Varje reallokering tar t sekunder. Vi definierar en funktion H(n,k,q,t,z) så att körtiden för ett program med n processer och granularitet z är högst en faktor H(n,k,q,t,z) längre för systemet utan reallokering än för systemed med. Vi antar optimal allokering av processer i de två systemen. Funktionen är optimal - det finns program där körtiden är exakt H(n,k,q,t,z) gånger längre. Resultaten valideras med mätningar på multiprocessorer i Sun/Solaris miljö.

Place, publisher, year, edition, pages
Oxford: Oxford University Press , 2004. Vol. 47, no 5, p. 527-544
Keywords [en]
multiprocessor, parallel computing, allocation, performance, granularity, synchronization
National Category
Mathematical Analysis Computer Sciences
Identifiers
URN: urn:nbn:se:mau:diva-39004DOI: 10.1093/comjnl/47.5.527ISI: 000223426300002Local ID: oai:bth.se:forskinfoA1C92F3B0B509BC9C12573CA00309B30OAI: oai:DiVA.org:mau-39004DiVA, id: diva2:1515387
Note

Computer Journal, 47(5): 527-544 (2004), http://www.informatik.uni-trier.de/~ley/db/journals/cj/cj47.html#KlonowskaLB04

Available from: 2012-09-18 Created: 2021-01-08 Last updated: 2021-01-08Bibliographically approved
In thesis
1. Theoretical Aspects on Performance Bounds and Fault Tolerance in Parallel Computing
Open this publication in new window or tab >>Theoretical Aspects on Performance Bounds and Fault Tolerance in Parallel Computing
2007 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis consists of two parts: performance bounds for scheduling algorithms for parallel programs in multiprocessor systems, and recovery schemes for fault tolerant distributed systems when one or more computers go down. In the first part we deliver tight bounds on the ratio for the minimal completion time of a parallel program executed in a parallel system in two scenarios. Scenario one, the ratio for minimal completion time when processes can be reallocated compared to when they cannot be reallocated to other processors during their execution time. Scenario two, when a schedule is preemptive, the ratio for the minimal completion time when we use two different numbers of preemptions. The second part discusses the problem of redistribution of the load among running computers in a parallel system. The goal is to find a redistribution scheme that maintains high performance even when one or more computers go down. Here we deliver four different redistribution algorithms. In both parts we use theoretical techniques that lead to explicit worst-case programs and scenarios. The correctness is based on mathematical proofs.

Place, publisher, year, edition, pages
Blekinge Institute of Technology, 2007
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:mau:diva-7778 (URN)8614 (Local ID)978-91-7295-126-6 (ISBN)8614 (Archive number)8614 (OAI)
Available from: 2020-02-28 Created: 2020-02-28 Last updated: 2021-01-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Lundberg, LarsLennerstad, Håkan

Search in DiVA

By author/editor
Lundberg, LarsLennerstad, Håkan
In the same journal
Computer journal
Mathematical AnalysisComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 67 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf