Distributed jobs and Parallel jobs
There are two types of Matlab jobs usually running on a cluster: distributed jobs and parallel jobs
A distributed job is one whose tasks do not directly communicate with each other. The tasks do not need to run simultaneously, and a worker might run several tasks of the same job in succession. Typically, all tasks perform the same or similar functions on different data sets in an embarrassingly parallel configuration.
A parallel job consists of only a single task that runs simultaneously on several workers. More specifically, the task is duplicated on each worker, so each worker can perform the task on a different set of data, or on a particular segment of a large data set. The workers can communicate with each other as each executes its task.
The differences between distributed jobs and parallel jobs are summarized in the following table
Distributed Job |
Parallel Job |
MATLAB sessions, called workers, perform the tasks but do not communicate with each other |
MATLAB sessions, called labs, can communicate with each other during the running of their tasks. |
You define any number of tasks in a job |
You define only one task in a job. Duplicates of that task run on all labs running the parallel job. |
Tasks need not run simultaneously. Tasks are distributed to workers as the workers become available, so a worker can perform several of the tasks in a job. |
Tasks run simultaneously, so you can run the job only on as many labs as are available at run time. The start of the job might be delayed until the required number of labs is available. |