Skip to content

파이썬 IPython Parallel #
Find similar titles

Structured data

Category
Etc
Programming
Computer science

Overview #

IPython's one of the main top features is that it allows to do parallel and distributed computing. IPython has it's own framework named ipyparallel (formerly IPython.Parallel). And, this framework supports bunch of parallel computing architectures and job schedulers which help us to increase speed of computing in applications.

Architecture #

The IPython architecture consists of four components:

  • The IPython engine
  • The IPython hub (part of IPython controller)
  • The IPython schedulers (part of IPython controller)
  • The controller client

IPython Engine #

The IPython engine is a Python instance that executes and takes Python commands over a network connection. IPython engine can also handle incoming and outgoing Python objects sent over a network connection. When multiple engines are started, parallel and distributed computing becomes possible. An important feature of an IPython engine is that it blocks while user code is being executed.

IPython Controller #

The IPython controller provides an interface for giving delivering tasks to the engines. Also, it is a collection of processes to which IPython engines and clients connect. It is composed of a Hub and a collection of Schedulers. These Schedulers are typically run in separate processes but on the same machine as the Hub, but can be run anywhere from local threads or on remote machines.

The Hub #

The Hub is the center of an IPython cluster. This is the process that keeps track of engine connections, schedulers, clients, as well as all task requests and results. Main role of the Hub is to execute queries of the cluster state, and minimize the necessary information required to establish the many connections involved in connecting new clients and engines.

The Scheduler #

All actions that can be performed on the engine go through a Scheduler. While the engines themselves block when user code is run, the schedulers hide that from the user to provide a fully asynchronous interface to a set of engines.

IPython Client (and Views) #

There is one primary object, the Client, for connecting to a cluster. For each execution model it creates appropriate View. And, those views allow users to interact with the engines through the interface. It has 2 default views:

  • The DirectView class for explicit addressing
  • The LoadBalancedView class for destination-agnostic scheduling

Getting started #

To start using IPython cluster we need to make sure that it is installed on our system, otherwise it can be installed via pip:

How to install #

Installation is simple as usual:

$ pip install ipython[parallel]

Or, explicitly:

$ pip install ipyparallel

Example #

To use IPython for parallel computing, we start one instance of the controller and 4 instances of the engine. To do this, it is best to simply start a controller and engines on a single host using the ipcluster command. To start a 1 controller and 4 engines on our localhost machine, we just simply do:

$ ipcluster start -n 4

Once we started the 1 controller and 4 engines, we can use these engines to do something useful. To make sure everything is working correctly, we can check with following commands:

$ ipython
In [1]: from IPython.parallel import Client

In [2]: c = Client()

In [4]: c.ids
Out[4]: set([0, 1, 2, 3])

In [5]: c[:].apply_sync(lambda : "Hello, World")
Out[5]: [ 'Hello, World', 'Hello, World', 'Hello, World', 'Hello, World' ]

When a client created with no arguments, then client tries to find the corresponding JSON file in the local ~/.ipython/profile_default/security directory. Or if we specify a profile, we can use that with the Client.

This should cover most cases:

In [2]: c = Client(profile='myprofile')

If we put JSON file in a different location or it has a different name, we can create the client like this:

In [2]: c = Client('/path/to/my/ipcontroller-client.json')

Client needs to be able to see the Hub’s ports to connect. So if they are on a different machine, we may need to use an ssh server to tunnel access to that machine, then we can connect to it with:

In [2]: c = Client('/path/to/my/ipcontroller-client.json', sshserver='me@myhub.example.com')

References #

  1. http://ipython.org/ipython-doc/3/
  2. https://ipython.org/ipython-doc/3/parallel/
  3. https://en.wikipedia.org/wiki/IPython#Parallel_computing

Suggested Pages #

0.0.1_20210630_7_v33