IPC PRIMITIVES
The main operations provided are the simple primitives ipc_put, ipc_get, and ipc_barrier. Respectively, these allow a PE (processing element, usually a separate processor) to copy data to another PE, from another PE, and to wait until all PEs have reached a given location in the code. The get/put routines take a variety of datatypes, and under C++ a variety of overloaded interfaces are defined to allow these types to be checked.
Utility routines ipc_my_process and ipc_num_processes are also provided. These allow a PE to determine which processor it is out of the group, and how many others there are, so that it can take responsibility for a suitable portion of the computation.
CONSOLE MESSAGING
Console messaging is complicated with multiple PEs, so routines have also been provided to manage log, notification, warning, and error messages (ipc_log() and ipc_notify()). These allow e.g. a single processor to announce an error encountered by all, and can tag each message with the originating PE's number. Each takes an arbitrary printf-style format string and its corresponding arguments.
The ipc_notify routine accepts a level number with each message, marking it as an error message, a warning, an ordinary message, a verbose message, etc. A global variable ipc_msg_level determines how what severity of messages is printed, e.g. to suppress debugging messages. Similar global variables determine what level of message requires PE synchronization, what level will force all PEs to print a msg regardless, etc.
The ipc_notify routine also keeps track of how many error and warning messages have been encountered, suppressing them after e.g. ipc_max_errors have been printed, and (optionally) aborting execution after ipc_exit_on_error_num errors have been encountered.
PROGRAM EXITING
Exiting from the program often needs special handling when there are multiple PEs. Usually one will want to synchronize before exiting, and in unexpected error cases a single PE should be able to force an exit from all PEs. Accordingly, the normal exit routine ipc_exit and the error exit routine ipc_abort have been provided. These also take an arbitrary printf-style format statement for the exit message.
PROTOCOLS
Currently supported underlying protocols:
IPC_USE_SHMEM Cray-specific SHMEM routines (default) Stubs for single-processor use
So at the moment only Cray machines are supported when ipc_num_processes() != 1;
Other protocols such as PVM, MPI, and PAPERS/CAPERS should be supportable as well, allowing use on networks of workstations or non-Cray multiprocessor machines.
DATA COPYING INTERFACE
The user interface for copying data to and from a remote process primarily consists of:
PUT: Copy data from one location in this process to the same location in a remote process GET: Copy data from one location in a remote process to the same location in this process PUT_TO: Copy data from one location in this process to an arbitrary location in a remote process GET_TO: Copy data from one location in a remote process to an arbitrary location in this process
A large family of actual functions implements these operations for different datatypes.
Definition in file ipc.c.
#include <stdio.h>
#include <stdarg.h>
#include <time.h>
#include <string.h>
#include "ipc.h"
Include dependency graph for ipc.c:
Go to the source code of this file.
Namespaces | |
namespace | Msg |
Defines | |
#define | IPC_PE_MSG_DELAY 0.1 |
Message delay time to use between PEs, in seconds. | |
#define | IPC_MAXFILENAMELENGTH 1024 |
For the log file. | |
#define | IPC_PUT_RAW(bits) |
Raw interface for use when absolute lengths of data are known when code is written. | |
#define | IPC_GET_RAW(bits) |
Functions | |
ipc_status | ipc_put_base (void *target_data, void *source_data, ipc_datatype datatype, size_t count, int process) |
ipc_status | ipc_get_base (void *target_data, void *source_data, ipc_datatype datatype, size_t count, int process) |
Implementation of get. | |
void | ipc_notify_base (int terminateline, int print_pe, int message_level, const char *format, va_list args) |
Print an informative or warning message on the console or stderr. | |
int | ipc_datatype_size (ipc_datatype datatype) |
private function to determine size of each IPC datatype | |
int | ipc_num_processes (void) |
Return the number of processes involved in the current run (currently the same as the number of processors). | |
int | ipc_my_process (void) |
Return the number of the current process. | |
void | ipc_barrier (void) |
Global barrier. | |
void | ipc_set_barrier (void) |
Barrier announce (non-blocking). | |
void | ipc_pe_msg_delay (double scale) |
Delay different amount on each processor, which helps to help keep messages from different PEs in some sort of order. | |
void | ipc_init (void) |
Call before making any other ipc_ calls Currently doesn't do much, but could in the future for some underlying protocols. | |
void | ipc_init_logfile (const char *basefilename) |
Call before making any ipc_log calls. | |
void | ipc_log (int print_pe, const char *format,...) |
Log a non-urgent message to the runfile. | |
void | ipc_error (void) |
These are separate routines merely because that makes it easy to have a debugger such as gdb stop execution when an error or warning occurs. | |
void | ipc_warning (void) |
int | ipc_notify (int print_pe, int message_level, const char *format,...) |
void | ipc_notify2 (int terminateline, int print_pe, int message_level, const char *format,...) |
Updated version of ipc_notify allowing unterminated lines. | |
void | ipc_exit (int status, const char *format,...) |
Synchronize all processors and exit with the given status. | |
void | ipc_abort (int status, const char *format,...) |
Abort all processors (not just this one) and exit with the given status. | |
Variables | |
int | ipc_msg_level = IPC_STD |
const char * | ipc_msg_level_docstring |
int | ipc_msg_forceall_level = IPC_NONE |
const char * | ipc_msg_forceall_level_docstring |
int | ipc_msg_synch_level = IPC_ERROR |
const char * | ipc_msg_synch_level_docstring |
int | ipc_exit_on_error_num = 0 |
const char * | ipc_exit_on_error_num_docstring |
int | ipc_max_warnings = 100 |
const char * | ipc_max_warnings_docstring |
int | ipc_max_errors = 100 |
const char * | ipc_max_errors_docstring |
FILE * | ipc_logfile = NULL |
int | ipc_warnings = 0 |
int | ipc_errors = 0 |
|
Value: ipc_status ipc_ ## name (c_type *data, size_t count, int process) \ { return ipc_ ## name ## _base(data,data,ipc_type,count,process); } \ \ ipc_status ipc_ ## name ## _to (c_type *target, c_type *source, size_t count, int process) \ { return ipc_ ## name ## _base(target,source,ipc_type,count,process); } \ \ ipc_status ipc_ ## name \ (c_type *data, ipc_datatype datatype, size_t count, int process) \ { \ IPC_CHECK_DATATYPE(name,ipc_type,data); \ return ipc_ ## name ## _base(data,data,ipc_type,count,process); \ } \ \ ipc_status ipc_ ## name ## _to \ (c_type *target, c_type *source, ipc_datatype datatype, size_t count, int process) \ { \ IPC_CHECK_DATATYPE(name,ipc_type,source); \ return ipc_ ## name ## _base(target,source,ipc_type,count,process); \ } |
|
Value: if (datatype != ipc_type && \ !((datatype==IPC_RAW8 && sizeof(data)==1) || \ (datatype==IPC_RAW32 && sizeof(data)==4) || \ (datatype==IPC_RAW64 && sizeof(data)==8))) \ ipc_notify(3,IPC_WARNING,"ipc_" #name " called with incorrect datatype (%d != %d, sizeof(data)=%d != %d)",\ ipc_type,datatype,sizeof(data),ipc_datatype_size(ipc_type)) |
|
Value: ipc_status ipc_get ## bits(void *data, size_t count, int process) \ { return ipc_get_base(data,data,IPC_RAW ## bits,count,process); } |
|
Value: ipc_status ipc_put ## bits(void *data, size_t count, int process) \ { return ipc_put_base(data,data,IPC_RAW ## bits,count,process); } (Primarily for code using device-independent types in ind_type.h). |
|
These are separate routines merely because that makes it easy to have a debugger such as gdb stop execution when an error or warning occurs. (Ex: "break ipc_error"). Definition at line 627 of file ipc.c. Referenced by ipc_notify_base(). |
|
Synchronize all processors and exit with the given status. Must be called by all processors for the program to complete. Definition at line 712 of file ipc.c. References ipc_barrier(), ipc_my_process(), ipc_num_processes(), and ipc_pe_msg_delay(). Referenced by lissom_init_hooks(), main(), LissomMap::save_state(), and wrong_usage(). |
|
Call before making any ipc_log calls. If called twice with different names, closes the first file and opens a new one Definition at line 580 of file ipc.c. References IPC_MAXFILENAMELENGTH. Referenced by main(), and process_command_line_args(). |
|
Log a non-urgent message to the runfile. This should be called by only a single PE to prevent duplicate messages, and thus the PE number is not printed. No newline or punctuation is used, to allow arbitrary formatting. Definition at line 611 of file ipc.c. References ipc_my_process(). Referenced by FixedWtRegion::backproject(), LissomMap::backproject(), hooklists_log(), and WorldViews::next(). |
|
Delay different amount on each processor, which helps to help keep messages from different PEs in some sort of order. Probably useful only soon after a barrier, since otherwise processors will be out of sync anyway. The given scale is multiplied by the arbitrary base delay amount IPC_PE_MSG_DELAY; scale=1.0 should be sufficient to keep 1-line messages in order. Definition at line 552 of file ipc.c. References ipc_my_process(), and IPC_PE_MSG_DELAY. Referenced by ipc_abort(), ipc_exit(), ipc_notify_base(), and LissomMap::prune(). |
|
Initial value: "If nonzero, when this many errors have been reached, the program\n" "will exit automatically." |
|
Initial value: "Maximum number of error messages to be printed, per PE. If this limit\n" "is reached, further errors will still increment the error counter\n" "but no messages will be printed." |
|
Initial value: "Maximum number of warnings to be printed, per PE. If this limit is\n" "reached, further warnings will still increment the warning counter\n" "but no messages will be printed." |
|
Initial value: "Messages with a debug level higher than this level (see ipc_msg_level\n" "for definitions) will be printed by all PEs who make the call, regardless\n" "of whether the call declares that only one should print. Useful for\n" "debugging on multiple PEs when some PEs are reporting errors or warnings\n" "in the totals yet no messages were displayed." |
|
Initial value: "Level of notification messages to display. Standard levels:\n\n" " IPC_None \n" " IPC_Error \n" " IPC_Warning \n" " IPC_Caution \n" " IPC_Alert \n" " IPC_Summary \n" " IPC_Std \n" " IPC_Verbose \n" " IPC_Overwhelm\n\n" "The last one is special in that it overrides the values of\n" "ipc_msg_level_forceall and ipc_msg_synch_level, forcing\n" "all PEs to present all messages and to synchronize while doing\n" "so." |
|
Initial value: "Attempt to print messages with a msg level higher than this level (see\n" "ipc_msg_level for definitions) in order of PE number, rather than\n" "interleaving the output from different PEs. The synchronization is\n" "currently accomplished only by having each PE delay for a short\n" "time proportional to its PE number. As a consequence, it slows\n" "down execution while not guaranteeing perfect order of output.\n" "However, it has the advantage of not failing catastrophically in\n" "error conditions where PEs become out of synch due to missed\n" "barriers, which is important for an error message handler." |