1 This document is a first attempt at describing how to write a NAL
2 for the Portals 3 library. It also defines the library architecture
3 and the abstraction of protection domains.
6 First, an overview of the architecture:
12 API === NAL (User space)
16 LIB === NAL (Library space)
20 Physical wire (NIC space)
32 Communication is through the indicated paths via well defined
33 interfaces. The API and LIB portions are written to be portable
34 across platforms and do not depend on the network interface.
36 Communcation between the application and the API code is
37 defined in the Portals 3 API specification. This is the
38 user-visible portion of the interface and should be the most
46 The user space NAL needs to implement only a few functions
47 that are stored in a nal_t data structure and called by the
50 int forward( nal_t *nal,
58 Most of the data structures in the portals library are held in
59 the LIB section of the code, so it is necessary to forward API
60 calls across the protection domain to the library. This is
61 handled by the NAL's forward method. Once the argument and return
62 blocks are on the remote side the NAL should call lib_dispatch()
63 to invoke the appropriate API function.
65 int validate( nal_t *nal,
72 The validate method provides a means for the NAL to prevalidate
73 and possibly pretranslate user addresses into a form suitable
74 for fast use by the network card or kernel module. The trans_base
75 pointer will be used by the library everytime it needs to
76 refer to the block of memory. The trans_data result is a
77 cookie that will be handed to the NAL along with the trans_base.
79 The library never performs calculations on the trans_base value;
80 it only computes offsets that are then handed to the NAL.
83 int shutdown( nal_t *nal, int interface );
85 Brings down the network interface. The remote NAL side should
86 call lib_fini() to bring down the library side of the network.
88 void yield( nal_t *nal );
90 This allows the user application to gracefully give up the processor
91 while busy waiting. Performance critical applications may not
92 want to take the time to call this function, so it should be an
93 option to the PtlEQWait call. Right now it is not implemented as such.
95 Lastly, the NAL must implement a function named PTL_IFACE_*, where
96 * is the name of the NAL such as PTL_IFACE_IP or PTL_IFACE_MYR.
97 This initialization function is to set up communication with the
98 library-side NAL, which should call lib_init() to bring up the
106 On the library-side, the NAL has much more responsibility. It
107 is responsible for calling lib_dispatch() on behalf of the user,
108 it is also responsible for bringing packets off the wire and
109 pushing bits out. As on the user side, the methods are stored
110 in a nal_cb_t structure that is defined on a per network
113 The calls to lib_dispatch() need to be examined. The prototype:
123 has two complications. The private field is a NAL-specific
124 value that will be passed to any callbacks produced as a result
125 of this API call. Kernel module implementations may use this
126 for task structures, or perhaps network card data. It is ignored
129 Secondly, the arg_block and ret_block must be in the same protection
130 domain as the library. The NAL's two halves must communicate the
131 sizes and perform the copies. After the call, the buffer pointed
132 to by ret_block will be filled in and should be copied back to
133 the user space. How this is to be done is NAL specific.
141 This is the only other entry point into the library from the NAL.
142 When the NAL detects an incoming message on the wire it should read
143 sizeof(ptl_hdr_t) bytes and pass a pointer to the header to
144 lib_parse(). It may set private to be anything that it needs to
145 tie the incoming message to callbacks that are made as a result
148 The method calls are:
165 This is a tricky function -- it must support async output
166 of messages as well as properly syncronized event log writing.
167 The private field is the same that was passed into lib_dispatch()
168 or lib_parse() and may be used to tie this call to the event
169 that initiated the entry to the library.
171 The cookie is a pointer to a library private value that must
172 be passed to lib_finalize() once the message has been completely
173 sent. It should not be examined by the NAL for any meaning.
175 The four ID fields are passed in, although some implementations
176 may not use all of them.
178 The single base pointer has been replaced with the translated
179 address that the API NAL generated in the api_nal->validate()
180 call. The trans_data is unchanged and the offset is in bytes.
194 This callback will only be called in response to lib_parse().
195 The cookie, trans_addr and trans_data are as discussed in send().
196 The NAL should read mlen bytes from the wire, deposit them into
197 trans_base + offset and then discard (rlen - mlen) bytes.
198 Once the entire message has been received the NAL should call
199 lib_finalize() with the lib_msg_t *cookie.
201 The special arguments of base=NULL, data=NULL, offset=0, mlen=0, rlen=0
202 is used to indicate that the NAL should clean up the wire. This could
203 be implemented as a blocking call, although having it return as quickly
204 as possible is desirable.
217 This is essentially a cross-protection domain memcpy(). The user address
218 has been pretranslated by the api_nal->translate() call.
230 Since the NAL may be in a non-standard hosted environment it can
231 not call malloc(). This allows the library side NAL to implement
232 the system specific malloc(). In the current reference implementation
233 the libary only calls nal->malloc() when the network interface is
234 initialized and then calls free when it is brought down. The library
235 maintains its own pool of objects for allocation so only one call to
236 malloc is made per object type.
245 User addresses are validated/translated at the user-level API NAL
246 method, which is likely to push them to this level. Meanwhile,
247 the library NAL will be notified when the library no longer
248 needs the buffer. Overlapped buffers are not detected by the
249 library, so the NAL should ref count each page involved.
251 Unfortunately we have a few bugs when the invalidate method is
252 called. It is still in progress...
260 As with malloc(), the library does not have any way to do printf
261 or printk. It is not necessary for the NAL to implement the this
262 call, although it will make debugging difficult.
274 These are used by the library to mark critical sections.
276 int (*gidrid2nidpid)(
285 int (*nidpid2gidrid)(
293 Rolf added these. I haven't looked at how they have to work yet.