首页 > 技术点滴 > All About Netgraph

All About Netgraph

2009年3月12日 baoz 阅读评论 742 views

今天问到白总freebsd有没类似netfilter的string match模块,他让我看看netgraph,结果虽然是没有,但netgraph果然还是很好很强大,白总又名white-cn,真不愧为中国bsd的元谋人啊。原文在这里

Part I: What is Netgraph?

The motivation

Imagine the following scenario: you are developing a TCP/IP router product based on FreeBSD. The product needs to support bit-synchronous serial WAN connections, i.e., dedicated high speed lines that run up to T1 speeds, where the basic framing is done via HDLC. You need to support the following protocols for the transmission of IP packets over the wire:

  • IP frames delivered over HDLC (the simplest way to transmit IP)
  • IP frames delivered over “Cisco HDLC” (basically, packets are prepended with a two-byte Ethertype, and there are also periodic keep-alive packets).
  • IP delivered over frame relay (frame relay provides for up to 1000 virtual point-to-point links which are multiplexed over a single physical wire).
  • IP inside RFC 1490 encapsulation over frame relay (RFC 1490 is a way to multiplex multiple protocols over a single connection, and is often used in conjunction with frame relay).
  • Point-to-Point Protocol (PPP) over HDLC
  • PPP over frame relay
  • PPP inside RFC 1490 encapsulation over frame relay
  • PPP over ISDN
  • There are even rumors you might have to support frame relay over ISDN (!)

Figure 1 graphically indicates all of the possible combinations:

Figure 1: Ways to talk IP over synchronous serial and ISDN WAN connections
Figure 1: Ways to talk IP over synchronous serial and ISDN WAN connections

This was the situation faced by Julian Elischer <julian@freebsd.org> and myself back in 1996 while we were working on the Whistle InterJet. At that time FreeBSD had very limited support for synchronous serial hardware and protocols. We looked at OEMing from Emerging Technologies, but decided instead to do it ourselves.

The answer was netgraph. Netgraph is an in-kernel networking subsystem that follows the UNIX principle of achieving power and flexibility through combinations of simple tools, each of which is designed to perform a single, well defined task. The basic idea is straightforward: there are nodes (the tools) and edges that connect pairs of nodes (hence the “graph” in “netgraph”). Data packets flow bidirectionally along the edges from node to node. When a node receives a data packet, it performs some processing on it, and then (usually) forwards it to another node. The processing may be something as simple as adding/removing headers, or it may be more complicated or involve other parts of the system. Netgraph is vaguely similar to System V Streams, but is designed for better speed and more flexibility.

Netgraph has proven very useful for networking, and is currently used in the Whistle InterJet for all of the above protocol configurations (except frame relay over ISDN), plus normal PPP over asynchronous serial (i.e., modems and TAs) and Point-to-Point Tunneling Protocol (PPTP), which includes encryption. With all of these protocols, the data packets are handled entirely in the kernel. In the case of PPP, the negotiation packets are handled separately in user-mode (see the FreeBSD port for mpd-3.0b5).

Nodes and edges

Looking at the picture above, it is obvious what the nodes and edges might be.

Object oriented nature

Netgraph is somewhat object-oriented in its design. Each node type is defined by an array of pointers to the methods, or C functions, that implement the specific behavior of nodes of that type. Each method may be left NULL to fall back to the default behavior.

Similarly, there are some control messages that are understood by all node types and which are handled by the base system (these are called generic control messages). Each node type may in addition define its own type-specific control messages. Control messages always contain a typecookie and a command, which together identify how to interpret the message. Each node type must define its own unique typecookie if it wishes to receive type-specific control messages. The generic control messages have a predefined typecookie.

Memory

Netgraph uses reference counting for node and hook structures. Each pointer to a node or a hook should count for one reference. If a node has a name, that also counts as a reference. All netgraph-related heap memory is allocated and free'd using malloc type M_NETGRAPH.

Synchronization

Running in the kernel requires attention to synchronization. Netgraph nodes normally run at splnet() (see spl(9)). For most node types, no special attention is necessary. Some nodes, however, interact with other parts of the kernel that run at different priority levels. For example, serial ports run at spltty() and so ng_tty(8) needs to deal with this. For these cases netgraph provides alternate data transmission routines that handle all the necessary queuing auto-magically (see ng_queue_data() below).

How to implement a node type

To implement a new node type, you only need to do two things:

  1. Define a struct ng_type.
  2. Link it in using the NETGRAPH_INIT() macro.

Step 2 is easy, so we'll focus on step 1. Here is struct ng_type, taken from netgraph.h:

/*
 * Structure of a node type
 */
struct ng_type {

    u_int32_t       version;        /* must equal NG_VERSION */
    const char      *name;          /* Unique type name */
    modeventhand_t  mod_event;      /* Module event handler (optional) */
    ng_constructor_t *constructor;  /* Node constructor */
    ng_rcvmsg_t     *rcvmsg;        /* control messages come here */
    ng_shutdown_t   *shutdown;      /* reset, and free resources */
    ng_newhook_t    *newhook;       /* first notification of new hook */
    ng_findhook_t   *findhook;      /* only if you have lots of hooks */
    ng_connect_t    *connect;       /* final notification of new hook */
    ng_rcvdata_t    *rcvdata;       /* date comes here */
    ng_rcvdata_t    *rcvdataq;      /* or here if being queued */
    ng_disconnect_t *disconnect;    /* notify on disconnect */

    const struct    ng_cmdlist *cmdlist;    /* commands we can convert */

    /* R/W data private to the base netgraph code DON'T TOUCH! */
    LIST_ENTRY(ng_type) types;              /* linked list of all types */
    int                 refs;               /* number of instances */
};

The version field should be equal to NG_VERSION. This is to prevent linking in incompatible types. The name is the unique node type name, e.g., ``tee''. The mod_event is an optional module event handler (for when the node type is loaded and unloaded) -- similar to a static initializer in C++ or Java.

Next are the node type methods, described in detail below. The cmdlist provides (optional) information for converting control messages to/from ASCII (see below), and the rest is private to the base netgraph code.

Node type methods

Each node type must implement these methods, defined in its struct ng_type. Each method has a default implementation, which is used if the node type doesn't define one.

int constructor(node_p *node);
Purpose: Initialize a new node by calling ng_make_node_common() and setting node->private if appropriate. Per-node initialization and memory allocation should happen here. ng_make_node_common() should be called first; it creates the node and sets the reference count to one.

Default action: Just calls ng_make_node_common().

When to override: If you require node-specific initialization or resource allocation.

int rcvmsg(node_p node, struct ng_mesg *msg,
       const char *retaddr, struct ng_mesg **resp);
Purpose: Receive and handle a control message. The address of the sender is in retaddr. The rcvmsg() function is responsible for freeing msg. The response, if any, may be returned synchronously if resp != NULL by setting *resp to point to it. Generic control messages (except for NGM_TEXT_STATUS) are handled by the base system and need not be handled here.

Default action: Handle all generic control messages; otherwise returns EINVAL.

When to override: If you define any type-specific control messages, or you want to implement control messages defined by some other node type.

int shutdown(node_p node);
Purpose: Shutdown the node. Should disconnect all hooks by calling ng_cutlinks(), free all private per-node memory, release the assigned name (if any) via ng_unname(), and release the node itself by calling ng_unref() (this call releases the reference added by ng_make_node_common()).

In the case of persistent nodes, all hooks should be disconnected and the associated device (or whatever) reset, but the node should not be removed (i.e., only call ng_cutlinks()).

Default action: Calls ng_cutlinks(), ng_unname(), and ng_unref().

When to override: When you need to undo the stuff you did in the constructor method.

int newhook(node_p node, hook_p hook, const char *name);
Purpose: Validate the connection of a hook and initialize any per-hook resources. The node should verify that the hook name is in fact one of the hook names supported by this node type. The uniqueness of the name will have already been verified (but it doesn't hurt to double-check).

If the hook requires per-hook information, this method should initialize hook->private accordingly.

Default action: Does nothing; the hook connection is always accepted.

When to override: Always, unless you plan to allow arbitrarily named hooks, have no per-hook initialization or resource allocation, and treat all hooks the same upon connection.

hook_p findhook(node_p node, const char *name);
Purpose: Find a connected hook on this node. It is not necessary to override this method unless the node supports a large number of hooks, where a linear search would be too slow.

Default action: Performs a linear search through the list of hooks connected to this node.

When to override: When your node supports a large number of simultaneously connected hooks (say, more than 50).

int connect(hook_p hook);
Purpose: Final verification of hook connection. This method gives the node a last chance to validate a newly connected hook. For example, the node may actually care who it's connected to. If this method returns an error, the connection is aborted.

Default action: Does nothing; the hook connection is accepted.

When to override: I've never had an occasion to override this method.

int rcvdata(hook_p hook, struct mbuf *m, meta_p meta);
Purpose: Receive an incoming data packet on a connected hook. The node is responsible for freeing the mbuf if it returns an error, or wishes to discard the data packet. Although not currently the case, in the future it could be that sometimes m == NULL (for example, if there is only a meta to be sent), so node types should handle this possibility.

Default action: Drops the data packet and meta-information.

When to override: Always, unless you intend to discard all received data packets.

int rcvdataq(hook_p hook, struct mbuf *m, meta_p meta);
Purpose: Queue an incoming data packet for reception on a connected hook. The node is responsible for freeing the mbuf if it returns an error, or wishes to discard the data packet.

The intention here is that some nodes may want to send data using a queuing mechanism instead of a functional mechanism. This requires cooperation of the receiving node type, which must implement this method in order for it to do anything different from rcvdata().

Default action: Calls the rcvdata() method.

When to override: Never, unless you have a reason to treat incoming ``queue'' data differently from incoming ``non-queue'' data.

int disconnect(hook_p hook);
Purpose: Notification to the node that a hook is being disconnected. The node should release any per-hook resources allocated during connect().

Although this function returns int, it should really return void because the return value is ignored; hook disconnection cannot be blocked by a node.

This function should check whether the last hook has been disconnected (hook->node->numhooks == 0) and if so, call ng_rmnode() to self-destruct, as is the custom. This helps avoid completely unconnected nodes that linger around in the system after their job is finished.

Default action: Does nothing.

When to override: Almost always.

int mod_event(module_t mod, int what, void *arg);
Purpose: Handle the events of loading and unloading the node type. Note that both events are handled through this one method, distinguished by what being either MOD_LOAD or MOD_UNLOAD. The arg parameter is a pointer to the struct ng_type defining the node type.

This method will never be called for MOD_UNLOAD when there are any nodes of this type currently in existence.

Currently, netgraph will only ever try to MOD_UNLOAD a node type when kldunload(2) is explicitly called. However, in the future more proactive unloading of node types may be implemented as a ``garbage collection'' measure.

Default action: Does nothing. If not overridden, MOD_LOAD and MOD_UNLOAD will succeed normally.

When to override: If your type needs to do any type-specific initialization or resource allocation upon loading, or undo any of that upon unloading. Also, if your type does not support unloading (perhaps because of unbreakable associations with other parts of the kernel) then returning an error in the MOD_UNLOAD case will prevent the type from being unloaded.

Netgraph header files

There are two header files all node types include. The netgraph.h header file defines the basic netgraph structures (good object-oriented design would dictate that the definitions of struct ng_node and struct ng_hook really don't belong here; instead, they should be private to the base netgraph code). Node structures are freed when the reference counter drops to zero after a call to ng_unref(). If a node has a name, that counts as a reference; to remove the name (and the reference), call ng_unname(). Of particular interest is struct ng_type, since every node type must supply one of these.

The ng_message.h header file defines structures and macros relevant to handling control messages. It defines the struct ng_mesg which every control message has as a prefix. It also serves as the ``public header file'' for all of the generic control messages, which all have typecookie NGM_GENERIC_COOKIE. The following summarizes the generic control messages:

NGM_SHUTDOWN Disconnect all target node hooks and remove the node (or just reset if persistent)
NGM_MKPEER Create a new node and connect to it
NGM_CONNECT Connect a target node's hook to another node
NGM_NAME Assign the target node a name
NGM_RMHOOK Break a connection between the target node and another node
NGM_NODEINFO Get information about the target node
NGM_LISTHOOKS Get a list of all connected hooks on the target node
NGM_LISTNAMES Get a list of all named nodes *
NGM_LISTNODES Get a list of all nodes, named and unnamed *
NGM_LISTTYPES Get a list of all installed node types *
NGM_TEXT_STATUS Get a human readable status report from the target node (optional)
NGM_BINARY2ASCII Convert a control message from binary to ASCII
NGM_ASCII2BINARY Convert a control message from ASCII to binary
* Not node specific

For most of these commands, there are corresponding C structure(s) defined in ng_message.h.

The netgraph.h and ng_message.h header files also define several commonly used functions and macros:

int ng_send_data(hook_p hook, struct mbuf *m, meta_p meta);
What it does: Delivers the mbuf m and associated meta-data meta out the hook hook and sets error to the resulting error code. Either or both of m and meta may be NULL. In all cases, the responsibility for freeing m and meta is lifted when this functions is called (even if there is an error), so these variables should be set to NULL after the call (this is done automatically if you use the NG_SEND_DATA() macro instead).
int ng_send_dataq(hook_p hook, struct mbuf *m, meta_p meta);
What it does: Same as ng_send_data(), except the recipient node receives the data via its rcvdataq() method instead of its rcvdata() method. If the node type does not override rcvdataq(), then calling this is equivalent to calling ng_send_data().
int ng_queue_data(hook_p hook, struct mbuf *m, meta_p meta);
What it does: Same as ng_send_data(), except this is safe to call from a non-splnet() context. The mbuf and meta-information will be queued and delivered later at splnet().
int ng_send_msg(node_p here, struct ng_mesg *msg,
       const char *address, struct ng_mesg **resp);
What it does: Sends the netgraph control message pointed to by msg from the local node here to the node found at address, which may be an absolute or relative address. If resp is non-NULL, and the recipient node wishes to return a synchronous reply, it will set *resp to point at it. In this case, it is the calling node's responsibility to process and free *resp.
int ng_queue_msg(node_p here, struct ng_mesg *msg, const char *address);
What it does: Same as ng_send_msg(), except this is safe to call from a non-splnet() context. The message will be queued and delivered later at splnet(). No synchronous reply is possible.
NG_SEND_DATA(error, hook, m, meta)
What it does: Slightly safer version of ng_send_data(). This simply calls ng_send_data() and then sets m and meta to NULL. Either or both of m and meta may be NULL, though they must be actual variables (they can't be the constant NULL due to the way the macro works).
NG_SEND_DATAQ(error, hook, m, meta)
What it does: Slightly safer version of ng_send_dataq(). This simply calls ng_send_dataq() and then sets m and meta to NULL. Either or both of m and meta may be NULL, though they must be actual variables (they can't be the constant NULL due to the way the macro works).
NG_FREE_DATA(m, meta)
What it does: Frees m and meta and sets them to NULL. Either or both of m and meta may be NULL, though they must be actual variables (they can't be the constant NULL due to the way the macro works).
NG_FREE_META(meta)
What it does: Frees meta and sets it to NULL. meta may be NULL, though it must be an actual variable (it can't be the constant NULL due to the way the macro works).
NG_MKMESSAGE(msg, cookie, cmdid, len, how)
What it does: Allocates and initializes a new netgraph control message with len bytes of argument space (len should be zero if there are no arguments). msg should be of type struct ng_mesg *. The cookie and cmdid are the message typecookie and command ID. how is one of M_WAIT or M_NOWAIT (it's safer to use M_NOWAIT).

Sets msg to NULL if memory allocation fails. Initializes the message token to zero.

NG_MKRESPONSE(rsp, msg, len, how)
What it does: Allocates and initializes a new netgraph control message that is intended to be a response to msg. The response will have len bytes of argument space (len should be zero if there are no arguments). msg should be a pointer to an existing struct ng_mesg while rsp should be of type struct ng_mesg *. how is one of M_WAIT or M_NOWAIT (it's safer to use M_NOWAIT).

Sets rsp to NULL if memory allocation fails.

int ng_name_node(node_p node, const char *name);
What it does: Assign the global name name to node node. The name must be unique. This is often called from within node constructors for nodes that are associated with some other named kernel entity, e.g., a device or interface. Assigning a name to a node increments the node's reference count.
void ng_cutlinks(node_p node);
What it does: Breaks all hook connections for node. Typically this is called during node shutdown.
void ng_unref(node_p node);
What it does: Decrements a node's reference count, and frees the node if that count goes to zero. Typically this is called in the shutdown() method to release the reference created by ng_make_node_common().
void ng_unname(node_p node);
What it does: Removes the global name assigned to the node and decrements the reference count. If the node does not have a name, this function has no effect. This should be called in the shutdown() method before freeing the node (via ng_unref()).

A real life example

Enough theory, let's see an example. Here is the implementation of the tee node type. As is the custom, the implementation consists of a public header file, a C file, and a man page. The header file is ng_tee.h and the C file is ng_tee.c.

Here are some things to notice about the header file:

  • The header file defines the following important things:
    • The unique name of the type ``tee'' as NG_TEE_NODE_TYPE.
    • The unique typecookie for ``tee'' node specific control messages, NGM_TEE_COOKIE.
    • The names of the four hooks supported by ``tee'' nodes.
    • The two control messages understood by ``tee'' nodes, NGM_TEE_GET_STATS and NGM_TEE_CLR_STATS.
    • The structure returned by NGM_TEE_GET_STATS, which is a struct ng_tee_stats.

    This information is public because other node types need to know it in order to talk to and connect to tee nodes.

  • Whenever there is an incompatible change in the control message format, the typecookie should be changed to avoid mysterious problems. The traditional way to generate unique typecookies is to use the output of ``date -u +%s''.
  • Along with the C structures are corresponding macros that are used when converting between binary and ASCII. Although this information really belongs in the C file, it is put into the header file so it doesn't get out of sync with the actual structure.

Here are some things to notice about the C file:

  • Nodes typically store information private to the node or to each hook. For the ng_tee(8) node type, this information is stored in a struct privdata for each node, and a struct hookdata for each hook.
  • The ng_tee_cmds array defines how to convert the type specific control messages from binary to ASCII and back. See below.
  • The ng_tee_typestruct at the beginning actually defines the node type for tee nodes. This structure contains the netgraph system version (to avoid incompatibilities), the unique type name (NG_ECHO_NODE_TYPE), pointers to the node type methods, and a pointer to the ng_tee_cmds array. Some methods don't need to be overridden because the default behavior is sufficient.
  • The NETGRAPH_INIT() macro is required to link in the type. This macro works whether the node type is compiled as a KLD or directly into the kernel (in this case, using options NETGRAPH_TEE).
  • Netgraph node structures (type struct ng_node) contain reference counts to ensure they get freed at the right time. A hidden side effect of calling ng_make_node_common() in the node constructor is that one reference is created. This reference is released by the ng_unref() call in the shutdown method ngt_rmnode().
  • Also in ngt_rmnode() is a call to ng_bypass(). This is a bit of a kludge that joins two edges by disconnecting the node in between them (in this case, the tee node).
  • Note that in the function ngt_disconnect() the node destroys itself when the last hook is disconnected. This keeps nodes from lingering around after they have nothing left to do.
  • No spl synchronization calls are necessary; the entire thing runs at splnet().

Converting control messages to/from ASCII

Netgraph provides an easy way to convert control messages (indeed, any C structure) between binary and ASCII formats. A detailed explanation is beyond the scope of this article, but here we'll give an overview.

Recall that control messages have a fixed header (struct ng_mesg) followed by a variable length payload having arbitrary structure and contents. In addition, the control message header contains a flag bit indicating whether the messages is a command or a reply. Usually the payload will be structured differently in the command and the response. For example, the ``tee'' node has a NGM_TEE_GET_STATS control message. When sent as a command ((msg->header.flags & NGF_RESP) == 0), the payload is empty. When sent as a response to a command ((msg->header.flags & NGF_RESP) != 0), the payload contains a struct ng_tee_stats that contains the node statistics.

So for each control message that a node type understands, the node type defines how to convert the payload area of that control message (in both cases, command and response) between its native binary representation and a human-readable ASCII version. These definitions are called netgraph parse types.

The cmdlist field in the struct ng_type that defines a node type is a pointer to an array of struct ng_cmdlists. Each element in this array corresponds to a type-specific control message understood by this node. Along with the typecookie and command ID (which uniquely identify the control message), are an ASCII name and two netgraph parse types that define how the payload area data is structured i.e. one for each direction (command and response).

Parse types are built up from the predefined parse types defined in ng_parse.h. Using these parse types, you can describe any arbitrarily complicated C structure, even one containing variable length arrays and strings. The ``tee'' node type has an example of doing this for the struct ng_tee_stats returned by the NGM_TEE_GET_STATS control message (see ng_tee.h and ng_tee.c).

You can also define your own parse types from scratch if necessary. For example, the ``ksocket'' node type contains special code for converting a struct sockaddr in the address families AF_INET and AF_LOCAL, to make them more human friendly. The relevant code can be found in ng_ksocket.h and ng_ksocket.c, specifically the section labeled ``STRUCT SOCKADDR PARSE TYPE''.

Parse types are a convenient and efficient way to effect binary/ASCII conversion in the kernel without a lot of manual parsing code and string manipulation. When performance is a real issue, binary control messages can always be used directly to avoid any conversion.

The gory details about parse types are available in ng_parse.h and ng_parse.c.

Programming gotcha's

Some things to look out for if you plan on implementing your own netgraph node type:

  • First, make sure you fully understand how mbuf's work and are used.
  • All data packets must be packet header mbuf's, i.e., with the M_PKTHDR flag set.
  • Be careful to always update m->m_pkthdr.len when you update m->m_len for any mbuf in the chain.
  • Be careful to check m->m_len and call m_pullup() if necessary before accessing mbuf data. Don't call m_pullup() unless necessary. You should always follow this pattern:
    	  struct foobar *f;
    
    	  if (m->m_len < sizeof(*f) && (m = m_pullup(m, sizeof(*f))) == NULL) {
    	      NG_FREE_META(meta);
    	      return (ENOBUFS);
    	  }
    	  f=mtod(m, struct foobar *);
    	  ...
  • Be careful to release all resources at the appropriate time, e.g., during the disconnect() and shutdown() methods to avoid memory leaks, etc. I've accidentally done things like leave timers running with disastrous results.
  • If you use a timer (see timeout(9)), be sure to set splnet() as the first thing in your handler (and splx() before exiting, of course). The timeout() routine does not preserve the SPL level to the event handler.
  • Make sure your node disappears when all edges have been broken unless there's a good reason not to.

Part IV: Future Directions

Netgraph is still a work in progress, and contributors are welcome! Here are some ideas for future work.

Node types

There are many node types yet to be written:

  • A ``slip'' node type that implements the SLIP protocol. This should be pretty easy and may be done soon.
  • More PPP compression and encryption nodes that can connect to a ng_ppp(8) node, e.g., PPP Deflate compression, PPP 3DES encryption, etc.
  • An implementation of ipfw(4) as a netgraph node.
  • An implementation of the Dynamic Packet Filter as a netgraph node. DPF is sort of a hyper-speed JIT compiling version of BPF.
  • A generic ``mux'' node type, where each hook could be configured with a unique header to append/strip from data packets.

FreeBSD currently has four PPP implementations: sppp(4), pppd(8), ppp(8), and the MPD port. This is pretty silly. Using netgraph, these can all be collapsed into a single user-land daemon that handles all the configuration and negotiation, while routing all data strictly in the kernel via ng_ppp(8) nodes. This combines the flexibility and configuration benefits of the user-land daemons with the speed of the kernel implementations. Right now MPD is the only implementation that has been fully ``netgraphified'' but plans are in the works for ppp(8)as well.

Control message ASCII-fication

Not all node types that define their own control messages support conversion between binary and ASCII. One project is to finish this work for those nodes that still need it.

Control flow

One issue that may need addressing is control flow. Right now when you send a data packet, if the ultimate recipient of that node can't handle it because of a full transmit queue or something, all it can do is drop the packet and return ENOBUFS. Perhaps we can define a new return code ESLOWDOWN or something that means ``data packet not dropped; queue full; slow down and try again later.'' Another possibility would be to define meta-data types for the equivalents of XOFF (stop flow) and XON (restart flow).

Code cleanups

Netgraph is somewhat object oriented, but could benefit from a more rigorous object oriented design without suffering too much in performance. There are still too many visible structure fields that shouldn't be accessible, etc., as well as other miscellaneous code cleanups.

Also, all of the node type man pages (e.g., ng_tee(8)) really belong in section 4 rather than section 8.

Electrocution

It would be nice to have a new generic control message NGM_ELECTROCUTE, which when sent to a node would shutdown that node as well as every node it was connected to, and every node those nodes were connected to, etc. This would allow for a quick cleanup of an arbitrarily complicated netgraph graph in a single blow. In addition, there might be a new socket option (see setsockopt(2)) that you could set on a ng_socket(8) socket that would cause an NGM_ELECTROCUTE to be automatically generated when the socket was closed.

Together, these two features would lead to more reliable avoidance of netgraph ``node leak.''

Infinite loop detection

It would be easy to include ``infinite loop detection'' in the base netgraph code. That is, each node would have a private counter. The counter would be incremented before each call to a node's rcvdata() method, and decremented afterwards. If the counter reached some insanely high value, then we've detected an infinite loop (and avoided a kernel panic).

New node types

There are lots of new and improved node types that could be created, for example:

  • A routing node type. Each connected hook would correspond to a route destination, i.e., an address and netmask combination. The routes would be managed via control messages.
  • A stateful packet filtering/firewall/address translation node type (replacement for ipfw and/or ipfirewall)
  • Node type for bandwidth limiting and/or bandwidth accounting
  • Adding VLAN support to the existing Ethernet nodes.

If you really wanted to get crazy

In theory, the BSD networking subsystem could be entirely replaced by netgraph. Of course, this will probably never happen, but it makes for a nice thought experiment. Each networking device would be a persistent netgraph node (like Ethernet devices are now). On top of each Ethernet device node would be an ``Ethertype multiplexor.'' Connected to this would be IP, ARP, IPX, AppleTalk, etc. nodes. The IP node would be a simple ``IP protocol multiplexor'' node on top of which would sit TCP, UDP, etc. nodes. The TCP and UDP nodes would in turn have socket-like nodes on top of them. Etc, etc.

Other crazy ideas (disclaimer: these are crazy ideas):

  • Make all devices appear as netgraph nodes. Convert between ioctl(2)'s and control messages. Talk directly to your SCSI disk with ngctl(8)! Seamless integration between netgraph and DEVFS.
  • A netgraph node that is also a VFS layer? A filesystem view of the space of netgraph nodes?
  • If NFS can work over UDP, it can work over netgraph. You could have NFS disks remotely mounted via an ATM link, or simply do NFS over raw Ethernet and cut out the UDP middleman.
  • A ``programmable'' node type whose implementation would depend on its configuration using some kind of node pseudo-code.

Surely there are lots more crazy ideas we haven't thought of yet.

包子猜您可能还喜欢下列文章:

  1. Recover deleted messages from Blackberry
  2. FreeBSD >= 7.0 local kernel root exploit
  3. 很不错的openssh后门
  4. tcpdump advanced filters
  5. MySQL优化 之 Discuz论坛优化 zz

分类: 技术点滴 标签: ,
  1. white-cn
    2009年3月12日11:21 | #1

    包总,包威武,包大神,求你不要毁我了……

  2. 小毛
    2009年3月13日11:31 | #2

    - -….路过

  1. 本文目前尚无任何 trackbacks 和 pingbacks.