Move CallBack Server thread creation, initial processing and destruction to RPC Cleanup some RPC code. Remove extraneous fields from nfs41_cb_info and clean up the code. Change KM_SLEEP in mir_nfs41_callback_thread to KM_NOSLEEP. Fix lint warnings
1 /* 2 * CDDL HEADER START 3 * 4 * The contents of this file are subject to the terms of the 5 * Common Development and Distribution License (the "License"). 6 * You may not use this file except in compliance with the License. 7 * 8 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 9 * or http://www.opensolaris.org/os/licensing. 10 * See the License for the specific language governing permissions 11 * and limitations under the License. 12 * 13 * When distributing Covered Code, include this CDDL HEADER in each 14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 15 * If applicable, add the following below this CDDL HEADER, with the 16 * fields enclosed by brackets "[]" replaced with your own identifying 17 * information: Portions Copyright [yyyy] [name of copyright owner] 18 * 19 * CDDL HEADER END 20 */ 21 /* 22 * Copyright 2009 Sun Microsystems, Inc. All rights reserved. 23 * Use is subject to license terms. 24 */ 25 26 /* 27 * Copyright (c) 1983, 1984, 1985, 1986, 1987, 1988, 1989 AT&T 28 * All Rights Reserved 29 */ 30 31 /* 32 * Portions of this source code were derived from Berkeley 4.3 BSD 33 * under license from the Regents of the University of California. 34 */ 35 36 37 /* 38 * Implements a kernel based, client side RPC over Connection Oriented 39 * Transports (COTS). 40 */ 41 42 /* 43 * Much of this file has been re-written to let NFS work better over slow 44 * transports. A description follows. 45 * 46 * One of the annoying things about kRPC/COTS is that it will temporarily 47 * create more than one connection between a client and server. This 48 * happens because when a connection is made, the end-points entry in the 49 * linked list of connections (headed by cm_hd), is removed so that other 50 * threads don't mess with it. Went ahead and bit the bullet by keeping 51 * the endpoint on the connection list and introducing state bits, 52 * condition variables etc. to the connection entry data structure (struct 53 * cm_xprt). 54 * 55 * Here is a summary of the changes to cm-xprt: 56 * 57 * x_ctime is the timestamp of when the endpoint was last 58 * connected or disconnected. If an end-point is ever disconnected 59 * or re-connected, then any outstanding RPC request is presumed 60 * lost, telling clnt_cots_kcallit that it needs to re-send the 61 * request, not just wait for the original request's reply to 62 * arrive. 63 * 64 * x_thread flag which tells us if a thread is doing a connection attempt. 65 * 66 * x_waitdis flag which tells us we are waiting a disconnect ACK. 67 * 68 * x_needdis flag which tells us we need to send a T_DISCONN_REQ 69 * to kill the connection. 70 * 71 * x_needrel flag which tells us we need to send a T_ORDREL_REQ to 72 * gracefully close the connection. 73 * 74 * #defined bitmasks for the all the b_* bits so that more 75 * efficient (and at times less clumsy) masks can be used to 76 * manipulated state in cases where multiple bits have to 77 * set/cleared/checked in the same critical section. 78 * 79 * x_conn_cv and x_dis-_cv are new condition variables to let 80 * threads knows when the connection attempt is done, and to let 81 * the connecting thread know when the disconnect handshake is 82 * done. 83 * 84 * Added the CONN_HOLD() macro so that all reference holds have the same 85 * look and feel. 86 * 87 * In the private (cku_private) portion of the client handle, 88 * 89 * cku_flags replaces the cku_sent a boolean. cku_flags keeps 90 * track of whether a request as been sent, and whether the 91 * client's handles call record is on the dispatch list (so that 92 * the reply can be matched by XID to the right client handle). 93 * The idea of CKU_ONQUEUE is that we can exit clnt_cots_kcallit() 94 * and still have the response find the right client handle so 95 * that the retry of CLNT_CALL() gets the result. Testing, found 96 * situations where if the timeout was increased, performance 97 * degraded. This was due to us hitting a window where the thread 98 * was back in rfscall() (probably printing server not responding) 99 * while the response came back but no place to put it. 100 * 101 * cku_ctime is just a cache of x_ctime. If they match, 102 * clnt_cots_kcallit() won't to send a retry (unless the maximum 103 * receive count limit as been reached). If the don't match, then 104 * we assume the request has been lost, and a retry of the request 105 * is needed. 106 * 107 * cku_recv_attempts counts the number of receive count attempts 108 * after one try is sent on the wire. 109 * 110 * Added the clnt_delay() routine so that interruptible and 111 * noninterruptible delays are possible. 112 * 113 * CLNT_MIN_TIMEOUT has been bumped to 10 seconds from 3. This is used to 114 * control how long the client delays before returned after getting 115 * ECONNREFUSED. At 3 seconds, 8 client threads per mount really does bash 116 * a server that may be booting and not yet started nfsd. 117 * 118 * CLNT_MAXRECV_WITHOUT_RETRY is a new macro (value of 3) (with a tunable) 119 * Why don't we just wait forever (receive an infinite # of times)? 120 * Because the server may have rebooted. More insidious is that some 121 * servers (ours) will drop NFS/TCP requests in some cases. This is bad, 122 * but it is a reality. 123 * 124 * The case of a server doing orderly release really messes up the 125 * client's recovery, especially if the server's TCP implementation is 126 * buggy. It was found was that the kRPC/COTS client was breaking some 127 * TPI rules, such as not waiting for the acknowledgement of a 128 * T_DISCON_REQ (hence the added case statements T_ERROR_ACK, T_OK_ACK and 129 * T_DISCON_REQ in clnt_dispatch_notifyall()). 130 * 131 * One of things that we've seen is that a kRPC TCP endpoint goes into 132 * TIMEWAIT and a thus a reconnect takes a long time to satisfy because 133 * that the TIMEWAIT state takes a while to finish. If a server sends a 134 * T_ORDREL_IND, there is little point in an RPC client doing a 135 * T_ORDREL_REQ, because the RPC request isn't going to make it (the 136 * server is saying that it won't accept any more data). So kRPC was 137 * changed to send a T_DISCON_REQ when we get a T_ORDREL_IND. So now the 138 * connection skips the TIMEWAIT state and goes straight to a bound state 139 * that kRPC can quickly switch to connected. 140 * 141 * Code that issues TPI request must use waitforack() to wait for the 142 * corresponding ack (assuming there is one) in any future modifications. 143 * This works around problems that may be introduced by breaking TPI rules 144 * (by submitting new calls before earlier requests have been acked) in the 145 * case of a signal or other early return. waitforack() depends on 146 * clnt_dispatch_notifyconn() to issue the wakeup when the ack 147 * arrives, so adding new TPI calls may require corresponding changes 148 * to clnt_dispatch_notifyconn(). Presently, the timeout period is based on 149 * CLNT_MIN_TIMEOUT which is 10 seconds. If you modify this value, be sure 150 * not to set it too low or TPI ACKS will be lost. 151 */ 152 153 #include <sys/param.h> 154 #include <sys/types.h> 155 #include <sys/user.h> 156 #include <sys/systm.h> 157 #include <sys/sysmacros.h> 158 #include <sys/proc.h> 159 #include <sys/socket.h> 160 #include <sys/file.h> 161 #include <sys/stream.h> 162 #include <sys/strsubr.h> 163 #include <sys/stropts.h> 164 #include <sys/strsun.h> 165 #include <sys/timod.h> 166 #include <sys/tiuser.h> 167 #include <sys/tihdr.h> 168 #include <sys/t_kuser.h> 169 #include <sys/fcntl.h> 170 #include <sys/errno.h> 171 #include <sys/kmem.h> 172 #include <sys/debug.h> 173 #include <sys/systm.h> 174 #include <sys/kstat.h> 175 #include <sys/t_lock.h> 176 #include <sys/sdt.h> 177 #include <sys/ddi.h> 178 #include <sys/cmn_err.h> 179 #include <sys/time.h> 180 #include <sys/isa_defs.h> 181 #include <sys/callb.h> 182 #include <sys/sunddi.h> 183 #include <sys/atomic.h> 184 #include <sys/sdt.h> 185 #include <sys/list.h> 186 #include <sys/pathname.h> 187 188 #include <netinet/in.h> 189 #include <netinet/tcp.h> 190 191 #include <rpc/types.h> 192 #include <rpc/xdr.h> 193 #include <rpc/auth.h> 194 #include <rpc/clnt.h> 195 #include <rpc/rpc_msg.h> 196 #include <rpc/rpc_tags.h> 197 198 #include <rpc/svc.h> 199 200 #define COTS_DEFAULT_ALLOCSIZE 2048 201 202 #define WIRE_HDR_SIZE 20 /* serialized call header, sans proc number */ 203 #define MSG_OFFSET 128 /* offset of call into the mblk */ 204 205 #define RPC_SAME_TAG(p, cm_entry) (clnt_connmgr_cmptag(p, cm_entry)) 206 207 /* 208 * Returns 0 if same 209 */ 210 #define NETBUF_CMP(addr1, addr2) ((addr1->len == addr2->len) ? \ 211 bcmp(addr1->buf, addr2->buf, addr1->len) : 1) 212 213 /* 214 * Bi-directional RPC by default 215 */ 216 217 int clnt_cots_birpc = 1; 218 219 const char *kinet_ntop6(uchar_t *, char *, size_t); 220 221 static int clnt_cots_ksettimers(CLIENT *, struct rpc_timers *, 222 struct rpc_timers *, int, void(*)(int, int, caddr_t), caddr_t, uint32_t); 223 static enum clnt_stat clnt_cots_kcallit(CLIENT *, rpcproc_t, xdrproc_t, 224 caddr_t, xdrproc_t, caddr_t, struct timeval); 225 static void clnt_cots_kabort(CLIENT *); 226 static void clnt_cots_kerror(CLIENT *, struct rpc_err *); 227 static bool_t clnt_cots_kfreeres(CLIENT *, xdrproc_t, caddr_t); 228 static void clnt_cots_kdestroy(CLIENT *); 229 static bool_t clnt_cots_kcontrol(CLIENT *, int, char *); 230 231 /* Callback RPC */ 232 static bool_t connmgr_cb_totest(CLIENT *, void *); 233 234 /* 235 * Global list of connmgr tags 236 */ 237 static rpc_tag_hd_t cm_tag_hd; 238 239 /* List of transports managed by the connection manager. */ 240 struct cm_xprt { 241 TIUSER *x_tiptr; /* transport handle */ 242 queue_t *x_wq; /* send queue */ 243 clock_t x_time; /* last time we handed this xprt out */ 244 clock_t x_ctime; /* time we went to CONNECTED */ 245 int x_tidu_size; /* TIDU size of this transport */ 246 union { 247 struct { 248 unsigned int 249 #ifdef _BIT_FIELDS_HTOL 250 b_closing: 1, /* we've sent a ord rel on this conn */ 251 b_dead: 1, /* transport is closed or disconn */ 252 b_doomed: 1, /* too many conns, let this go idle */ 253 b_connected: 1, /* this connection is connected */ 254 255 b_ordrel: 1, /* do an orderly release? */ 256 b_thread: 1, /* thread doing connect */ 257 b_waitdis: 1, /* waiting for disconnect ACK */ 258 b_needdis: 1, /* need T_DISCON_REQ */ 259 260 b_needrel: 1, /* need T_ORDREL_REQ */ 261 b_early_disc: 1, /* got a T_ORDREL_IND or T_DISCON_IND */ 262 /* disconnect during connect */ 263 b_cb_tested: 1, /* server side cb conn only */ 264 b_cb: 1, /* client side cb conn */ 265 266 b_pad: 20; 267 268 #endif 269 270 #ifdef _BIT_FIELDS_LTOH 271 b_pad: 20, 272 273 b_cb: 1, /* client side cb conn */ 274 b_cb_tested: 1, /* server side cb conn only */ 275 b_early_disc: 1, /* got a T_ORDREL_IND or T_DISCON_IND */ 276 /* disconnect during connect */ 277 b_needrel: 1, /* need T_ORDREL_REQ */ 278 279 b_needdis: 1, /* need T_DISCON_REQ */ 280 b_waitdis: 1, /* waiting for disconnect ACK */ 281 b_thread: 1, /* thread doing connect */ 282 b_ordrel: 1, /* do an orderly release? */ 283 284 b_connected: 1, /* this connection is connected */ 285 b_doomed: 1, /* too many conns, let this go idle */ 286 b_dead: 1, /* transport is closed or disconn */ 287 b_closing: 1; /* we've sent a ord rel on this conn */ 288 #endif 289 } bit; unsigned int word; 290 291 #define x_closing x_state.bit.b_closing 292 #define x_dead x_state.bit.b_dead 293 #define x_doomed x_state.bit.b_doomed 294 #define x_connected x_state.bit.b_connected 295 296 #define x_ordrel x_state.bit.b_ordrel 297 #define x_thread x_state.bit.b_thread 298 #define x_waitdis x_state.bit.b_waitdis 299 #define x_needdis x_state.bit.b_needdis 300 301 #define x_needrel x_state.bit.b_needrel 302 #define x_early_disc x_state.bit.b_early_disc 303 #define x_cb_tested x_state.bit.b_cb_tested 304 #define x_cb x_state.bit.b_cb 305 306 #define x_state_flags x_state.word 307 308 #define X_CLOSING 0x80000000 309 #define X_DEAD 0x40000000 310 #define X_DOOMED 0x20000000 311 #define X_CONNECTED 0x10000000 312 313 #define X_ORDREL 0x08000000 314 #define X_THREAD 0x04000000 315 #define X_WAITDIS 0x02000000 316 #define X_NEEDDIS 0x01000000 317 318 #define X_NEEDREL 0x00800000 319 #define X_EARLYDISC 0x00400000 320 #define X_CBTESTED 0x00200000 321 #define X_CB 0x00100000 322 323 #define X_BADSTATES (X_CLOSING | X_DEAD | X_DOOMED) 324 325 } x_state; 326 int x_ref; /* number of users of this xprt */ 327 int x_family; /* address family of transport */ 328 dev_t x_rdev; /* device number of transport */ 329 struct cm_xprt *x_next; 330 331 struct netbuf x_server; /* destination address */ 332 struct netbuf x_src; /* src address (for retries) */ 333 kmutex_t x_lock; /* lock on this entry */ 334 kcondvar_t x_cv; /* to signal when can be closed */ 335 kcondvar_t x_conn_cv; /* to signal when connection attempt */ 336 /* is complete */ 337 kstat_t *x_ksp; 338 339 kcondvar_t x_dis_cv; /* to signal when disconnect attempt */ 340 /* is complete */ 341 zoneid_t x_zoneid; /* zone this xprt belongs to */ 342 rpcprog_t x_prog; /* Program number for incoming calls */ 343 void *x_tags; /* tags list for this xprt */ 344 }; 345 346 typedef struct cm_kstat_xprt { 347 kstat_named_t x_wq; 348 kstat_named_t x_server; 349 kstat_named_t x_family; 350 kstat_named_t x_rdev; 351 kstat_named_t x_time; 352 kstat_named_t x_state; 353 kstat_named_t x_ref; 354 kstat_named_t x_port; 355 } cm_kstat_xprt_t; 356 357 static cm_kstat_xprt_t cm_kstat_template = { 358 { "write_queue", KSTAT_DATA_UINT32 }, 359 { "server", KSTAT_DATA_STRING }, 360 { "addr_family", KSTAT_DATA_UINT32 }, 361 { "device", KSTAT_DATA_UINT32 }, 362 { "time_stamp", KSTAT_DATA_UINT32 }, 363 { "status", KSTAT_DATA_UINT32 }, 364 { "ref_count", KSTAT_DATA_INT32 }, 365 { "port", KSTAT_DATA_UINT32 }, 366 }; 367 368 /* 369 * The inverse of this is connmgr_release(). 370 */ 371 #define CONN_HOLD(Cm_entry) {\ 372 mutex_enter(&(Cm_entry)->x_lock); \ 373 (Cm_entry)->x_ref++; \ 374 mutex_exit(&(Cm_entry)->x_lock); \ 375 } 376 377 378 /* 379 * Private data per rpc handle. This structure is allocated by 380 * clnt_cots_kcreate, and freed by clnt_cots_kdestroy. 381 */ 382 typedef struct cku_private_s { 383 CLIENT cku_client; /* client handle */ 384 calllist_t cku_call; /* for dispatching calls */ 385 struct rpc_err cku_err; /* error status */ 386 387 struct netbuf cku_srcaddr; /* source address for retries */ 388 int cku_addrfmly; /* for binding port */ 389 struct netbuf cku_addr; /* remote address */ 390 dev_t cku_device; /* device to use */ 391 uint_t cku_flags; 392 #define CKU_ONQUEUE 0x1 393 #define CKU_SENT 0x2 394 #define CKU_CALLBACK 0x4 /* server side call back */ 395 #define CKU_BC_SETUP 0x8 /* init back channel info */ 396 #define CKU_BACKCHANNEL 0x10 /* client side back channel */ 397 #define CKU_TAGCMP 0x20 /* turns no tag check */ 398 #define CKU_CB_TEST 0x40 /* cb test clnt handle */ 399 #define CKU_BIND_CONN 0x80 /* bind conn to tag */ 400 401 bool_t cku_progress; /* for CLSET_PROGRESS */ 402 uint32_t cku_xid; /* current XID */ 403 clock_t cku_ctime; /* time stamp of when */ 404 /* connection was created */ 405 uint_t cku_recv_attempts; 406 XDR cku_outxdr; /* xdr routine for output */ 407 XDR cku_inxdr; /* xdr routine for input */ 408 char cku_rpchdr[WIRE_HDR_SIZE + 4]; 409 /* pre-serialized rpc header */ 410 411 uint_t cku_outbuflen; /* default output mblk length */ 412 struct cred *cku_cred; /* credentials */ 413 bool_t cku_nodelayonerr; 414 /* for CLSET_NODELAYONERR */ 415 int cku_useresvport; /* Use reserved port */ 416 struct rpc_cots_client *cku_stats; /* stats for zone */ 417 struct cm_xprt *cku_entry; /* Callback Connection Info */ 418 CBSERVER_ARGS *cku_cb; /* For Callback Info */ 419 tagid cku_tag; 420 } cku_private_t; 421 422 static struct cm_xprt *connmgr_wrapconnect(struct cm_xprt *, 423 const struct timeval *, struct netbuf *, int, struct netbuf *, 424 struct rpc_err *, bool_t, bool_t); 425 426 static bool_t connmgr_connect(struct cm_xprt *, queue_t *, struct netbuf *, 427 int, calllist_t *, int *, bool_t reconnect, 428 const struct timeval *, bool_t); 429 430 static bool_t connmgr_setopt(queue_t *, int, int, calllist_t *); 431 static void connmgr_sndrel(struct cm_xprt *); 432 static void connmgr_snddis(struct cm_xprt *); 433 static void connmgr_close(struct cm_xprt *); 434 static void connmgr_release(struct cm_xprt *); 435 static struct cm_xprt *connmgr_wrapget(struct netbuf *, const struct timeval *, 436 cku_private_t *); 437 438 static struct cm_xprt *connmgr_get(struct netbuf *, const struct timeval *, 439 cku_private_t *); 440 441 static void connmgr_cancelconn(struct cm_xprt *); 442 static enum clnt_stat connmgr_cwait(struct cm_xprt *, const struct timeval *, 443 bool_t); 444 static void connmgr_dis_and_wait(struct cm_xprt *); 445 446 static int clnt_dispatch_send(queue_t *, mblk_t *, calllist_t *, uint_t, 447 uint_t); 448 449 static int clnt_delay(clock_t, bool_t); 450 451 static int waitforack(calllist_t *, t_scalar_t, const struct timeval *, bool_t); 452 static bool_t connmgr_tag_swap(cku_private_t *, void *); 453 static void connmgr_tag_unbind(cku_private_t *); 454 static void connmgr_tag_destroy(cku_private_t *, char *); 455 static int connmgr_clrcbinfo(cku_private_t *, void *); 456 457 extern void mir_set_cbinfo(queue_t *, void *); 458 extern SVCCB *mir_get_svccb(queue_t *); 459 460 /* 461 * Operations vector for TCP/IP based RPC 462 */ 463 static struct clnt_ops tcp_ops = { 464 clnt_cots_kcallit, /* do rpc call */ 465 clnt_cots_kabort, /* abort call */ 466 clnt_cots_kerror, /* return error status */ 467 clnt_cots_kfreeres, /* free results */ 468 clnt_cots_kdestroy, /* destroy rpc handle */ 469 clnt_cots_kcontrol, /* the ioctl() of rpc */ 470 clnt_cots_ksettimers, /* set retry timers */ 471 }; 472 473 static int rpc_kstat_instance = 0; /* keeps the current instance */ 474 /* number for the next kstat_create */ 475 476 static struct cm_xprt *cm_hd = NULL; 477 static struct cm_xprt *cb_cm_hd = NULL; /* callback list */ 478 479 static kmutex_t connmgr_lock; /* for connmgr's list of transports */ 480 static kmutex_t connmgr_cb_lock; /* for connmgr's callback list */ 481 482 extern kmutex_t clnt_max_msg_lock; 483 484 static calllist_t *clnt_pending = NULL; 485 extern kmutex_t clnt_pending_lock; 486 487 static int clnt_cots_hash_size = DEFAULT_HASH_SIZE; 488 489 static call_table_t *cots_call_ht; 490 491 static const struct rpc_cots_client { 492 kstat_named_t rccalls; 493 kstat_named_t rcbadcalls; 494 kstat_named_t rcbadxids; 495 kstat_named_t rctimeouts; 496 kstat_named_t rcnewcreds; 497 kstat_named_t rcbadverfs; 498 kstat_named_t rctimers; 499 kstat_named_t rccantconn; 500 kstat_named_t rcnomem; 501 kstat_named_t rcintrs; 502 } cots_rcstat_tmpl = { 503 { "calls", KSTAT_DATA_UINT64 }, 504 { "badcalls", KSTAT_DATA_UINT64 }, 505 { "badxids", KSTAT_DATA_UINT64 }, 506 { "timeouts", KSTAT_DATA_UINT64 }, 507 { "newcreds", KSTAT_DATA_UINT64 }, 508 { "badverfs", KSTAT_DATA_UINT64 }, 509 { "timers", KSTAT_DATA_UINT64 }, 510 { "cantconn", KSTAT_DATA_UINT64 }, 511 { "nomem", KSTAT_DATA_UINT64 }, 512 { "interrupts", KSTAT_DATA_UINT64 } 513 }; 514 515 #define COTSRCSTAT_INCR(p, x) \ 516 if (p != NULL) \ 517 atomic_add_64(&(p)->x.value.ui64, 1) 518 519 #define CLNT_MAX_CONNS 1 /* concurrent connections between clnt/srvr */ 520 static int clnt_max_conns = CLNT_MAX_CONNS; 521 522 #define CLNT_MIN_TIMEOUT 10 /* seconds to wait after we get a */ 523 /* connection reset */ 524 #define CLNT_MIN_CONNTIMEOUT 5 /* seconds to wait for a connection */ 525 526 527 static int clnt_cots_min_tout = CLNT_MIN_TIMEOUT; 528 static int clnt_cots_min_conntout = CLNT_MIN_CONNTIMEOUT; 529 530 /* 531 * Limit the number of times we will attempt to receive a reply without 532 * re-sending a response. 533 */ 534 #define CLNT_MAXRECV_WITHOUT_RETRY 3 535 static uint_t clnt_cots_maxrecv = CLNT_MAXRECV_WITHOUT_RETRY; 536 537 uint_t *clnt_max_msg_sizep; 538 void (*clnt_stop_idle)(queue_t *wq); 539 540 #define ptoh(p) (&((p)->cku_client)) 541 #define htop(h) ((cku_private_t *)((h)->cl_private)) 542 543 /* 544 * Times to retry 545 */ 546 #define REFRESHES 2 /* authentication refreshes */ 547 548 /* 549 * The following is used to determine the global default behavior for 550 * COTS when binding to a local port. 551 * 552 * If the value is set to 1 the default will be to select a reserved 553 * (aka privileged) port, if the value is zero the default will be to 554 * use non-reserved ports. Users of kRPC may override this by using 555 * CLNT_CONTROL() and CLSET_BINDRESVPORT. 556 */ 557 static int clnt_cots_do_bindresvport = 1; 558 559 static zone_key_t zone_cots_key; 560 561 /* 562 * We need to do this after all kernel threads in the zone have exited. 563 */ 564 /* ARGSUSED */ 565 static void 566 clnt_zone_destroy(zoneid_t zoneid, void *unused) 567 { 568 struct cm_xprt **cmp; 569 struct cm_xprt *cm_entry; 570 struct cm_xprt *freelist = NULL; 571 572 mutex_enter(&connmgr_lock); 573 cmp = &cm_hd; 574 while ((cm_entry = *cmp) != NULL) { 575 if (cm_entry->x_zoneid == zoneid) { 576 *cmp = cm_entry->x_next; 577 cm_entry->x_next = freelist; 578 freelist = cm_entry; 579 } else { 580 cmp = &cm_entry->x_next; 581 } 582 } 583 mutex_exit(&connmgr_lock); 584 while ((cm_entry = freelist) != NULL) { 585 freelist = cm_entry->x_next; 586 connmgr_close(cm_entry); 587 } 588 } 589 590 int 591 clnt_cots_kcreate(dev_t dev, struct netbuf *addr, int family, rpcprog_t prog, 592 rpcvers_t vers, uint_t max_msgsize, cred_t *cred, CLIENT **ncl) 593 { 594 CLIENT *h; 595 cku_private_t *p; 596 struct rpc_msg call_msg; 597 struct rpcstat *rpcstat; 598 599 RPCLOG(8, "clnt_cots_kcreate: prog %u\n", prog); 600 601 rpcstat = zone_getspecific(rpcstat_zone_key, rpc_zone()); 602 ASSERT(rpcstat != NULL); 603 604 /* Allocate and intialize the client handle. */ 605 p = kmem_zalloc(sizeof (*p), KM_SLEEP); 606 607 h = ptoh(p); 608 609 h->cl_private = (caddr_t)p; 610 h->cl_auth = authkern_create(); 611 h->cl_ops = &tcp_ops; 612 613 cv_init(&p->cku_call.call_cv, NULL, CV_DEFAULT, NULL); 614 mutex_init(&p->cku_call.call_lock, NULL, MUTEX_DEFAULT, NULL); 615 616 /* 617 * If the current sanity check size in rpcmod is smaller 618 * than the size needed, then increase the sanity check. 619 */ 620 if (max_msgsize != 0 && clnt_max_msg_sizep != NULL && 621 max_msgsize > *clnt_max_msg_sizep) { 622 mutex_enter(&clnt_max_msg_lock); 623 if (max_msgsize > *clnt_max_msg_sizep) 624 *clnt_max_msg_sizep = max_msgsize; 625 mutex_exit(&clnt_max_msg_lock); 626 } 627 628 p->cku_outbuflen = COTS_DEFAULT_ALLOCSIZE; 629 630 /* Preserialize the call message header */ 631 632 call_msg.rm_xid = 0; 633 call_msg.rm_direction = CALL; 634 call_msg.rm_call.cb_rpcvers = RPC_MSG_VERSION; 635 call_msg.rm_call.cb_prog = prog; 636 call_msg.rm_call.cb_vers = vers; 637 638 xdrmem_create(&p->cku_outxdr, p->cku_rpchdr, WIRE_HDR_SIZE, XDR_ENCODE); 639 640 if (!xdr_callhdr(&p->cku_outxdr, &call_msg)) { 641 RPCLOG0(1, "clnt_cots_kcreate - Fatal header serialization " 642 "error\n"); 643 auth_destroy(h->cl_auth); 644 kmem_free(p, sizeof (cku_private_t)); 645 RPCLOG0(1, "clnt_cots_kcreate: create failed error EINVAL\n"); 646 return (EINVAL); /* XXX */ 647 } 648 649 /* 650 * The zalloc initialized the fields below. 651 * p->cku_xid = 0; 652 * p->cku_flags = 0; 653 * p->cku_srcaddr.len = 0; 654 * p->cku_srcaddr.maxlen = 0; 655 */ 656 657 p->cku_cred = cred; 658 p->cku_device = dev; 659 p->cku_addrfmly = family; 660 if (addr) { 661 p->cku_addr.buf = kmem_zalloc(addr->maxlen, KM_SLEEP); 662 p->cku_addr.maxlen = addr->maxlen; 663 p->cku_addr.len = addr->len; 664 bcopy(addr->buf, p->cku_addr.buf, addr->len); 665 } else { 666 /* 667 * Only valid for a callback client handle 668 * (the connection is picked up via tags). 669 */ 670 p->cku_addr.buf = NULL; 671 p->cku_addr.maxlen = 0; 672 p->cku_addr.len = 0; 673 } 674 675 p->cku_stats = rpcstat->rpc_cots_client; 676 p->cku_useresvport = -1; /* value is has not been set */ 677 678 *ncl = h; 679 return (0); 680 } 681 682 /*ARGSUSED*/ 683 static void 684 clnt_cots_kabort(CLIENT *h) 685 { 686 } 687 688 /* 689 * Return error info on this handle. 690 */ 691 static void 692 clnt_cots_kerror(CLIENT *h, struct rpc_err *err) 693 { 694 /* LINTED pointer alignment */ 695 cku_private_t *p = htop(h); 696 697 *err = p->cku_err; 698 } 699 700 static bool_t 701 clnt_cots_kfreeres(CLIENT *h, xdrproc_t xdr_res, caddr_t res_ptr) 702 { 703 /* LINTED pointer alignment */ 704 cku_private_t *p = htop(h); 705 XDR *xdrs; 706 707 xdrs = &(p->cku_outxdr); 708 xdrs->x_op = XDR_FREE; 709 return ((*xdr_res)(xdrs, res_ptr)); 710 } 711 712 static bool_t 713 clnt_cots_kcontrol(CLIENT *h, int cmd, char *arg) 714 { 715 cku_private_t *p = htop(h); 716 717 switch (cmd) { 718 case CLSET_PROGRESS: 719 p->cku_progress = TRUE; 720 return (TRUE); 721 722 case CLSET_XID: 723 if (arg == NULL) 724 return (FALSE); 725 726 p->cku_xid = *((uint32_t *)arg); 727 return (TRUE); 728 729 case CLGET_XID: 730 if (arg == NULL) 731 return (FALSE); 732 733 *((uint32_t *)arg) = p->cku_xid; 734 return (TRUE); 735 736 case CLSET_NODELAYONERR: 737 if (arg == NULL) 738 return (FALSE); 739 740 if (*((bool_t *)arg) == TRUE) { 741 p->cku_nodelayonerr = TRUE; 742 return (TRUE); 743 } 744 if (*((bool_t *)arg) == FALSE) { 745 p->cku_nodelayonerr = FALSE; 746 return (TRUE); 747 } 748 return (FALSE); 749 750 case CLGET_NODELAYONERR: 751 if (arg == NULL) 752 return (FALSE); 753 754 *((bool_t *)arg) = p->cku_nodelayonerr; 755 return (TRUE); 756 757 case CLSET_BINDRESVPORT: 758 if (arg == NULL) 759 return (FALSE); 760 761 if (*(int *)arg != 1 && *(int *)arg != 0) 762 return (FALSE); 763 764 p->cku_useresvport = *(int *)arg; 765 766 return (TRUE); 767 768 case CLGET_BINDRESVPORT: 769 if (arg == NULL) 770 return (FALSE); 771 772 *(int *)arg = p->cku_useresvport; 773 774 return (TRUE); 775 776 case CLSET_CBCLIENT: 777 p->cku_flags |= CKU_CALLBACK; 778 return (TRUE); 779 780 case CLSET_CBSERVER_SETUP: 781 p->cku_flags |= CKU_BC_SETUP; 782 p->cku_cb = (CBSERVER_ARGS *)arg; 783 return (TRUE); 784 785 case CLSET_CBSERVER_CLEANUP: 786 return (connmgr_clrcbinfo(p, (void *)arg)); 787 788 case CLSET_BACKCHANNEL: 789 p->cku_flags |= CKU_BACKCHANNEL; 790 return (TRUE); 791 792 case CLSET_BACKCHANNEL_CLEAR: 793 /* 794 * Clears both the backchannel related 795 * flags 796 */ 797 p->cku_flags &= ~CKU_BACKCHANNEL; 798 p->cku_flags &= ~CKU_BC_SETUP; 799 p->cku_cb = NULL; 800 return (TRUE); 801 802 case CLSET_TAG: 803 p->cku_flags |= CKU_TAGCMP; 804 bcopy(arg, p->cku_tag, sizeof (tagid)); 805 return (TRUE); 806 807 case CLSET_TAG_CLEAR: 808 p->cku_flags &= ~CKU_TAGCMP; 809 bzero(p->cku_tag, sizeof (tagid)); 810 return (TRUE); 811 812 case CLSET_TAG_SWAP: 813 return (connmgr_tag_swap(p, arg)); 814 815 case CLSET_CB_TEST: 816 p->cku_flags |= CKU_CB_TEST; 817 return (TRUE); 818 819 case CLGET_CB_UNTESTED: 820 return (connmgr_cb_totest(h, arg)); 821 822 case CLSET_CB_TEST_CLEAR: 823 p->cku_flags &= ~CKU_CB_TEST; 824 return (TRUE); 825 826 case CLSET_NON_BIRPC: 827 clnt_cots_birpc = 0; 828 return (TRUE); 829 830 case CLSET_CBSERVER_CLEAR: 831 p->cku_flags &= ~CKU_BC_SETUP; 832 return (TRUE); 833 834 case CLSET_BINDCONN_TO_TAG: 835 p->cku_flags |= CKU_BIND_CONN; 836 return (TRUE); 837 838 case CLSET_CLEAR_BINDCONN: 839 p->cku_flags &= ~CKU_BIND_CONN; 840 return (TRUE); 841 842 case CLSET_TAG_CONN_UNBIND: 843 connmgr_tag_unbind(p); 844 return (TRUE); 845 846 case CLSET_TAG_DESTROY: 847 rpc_destroy_tag(&cm_tag_hd, (void *)arg); 848 return (TRUE); 849 850 default: 851 return (FALSE); 852 853 } 854 } 855 856 /* 857 * Destroy rpc handle. Frees the space used for output buffer, 858 * private data, and handle structure. 859 */ 860 static void 861 clnt_cots_kdestroy(CLIENT *h) 862 { 863 /* LINTED pointer alignment */ 864 cku_private_t *p = htop(h); 865 calllist_t *call = &p->cku_call; 866 867 RPCLOG(8, "clnt_cots_kdestroy h: %p\n", (void *)h); 868 RPCLOG(8, "clnt_cots_kdestroy h: xid=0x%x\n", p->cku_xid); 869 870 if (p->cku_flags & CKU_ONQUEUE) { 871 RPCLOG(64, "clnt_cots_kdestroy h: removing call for xid 0x%x " 872 "from dispatch list\n", p->cku_xid); 873 call_table_remove(call); 874 } 875 876 if (call->call_reply) 877 freemsg(call->call_reply); 878 cv_destroy(&call->call_cv); 879 mutex_destroy(&call->call_lock); 880 881 kmem_free(p->cku_srcaddr.buf, p->cku_srcaddr.maxlen); 882 kmem_free(p->cku_addr.buf, p->cku_addr.maxlen); 883 kmem_free(p, sizeof (*p)); 884 } 885 886 static int clnt_cots_pulls; 887 #define RM_HDR_SIZE 4 /* record mark header size */ 888 889 /* 890 * Call remote procedure. 891 */ 892 static enum clnt_stat 893 clnt_cots_kcallit(CLIENT *h, rpcproc_t procnum, xdrproc_t xdr_args, 894 caddr_t argsp, xdrproc_t xdr_results, caddr_t resultsp, struct timeval wait) 895 { 896 /* LINTED pointer alignment */ 897 cku_private_t *p = htop(h); 898 calllist_t *call = &p->cku_call; 899 XDR *xdrs; 900 struct rpc_msg reply_msg; 901 mblk_t *mp; 902 #ifdef RPCDEBUG 903 clock_t time_sent; 904 #endif 905 struct netbuf *retryaddr; 906 struct cm_xprt *cm_entry = NULL; 907 queue_t *wq; 908 int len; 909 int mpsize; 910 int refreshes = REFRESHES; 911 int interrupted; 912 int tidu_size; 913 enum clnt_stat status; 914 struct timeval cwait; 915 bool_t delay_first = FALSE; 916 clock_t ticks; 917 918 RPCLOG(2, "clnt_cots_kcallit, procnum %u\n", procnum); 919 COTSRCSTAT_INCR(p->cku_stats, rccalls); 920 921 RPCLOG(2, "clnt_cots_kcallit: wait.tv_sec: %ld\n", wait.tv_sec); 922 RPCLOG(2, "clnt_cots_kcallit: wait.tv_usec: %ld\n", wait.tv_usec); 923 924 /* 925 * Bug ID 1240234: 926 * Look out for zero length timeouts. We don't want to 927 * wait zero seconds for a connection to be established. 928 */ 929 if (wait.tv_sec < clnt_cots_min_conntout) { 930 cwait.tv_sec = clnt_cots_min_conntout; 931 cwait.tv_usec = 0; 932 RPCLOG(8, "clnt_cots_kcallit: wait.tv_sec (%ld) too low,", 933 wait.tv_sec); 934 RPCLOG(8, " setting to: %d\n", clnt_cots_min_conntout); 935 } else { 936 cwait = wait; 937 } 938 939 call_again: 940 if (cm_entry) { 941 connmgr_release(cm_entry); 942 cm_entry = NULL; 943 } 944 945 mp = NULL; 946 947 /* 948 * If the call is not a retry, allocate a new xid and cache it 949 * for future retries. 950 * Bug ID 1246045: 951 * Treat call as a retry for purposes of binding the source 952 * port only if we actually attempted to send anything on 953 * the previous call. 954 */ 955 if (p->cku_xid == 0) { 956 p->cku_xid = alloc_xid(); 957 call->call_zoneid = rpc_zoneid(); 958 959 /* 960 * We need to ASSERT here that our xid != 0 because this 961 * determines whether or not our call record gets placed on 962 * the hash table or the linked list. By design, we mandate 963 * that RPC calls over cots must have xid's != 0, so we can 964 * ensure proper management of the hash table. 965 */ 966 ASSERT(p->cku_xid != 0); 967 968 retryaddr = NULL; 969 p->cku_flags &= ~CKU_SENT; 970 971 if (p->cku_flags & CKU_ONQUEUE) { 972 RPCLOG(8, "clnt_cots_kcallit: new call, dequeuing old" 973 " one (%p)\n", (void *)call); 974 call_table_remove(call); 975 p->cku_flags &= ~CKU_ONQUEUE; 976 RPCLOG(64, "clnt_cots_kcallit: removing call from " 977 "dispatch list because xid was zero (now 0x%x)\n", 978 p->cku_xid); 979 } 980 981 if (call->call_reply != NULL) { 982 freemsg(call->call_reply); 983 call->call_reply = NULL; 984 } 985 } else if (p->cku_srcaddr.buf == NULL || p->cku_srcaddr.len == 0) { 986 retryaddr = NULL; 987 988 } else if (p->cku_flags & CKU_SENT) { 989 retryaddr = &p->cku_srcaddr; 990 991 } else { 992 /* 993 * Bug ID 1246045: Nothing was sent, so set retryaddr to 994 * NULL and let connmgr_get() bind to any source port it 995 * can get. 996 */ 997 retryaddr = NULL; 998 } 999 1000 RPCLOG(64, "clnt_cots_kcallit: xid = 0x%x", p->cku_xid); 1001 RPCLOG(64, " flags = 0x%x\n", p->cku_flags); 1002 1003 p->cku_err.re_status = RPC_TIMEDOUT; 1004 p->cku_err.re_errno = p->cku_err.re_terrno = 0; 1005 1006 cm_entry = connmgr_wrapget(retryaddr, &cwait, p); 1007 1008 if (cm_entry == NULL) { 1009 RPCLOG(1, "clnt_cots_kcallit: can't connect status %s\n", 1010 clnt_sperrno(p->cku_err.re_status)); 1011 1012 /* 1013 * The reasons why we fail to create a connection are 1014 * varied. In most cases we don't want the caller to 1015 * immediately retry. This could have one or more 1016 * bad effects. This includes flooding the net with 1017 * connect requests to ports with no listener; a hard 1018 * kernel loop due to all the "reserved" TCP ports being 1019 * in use. 1020 */ 1021 delay_first = TRUE; 1022 1023 /* 1024 * Even if we end up returning EINTR, we still count a 1025 * a "can't connect", because the connection manager 1026 * might have been committed to waiting for or timing out on 1027 * a connection. 1028 */ 1029 COTSRCSTAT_INCR(p->cku_stats, rccantconn); 1030 switch (p->cku_err.re_status) { 1031 case RPC_INTR: 1032 p->cku_err.re_errno = EINTR; 1033 1034 /* 1035 * No need to delay because a UNIX signal(2) 1036 * interrupted us. The caller likely won't 1037 * retry the CLNT_CALL() and even if it does, 1038 * we assume the caller knows what it is doing. 1039 */ 1040 delay_first = FALSE; 1041 break; 1042 1043 case RPC_TIMEDOUT: 1044 p->cku_err.re_errno = ETIMEDOUT; 1045 1046 /* 1047 * No need to delay because timed out already 1048 * on the connection request and assume that the 1049 * transport time out is longer than our minimum 1050 * timeout, or least not too much smaller. 1051 */ 1052 delay_first = FALSE; 1053 break; 1054 1055 case RPC_SYSTEMERROR: 1056 case RPC_TLIERROR: 1057 /* 1058 * We want to delay here because a transient 1059 * system error has a better chance of going away 1060 * if we delay a bit. If it's not transient, then 1061 * we don't want end up in a hard kernel loop 1062 * due to retries. 1063 */ 1064 ASSERT(p->cku_err.re_errno != 0); 1065 break; 1066 1067 1068 case RPC_CANTCONNECT: 1069 /* 1070 * RPC_CANTCONNECT is set on T_ERROR_ACK which 1071 * implies some error down in the TCP layer or 1072 * below. If cku_nodelayonerror is set then we 1073 * assume the caller knows not to try too hard. 1074 */ 1075 RPCLOG0(8, "clnt_cots_kcallit: connection failed,"); 1076 RPCLOG0(8, " re_status=RPC_CANTCONNECT,"); 1077 RPCLOG(8, " re_errno=%d,", p->cku_err.re_errno); 1078 RPCLOG(8, " cku_nodelayonerr=%d", p->cku_nodelayonerr); 1079 if (p->cku_nodelayonerr == TRUE) 1080 delay_first = FALSE; 1081 1082 p->cku_err.re_errno = EIO; 1083 1084 break; 1085 1086 case RPC_XPRTFAILED: 1087 /* 1088 * We want to delay here because we likely 1089 * got a refused connection. 1090 */ 1091 if (p->cku_err.re_errno == 0) 1092 p->cku_err.re_errno = EIO; 1093 1094 RPCLOG(1, "clnt_cots_kcallit: transport failed: %d\n", 1095 p->cku_err.re_errno); 1096 1097 break; 1098 1099 case RPC_CONN_NOT_BOUND: 1100 1101 delay_first = FALSE; 1102 break; 1103 1104 default: 1105 /* 1106 * We delay here because it is better to err 1107 * on the side of caution. If we got here then 1108 * status could have been RPC_SUCCESS, but we 1109 * know that we did not get a connection, so 1110 * force the rpc status to RPC_CANTCONNECT. 1111 */ 1112 p->cku_err.re_status = RPC_CANTCONNECT; 1113 p->cku_err.re_errno = EIO; 1114 break; 1115 } 1116 if (delay_first == TRUE) 1117 ticks = clnt_cots_min_tout * drv_usectohz(1000000); 1118 goto cots_done; 1119 } 1120 1121 /* 1122 * If we've never sent any request on this connection (send count 1123 * is zero, or the connection has been reset), cache the 1124 * the connection's create time and send a request (possibly a retry) 1125 */ 1126 if ((p->cku_flags & CKU_SENT) == 0 || 1127 p->cku_ctime != cm_entry->x_ctime) { 1128 p->cku_ctime = cm_entry->x_ctime; 1129 1130 } else if ((p->cku_flags & CKU_SENT) && (p->cku_flags & CKU_ONQUEUE) && 1131 (call->call_reply != NULL || 1132 p->cku_recv_attempts < clnt_cots_maxrecv)) { 1133 1134 /* 1135 * If we've sent a request and our call is on the dispatch 1136 * queue and we haven't made too many receive attempts, then 1137 * don't re-send, just receive. 1138 */ 1139 p->cku_recv_attempts++; 1140 goto read_again; 1141 } 1142 1143 /* 1144 * Now we create the RPC request in a STREAMS message. We have to do 1145 * this after the call to connmgr_get so that we have the correct 1146 * TIDU size for the transport. 1147 */ 1148 tidu_size = cm_entry->x_tidu_size; 1149 len = MSG_OFFSET + MAX(tidu_size, RM_HDR_SIZE + WIRE_HDR_SIZE); 1150 1151 while ((mp = allocb(len, BPRI_MED)) == NULL) { 1152 if (strwaitbuf(len, BPRI_MED)) { 1153 p->cku_err.re_status = RPC_SYSTEMERROR; 1154 p->cku_err.re_errno = ENOSR; 1155 COTSRCSTAT_INCR(p->cku_stats, rcnomem); 1156 goto cots_done; 1157 } 1158 } 1159 xdrs = &p->cku_outxdr; 1160 xdrmblk_init(xdrs, mp, XDR_ENCODE, tidu_size); 1161 mpsize = MBLKSIZE(mp); 1162 ASSERT(mpsize >= len); 1163 ASSERT(mp->b_rptr == mp->b_datap->db_base); 1164 1165 /* 1166 * If the size of mblk is not appreciably larger than what we 1167 * asked, then resize the mblk to exactly len bytes. The reason for 1168 * this: suppose len is 1600 bytes, the tidu is 1460 bytes 1169 * (from TCP over ethernet), and the arguments to the RPC require 1170 * 2800 bytes. Ideally we want the protocol to render two 1171 * ~1400 byte segments over the wire. However if allocb() gives us a 2k 1172 * mblk, and we allocate a second mblk for the remainder, the protocol 1173 * module may generate 3 segments over the wire: 1174 * 1460 bytes for the first, 448 (2048 - 1600) for the second, and 1175 * 892 for the third. If we "waste" 448 bytes in the first mblk, 1176 * the XDR encoding will generate two ~1400 byte mblks, and the 1177 * protocol module is more likely to produce properly sized segments. 1178 */ 1179 if ((mpsize >> 1) <= len) 1180 mp->b_rptr += (mpsize - len); 1181 1182 /* 1183 * Adjust b_rptr to reserve space for the non-data protocol headers 1184 * any downstream modules might like to add, and for the 1185 * record marking header. 1186 */ 1187 mp->b_rptr += (MSG_OFFSET + RM_HDR_SIZE); 1188 1189 if (h->cl_auth->ah_cred.oa_flavor != RPCSEC_GSS) { 1190 /* Copy in the preserialized RPC header information. */ 1191 bcopy(p->cku_rpchdr, mp->b_rptr, WIRE_HDR_SIZE); 1192 1193 /* Use XDR_SETPOS() to set the b_wptr to past the RPC header. */ 1194 XDR_SETPOS(xdrs, (uint_t)(mp->b_rptr - mp->b_datap->db_base + 1195 WIRE_HDR_SIZE)); 1196 1197 ASSERT((mp->b_wptr - mp->b_rptr) == WIRE_HDR_SIZE); 1198 1199 /* Serialize the procedure number and the arguments. */ 1200 if ((!XDR_PUTINT32(xdrs, (int32_t *)&procnum)) || 1201 (!AUTH_MARSHALL(h->cl_auth, xdrs, p->cku_cred)) || 1202 (!(*xdr_args)(xdrs, argsp))) { 1203 p->cku_err.re_status = RPC_CANTENCODEARGS; 1204 p->cku_err.re_errno = EIO; 1205 goto cots_done; 1206 } 1207 (*(uint32_t *)(mp->b_rptr)) = p->cku_xid; 1208 } else { 1209 uint32_t *uproc = (uint32_t *)&p->cku_rpchdr[WIRE_HDR_SIZE]; 1210 IXDR_PUT_U_INT32(uproc, procnum); 1211 1212 (*(uint32_t *)(&p->cku_rpchdr[0])) = p->cku_xid; 1213 1214 /* Use XDR_SETPOS() to set the b_wptr. */ 1215 XDR_SETPOS(xdrs, (uint_t)(mp->b_rptr - mp->b_datap->db_base)); 1216 1217 /* Serialize the procedure number and the arguments. */ 1218 if (!AUTH_WRAP(h->cl_auth, p->cku_rpchdr, WIRE_HDR_SIZE+4, 1219 xdrs, xdr_args, argsp)) { 1220 p->cku_err.re_status = RPC_CANTENCODEARGS; 1221 p->cku_err.re_errno = EIO; 1222 goto cots_done; 1223 } 1224 } 1225 1226 RPCLOG(2, "clnt_cots_kcallit: connected, sending call, tidu_size %d\n", 1227 tidu_size); 1228 1229 wq = cm_entry->x_wq; 1230 status = clnt_dispatch_send(wq, mp, call, p->cku_xid, 1231 (p->cku_flags & CKU_ONQUEUE)); 1232 1233 if (status == RPC_CANTSEND) { 1234 p->cku_err.re_status = status; 1235 p->cku_err.re_errno = EIO; 1236 DTRACE_PROBE(krpc__e__clntcots__kcallit__cantsend); 1237 1238 /* 1239 * Allow for processing of the QFULL queue. 1240 */ 1241 delay_first = TRUE; 1242 ticks = clnt_cots_min_tout * drv_usectohz(1000000); 1243 1244 goto cots_done; 1245 } 1246 1247 RPCLOG(64, "clnt_cots_kcallit: sent call for xid 0x%x\n", 1248 (uint_t)p->cku_xid); 1249 p->cku_flags |= (CKU_ONQUEUE|CKU_SENT); 1250 p->cku_recv_attempts = 1; 1251 1252 #ifdef RPCDEBUG 1253 time_sent = lbolt; 1254 #endif 1255 1256 /* 1257 * Wait for a reply or a timeout. If there is no error or timeout, 1258 * (both indicated by call_status), call->call_reply will contain 1259 * the RPC reply message. 1260 */ 1261 read_again: 1262 mutex_enter(&call->call_lock); 1263 interrupted = 0; 1264 if (call->call_status == RPC_TIMEDOUT) { 1265 /* 1266 * Indicate that the lwp is not to be stopped while waiting 1267 * for this network traffic. This is to avoid deadlock while 1268 * debugging a process via /proc and also to avoid recursive 1269 * mutex_enter()s due to NFS page faults while stopping 1270 * (NFS holds locks when it calls here). 1271 */ 1272 clock_t cv_wait_ret; 1273 clock_t timout; 1274 clock_t oldlbolt; 1275 1276 klwp_t *lwp = ttolwp(curthread); 1277 1278 if (lwp != NULL) 1279 lwp->lwp_nostop++; 1280 1281 oldlbolt = lbolt; 1282 timout = wait.tv_sec * drv_usectohz(1000000) + 1283 drv_usectohz(wait.tv_usec) + oldlbolt; 1284 /* 1285 * Iterate until the call_status is changed to something 1286 * other that RPC_TIMEDOUT, or if cv_timedwait_sig() returns 1287 * something <=0 zero. The latter means that we timed 1288 * out. 1289 */ 1290 if (h->cl_nosignal) 1291 while ((cv_wait_ret = cv_timedwait(&call->call_cv, 1292 &call->call_lock, timout)) > 0 && 1293 call->call_status == RPC_TIMEDOUT) 1294 /* NO OP */; 1295 else 1296 while ((cv_wait_ret = cv_timedwait_sig( 1297 &call->call_cv, 1298 &call->call_lock, timout)) > 0 && 1299 call->call_status == RPC_TIMEDOUT) 1300 /* NO OP */; 1301 1302 switch (cv_wait_ret) { 1303 case 0: 1304 /* 1305 * If we got out of the above loop with 1306 * cv_timedwait_sig() returning 0, then we were 1307 * interrupted regardless what call_status is. 1308 */ 1309 interrupted = 1; 1310 break; 1311 case -1: 1312 /* cv_timedwait_sig() timed out */ 1313 break; 1314 default: 1315 1316 /* 1317 * We were cv_signaled(). If we didn't 1318 * get a successful call_status and returned 1319 * before time expired, delay up to clnt_cots_min_tout 1320 * seconds so that the caller doesn't immediately 1321 * try to call us again and thus force the 1322 * same condition that got us here (such 1323 * as a RPC_XPRTFAILED due to the server not 1324 * listening on the end-point. 1325 */ 1326 if (call->call_status != RPC_SUCCESS) { 1327 clock_t curlbolt; 1328 clock_t diff; 1329 1330 curlbolt = ddi_get_lbolt(); 1331 ticks = clnt_cots_min_tout * 1332 drv_usectohz(1000000); 1333 diff = curlbolt - oldlbolt; 1334 if (diff < ticks) { 1335 delay_first = TRUE; 1336 if (diff > 0) 1337 ticks -= diff; 1338 } 1339 } 1340 break; 1341 } 1342 1343 if (lwp != NULL) 1344 lwp->lwp_nostop--; 1345 } 1346 /* 1347 * Get the reply message, if any. This will be freed at the end 1348 * whether or not an error occurred. 1349 */ 1350 mp = call->call_reply; 1351 call->call_reply = NULL; 1352 1353 /* 1354 * call_err is the error info when the call is on dispatch queue. 1355 * cku_err is the error info returned to the caller. 1356 * Sync cku_err with call_err for local message processing. 1357 */ 1358 1359 status = call->call_status; 1360 p->cku_err = call->call_err; 1361 mutex_exit(&call->call_lock); 1362 1363 if (status != RPC_SUCCESS) { 1364 switch (status) { 1365 case RPC_TIMEDOUT: 1366 if (interrupted) { 1367 COTSRCSTAT_INCR(p->cku_stats, rcintrs); 1368 p->cku_err.re_status = RPC_INTR; 1369 p->cku_err.re_errno = EINTR; 1370 RPCLOG(1, "clnt_cots_kcallit: xid 0x%x", 1371 p->cku_xid); 1372 RPCLOG(1, "signal interrupted at %ld", lbolt); 1373 RPCLOG(1, ", was sent at %ld\n", time_sent); 1374 } else { 1375 COTSRCSTAT_INCR(p->cku_stats, rctimeouts); 1376 p->cku_err.re_errno = ETIMEDOUT; 1377 RPCLOG(1, "clnt_cots_kcallit: timed out at %ld", 1378 lbolt); 1379 RPCLOG(1, ", was sent at %ld\n", time_sent); 1380 } 1381 break; 1382 1383 case RPC_XPRTFAILED: 1384 if (p->cku_err.re_errno == 0) 1385 p->cku_err.re_errno = EIO; 1386 1387 RPCLOG(1, "clnt_cots_kcallit: transport failed: %d\n", 1388 p->cku_err.re_errno); 1389 break; 1390 1391 case RPC_SYSTEMERROR: 1392 ASSERT(p->cku_err.re_errno); 1393 RPCLOG(1, "clnt_cots_kcallit: system error: %d\n", 1394 p->cku_err.re_errno); 1395 break; 1396 1397 default: 1398 p->cku_err.re_status = RPC_SYSTEMERROR; 1399 p->cku_err.re_errno = EIO; 1400 RPCLOG(1, "clnt_cots_kcallit: error: %s\n", 1401 clnt_sperrno(status)); 1402 break; 1403 } 1404 if (p->cku_err.re_status != RPC_TIMEDOUT) { 1405 1406 if (p->cku_flags & CKU_ONQUEUE) { 1407 call_table_remove(call); 1408 p->cku_flags &= ~CKU_ONQUEUE; 1409 } 1410 1411 RPCLOG(64, "clnt_cots_kcallit: non TIMEOUT so xid 0x%x " 1412 "taken off dispatch list\n", p->cku_xid); 1413 if (call->call_reply) { 1414 freemsg(call->call_reply); 1415 call->call_reply = NULL; 1416 } 1417 } else if (wait.tv_sec != 0) { 1418 /* 1419 * We've sent the request over TCP and so we have 1420 * every reason to believe it will get 1421 * delivered. In which case returning a timeout is not 1422 * appropriate. 1423 */ 1424 if (p->cku_progress == TRUE && 1425 p->cku_recv_attempts < clnt_cots_maxrecv) { 1426 p->cku_err.re_status = RPC_INPROGRESS; 1427 } 1428 } 1429 goto cots_done; 1430 } 1431 1432 xdrs = &p->cku_inxdr; 1433 xdrmblk_init(xdrs, mp, XDR_DECODE, 0); 1434 1435 reply_msg.rm_direction = REPLY; 1436 reply_msg.rm_reply.rp_stat = MSG_ACCEPTED; 1437 reply_msg.acpted_rply.ar_stat = SUCCESS; 1438 1439 reply_msg.acpted_rply.ar_verf = _null_auth; 1440 /* 1441 * xdr_results will be done in AUTH_UNWRAP. 1442 */ 1443 reply_msg.acpted_rply.ar_results.where = NULL; 1444 reply_msg.acpted_rply.ar_results.proc = xdr_void; 1445 1446 if (xdr_replymsg(xdrs, &reply_msg)) { 1447 enum clnt_stat re_status; 1448 1449 _seterr_reply(&reply_msg, &p->cku_err); 1450 1451 re_status = p->cku_err.re_status; 1452 if (re_status == RPC_SUCCESS) { 1453 /* 1454 * Reply is good, check auth. 1455 */ 1456 if (!AUTH_VALIDATE(h->cl_auth, 1457 &reply_msg.acpted_rply.ar_verf)) { 1458 COTSRCSTAT_INCR(p->cku_stats, rcbadverfs); 1459 RPCLOG0(1, "clnt_cots_kcallit: validation " 1460 "failure\n"); 1461 freemsg(mp); 1462 (void) xdr_rpc_free_verifier(xdrs, &reply_msg); 1463 mutex_enter(&call->call_lock); 1464 if (call->call_reply == NULL) 1465 call->call_status = RPC_TIMEDOUT; 1466 mutex_exit(&call->call_lock); 1467 goto read_again; 1468 } else if (!AUTH_UNWRAP(h->cl_auth, xdrs, 1469 xdr_results, resultsp)) { 1470 RPCLOG0(1, "clnt_cots_kcallit: validation " 1471 "failure (unwrap)\n"); 1472 p->cku_err.re_status = RPC_CANTDECODERES; 1473 p->cku_err.re_errno = EIO; 1474 } 1475 } else { 1476 /* set errno in case we can't recover */ 1477 if (re_status != RPC_VERSMISMATCH && 1478 re_status != RPC_AUTHERROR && 1479 re_status != RPC_PROGVERSMISMATCH) 1480 p->cku_err.re_errno = EIO; 1481 1482 if (re_status == RPC_AUTHERROR) { 1483 /* 1484 * Maybe our credential need to be refreshed 1485 */ 1486 if (cm_entry) { 1487 /* 1488 * There is the potential that the 1489 * cm_entry has/will be marked dead, 1490 * so drop the connection altogether, 1491 * force REFRESH to establish new 1492 * connection. 1493 */ 1494 connmgr_cancelconn(cm_entry); 1495 cm_entry = NULL; 1496 } 1497 1498 if ((refreshes > 0) && 1499 AUTH_REFRESH(h->cl_auth, &reply_msg, 1500 p->cku_cred)) { 1501 refreshes--; 1502 (void) xdr_rpc_free_verifier(xdrs, 1503 &reply_msg); 1504 freemsg(mp); 1505 mp = NULL; 1506 1507 if (p->cku_flags & CKU_ONQUEUE) { 1508 call_table_remove(call); 1509 p->cku_flags &= ~CKU_ONQUEUE; 1510 } 1511 1512 RPCLOG(64, 1513 "clnt_cots_kcallit: AUTH_ERROR, xid" 1514 " 0x%x removed off dispatch list\n", 1515 p->cku_xid); 1516 if (call->call_reply) { 1517 freemsg(call->call_reply); 1518 call->call_reply = NULL; 1519 } 1520 1521 COTSRCSTAT_INCR(p->cku_stats, 1522 rcbadcalls); 1523 COTSRCSTAT_INCR(p->cku_stats, 1524 rcnewcreds); 1525 goto call_again; 1526 } 1527 1528 /* 1529 * We have used the client handle to 1530 * do an AUTH_REFRESH and the RPC status may 1531 * be set to RPC_SUCCESS; Let's make sure to 1532 * set it to RPC_AUTHERROR. 1533 */ 1534 p->cku_err.re_status = RPC_AUTHERROR; 1535 1536 /* 1537 * Map recoverable and unrecoverable 1538 * authentication errors to appropriate errno 1539 */ 1540 switch (p->cku_err.re_why) { 1541 case AUTH_TOOWEAK: 1542 /* 1543 * This could be a failure where the 1544 * server requires use of a reserved 1545 * port, check and optionally set the 1546 * client handle useresvport trying 1547 * one more time. Next go round we 1548 * fall out with the tooweak error. 1549 */ 1550 if (p->cku_useresvport != 1) { 1551 p->cku_useresvport = 1; 1552 p->cku_xid = 0; 1553 (void) xdr_rpc_free_verifier 1554 (xdrs, &reply_msg); 1555 freemsg(mp); 1556 goto call_again; 1557 } 1558 /* FALLTHRU */ 1559 case AUTH_BADCRED: 1560 case AUTH_BADVERF: 1561 case AUTH_INVALIDRESP: 1562 case AUTH_FAILED: 1563 case RPCSEC_GSS_NOCRED: 1564 case RPCSEC_GSS_FAILED: 1565 p->cku_err.re_errno = EACCES; 1566 break; 1567 case AUTH_REJECTEDCRED: 1568 case AUTH_REJECTEDVERF: 1569 default: p->cku_err.re_errno = EIO; 1570 break; 1571 } 1572 RPCLOG(1, "clnt_cots_kcallit : authentication" 1573 " failed with RPC_AUTHERROR of type %d\n", 1574 (int)p->cku_err.re_why); 1575 } 1576 } 1577 } else { 1578 /* reply didn't decode properly. */ 1579 p->cku_err.re_status = RPC_CANTDECODERES; 1580 p->cku_err.re_errno = EIO; 1581 RPCLOG0(1, "clnt_cots_kcallit: decode failure\n"); 1582 } 1583 1584 (void) xdr_rpc_free_verifier(xdrs, &reply_msg); 1585 1586 if (p->cku_flags & CKU_ONQUEUE) { 1587 call_table_remove(call); 1588 p->cku_flags &= ~CKU_ONQUEUE; 1589 } 1590 1591 RPCLOG(64, "clnt_cots_kcallit: xid 0x%x taken off dispatch list", 1592 p->cku_xid); 1593 RPCLOG(64, " status is %s\n", clnt_sperrno(p->cku_err.re_status)); 1594 cots_done: 1595 if (cm_entry) { 1596 1597 /* 1598 * If it was a test of the cb connection, set status as 1599 * tested. 1600 */ 1601 if ((p->cku_flags & (CKU_CALLBACK | CKU_CB_TEST)) && 1602 p->cku_err.re_status == RPC_SUCCESS) 1603 cm_entry->x_cb_tested = TRUE; 1604 1605 connmgr_release(cm_entry); 1606 } 1607 1608 if (mp != NULL) 1609 freemsg(mp); 1610 if ((p->cku_flags & CKU_ONQUEUE) == 0 && call->call_reply) { 1611 freemsg(call->call_reply); 1612 call->call_reply = NULL; 1613 } 1614 if (p->cku_err.re_status != RPC_SUCCESS) { 1615 RPCLOG0(1, "clnt_cots_kcallit: tail-end failure\n"); 1616 COTSRCSTAT_INCR(p->cku_stats, rcbadcalls); 1617 } 1618 1619 /* 1620 * No point in delaying if the zone is going away. 1621 */ 1622 if (delay_first == TRUE && 1623 !(zone_status_get(curproc->p_zone) >= ZONE_IS_SHUTTING_DOWN)) { 1624 if (clnt_delay(ticks, h->cl_nosignal) == EINTR) { 1625 p->cku_err.re_errno = EINTR; 1626 p->cku_err.re_status = RPC_INTR; 1627 } 1628 } 1629 return (p->cku_err.re_status); 1630 } 1631 1632 1633 /* 1634 * Kinit routine for cots. This sets up the correct operations in 1635 * the client handle, as the handle may have previously been a clts 1636 * handle, and clears the xid field so there is no way a new call 1637 * could be mistaken for a retry. It also sets in the handle the 1638 * information that is passed at create/kinit time but needed at 1639 * call time, as cots creates the transport at call time - device, 1640 * address of the server, protocol family. 1641 */ 1642 void 1643 clnt_cots_kinit(CLIENT *h, dev_t dev, int family, struct netbuf *addr, 1644 int max_msgsize, cred_t *cred) 1645 { 1646 /* LINTED pointer alignment */ 1647 cku_private_t *p = htop(h); 1648 calllist_t *call = &p->cku_call; 1649 1650 h->cl_ops = &tcp_ops; 1651 if (p->cku_flags & CKU_ONQUEUE) { 1652 call_table_remove(call); 1653 p->cku_flags &= ~CKU_ONQUEUE; 1654 RPCLOG(64, "clnt_cots_kinit: removing call for xid 0x%x from" 1655 " dispatch list\n", p->cku_xid); 1656 } 1657 1658 if (call->call_reply != NULL) { 1659 freemsg(call->call_reply); 1660 call->call_reply = NULL; 1661 } 1662 1663 call->call_bucket = NULL; 1664 call->call_hash = 0; 1665 1666 /* 1667 * We don't clear cku_flags here, because clnt_cots_kcallit() 1668 * takes care of handling the cku_flags reset. 1669 */ 1670 p->cku_xid = 0; 1671 p->cku_device = dev; 1672 p->cku_addrfmly = family; 1673 p->cku_cred = cred; 1674 1675 if (p->cku_addr.maxlen < addr->len) { 1676 if (p->cku_addr.maxlen != 0 && p->cku_addr.buf != NULL) 1677 kmem_free(p->cku_addr.buf, p->cku_addr.maxlen); 1678 p->cku_addr.buf = kmem_zalloc(addr->maxlen, KM_SLEEP); 1679 p->cku_addr.maxlen = addr->maxlen; 1680 } 1681 1682 p->cku_addr.len = addr->len; 1683 bcopy(addr->buf, p->cku_addr.buf, addr->len); 1684 1685 /* 1686 * If the current sanity check size in rpcmod is smaller 1687 * than the size needed, then increase the sanity check. 1688 */ 1689 if (max_msgsize != 0 && clnt_max_msg_sizep != NULL && 1690 max_msgsize > *clnt_max_msg_sizep) { 1691 mutex_enter(&clnt_max_msg_lock); 1692 if (max_msgsize > *clnt_max_msg_sizep) 1693 *clnt_max_msg_sizep = max_msgsize; 1694 mutex_exit(&clnt_max_msg_lock); 1695 } 1696 } 1697 1698 /* 1699 * ksettimers is a no-op for cots, with the exception of setting the xid. 1700 */ 1701 /* ARGSUSED */ 1702 static int 1703 clnt_cots_ksettimers(CLIENT *h, struct rpc_timers *t, struct rpc_timers *all, 1704 int minimum, void (*feedback)(int, int, caddr_t), caddr_t arg, 1705 uint32_t xid) 1706 { 1707 /* LINTED pointer alignment */ 1708 cku_private_t *p = htop(h); 1709 1710 if (xid) 1711 p->cku_xid = xid; 1712 COTSRCSTAT_INCR(p->cku_stats, rctimers); 1713 return (0); 1714 } 1715 1716 extern void rpc_poptimod(struct vnode *); 1717 extern int kstr_push(struct vnode *, char *); 1718 1719 int 1720 conn_kstat_update(kstat_t *ksp, int rw) 1721 { 1722 struct cm_xprt *cm_entry; 1723 struct cm_kstat_xprt *cm_ksp_data; 1724 uchar_t *b; 1725 char *fbuf; 1726 1727 if (rw == KSTAT_WRITE) 1728 return (EACCES); 1729 if (ksp == NULL || ksp->ks_private == NULL) 1730 return (EIO); 1731 cm_entry = (struct cm_xprt *)ksp->ks_private; 1732 cm_ksp_data = (struct cm_kstat_xprt *)ksp->ks_data; 1733 1734 cm_ksp_data->x_wq.value.ui32 = (uint32_t)(uintptr_t)cm_entry->x_wq; 1735 cm_ksp_data->x_family.value.ui32 = cm_entry->x_family; 1736 cm_ksp_data->x_rdev.value.ui32 = (uint32_t)cm_entry->x_rdev; 1737 cm_ksp_data->x_time.value.ui32 = cm_entry->x_time; 1738 cm_ksp_data->x_ref.value.ui32 = cm_entry->x_ref; 1739 cm_ksp_data->x_state.value.ui32 = cm_entry->x_state_flags; 1740 1741 if (cm_entry->x_server.buf) { 1742 fbuf = cm_ksp_data->x_server.value.str.addr.ptr; 1743 if (cm_entry->x_family == AF_INET && 1744 cm_entry->x_server.len == 1745 sizeof (struct sockaddr_in)) { 1746 struct sockaddr_in *sa; 1747 sa = (struct sockaddr_in *) 1748 cm_entry->x_server.buf; 1749 b = (uchar_t *)&sa->sin_addr; 1750 (void) sprintf(fbuf, 1751 "%03d.%03d.%03d.%03d", b[0] & 0xFF, b[1] & 0xFF, 1752 b[2] & 0xFF, b[3] & 0xFF); 1753 cm_ksp_data->x_port.value.ui32 = 1754 (uint32_t)sa->sin_port; 1755 } else if (cm_entry->x_family == AF_INET6 && 1756 cm_entry->x_server.len >= 1757 sizeof (struct sockaddr_in6)) { 1758 /* extract server IP address & port */ 1759 struct sockaddr_in6 *sin6; 1760 sin6 = (struct sockaddr_in6 *)cm_entry->x_server.buf; 1761 (void) kinet_ntop6((uchar_t *)&sin6->sin6_addr, fbuf, 1762 INET6_ADDRSTRLEN); 1763 cm_ksp_data->x_port.value.ui32 = sin6->sin6_port; 1764 } else { 1765 struct sockaddr_in *sa; 1766 1767 sa = (struct sockaddr_in *)cm_entry->x_server.buf; 1768 b = (uchar_t *)&sa->sin_addr; 1769 (void) sprintf(fbuf, 1770 "%03d.%03d.%03d.%03d", b[0] & 0xFF, b[1] & 0xFF, 1771 b[2] & 0xFF, b[3] & 0xFF); 1772 } 1773 KSTAT_NAMED_STR_BUFLEN(&cm_ksp_data->x_server) = 1774 strlen(fbuf) + 1; 1775 } 1776 1777 return (0); 1778 } 1779 1780 1781 /* 1782 * We want a version of delay which is interruptible by a UNIX signal 1783 * Return EINTR if an interrupt occured. 1784 */ 1785 static int 1786 clnt_delay(clock_t ticks, bool_t nosignal) 1787 { 1788 if (nosignal == TRUE) { 1789 delay(ticks); 1790 return (0); 1791 } 1792 return (delay_sig(ticks)); 1793 } 1794 1795 /* 1796 * Wait for a connection until a timeout, or until we are 1797 * signalled that there has been a connection state change. 1798 */ 1799 static enum clnt_stat 1800 connmgr_cwait(struct cm_xprt *cm_entry, const struct timeval *waitp, 1801 bool_t nosignal) 1802 { 1803 bool_t interrupted; 1804 clock_t timout, cv_stat; 1805 enum clnt_stat clstat; 1806 unsigned int old_state; 1807 1808 ASSERT(MUTEX_HELD(&connmgr_lock)); 1809 /* 1810 * We wait for the transport connection to be made, or an 1811 * indication that it could not be made. 1812 */ 1813 clstat = RPC_TIMEDOUT; 1814 interrupted = FALSE; 1815 1816 old_state = cm_entry->x_state_flags; 1817 /* 1818 * Now loop until cv_timedwait{_sig} returns because of 1819 * a signal(0) or timeout(-1) or cv_signal(>0). But it may be 1820 * cv_signalled for various other reasons too. So loop 1821 * until there is a state change on the connection. 1822 */ 1823 1824 timout = waitp->tv_sec * drv_usectohz(1000000) + 1825 drv_usectohz(waitp->tv_usec) + lbolt; 1826 1827 if (nosignal) { 1828 while ((cv_stat = cv_timedwait(&cm_entry->x_conn_cv, 1829 &connmgr_lock, timout)) > 0 && 1830 cm_entry->x_state_flags == old_state) 1831 ; 1832 } else { 1833 while ((cv_stat = cv_timedwait_sig(&cm_entry->x_conn_cv, 1834 &connmgr_lock, timout)) > 0 && 1835 cm_entry->x_state_flags == old_state) 1836 ; 1837 1838 if (cv_stat == 0) /* got intr signal? */ 1839 interrupted = TRUE; 1840 } 1841 1842 if ((cm_entry->x_state_flags & (X_BADSTATES|X_CONNECTED)) == 1843 X_CONNECTED) { 1844 clstat = RPC_SUCCESS; 1845 } else { 1846 if (interrupted == TRUE) 1847 clstat = RPC_INTR; 1848 RPCLOG(1, "connmgr_cwait: can't connect, error: %s\n", 1849 clnt_sperrno(clstat)); 1850 } 1851 1852 return (clstat); 1853 } 1854 1855 static bool_t 1856 clnt_connmgr_cmptag(cku_private_t *p, struct cm_xprt *xprt) 1857 { 1858 bool_t same_tag = FALSE; 1859 bool_t bc_call; 1860 bool_t bc_conn; 1861 int tagless = 0; 1862 1863 tagless = rpc_is_taglist_empty(xprt->x_tags); 1864 1865 1866 bc_conn = xprt->x_cb; 1867 1868 /* 1869 * Old style request with no TAGS (nfsv4.0/nfsv3/NLM) 1870 * If the connection is tagged, return TRUE only if it is 1871 * a fore-channel only connection. Don't allow old styled 1872 * requests over a connection that allows only backchannel traffic. 1873 */ 1874 if (!(p->cku_flags & CKU_TAGCMP)) { 1875 1876 if (tagless) 1877 return (TRUE); 1878 1879 if (bc_conn) 1880 return (FALSE); 1881 1882 return (TRUE); 1883 } 1884 1885 bc_call = (p->cku_flags & CKU_BACKCHANNEL) ? TRUE : FALSE; 1886 1887 /* 1888 * If request is for back channel exclusively, 1889 * but the connection is not marked as BC connection. 1890 */ 1891 1892 if (bc_call && !bc_conn) 1893 return (FALSE); 1894 1895 /* 1896 * In case of non bi-dir RPC, if request is for fore channel, 1897 * but the connection is for back channel only, fail. 1898 */ 1899 1900 if (!clnt_cots_birpc && (!bc_call && bc_conn)) 1901 return (FALSE); 1902 1903 1904 /* 1905 * Now compare tags 1906 */ 1907 1908 if (!tagless) { 1909 same_tag = rpc_cmp_tag(xprt->x_tags, (void *)p->cku_tag); 1910 } 1911 1912 #ifdef NOT_YET 1913 1914 /* not turned-on until SSV and connection binding enforcement */ 1915 1916 /* 1917 * Tags are different, or connection has no tags 1918 * Add a new tag only if BIND_CONN flag is set 1919 */ 1920 1921 if (!same_tag && (p->cku_flags & CKU_BIND_CONN)) { 1922 1923 #endif 1924 if (!same_tag) { 1925 rpc_add_tag(&cm_tag_hd, (void *)xprt, (void *)p->cku_tag); 1926 } 1927 1928 return (TRUE); 1929 } 1930 1931 /* 1932 * connmgr wrapper to swap tag values 1933 * client handle's tag p->cku_tag is changedd to new 1934 * tag as well 1935 */ 1936 1937 static bool_t 1938 connmgr_tag_swap(cku_private_t *p, void *newtag) 1939 { 1940 1941 if (!rpc_tag_swap(&cm_tag_hd, (void *)p->cku_tag, newtag)) 1942 return (FALSE); 1943 1944 /* 1945 * Now set the new tag on the client handle 1946 */ 1947 bcopy(newtag, p->cku_tag, sizeof (tagid)); 1948 1949 return (TRUE); 1950 } 1951 1952 /* 1953 * Detaches all the connections associated with the tag. 1954 */ 1955 static void 1956 connmgr_tag_disassociate(cku_private_t *p) 1957 { 1958 rpc_tag_t *tag; 1959 1960 tag = rpc_lookup_tag(&cm_tag_hd, (void *)p->cku_tag, FALSE); 1961 1962 if (tag == NULL) { 1963 return; 1964 } 1965 1966 mutex_enter(&tag->rt_lock); 1967 rpc_remove_all_xprt(&cm_tag_hd, tag); 1968 mutex_exit(&tag->rt_lock); 1969 1970 /* 1971 * Release the refcnt acquired by lookup_tag() 1972 */ 1973 RPC_TAG_RELE(&cm_tag_hd, tag); 1974 1975 } 1976 1977 /* 1978 * Detaches the connections associated with the tag (eventually). 1979 * The way we go about doing this is, we mark the connection as 1980 * doomed. The connection will take the usual course to closure 1981 * when the timer goes off, at which point it is detached from 1982 * the tag as well. This way, we try to avoid rt_refcnt from 1983 * going to 0 and having to reallocate the tag for the new 1984 * connection. 1985 */ 1986 static void 1987 connmgr_tag_unbind(cku_private_t *p) 1988 { 1989 1990 rpc_tag_t *tag; 1991 int back_chan = 0; 1992 struct cm_xprt *xprt; 1993 void *cookie = NULL; 1994 1995 if (p->cku_flags & CKU_BACKCHANNEL) 1996 back_chan = 1; 1997 1998 /* 1999 * Unfortunately we'll need to hold this lock 2000 * to change x_state_flags 2001 */ 2002 mutex_enter(&connmgr_lock); 2003 2004 tag = rpc_lookup_tag(&cm_tag_hd, (void *)p->cku_tag, FALSE); 2005 2006 if (tag == NULL) { 2007 mutex_exit(&connmgr_lock); 2008 return; 2009 } 2010 2011 mutex_enter(&tag->rt_lock); 2012 2013 xprt = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2014 while (xprt) { 2015 /* 2016 * Mark only connections of the right type 2017 * in the case for non-bidir rpc. 2018 */ 2019 if (!clnt_cots_birpc && 2020 ((back_chan && !xprt->x_cb) || 2021 (!back_chan && xprt->x_cb))) { 2022 xprt = 2023 (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2024 continue; 2025 } 2026 2027 if ((xprt->x_state_flags & X_DOOMED) == 0) 2028 xprt->x_doomed = TRUE; 2029 2030 xprt = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2031 } 2032 mutex_exit(&tag->rt_lock); 2033 2034 /* 2035 * Release the refcnt acquired by lookup_tag() 2036 */ 2037 RPC_TAG_RELE(&cm_tag_hd, tag); 2038 2039 mutex_exit(&connmgr_lock); 2040 } 2041 2042 /* ARGSUSED */ 2043 static int 2044 connmgr_clrcbinfo(cku_private_t *p, void *sessid) 2045 { 2046 rpc_tag_t *tag; 2047 struct cm_xprt *xprt; 2048 void *cookie = NULL; 2049 2050 tag = rpc_lookup_tag(&cm_tag_hd, sessid, FALSE); 2051 2052 if (tag == NULL) { 2053 return (FALSE); 2054 } 2055 2056 mutex_enter(&tag->rt_lock); 2057 2058 xprt = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2059 while (xprt) { 2060 extern void mir_clear_cbinfo(queue_t *); 2061 mir_clear_cbinfo(xprt->x_wq); 2062 xprt = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2063 } 2064 mutex_exit(&tag->rt_lock); 2065 2066 /* 2067 * Release the refcnt acquired by lookup_tag() 2068 */ 2069 RPC_TAG_RELE(&cm_tag_hd, tag); 2070 return (TRUE); 2071 } 2072 2073 2074 /* 2075 * Primary interface for how RPC grabs a connection. 2076 */ 2077 static struct cm_xprt * 2078 connmgr_wrapget( 2079 struct netbuf *retryaddr, 2080 const struct timeval *waitp, 2081 cku_private_t *p) 2082 { 2083 struct cm_xprt *cm_entry; 2084 cm_entry = connmgr_get(retryaddr, waitp, p); 2085 2086 2087 if (cm_entry == NULL) { 2088 /* 2089 * Re-map the call status to RPC_INTR if the err code is 2090 * EINTR. This can happen if calls status is RPC_TLIERROR. 2091 * However, don't re-map if signalling has been turned off. 2092 * XXX Really need to create a separate thread whenever 2093 * there isn't an existing connection. 2094 */ 2095 if (p->cku_err.re_errno == EINTR) { 2096 if (p->cku_client.cl_nosignal == TRUE) 2097 p->cku_err.re_errno = EIO; 2098 else 2099 p->cku_err.re_status = RPC_INTR; 2100 } 2101 } 2102 return (cm_entry); 2103 } 2104 2105 /* ARGSUSED */ 2106 struct cm_xprt * 2107 connmgr_cbget(struct netbuf *retryaddr, const struct timeval *waitp, 2108 cku_private_t *p) 2109 { 2110 rpc_tag_t *tag; 2111 struct cm_xprt *cm_entry, *lru_entry; 2112 struct netbuf *srcaddr; 2113 clock_t prev_time; 2114 bool_t cbconn_test = FALSE; 2115 void *cookie = NULL; 2116 2117 ASSERT(p->cku_flags & (CKU_TAGCMP|CKU_CALLBACK)); 2118 2119 cbconn_test = (p->cku_flags & CKU_CB_TEST); 2120 2121 tag = rpc_lookup_tag(&cm_tag_hd, (void *)p->cku_tag, FALSE); 2122 if (tag == NULL) { 2123 p->cku_err.re_status = RPC_CANTCONNECT; 2124 p->cku_err.re_errno = EIO; 2125 return (NULL); 2126 } 2127 2128 prev_time = lbolt; 2129 lru_entry = NULL; 2130 2131 /* 2132 * Pick the LRU cm_entry for the tag 2133 */ 2134 mutex_enter(&tag->rt_lock); 2135 2136 cm_entry = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2137 2138 2139 while (cm_entry) { 2140 2141 srcaddr = &cm_entry->x_src; 2142 2143 if ((retryaddr != NULL) && 2144 (NETBUF_CMP(retryaddr, srcaddr) != 0)) { 2145 cm_entry = 2146 (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2147 continue; 2148 } 2149 2150 /* 2151 * If it is a cb connection test, pick up only untested 2152 * connections. Skip already tested connections. 2153 */ 2154 2155 if (cbconn_test && 2156 (cm_entry->x_cb_tested == TRUE)) { 2157 cm_entry = 2158 (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2159 continue; 2160 } 2161 2162 if ((cm_entry->x_time - prev_time) <= 0 || lru_entry == NULL) { 2163 lru_entry = cm_entry; 2164 prev_time = cm_entry->x_time; 2165 } 2166 2167 cm_entry = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2168 } 2169 2170 if (lru_entry != NULL) { 2171 CONN_HOLD(lru_entry); 2172 lru_entry->x_time = lbolt; 2173 } 2174 2175 mutex_exit(&tag->rt_lock); 2176 2177 RPC_TAG_RELE(&cm_tag_hd, tag); 2178 2179 DTRACE_PROBE2(cb__connmgr, char *, "lru_entry", 2180 struct cm_xprt *, lru_entry); 2181 2182 if (lru_entry == NULL) { 2183 p->cku_err.re_status = RPC_CANTCONNECT; 2184 p->cku_err.re_errno = EIO; 2185 return (NULL); 2186 } 2187 2188 return (lru_entry); 2189 } 2190 2191 /* 2192 * Given a specific tag, return number of untested 2193 * connections. 2194 */ 2195 2196 bool_t 2197 connmgr_cb_totest(CLIENT *h, void *conn_num) 2198 { 2199 cku_private_t *p = htop(h); 2200 rpc_tag_t *tag; 2201 struct cm_xprt *cm_entry; 2202 void *cookie = NULL; 2203 int cno = 0; 2204 2205 tag = rpc_lookup_tag(&cm_tag_hd, (void *)p->cku_tag, FALSE); 2206 if (tag == NULL) { 2207 *((int *)conn_num) = cno; 2208 return (FALSE); 2209 } 2210 2211 mutex_enter(&tag->rt_lock); 2212 cm_entry = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2213 2214 while (cm_entry) { 2215 if (cm_entry->x_cb_tested == FALSE) 2216 cno++; 2217 cm_entry = (struct cm_xprt *)rpc_get_next_xprt(tag, &cookie); 2218 } 2219 mutex_exit(&tag->rt_lock); 2220 RPC_TAG_RELE(&cm_tag_hd, tag); 2221 *((int *)conn_num) = cno; 2222 return (TRUE); 2223 } 2224 2225 /* 2226 * Obtains a transport to the server specified in addr. If a suitable transport 2227 * does not already exist in the list of cached transports, a new connection 2228 * is created, connected, and added to the list. The connection is for sending 2229 * only - the reply message may come back on another transport connection. 2230 */ 2231 static struct cm_xprt * 2232 connmgr_get( 2233 struct netbuf *retryaddr, 2234 const struct timeval *waitp, /* changed to a ptr to converse stack */ 2235 cku_private_t *p) 2236 { 2237 struct cm_xprt *cm_entry; 2238 struct cm_xprt *lru_entry; 2239 struct cm_xprt **cmp; 2240 queue_t *wq; 2241 TIUSER *tiptr; 2242 int i; 2243 int retval; 2244 clock_t prev_time; 2245 int tidu_size; 2246 bool_t connected; 2247 zoneid_t zoneid = rpc_zoneid(); 2248 struct netbuf *destaddr = &p->cku_addr; 2249 struct netbuf *cm_destaddr, *cm_srcaddr; 2250 int addrfmly = p->cku_addrfmly; 2251 struct netbuf *srcaddr = &p->cku_srcaddr; 2252 struct rpc_err *rpcerr = &p->cku_err; 2253 dev_t device = p->cku_device; 2254 bool_t nosignal = p->cku_client.cl_nosignal; 2255 int useresvport = p->cku_useresvport; 2256 CBSERVER_ARGS *cbsrv_args = p->cku_cb; 2257 2258 /* 2259 * Server side callback connections 2260 */ 2261 if (p->cku_flags & CKU_CALLBACK) { 2262 return (connmgr_cbget(retryaddr, waitp, p)); 2263 } 2264 2265 /* 2266 * If the call is not a retry, look for a transport entry that 2267 * goes to the server of interest. 2268 */ 2269 mutex_enter(&connmgr_lock); 2270 2271 if (retryaddr == NULL) { 2272 use_new_conn: 2273 i = 0; 2274 cm_entry = lru_entry = NULL; 2275 prev_time = lbolt; 2276 2277 cmp = &cm_hd; 2278 while ((cm_entry = *cmp) != NULL) { 2279 ASSERT(cm_entry != cm_entry->x_next); 2280 /* 2281 * Garbage collect conections that are marked 2282 * for needs disconnect. 2283 */ 2284 2285 if (cm_entry->x_needdis) { 2286 CONN_HOLD(cm_entry); 2287 connmgr_dis_and_wait(cm_entry); 2288 connmgr_release(cm_entry); 2289 /* 2290 * connmgr_lock could have been 2291 * dropped for the disconnect 2292 * processing so start over. 2293 */ 2294 goto use_new_conn; 2295 } 2296 2297 2298 /* 2299 * Garbage collect the dead connections that have 2300 * no threads working on them. 2301 */ 2302 if ((cm_entry->x_state_flags & (X_DEAD|X_THREAD)) == 2303 X_DEAD) { 2304 mutex_enter(&cm_entry->x_lock); 2305 if (cm_entry->x_ref != 0) { 2306 /* 2307 * Currently in use. 2308 * Cleanup later. 2309 */ 2310 cmp = &cm_entry->x_next; 2311 mutex_exit(&cm_entry->x_lock); 2312 continue; 2313 } 2314 mutex_exit(&cm_entry->x_lock); 2315 *cmp = cm_entry->x_next; 2316 mutex_exit(&connmgr_lock); 2317 connmgr_close(cm_entry); 2318 mutex_enter(&connmgr_lock); 2319 goto use_new_conn; 2320 } 2321 2322 cm_destaddr = &cm_entry->x_server; 2323 2324 if ((cm_entry->x_state_flags & X_BADSTATES) == 0 && 2325 cm_entry->x_zoneid == zoneid && 2326 cm_entry->x_rdev == device && 2327 NETBUF_CMP(destaddr, cm_destaddr) == 0 && 2328 RPC_SAME_TAG(p, cm_entry)) { 2329 2330 /* 2331 * If the matching entry isn't connected, 2332 * attempt to reconnect it. 2333 */ 2334 if (cm_entry->x_connected == FALSE) { 2335 /* 2336 * We don't go through trying 2337 * to find the least recently 2338 * used connected because 2339 * connmgr_reconnect() briefly 2340 * dropped the connmgr_lock, 2341 * allowing a window for our 2342 * accounting to be messed up. 2343 * In any case, a re-connected 2344 * connection is as good as 2345 * a LRU connection. 2346 */ 2347 return (connmgr_wrapconnect(cm_entry, 2348 waitp, destaddr, addrfmly, srcaddr, 2349 rpcerr, TRUE, nosignal)); 2350 } 2351 i++; 2352 if (cm_entry->x_time - prev_time <= 0 || 2353 lru_entry == NULL) { 2354 prev_time = cm_entry->x_time; 2355 lru_entry = cm_entry; 2356 } 2357 } 2358 2359 cmp = &cm_entry->x_next; 2360 } 2361 2362 if (i > clnt_max_conns) { 2363 RPCLOG(8, "connmgr_get: too many conns, dooming entry" 2364 " %p\n", (void *)lru_entry->x_tiptr); 2365 lru_entry->x_doomed = TRUE; 2366 goto use_new_conn; 2367 } 2368 2369 /* 2370 * If we are at the maximum number of connections to 2371 * the server, hand back the least recently used one. 2372 */ 2373 if (i == clnt_max_conns) { 2374 /* 2375 * Copy into the handle the source address of 2376 * the connection, which we will use in case of 2377 * a later retry. 2378 */ 2379 if (srcaddr->len != lru_entry->x_src.len) { 2380 if (srcaddr->len > 0) 2381 kmem_free(srcaddr->buf, 2382 srcaddr->maxlen); 2383 srcaddr->buf = kmem_zalloc( 2384 lru_entry->x_src.len, KM_SLEEP); 2385 srcaddr->maxlen = srcaddr->len = 2386 lru_entry->x_src.len; 2387 } 2388 bcopy(lru_entry->x_src.buf, srcaddr->buf, srcaddr->len); 2389 RPCLOG(2, "connmgr_get: call going out on %p\n", 2390 (void *)lru_entry); 2391 lru_entry->x_time = lbolt; 2392 2393 if (p->cku_flags & CKU_BC_SETUP) { 2394 extern void mir_set_cbinfo(queue_t *, void *); 2395 mir_set_cbinfo(lru_entry->x_wq, 2396 (void *)cbsrv_args); 2397 } 2398 2399 CONN_HOLD(lru_entry); 2400 mutex_exit(&connmgr_lock); 2401 return (lru_entry); 2402 } 2403 2404 } else { 2405 /* 2406 * This is the retry case (retryaddr != NULL). Retries must 2407 * be sent on the same source port as the original call. 2408 */ 2409 2410 /* 2411 * Walk the list looking for a connection with a source address 2412 * that matches the retry address. 2413 */ 2414 cmp = &cm_hd; 2415 while ((cm_entry = *cmp) != NULL) { 2416 2417 ASSERT(cm_entry != cm_entry->x_next); 2418 cm_srcaddr = &cm_entry->x_src; 2419 2420 if (zoneid != cm_entry->x_zoneid || 2421 device != cm_entry->x_rdev || 2422 (NETBUF_CMP(retryaddr, cm_srcaddr) != 0) || 2423 !RPC_SAME_TAG(p, cm_entry)) { 2424 cmp = &cm_entry->x_next; 2425 continue; 2426 } 2427 2428 /* 2429 * Sanity check: if the connection with our source 2430 * port is going to some other server, something went 2431 * wrong, as we never delete connections (i.e. release 2432 * ports) unless they have been idle. In this case, 2433 * it is probably better to send the call out using 2434 * a new source address than to fail it altogether, 2435 * since that port may never be released. 2436 */ 2437 2438 cm_destaddr = &cm_entry->x_server; 2439 2440 if (NETBUF_CMP(destaddr, cm_destaddr)) { 2441 RPCLOG(1, "connmgr_get: tiptr %p" 2442 " is going to a different server" 2443 " with the port that belongs" 2444 " to us!\n", (void *)cm_entry->x_tiptr); 2445 retryaddr = NULL; 2446 goto use_new_conn; 2447 } 2448 2449 if (p->cku_flags & CKU_BC_SETUP) { 2450 extern void mir_set_cbinfo(queue_t *, void *); 2451 mir_set_cbinfo(cm_entry->x_wq, 2452 (void *)cbsrv_args); 2453 } 2454 2455 /* 2456 * If the connection of interest is not connected and we 2457 * can't reconnect it, then the server is probably 2458 * still down. Return NULL to the caller and let it 2459 * retry later if it wants to. We have a delay so the 2460 * machine doesn't go into a tight retry loop. If the 2461 * entry was already connected, or the reconnected was 2462 * successful, return this entry. 2463 */ 2464 if (cm_entry->x_connected == FALSE) { 2465 return (connmgr_wrapconnect(cm_entry, 2466 waitp, destaddr, addrfmly, NULL, 2467 rpcerr, TRUE, nosignal)); 2468 } else { 2469 CONN_HOLD(cm_entry); 2470 2471 cm_entry->x_time = lbolt; 2472 mutex_exit(&connmgr_lock); 2473 RPCLOG(2, "connmgr_get: found old " 2474 "transport %p for retry\n", 2475 (void *)cm_entry); 2476 return (cm_entry); 2477 } 2478 } 2479 2480 /* 2481 * We cannot find an entry in the list for this retry. 2482 * Either the entry has been removed temporarily to be 2483 * reconnected by another thread, or the original call 2484 * got a port but never got connected, 2485 * and hence the transport never got put in the 2486 * list. Fall through to the "create new connection" code - 2487 * the former case will fail there trying to rebind the port, 2488 * and the later case (and any other pathological cases) will 2489 * rebind and reconnect and not hang the client machine. 2490 */ 2491 RPCLOG0(8, "connmgr_get: no entry in list for retry\n"); 2492 } 2493 2494 2495 #ifdef NOT_YET 2496 2497 /* not turned-on until SSV and connection binding enforcement */ 2498 2499 /* 2500 * If tags are set and we don't have a connection associated 2501 * with the tag (and the caller has not asked for a binding), 2502 * notify the caller to do connection binding. 2503 */ 2504 2505 /* XXX verify retry case */ 2506 2507 if ((p->cku_flags & CKU_TAGCMP) && 2508 !(p->cku_flags & CKU_BIND_CONN)) { 2509 p->cku_err.re_status = RPC_CONN_NOT_BOUND; 2510 p->cku_err.re_errno = EIO; 2511 mutex_exit(&connmgr_lock); 2512 return (NULL); 2513 } 2514 2515 #endif 2516 2517 /* 2518 * Set up a transport entry in the connection manager's list. 2519 */ 2520 cm_entry = (struct cm_xprt *) 2521 kmem_zalloc(sizeof (struct cm_xprt), KM_SLEEP); 2522 2523 cm_entry->x_server.buf = kmem_zalloc(destaddr->len, KM_SLEEP); 2524 bcopy(destaddr->buf, cm_entry->x_server.buf, destaddr->len); 2525 cm_entry->x_server.len = cm_entry->x_server.maxlen = destaddr->len; 2526 2527 cm_entry->x_state_flags = X_THREAD; 2528 cm_entry->x_ref = 1; 2529 cm_entry->x_family = addrfmly; 2530 cm_entry->x_rdev = device; 2531 cm_entry->x_zoneid = zoneid; 2532 mutex_init(&cm_entry->x_lock, NULL, MUTEX_DEFAULT, NULL); 2533 cv_init(&cm_entry->x_cv, NULL, CV_DEFAULT, NULL); 2534 cv_init(&cm_entry->x_conn_cv, NULL, CV_DEFAULT, NULL); 2535 cv_init(&cm_entry->x_dis_cv, NULL, CV_DEFAULT, NULL); 2536 2537 cm_entry->x_cb = (p->cku_flags & CKU_BACKCHANNEL) ? TRUE : FALSE; 2538 2539 rpc_init_taglist(&cm_entry->x_tags); 2540 2541 if (p->cku_flags & CKU_TAGCMP) { 2542 ASSERT(p->cku_tag != NULL); 2543 rpc_add_tag(&cm_tag_hd, (void *)cm_entry, (void *)p->cku_tag); 2544 } 2545 2546 /* 2547 * Note that we add this partially initialized entry to the 2548 * connection list. This is so that we don't have connections to 2549 * the same server. 2550 * 2551 * Note that x_src is not initialized at this point. This is because 2552 * retryaddr might be NULL in which case x_src is whatever 2553 * t_kbind/bindresvport gives us. If another thread wants a 2554 * connection to the same server, seemingly we have an issue, but we 2555 * don't. If the other thread comes in with retryaddr == NULL, then it 2556 * will never look at x_src, and it will end up waiting in 2557 * connmgr_cwait() for the first thread to finish the connection 2558 * attempt. If the other thread comes in with retryaddr != NULL, then 2559 * that means there was a request sent on a connection, in which case 2560 * the the connection should already exist. Thus the first thread 2561 * never gets here ... it finds the connection it its server in the 2562 * connection list. 2563 * 2564 * But even if theory is wrong, in the retryaddr != NULL case, the 2nd 2565 * thread will skip us because x_src.len == 0. 2566 */ 2567 cm_entry->x_next = cm_hd; 2568 cm_hd = cm_entry; 2569 mutex_exit(&connmgr_lock); 2570 2571 /* 2572 * Either we didn't find an entry to the server of interest, or we 2573 * don't have the maximum number of connections to that server - 2574 * create a new connection. 2575 */ 2576 RPCLOG0(8, "connmgr_get: creating new connection\n"); 2577 rpcerr->re_status = RPC_TLIERROR; 2578 2579 i = t_kopen(NULL, device, FREAD|FWRITE|FNDELAY, &tiptr, zone_kcred()); 2580 if (i) { 2581 RPCLOG(1, "connmgr_get: can't open cots device, error %d\n", i); 2582 rpcerr->re_errno = i; 2583 connmgr_cancelconn(cm_entry); 2584 return (NULL); 2585 } 2586 rpc_poptimod(tiptr->fp->f_vnode); 2587 2588 if (i = strioctl(tiptr->fp->f_vnode, I_PUSH, (intptr_t)"rpcmod", 0, 2589 K_TO_K, kcred, &retval)) { 2590 RPCLOG(1, "connmgr_get: can't push cots module, %d\n", i); 2591 (void) t_kclose(tiptr, 1); 2592 rpcerr->re_errno = i; 2593 connmgr_cancelconn(cm_entry); 2594 return (NULL); 2595 } 2596 2597 if (i = strioctl(tiptr->fp->f_vnode, RPC_CLIENT, 0, 0, K_TO_K, 2598 kcred, &retval)) { 2599 RPCLOG(1, "connmgr_get: can't set client status with cots " 2600 "module, %d\n", i); 2601 (void) t_kclose(tiptr, 1); 2602 rpcerr->re_errno = i; 2603 connmgr_cancelconn(cm_entry); 2604 return (NULL); 2605 } 2606 2607 mutex_enter(&connmgr_lock); 2608 2609 wq = tiptr->fp->f_vnode->v_stream->sd_wrq->q_next; 2610 cm_entry->x_wq = wq; 2611 2612 mutex_exit(&connmgr_lock); 2613 2614 if (i = strioctl(tiptr->fp->f_vnode, I_PUSH, (intptr_t)"timod", 0, 2615 K_TO_K, kcred, &retval)) { 2616 RPCLOG(1, "connmgr_get: can't push timod, %d\n", i); 2617 (void) t_kclose(tiptr, 1); 2618 rpcerr->re_errno = i; 2619 connmgr_cancelconn(cm_entry); 2620 return (NULL); 2621 } 2622 2623 /* 2624 * If the caller has not specified reserved port usage then 2625 * take the system default. 2626 */ 2627 if (useresvport == -1) 2628 useresvport = clnt_cots_do_bindresvport; 2629 2630 if ((useresvport || retryaddr != NULL) && 2631 (addrfmly == AF_INET || addrfmly == AF_INET6)) { 2632 bool_t alloc_src = FALSE; 2633 2634 if (srcaddr->len != destaddr->len) { 2635 kmem_free(srcaddr->buf, srcaddr->maxlen); 2636 srcaddr->buf = kmem_zalloc(destaddr->len, KM_SLEEP); 2637 srcaddr->maxlen = destaddr->len; 2638 srcaddr->len = destaddr->len; 2639 alloc_src = TRUE; 2640 } 2641 2642 if ((i = bindresvport(tiptr, retryaddr, srcaddr, TRUE)) != 0) { 2643 (void) t_kclose(tiptr, 1); 2644 RPCLOG(1, "connmgr_get: couldn't bind, retryaddr: " 2645 "%p\n", (void *)retryaddr); 2646 2647 /* 2648 * 1225408: If we allocated a source address, then it 2649 * is either garbage or all zeroes. In that case 2650 * we need to clear srcaddr. 2651 */ 2652 if (alloc_src == TRUE) { 2653 kmem_free(srcaddr->buf, srcaddr->maxlen); 2654 srcaddr->maxlen = srcaddr->len = 0; 2655 srcaddr->buf = NULL; 2656 } 2657 rpcerr->re_errno = i; 2658 connmgr_cancelconn(cm_entry); 2659 return (NULL); 2660 } 2661 } else { 2662 if ((i = t_kbind(tiptr, NULL, NULL)) != 0) { 2663 RPCLOG(1, "clnt_cots_kcreate: t_kbind: %d\n", i); 2664 (void) t_kclose(tiptr, 1); 2665 rpcerr->re_errno = i; 2666 connmgr_cancelconn(cm_entry); 2667 return (NULL); 2668 } 2669 } 2670 2671 { 2672 /* 2673 * Keep the kernel stack lean. Don't move this call 2674 * declaration to the top of this function because a 2675 * call is declared in connmgr_wrapconnect() 2676 */ 2677 calllist_t call; 2678 2679 bzero(&call, sizeof (call)); 2680 cv_init(&call.call_cv, NULL, CV_DEFAULT, NULL); 2681 2682 /* 2683 * This is a bound end-point so don't close it's stream. 2684 */ 2685 connected = connmgr_connect(cm_entry, wq, destaddr, addrfmly, 2686 &call, &tidu_size, FALSE, waitp, nosignal); 2687 *rpcerr = call.call_err; 2688 cv_destroy(&call.call_cv); 2689 2690 } 2691 2692 mutex_enter(&connmgr_lock); 2693 2694 /* 2695 * Set up a transport entry in the connection manager's list. 2696 */ 2697 cm_entry->x_src.buf = kmem_zalloc(srcaddr->len, KM_SLEEP); 2698 bcopy(srcaddr->buf, cm_entry->x_src.buf, srcaddr->len); 2699 cm_entry->x_src.len = cm_entry->x_src.maxlen = srcaddr->len; 2700 2701 cm_entry->x_tiptr = tiptr; 2702 cm_entry->x_time = lbolt; 2703 2704 if (tiptr->tp_info.servtype == T_COTS_ORD) 2705 cm_entry->x_ordrel = TRUE; 2706 else 2707 cm_entry->x_ordrel = FALSE; 2708 2709 cm_entry->x_tidu_size = tidu_size; 2710 2711 if (cm_entry->x_early_disc) { 2712 /* 2713 * We need to check if a disconnect request has come 2714 * while we are connected, if so, then we need to 2715 * set rpcerr->re_status appropriately before returning 2716 * NULL to caller. 2717 */ 2718 if (rpcerr->re_status == RPC_SUCCESS) 2719 rpcerr->re_status = RPC_XPRTFAILED; 2720 cm_entry->x_connected = FALSE; 2721 } else 2722 cm_entry->x_connected = connected; 2723 2724 2725 /* 2726 * There could be a discrepancy here such that 2727 * x_early_disc is TRUE yet connected is TRUE as well 2728 * and the connection is actually connected. In that case 2729 * lets be conservative and declare the connection as not 2730 * connected. 2731 */ 2732 cm_entry->x_early_disc = FALSE; 2733 cm_entry->x_needdis = (cm_entry->x_connected == FALSE); 2734 cm_entry->x_ctime = lbolt; 2735 2736 DTRACE_PROBE2(connmgr__get, char *, "created a new cm_entry", 2737 struct cm_xprt *, cm_entry); 2738 /* 2739 * Notify any threads waiting that the connection attempt is done. 2740 */ 2741 cm_entry->x_thread = FALSE; 2742 cv_broadcast(&cm_entry->x_conn_cv); 2743 2744 if (cm_entry->x_connected == FALSE) { 2745 mutex_exit(&connmgr_lock); 2746 connmgr_release(cm_entry); 2747 return (NULL); 2748 } 2749 2750 mutex_exit(&connmgr_lock); 2751 2752 if (p->cku_flags & CKU_BC_SETUP) { 2753 extern void mir_set_cbinfo(queue_t *, void *); 2754 mir_set_cbinfo(wq, (void *)cbsrv_args); 2755 } 2756 2757 return (cm_entry); 2758 } 2759 2760 /* 2761 * Keep the cm_xprt entry on the connecton list when making a connection. This 2762 * is to prevent multiple connections to a slow server from appearing. 2763 * We use the bit field x_thread to tell if a thread is doing a connection 2764 * which keeps other interested threads from messing with connection. 2765 * Those other threads just wait if x_thread is set. 2766 * 2767 * If x_thread is not set, then we do the actual work of connecting via 2768 * connmgr_connect(). 2769 * 2770 * mutex convention: called with connmgr_lock held, returns with it released. 2771 */ 2772 static struct cm_xprt * 2773 connmgr_wrapconnect( 2774 struct cm_xprt *cm_entry, 2775 const struct timeval *waitp, 2776 struct netbuf *destaddr, 2777 int addrfmly, 2778 struct netbuf *srcaddr, 2779 struct rpc_err *rpcerr, 2780 bool_t reconnect, 2781 bool_t nosignal) 2782 { 2783 ASSERT(MUTEX_HELD(&connmgr_lock)); 2784 /* 2785 * Hold this entry as we are about to drop connmgr_lock. 2786 */ 2787 CONN_HOLD(cm_entry); 2788 2789 /* 2790 * If there is a thread already making a connection for us, then 2791 * wait for it to complete the connection. 2792 */ 2793 if (cm_entry->x_thread == TRUE) { 2794 rpcerr->re_status = connmgr_cwait(cm_entry, waitp, nosignal); 2795 2796 if (rpcerr->re_status != RPC_SUCCESS) { 2797 mutex_exit(&connmgr_lock); 2798 connmgr_release(cm_entry); 2799 return (NULL); 2800 } 2801 } else { 2802 bool_t connected; 2803 calllist_t call; 2804 2805 cm_entry->x_thread = TRUE; 2806 2807 while (cm_entry->x_needrel == TRUE) { 2808 cm_entry->x_needrel = FALSE; 2809 2810 connmgr_sndrel(cm_entry); 2811 delay(drv_usectohz(1000000)); 2812 2813 mutex_enter(&connmgr_lock); 2814 } 2815 2816 /* 2817 * If we need to send a T_DISCON_REQ, send one. 2818 */ 2819 connmgr_dis_and_wait(cm_entry); 2820 2821 mutex_exit(&connmgr_lock); 2822 2823 bzero(&call, sizeof (call)); 2824 cv_init(&call.call_cv, NULL, CV_DEFAULT, NULL); 2825 2826 connected = connmgr_connect(cm_entry, cm_entry->x_wq, 2827 destaddr, addrfmly, &call, &cm_entry->x_tidu_size, 2828 reconnect, waitp, nosignal); 2829 2830 *rpcerr = call.call_err; 2831 cv_destroy(&call.call_cv); 2832 2833 mutex_enter(&connmgr_lock); 2834 2835 2836 if (cm_entry->x_early_disc) { 2837 /* 2838 * We need to check if a disconnect request has come 2839 * while we are connected, if so, then we need to 2840 * set rpcerr->re_status appropriately before returning 2841 * NULL to caller. 2842 */ 2843 if (rpcerr->re_status == RPC_SUCCESS) 2844 rpcerr->re_status = RPC_XPRTFAILED; 2845 cm_entry->x_connected = FALSE; 2846 } else 2847 cm_entry->x_connected = connected; 2848 2849 /* 2850 * There could be a discrepancy here such that 2851 * x_early_disc is TRUE yet connected is TRUE as well 2852 * and the connection is actually connected. In that case 2853 * lets be conservative and declare the connection as not 2854 * connected. 2855 */ 2856 2857 cm_entry->x_early_disc = FALSE; 2858 cm_entry->x_needdis = (cm_entry->x_connected == FALSE); 2859 2860 2861 /* 2862 * connmgr_connect() may have given up before the connection 2863 * actually timed out. So ensure that before the next 2864 * connection attempt we do a disconnect. 2865 */ 2866 cm_entry->x_ctime = lbolt; 2867 cm_entry->x_thread = FALSE; 2868 2869 cv_broadcast(&cm_entry->x_conn_cv); 2870 2871 if (cm_entry->x_connected == FALSE) { 2872 mutex_exit(&connmgr_lock); 2873 connmgr_release(cm_entry); 2874 return (NULL); 2875 } 2876 } 2877 2878 if (srcaddr != NULL) { 2879 /* 2880 * Copy into the handle the 2881 * source address of the 2882 * connection, which we will use 2883 * in case of a later retry. 2884 */ 2885 if (srcaddr->len != cm_entry->x_src.len) { 2886 if (srcaddr->maxlen > 0) 2887 kmem_free(srcaddr->buf, srcaddr->maxlen); 2888 srcaddr->buf = kmem_zalloc(cm_entry->x_src.len, 2889 KM_SLEEP); 2890 srcaddr->maxlen = srcaddr->len = 2891 cm_entry->x_src.len; 2892 } 2893 bcopy(cm_entry->x_src.buf, srcaddr->buf, srcaddr->len); 2894 } 2895 cm_entry->x_time = lbolt; 2896 mutex_exit(&connmgr_lock); 2897 return (cm_entry); 2898 } 2899 2900 /* 2901 * If we need to send a T_DISCON_REQ, send one. 2902 */ 2903 static void 2904 connmgr_dis_and_wait(struct cm_xprt *cm_entry) 2905 { 2906 ASSERT(MUTEX_HELD(&connmgr_lock)); 2907 for (;;) { 2908 while (cm_entry->x_needdis == TRUE) { 2909 RPCLOG(8, "connmgr_dis_and_wait: need " 2910 "T_DISCON_REQ for connection 0x%p\n", 2911 (void *)cm_entry); 2912 cm_entry->x_needdis = FALSE; 2913 cm_entry->x_waitdis = TRUE; 2914 2915 connmgr_snddis(cm_entry); 2916 2917 mutex_enter(&connmgr_lock); 2918 } 2919 2920 if (cm_entry->x_waitdis == TRUE) { 2921 clock_t curlbolt; 2922 clock_t timout; 2923 2924 RPCLOG(8, "connmgr_dis_and_wait waiting for " 2925 "T_DISCON_REQ's ACK for connection %p\n", 2926 (void *)cm_entry); 2927 curlbolt = ddi_get_lbolt(); 2928 2929 timout = clnt_cots_min_conntout * 2930 drv_usectohz(1000000) + curlbolt; 2931 2932 /* 2933 * The TPI spec says that the T_DISCON_REQ 2934 * will get acknowledged, but in practice 2935 * the ACK may never get sent. So don't 2936 * block forever. 2937 */ 2938 (void) cv_timedwait(&cm_entry->x_dis_cv, 2939 &connmgr_lock, timout); 2940 } 2941 /* 2942 * If we got the ACK, break. If we didn't, 2943 * then send another T_DISCON_REQ. 2944 */ 2945 if (cm_entry->x_waitdis == FALSE) { 2946 break; 2947 } else { 2948 RPCLOG(8, "connmgr_dis_and_wait: did" 2949 "not get T_DISCON_REQ's ACK for " 2950 "connection %p\n", (void *)cm_entry); 2951 cm_entry->x_needdis = TRUE; 2952 } 2953 } 2954 } 2955 2956 static void 2957 connmgr_cancelconn(struct cm_xprt *cm_entry) 2958 { 2959 /* 2960 * Mark the connection table entry as dead; the next thread that 2961 * goes through connmgr_release() will notice this and deal with it. 2962 */ 2963 mutex_enter(&connmgr_lock); 2964 cm_entry->x_dead = TRUE; 2965 2966 /* 2967 * Notify any threads waiting for the connection that it isn't 2968 * going to happen. 2969 */ 2970 cm_entry->x_thread = FALSE; 2971 cv_broadcast(&cm_entry->x_conn_cv); 2972 mutex_exit(&connmgr_lock); 2973 2974 connmgr_release(cm_entry); 2975 } 2976 2977 static void 2978 connmgr_close(struct cm_xprt *cm_entry) 2979 { 2980 mutex_enter(&cm_entry->x_lock); 2981 while (cm_entry->x_ref != 0) { 2982 /* 2983 * Must be a noninterruptible wait. 2984 */ 2985 cv_wait(&cm_entry->x_cv, &cm_entry->x_lock); 2986 } 2987 2988 if (cm_entry->x_tiptr != NULL) { 2989 if (cm_entry->x_cb) { 2990 extern void mir_clear_cbinfo(queue_t *); 2991 mir_clear_cbinfo(cm_entry->x_wq); 2992 } 2993 (void) t_kclose(cm_entry->x_tiptr, 1); 2994 } 2995 2996 /* 2997 * Remove all tags 2998 */ 2999 if (!rpc_is_taglist_empty(cm_entry->x_tags)) 3000 rpc_remove_all_tag(&cm_tag_hd, (void *)cm_entry); 3001 3002 rpc_destroy_taglist(&cm_entry->x_tags); 3003 3004 mutex_exit(&cm_entry->x_lock); 3005 3006 if (cm_entry->x_ksp != NULL) { 3007 mutex_enter(&connmgr_lock); 3008 cm_entry->x_ksp->ks_private = NULL; 3009 mutex_exit(&connmgr_lock); 3010 3011 /* 3012 * Must free the buffer we allocated for the 3013 * server address in the update function 3014 */ 3015 if (((struct cm_kstat_xprt *)(cm_entry->x_ksp->ks_data))-> 3016 x_server.value.str.addr.ptr != NULL) { 3017 kmem_free(((struct cm_kstat_xprt *)(cm_entry->x_ksp-> 3018 ks_data))->x_server.value.str.addr.ptr, 3019 INET6_ADDRSTRLEN); 3020 } 3021 kmem_free(cm_entry->x_ksp->ks_data, 3022 cm_entry->x_ksp->ks_data_size); 3023 kstat_delete(cm_entry->x_ksp); 3024 } 3025 3026 mutex_destroy(&cm_entry->x_lock); 3027 cv_destroy(&cm_entry->x_cv); 3028 cv_destroy(&cm_entry->x_conn_cv); 3029 cv_destroy(&cm_entry->x_dis_cv); 3030 3031 if (cm_entry->x_server.buf != NULL) 3032 kmem_free(cm_entry->x_server.buf, cm_entry->x_server.maxlen); 3033 if (cm_entry->x_src.buf != NULL) 3034 kmem_free(cm_entry->x_src.buf, cm_entry->x_src.maxlen); 3035 cm_entry->x_cb = FALSE; 3036 kmem_free(cm_entry, sizeof (struct cm_xprt)); 3037 } 3038 3039 /* 3040 * Called by KRPC after sending the call message to release the connection 3041 * it was using. 3042 */ 3043 static void 3044 connmgr_release(struct cm_xprt *cm_entry) 3045 { 3046 mutex_enter(&cm_entry->x_lock); 3047 cm_entry->x_ref--; 3048 if (cm_entry->x_ref == 0) 3049 cv_signal(&cm_entry->x_cv); 3050 mutex_exit(&cm_entry->x_lock); 3051 } 3052 3053 /* 3054 * Given an open stream, connect to the remote. Returns true if connected, 3055 * false otherwise. 3056 */ 3057 static bool_t 3058 connmgr_connect( 3059 struct cm_xprt *cm_entry, 3060 queue_t *wq, 3061 struct netbuf *addr, 3062 int addrfmly, 3063 calllist_t *e, 3064 int *tidu_ptr, 3065 bool_t reconnect, 3066 const struct timeval *waitp, 3067 bool_t nosignal) 3068 { 3069 mblk_t *mp; 3070 struct T_conn_req *tcr; 3071 struct T_info_ack *tinfo; 3072 int interrupted, error; 3073 int tidu_size, kstat_instance; 3074 3075 /* if it's a reconnect, flush any lingering data messages */ 3076 if (reconnect) 3077 (void) putctl1(wq, M_FLUSH, FLUSHRW); 3078 3079 mp = allocb(sizeof (*tcr) + addr->len, BPRI_LO); 3080 if (mp == NULL) { 3081 /* 3082 * This is unfortunate, but we need to look up the stats for 3083 * this zone to increment the "memory allocation failed" 3084 * counter. curproc->p_zone is safe since we're initiating a 3085 * connection and not in some strange streams context. 3086 */ 3087 struct rpcstat *rpcstat; 3088 3089 rpcstat = zone_getspecific(rpcstat_zone_key, rpc_zone()); 3090 ASSERT(rpcstat != NULL); 3091 3092 RPCLOG0(1, "connmgr_connect: cannot alloc mp for " 3093 "sending conn request\n"); 3094 COTSRCSTAT_INCR(rpcstat->rpc_cots_client, rcnomem); 3095 e->call_status = RPC_SYSTEMERROR; 3096 e->call_reason = ENOSR; 3097 return (FALSE); 3098 } 3099 3100 mp->b_datap->db_type = M_PROTO; 3101 tcr = (struct T_conn_req *)mp->b_rptr; 3102 bzero(tcr, sizeof (*tcr)); 3103 tcr->PRIM_type = T_CONN_REQ; 3104 tcr->DEST_length = addr->len; 3105 tcr->DEST_offset = sizeof (struct T_conn_req); 3106 mp->b_wptr = mp->b_rptr + sizeof (*tcr); 3107 3108 bcopy(addr->buf, mp->b_wptr, tcr->DEST_length); 3109 mp->b_wptr += tcr->DEST_length; 3110 3111 RPCLOG(8, "connmgr_connect: sending conn request on queue " 3112 "%p", (void *)wq); 3113 RPCLOG(8, " call %p\n", (void *)wq); 3114 /* 3115 * We use the entry in the handle that is normally used for 3116 * waiting for RPC replies to wait for the connection accept. 3117 */ 3118 if (clnt_dispatch_send(wq, mp, e, 0, 0) != RPC_SUCCESS) { 3119 DTRACE_PROBE(krpc__e__connmgr__connect__cantsend); 3120 freemsg(mp); 3121 return (FALSE); 3122 } 3123 3124 mutex_enter(&clnt_pending_lock); 3125 3126 /* 3127 * We wait for the transport connection to be made, or an 3128 * indication that it could not be made. 3129 */ 3130 interrupted = 0; 3131 3132 /* 3133 * waitforack should have been called with T_OK_ACK, but the 3134 * present implementation needs to be passed T_INFO_ACK to 3135 * work correctly. 3136 */ 3137 error = waitforack(e, T_INFO_ACK, waitp, nosignal); 3138 if (error == EINTR) 3139 interrupted = 1; 3140 if (zone_status_get(curproc->p_zone) >= ZONE_IS_EMPTY) { 3141 /* 3142 * No time to lose; we essentially have been signaled to 3143 * quit. 3144 */ 3145 interrupted = 1; 3146 } 3147 #ifdef RPCDEBUG 3148 if (error == ETIME) 3149 RPCLOG0(8, "connmgr_connect: giving up " 3150 "on connection attempt; " 3151 "clnt_dispatch notifyconn " 3152 "diagnostic 'no one waiting for " 3153 "connection' should not be " 3154 "unexpected\n"); 3155 #endif 3156 if (e->call_prev) 3157 e->call_prev->call_next = e->call_next; 3158 else 3159 clnt_pending = e->call_next; 3160 if (e->call_next) 3161 e->call_next->call_prev = e->call_prev; 3162 mutex_exit(&clnt_pending_lock); 3163 3164 if (e->call_status != RPC_SUCCESS || error != 0) { 3165 if (interrupted) 3166 e->call_status = RPC_INTR; 3167 else if (error == ETIME) 3168 e->call_status = RPC_TIMEDOUT; 3169 else if (error == EPROTO) 3170 e->call_status = RPC_SYSTEMERROR; 3171 3172 RPCLOG(8, "connmgr_connect: can't connect, status: " 3173 "%s\n", clnt_sperrno(e->call_status)); 3174 3175 if (e->call_reply) { 3176 freemsg(e->call_reply); 3177 e->call_reply = NULL; 3178 } 3179 3180 return (FALSE); 3181 } 3182 /* 3183 * The result of the "connection accept" is a T_info_ack 3184 * in the call_reply field. 3185 */ 3186 ASSERT(e->call_reply != NULL); 3187 mp = e->call_reply; 3188 e->call_reply = NULL; 3189 tinfo = (struct T_info_ack *)mp->b_rptr; 3190 3191 tidu_size = tinfo->TIDU_size; 3192 tidu_size -= (tidu_size % BYTES_PER_XDR_UNIT); 3193 if (tidu_size > COTS_DEFAULT_ALLOCSIZE || (tidu_size <= 0)) 3194 tidu_size = COTS_DEFAULT_ALLOCSIZE; 3195 *tidu_ptr = tidu_size; 3196 3197 freemsg(mp); 3198 3199 /* 3200 * Set up the pertinent options. NODELAY is so the transport doesn't 3201 * buffer up RPC messages on either end. This may not be valid for 3202 * all transports. Failure to set this option is not cause to 3203 * bail out so we return success anyway. Note that lack of NODELAY 3204 * or some other way to flush the message on both ends will cause 3205 * lots of retries and terrible performance. 3206 */ 3207 if (addrfmly == AF_INET || addrfmly == AF_INET6) { 3208 (void) connmgr_setopt(wq, IPPROTO_TCP, TCP_NODELAY, e); 3209 if (e->call_status == RPC_XPRTFAILED) 3210 return (FALSE); 3211 } 3212 3213 /* 3214 * Since we have a connection, we now need to figure out if 3215 * we need to create a kstat. If x_ksp is not NULL then we 3216 * are reusing a connection and so we do not need to create 3217 * another kstat -- lets just return. 3218 */ 3219 if (cm_entry->x_ksp != NULL) 3220 return (TRUE); 3221 3222 /* 3223 * We need to increment rpc_kstat_instance atomically to prevent 3224 * two kstats being created with the same instance. 3225 */ 3226 kstat_instance = atomic_add_32_nv((uint32_t *)&rpc_kstat_instance, 1); 3227 3228 if ((cm_entry->x_ksp = kstat_create_zone("unix", kstat_instance, 3229 "rpc_cots_connections", "rpc", KSTAT_TYPE_NAMED, 3230 (uint_t)(sizeof (cm_kstat_xprt_t) / sizeof (kstat_named_t)), 3231 KSTAT_FLAG_VIRTUAL, cm_entry->x_zoneid)) == NULL) { 3232 return (TRUE); 3233 } 3234 3235 cm_entry->x_ksp->ks_lock = &connmgr_lock; 3236 cm_entry->x_ksp->ks_private = cm_entry; 3237 cm_entry->x_ksp->ks_data_size = ((INET6_ADDRSTRLEN * sizeof (char)) 3238 + sizeof (cm_kstat_template)); 3239 cm_entry->x_ksp->ks_data = kmem_alloc(cm_entry->x_ksp->ks_data_size, 3240 KM_SLEEP); 3241 bcopy(&cm_kstat_template, cm_entry->x_ksp->ks_data, 3242 cm_entry->x_ksp->ks_data_size); 3243 ((struct cm_kstat_xprt *)(cm_entry->x_ksp->ks_data))-> 3244 x_server.value.str.addr.ptr = 3245 kmem_alloc(INET6_ADDRSTRLEN, KM_SLEEP); 3246 3247 cm_entry->x_ksp->ks_update = conn_kstat_update; 3248 kstat_install(cm_entry->x_ksp); 3249 return (TRUE); 3250 } 3251 3252 /* 3253 * Called by connmgr_connect to set an option on the new stream. 3254 */ 3255 static bool_t 3256 connmgr_setopt(queue_t *wq, int level, int name, calllist_t *e) 3257 { 3258 mblk_t *mp; 3259 struct opthdr *opt; 3260 struct T_optmgmt_req *tor; 3261 struct timeval waitp; 3262 int error; 3263 3264 mp = allocb(sizeof (struct T_optmgmt_req) + sizeof (struct opthdr) + 3265 sizeof (int), BPRI_LO); 3266 if (mp == NULL) { 3267 RPCLOG0(1, "connmgr_setopt: cannot alloc mp for option " 3268 "request\n"); 3269 return (FALSE); 3270 } 3271 3272 mp->b_datap->db_type = M_PROTO; 3273 tor = (struct T_optmgmt_req *)(mp->b_rptr); 3274 tor->PRIM_type = T_SVR4_OPTMGMT_REQ; 3275 tor->MGMT_flags = T_NEGOTIATE; 3276 tor->OPT_length = sizeof (struct opthdr) + sizeof (int); 3277 tor->OPT_offset = sizeof (struct T_optmgmt_req); 3278 3279 opt = (struct opthdr *)(mp->b_rptr + sizeof (struct T_optmgmt_req)); 3280 opt->level = level; 3281 opt->name = name; 3282 opt->len = sizeof (int); 3283 *(int *)((char *)opt + sizeof (*opt)) = 1; 3284 mp->b_wptr += sizeof (struct T_optmgmt_req) + sizeof (struct opthdr) + 3285 sizeof (int); 3286 3287 /* 3288 * We will use this connection regardless 3289 * of whether or not the option is settable. 3290 */ 3291 if (clnt_dispatch_send(wq, mp, e, 0, 0) != RPC_SUCCESS) { 3292 DTRACE_PROBE(krpc__e__connmgr__setopt__cantsend); 3293 freemsg(mp); 3294 return (FALSE); 3295 } 3296 3297 mutex_enter(&clnt_pending_lock); 3298 3299 waitp.tv_sec = clnt_cots_min_conntout; 3300 waitp.tv_usec = 0; 3301 error = waitforack(e, T_OPTMGMT_ACK, &waitp, 1); 3302 3303 if (e->call_prev) 3304 e->call_prev->call_next = e->call_next; 3305 else 3306 clnt_pending = e->call_next; 3307 if (e->call_next) 3308 e->call_next->call_prev = e->call_prev; 3309 mutex_exit(&clnt_pending_lock); 3310 3311 if (e->call_reply != NULL) { 3312 freemsg(e->call_reply); 3313 e->call_reply = NULL; 3314 } 3315 3316 if (e->call_status != RPC_SUCCESS || error != 0) { 3317 RPCLOG(1, "connmgr_setopt: can't set option: %d\n", name); 3318 return (FALSE); 3319 } 3320 RPCLOG(8, "connmgr_setopt: successfully set option: %d\n", name); 3321 return (TRUE); 3322 } 3323 3324 #ifdef DEBUG 3325 3326 /* 3327 * This is a knob to let us force code coverage in allocation failure 3328 * case. 3329 */ 3330 static int connmgr_failsnd; 3331 #define CONN_SND_ALLOC(Size, Pri) \ 3332 ((connmgr_failsnd-- > 0) ? NULL : allocb(Size, Pri)) 3333 3334 #else 3335 3336 #define CONN_SND_ALLOC(Size, Pri) allocb(Size, Pri) 3337 3338 #endif 3339 3340 /* 3341 * Sends an orderly release on the specified queue. 3342 * Entered with connmgr_lock. Exited without connmgr_lock 3343 */ 3344 static void 3345 connmgr_sndrel(struct cm_xprt *cm_entry) 3346 { 3347 struct T_ordrel_req *torr; 3348 mblk_t *mp; 3349 queue_t *q = cm_entry->x_wq; 3350 ASSERT(MUTEX_HELD(&connmgr_lock)); 3351 mp = CONN_SND_ALLOC(sizeof (struct T_ordrel_req), BPRI_LO); 3352 if (mp == NULL) { 3353 cm_entry->x_needrel = TRUE; 3354 mutex_exit(&connmgr_lock); 3355 RPCLOG(1, "connmgr_sndrel: cannot alloc mp for sending ordrel " 3356 "to queue %p\n", (void *)q); 3357 return; 3358 } 3359 mutex_exit(&connmgr_lock); 3360 3361 mp->b_datap->db_type = M_PROTO; 3362 torr = (struct T_ordrel_req *)(mp->b_rptr); 3363 torr->PRIM_type = T_ORDREL_REQ; 3364 mp->b_wptr = mp->b_rptr + sizeof (struct T_ordrel_req); 3365 3366 RPCLOG(8, "connmgr_sndrel: sending ordrel to queue %p\n", (void *)q); 3367 put(q, mp); 3368 } 3369 3370 /* 3371 * Sends an disconnect on the specified queue. 3372 * Entered with connmgr_lock. Exited without connmgr_lock 3373 */ 3374 static void 3375 connmgr_snddis(struct cm_xprt *cm_entry) 3376 { 3377 struct T_discon_req *tdis; 3378 mblk_t *mp; 3379 queue_t *q = cm_entry->x_wq; 3380 3381 ASSERT(MUTEX_HELD(&connmgr_lock)); 3382 mp = CONN_SND_ALLOC(sizeof (*tdis), BPRI_LO); 3383 if (mp == NULL) { 3384 cm_entry->x_needdis = TRUE; 3385 mutex_exit(&connmgr_lock); 3386 RPCLOG(1, "connmgr_snddis: cannot alloc mp for sending discon " 3387 "to queue %p\n", (void *)q); 3388 return; 3389 } 3390 mutex_exit(&connmgr_lock); 3391 3392 mp->b_datap->db_type = M_PROTO; 3393 tdis = (struct T_discon_req *)mp->b_rptr; 3394 tdis->PRIM_type = T_DISCON_REQ; 3395 mp->b_wptr = mp->b_rptr + sizeof (*tdis); 3396 3397 RPCLOG(8, "connmgr_snddis: sending discon to queue %p\n", (void *)q); 3398 put(q, mp); 3399 } 3400 3401 3402 3403 void 3404 connmgr_cb_add(struct cm_xprt *cm_entry) 3405 { 3406 mutex_enter(&connmgr_cb_lock); 3407 cm_entry->x_next = cb_cm_hd; 3408 cb_cm_hd = cm_entry; 3409 mutex_exit(&connmgr_cb_lock); 3410 } 3411 3412 /* 3413 * lookup the cb_cm_hd list for a cm_xprt 3414 * Adds a reference to the xprt 3415 */ 3416 3417 static struct cm_xprt * 3418 connmgr_cb_lookup(queue_t *wq) 3419 { 3420 struct cm_xprt *cm_entry; 3421 3422 mutex_enter(&connmgr_cb_lock); 3423 cm_entry = cb_cm_hd; 3424 3425 while (cm_entry) { 3426 if (cm_entry->x_wq == wq) { 3427 CONN_HOLD(cm_entry); 3428 break; 3429 } 3430 cm_entry = cm_entry->x_next; 3431 } 3432 mutex_exit(&connmgr_cb_lock); 3433 3434 return (cm_entry); 3435 } 3436 3437 3438 /* 3439 * Creates a connection manager entry for callback connection 3440 * and adds it to the callback connection list 3441 */ 3442 3443 /* ARGSUSED */ 3444 int 3445 connmgr_cb_create(void *tp_handle, rpcprog_t prog, rpcvers_t vers, 3446 int family, void *tag) 3447 { 3448 struct cm_xprt *cm_entry; 3449 uint_t addr_len; 3450 zoneid_t zoneid = rpc_zoneid(); 3451 vnode_t *vp; 3452 struct sockaddr_storage *sa; 3453 char *devnam; 3454 SVCMASTERXPRT *mxprt = (SVCMASTERXPRT *)tp_handle; 3455 3456 /* 3457 * If there is already a cm_xprt for the same transport 3458 * then just add the tag. 3459 */ 3460 if (cm_entry = connmgr_cb_lookup(mxprt->xp_wq)) { 3461 /* 3462 * Do nothing if the same tag already exists 3463 * for this cm_xprt, else add the tag 3464 */ 3465 if (rpc_cmp_tag(cm_entry->x_tags, tag) == FALSE) 3466 rpc_add_tag( 3467 &cm_tag_hd, (void *)cm_entry, (void *)tag); 3468 3469 connmgr_release(cm_entry); 3470 3471 return (0); 3472 } 3473 3474 3475 cm_entry = (struct cm_xprt *) 3476 kmem_zalloc(sizeof (struct cm_xprt), KM_SLEEP); 3477 3478 cm_entry->x_tiptr = NULL; 3479 3480 addr_len = mxprt->xp_rtaddr.len; 3481 cm_entry->x_server.buf = kmem_zalloc(addr_len, KM_SLEEP); 3482 bcopy(mxprt->xp_rtaddr.buf, cm_entry->x_server.buf, addr_len); 3483 cm_entry->x_server.len = cm_entry->x_server.maxlen = addr_len; 3484 3485 addr_len = mxprt->xp_lcladdr.len; 3486 cm_entry->x_src.buf = kmem_zalloc(addr_len, KM_SLEEP); 3487 bcopy(mxprt->xp_lcladdr.buf, cm_entry->x_src.buf, addr_len); 3488 cm_entry->x_src.len = cm_entry->x_src.maxlen = addr_len; 3489 cm_entry->x_tiptr = NULL; 3490 3491 3492 cm_entry->x_state_flags = X_CONNECTED; 3493 cm_entry->x_ref = 0; 3494 cm_entry->x_family = family; 3495 cm_entry->x_zoneid = zoneid; 3496 mutex_init(&cm_entry->x_lock, NULL, MUTEX_DEFAULT, NULL); 3497 cv_init(&cm_entry->x_cv, NULL, CV_DEFAULT, NULL); 3498 cv_init(&cm_entry->x_conn_cv, NULL, CV_DEFAULT, NULL); 3499 cm_entry->x_wq = mxprt->xp_wq; 3500 cm_entry->x_tidu_size = mxprt->xp_msg_size; 3501 cm_entry->x_prog = prog; 3502 cm_entry->x_cb = FALSE; 3503 3504 sa = (struct sockaddr_storage *)mxprt->xp_lcladdr.buf; 3505 switch (sa->ss_family) { 3506 case AF_INET: 3507 devnam = "/dev/tcp"; 3508 break; 3509 case AF_INET6: 3510 devnam = "/dev/tcp6"; 3511 break; 3512 default: 3513 goto errout; 3514 } 3515 3516 if (lookupname(devnam, UIO_SYSSPACE, FOLLOW, NULLVPP, &vp) != 0) { 3517 goto errout; 3518 } 3519 3520 if (vp->v_type != VCHR) { 3521 VN_RELE(vp); 3522 goto errout; 3523 } 3524 3525 cm_entry->x_rdev = vp->v_rdev; 3526 3527 VN_RELE(vp); 3528 3529 /* XXX do we need kstats */ 3530 cm_entry->x_ksp = NULL; 3531 rpc_init_taglist(&cm_entry->x_tags); 3532 rpc_add_tag(&cm_tag_hd, (void *)cm_entry, tag); 3533 3534 connmgr_cb_add(cm_entry); 3535 3536 return (0); 3537 3538 errout: 3539 connmgr_close(cm_entry); 3540 return (1); 3541 } 3542 3543 /* 3544 * We end up here if there is a connection disconnect. 3545 * The cm_entry is taken off the list. All tags are removed. 3546 */ 3547 3548 void 3549 connmgr_cb_destroy(queue_t *wq) 3550 { 3551 struct cm_xprt *cm_entry, *cm_prev; 3552 3553 mutex_enter(&connmgr_cb_lock); 3554 cm_entry = cb_cm_hd; 3555 cm_prev = NULL; 3556 3557 while (cm_entry) { 3558 if (cm_entry->x_wq == wq) { 3559 if (cm_prev) 3560 cm_prev->x_next = cm_entry->x_next; 3561 else 3562 cb_cm_hd = cm_entry->x_next; 3563 cm_entry->x_next = NULL; 3564 break; 3565 } 3566 cm_prev = cm_entry; 3567 cm_entry = cm_entry->x_next; 3568 } 3569 mutex_exit(&connmgr_cb_lock); 3570 3571 if (cm_entry) 3572 connmgr_close(cm_entry); 3573 } 3574 3575 /* 3576 * Sets up the entry for receiving replies, and calls rpcmod's write put proc 3577 * (through put) to send the call. 3578 */ 3579 static int 3580 clnt_dispatch_send(queue_t *q, mblk_t *mp, calllist_t *e, uint_t xid, 3581 uint_t queue_flag) 3582 { 3583 ASSERT(e != NULL); 3584 3585 e->call_status = RPC_TIMEDOUT; /* optimistic, eh? */ 3586 e->call_reason = 0; 3587 e->call_wq = q; 3588 e->call_xid = xid; 3589 e->call_notified = FALSE; 3590 3591 if (!canput(q)) { 3592 e->call_status = RPC_CANTSEND; 3593 e->call_reason = EIO; 3594 return (RPC_CANTSEND); 3595 } 3596 3597 /* 3598 * If queue_flag is set then the calllist_t is already on the hash 3599 * queue. In this case just send the message and return. 3600 */ 3601 if (queue_flag) { 3602 put(q, mp); 3603 return (RPC_SUCCESS); 3604 3605 } 3606 3607 /* 3608 * Set up calls for RPC requests (with XID != 0) on the hash 3609 * queue for fast lookups and place other calls (i.e. 3610 * connection management) on the linked list. 3611 */ 3612 if (xid != 0) { 3613 RPCLOG(64, "clnt_dispatch_send: putting xid 0x%x on " 3614 "dispatch list\n", xid); 3615 e->call_hash = call_hash(xid, clnt_cots_hash_size); 3616 e->call_bucket = &cots_call_ht[e->call_hash]; 3617 call_table_enter(e); 3618 } else { 3619 mutex_enter(&clnt_pending_lock); 3620 if (clnt_pending) 3621 clnt_pending->call_prev = e; 3622 e->call_next = clnt_pending; 3623 e->call_prev = NULL; 3624 clnt_pending = e; 3625 mutex_exit(&clnt_pending_lock); 3626 } 3627 3628 put(q, mp); 3629 return (RPC_SUCCESS); 3630 } 3631 3632 /* 3633 * Called by rpcmod to notify a client with a clnt_pending call that its reply 3634 * has arrived. If we can't find a client waiting for this reply, we log 3635 * the error and return. 3636 */ 3637 bool_t 3638 clnt_dispatch_notify(mblk_t *mp, zoneid_t zoneid, uint32_t xid) 3639 { 3640 calllist_t *e = NULL; 3641 call_table_t *chtp; 3642 uint_t hash; 3643 3644 /* 3645 * This is a standard reply, handle it normally 3646 */ 3647 3648 hash = call_hash(xid, clnt_cots_hash_size); 3649 chtp = &cots_call_ht[hash]; 3650 /* call_table_find returns with the hash bucket locked */ 3651 call_table_find(chtp, xid, e); 3652 3653 if (e != NULL) { 3654 /* 3655 * Found thread waiting for this reply 3656 */ 3657 mutex_enter(&e->call_lock); 3658 3659 /* 3660 * verify that the reply is coming in on 3661 * the same zone that it was sent from. 3662 */ 3663 if (e->call_zoneid != zoneid) { 3664 mutex_exit(&e->call_lock); 3665 mutex_exit(&chtp->ct_lock); 3666 return (FALSE); 3667 } 3668 3669 if (e->call_reply) 3670 /* 3671 * This can happen under the following scenario: 3672 * clnt_cots_kcallit() times out on the response, 3673 * rfscall() repeats the CLNT_CALL() with 3674 * the same xid, clnt_cots_kcallit() sends the retry, 3675 * thereby putting the clnt handle on the pending list, 3676 * the first response arrives, signalling the thread 3677 * in clnt_cots_kcallit(). Before that thread is 3678 * dispatched, the second response arrives as well, 3679 * and clnt_dispatch_notify still finds the handle on 3680 * the pending list, with call_reply set. So free the 3681 * old reply now. 3682 * 3683 * It is also possible for a response intended for 3684 * an RPC call with a different xid to reside here. 3685 * This can happen if the thread that owned this 3686 * client handle prior to the current owner bailed 3687 * out and left its call record on the dispatch 3688 * queue. A window exists where the response can 3689 * arrive before the current owner dispatches its 3690 * RPC call. 3691 * 3692 * In any case, this is the very last point where we 3693 * can safely check the call_reply field before 3694 * placing the new response there. 3695 */ 3696 freemsg(e->call_reply); 3697 e->call_reply = mp; 3698 e->call_status = RPC_SUCCESS; 3699 e->call_notified = TRUE; 3700 cv_signal(&e->call_cv); 3701 mutex_exit(&e->call_lock); 3702 mutex_exit(&chtp->ct_lock); 3703 return (TRUE); 3704 } else { 3705 zone_t *zone; 3706 struct rpcstat *rpcstat; 3707 3708 mutex_exit(&chtp->ct_lock); 3709 RPCLOG(65, "clnt_dispatch_notify: no caller for reply 0x%x\n", 3710 xid); 3711 /* 3712 * This is unfortunate, but we need to lookup the zone so we 3713 * can increment its "rcbadxids" counter. 3714 */ 3715 zone = zone_find_by_id(zoneid); 3716 if (zone == NULL) { 3717 /* 3718 * The zone went away... 3719 */ 3720 return (FALSE); 3721 } 3722 rpcstat = zone_getspecific(rpcstat_zone_key, zone); 3723 if (zone_status_get(zone) >= ZONE_IS_SHUTTING_DOWN) { 3724 /* 3725 * Not interested 3726 */ 3727 zone_rele(zone); 3728 return (FALSE); 3729 } 3730 COTSRCSTAT_INCR(rpcstat->rpc_cots_client, rcbadxids); 3731 zone_rele(zone); 3732 } 3733 return (FALSE); 3734 } 3735 3736 /* 3737 * Called by rpcmod when a non-data indication arrives. The ones in which we 3738 * are interested are connection indications and options acks. We dispatch 3739 * based on the queue the indication came in on. If we are not interested in 3740 * what came in, we return false to rpcmod, who will then pass it upstream. 3741 */ 3742 bool_t 3743 clnt_dispatch_notifyconn(queue_t *q, mblk_t *mp) 3744 { 3745 calllist_t *e; 3746 int type; 3747 3748 ASSERT((q->q_flag & QREADR) == 0); 3749 3750 type = ((union T_primitives *)mp->b_rptr)->type; 3751 RPCLOG(8, "clnt_dispatch_notifyconn: prim type: [%s]\n", 3752 rpc_tpiprim2name(type)); 3753 mutex_enter(&clnt_pending_lock); 3754 for (e = clnt_pending; /* NO CONDITION */; e = e->call_next) { 3755 if (e == NULL) { 3756 mutex_exit(&clnt_pending_lock); 3757 RPCLOG(1, "clnt_dispatch_notifyconn: no one waiting " 3758 "for connection on queue 0x%p\n", (void *)q); 3759 return (FALSE); 3760 } 3761 if (e->call_wq == q) 3762 break; 3763 } 3764 3765 switch (type) { 3766 case T_CONN_CON: 3767 /* 3768 * The transport is now connected, send a T_INFO_REQ to get 3769 * the tidu size. 3770 */ 3771 mutex_exit(&clnt_pending_lock); 3772 ASSERT(mp->b_datap->db_lim - mp->b_datap->db_base >= 3773 sizeof (struct T_info_req)); 3774 mp->b_rptr = mp->b_datap->db_base; 3775 ((union T_primitives *)mp->b_rptr)->type = T_INFO_REQ; 3776 mp->b_wptr = mp->b_rptr + sizeof (struct T_info_req); 3777 mp->b_datap->db_type = M_PCPROTO; 3778 put(q, mp); 3779 return (TRUE); 3780 case T_INFO_ACK: 3781 case T_OPTMGMT_ACK: 3782 e->call_status = RPC_SUCCESS; 3783 e->call_reply = mp; 3784 e->call_notified = TRUE; 3785 cv_signal(&e->call_cv); 3786 break; 3787 case T_ERROR_ACK: 3788 e->call_status = RPC_CANTCONNECT; 3789 e->call_reply = mp; 3790 e->call_notified = TRUE; 3791 cv_signal(&e->call_cv); 3792 break; 3793 case T_OK_ACK: 3794 /* 3795 * Great, but we are really waiting for a T_CONN_CON 3796 */ 3797 freemsg(mp); 3798 break; 3799 default: 3800 mutex_exit(&clnt_pending_lock); 3801 RPCLOG(1, "clnt_dispatch_notifyconn: bad type %d\n", type); 3802 return (FALSE); 3803 } 3804 3805 mutex_exit(&clnt_pending_lock); 3806 return (TRUE); 3807 } 3808 3809 /* 3810 * Called by rpcmod when the transport is (or should be) going away. Informs 3811 * all callers waiting for replies and marks the entry in the connection 3812 * manager's list as unconnected, and either closing (close handshake in 3813 * progress) or dead. 3814 */ 3815 void 3816 clnt_dispatch_notifyall(queue_t *q, int32_t msg_type, int32_t reason) 3817 { 3818 calllist_t *e; 3819 call_table_t *ctp; 3820 struct cm_xprt *cm_entry; 3821 int have_connmgr_lock; 3822 int i; 3823 3824 ASSERT((q->q_flag & QREADR) == 0); 3825 3826 RPCLOG(1, "clnt_dispatch_notifyall on queue %p", (void *)q); 3827 RPCLOG(1, " received a notifcation prim type [%s]", 3828 rpc_tpiprim2name(msg_type)); 3829 RPCLOG(1, " and reason %d\n", reason); 3830 3831 /* 3832 * Find the transport entry in the connection manager's list, close 3833 * the transport and delete the entry. In the case where rpcmod's 3834 * idle timer goes off, it sends us a T_ORDREL_REQ, indicating we 3835 * should gracefully close the connection. 3836 */ 3837 have_connmgr_lock = 1; 3838 mutex_enter(&connmgr_lock); 3839 for (cm_entry = cm_hd; cm_entry; cm_entry = cm_entry->x_next) { 3840 ASSERT(cm_entry != cm_entry->x_next); 3841 if (cm_entry->x_wq == q) { 3842 ASSERT(MUTEX_HELD(&connmgr_lock)); 3843 ASSERT(have_connmgr_lock == 1); 3844 switch (msg_type) { 3845 case T_ORDREL_REQ: 3846 3847 if (cm_entry->x_dead) { 3848 RPCLOG(1, "idle timeout on dead " 3849 "connection: %p\n", 3850 (void *)cm_entry); 3851 if (clnt_stop_idle != NULL) 3852 (*clnt_stop_idle)(q); 3853 break; 3854 } 3855 3856 /* 3857 * Only mark the connection as dead if it is 3858 * connected and idle. 3859 * An unconnected connection has probably 3860 * gone idle because the server is down, 3861 * and when it comes back up there will be 3862 * retries that need to use that connection. 3863 */ 3864 if (cm_entry->x_connected || 3865 cm_entry->x_doomed) { 3866 if (cm_entry->x_ordrel) { 3867 if (cm_entry->x_closing == 3868 TRUE) { 3869 /* 3870 * The connection is 3871 * obviously wedged due 3872 * to a bug or problem 3873 * with the transport. 3874 * Mark it as dead. 3875 * Otherwise we can 3876 * leak connections. 3877 */ 3878 cm_entry->x_dead = TRUE; 3879 mutex_exit( 3880 &connmgr_lock); 3881 have_connmgr_lock = 0; 3882 if (clnt_stop_idle != 3883 NULL) 3884 (*clnt_stop_idle)(q); 3885 break; 3886 } 3887 cm_entry->x_closing = TRUE; 3888 connmgr_sndrel(cm_entry); 3889 have_connmgr_lock = 0; 3890 } else { 3891 cm_entry->x_dead = TRUE; 3892 mutex_exit(&connmgr_lock); 3893 have_connmgr_lock = 0; 3894 if (clnt_stop_idle != NULL) 3895 (*clnt_stop_idle)(q); 3896 } 3897 } else { 3898 /* 3899 * We don't mark the connection 3900 * as dead, but we turn off the 3901 * idle timer. 3902 */ 3903 mutex_exit(&connmgr_lock); 3904 have_connmgr_lock = 0; 3905 if (clnt_stop_idle != NULL) 3906 (*clnt_stop_idle)(q); 3907 RPCLOG(1, "clnt_dispatch_notifyall:" 3908 " ignoring timeout from rpcmod" 3909 " (q %p) because we are not " 3910 " connected\n", (void *)q); 3911 } 3912 break; 3913 case T_ORDREL_IND: 3914 /* 3915 * If this entry is marked closing, then we are 3916 * completing a close handshake, and the 3917 * connection is dead. Otherwise, the server is 3918 * trying to close. Since the server will not 3919 * be sending any more RPC replies, we abort 3920 * the connection, including flushing 3921 * any RPC requests that are in-transit. 3922 */ 3923 if (cm_entry->x_closing) { 3924 cm_entry->x_dead = TRUE; 3925 mutex_exit(&connmgr_lock); 3926 have_connmgr_lock = 0; 3927 if (clnt_stop_idle != NULL) 3928 (*clnt_stop_idle)(q); 3929 } else { 3930 /* 3931 * if we're getting a disconnect 3932 * before we've finished our 3933 * connect attempt, mark it for 3934 * later processing 3935 */ 3936 if (cm_entry->x_thread) 3937 cm_entry->x_early_disc = TRUE; 3938 else 3939 cm_entry->x_connected = FALSE; 3940 cm_entry->x_waitdis = TRUE; 3941 connmgr_snddis(cm_entry); 3942 have_connmgr_lock = 0; 3943 } 3944 break; 3945 3946 case T_ERROR_ACK: 3947 case T_OK_ACK: 3948 cm_entry->x_waitdis = FALSE; 3949 cv_signal(&cm_entry->x_dis_cv); 3950 mutex_exit(&connmgr_lock); 3951 return; 3952 3953 case T_DISCON_REQ: 3954 if (cm_entry->x_thread) 3955 cm_entry->x_early_disc = TRUE; 3956 else 3957 cm_entry->x_connected = FALSE; 3958 cm_entry->x_waitdis = TRUE; 3959 3960 connmgr_snddis(cm_entry); 3961 have_connmgr_lock = 0; 3962 break; 3963 3964 case T_DISCON_IND: 3965 default: 3966 /* 3967 * if we're getting a disconnect before 3968 * we've finished our connect attempt, 3969 * mark it for later processing 3970 */ 3971 if (cm_entry->x_closing) { 3972 cm_entry->x_dead = TRUE; 3973 mutex_exit(&connmgr_lock); 3974 have_connmgr_lock = 0; 3975 if (clnt_stop_idle != NULL) 3976 (*clnt_stop_idle)(q); 3977 } else { 3978 if (cm_entry->x_thread) { 3979 cm_entry->x_early_disc = TRUE; 3980 } else { 3981 cm_entry->x_dead = TRUE; 3982 cm_entry->x_connected = FALSE; 3983 } 3984 } 3985 break; 3986 } 3987 break; 3988 } 3989 } 3990 3991 if (have_connmgr_lock) 3992 mutex_exit(&connmgr_lock); 3993 3994 if (msg_type == T_ERROR_ACK || msg_type == T_OK_ACK) { 3995 RPCLOG(1, "clnt_dispatch_notifyall: (wq %p) could not find " 3996 "connmgr entry for discon ack\n", (void *)q); 3997 return; 3998 } 3999 4000 /* 4001 * Then kick all the clnt_pending calls out of their wait. There 4002 * should be no clnt_pending calls in the case of rpcmod's idle 4003 * timer firing. 4004 */ 4005 for (i = 0; i < clnt_cots_hash_size; i++) { 4006 ctp = &cots_call_ht[i]; 4007 mutex_enter(&ctp->ct_lock); 4008 for (e = ctp->ct_call_next; 4009 e != (calllist_t *)ctp; 4010 e = e->call_next) { 4011 if (e->call_wq == q && e->call_notified == FALSE) { 4012 RPCLOG(1, 4013 "clnt_dispatch_notifyall for queue %p ", 4014 (void *)q); 4015 RPCLOG(1, "aborting clnt_pending call %p\n", 4016 (void *)e); 4017 4018 if (msg_type == T_DISCON_IND) 4019 e->call_reason = reason; 4020 e->call_notified = TRUE; 4021 e->call_status = RPC_XPRTFAILED; 4022 cv_signal(&e->call_cv); 4023 } 4024 } 4025 mutex_exit(&ctp->ct_lock); 4026 } 4027 4028 mutex_enter(&clnt_pending_lock); 4029 for (e = clnt_pending; e; e = e->call_next) { 4030 /* 4031 * Only signal those RPC handles that haven't been 4032 * signalled yet. Otherwise we can get a bogus call_reason. 4033 * This can happen if thread A is making a call over a 4034 * connection. If the server is killed, it will cause 4035 * reset, and reason will default to EIO as a result of 4036 * a T_ORDREL_IND. Thread B then attempts to recreate 4037 * the connection but gets a T_DISCON_IND. If we set the 4038 * call_reason code for all threads, then if thread A 4039 * hasn't been dispatched yet, it will get the wrong 4040 * reason. The bogus call_reason can make it harder to 4041 * discriminate between calls that fail because the 4042 * connection attempt failed versus those where the call 4043 * may have been executed on the server. 4044 */ 4045 if (e->call_wq == q && e->call_notified == FALSE) { 4046 RPCLOG(1, "clnt_dispatch_notifyall for queue %p ", 4047 (void *)q); 4048 RPCLOG(1, " aborting clnt_pending call %p\n", 4049 (void *)e); 4050 4051 if (msg_type == T_DISCON_IND) 4052 e->call_reason = reason; 4053 e->call_notified = TRUE; 4054 /* 4055 * Let the caller timeout, else he will retry 4056 * immediately. 4057 */ 4058 e->call_status = RPC_XPRTFAILED; 4059 4060 /* 4061 * We used to just signal those threads 4062 * waiting for a connection, (call_xid = 0). 4063 * That meant that threads waiting for a response 4064 * waited till their timeout expired. This 4065 * could be a long time if they've specified a 4066 * maximum timeout. (2^31 - 1). So we 4067 * Signal all threads now. 4068 */ 4069 cv_signal(&e->call_cv); 4070 } 4071 } 4072 mutex_exit(&clnt_pending_lock); 4073 } 4074 4075 4076 /*ARGSUSED*/ 4077 /* 4078 * after resuming a system that's been suspended for longer than the 4079 * NFS server's idle timeout (svc_idle_timeout for Solaris 2), rfscall() 4080 * generates "NFS server X not responding" and "NFS server X ok" messages; 4081 * here we reset inet connections to cause a re-connect and avoid those 4082 * NFS messages. see 4045054 4083 */ 4084 boolean_t 4085 connmgr_cpr_reset(void *arg, int code) 4086 { 4087 struct cm_xprt *cxp; 4088 4089 if (code == CB_CODE_CPR_CHKPT) 4090 return (B_TRUE); 4091 4092 if (mutex_tryenter(&connmgr_lock) == 0) 4093 return (B_FALSE); 4094 for (cxp = cm_hd; cxp; cxp = cxp->x_next) { 4095 if ((cxp->x_family == AF_INET || cxp->x_family == AF_INET6) && 4096 cxp->x_connected == TRUE) { 4097 if (cxp->x_thread) 4098 cxp->x_early_disc = TRUE; 4099 else 4100 cxp->x_connected = FALSE; 4101 cxp->x_needdis = TRUE; 4102 } 4103 } 4104 mutex_exit(&connmgr_lock); 4105 return (B_TRUE); 4106 } 4107 4108 void 4109 clnt_cots_stats_init(zoneid_t zoneid, struct rpc_cots_client **statsp) 4110 { 4111 4112 *statsp = (struct rpc_cots_client *)rpcstat_zone_init_common(zoneid, 4113 "unix", "rpc_cots_client", (const kstat_named_t *)&cots_rcstat_tmpl, 4114 sizeof (cots_rcstat_tmpl)); 4115 } 4116 4117 void 4118 clnt_cots_stats_fini(zoneid_t zoneid, struct rpc_cots_client **statsp) 4119 { 4120 rpcstat_zone_fini_common(zoneid, "unix", "rpc_cots_client"); 4121 kmem_free(*statsp, sizeof (cots_rcstat_tmpl)); 4122 } 4123 4124 void 4125 clnt_cots_init(void) 4126 { 4127 mutex_init(&connmgr_lock, NULL, MUTEX_DEFAULT, NULL); 4128 mutex_init(&connmgr_cb_lock, NULL, MUTEX_DEFAULT, NULL); 4129 mutex_init(&clnt_pending_lock, NULL, MUTEX_DEFAULT, NULL); 4130 4131 rpc_taghd_init(&cm_tag_hd, 4132 offsetof(struct cm_xprt, x_tags)); 4133 4134 if (clnt_cots_hash_size < DEFAULT_MIN_HASH_SIZE) 4135 clnt_cots_hash_size = DEFAULT_MIN_HASH_SIZE; 4136 4137 cots_call_ht = call_table_init(clnt_cots_hash_size); 4138 zone_key_create(&zone_cots_key, NULL, NULL, clnt_zone_destroy); 4139 } 4140 4141 void 4142 clnt_cots_fini(void) 4143 { 4144 rpc_taghd_destroy(&cm_tag_hd); 4145 (void) zone_key_delete(zone_cots_key); 4146 } 4147 4148 /* 4149 * Wait for TPI ack, returns success only if expected ack is received 4150 * within timeout period. 4151 */ 4152 4153 static int 4154 waitforack(calllist_t *e, t_scalar_t ack_prim, const struct timeval *waitp, 4155 bool_t nosignal) 4156 { 4157 union T_primitives *tpr; 4158 clock_t timout; 4159 int cv_stat = 1; 4160 4161 ASSERT(MUTEX_HELD(&clnt_pending_lock)); 4162 while (e->call_reply == NULL) { 4163 if (waitp != NULL) { 4164 timout = waitp->tv_sec * drv_usectohz(MICROSEC) + 4165 drv_usectohz(waitp->tv_usec) + lbolt; 4166 if (nosignal) 4167 cv_stat = cv_timedwait(&e->call_cv, 4168 &clnt_pending_lock, timout); 4169 else 4170 cv_stat = cv_timedwait_sig(&e->call_cv, 4171 &clnt_pending_lock, timout); 4172 } else { 4173 if (nosignal) 4174 cv_wait(&e->call_cv, &clnt_pending_lock); 4175 else 4176 cv_stat = cv_wait_sig(&e->call_cv, 4177 &clnt_pending_lock); 4178 } 4179 if (cv_stat == -1) 4180 return (ETIME); 4181 if (cv_stat == 0) 4182 return (EINTR); 4183 /* 4184 * if we received an error from the server and we know a reply 4185 * is not going to be sent, do not wait for the full timeout, 4186 * return now. 4187 */ 4188 if (e->call_status == RPC_XPRTFAILED) 4189 return (e->call_reason); 4190 } 4191 tpr = (union T_primitives *)e->call_reply->b_rptr; 4192 if (tpr->type == ack_prim) 4193 return (0); /* Success */ 4194 4195 if (tpr->type == T_ERROR_ACK) { 4196 if (tpr->error_ack.TLI_error == TSYSERR) 4197 return (tpr->error_ack.UNIX_error); 4198 else 4199 return (t_tlitosyserr(tpr->error_ack.TLI_error)); 4200 } 4201 4202 return (EPROTO); /* unknown or unexpected primitive */ 4203 } --- EOF ---