sbin/hastd/hast.conf.5

   1 .\" Copyright (c) 2010 The FreeBSD Foundation
   2 .\" Copyright (c) 2010-2012 Pawel Jakub Dawidek <pawel@dawidek.net>
   3 .\" All rights reserved.
   4 .\"
   5 .\" This documentation was written by Pawel Jakub Dawidek under sponsorship from
   6 .\" the FreeBSD Foundation.
   7 .\"
   8 .\" Redistribution and use in source and binary forms, with or without
   9 .\" modification, are permitted provided that the following conditions
  10 .\" are met:
  11 .\" 1. Redistributions of source code must retain the above copyright
  12 .\"    notice, this list of conditions and the following disclaimer.
  13 .\" 2. Redistributions in binary form must reproduce the above copyright
  14 .\"    notice, this list of conditions and the following disclaimer in the
  15 .\"    documentation and/or other materials provided with the distribution.
  16 .\"
  17 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
  18 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  19 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  20 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
  21 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  22 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  23 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  24 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  25 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  26 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  27 .\" SUCH DAMAGE.
  28 .\"
  29 .\" $FreeBSD$
  30 .\"
  31 .Dd January 25, 2012
  32 .Dt HAST.CONF 5
  33 .Os
  34 .Sh NAME
  35 .Nm hast.conf
  36 .Nd configuration file for the
  37 .Xr hastd 8
  38 daemon and the
  39 .Xr hastctl 8
  40 utility
  41 .Sh DESCRIPTION
  42 The
  43 .Nm
  44 file is used by both
  45 .Xr hastd 8
  46 daemon
  47 and
  48 .Xr hastctl 8
  49 control utility.
  50 Configuration file is designed in a way that exactly the same file can be
  51 (and should be) used on both HAST nodes.
  52 Every line starting with # is treated as comment and ignored.
  53 .Sh CONFIGURATION FILE SYNTAX
  54 General syntax of the
  55 .Nm
  56 file is following:
  57 .Bd -literal -offset indent
  58 # Global section
  59 control <addr>
  60 listen <addr>
  61 replication <mode>
  62 checksum <algorithm>
  63 compression <algorithm>
  64 timeout <seconds>
  65 exec <path>
  66 metaflush "on" | "off"
  67 pidfile <path>
  68
  69 on <node> {
  70         # Node section
  71         control <addr>
  72         listen <addr>
  73         pidfile <path>
  74 }
  75
  76 on <node> {
  77         # Node section
  78         control <addr>
  79         listen <addr>
  80         pidfile <path>
  81 }
  82
  83 resource <name> {
  84         # Resource section
  85         replication <mode>
  86         checksum <algorithm>
  87         compression <algorithm>
  88         name <name>
  89         local <path>
  90         timeout <seconds>
  91         exec <path>
  92         metaflush "on" | "off"
  93
  94         on <node> {
  95                 # Resource-node section
  96                 name <name>
  97                 # Required
  98                 local <path>
  99                 metaflush "on" | "off"
 100                 # Required
 101                 remote <addr>
 102                 source <addr>
 103         }
 104         on <node> {
 105                 # Resource-node section
 106                 name <name>
 107                 # Required
 108                 local <path>
 109                 metaflush "on" | "off"
 110                 # Required
 111                 remote <addr>
 112                 source <addr>
 113         }
 114 }
 115 .Ed
 116 .Pp
 117 Most of the various available configuration parameters are optional.
 118 If parameter is not defined in the particular section, it will be
 119 inherited from the parent section.
 120 For example, if the
 121 .Ic listen
 122 parameter is not defined in the node section, it will be inherited from
 123 the global section.
 124 In case the global section does not define the
 125 .Ic listen
 126 parameter at all, the default value will be used.
 127 .Sh CONFIGURATION FILE DESCRIPTION
 128 The
 129 .Aq node
 130 argument can be replaced either by a full hostname as obtained by
 131 .Xr gethostname 3 ,
 132 only first part of the hostname, or by node's UUID as found in the
 133 .Va kern.hostuuid
 134 .Xr sysctl 8
 135 variable.
 136 .Pp
 137 The following statements are available:
 138 .Bl -tag -width ".Ic xxxx"
 139 .It Ic control Aq addr
 140 .Pp
 141 Address for communication with
 142 .Xr hastctl 8 .
 143 Each of the following examples defines the same control address:
 144 .Bd -literal -offset indent
 145 uds:///var/run/hastctl
 146 unix:///var/run/hastctl
 147 /var/run/hastctl
 148 .Ed
 149 .Pp
 150 The default value is
 151 .Pa uds:///var/run/hastctl .
 152 .It Ic pidfile Aq path
 153 .Pp
 154 File in which to store the process ID of the main
 155 .Xr hastd 8
 156 process.
 157 .Pp
 158 The default value is
 159 .Pa /var/run/hastd.pid .
 160 .It Ic listen Aq addr
 161 .Pp
 162 Address to listen on in form of:
 163 .Bd -literal -offset indent
 164 protocol://protocol-specific-address
 165 .Ed
 166 .Pp
 167 Each of the following examples defines the same listen address:
 168 .Bd -literal -offset indent
 169 0.0.0.0
 170 0.0.0.0:8457
 171 tcp://0.0.0.0
 172 tcp://0.0.0.0:8457
 173 tcp4://0.0.0.0
 174 tcp4://0.0.0.0:8457
 175 .Ed
 176 .Pp
 177 Multiple listen addresses can be specified.
 178 By default
 179 .Nm hastd
 180 listens on
 181 .Pa tcp4://0.0.0.0:8457
 182 and
 183 .Pa tcp6://[::]:8457
 184 if kernel supports IPv4 and IPv6 respectively.
 185 .It Ic replication Aq mode
 186 .Pp
 187 Replication mode should be one of the following:
 188 .Bl -tag -width ".Ic xxxx"
 189 .It Ic memsync
 190 .Pp
 191 Report the write operation as completed when local write completes and
 192 when the remote node acknowledges the data receipt, but before it
 193 actually stores the data.
 194 The data on remote node will be stored directly after sending
 195 acknowledgement.
 196 This mode is intended to reduce latency, but still provides a very good
 197 reliability.
 198 The only situation where some small amount of data could be lost is when
 199 the data is stored on primary node and sent to the secondary.
 200 Secondary node then acknowledges data receipt and primary reports
 201 success to an application.
 202 However, it may happen that the secondary goes down before the received
 203 data is really stored locally.
 204 Before secondary node returns, primary node dies entirely.
 205 When the secondary node comes back to life it becomes the new primary.
 206 Unfortunately some small amount of data which was confirmed to be stored
 207 to the application was lost.
 208 The risk of such a situation is very small.
 209 The
 210 .Ic memsync
 211 replication mode is currently not implemented.
 212 .It Ic fullsync
 213 .Pp
 214 Mark the write operation as completed when local as well as remote
 215 write completes.
 216 This is the safest and the slowest replication mode.
 217 The
 218 .Ic fullsync
 219 replication mode is the default.
 220 .It Ic async
 221 .Pp
 222 The write operation is reported as complete right after the local write
 223 completes.
 224 This is the fastest and the most dangerous replication mode.
 225 This mode should be used when replicating to a distant node where
 226 latency is too high for other modes.
 227 .El
 228 .It Ic checksum Aq algorithm
 229 .Pp
 230 Checksum algorithm should be one of the following:
 231 .Bl -tag -width ".Ic sha256"
 232 .It Ic none
 233 No checksum will be calculated for the data being send over the network.
 234 This is the default setting.
 235 .It Ic crc32
 236 CRC32 checksum will be calculated.
 237 .It Ic sha256
 238 SHA256 checksum will be calculated.
 239 .El
 240 .It Ic compression Aq algorithm
 241 .Pp
 242 Compression algorithm should be one of the following:
 243 .Bl -tag -width ".Ic none"
 244 .It Ic none
 245 Data send over the network will not be compressed.
 246 .It Ic hole
 247 Only blocks that contain all zeros will be compressed.
 248 This is very useful for initial synchronization where potentially many blocks
 249 are still all zeros.
 250 There should be no measurable performance overhead when this algorithm is being
 251 used.
 252 This is the default setting.
 253 .It Ic lzf
 254 The LZF algorithm by Marc Alexander Lehmann will be used to compress the data
 255 send over the network.
 256 LZF is very fast, general purpose compression algorithm.
 257 .El
 258 .It Ic timeout Aq seconds
 259 .Pp
 260 Connection timeout in seconds.
 261 The default value is
 262 .Va 20 .
 263 .It Ic exec Aq path
 264 .Pp
 265 Execute the given program on various HAST events.
 266 Below is the list of currently implemented events and arguments the given
 267 program is executed with:
 268 .Bl -tag -width ".Ic xxxx"
 269 .It Ic "<path> role <resource> <oldrole> <newrole>"
 270 .Pp
 271 Executed on both primary and secondary nodes when resource role is changed.
 272 .Pp
 273 .It Ic "<path> connect <resource>"
 274 .Pp
 275 Executed on both primary and secondary nodes when connection for the given
 276 resource between the nodes is established.
 277 .Pp
 278 .It Ic "<path> disconnect <resource>"
 279 .Pp
 280 Executed on both primary and secondary nodes when connection for the given
 281 resource between the nodes is lost.
 282 .Pp
 283 .It Ic "<path> syncstart <resource>"
 284 .Pp
 285 Executed on primary node when synchronization process of secondary node is
 286 started.
 287 .Pp
 288 .It Ic "<path> syncdone <resource>"
 289 .Pp
 290 Executed on primary node when synchronization process of secondary node is
 291 completed successfully.
 292 .Pp
 293 .It Ic "<path> syncintr <resource>"
 294 .Pp
 295 Executed on primary node when synchronization process of secondary node is
 296 interrupted, most likely due to secondary node outage or connection failure
 297 between the nodes.
 298 .Pp
 299 .It Ic "<path> split-brain <resource>"
 300 .Pp
 301 Executed on both primary and secondary nodes when split-brain condition is
 302 detected.
 303 .Pp
 304 .El
 305 The
 306 .Aq path
 307 argument should contain full path to executable program.
 308 If the given program exits with code different than
 309 .Va 0 ,
 310 .Nm hastd
 311 will log it as an error.
 312 .Pp
 313 The
 314 .Aq resource
 315 argument is resource name from the configuration file.
 316 .Pp
 317 The
 318 .Aq oldrole
 319 argument is previous resource role (before the change).
 320 It can be one of:
 321 .Ar init ,
 322 .Ar secondary ,
 323 .Ar primary .
 324 .Pp
 325 The
 326 .Aq newrole
 327 argument is current resource role (after the change).
 328 It can be one of:
 329 .Ar init ,
 330 .Ar secondary ,
 331 .Ar primary .
 332 .Pp
 333 .It Ic metaflush on | off
 334 .Pp
 335 When set to
 336 .Va on ,
 337 flush write cache of the local provider after every metadata (activemap) update.
 338 Flushing write cache ensures that provider will not reorder writes and that
 339 metadata will be properly updated before real data is stored.
 340 If the local provider does not support flushing write cache (it returns
 341 .Er EOPNOTSUPP
 342 on the
 343 .Cm BIO_FLUSH
 344 request),
 345 .Nm hastd
 346 will disable
 347 .Ic metaflush
 348 automatically.
 349 The default value is
 350 .Va on .
 351 .Pp
 352 .It Ic name Aq name
 353 .Pp
 354 GEOM provider name that will appear as
 355 .Pa /dev/hast/<name> .
 356 If name is not defined, resource name will be used as provider name.
 357 .It Ic local Aq path
 358 .Pp
 359 Path to the local component which will be used as backend provider for
 360 the resource.
 361 This can be either GEOM provider or regular file.
 362 .It Ic remote Aq addr
 363 .Pp
 364 Address of the remote
 365 .Nm hastd
 366 daemon.
 367 Format is the same as for the
 368 .Ic listen
 369 statement.
 370 When operating as a primary node this address will be used to connect to
 371 the secondary node.
 372 When operating as a secondary node only connections from this address
 373 will be accepted.
 374 .Pp
 375 A special value of
 376 .Va none
 377 can be used when the remote address is not yet known (eg. the other node is not
 378 set up yet).
 379 .It Ic source Aq addr
 380 .Pp
 381 Local address to bind to before connecting to the remote
 382 .Nm hastd
 383 daemon.
 384 Format is the same as for the
 385 .Ic listen
 386 statement.
 387 .El
 388 .Sh FILES
 389 .Bl -tag -width ".Pa /var/run/hastctl" -compact
 390 .It Pa /etc/hast.conf
 391 The default
 392 .Xr hastctl 8
 393 and
 394 .Xr hastd 8
 395 configuration file.
 396 .It Pa /var/run/hastctl
 397 Control socket used by the
 398 .Xr hastctl 8
 399 control utility to communicate with the
 400 .Xr hastd 8
 401 daemon.
 402 .El
 403 .Sh EXAMPLES
 404 The example configuration file can look as follows:
 405 .Bd -literal -offset indent
 406 listen tcp://0.0.0.0
 407
 408 on hasta {
 409         listen tcp://2001:db8::1/64
 410 }
 411 on hastb {
 412         listen tcp://2001:db8::2/64
 413 }
 414
 415 resource shared {
 416         local /dev/da0
 417
 418         on hasta {
 419                 remote tcp://10.0.0.2
 420         }
 421         on hastb {
 422                 remote tcp://10.0.0.1
 423         }
 424 }
 425 resource tank {
 426         on hasta {
 427                 local /dev/mirror/tanka
 428                 source tcp://10.0.0.1
 429                 remote tcp://10.0.0.2
 430         }
 431         on hastb {
 432                 local /dev/mirror/tankb
 433                 source tcp://10.0.0.2
 434                 remote tcp://10.0.0.1
 435         }
 436 }
 437 .Ed
 438 .Sh SEE ALSO
 439 .Xr gethostname 3 ,
 440 .Xr geom 4 ,
 441 .Xr hastctl 8 ,
 442 .Xr hastd 8
 443 .Sh AUTHORS
 444 The
 445 .Nm
 446 was written by
 447 .An Pawel Jakub Dawidek Aq pjd@FreeBSD.org
 448 under sponsorship of the FreeBSD Foundation.