mirror of
https://github.com/servalproject/serval-dna.git
synced 2025-04-14 22:26:44 +00:00
Ensure a race condition while starting servald only starts one process
This commit is contained in:
parent
0b0e4cc8b4
commit
afd31fe12c
138
overlay.c
138
overlay.c
@ -1,138 +0,0 @@
|
||||
/*
|
||||
Serval Distributed Numbering Architecture (DNA)
|
||||
Copyright (C) 2010 Paul Gardner-Stephen
|
||||
|
||||
This program is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU General Public License
|
||||
as published by the Free Software Foundation; either version 2
|
||||
of the License, or (at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||
*/
|
||||
|
||||
/*
|
||||
Serval Overlay Mesh Network.
|
||||
|
||||
Basically we use UDP broadcast to send link-local, and then implement a BATMAN-like protocol over the top of that.
|
||||
|
||||
Each overlay packet can contain one or more encapsulated packets each addressed using Serval DNA SIDs, with source,
|
||||
destination and next-hop addresses.
|
||||
|
||||
The use of an overlay also lets us be a bit clever about using irregular transports, such as an ISM915 modem attached via ethernet
|
||||
(which we are planning to build in coming months), by paring off the IP and UDP headers that would otherwise dominate. Even on
|
||||
regular WiFi and ethernet we can aggregate packets in a way similar to IAX, but not just for voice frames.
|
||||
|
||||
The use of long (relative to IPv4 or even IPv6) 256 bit Curve25519 addresses means that it is a really good idea to
|
||||
have neighbouring nodes exchange lists of peer aliases so that addresses can be summarised, possibly using less space than IPv4
|
||||
would have.
|
||||
|
||||
One approach to handle address shortening is to have the periodic TTL=255 BATMAN-style hello packets include an epoch number.
|
||||
This epoch number can be used by immediate neighbours of the originator to reference the neighbours listed in that packet by
|
||||
their ordinal position in the packet instead of by their full address. This gets us address shortening to 1 byte in most cases
|
||||
in return for no new packets, but the periodic hello packets will now be larger. We might deal with this issue by having these
|
||||
hello packets reference the previous epoch for common neighbours. Unresolved neighbour addresses could be resolved by a simple
|
||||
DNA request, which should only need to occur ocassionally, and other link-local neighbours could sniff and cache the responses
|
||||
to avoid duplicated traffic. Indeed, during quiet times nodes could preemptively advertise address resolutions if they wished,
|
||||
or similarly advertise the full address of a few (possibly randomly selected) neighbours in each epoch.
|
||||
|
||||
Byzantine Robustness is a goal, so we have to think about all sorts of malicious failure modes.
|
||||
|
||||
One approach to help byzantine robustness is to have multiple signature shells for each hop for mesh topology packets.
|
||||
Thus forging a report of closeness requires forging a signature. As such frames are forwarded, the outermost signature
|
||||
shell is removed. This is really only needed for more paranoid uses.
|
||||
|
||||
We want to have different traffic classes for voice/video calls versus regular traffic, e.g., MeshMS frames. Thus we need to have
|
||||
separate traffic queues for these items. Aside from allowing us to prioritise isochronous data, it also allows us to expire old
|
||||
isochronous frames that are in-queue once there is no longer any point delivering them (e.g after holding them more than 200ms).
|
||||
We can also be clever about round-robin fair-sharing or even prioritising among isochronous streams. Since we also know about the
|
||||
DNA isochronous protocols and the forward error correction and other redundancy measures we also get smart about dropping, say, 1 in 3
|
||||
frames from every call if we know that this can be safely done. That is, when traffic is low, we maximise redundancy, and when we
|
||||
start to hit the limit of traffic, we start to throw away some of the redundancy. This of course relies on us knowing when the
|
||||
network channel is getting too full.
|
||||
|
||||
Smart-flooding of broadcast information is also a requirement. The long addresses help here, as we can make any address that begins
|
||||
with the first 192 bits all ones be broadcast, and use the remaining 64 bits as a "broadcast packet identifier" (BPI).
|
||||
Nodes can remember recently seen BPIs and not forward broadcast frames that have been seen recently. This should get us smart flooding
|
||||
of the majority of a mesh (with some node mobility issues being a factor). We could refine this later, but it will do for now, especially
|
||||
since for things like number resolution we are happy to send repeat requests.
|
||||
|
||||
This file currently seems to exist solely to contain this introduction, which is fine with me. Functions land in here until their
|
||||
proper place becomes apparent.
|
||||
|
||||
*/
|
||||
|
||||
#include "serval.h"
|
||||
#include "conf.h"
|
||||
#include "rhizome.h"
|
||||
#include "httpd.h"
|
||||
#include "strbuf.h"
|
||||
#include "keyring.h"
|
||||
#include "overlay_interface.h"
|
||||
#include "server.h"
|
||||
|
||||
keyring_file *keyring=NULL;
|
||||
|
||||
/* The caller must set up the keyring before calling this function, and the keyring must contain at
|
||||
* least one identity, otherwise MDP and routing will not work.
|
||||
*/
|
||||
int overlayServerMode()
|
||||
{
|
||||
IN();
|
||||
|
||||
/* Setup up client API sockets before writing our PID file
|
||||
We want clients to be able to connect to our sockets as soon
|
||||
as servald start has returned. But we don't want servald start
|
||||
to take very long.
|
||||
Try to perform only minimal CPU or IO processing here.
|
||||
*/
|
||||
overlay_mdp_setup_sockets();
|
||||
monitor_setup_sockets();
|
||||
// start the HTTP server if enabled
|
||||
httpd_server_start(HTTPD_PORT, HTTPD_PORT_MAX);
|
||||
|
||||
/* record PID file so that servald start can return */
|
||||
if (server_write_pid())
|
||||
RETURN(-1);
|
||||
|
||||
/* For testing, it can be very helpful to delay the start of the server process, for example to
|
||||
* check that the start/stop logic is robust.
|
||||
*/
|
||||
const char *delay = getenv("SERVALD_SERVER_START_DELAY");
|
||||
if (delay){
|
||||
time_ms_t milliseconds = atoi(delay);
|
||||
INFOF("Sleeping for %"PRId64" milliseconds", (int64_t) milliseconds);
|
||||
sleep_ms(milliseconds);
|
||||
}
|
||||
overlay_queue_init();
|
||||
|
||||
time_ms_t now = gettime_ms();
|
||||
|
||||
// Periodically check for server shut down
|
||||
RESCHEDULE(&ALARM_STRUCT(server_shutdown_check), now, now+30000, now);
|
||||
|
||||
overlay_mdp_bind_internal_services();
|
||||
|
||||
olsr_init_socket();
|
||||
|
||||
/* Calculate (and possibly show) CPU usage stats periodically */
|
||||
RESCHEDULE(&ALARM_STRUCT(fd_periodicstats), now+3000, now+30000, TIME_MS_NEVER_WILL);
|
||||
|
||||
cf_on_config_change();
|
||||
|
||||
// log message used by tests to wait for the server to start
|
||||
INFO("Server initialised, entering main loop");
|
||||
|
||||
/* Check for activitiy and respond to it */
|
||||
while((serverMode==1) && fd_poll());
|
||||
|
||||
serverCleanUp();
|
||||
RETURN(0);
|
||||
OUT();
|
||||
}
|
@ -170,12 +170,16 @@ int overlay_mdp_setup_sockets()
|
||||
|
||||
if (mdp_sock.poll.fd == -1) {
|
||||
mdp_sock.poll.fd = mdp_bind_socket("mdp.socket");
|
||||
if (mdp_sock.poll.fd == -1)
|
||||
return -1;
|
||||
mdp_sock.poll.events = POLLIN;
|
||||
watch(&mdp_sock);
|
||||
}
|
||||
|
||||
if (mdp_sock2.poll.fd == -1) {
|
||||
mdp_sock2.poll.fd = mdp_bind_socket("mdp.2.socket");
|
||||
if (mdp_sock2.poll.fd == -1)
|
||||
return -1;
|
||||
mdp_sock2.poll.events = POLLIN;
|
||||
watch(&mdp_sock2);
|
||||
}
|
||||
@ -214,8 +218,10 @@ int overlay_mdp_setup_sockets()
|
||||
WHY_perror("bind");
|
||||
}
|
||||
|
||||
if (fd!=-1)
|
||||
if (fd!=-1){
|
||||
close(fd);
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
@ -1107,7 +1113,6 @@ int overlay_mdp_address_list(struct overlay_mdp_addrlist *request, struct overla
|
||||
|
||||
struct routing_state{
|
||||
struct socket_address *client;
|
||||
int fd;
|
||||
};
|
||||
|
||||
static int routing_table(struct subscriber *subscriber, void *context)
|
||||
|
58
server.c
58
server.c
@ -33,11 +33,14 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||
#include "overlay_interface.h"
|
||||
#include "overlay_packet.h"
|
||||
#include "server.h"
|
||||
#include "keyring.h"
|
||||
|
||||
#define PROC_SUBDIR "proc"
|
||||
#define PIDFILE_NAME "servald.pid"
|
||||
#define STOPFILE_NAME "servald.stop"
|
||||
|
||||
keyring_file *keyring=NULL;
|
||||
|
||||
static char pidfile_path[256];
|
||||
|
||||
static int server_getpid = 0;
|
||||
@ -101,7 +104,59 @@ int server()
|
||||
sigaction(SIGHUP, &sig, NULL);
|
||||
sigaction(SIGINT, &sig, NULL);
|
||||
|
||||
overlayServerMode();
|
||||
/* Setup up client API sockets before writing our PID file
|
||||
We want clients to be able to connect to our sockets as soon
|
||||
as servald start has returned. But we don't want servald start
|
||||
to take very long.
|
||||
Try to perform only minimal CPU or IO processing here.
|
||||
*/
|
||||
if (overlay_mdp_setup_sockets()==-1)
|
||||
RETURN(-1);
|
||||
|
||||
if (monitor_setup_sockets()==-1)
|
||||
RETURN(-1);
|
||||
|
||||
// start the HTTP server if enabled
|
||||
if (httpd_server_start(HTTPD_PORT, HTTPD_PORT_MAX)==-1)
|
||||
RETURN(-1);
|
||||
|
||||
/* For testing, it can be very helpful to delay the start of the server process, for example to
|
||||
* check that the start/stop logic is robust.
|
||||
*/
|
||||
const char *delay = getenv("SERVALD_SERVER_START_DELAY");
|
||||
if (delay){
|
||||
time_ms_t milliseconds = atoi(delay);
|
||||
INFOF("Sleeping for %"PRId64" milliseconds", (int64_t) milliseconds);
|
||||
sleep_ms(milliseconds);
|
||||
}
|
||||
|
||||
/* record PID file so that servald start can return */
|
||||
if (server_write_pid())
|
||||
RETURN(-1);
|
||||
|
||||
overlay_queue_init();
|
||||
|
||||
time_ms_t now = gettime_ms();
|
||||
|
||||
// Periodically check for server shut down
|
||||
RESCHEDULE(&ALARM_STRUCT(server_shutdown_check), now, now+30000, now);
|
||||
|
||||
overlay_mdp_bind_internal_services();
|
||||
|
||||
olsr_init_socket();
|
||||
|
||||
/* Calculate (and possibly show) CPU usage stats periodically */
|
||||
RESCHEDULE(&ALARM_STRUCT(fd_periodicstats), now+3000, now+30000, TIME_MS_NEVER_WILL);
|
||||
|
||||
cf_on_config_change();
|
||||
|
||||
// log message used by tests to wait for the server to start
|
||||
INFO("Server initialised, entering main loop");
|
||||
|
||||
/* Check for activitiy and respond to it */
|
||||
while((serverMode==1) && fd_poll());
|
||||
|
||||
serverCleanUp();
|
||||
|
||||
RETURN(0);
|
||||
OUT();
|
||||
@ -322,6 +377,7 @@ void cf_on_config_change()
|
||||
DEFINE_ALARM(server_shutdown_check);
|
||||
void server_shutdown_check(struct sched_ent *alarm)
|
||||
{
|
||||
// TODO we should watch a descriptor and quit when it closes
|
||||
/* If this server has been supplanted with another or Serval has been uninstalled, then its PID
|
||||
file will change or be unaccessible. In this case, shut down without all the cleanup.
|
||||
Perform this check at most once per second. */
|
||||
|
@ -65,7 +65,6 @@ SERVAL_DAEMON_SOURCES = \
|
||||
monitor-client.c \
|
||||
monitor-cli.c \
|
||||
nonce.c \
|
||||
overlay.c \
|
||||
overlay_address.c \
|
||||
overlay_buffer.c \
|
||||
overlay_interface.c \
|
||||
|
20
tests/server
20
tests/server
@ -106,14 +106,25 @@ test_StartStart() {
|
||||
assert [ "$servald_pid" = "$start_pid" ]
|
||||
}
|
||||
|
||||
doc_StartStopFast="Stop server before it finishes starting"
|
||||
setup_StartStopFast() {
|
||||
doc_StartTwice="Attempt to start the server twice at the same time"
|
||||
setup_StartTwice() {
|
||||
setup
|
||||
export SERVALD_SERVER_START_DELAY=10000
|
||||
export SERVALD_SERVER_START_DELAY=2000
|
||||
set_instance +A
|
||||
executeOk_servald config set debug.io on
|
||||
}
|
||||
test_StartStopFast() {
|
||||
start_other(){
|
||||
start_servald_server
|
||||
echo $servald_pid > other_pid
|
||||
}
|
||||
test_StartTwice() {
|
||||
fork %server start_other
|
||||
start_servald_server
|
||||
fork_wait %server
|
||||
# both servald start commands should return success with the same PID
|
||||
assertGrep other_pid "^$servald_pid$"
|
||||
stop_servald_server
|
||||
assert_no_servald_processes
|
||||
}
|
||||
|
||||
doc_RemovePid="Server stops when pid file removed"
|
||||
@ -124,7 +135,6 @@ setup_RemovePid() {
|
||||
test_RemovePid() {
|
||||
rm $instance_servald_pidfile
|
||||
wait_until ! kill -0 $servald_pid 2>/dev/null
|
||||
|
||||
}
|
||||
|
||||
doc_NoZombie="Server process does not become a zombie"
|
||||
|
Loading…
x
Reference in New Issue
Block a user