Commit graph

263 commits

Author SHA1 Message Date
Micke Nordin d75b5ebddb
Correct lb for registration 2024-10-25 15:17:27 +02:00
Micke Nordin e36fac0071
Install scriptherder 2024-10-17 14:19:37 +02:00
Micke Nordin 4ab780ed9b
Install scriptherder 2024-10-17 14:17:16 +02:00
Micke Nordin 243d076627
check scriptherder 2024-10-17 14:14:03 +02:00
Micke Nordin ad425d78c5
Add nrpe 2024-10-17 14:11:44 +02:00
Micke Nordin 99a7bbc983
Add checks 2024-10-17 14:09:58 +02:00
Micke Nordin 46b2559759
Add monitor 2024-10-17 14:03:23 +02:00
Micke Nordin 8eeed156a0
Turns out we don't need satosa 2024-10-17 12:00:41 +02:00
Micke Nordin 61665955a2
Save certs 2024-10-17 11:56:47 +02:00
Micke Nordin 11a4feb668
No extra volumes 2024-10-17 10:53:56 +02:00
Micke Nordin 72f3d11801
Don't mount individual files 2024-10-17 10:51:40 +02:00
Micke Nordin 4e349a1223
Dont run alloy client on monitor 2024-10-17 10:31:04 +02:00
Micke Nordin 7a299bcb56
Secrets for satosa 2024-10-17 10:08:01 +02:00
Micke Nordin 0ba2932661
Formatting 2024-10-16 17:39:09 +02:00
Micke Nordin 93ae71623d
Add more params 2024-10-16 10:01:51 +02:00
Micke Nordin 6754b0081e
Add cinder secrets 2024-10-16 09:23:36 +02:00
Micke Nordin b15ca8506d
Satosa config 2024-10-15 11:57:59 +02:00
Micke Nordin f488df2435
Secrets for satosa 2024-10-15 11:38:23 +02:00
Micke Nordin bfe9b8d3c7
Trust jocar and mariah 2024-10-15 11:23:07 +02:00
Micke Nordin fe8cf2c3f8
Add inflyx password 2024-10-15 11:18:40 +02:00
Micke Nordin 9023c768cf
internal-dco-test-monitor-1.streams.sunet.se added 2024-10-14 16:28:47 +02:00
Micke Nordin 0aebc7a86d
internal-dco-test-satosa-1.streams.sunet.se added 2024-10-14 16:28:46 +02:00
Micke Nordin f83c2d7448
Correct key 2024-10-14 14:06:02 +02:00
Micke Nordin a1cf28c5cc
correct name 2024-10-14 13:57:35 +02:00
Micke Nordin 2102f0f6ae
Add peers 2024-10-14 13:31:02 +02:00
Micke Nordin 4397b0c4b1
Add peers 2024-10-14 13:26:43 +02:00
Micke Nordin 465fdf65b0
Remove some copy paste 2024-10-14 13:20:49 +02:00
Micke Nordin cab8797259
Add setup_cosmos_modules 2024-10-14 13:05:58 +02:00
Micke Nordin 2327c6d7ae
internal-dco-test-k8sc-3.streams.sunet.se added 2024-10-11 17:39:59 +02:00
Micke Nordin a99b0dbf53
internal-dco-test-k8sc-2.streams.sunet.se added 2024-10-11 17:39:58 +02:00
Micke Nordin 97e8b264e2
internal-dco-test-k8sc-1.streams.sunet.se added 2024-10-11 17:39:57 +02:00
Micke Nordin ec7d267d41
internal-dco-test-k8sw-3.streams.sunet.se added 2024-10-11 17:39:56 +02:00
Micke Nordin 0198ac2508
internal-dco-test-k8sw-2.streams.sunet.se added 2024-10-11 17:39:56 +02:00
Micke Nordin 763822385a
internal-dco-test-k8sw-1.streams.sunet.se added 2024-10-11 17:39:55 +02:00
Micke Nordin 0d70187c38
Further bootstrap work 2024-10-11 17:39:54 +02:00
Micke Nordin 39bff16c74
Initial trust 2024-10-11 17:39:53 +02:00
Micke Nordin 20ddc3c257
Start on cosmos-rules 2024-10-11 17:39:52 +02:00
Patrik Holmqvist 028ba3d608
Merge pull request #56 from SUNET/pahol-fix-noble-eyaml
patch for broken eyaml in ubuntu24.04.
2024-09-10 13:16:19 +02:00
Patrik Holmqvist 7941e3f970
Merge the 2 patch functions to 1. 2024-09-09 17:29:31 +02:00
Patrik Holmqvist fac9a556ba
Patch for broken eyaml in ubuntu24.04. 2024-09-09 16:52:38 +02:00
Patrik Lundin 770a5ca3cc
Merge pull request #55 from SUNET/patlu-fleetlock-lock-timeouts
fleetlock: configurable lock/unlock timeout
2024-07-04 13:07:34 +02:00
Patrik Lundin aa88795ee0
sunet-fleetlock: also handle ReadTimeout
Turns out this was not caught by ConnectionError.
2024-07-03 14:13:22 +02:00
Patrik Lundin 01768129f0
fleetlock: configurable lock/unlock timeout
While we already support setting a healthcheck timeout it probably
makes sense to be able to control how long we wait for a
fleetlock_lock() or fleetlock_unlock() call. This becomes important if
only running cosmos once a night or something like that. In that case we
you probably want to give a physical machine more than than 1 minute to
complete a reboot etc.

This can now be controlled by setting fleetlock_lock_timeout and
fleetlock_unlock_timeout in /etc/run-cosmos-fleetlock-conf. Keep in mind
that while it can make sense to increase the time for taking a lock,
releasing a lock should always be fast (either you have it and release
it, or you dont have it and it is a no-op) so setting a long unlock
timeout should probably never be done.

Since we also potentially wait the unlock timeout at boot (if the
fleetlock server is broken etc) that is another reason to keep it
short. The default 1m is probably OK for most uses.
2024-07-03 13:27:52 +02:00
Patrik Lundin 443611dd3f
Merge pull request #49 from SUNET/john-permissions-fix
Enforce more strict permissions for files in Cosmos
2024-07-03 11:36:21 +02:00
Johan Wassberg 5518048d79
Merge pull request #54 from SUNET/pahol-ubuntu24
Ubuntu-24 fixes
2024-06-19 15:07:17 +02:00
Patrik Holmqvist 4231b4ac1d
Migrate from legacy fact
This did not work on modern puppet in ubuntu24:
Warning: Interpolation failed with '::lsbdistcodename', but compilation continuing;
New syntax inspiration from:
https://www.puppet.com/docs/puppet/8/hiera_config_yaml_5#configuring_hiera
2024-06-19 14:07:13 +02:00
Patrik Holmqvist bc9d1dc960
Use upstream puppet modules for ubuntu24+.
This is how we do it in modern debian so it
makes sense to do it on modern ubuntu as well.
2024-06-19 14:02:24 +02:00
Patrik Lundin 5d88e66379
Merge pull request #53 from SUNET/patlu-fleetlock-error-handling
sunet-fleetlock: handle connection errors
2024-06-17 13:27:11 +02:00
Patrik Lundin e315282bc5
Use more strict exception checking
This is probably wide enough and we do not need weird extra handling of
our own execption etc.

Thanks to @mickenordin for keeping me honest :).
2024-06-17 12:40:12 +02:00
Patrik Lundin 4b8b8887f6
sunet-fleetlock: handle connection errors
In order to handle upgrades of the fleetlock server when running only
one server we need to handle connection errors like connection refused
or timed out errors gracefully.

Because there are several different ways the connection can fail and it
is hard to keep track of them all, just catch everything. We then also
need special handling of our own timeout execption so we are not
accidentally stuck retrying forever.

Also fix so we actually use the request_timeout arg for individual HTTP
requests instead of the global timeout.

While here run isort to keep imports tidy.
2024-06-17 12:07:22 +02:00