Commit graph

233 commits

Author SHA1 Message Date
Patrik Lundin 170bdbc154
Missing $ 2024-10-10 15:29:50 +02:00
Patrik Lundin 26f583c41a
Fix manifest name 2024-10-10 15:28:23 +02:00
Patrik Lundin 4b1f93c08a
Add missing $ 2024-10-10 15:27:06 +02:00
Patrik Lundin cf51469fae
Apply cdn::cache to cache nodes 2024-10-10 15:25:12 +02:00
Patrik Lundin d0a19691aa
Initial cdn::cache manifest 2024-10-10 15:22:11 +02:00
Patrik Lundin b2de8d246b
Start installing docker on cache machines 2024-10-10 11:01:28 +02:00
Patrik Lundin 254a3f107e
Quote some variables to make shellcheck happy 2024-10-10 10:38:45 +02:00
Patrik Lundin 7001a3fab6
Remove trailing "/" in dir path 2024-10-10 10:36:00 +02:00
Patrik Lundin d38ef1b1ce
Remove bridges for now 2024-10-10 10:27:41 +02:00
Patrik Lundin 5d05e596c0
Cleanup ":" 2024-10-10 10:24:31 +02:00
Patrik Lundin 563886294b
Fix template 2024-10-10 10:23:55 +02:00
Patrik Lundin d78d8c22b1
Make sure we trust internal cdn CA 2024-10-10 10:19:00 +02:00
Patrik Lundin b44fb5ce43
Update key paths to reflect internal CA 2024-10-10 10:17:39 +02:00
Patrik Lundin 65fc0590b4
Add certbot deploy script for mosquitto 2024-10-10 10:13:04 +02:00
Patrik Lundin b9266ec0e7
Start requesting ACME certs from internal CA 2024-10-09 12:13:30 +02:00
Patrik Lundin 8f8c360c69
Use environment instead of instance 2024-10-09 11:59:51 +02:00
Patrik Lundin c09f81afbf
Fix type declaration
```
Error: Evaluation Error: Error while evaluating a Resource Statement, Class[Cdn::Ca_trust]:
  parameter 'ca_root_fp' entry 'test' entry 'url' expects a Hash value, got String
  parameter 'ca_root_fp' entry 'test' entry 'fp' expects a Hash value, got String on node internal-sto3-test-mqtt-1.cdn.sunet.se
```

Also rename variable now that it contains more than fingerprint
2024-10-09 11:53:52 +02:00
Patrik Lundin 1ef179cad2
Fix broken file declaration
While here make puppet-lint happy
2024-10-09 11:50:34 +02:00
Patrik Lundin 1dcc58d991
Apply trust class to mqtt 2024-10-09 11:47:53 +02:00
Patrik Lundin ab3c08c5e1
Add class for setting up trust of internal CA 2024-10-09 11:46:28 +02:00
Patrik Lundin d1b0694e44
Also set --admin-provisioner=admin
Without this the commands will hang for input to select a provisioner.
This is needed now that we have enabled a second (the ACME) provisioner
on init.
2024-10-08 21:45:17 +02:00
Patrik Lundin 22a2029cf9
Enable ACME provisioner at init 2024-10-08 16:50:46 +02:00
Patrik Lundin 6354f6faaa
Test opening port 80 for certbot operation 2024-10-08 16:38:11 +02:00
Patrik Lundin fe04d862e3
Move script to correct location 2024-10-08 14:12:48 +02:00
Patrik Lundin 8d4d1841c4
Bootstrap step client 2024-10-08 14:09:44 +02:00
Patrik Lundin 44001514de
Missing "," 2024-10-08 13:42:14 +02:00
Patrik Lundin a4a5a44647
Install step-cli from deb 2024-10-08 13:40:54 +02:00
Patrik Lundin 1cfbc3e908
Make puppet-lint happy with indent 2024-10-08 13:36:21 +02:00
Patrik Lundin 49ff235bc4
Download step client deb file 2024-10-08 13:33:32 +02:00
Patrik Lundin aca8dd1b22
Add file to correct location 2024-10-08 13:12:54 +02:00
Patrik Lundin d9db9fee72
Add init script for setting provisioner file
This is to deal with the problem that it makes sense to have a separate
passsword for encryption keys and the admin provisioner. It is currently
not possible to control this via the docker env flags so add this
workaround for now.
2024-10-08 12:35:41 +02:00
Patrik Lundin d1c863c7cb
Expose the step-ca port 2024-10-08 10:09:20 +02:00
Patrik Lundin d46d54a6a6
Enable compose file 2024-10-08 10:04:32 +02:00
Patrik Lundin 1803d1c69a
Add initial compose file for step-ca 2024-10-08 10:02:48 +02:00
Patrik Lundin 828f9a899d
Fix templates for passwords 2024-10-08 09:51:08 +02:00
Patrik Lundin f247388664
Trust maria
Copied from cnaas-ops
2024-10-08 09:41:09 +02:00
Patrik Lundin 9379ba58e2
Handle undef ca_secrets more gracefully 2024-10-08 09:39:09 +02:00
Patrik Lundin 61a4ec13e3
Start setting up step-ca files 2024-10-08 09:36:04 +02:00
Patrik Lundin e02160a311
Initial cdn::ca class 2024-10-07 08:35:00 +02:00
Patrik Lundin 9f05f40714
Install docker on ca machines 2024-10-06 15:37:33 +02:00
Patrik Lundin 49106049ff
Start using cdn.conf template 2024-10-06 14:51:55 +02:00
Patrik Lundin e5ce5dd1cd
Start managing cdn.conf 2024-10-06 14:50:07 +02:00
Patrik Lundin 40036c3c32
Fix variable usage 2024-10-06 14:44:32 +02:00
Patrik Lundin 52469c754d
Correct path 2024-10-06 14:32:17 +02:00
Patrik Lundin 4b90469531
Missing $ 2024-10-06 14:30:51 +02:00
Patrik Lundin 0c5e2604b6
Add missing clients parameter 2024-10-06 14:29:48 +02:00
Patrik Lundin 7352a20143
Start managing mqtt ACL
Include sample comsos-rules entry for testing out template
2024-10-06 14:26:10 +02:00
Patrik Lundin 2099c4d691
Fix class name 2024-10-04 17:43:31 +02:00
Patrik Lundin c638772941
Apply mqtt class 2024-10-04 17:41:59 +02:00
Patrik Lundin 152179a5c1
Initial commit for mqtt management 2024-10-04 17:33:49 +02:00
Patrik Lundin 895264bc4f
Trust kano
Copied from platform-ops
2024-10-04 17:18:09 +02:00
Patrik Lundin febde032ee
Update to new key standard 2024-10-04 17:16:23 +02:00
Patrik Lundin 571af24060
Make seccomp file readable by runner 2024-10-04 09:22:05 +02:00
Patrik Lundin 05ee26e7c2
Make docker_certs available to runner 2024-10-03 21:04:17 +02:00
Patrik Lundin 48d3b890d0
Use owner/group matching runner compose file 2024-10-03 20:57:28 +02:00
Patrik Lundin d1d72ad80a
Try to access map correctly 2024-10-03 20:42:39 +02:00
Patrik Lundin 25a18fd58b
Remove extra dot 2024-10-03 20:15:39 +02:00
Patrik Lundin 32e4a99cef
Add initial forgejo runner config 2024-10-03 20:12:59 +02:00
Patrik Lundin 3883bb53b2
Trust jocar key 2024-10-03 15:56:30 +02:00
Patrik Lundin dc180c10b0
Fix so systemd file is named sunet-cdn-l4lb
Not sunet-sunet-cdn-l4lb
2024-08-20 12:38:06 +02:00
Patrik Lundin dd0493f869
Fix volume declarations
Did not expect to create anonymous volumes, see
https://stackoverflow.com/questions/46166304/docker-compose-volumes-without-colon
for more details. Now the host directories should be mounted. While here
try setting :ro to the paths we are not expecting to modify. The
/lib/modules :ro flag is based on
3cbd8258eb/cilium-lb.yaml (L143-L145)
2024-08-20 12:31:42 +02:00
Patrik Lundin 79f2018d1b
Fix path to template 2024-08-20 12:10:29 +02:00
Patrik Lundin 4755886ea9
Move manifest to expected location 2024-08-20 12:06:35 +02:00
Patrik Lundin f4cd10a970
Add mifr key, imported from platform-ops
Need to trust commits to puppet-sunet stable branch
2024-08-20 12:00:57 +02:00
Patrik Lundin 9991bef58d
Assign new cdn::l4lb class to machine 2024-08-20 11:27:26 +02:00
Patrik Lundin 6057c62f47
Initial commit of running cilium l4lb via compose 2024-08-20 11:25:15 +02:00
Patrik Lundin b014b4fdcc
Add sunet::dockerhost2 to cdn-prod-l4lb
While here fix indentation.
2024-08-15 09:21:02 +02:00
Patrik Lundin ac83234433
Merge remote-tracking branch 'multiverse/main' 2024-07-05 10:59:29 +02:00
Patrik Lundin 94a65a31e0
Fix problems with outdated sunet puppet modules
Problem seen:
```
Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Unknown variable: '::osfamily'. (file: /etc/puppet/cosmos-modules/augeas/manifests/params.pp, line: 7, column: 8) on node example-1.sunet.se
```

This way we run modules installed from upstream apt packages instead.
Solution to delete keys to use local pacakges from pahol.

While here fix pylint issue with not importing platform module at
beginning of file.
2024-07-04 14:42:34 +02:00
Patrik Lundin 3d0413b450
Disable ntpd management for now
The current ntp puppet manifest does not support 24.04, and we need to
figure out if the future means timesyncd or chrony.
2024-07-04 13:32:23 +02:00
Patrik Lundin 74fb420946
Add initial cosmos-rules 2024-07-04 08:48:58 +02:00
Patrik Lundin c417b1e296
Trust pahol key
Needed for puppet module
2024-07-03 17:27:23 +02:00
Patrik Lundin 4aa5e530f9
Trust jocar key
Needed for some puppet modules
2024-07-03 17:23:43 +02:00
Patrik Lundin 0b82213811
Add my GPG key 2024-07-03 15:56:09 +02:00
Patrik Lundin a49e9cfd24
Add init.pp
Based on geteduroam-ops
2024-07-03 14:48:52 +02:00
Patrik Lundin aa88795ee0
sunet-fleetlock: also handle ReadTimeout
Turns out this was not caught by ConnectionError.
2024-07-03 14:13:22 +02:00
Patrik Lundin 01768129f0
fleetlock: configurable lock/unlock timeout
While we already support setting a healthcheck timeout it probably
makes sense to be able to control how long we wait for a
fleetlock_lock() or fleetlock_unlock() call. This becomes important if
only running cosmos once a night or something like that. In that case we
you probably want to give a physical machine more than than 1 minute to
complete a reboot etc.

This can now be controlled by setting fleetlock_lock_timeout and
fleetlock_unlock_timeout in /etc/run-cosmos-fleetlock-conf. Keep in mind
that while it can make sense to increase the time for taking a lock,
releasing a lock should always be fast (either you have it and release
it, or you dont have it and it is a no-op) so setting a long unlock
timeout should probably never be done.

Since we also potentially wait the unlock timeout at boot (if the
fleetlock server is broken etc) that is another reason to keep it
short. The default 1m is probably OK for most uses.
2024-07-03 13:27:52 +02:00
Patrik Lundin 443611dd3f
Merge pull request #49 from SUNET/john-permissions-fix
Enforce more strict permissions for files in Cosmos
2024-07-03 11:36:21 +02:00
Patrik Lundin dfda322939
Add setup_cosmos_modules 2024-07-01 11:16:15 +02:00
Patrik Holmqvist 4231b4ac1d
Migrate from legacy fact
This did not work on modern puppet in ubuntu24:
Warning: Interpolation failed with '::lsbdistcodename', but compilation continuing;
New syntax inspiration from:
https://www.puppet.com/docs/puppet/8/hiera_config_yaml_5#configuring_hiera
2024-06-19 14:07:13 +02:00
Patrik Holmqvist bc9d1dc960
Use upstream puppet modules for ubuntu24+.
This is how we do it in modern debian so it
makes sense to do it on modern ubuntu as well.
2024-06-19 14:02:24 +02:00
Patrik Lundin e315282bc5
Use more strict exception checking
This is probably wide enough and we do not need weird extra handling of
our own execption etc.

Thanks to @mickenordin for keeping me honest :).
2024-06-17 12:40:12 +02:00
Patrik Lundin 4b8b8887f6
sunet-fleetlock: handle connection errors
In order to handle upgrades of the fleetlock server when running only
one server we need to handle connection errors like connection refused
or timed out errors gracefully.

Because there are several different ways the connection can fail and it
is hard to keep track of them all, just catch everything. We then also
need special handling of our own timeout execption so we are not
accidentally stuck retrying forever.

Also fix so we actually use the request_timeout arg for individual HTTP
requests instead of the global timeout.

While here run isort to keep imports tidy.
2024-06-17 12:07:22 +02:00
Johan Wassberg c72f5ccd86
Allow for hosts without class(s) 2024-04-12 15:32:40 +02:00
Patrik Lundin df5558befb
Fix another indentation mismatch 2024-01-24 15:36:52 +01:00
Patrik Lundin 4b93d9c426
run-cosmos: support fleetlock unlocking at boot
This extends run-cosmos with a new argument that calls the unlock
function already included in the script as well as using the already
existing lock() function to make sure there is no race between the
bootup process and cron starting a normal run-cosmos process at the same
time.

The oexit() function is added to support exiting with a OK exit value
the same way eexit() is used to signal something is wrong.

This change also adds the systemd unit file that runs run-cosmos with the
new fleetlock-unlock argument at boot if fleetlock is configured.

While here fix indentation that was mixed between 3 and 4 spaces: it is
now 4 spaces everywhere.
2024-01-24 15:36:34 +01:00
John Van de Meulebrouck Brendgard 8d4ce2d1b7
Make sure that COSMOS_BASE is only readable
by root since it's possible that the directory
can contain files that after applying the
overlay to / only should be read or writable
by root.
2023-11-17 15:03:47 +01:00
John Van de Meulebrouck Brendgard 75e566ab61
Make sure that /root in overlay is owned by root
as well as that /root/.ssh and its content is
only owned and readable by root. This is redundant
if the previous permissions were properly applied
and no other changes have been made by the user
or something else, but is added for good measure
as a layered defense.
2023-11-17 14:58:51 +01:00
John Van de Meulebrouck Brendgard ca353ed406
Set same permissions for /root/.ssh/authorized_keys
in post-tasks.d/010fix-ssh-perms as is done by
Puppet with sunet::ssh_keys.
2023-11-17 13:50:02 +01:00
Johan Wassberg a6a67d355f
Diffable 2023-11-14 15:28:46 +01:00
Johan Wassberg 120c4a5a93
A few more depends for Bookworm 2023-11-14 15:27:45 +01:00
Johan Wassberg 58a9ca7aa9
No need of x11 on our servers 2023-10-02 12:39:44 +02:00
Micke Nordin 3aac1f97d8
Add additional packages for use with debian 12
This patch will install three packages that is needed for normal operations of puppet using puppet-sunet with multiverse on Debian 12:

cron puppet-module-puppetlabs-cron-core puppet-module-camptocamp-augeas
2023-07-10 16:32:20 +02:00
Patrik Lundin 7baf9affb1
Add fleetlock support to run-cosmos
Makes run-cosmos request a fleetlock lock before running cosmos "update"
and "apply" steps. This is helpful for making sure only one (or several)
machine out of some set of machines runs cosmos changes at a time. This
way if cosmos (or puppet) decides that a service needs to be restarted
this will only happen on a subset of machines at a time. When the cosmos
"apply" is done a fleetlock unlock request will be performed so the
other machines can progress.

The unlock code in run-cosmos will also run the new tool
sunet-machine-healthy to decide things are good before unlocking. This
way if a restarted service breaks this will stop the unlock attempt
and in turn make it so the others should not break their service as
well, giving an operator time to figure out what is wrong.
2023-06-17 08:10:00 +02:00
Johan Wassberg cf2e6b6518
File provided by Sunet::Dockerhost 2023-04-04 15:21:15 +02:00
Johan Wassberg 5af8093338
Add support for eyaml in Hiera
And at the same time remove support for gpg.

The modern version of the configuration (v5) has been tested with 20.04 but
might work with older dists.
2023-02-16 07:44:37 +01:00
Fredrik Thulin c400bba97d
remove 'make db'
The db-file, essentially providing reverse lookup of classes to host
names, is only used by some Nagios configuration instances and causes
continuing operational headaches in those ops-repos.

It should be kept/refactored to only apply to the monitoring hosts in
the cases where it is used, but we don't want any new ops-repos to use
it hence it should be removed from upstream multiverse.
2023-02-07 14:21:29 +01:00
Fredrik Thulin 12b2412ea7
run cron at boot too, to e.g. get new firewall rules installed 2023-02-06 17:12:01 +01:00
Fredrik Thulin 79606f2a6d
check for /etc/no-automatic-cosmos in the wrapper, and allow arguments to be passed 2023-02-06 16:47:41 +01:00
Fredrik Thulin 3988f5beb0
shellcheck fixes 2023-02-06 16:47:30 +01:00