libera/#devuan-dev/ Thursday, 2019-10-03

masonty all00:03
masonfsmithred: I got around to trying the timeout patch, and it does cut the timeout back, but I suspect we'll want a different solution. I'm going to look at how shutdownramfs's work.01:03
masonLooking at how systemd accomplishes this smoothly is frustrating, as it's utterly opaque.01:04
mason(transient crypt services appear to be generated on the fly)01:05
golinuxDuh . . . what did you expect.  This is why we need a para on the free software page.01:05
* golinux apologizes for being a bot of a nag . . .01:06
masonhah01:06
golinuxbot > bit01:06
golinuxWell, maybe a bot too!!01:06
masonNo, it's useful. I'll write something up. I've just been distracted by this native ZFS set-up stuff, and now I want shutdown to work cleanly.01:06
golinuxI know.  That's why I have held off mentioning it.01:08
masonI don't mind pestering. It has a purpose. The guilt builds up until I act.01:09
golinuxGuilt is not a good model for contributors01:09
masonhah01:10
masonsystemd's cryptsetup.c annoys me immediately. It doesn't wrap at 80 columns.01:10
golinuxIn fact guilt is not much of a good model for anything constructive01:11
masonhttps://bpaste.net/show/WNG-01:13
masonThis code is gross, as I read further. I'm going to head off to dinner, and I'll write up a paragraph about shrinking opportunities for free software authors as a result of systemd. You'll have it this evening.01:15
golinuxI've been cooking too.  Now time to enjoy the fruits of my effort.  Later . . .01:16
masono/01:17
g4570ngolinux: rrq: Centurion_Dan: Thanks for your work, d-i is working fine again. Javier de EterTICs also sends his thanks02:41
masonSo, the shutdownramfs/haltramfs concept appears to have arisen with systemd. I need to research how other systems do it, because the notion of a final pivot and clean-up seems reasonable. Also worth understanding the goals... If a user can punch the power they can freeze the keys in RAM, which might mean an explicit clean of the memory at shutdown might be more or less futile. Maybe something like what03:02
masonOpenSSH does against Spectre might be something to consider.03:02
masonOr maybe that wouldn't be useful if they catch the RAM fresh. Dunno.03:03
koollman_once you can read ram and have physical access ... protection is rather tricky03:03
masonkoollman_: Right.03:04
koollman_need some live hardware-supported ram encryption. which no init system will solve03:04
masonkoollman_: So, I guess the question is, what's possible? Do we want to just make a cosmetic improvement in tearing down root on whatever on LUKS? Dunno as yet.03:04
masonThere's more reading to do.03:04
koollman_I don't think adding complexity for the sake of it helps. there's enough troubles in various cases when shutting down services in specific order already :)03:05
koollman_if there's a known attack to protect against, then it can be tested against various ideas03:05
koollman_as for alternative ideas: I like the rather clean concept of phases in s6. I suppose it would make implementing something similar to shutdown-ramfs easier. maybe. I would have to try :)03:09
koollman_(hm. not phases. 'stages')03:10
masonWell. Very simply, "hold off on unlocking things we can't unlock until the end. If what's left is root, pivot, unmount, and then spin down the LUKS device."03:10
masonHowever we get there, that's where we want to be I think.03:10
koollman_it's supposed to be simple. there are always weird corner cases :)03:11
masons/unlock/stop/03:11
masonYeah.03:11
koollman_the luks device may be on something other than a raw device. which may have its own requirements. and block the pivot somehow (or must be stopped at just the right moment)03:13
masonRight.03:13
koollman_(real world example: luks device on some userspace-exported nbd device. you kill the process ... can't umount or even sync :) )03:14
masonkoollman_: So, one of my stray thoughts is that we let the user specify some of this.03:15
plasma41I'm getting ready to format and mail this week's notes. If anyone wants to make any final edits to the notes, please do so now.03:19
masonplasma41: If you want to strike my part, we didn't really cover it and we can just do it again in a future meeting.03:21
masonOr the first time. Whichever.03:21
plasma41mason: Is that the PPA question?03:23
masonplasma41: That and alternative bootloaders.03:23
plasma41Ok, I removed them03:24
masonty03:24
plasma41Alright copying down the pad now03:49
fsmithredmason, a good place to put a setting like that might be in /etc/default/04:12
masonfsmithred: Sounds reasonable, yeah. Or possibly as an option in crypttab, if we can limit it to just crypted devices.04:19
agrisI don't do drugs04:39
agristablets either04:39
plasma41Meeting notes posted04:45
masonplasma41: ty04:46
masonfsmithred: So, we should enumerate scenarios where this matters.17:31
masonfsmithred: The other thing I read is that certain hardware may be happier itself having a clean shutdown, but I haven't found examples yet.17:32
fsmithredI am no security expert17:32
masonfsmithred: But, at the core of it, we're either setting a proper order so we can close the LUKS devices to clear the key, or we're working on a cosmetic fix.17:32
masonOrdering transitory/hotplug storage shouldn't be a concern as we're talking explicitly about the root filesystem.17:33
fsmithredright17:33
masonfsmithred: Also no expert here. I'm just casting about, trying to figure out what we really want to do.17:33
masonfsmithred: It could well be that given the nature of the problem, a cosmetic fix really is acceptable.17:34
fsmithredmy understanding is that reading ram requires physical access a very short time after shutdown or on a reboot17:35
masonfsmithred: Right.17:35
fsmithredshort = seconds, I think17:35
masonfsmithred: And if you have physical access, you can force the machine off by yanking power or doing something local.17:35
masonThe most interesting attack scenario, though, might be forcing a reset but then booting into a prepared environment that harvests memory. Consider a crashdump kernel, for instance.17:36
masonBut we still can't guarantee that we'd get a chance to clear the key from RAM even then. We can't intercept a sysrq-trigger event, for instance.17:37
fsmithredcan we run a script from the bios?17:38
masonfsmithred: I'm thinking we just address the cosmetic issue, which you've largely done with the patch you pointed me to. That said, it might be good to actually have a local version of that code (local to Devuan) so that it doesn't get overwritten periodically. Also, if we do that we can offer a cleaner shutdown in the uninterrupted case.17:38
masonfsmithred: A script from the BIOS to what end, and at what point?17:38
fsmithredmx/antix solved that with a package called cryptsetup-functions-modified or something like that17:39
fsmithreduses dpkg-divert to replace the cryptdisk.functions file17:39
masonfsmithred: I'm thinking more, capture LUKS device names that matter for this - maybe in /etc/default - and have the cryptsetup shutdown scripting simply ignore those.17:39
golinux(This discussion is interesting but will come to naught if we can't get the point release and Beowulf out the door. Carry on . . .)17:39
fsmithredwipe the ram during post?17:39
* golinux goes off to b'fast17:40
masonfsmithred: Yeah, when I read about that it seemed interesting. I need to learn about dpkg-divert as I know nothing at present.17:40
masongolinux: Enjoy!17:40
masonfsmithred: Wiping the RAM during POST would be interesting, but we can't guarantee that we control the execution environment at that point.17:40
masonWe'd have to control the UEFI firmware or the BIOS to actually do that. Too far outside of our scope.17:41
fsmithredoh, right. Every bios is different.17:41
fsmithredand uefi implementations are like designer drugs17:42
masonhah17:42
fsmithredhttps://github.com/MX-Linux/cryptsetup-modified-functions17:47
masonfsmithred: That's not quite right, though.17:50
fsmithredwhat's wrong?17:50
masonfsmithred: If you're going to set it to 1, why sleep at all? We're not going to try again and an arbitrary one-second sleep per device is at best only somewhat less annoying than a long delay per device.17:50
fsmithredI have mine set to 1 2 and it's not annoying at all17:51
masonLooking at https://github.com/MX-Linux/cryptsetup-modified-functions/blob/master/cryptdisks-functions it should simply remove the loop, not just set it to 1.17:51
fsmithredok, I don't know if anyone has suggested that or tried it before17:52
masonfsmithred: So, on my laptop, 1 2 would mean three seconds for the first device, and three for the second, always, so six seconds of staring at predicted error messages.17:52
masonfsmithred: line 198 says "sleep $i" which for you is sleep 1, then sleep 2, and it's per-device.17:52
fsmithredmaybe I don't have enough devices to make a difference17:53
fsmithredI haven't noticed a 3-second pause. Maybe 1.17:53
masonfsmithred: It's right there in the code.17:53
fsmithredI know, I put it there several times after upgrades17:53
masonfsmithred: Ought to look like this: https://bpaste.net/show/B-tA17:53
masonThat tries once, and will emit an error if it fails. It's the equivalent of the posted version, only without extraneous sleep and without the now-misleading loop scaffolding.17:54
masonah, dpkg-divert is simple enough17:59
fsmithredyeah, they have both cryptdisks-functions and cryptdisks.functions which are for two different versions of cryptsetup18:01
fsmithredthat's a little weird18:01
masonWas just noticing I had the dot and not the dash locally.18:02
fsmithredthe two files are very different18:03
fsmithredbut the fix is the same18:03
masonWhere's the file with the dash come from? I only see the one, owned by the cryptsetup package, with a dot.18:04
fsmithredI have cryptdisks-funtions in beowulf and cryptdisks.functions in ascii18:05
masonAh, churn for the sake of churn. :P Awesome.18:05
masonfsmithred: Here's my patch for ASCII: https://bpaste.net/show/Wm_t18:06
masonTested minimally and working without the cosmetic annoyances and delay.18:06
fsmithredso, just remove those two lines - for... ...done18:08
masonand the sleep18:08
masonand then fix the indentation18:08
masonRemember that the sleep does nothing without the for loop, as the sleep is to give an increasing back-off to let the device settle.18:09
masonBut if we're only trying once, there's no need for the sleep at all. Plus, it uses the loop variable $i.18:09
masonOh, hell, something else does too. Half a sec and I'll have a corrected patch.18:10
fsmithredif [ $ret -ne 0 ]    ?18:11
masonJust tracing through. So, the line that checks return value in the bit we're patching... It's an odd choice by the original author.18:13
masonIt checkes for return code 1 from cryptsetup(8), meaning "wrong parameters", or return code 2 (no permission / bad passphrase) combined with the timeout having reached 16 seconds.18:13
masonThat seems... nonsensical.18:13
fsmithredso...18:15
masonSo, we only log those two errors, except that we only log the "no permission / bad passphrase" error *if the thing has tried longer than 16 seconds*.18:15
fsmithredif you don't have permission, you should try more times18:15
masonI'd tend to think we want to log any error.18:15
masonRight. That doesn't make sense to me.18:15
fsmithredyeah, that's what I suggested - if it's nonzero, log it18:15
fsmithredI'm about to try that18:16
masonRight.18:16
masonOh, haha, that is what you suggested.18:16
masonYes, that.18:16
masonThe trick there is that we are still left with extraneous code, because we don't get to that test if ret = 018:17
masonfsmithred: So, https://bpaste.net/show/ZT_x but I want to run it a couple times, because some error text just flashed by18:20
fsmithredok, I have to reboot to test this. It'll take a few minutes to shut stuff down and then start up again.18:21
masonoh, I think we do want the break - testing18:22
fsmithredyou removed the test for nonzero18:22
fsmithredbut not the action18:22
masonright, the action being, we want to log, no?18:22
masonI might be misreading it.18:22
fsmithredyou'll log regardless of the exit code18:23
fsmithredwith "dst busy"18:24
masonyeah, but we don't log the return code in that line18:24
fsmithredwhich might be ok for testing18:24
masonYeah, we want the break too. Sigh, messy.18:25
masonargh, no, we don't want the break. The break is for the for loop we've removed. Correct as it stood. Sigh. More coffee will fix this.18:27
fsmithredI've got the break and the test for nonzero. Gonna reboot now. Back in a few minutes. (yeah, really)18:27
fsmithredoh18:27
masonfsmithred: If you removed the for loop you don't want the break.18:27
fsmithredok18:27
fsmithredwill remove it18:27
masonI was just curious about some of the error text I saw, and thought I'd missed something.18:27
masonkk18:27
masonThis patch should be correct: https://bpaste.net/show/ZT_x18:28
masonOh, except, I should find the source for log_action_foo_msg to make sure there's no extra behaviour there. Not sure why there are the two. We can probably just log $ret and the message in the end_msg call.18:28
fsmithredbrb18:29
masonfsmithred: https://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html18:34
fsmithredseems to work. I rebooted twice. The shutdown was fast, and I didn't see any red flash by.18:34
masonI think for maximum correctness we want one log message, the end one. I'll do one final change and test.18:35
masonIf there are better docs for the lsb/init-functions Debian ships I'd love to see them.18:35
fsmithredI don't know.18:36
fsmithredanyway, I need to take a break from the screen. Probably have to go move some boxes or something.18:36
fsmithredbbl18:37
masonfsmithred: FWIW, the init-functions matter. We need the log_action_cont_msg one, not log_action_end_msg or it gets ugly.19:08
masonfsmithred: Here's where I ended up: https://bpaste.net/show/C8qM19:18
fsmithredmason, I'm not seeing where you test for nonzero $ret20:39
masonfsmithred: handle_crypttab_line_stop "$dst" "$src" "$key" "$opts" <&3 && break || ret=$?20:40
fsmithredthat sets ret but doesn't test it20:41
masonfsmithred: If we don't error, we break. So if we get past that line, we've got an error. The test is implicit, unless I'm badly misunderstanding.20:41
masonSo, ret only gets set on fail, and we can trust that it's got a valid value from $?20:41
fsmithredah, I didn't see the break20:41
masonRight. The test for ret=0 is implicit.20:41
masonfsmithred: That's the trouble with the "cute" style they've adopted. It's better to waste a little whitespace and have control flow be more obvious.20:42
* fsmithred likes whitespace20:44
fsmithredI will test this later.20:44
fsmithredneed to get back to work20:45
masonfsmithred: The really interesting bit is how badly it went using the wrong LSB init function. Those ought to be documented somewhere, and they seem not to be. I should probably write docs rather than waiting for someone else to do it.20:45
masonkk20:45

Generated by irclog2html.py 2.17.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!