Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging stops when logfile goes away on zram sync #1654

Open
mstormi opened this issue May 13, 2024 · 19 comments
Open

Logging stops when logfile goes away on zram sync #1654

mstormi opened this issue May 13, 2024 · 19 comments
Labels
bug An unexpected problem or unintended behavior

Comments

@mstormi
Copy link
Contributor

mstormi commented May 13, 2024

Sorry upfront for calling this a bug, it's merely a feature request or something inbetween,
but it has severe impact on useability for people to use openHABian (which there are quite a lot).

openHABian now by default syncs ZRAM to disk, un- and remounting the filesystem the /var/log/openhab is on.
openHAB keeps running but that seems to result in logging to stop, probably because the file handle is gone.
Now to issue a log:set command seems to recover logging but unforunately this is difficult to initiate from the outside.

Could you add some sort of periodic logfile handle or logger reinit check that does what issueing log:set does ?
Thanks!

See also https://community.openhab.org/t/openhabian-testing/147079/121

@mstormi mstormi added the bug An unexpected problem or unintended behavior label May 13, 2024
@openhab-bot
Copy link
Collaborator

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/openhabian-testing/147079/128

@rkoshak
Copy link
Contributor

rkoshak commented May 14, 2024

I commented on another thread but wanted to comment here as well.

Since events.log doesn't seem to have the same problem on a zram sync, maybe switching openhab.log to use a RollingRandomAccessFile appender would work. That's the major difference between events.log and openhab.log in the configs.

@J-N-K
Copy link
Member

J-N-K commented May 14, 2024

Before #1078 it already was a RollingRandomAccessFile and I didn't find any note in that PR why it was changed. I'll move this issue to distro.

@J-N-K J-N-K transferred this issue from openhab/openhab-core May 14, 2024
@wborn
Copy link
Member

wborn commented May 14, 2024

It's an issue in the logging framework Log4j2. I've seen this same issue also in other Java applications using Log4j2.

There seems to be a workaround, see:

https://stackoverflow.com/questions/52592234/does-anybody-know-how-to-configure-or-extend-lo4j2-to-recreate-logs-after-deleti

But it would be better if it is fixed in Log4j2 itself. Maybe you can check the issue tracker if there is an open issue for it and if not create one with enough details so it does not get closed like:

https://issues.apache.org/jira/browse/LOG4J2-3099

@rkoshak
Copy link
Contributor

rkoshak commented May 14, 2024

That doesn't really address two oddities here.

  1. Why doesn't the problem effect events.log but only openhab.log?
  2. Why is openhab.log using a different appender from events.log in the first place?

Given the somewhat hostile response to the issue you link to, I personally don't have much hope that this would ever get fixed upstream since they basically said "you are stupid for deleting the file in the first place, that's not our fault."

It would be informative if someone experiencing this problem could verify whether using the RollingRandomAccessFile appender for openhab.log mitigates this problem. If so, if that appender is good enough for events.log it seem like it should be good enough for openhab.log too unless someone knows why they are different in the first place. So we can mitigate the problem by changing the appender in log4j2.xml here whether or not log4j2.xml ever addresses the problem on their end.

@wborn
Copy link
Member

wborn commented May 15, 2024

Sure we can workaround it by using another logger.

The different loggers have different implementations which may result in better performance at the expense of using more memory. Something you may not have an abundance of when using ZRAM or RPIs

https://stackoverflow.com/questions/27836850/what-is-the-difference-between-rollingfile-and-rollingrandomaccessfile-in-log4j2

@mstormi
Copy link
Contributor Author

mstormi commented May 15, 2024

Sure we can workaround it by using another logger.

As @J-N-K already noted openhab.log used to be a RollingRandomAccessFile like events.log still is.
Log performance isn't an issue.
My first simple attempts to change that in log4j2.xml failed, though. No openhab.log logging at all any more, even after OH restart. Please try yourself. You can systemctl start zsync to provoke the situation when logging stops.

at the expense of using more memory. Something you may not have an abundance of when using ZRAM or RPIs

The stackoverflow link speaks of 256K. These we can spend for sure.

wborn added a commit to wborn/openhab-distro that referenced this issue May 15, 2024
@wborn
Copy link
Member

wborn commented May 15, 2024

No openhab.log logging at all any more, even after OH restart.

Maybe you forgot to also update the end tag? 🤔
Works fine for me.

@mstormi
Copy link
Contributor Author

mstormi commented May 15, 2024

Maybe you forgot to also update the end tag? 🤔

No but it was an unspotted typo indeed, works now.

EDIT: no issues so far I'd think it's safe to merge

@wborn
Copy link
Member

wborn commented May 15, 2024

Given the somewhat hostile response to the issue you link to, I personally don't have much hope that this would ever get fixed upstream since they basically said "you are stupid for deleting the file in the first place, that's not our fault."

There is a more reasonable response in LOG4J2-3035 which is still open. 🙂

@rkoshak
Copy link
Contributor

rkoshak commented May 15, 2024

If performance or memory are of concern, according to that StackOverflow post, either RollingRandomAccessFile appender or RollingFile appender can be configured to work similarly in terms of flushing and the size of the buffers. It looks to not be. a problem but if it becomes one we have that option.

What's weird is that on the StackOverflow post and the issue they describe the exact opposite behavior as we are seeing. They are saying that RollingRandomAccessFile appender has problems with logrotate (and their explanation makes sense) but we are actually seeing the problem with RollingFile appender.

Weird and I'm sure it has to do with details in how zram does the flush.

But if it works it works! :-D

@mstormi
Copy link
Contributor Author

mstormi commented May 20, 2024

But if it works it works! :-D

Yes but seems it does not. I keep seeing the log stop issue at times.
So the logfile type change doesn't do harm but didn't help either (there's others like @holgerfriedrich to report this, too).

I noticed you can issue about any log:set command such as log:set default abcd to re-activate logging.

So given to wait or attempt convincing the log4j2 people is very unlikely to work, I'm coming back to my original request:

Could you add some sort of periodic logfile handle or logger reinit check that does what issueing log:set does ?

@wborn
Copy link
Member

wborn commented May 24, 2024

So given to wait or attempt convincing the log4j2 people is very unlikely to work, I'm coming back to my original request

Could you add some sort of periodic logfile handle or logger reinit check that does what issueing log:set does ?

I'd personally rather put effort into fixing it in log4j2 itself instead of creating a workaround that only works with openHAB. That would solve similar issues in other Java applications I develop as well. 😉

@mstormi
Copy link
Contributor Author

mstormi commented May 24, 2024

I'd personally rather put effort into fixing it in log4j2 itself instead of creating a workaround that only works with openHAB. That would solve similar issues in other Java applications I develop as well. 😉

Then please do and go convince the log4j2 devs yourself. I'm no java dev and I don't have neither the capabilities nor energy to.
Given their hostility on the subject though (see link in history) and that nothing has changed in years I'm pretty sure they will not let themselves be persuaded so that feels like waiting for Godot.

Now short of that being a short term option (if any at all), I'd really appreciate if you could build a fix in openHAB for now to ease user life being affected today.

@wborn
Copy link
Member

wborn commented May 24, 2024

I'd really appreciate if you could build a fix in openHAB for now to ease user life being affected today.

I don't have time for that now. But if I would have time, I'd rather spend it on fixing the issue in log4j2. 😉

@michikrug
Copy link

  1. Why doesn't the problem effect events.log but only openhab.log?

Just for completeness: for me both files are affected by the issue

@mstormi
Copy link
Contributor Author

mstormi commented Jun 7, 2024

@openhab/core-maintainers anyone else willing to take on this?

@Micha-he
Copy link

Micha-he commented Nov 25, 2024

Is there any news? My logging stops still on the current (4.2.2) Openhab server, if I activate the zsync-timer.

@openhab-bot
Copy link
Collaborator

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/zram-not-working-accordingly-on-raspberry-pi-5/157778/70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants