-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support determining mode based on shebang interpreter directive #47
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -173,11 +173,16 @@ impl Buffer { | |
file_path: Option<PathBuf>, | ||
repo: Option<RepositoryRc>, | ||
) -> Self { | ||
let mode = file_path | ||
.as_ref() | ||
.map(|path| context.0.mode_by_filename(path)) | ||
let mode = text | ||
.line(0) | ||
.as_str() | ||
.and_then(|shebang| context.0.mode_by_shebang(shebang)) | ||
.or_else(|| { | ||
file_path | ||
.as_ref() | ||
.and_then(|path| context.0.mode_by_filename(path)) | ||
}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may potentially read the whole file in pathological cases, e.g. a minified file that has no new lines. One goal of I think the right solution here long term is to build buffers, i.e. call For this PR though, I'd be happy if instead you just bound how much of the file you read, say 256 bytes at most and test the regex for that. You'll have to deal with potentially truncated utf-8... Maybe a better solution is to read characters until you encounter either 1. a new line or 2. if you don't after X characters, you give you and don't check the shebang -- essentially we only test if shebangs apply up to a certain line length. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A 2nd comment is that I think we should test if "mode_by_filename" applies and only then check the shebangs to avoid having to read the file if the name already matches. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great suggestions. I'm looking to see if there is any standards around whether white space is allowed before the |
||
.unwrap_or(&PLAIN_TEXT_MODE); | ||
|
||
let mut parser = mode | ||
.language() | ||
.and_then(|result| result.ok()) | ||
|
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -32,7 +32,7 @@ use crate::{ | |||
splash::{Properties as SplashProperties, Splash}, | ||||
theme::{Theme, THEMES}, | ||||
}, | ||||
config::{EditorConfig, PLAIN_TEXT_MODE}, | ||||
config::EditorConfig, | ||||
error::Result, | ||||
task::TaskPool, | ||||
}; | ||||
|
@@ -94,11 +94,16 @@ pub struct Context { | |||
} | ||||
|
||||
impl Context { | ||||
pub fn mode_by_filename(&self, filename: impl AsRef<Path>) -> &Mode { | ||||
pub fn mode_by_filename(&self, filename: impl AsRef<Path>) -> Option<&Mode> { | ||||
self.modes | ||||
.iter() | ||||
.find(|&mode| mode.matches_by_filename(filename.as_ref())) | ||||
.unwrap_or(&PLAIN_TEXT_MODE) | ||||
} | ||||
|
||||
pub fn mode_by_shebang(&self, shebang: &str) -> Option<&Mode> { | ||||
self.modes | ||||
.iter() | ||||
.find(|&mode| mode.matches_by_shebang(shebang)) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If shebang is a variant of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've been thinking about merging these methods into something like All of that being said, prior to detecting the mode or creating the buffer Line 156 in 0b40784
Buffer::new() so reading a slice should be quite efficient.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here is an updated version of this code with a limit on the length of text examined for the interpreter directive, an inversion of the order mode checks are performed (filename checks first, then shebang), and a merging of the mode_by_filename() and mode_by_shebang() methods into one, mode_by_file(), which always returns a I ended up settling on 256 characters for the maximum length of the shebang directive, matching what linux has done since 2018 (https://lore.kernel.org/lkml/[email protected]/) (https://github.com/torvalds/linux/blob/master/include/uapi/linux/binfmts.h) I haven’t found a satisfactory way of adding a shebang case to FilenamePattern, the biggest stumbling blocks being that the shebang line would need to be passed to the matches() method and the pattern list would need to be sorted/filtered if we wanted to ensure that filename patterns were always examined before shebang directives. |
||||
} | ||||
} | ||||
|
||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking whether this should be a new field, rather than a new variant of
FilenamePattern
-- granted, we should probably rename it toFilePattern
as it will look inside the file too to determine if the mode applies. I.e. something likeFilePattern::Shebang
.The structure I suggest may make it harder to avoid reading the file when the filename would suffice.