-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile pointer to memref #57
Conversation
WalkthroughThis pull request introduces comprehensive type refinements and code simplifications across multiple modules in the Charms library. The changes primarily focus on enhancing pointer type specifications, streamlining pointer operations, and improving type safety in sorting algorithms, benchmarking, and intrinsic function handling. Key modifications include updating function signatures to specify more precise pointer types, simplifying pointer loading and manipulation, and introducing better error handling mechanisms. Changes
Sequence DiagramsequenceDiagram
participant Expander
participant Definition
participant Charms
participant Intrinsic
Charms->>Definition: Compile definitions
Definition->>Expander: Expand to MLIR
Expander-->>Definition: Return expanded definitions
Definition->>Charms: Track required intrinsic modules
Charms->>Intrinsic: Process intrinsic modules
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (9)
lib/charms/prelude.ex (1)
98-104
: Chaining preprocessors elegantly.
The functional approach to apply each preprocessor in turn is clean. For large argument lists or multiple preprocessors, consider short-circuiting on errors for performance.lib/charms/pointer.ex (1)
106-117
: Pointer type generation with layout.
Creating a strided layout attribute is a solid approach. Consider applying similar logic for multi-dimensional memrefs as needed.bench/vec_add_int_list.ex (1)
15-17
: Consider optimizing the reduce operationWhile the pointer loading simplification is good, the reduce operation could be further optimized to avoid allocating a new pointer in each iteration.
Consider pre-allocating the pointer outside the reduce:
init = SIMD.new(SIMD.t(i32(), 8), [0, 0, 0, 0, 0, 0, 0, 0]) + v_ptr = Pointer.allocate(i32()) Enum.reduce(l, init, fn x, acc -> - v_ptr = Pointer.allocate(i32()) enif_get_int(env, x, v_ptr) i = Pointer.load(i_ptr) Pointer.store(i + 1, i_ptr) Pointer.load(v_ptr) |> vector.insertelement(acc, i) end)bench/enif_quick_sort.ex (1)
13-29
: Consider optimizing pointer arithmetic operationsWhile the implementation is correct, consider reducing the number of
element_ptr
calls by using pointer arithmetic. This could improve performance in tight loops.- swap(Pointer.element_ptr(arr, i), Pointer.element_ptr(start, j)) + i_ptr = Pointer.offset(arr, i) + j_ptr = Pointer.offset(start, j) + swap(i_ptr, j_ptr)test/add_two_int_test.exs (1)
23-24
: Remove redundant arithmetic operationsThese operations (
sum / 1
,sum + 1 - 1
) don't affect the result and can be removed for clarity.- sum = sum / 1 - sum = sum + 1 - 1bench/enif_tim_sort.ex (1)
11-33
: Consider simplifying the insertion sort implementationThe current implementation uses multiple pointer operations and allocations in tight loops. Consider simplifying by:
- Reducing temporary pointer allocations
- Using pointer arithmetic instead of repeated element_ptr calls
for_loop {temp, i} <- {start, n} do i = value index.casts(i) :: i32() i = i + start_i - j_ptr = Pointer.allocate(i32()) - Pointer.store(i - 1, j_ptr) + j = i - 1 while( - Pointer.load(i32(), j_ptr) >= left && - Pointer.load(Pointer.element_ptr(arr, Pointer.load(i32(), j_ptr))) > - temp + j >= left && Pointer.load(Pointer.element_ptr(arr, j)) > temp ) do - j = Pointer.load(i32(), j_ptr) Pointer.store( Pointer.load(Pointer.element_ptr(arr, j)), Pointer.element_ptr(arr, j + 1) ) - Pointer.store(j - 1, j_ptr) + j = j - 1 endlib/charms.ex (1)
63-69
: Consider adding duplicate check for intrinsic modulesWhile the code handles module self-reference (
r != env.module
), it might be worth checking for duplicates inrequired_intrinsic_modules
to avoid redundant processing.Consider adding:
i = - for r <- required_intrinsic_modules, r != env.module do + for r <- required_intrinsic_modules |> Enum.uniq(), r != env.module dotest/defm_test.exs (2)
46-64
: Consider adding memory leak checksThe sorting tests cover functionality well, but consider adding memory leak detection for the allocated arrays, especially important when dealing with pointer operations.
Consider adding:
# After the sorting tests test "#{s} memory usage" do initial_memory = :erlang.memory(:total) # Run multiple sort operations final_memory = :erlang.memory(:total) # Assert memory growth is within acceptable bounds assert (final_memory - initial_memory) < threshold end
168-180
: Expand type mismatch test coverageWhile the f32 to i32 mismatch test is good, consider adding more edge cases relevant to pointer/memref compilation.
Consider adding:
test "pointer type mismatches" do assert_raise ArgumentError, ~r/type mismatch/, fn -> # Test pointer type conversion mismatches # Test memref type conversion mismatches end end
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
bench/enif_merge_sort.ex
(1 hunks)bench/enif_quick_sort.ex
(2 hunks)bench/enif_tim_sort.ex
(1 hunks)bench/sort_benchmark.exs
(1 hunks)bench/sort_util.ex
(5 hunks)bench/vec_add_int_list.ex
(1 hunks)lib/charms.ex
(2 hunks)lib/charms/constant.ex
(1 hunks)lib/charms/defm.ex
(0 hunks)lib/charms/defm/definition.ex
(3 hunks)lib/charms/defm/expander.ex
(9 hunks)lib/charms/intrinsic.ex
(1 hunks)lib/charms/jit.ex
(2 hunks)lib/charms/pointer.ex
(3 hunks)lib/charms/prelude.ex
(2 hunks)test/add_two_int_test.exs
(1 hunks)test/defm_test.exs
(2 hunks)
💤 Files with no reviewable changes (1)
- lib/charms/defm.ex
✅ Files skipped from review due to trivial changes (1)
- lib/charms/constant.ex
🔇 Additional comments (48)
lib/charms/prelude.ex (8)
7-7
: Looks good on the Dialect alias additions.
No issues spotted; the extended alias list provides broader MLIR Dialect coverage.
10-13
: Validate numeric literal conversion.
The specialization for integers/floats seems appropriate. Ensure all other numeric edge cases (e.g., big integers) are handled as expected.
17-17
: Fallback literal logic.
Returning the value unmodified for non-numeric inputs aligns with your design. No issues found.
39-46
: Minimal overhead for 'const' definition.
This implemention neatly delegates literal conversion to the existing Charms.Constant facility. No issues found.
55-85
: Confirm pointer extraction logic.
The condition checks for both an LLVM pointer and a memref pointer, which may appear contradictory. Consider verifying that both checks are intended and won't skip some valid scenarios.
87-96
: Argument type consistency check.
Raising an ArgumentError on a type mismatch is straightforward. Good approach for early failure and debugging.
105-118
: Robust ENIF call mechanism.
This function encapsulates argument preprocessing and function calling neatly. Consider verifying what happens if the callee is missing from ENIF.signature.
122-126
: Automated intrinsic creation.
Generating intrinsics in a loop aligns well with common patterns. Code readability and maintainability are good.
lib/charms/pointer.ex (13)
4-5
: Enhanced module documentation.
The updated docstrings clarify the intended role of smart pointers with SIMD/Tensor support.
7-7
: Imports and usage.
Using both Beaver and Charms.Intrinsic suggests deeper integration of pointer operations. No immediate issues.
11-11
: Expanded alias usage.
Referencing MemRef, Index, and Arith fosters clearer pointer operations. Seems consistent with the module’s focus.
26-51
: Flexible array allocation.
Distinguishing between integer and MLIR.Value-based sizes looks good. Ensuring type conversions for dynamic sizes can prevent runtime errors.
59-71
: Conditional load for LLVM vs. MemRef pointer.
Ensuring both pointer types are handled is beneficial. Keep an eye on edge cases where the pointer might be partially typed or cast incorrectly.
73-90
: Pointer type checks.
Providing distinct memref_ptr? overloads for Type and Value helps unify pointer handling. The load/1 function’s error message is quite direct, which aids debugging.
98-104
: Store method usage.
Storing data at index 0 for a pointer is straightforward. Ensure no out-of-bounds scenarios occur for arrays bigger than one element.
118-142
: Offset pointer recalculation.
Reinterpreting the cast works, but ensure dynamic offsets/sizes are always matched. The error raise is helpful if the dense arrays can’t be built.
144-155
: Element pointer type consistency check.
Verifying that elem_t matches the pointer’s element type is crucial to avoid silent misalignment issues.
156-168
: Ensure integer index type.
Casting n to Type.index() is prudent. Double-check if negative indices are possible or should be allowed.
176-190
: Shortcut for typed pointer element_ptr.
Raising an error if the pointer is not typed ensures clarity. This approach is consistent with the code in load/1.
192-198
: Retrieve element type from typed pointer.
The explicit error message is helpful. This ensures that only typed pointers pass.
210-213
: Typed pointer creation.
Returning a MemRef-based pointer type from t(elem_t) is consistent with the rest of the Pointer module’s approach.
lib/charms/defm/definition.ex (3)
262-272
: Accumulating required intrinsic modules.
Collecting these modules in MapSet improves composability across definitions. Good approach to track cross-module dependencies.
293-295
: Propagate intrinsic modules in the result tuple.
Returning required_intrinsic_modules alongside IR and referenced_modules is logical. This ensures the compiler always knows which intrinsics are needed.
318-333
: Enhanced error handling with stack traces.
Re-raising the error with a trace is highly beneficial for debugging. The fallback cases appear thorough.
lib/charms/defm/expander.ex (8)
22-22
: Additional MLIR dialect alias added.
Ensuring SCF, MemRef, Index, Arith, and UB are available strengthens coverage for expansions.
37-37
: Tracking required_intrinsic_modules in Expander.
This field ensures expansions automatically record dependent intrinsic modules. Smooth integration with the rest of the pipeline.
470-470
: Marking intrinsic modules as used.
Updating state.mlir.required_intrinsic_modules in expand_intrinsics is a straightforward approach to track usage.
501-501
: export_intrinsics? logic.
Delegating to module.intrinsics fosters dynamic discovery. No issues spotted; good maintainability.
785-790
: '::' operator expansion.
Retaining the AST structure with {:"::", meta, [param, type]} is helpful when generating typed calls.
1125-1129
: for_loop design.
This snippet constructs a structured loop over the pointer with SCF.for. The code sets an element and index variables properly.
1237-1241
: export_intrinsics? checks.
Confirming the target module is compiled and loaded prevents runtime surprises. Good safety net for dynamic calls.
1248-1253
: Handling type creation calls.
When called on MLIR.Type, the fallback ensures arguments are expanded before applying type constructors. This is consistent with the dynamic approach.
bench/sort_benchmark.exs (2)
13-13
: Verify parallel benchmark stability
Adding parallel benchmarking could affect measurement accuracy due to system resource contention. Consider documenting the rationale for choosing 2 parallel processes and verifying that results remain consistent across multiple runs.
14-14
: LGTM! Improved input generation
The dynamic input generation using Enum.map
and Enum.into
is more maintainable than the previous hardcoded approach.
bench/enif_merge_sort.ex (2)
6-6
: LGTM! Enhanced type safety
The explicit Pointer.t(Term.t())
type specification improves type safety and code clarity.
9-9
: LGTM! Simplified midpoint calculation
The direct value assignment using value arith.divsi
is more efficient than the previous approach using result_at
.
bench/vec_add_int_list.ex (1)
8-9
: LGTM! Improved type safety in pointer initialization
Using Pointer.element_type(i_ptr)
for the constant type ensures type consistency with the pointer.
bench/enif_quick_sort.ex (2)
6-10
: LGTM! Type refinement improves safety
The updated type signature Pointer.t(Term.t())
provides better type safety, and the implementation is clean and efficient.
Line range hint 33-39
: LGTM! Type refinement is consistent
The updated type signature maintains consistency with other functions while preserving the correct recursive quicksort implementation.
test/add_two_int_test.exs (1)
49-57
: LGTM! Comprehensive test coverage
Good test coverage including:
- JIT initialization and caching
- Normal addition cases
- Error handling for invalid inputs
- Cleanup with engine destruction
bench/enif_tim_sort.ex (1)
Line range hint 37-77
: LGTM! Tim sort implementation with proper type refinements
The implementation correctly handles:
- Run size management
- Merging of sorted runs
- Memory management with proper pointer types
bench/sort_util.ex (2)
Line range hint 5-24
: LGTM: Type refinement improves safety of pointer operations
The enhanced type specification Pointer.t(Term.t())
provides better type safety for pointer operations. The implementation correctly manages memory allocation and list traversal.
Line range hint 27-102
: Consider adding memory cleanup for temporary arrays
While the pointer type refinements improve type safety, the temporary arrays (left_temp
and right_temp
) allocated at lines 31-32 should be properly deallocated after use to prevent memory leaks.
Let's verify if there are any memory cleanup patterns used elsewhere in the codebase:
lib/charms/intrinsic.ex (1)
23-23
: Verify the necessity of explicit nil return
The __use_intrinsic__
function is added as a hook for intrinsic module handling, but the explicit nil
return might be unnecessary as it's the default return value in Elixir.
Consider simplifying to:
-def __use_intrinsic__, do: nil
+def __use_intrinsic__, do: :ok
Let's check if there are any direct calls to this function that depend on the return value:
✅ Verification successful
Keep the explicit nil
return value
The __use_intrinsic__
function is called during compile-time in a quoted expression within lib/charms.ex
for required intrinsic modules. Since it's part of the module's public interface and is explicitly called in compile-time machinery, it's better to keep the explicit nil
return value for clarity and to maintain a clear contract, even though it's the default return value in Elixir.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for direct calls to __use_intrinsic__
rg "__use_intrinsic__.*\(" --type elixir
Length of output: 42
Script:
#!/bin/bash
# Let's try a broader search for __use_intrinsic__ without parentheses
rg "__use_intrinsic__" --type elixir -A 3
# Also search for any references in macro definitions or module attributes
ast-grep --pattern 'def $_ do
$$$
__use_intrinsic__
$$$
end'
# Let's also check if it's used in any module attributes or compile-time checks
rg "@.*__use_intrinsic__" --type elixir
Length of output: 421
Script:
#!/bin/bash
# Let's check the context of how it's used in lib/charms.ex
rg -B 5 -A 5 "__use_intrinsic__" lib/charms.ex
# Also check if there are any compile-time behaviors or guards related to this function
ast-grep --pattern 'defmacro __using__($_) do
$$$
__use_intrinsic__
$$$
end'
Length of output: 326
lib/charms.ex (2)
51-53
: LGTM: Enhanced compilation output with intrinsic modules
The addition of required_intrinsic_modules
to the compilation output improves the tracking of intrinsic dependencies.
74-74
: LGTM: Proper integration of intrinsic modules
The unquote_splicing(i)
correctly integrates the intrinsic module quotes into the final quoted expression.
lib/charms/jit.ex (2)
24-24
: Verify vector conversion implications
The uncommented vector conversion pass with 'reassociate-fp-reductions' could affect floating-point precision and performance. While this is necessary for pointer/memref compilation, we should ensure it doesn't introduce unexpected behavior.
✅ Verification successful
Let me check the floating-point operations in the test files to better understand the impact.
Vector conversion is safe to enable
The vector conversion pass with 'reassociate-fp-reductions' is safe to enable because:
- The codebase only uses vectors for integer operations (as seen in bench/vec_add_int_list.ex)
- Floating-point operations are limited to test cases and type checking, not actual computations
- The vector-to-llvm conversion is only referenced in jit.ex, suggesting it's a necessary compilation step
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for vector-related operations that might be affected
ast-grep --pattern 'vector.$$$'
# Look for floating-point operations that might be impacted
rg -l 'f(32|64)'
Length of output: 918
Script:
#!/bin/bash
# Check floating-point related tests
rg -A 5 'f(32|64)' test/defm_test.exs test/const_test.exs
# Look for any MLIR/LLVM vector conversion patterns in the codebase
rg -l "convert.*vector.*llvm"
Length of output: 2299
85-86
: Consider preserving error context
The simplified error handling might make debugging harder by not capturing and formatting diagnostic information. Consider preserving some form of error context while maintaining the streamlined approach.
Summary by CodeRabbit
New Features
Charms.Prelude
module.Charms.Intrinsic
module to support intrinsic functions.Charms.Pointer
module to improve memory management and pointer operations.Bug Fixes
Charms.Defm.Definition
andCharms.Defm.Expander
.Documentation
Tests