Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile pointer to memref #57

Merged
merged 3 commits into from
Dec 19, 2024
Merged

Compile pointer to memref #57

merged 3 commits into from
Dec 19, 2024

Conversation

jackalcooper
Copy link
Contributor

@jackalcooper jackalcooper commented Dec 19, 2024

Summary by CodeRabbit

  • New Features

    • Enhanced type specifications for sorting algorithms and pointer handling across various modules.
    • Introduced parallel benchmarking for sorting algorithms.
    • Added new processing capabilities for literals and pointers in the Charms.Prelude module.
    • Added a new function in the Charms.Intrinsic module to support intrinsic functions.
    • Updated Charms.Pointer module to improve memory management and pointer operations.
  • Bug Fixes

    • Improved error handling and reporting in several modules, including Charms.Defm.Definition and Charms.Defm.Expander.
  • Documentation

    • Updated module documentation to reflect new features and changes, particularly in memory management and pointer operations.
  • Tests

    • Added new tests for sorting algorithms and type safety checks, ensuring robustness in functionality.
    • Removed tests for integer addition functionality, focusing on sorting algorithms instead.

Copy link
Contributor

coderabbitai bot commented Dec 19, 2024

Walkthrough

This pull request introduces comprehensive type refinements and code simplifications across multiple modules in the Charms library. The changes primarily focus on enhancing pointer type specifications, streamlining pointer operations, and improving type safety in sorting algorithms, benchmarking, and intrinsic function handling. Key modifications include updating function signatures to specify more precise pointer types, simplifying pointer loading and manipulation, and introducing better error handling mechanisms.

Changes

File Change Summary
bench/enif_merge_sort.ex Updated do_sort function signature to specify Pointer.t(Term.t()), simplified midpoint calculation.
bench/enif_quick_sort.ex Enhanced type specifications for swap, partition, and do_sort functions with Pointer.t(Term.t()).
bench/enif_tim_sort.ex Refined type specifications for insertion_sort and tim_sort functions.
bench/sort_benchmark.exs Dynamically generated benchmark inputs, added parallel execution parameter.
lib/charms.ex Modified __before_compile__/1 macro to handle required intrinsic modules.
lib/charms/defm.ex Removed const macro.
lib/charms/pointer.ex Significant updates to memory management, added support for SIMD and Tensor operations.
lib/charms/prelude.ex Added new functions for literal conversion, pointer extraction, and argument preprocessing.
test/add_two_int_test.exs Introduced AddTwoIntTest module with functions for adding integers and error handling.
test/defm_test.exs Removed AddTwoInt module, added tests for sorting algorithms and type safety checks.

Sequence Diagram

sequenceDiagram
    participant Expander
    participant Definition
    participant Charms
    participant Intrinsic

    Charms->>Definition: Compile definitions
    Definition->>Expander: Expand to MLIR
    Expander-->>Definition: Return expanded definitions
    Definition->>Charms: Track required intrinsic modules
    Charms->>Intrinsic: Process intrinsic modules
Loading

Possibly related PRs

  • Macro const/1 as syntax sugar to create constant op #21: The main PR modifies the do_sort function in ENIFMergeSort, which is related to the changes in the do_sort function in ENIFQuickSort that also updates the type signature to specify that arr is a pointer to Term.t().
  • JIT module supervision #22: The changes in the do_sort function in ENIFQuickSort are relevant as they also involve updating the type signature to specify that arr is a pointer to Term.t(), similar to the changes in the main PR.
  • Replace Agent with LockedCache as to cache JIT compilation #25: The changes in the ENIFMergeSort module's do_sort function and the introduction of a new SortUtil.merge function are relevant as they indicate a restructuring of the merge functionality, which is also reflected in the changes made in the main PR.
  • Init JIT in wrapper function #29: The modifications in the do_sort function in ENIFQuickSort and the overall changes in the sorting modules are relevant as they reflect similar updates to the sorting logic and type specifications.
  • Collect referenced modules and init them together #34: The changes in the ENIFMergeSort and ENIFTimSort modules, which involve the restructuring of the merge functionality, are relevant as they align with the updates made in the main PR.
  • Disallow reference loop between modules #39: The restructuring of the merge functionality in ENIFMergeSort and ENIFTimSort is relevant as it indicates a shift in how merging is handled, which is also reflected in the main PR.
  • Function call return type infererence #41: The modifications in the AddTwoIntVec module, particularly in the add function, are relevant as they involve similar updates to function signatures and handling of types.
  • Attach custom diagnostic server in JIT and expander #44: The introduction of a diagnostic server in the JIT and expander modules is relevant as it enhances error handling, which is a focus in the main PR.
  • Migrate leftover def transformation to defm #45: The migration of def transformations to defm is relevant as it aligns with the changes made in the main PR regarding function definitions.
  • Uncomment assertions of locals #46: The uncommenting of assertions related to locals is relevant as it reflects a focus on improving error handling and clarity in the tests, which is also a theme in the main PR.

Poem

🐰 Sorting bits with rabbit might,
Pointers dancing left and right,
Types refined with gentle care,
Code now flows without a snare,
Charms library takes elegant flight! 🚀


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14d8bf1 and 7f60dfe.

📒 Files selected for processing (1)
  • lib/charms/intrinsic.ex (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/charms/intrinsic.ex

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
lib/charms/prelude.ex (1)

98-104: Chaining preprocessors elegantly.
The functional approach to apply each preprocessor in turn is clean. For large argument lists or multiple preprocessors, consider short-circuiting on errors for performance.

lib/charms/pointer.ex (1)

106-117: Pointer type generation with layout.
Creating a strided layout attribute is a solid approach. Consider applying similar logic for multi-dimensional memrefs as needed.

bench/vec_add_int_list.ex (1)

15-17: Consider optimizing the reduce operation

While the pointer loading simplification is good, the reduce operation could be further optimized to avoid allocating a new pointer in each iteration.

Consider pre-allocating the pointer outside the reduce:

    init = SIMD.new(SIMD.t(i32(), 8), [0, 0, 0, 0, 0, 0, 0, 0])
+   v_ptr = Pointer.allocate(i32())
    Enum.reduce(l, init, fn x, acc ->
-     v_ptr = Pointer.allocate(i32())
      enif_get_int(env, x, v_ptr)
      i = Pointer.load(i_ptr)
      Pointer.store(i + 1, i_ptr)
      Pointer.load(v_ptr) |> vector.insertelement(acc, i)
    end)
bench/enif_quick_sort.ex (1)

13-29: Consider optimizing pointer arithmetic operations

While the implementation is correct, consider reducing the number of element_ptr calls by using pointer arithmetic. This could improve performance in tight loops.

-        swap(Pointer.element_ptr(arr, i), Pointer.element_ptr(start, j))
+        i_ptr = Pointer.offset(arr, i)
+        j_ptr = Pointer.offset(start, j)
+        swap(i_ptr, j_ptr)
test/add_two_int_test.exs (1)

23-24: Remove redundant arithmetic operations

These operations (sum / 1, sum + 1 - 1) don't affect the result and can be removed for clarity.

-            sum = sum / 1
-            sum = sum + 1 - 1
bench/enif_tim_sort.ex (1)

11-33: Consider simplifying the insertion sort implementation

The current implementation uses multiple pointer operations and allocations in tight loops. Consider simplifying by:

  1. Reducing temporary pointer allocations
  2. Using pointer arithmetic instead of repeated element_ptr calls
     for_loop {temp, i} <- {start, n} do
       i = value index.casts(i) :: i32()
       i = i + start_i
-      j_ptr = Pointer.allocate(i32())
-      Pointer.store(i - 1, j_ptr)
+      j = i - 1
 
       while(
-        Pointer.load(i32(), j_ptr) >= left &&
-          Pointer.load(Pointer.element_ptr(arr, Pointer.load(i32(), j_ptr))) >
-            temp
+        j >= left && Pointer.load(Pointer.element_ptr(arr, j)) > temp
       ) do
-        j = Pointer.load(i32(), j_ptr)
         Pointer.store(
           Pointer.load(Pointer.element_ptr(arr, j)),
           Pointer.element_ptr(arr, j + 1)
         )
-        Pointer.store(j - 1, j_ptr)
+        j = j - 1
       end
lib/charms.ex (1)

63-69: Consider adding duplicate check for intrinsic modules

While the code handles module self-reference (r != env.module), it might be worth checking for duplicates in required_intrinsic_modules to avoid redundant processing.

Consider adding:

 i =
-  for r <- required_intrinsic_modules, r != env.module do
+  for r <- required_intrinsic_modules |> Enum.uniq(), r != env.module do
test/defm_test.exs (2)

46-64: Consider adding memory leak checks

The sorting tests cover functionality well, but consider adding memory leak detection for the allocated arrays, especially important when dealing with pointer operations.

Consider adding:

# After the sorting tests
test "#{s} memory usage" do
  initial_memory = :erlang.memory(:total)
  # Run multiple sort operations
  final_memory = :erlang.memory(:total)
  # Assert memory growth is within acceptable bounds
  assert (final_memory - initial_memory) < threshold
end

168-180: Expand type mismatch test coverage

While the f32 to i32 mismatch test is good, consider adding more edge cases relevant to pointer/memref compilation.

Consider adding:

test "pointer type mismatches" do
  assert_raise ArgumentError, ~r/type mismatch/, fn ->
    # Test pointer type conversion mismatches
    # Test memref type conversion mismatches
  end
end
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37abfda and 58b737d.

📒 Files selected for processing (17)
  • bench/enif_merge_sort.ex (1 hunks)
  • bench/enif_quick_sort.ex (2 hunks)
  • bench/enif_tim_sort.ex (1 hunks)
  • bench/sort_benchmark.exs (1 hunks)
  • bench/sort_util.ex (5 hunks)
  • bench/vec_add_int_list.ex (1 hunks)
  • lib/charms.ex (2 hunks)
  • lib/charms/constant.ex (1 hunks)
  • lib/charms/defm.ex (0 hunks)
  • lib/charms/defm/definition.ex (3 hunks)
  • lib/charms/defm/expander.ex (9 hunks)
  • lib/charms/intrinsic.ex (1 hunks)
  • lib/charms/jit.ex (2 hunks)
  • lib/charms/pointer.ex (3 hunks)
  • lib/charms/prelude.ex (2 hunks)
  • test/add_two_int_test.exs (1 hunks)
  • test/defm_test.exs (2 hunks)
💤 Files with no reviewable changes (1)
  • lib/charms/defm.ex
✅ Files skipped from review due to trivial changes (1)
  • lib/charms/constant.ex
🔇 Additional comments (48)
lib/charms/prelude.ex (8)

7-7: Looks good on the Dialect alias additions.
No issues spotted; the extended alias list provides broader MLIR Dialect coverage.


10-13: Validate numeric literal conversion.
The specialization for integers/floats seems appropriate. Ensure all other numeric edge cases (e.g., big integers) are handled as expected.


17-17: Fallback literal logic.
Returning the value unmodified for non-numeric inputs aligns with your design. No issues found.


39-46: Minimal overhead for 'const' definition.
This implemention neatly delegates literal conversion to the existing Charms.Constant facility. No issues found.


55-85: Confirm pointer extraction logic.
The condition checks for both an LLVM pointer and a memref pointer, which may appear contradictory. Consider verifying that both checks are intended and won't skip some valid scenarios.


87-96: Argument type consistency check.
Raising an ArgumentError on a type mismatch is straightforward. Good approach for early failure and debugging.


105-118: Robust ENIF call mechanism.
This function encapsulates argument preprocessing and function calling neatly. Consider verifying what happens if the callee is missing from ENIF.signature.


122-126: Automated intrinsic creation.
Generating intrinsics in a loop aligns well with common patterns. Code readability and maintainability are good.

lib/charms/pointer.ex (13)

4-5: Enhanced module documentation.
The updated docstrings clarify the intended role of smart pointers with SIMD/Tensor support.


7-7: Imports and usage.
Using both Beaver and Charms.Intrinsic suggests deeper integration of pointer operations. No immediate issues.


11-11: Expanded alias usage.
Referencing MemRef, Index, and Arith fosters clearer pointer operations. Seems consistent with the module’s focus.


26-51: Flexible array allocation.
Distinguishing between integer and MLIR.Value-based sizes looks good. Ensuring type conversions for dynamic sizes can prevent runtime errors.


59-71: Conditional load for LLVM vs. MemRef pointer.
Ensuring both pointer types are handled is beneficial. Keep an eye on edge cases where the pointer might be partially typed or cast incorrectly.


73-90: Pointer type checks.
Providing distinct memref_ptr? overloads for Type and Value helps unify pointer handling. The load/1 function’s error message is quite direct, which aids debugging.


98-104: Store method usage.
Storing data at index 0 for a pointer is straightforward. Ensure no out-of-bounds scenarios occur for arrays bigger than one element.


118-142: Offset pointer recalculation.
Reinterpreting the cast works, but ensure dynamic offsets/sizes are always matched. The error raise is helpful if the dense arrays can’t be built.


144-155: Element pointer type consistency check.
Verifying that elem_t matches the pointer’s element type is crucial to avoid silent misalignment issues.


156-168: Ensure integer index type.
Casting n to Type.index() is prudent. Double-check if negative indices are possible or should be allowed.


176-190: Shortcut for typed pointer element_ptr.
Raising an error if the pointer is not typed ensures clarity. This approach is consistent with the code in load/1.


192-198: Retrieve element type from typed pointer.
The explicit error message is helpful. This ensures that only typed pointers pass.


210-213: Typed pointer creation.
Returning a MemRef-based pointer type from t(elem_t) is consistent with the rest of the Pointer module’s approach.

lib/charms/defm/definition.ex (3)

262-272: Accumulating required intrinsic modules.
Collecting these modules in MapSet improves composability across definitions. Good approach to track cross-module dependencies.


293-295: Propagate intrinsic modules in the result tuple.
Returning required_intrinsic_modules alongside IR and referenced_modules is logical. This ensures the compiler always knows which intrinsics are needed.


318-333: Enhanced error handling with stack traces.
Re-raising the error with a trace is highly beneficial for debugging. The fallback cases appear thorough.

lib/charms/defm/expander.ex (8)

22-22: Additional MLIR dialect alias added.
Ensuring SCF, MemRef, Index, Arith, and UB are available strengthens coverage for expansions.


37-37: Tracking required_intrinsic_modules in Expander.
This field ensures expansions automatically record dependent intrinsic modules. Smooth integration with the rest of the pipeline.


470-470: Marking intrinsic modules as used.
Updating state.mlir.required_intrinsic_modules in expand_intrinsics is a straightforward approach to track usage.


501-501: export_intrinsics? logic.
Delegating to module.intrinsics fosters dynamic discovery. No issues spotted; good maintainability.


785-790: '::' operator expansion.
Retaining the AST structure with {:"::", meta, [param, type]} is helpful when generating typed calls.


1125-1129: for_loop design.
This snippet constructs a structured loop over the pointer with SCF.for. The code sets an element and index variables properly.


1237-1241: export_intrinsics? checks.
Confirming the target module is compiled and loaded prevents runtime surprises. Good safety net for dynamic calls.


1248-1253: Handling type creation calls.
When called on MLIR.Type, the fallback ensures arguments are expanded before applying type constructors. This is consistent with the dynamic approach.

bench/sort_benchmark.exs (2)

13-13: Verify parallel benchmark stability

Adding parallel benchmarking could affect measurement accuracy due to system resource contention. Consider documenting the rationale for choosing 2 parallel processes and verifying that results remain consistent across multiple runs.


14-14: LGTM! Improved input generation

The dynamic input generation using Enum.map and Enum.into is more maintainable than the previous hardcoded approach.

bench/enif_merge_sort.ex (2)

6-6: LGTM! Enhanced type safety

The explicit Pointer.t(Term.t()) type specification improves type safety and code clarity.


9-9: LGTM! Simplified midpoint calculation

The direct value assignment using value arith.divsi is more efficient than the previous approach using result_at.

bench/vec_add_int_list.ex (1)

8-9: LGTM! Improved type safety in pointer initialization

Using Pointer.element_type(i_ptr) for the constant type ensures type consistency with the pointer.

bench/enif_quick_sort.ex (2)

6-10: LGTM! Type refinement improves safety

The updated type signature Pointer.t(Term.t()) provides better type safety, and the implementation is clean and efficient.


Line range hint 33-39: LGTM! Type refinement is consistent

The updated type signature maintains consistency with other functions while preserving the correct recursive quicksort implementation.

test/add_two_int_test.exs (1)

49-57: LGTM! Comprehensive test coverage

Good test coverage including:

  • JIT initialization and caching
  • Normal addition cases
  • Error handling for invalid inputs
  • Cleanup with engine destruction
bench/enif_tim_sort.ex (1)

Line range hint 37-77: LGTM! Tim sort implementation with proper type refinements

The implementation correctly handles:

  • Run size management
  • Merging of sorted runs
  • Memory management with proper pointer types
bench/sort_util.ex (2)

Line range hint 5-24: LGTM: Type refinement improves safety of pointer operations

The enhanced type specification Pointer.t(Term.t()) provides better type safety for pointer operations. The implementation correctly manages memory allocation and list traversal.


Line range hint 27-102: Consider adding memory cleanup for temporary arrays

While the pointer type refinements improve type safety, the temporary arrays (left_temp and right_temp) allocated at lines 31-32 should be properly deallocated after use to prevent memory leaks.

Let's verify if there are any memory cleanup patterns used elsewhere in the codebase:

lib/charms/intrinsic.ex (1)

23-23: Verify the necessity of explicit nil return

The __use_intrinsic__ function is added as a hook for intrinsic module handling, but the explicit nil return might be unnecessary as it's the default return value in Elixir.

Consider simplifying to:

-def __use_intrinsic__, do: nil
+def __use_intrinsic__, do: :ok

Let's check if there are any direct calls to this function that depend on the return value:

✅ Verification successful

Keep the explicit nil return value

The __use_intrinsic__ function is called during compile-time in a quoted expression within lib/charms.ex for required intrinsic modules. Since it's part of the module's public interface and is explicitly called in compile-time machinery, it's better to keep the explicit nil return value for clarity and to maintain a clear contract, even though it's the default return value in Elixir.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for direct calls to __use_intrinsic__
rg "__use_intrinsic__.*\(" --type elixir

Length of output: 42


Script:

#!/bin/bash
# Let's try a broader search for __use_intrinsic__ without parentheses
rg "__use_intrinsic__" --type elixir -A 3

# Also search for any references in macro definitions or module attributes
ast-grep --pattern 'def $_ do
  $$$
  __use_intrinsic__
  $$$
end'

# Let's also check if it's used in any module attributes or compile-time checks
rg "@.*__use_intrinsic__" --type elixir

Length of output: 421


Script:

#!/bin/bash
# Let's check the context of how it's used in lib/charms.ex
rg -B 5 -A 5 "__use_intrinsic__" lib/charms.ex

# Also check if there are any compile-time behaviors or guards related to this function
ast-grep --pattern 'defmacro __using__($_) do
  $$$
  __use_intrinsic__
  $$$
end'

Length of output: 326

lib/charms.ex (2)

51-53: LGTM: Enhanced compilation output with intrinsic modules

The addition of required_intrinsic_modules to the compilation output improves the tracking of intrinsic dependencies.


74-74: LGTM: Proper integration of intrinsic modules

The unquote_splicing(i) correctly integrates the intrinsic module quotes into the final quoted expression.

lib/charms/jit.ex (2)

24-24: Verify vector conversion implications

The uncommented vector conversion pass with 'reassociate-fp-reductions' could affect floating-point precision and performance. While this is necessary for pointer/memref compilation, we should ensure it doesn't introduce unexpected behavior.

✅ Verification successful

Let me check the floating-point operations in the test files to better understand the impact.


Vector conversion is safe to enable

The vector conversion pass with 'reassociate-fp-reductions' is safe to enable because:

  1. The codebase only uses vectors for integer operations (as seen in bench/vec_add_int_list.ex)
  2. Floating-point operations are limited to test cases and type checking, not actual computations
  3. The vector-to-llvm conversion is only referenced in jit.ex, suggesting it's a necessary compilation step
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for vector-related operations that might be affected
ast-grep --pattern 'vector.$$$'

# Look for floating-point operations that might be impacted
rg -l 'f(32|64)' 

Length of output: 918


Script:

#!/bin/bash
# Check floating-point related tests
rg -A 5 'f(32|64)' test/defm_test.exs test/const_test.exs

# Look for any MLIR/LLVM vector conversion patterns in the codebase
rg -l "convert.*vector.*llvm"

Length of output: 2299


85-86: Consider preserving error context

The simplified error handling might make debugging harder by not capturing and formatting diagnostic information. Consider preserving some form of error context while maintaining the streamlined approach.

@jackalcooper jackalcooper merged commit 5ab98a7 into main Dec 19, 2024
1 check passed
@jackalcooper jackalcooper deleted the dynamic-memref-as-ptr branch December 19, 2024 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant